Introduction to

Sampling

The way in which we select a sample of

individuals to be research participants is critical. How we select participants

(random sampling) will determine the population to which we may generalize our

research findings. The procedure that we use for assigning participants to different

treatment conditions (random assignment) will determine whether bias exists in

our treatment groups.

Before

describing sampling procedures, we need to define a few key terms. The term

population means all members that meet a set of specifications or a specified

criterion. For example, the population of the United States is defined as all

people residing in the United States. The population of New Orleans means all

people living within the city’s limits or boundary. A population of inanimate

objects can also exist, such as all automobiles manufactured in Michigan in the

year 2003. A single member of any given population is referred to as an

element. When only some elements are selected from a population, we refer to

that as a sample; when all elements are included, we call it a census.

Data

derived from a sample are treated statistically. Using sample data, we

calculate various statistics, such as the mean and standard deviation. These

sample statistics summarize (describe) aspects of the sample data. These data,

when treated with other statistical procedures, allow us to make certain

inferences. From the sample statistics, we make corresponding estimates of the

population. Thus, from the sample mean, we estimate the population mean; from

the sample standard deviation, we estimate the population standard deviation.

Types of Sampling

Simple Random Sampling

Researchers use two major sampling techniques:

probability sampling and non probability sampling. With probability sampling, a

researcher can specify the probability of an element’s (participant’s) being

included in the sample. With non probability sampling, there is no way of

estimating the probability of an element’s being included in a sample. If the

researcher’s interest is in generalizing the findings derived from the sample

to the general population, then probability sampling is far more useful and

precise. Unfortunately, it is also much more difficult and expensive than non

probability sampling.

Probability

sampling is also referred to as random sampling or representative sampling. The

word random describes the procedure used to select elements (participants,

cars, test items) from a population.

When random sampling is used, each element in the population has an

equal chance of being selected (simple random sampling) or a known probability

of being selected (stratified random sampling). The sample is referred to as

representative because the characteristics of a properly drawn sample represent

the parent population in all ways.

One

caution before we begin our description of simple random sampling: Random

sampling is different from random assignment. Random assignment describes the

process of placing participants into different experimental groups.

Step 1. Defining the

Population

Before a sample is taken, we must first define

the population to which we want to generalize our results. The population of

interest may differ for each study we undertake. It could be the population of

professional football players in the United States or the registered voters in

Bowling Green, Ohio. It could also be all college students at a given

university, or all sophomores at that institution. It could be female students,

or introductory psychology students, or 10-year-old children in a particular school,

or members of the local senior citizens centre. The point should be clear; the

sample should be drawn from the population to which you want to generalize—the

population in which you are interested.

It

is unfortunate that many researchers fail to make explicit their population of

interest. Many investigators use only college students in their samples, yet

their interest is in the adult population of the United States. To a large

extent, the generalizability of sample data depends on what is being studied

and the inferences that are being made.

Step 2. Constructing a List

Before a sample can be chosen randomly, it is

necessary to have a complete list of the population from which to select. In

some cases, the logistics and expense of constructing a list of the entire

population is simply too great, and an alternative procedure is forced upon the

investigator. We could avoid this problem by restricting our population of

interest—by defining it narrowly. However, doing so might increase the

difficulty of finding or constructing a list from which to make our random

selection. For example, you would have no difficulty identifying female

students at any given university and then constructing a list of their names

from which to draw a random sample. It would be more difficult to identify female students coming from a

three-child family, and even more difficult if you narrowed your interest to

firstborn females in a three-child family. Moreover, defining a population narrowly also means

generalizing results narrowly.

Caution must be exercised in compiling a list

or in using one already constructed. The population list from which you intend

to sample must be both recent and exhaustive. If not, problems can occur. By an

exhaustive list, we mean that all members of the population must appear on the

list. Voter registration lists, telephone directories, homeowner lists, and

school directories are sometimes used, but these lists may have limitations.

They must be up to date and complete if the samples chosen from them are to be

truly representative of the population. In addition, such lists may provide

very biased samples for some research questions we ask.

Step 3. Drawing the Sample

After a list of population members has been

constructed, various random sampling options are available. Some common ones

include tossing dice, flipping coins, spinning wheels, drawing names out of a

rotating drum, using a table of random numbers, and using computer programs.

Except for the last two methods, most of the techniques are slow and

cumbersome. Tables of random numbers are easy to use, accessible, and truly

random. Here is a website that provides a random number table, as well as a way

to generate random numbers.

Step 4. Contacting Members of a Sample

Researchers

using random sampling procedures must be prepared to encounter difficulties at

several points. As we noted, the starting point is an accurate statement that

identifies the population to which we want to generalize. Then we must obtain a

listing of the population, accurate and up-to-date, from which to draw our

sample. Further, we must decide on the random selection procedure that we wish

to use. Finally, we must contact each of those selected for our sample and

obtain the information needed. Failing to contact all individuals in the sample

can be a problem, and the representativeness of the sample can be lost at this

point.

Stratified Random

Sampling

This procedure known as stratified random

sampling is also a form of probability sampling. To stratify means to classify

or to separate people into groups according to some characteristics, such as

position, rank, income, education, sex, or ethnic background. These separate

groupings are referred to as subsets or subgroups. For a stratified random

sample, the population is divided into groups or strata. A random sample is

selected from each stratum based upon the percentage that each subgroup

represents in the population. Stratified random samples are generally more

accurate in representing the population than are simple random samples. They

also require more effort, and there is a practical limit to the number of

strata used. Because participants are to be chosen randomly from each stratum,

a complete list of the population within each stratum must be constructed.

Stratified sampling is generally used in two different ways. In one, primary

interest is in the representativeness of the sample for purposes of commenting

on the population. In the other, the focus of interest is comparison between

and among the strata.

Stratified

samples are sometimes used to optimize group comparisons. In this case, we are

not concerned about representing the total population. Instead, our focus is on

comparisons involving two or more strata. If the groups involved in our

comparisons are equally represented in the population, a single random sample

could be used. When this is not the case, a different procedure is necessary.

For example, if we were interested in making comparisons between whites and

blacks, a simple random sample of 100 people might include about 85 to 90

whites and only 10 to 15 blacks. This is hardly a satisfactory sample for

making comparisons. With a stratified random sample, we could randomly choose

50 whites and 50 blacks and thus optimize our comparison. Whenever strata

rather than the population are our primary interest, we can sample in different

proportions from each stratum. Although random sampling is optimal from a

methodological point of view, it is not always possible from a practical point

of view.

Convenience Sampling

Convenience sampling is used because it is

quick, inexpensive, and convenient. Convenience samples are useful for certain

purposes, and they require very little planning. Researchers simply use

participants who are available at the moment. The procedure is casual and easy,

relative to random sampling. Contrast using any available participants with

random sampling, where you must (1) have a well-defined population, (2) construct

a list of members of the population if one is not available, (3) sample

randomly from the list, and (4) contact and use as many individuals from the

list as possible. Convenience sampling requires far less effort. However, such

convenience comes with potential problems, which we will describe. Convenience

samples are non probability samples. Therefore, it is not possible to specify

the probability of any population element’s being selected for the sample.

Indeed, it is not possible to specify the population from which the sample was

drawn.

Example

; In shopping malls or airports, individuals are selected as they pass a

certain location and interviewed concerning issues, candidates, or other

matters.

Quota Sampling

In many large-scale applications of sampling

procedures, it is not always possible or desirable to list all members of the

population and randomly select elements from that list. The reasons for using

any alternative procedures include cost, timeliness, and convenience. One

alternative procedure is quota sampling.

This

technique is often used by market researchers and those taking political polls.

Usually, when this technique is used, the population of interest is large and

there are no ready-made lists of names available from which to sample randomly.

The Gallup Poll is one of the best known and well conducted polls to use quota

sampling. This poll frequently reports on major public issues and on

presidential elections. The results of the poll are syndicated for a fee that

supports it. In this quota sampling procedure, localities are selected and

interviewers are assigned a starting point, a specified direction, and a goal

of trying to meet quotas for subsets (ethnic origins, political affiliations,

and so on) selected from the population. Although some notable exceptions have

occurred, predictions of national elections over the past few years have been

relatively accurate—certainly, much more so than guesswork.

With

the quota sampling procedure, we first decide which subgroups of the population

interest us. This, in turn, is dictated by the nature of the problem being

investigated (the question being asked). For issues of national interest (such

as abortion, drug use, or political preference), frequently used subsets are

age, race, sex, socioeconomic level, and religion. The intent is to select a

sample whose frequency distribution of characteristics reflects that of the

population of interest. Obviously, it is necessary to know the percentage of

individuals making up each subset of the population if we are to match these

percentages in the sample. For example, if you were interested in ethnic groups

such as Italians, Germans, Russians, and so on, and knew their population

percentages, you would select your sample so as to obtain these percentages.

Within

each subset, participants are not chosen randomly. This is simply because there

are usually no ready-made lists from which the researcher can select randomly.

Often individuals are selected in the sample on the basis of availability. For

this reason, quota sampling is less expensive. It would not be so if lists of

the population of interest had to be constructed. However, if exhaustive

ready-made lists were conveniently available for the population of interest,

then choosing participants randomly would be possible and preferable. In the

absence of such lists, it is much more convenient to select quotas by knocking

on doors, telephoning numbers, or sending mailings until the sample percentages

for subsets match those of the population. Obviously, even though the quotas

may be achieved and the sample may match the population percentages in terms of

subsets, the sample may still not represent (reflect) the population to which

we wish to generalize.

Often

interviewers, for sampling purposes, concentrate on areas where large numbers

of people are likely to be. This could bias the findings. As we noted earlier,

samples taken in airports may over represent high-income groups, whereas those

at a bus or rail depots may over represent low-income groups. Samples at either

place may under represent those who seldom travel. Also, people who are home

during the day, and are therefore available for house-to-house interviews or

telephone calls, may be quite different in important ways from those who are not

home. In this respect, quota sampling and convenience sampling are similar. In

spite of these difficulties, the quota system is widely used and will

unquestionably continue to be so for economic and logistic reasons.

Table No. 1

Sampling

Technique

Advantages

Limitations

Simple Random

Sampling

Representative

of the population.

May be

difficult to obtain the list.

May be more

expensive.

Stratified

Random Sampling

Representative

of the population.

May be

difficult to obtain the list.

May be more expensive.

Convenience

Sampling

Simple

Easy

Convenient

No complete

member list needed.

May not be

representative of population.

Quota Sampling

Simple

Easy

Convenient

No complete

member list needed.

May not be

representative of population.

Sampling Error

Error can occur during the sampling process.

Sampling error can include both systematic sampling error and random sampling

error. Systematic sampling error is the fault of the investigation, but random

sampling error is not. When errors are systematic, they bias the sample in one

direction. Under these circumstances, the sample does not truly represent the

population of interest. Systematic error occurs when the sample is not drawn

properly, as in the poll conducted by Literary Digest magazine. It can also

occur if names are dropped from the sample list because some individuals were

difficult to locate or uncooperative. Individuals dropped from the sample could

be different from those retained. Those remaining could quite possibly produce

a biased sample. Political polls often have special problems that make

prediction difficult. Random sampling error, as contrasted to systematic

sampling error, is often referred to as chance error. Purely by chance, samples

drawn from the same population will rarely provide identical estimates of the

population parameter of interest. These estimates will vary from sample to

sample.

Conclusion

When

we conduct research, we are generally interested in drawing some conclusion

about a population of individuals that have some common characteristic.

However, populations are typically too large to allow observations on all

individuals, and we resort to selecting a sample. In order to make inferences

about the population, the sample must be representative. Thus, the manner in

which the sample is drawn is critical. Probability sampling uses random

sampling in which each element in the population (or a subgroup of the

population with stratified random sampling) has an equal chance of being

selected for the sample. This technique is considered to be the best means of

obtaining a representative sample. When probability sampling is not possible,

nonprobability sampling must be used. Convenience sampling involves using

participants who are readily available (such as introductory psychology students).

It is the easiest technique but the poorest from a methodological standpoint.

Quota sampling is essentially convenience sampling in which there is an effort

to better represent the population by sampling a certain percentage of

participants from subgroups that correspond to the prevalence of those

subgroups in the population.

By their very nature, samples do not perfectly

match the population from which they are drawn. There is always some degree of

sampling error, and the degree of error is inversely related to the size of the

sample. Larger samples are more likely to accurately represent characteristics

of the population, and smaller samples are less likely to accurately represent

characteristics of the population. Therefore, researchers strive for samples

that are large enough to reduce sampling error to an acceptable level. Even

when samples are large enough, it is important to evaluate the specific method

by which the sample was drawn. We are increasingly exposed to information

obtained from self-selected samples that represent only a very narrow subgroup

of individuals. Much of such information is meaningless because the subgroup is

difficult to identify.