Business Research Methods
Sample Designs and Sampling Procedures Prepared By Nusrat Jahan, Faculty, IUB
Sampling Terminology • Sample – subset of larger population • Population or universe – any complete group that share some set of characteristics (e.g., people, sales territories, stores, etc.) • Population element (sampling unit) – individual member of population. • Sampling frame: Listing of population from which a sample is chosen. • Census – investigation of all individual elements that make up a population • Survey – a polling / investigation of the sample
Sampling Terminology Parameter: The variable of interest. Statistic: The information obtained from the sample about the parameter. Critical assumption means the sample chosen is representative of the population. The Goal of Sampling is to be able to make inferences about the population parameter from knowledge of the relevant statistic – to draw general conclusions about the entire body of units.
Why Sample? • It works! Properly selected samples yield accurate and reliable results. – If elements are similar smaller sample is needed
• May even be more accurate than census – Bureau of Census uses samples to check accuracy of the U. S. Census
• It saves resources
STAGES IN SELECTION OF A SAMPLE
2. Sampling Frame • Sampling frame means Listing of population from which a sample is chosen. • It is also known as Working population • Example: Student mailing list, Phone book, list of customers. • E.g. list of all the employees working in Agrabad branch of any bank. • Problem with lists: Omission Ineligible Duplications
• Sampling frame error – occurs when population is not accurately represented in the sampling frame.
List of Units
Census 1
3
2 12
13
1
14 3
2 12
Census
13
14
4
5
11
10
15 4
5
11
10
15
9 8
7
17
16
16
6
6 9 8 17
7
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
List of Units
Sampling Frame 1
3
2 12
13
14
4
5
11
10
15
16
6 9 8
7
17
Sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Sampling Procedure: Selecting a Sample Design
1. Non-probability Sampling Probability of selecting any particular member is unknown. That means, probability of selection of some population units are zero or unknown. Technically, inappropriate to apply statistical techniques to project beyond the sample. That means survey result cannot be projected to the population. Sampling error cannot be computed.
Types of non-probability sampling: • a) Convenience sampling • b) Judgment sampling • c) Quota sampling • d) Snowball sampling
a) Convenience Sampling
• Also called haphazard or accidental sampling • The sampling procedure of obtaining the people or units that are most conveniently available • E.g people in my class, mall intercepts, friends, relatives.
a) Convenience Sampling Advantages: Very low cost and extensively used No need for list of population
Disadvantages Bias associated with estimates cannot be measured or controlled Projecting data beyond sample is inappropriate.
b) Judgment Sampling • Also called purposive sampling • Judgment sampling involves choosing objects / sample that it is believed will give accurate results – E.g. a research is conducted on toy stores, selecting three stores purposively from where accurate results can be found.
b) Judgment Sampling Advantages Moderate cost and average use by the researcher Useful for certain types of forecasting Sample guaranteed to meet a specific objective
Disadvantages Bias due to researchers’ beliefs may make sample unrepresentative Projecting data beyond sample is inappropriate.
c) Quota Sampling • Quota samples are based on selecting objects until you have a certain number (the quota) of each type. • Still widely used (especially for telephone surveys with high nonresponse levels) • The population is divided into cells/groups on the basis of relevant control characteristics. E.g survey population divided into male and female. • A quota of sample units is established for each cell / group. E.g. from survey population comprising 100 men and 100 women, 50 men, 50 women should be chosen. • A convenience sample is drawn for each cell until the quota is met. • It should not be confused with stratified sampling.
c) Quota Sampling Advantages Moderate cost and extensively used by the researcher Introduces some stratification of population Requires no list of population Disadvantages Introduces bias in researcher’s classification of subjects Non-probability selection within classes/groups means error from population cannot be estimated Projecting data beyond sample is inappropriate.
• • • •
d) Snowball Sampling Initial respondents are selected by probability methods if possible. Selection of additional respondents is based on referrals from initial respondents. E.g. friends of friends. It is used sample of rare populations.
d) Snowball Sampling Advantages Low cost and used in special situations It is useful in locating members of rare populations.
Disadvantages High bias because sample unit is not independent Projecting data beyond sample is inappropriate
2. Probability Sampling
2. Probability Sampling • • • • •
a) Simple random sample b) Systematic sample c) Stratified sample d) Cluster sample e) Multistage area sample
a) Simple Random Sampling • A sampling procedure that ensures that each element in the population will have an equal chance of being included in the sample
For example, a school administrator of 4 schools wishes to find out students’ opinions about food served in the school cafeterias. He has a complete list of all students in the schools and decides to randomly select 150 students from the list. In this example, each student throughout the 4 schools has an equal probability of selection to be given the survey; therefore, it is a simple random sample. As the name implies, selecting a simple random sample is, well… simple!
a) Simple Random Sampling Here are the steps to select sample: Assign each member of your population a numerical label (e.g. 1,2,3). Use statistical software or a random digit table to select numerical labels /numbers at random. Use this site: http://stattrek.com/Tables/Random.aspx There are many other ways to obtain a simple random sample. One traditional way would be the lottery method. Each of the population members is assigned a unique number. The numbers are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects the numbers. Population members having the selected numbers are included in the sample.
a) Simple Random Sampling Example: A small catering business serves 9 reception /community centers. The owner wants to interview a sample of 4 clients in detail to find ways to improve services to his/her clients. To avoid bias, the owner chooses a simple random sample of size 4. Step 1: Each reception center is assigned a numerical label 1-9. 1 - Darlene’s Wedding Center 2 - Magic Moments Reception Hall 3 - Rustic Realm Weddings 4 - Romance Gardens 5 - Classic Weddings 6 - Old Time Chapel 7 - Lovers Lane Weddings 8 - Accents-Modern Weddings 9 - Century Falls Reception Center Step 2: The owner decides to use a statistical software program to generate 4 numerical labels between 1 and 9 at random. The software returns the following numbers: 5, 8, 6, 4 Therefore, the simple random sample to be interviewed in detail will be: Classic Weddings (5) Accents-Modern Weddings (8) Old Time Chapel (6) Romance Gardens (4)
a) Simple Random Sampling Advantages Only minimum advance knowledge of population is needed. Easy to analyze data and compute error. Disadvantages Requires sampling frame High cost Not frequently used in practice Respondents may be widely dispersed, hence higher cost.
•
b) Systematic Sampling Systematic Sampling This is random sampling with a system!
• From the sampling frame, select an initial starting point randomly. • and thereafter at regular intervals, like every 3rd member from the list of populantion
• E.g. A sample of 1000 firms from a list of 200,000 firms can be taken by drawing every 200th name from the list.
Systematic Sampling For example, suppose you want to sample 8 houses from a street of 120 houses. 120/8=15, so every 15th house is chosen after a random starting point between 1 and 15. If the random starting point is 11, then the houses selected are 11, 26, 41, 56, 71, 86, 101, and 116.
Systematic Sampling If there were 125 houses, 125/8=15.625, so should you take every 15th house or every 16th house? If you take every 16th house, 8*16=128 so there is a risk that the last house chosen does not exist. To overcome this the random starting point should be between 1 and 10. On the other hand if you take every 15th house, 8*15=120 so the last five houses will never be selected. The random starting point should now be between 1 and 20 to ensure that every house has some chance of being selected. In a random sample every member of the population has an equal chance of being chosen, which is clearly not the case here, but in practice a systematic sample is almost always acceptable as being random.
b) Systematic Sampling Advantages Moderate cost and moderately used by the researcher. Simple to draw sample and easy to check Disadvantages Periodic ordering of the population may lead to variability or biasness.
c) Stratified Sampling In a stratified sample the sampling frame is divided into non-overlapping groups or stratum. For example: on the basis of geographical areas, age-groups, genders. A sample is taken from each stratum, and this sample is taken through simple random sample it is referred to as stratified random sampling.
c) Stratified Sampling Choice of Sample Size for each Stratum In general the size of the sample in each stratum is taken in proportion to the size of the stratum. This is called proportional allocation. Suppose that in a company there are the following staff:
and we are asked to take a sample of 40 staff, stratified according to the above categories.
c) Stratified Sampling
The first step is to find the total number of staff (180) and calculate the percentage in each group. % male, full time = ( 90 / 180 ) x 100 = 0.5 x 100 = 50% % male, part time = ( 18 / 180 ) x100 = 0.1 x 100 = 10% % female, full time = (9 / 180 ) x 100 = 0.05 x 100 = 5% % female, part time = (63/180)x100 = 0.35 x 100 = 35% This tells us that of our sample of 40, 50% should be male, full time. 10% should be male, part time. 5% should be female, full time. 35% should be female, part time. The following sample would be selected by applying simple random sampling technique
50% of 40 is 20 (from male full time stratum) 10% of 40 is 4 (from male part time stratum) 5% of 40 is 2 (from female full time stratum) 35% of 40 is 14 (from male part time stratum)
c) Stratified Sampling Advantages Moderately used by the researcher Assures representation of all groups in the sample Characteristics of each stratum can be estimated and comparisons can be made. Disadvantages Requires accurate information on proportion of each stratum in the total population If stratified lists are not already available, they can be costly to prepare.
d) Cluster Sampling Cluster sampling is a sampling technique where the entire population is divided into groups, or clusters and a random sample of these clusters are selected.
All observations in the selected clusters are included in the sample.
d) Cluster Sampling
d) Cluster Sampling Advantages Low cost and frequently used by the researcher If clusters are geographically defined, it yields lowest field cost Can estimate characteristics of clusters as well as population. Disadvantages Researcher must be able to assign population members to unique cluster otherwise duplication or omission of individual would result.
e) Multistage Sampling In case of this sampling method, researcher performs some combination of the first four probability sampling techniques. Advantages Frequently used, especially in nationwide survey Disadvantages Depends on other sampling techniques
Determining Sample Size This table is developed by Taro Yamane, in 1967 and can be used to select sample size. The level of precision or sampling error, is often expressed in % points, (e.g., ±5%).Thus, if a researcher finds that 60% of respondents in the sample have adopted a recommended practice with a precision rate of ±5%, then it can be concluded that between 55% and 65% of subjects in the population have adopted the practice.
Determining Sample Size This table is developed by Taro Yamane, in 1967 and can be used to select sample size. The level of precision or sampling error, is often expressed in % points, (e.g., ±5%).Thus, if a researcher finds that 60% of respondents in the sample have adopted a recommended practice with a precision rate of ±5%, then it can be concluded that between 55% and 65% of subjects in the population have adopted the practice. (for more information visithttp://edis.ifas.ufl.edu/PD006#TABLE_1)
Factors to be considered in Sample Design? Representativeness of sample is always important Degree of accuracy depends on researcher (different considerations like cost could trade-off for a reduction in accuracy) Resources constraints like (financial or human resource) may eliminate certain methods. Time (if a deadline has to be followed researcher may select less time consuming sample design) Advanced knowledge of the population (such as availability of list of population member. This unavailability may rule out some sampling method) National versus local project (geographic proximity of population will influence sample design) Need for statistical analysis for projection beyond the sample (non-probability method don’t allow the researcher to utilize statistical analysis to project beyond the sample.