|
HOWEVER, BE ALERT TO THE EXAM 3 STUDY GUIDE, COMING SOON! |
|
SUMMER 2004 Dr. Susan Carol Losh GUIDE 9: TYPES OF ERROR AND BASIC SAMPLING DESIGNS FEEDBACK ASSIGNMENT 4
|
| KEY TO:
Huff, Chapters 9 and 10, pp. 100-142 Agresti & Finlay, Chapter 2, pp.18-29 |
|
|
|
|
|
|
|
|
The goal of much research is to predict the true POPULATION VALUE. We want to minimize ANY deviation from the true population value when we take a sample whenever possible.
Further, when we test null hypotheses with our sample data, it is nearly always with reference to: what's going on in the POPULATION. For example,
Our Ho: X2 = 0
speaks to the value of Chi-square in the POPULATION. We then ask about the likelihood of observing our sample results given the null hypothesis about the population Chi-square (not the sample X2).
Your tests of statistical significance, such as those in the Berkeley SDA program, assume that sampling error is RANDOM, and not systematic. We expect random variations from case to case and from sample to sample.
Positive fluctuactions cancel out negative ones IN THE LONG RUN, although not necessarily in any ONE particular sample.
When we observe sample univariate results, such as a mean or a percentage, we often put error limits, called "confidence intervals" around that result in an attempt to estimate what is happening in the population.
|
|
Populations have PARAMETERS, samples provide ESTIMATES.
A POPULATION is the entire collection of elements that you wish to study, for example, ALL registered students at FSU, Spring 2004; ALL residential telephone numbers in Leon County, Florida.
A SAMPLE is some specified subpart or subset of your population.
In order to take a good sample, you must carefully define your population.
We use samples to generalize to populations and it is usually the well-defined populations we are interested in.
Somewhere along the line, you will need a good FRAME or list of all the elements in the population, (even if it's very "far down the line" for a multi-stage sample).
Research practitioners usually distinguish between two main sources of error in measuring phenomena:
(1) SYSTEMATIC ERROR or BIAS--often tricky to discover and
(2) RANDOM ERROR which is often sampling error.
BIAS is often hidden. The more sources of input we get before starting a study, the more likely we are to discover bias. In the meantime, to minimize bias we can:
|
|
Random error, on the other hand, is exactly that. It is unpredictable (when tossing a penny, for example, in the short run, you could get six heads in a row) and in the long run has no discernible pattern. We assume that the prediction errors in techniques such as regression or analysis of variance are essentially random. In the long run, for example, positive deviations from group means should be cancelled out by negative deviations.
We can control sampling error by the TYPES of samples we take and HOW LARGE a sample we take.
Larger and/or more representative samples have:
|
|
We make generalizations from SAMPLING DISTRIBUTIONS.
SAMPLING DISTRIBUTIONS are hypothetical distributions of a sample statistic (such as a mean or a Pearson's r correlation coefficient) taken from an infinite number of samples of the same size and the same type taken around the same time period (say, n = 900 for each sample and each sample is a Random Digit Dial telephone survey).
Remember that each element in a sampling distribution is a separate sample.
In the long run, we hope that the center of the sampling distribution, such as the "mean of the means" (the grand mean) will be the same as the true population value (such as the true population mean.) This is often called the expected value.
If we do a good job on sampling, we can estimate the population mean or percentage from just one sample and put approximate limits (called "confidence intervals") around our estimate.
Confidence intervals tell us how much on the average we can expect our results to vary from one sample to another (say, "plus or minus 3 percent").
The size of the confidence interval typically depends on two entities:
All else equal, LARGER SAMPLE SIZES produce SMALLER confidence intervals thus more precise estimates.
Recall that the square root of the case base is in the DENOMINATOR for the standard error of the mean. Larger denominators mean that the end result is smaller.
In fact, if you QUADRUPLE THE SAMPLE SIZE, you will CUT THE STANDARD ERROR IN HALF. (Remember the square roots that are often taken in these calculations; that's why you have to quadruple the sample, not just double it.)
Put into common terms, as we have referenced all semester, the results from large samples are more stable than the results from small samples. They vary less from one sample to the next. In a small sample, changes in one or two people can make a big difference in the results. In a large sample the same changes are virtually unnoticed.
Obviously, at some point, it is not cost
effective to keep increasing the sample size. It is one thing to increase
your sample size from 100 to 400 people. It is quite another to quadruple
your national sample from 1500 to 6000 people.
|
|
ALL THIS ASSUMES THAT WE TAKE PROBABILITY SAMPLES THROUGHOUT THE ENTIRE SAMPLING PROCESS
|
|
In probability samples, each element, person, or case has a KNOWN, NON-ZERO chance of selection.
That's it! That's
all. That's the definition of a probability sample.
And the two entities:
KNOWN andmust be present in any definition that you give of a probability sample.
NON-ZERO
Notice that NOWHERE in this definition does it say "equal" probabilities.
VERY IMPORTANT NOTE: Probabilty samples can have elements selected with unequal probabilities. We call EQUAL probability samples EPSEM samples:
Equal
Probability
of
SElection
Methods.
You could have a
sample with unequal
but known probabilities of selection--this
would STILL be a probability sample. It would not be an EPSEM sample.
|
|
To do probability samples, you need a complete FRAME or list at some point, which enumerates all the elements in your population. Or you need a sampling procedure that can APPROXIMATE such a list, if one is not available. Random Digit Dialing--RDD--approximates a complete list of telephone numbers. We would not be able to list all the numbers. Programs such as Genesys create lists of random digit telephone numbers.
If your population is very large, such a complete list will either be impossible, unwieldy, or VERY expensive. If you break your sample into STAGES, all you need is a complete frame at each stage. Here is an example for a Random Digit Dial telephone sample survey for the entire United States:
|
|
|
|
| Simple random samples (srs) | Self-selected samples (e.g., call-in/mail-in "polls") | |
| Systematic samples with a random start | Available respondents (e.g., "grab"/haphazard/ "convenience" samples) | |
| Stratified samples (a) proportionate to size; (b) disproportionate to size | Purposive/judgement samples (including "snowball samples") | |
| Cluster samples (usually based on geographical proximity) | Quota samples |
|
A SAMPLE MUST BE A PROBABILITY SAMPLE AT ALL STAGES IN ORDER TO STRICTLY QUALIFY AS A PROBABILITY SAMPLE. For example, if you do a RDD telephone survey, you also must use some type of probability method to select the respondent within households.
THERE REALLY IS NO SUCH THING AS A "QUASI-PROBABILITY"
SAMPLE ALTHOUGH SOME COMMERCIAL AGENCIES WOULD LOVE YOU TO BELIEVE THIS.
|
|
SIMPLE RANDOM SAMPLING (srs).
SYSTEMATIC SAMPLING.
It is fairly common to see stratified samples
(see below) with systematic sampling within strata.
STRATIFIED SAMPLES.
CLUSTER SAMPLES.
|
|
SELF-SELECTED.
AVAILABILITY/GRAB/HAPHAZARD/CONVENIENCE SAMPLES.
PURPOSIVE SAMPLES (sometimes called judgment samples).
NOTE: Depending on the situation,
this may, in fact, be the best that we can do.
QUOTA SAMPLES. THEY ONLY LOOK GOOD. They are still often used.
|
|
We often have no idea how refusals or absent households differ from those interviewed. Aim for MINIMUM 50 PERCENT (this is actually pretty bad), 65 percent is more acceptable and over 70-odd percent, you will rival the General Social Survey.
|
DIFFERENCES BETWEEN SAMPLING AND RANDOM ASSIGNMENT TO TREATMENT GROUPS |
While sampling is often a focal point in survey research, researchers using other designs often feel they can overlook sampling altogether.
Those doing ethnographies
feel it is the wealth of detail that they collect that is important.
Experimenters claim
that random assignment to intervention or treatment groups is all that
is needed for the validity of their research.
WRONG! WRONG! WRONG!
EVERYBODY needs
to worry about sampling.
Don't you want to
generalize your results at all?
You can't do it
if your sample is terrible.
If you are doing
an ethnography, you are very dependent upon who lets you in to their group
to study it. Thus, it is understandable in this instance that the researcher
is unable to select their sample cases in advance and simply collect data.
A judgement sample may be the best you can do. At least TRY not to let
your own personal biases influence the group (or groups) that is selected.
Get as much advice from others as you can. Consumers, watch for someone
who at least tried to be careful in selecting their research group.
|
|
Notice that you will define your population and select your sample BEFORE you assign subjects in experiments or quasi experiments to treatment groups. Here is where experimenters often get sloppy. They take grab samples or even cluster grab samples and sometimes never define their population at all!
RANDOMIZATION AND SIMPLE RANDOM SAMPLING
ARE TWO DIFFERENT THINGS.
IT IS VERY COMMON FOR NOVICES TO CONFUSE
THEM.
SIMPLE RANDOM SAMPLING: One way your total pool of subjects may be created before any intervention or treatment. However, many other sampling methods, such as cluster or convenience sampling might be used.
The process of how you obtained subjects in the first place influences how well you can generalize and whom you can generalize to. If you used a sample random sample to select elements into your study before you began any intervention, other things equal, you will be able to generalize to a known population with known error limits. (Sometimes this is called external validity.)
RANDOMIZATION OR RANDOM ASSIGNMENT: One way of assigning subjects to treatment or intervention groups. Other methods, such as experimenter judgement might be used.
The process of how you assigned subjects to treatment or intervention groups can affect the strength of your causal statements.
Randomization, or random assignment of subjects to treatment groups DOES NOT CORRECT for sloppy sampling of groups or elements in the first place. What randomization means is that you can typically make strong causal statements about how the treatments influenced the outcomes.
Once that's done, whom can you generalize to? If your sample of groups or elements is poor, you can't generalize to anyone!
Now, that is a strong statement. Most researchers in practice aren't that fussy. But, where do you draw the line? What if you have a grab sample of classes from the FSU University school? (You grabbed where the instructor was cooperative.) You might generalize to the University school. You might even try to generalize to Leon County public schools. But what then? Your results don't represent Florida classrooms, Southeastern classrooms, and certainly not United States classrooms. Although you used random assignment of treatments, your sample of classes limits your external validity, or how much you can generalize.
Notice, too, that when you sample classrooms,
you have a CLUSTER SAMPLE. If the students within a classroom are similar
(say, grouped by ability level), you will artificially depress your standard
errors. This means that your TRUE standard errors will be much larger than
the typical statistical program calculations that you see on your computer
output because your classrooms under estimated the true heterogeneity present
in the entire school (most statistical packages use srs formulas to calculate
standard errors but see some of the options in the SDA program).
|
![]() |
READINGS AND ASSIGNMENTS |
OVERVIEW |
|
Susan Carol Losh July 20,
2004
This page was built with
Netscape Composer
and is best viewed with
Netscape Navigator
600 X 800 display resolution.