In a field experiment, students or clients are assigned to experimental and control groups in a non-random fashion and instruction occurs in a non-laboratory setting. Because the random assignment of participants and a controlled instructional environment are not generally feasible in educational settings, field experiments are more commonly used than controlled experiments.
In the simplest field experiment design, one group participates in a learning intervention while one group does not, both groups complete an outcome measure, and the results are compared. The biggest problem with this design is that the groups may differ on the outcome variable before the learning intervention. Measuring the variable before the study's start can help you account for pre-existing differences. Sometimes groups are also compared on other characteristics that may affect the outcome variable, such as SAT or IQ scores.
Group 1: Pretest
Group 2: Pretest ----------------------------------> Posttest
Experiment participants may improve on a certain outcome because they have matured, or regress because they have fatigued. For example, an instructor who attends teacher-training sessions during an academic year and improves significantly on course instructor survey ratings at the end of this year may be overlooking improvement due to teaching maturation. Fortunately, comparing similar groups through a field experiment provides evidence that observed changes are due to the intervention rather than maturation.
Are resources available?
Conducting a field experiment requires a medium level of time, ability to collect baseline data prior to intervention, and experience or training in one or more data collection methods. [more]
What will you compare?
You may want to compare the performance of people who participate in an educational program with that of people who do not participate in any program. On the other hand, it may be more realistic to compare the performance of participants in a new program with that of participants in an existing effective program.
Do I have a good chance of detecting differences?
In order to find statistically significant differences between two groups, you must have a good estimate of the expected difference between the control and intervention groups on outcome variables. Reviewing similar interventions from previous research can help you make this estimate and make your intervention more potent. Larger samples are more likely to reveal differences. The number of interventions being tested and the significance level of statistical tests also affect your ability to detect differences. If your knowledge of statistics is somewhat limited, consult with an expert.
How will you deal with nonparticipation and attrition?
Participants who do not complete a course or don't fully participate create a problem if their level of participation or attrition follows a pattern different from the rest of the group (i.e., the pattern is non-random). For example, if under-achieving students are more likely to drop out of the intervention group than the control group, the intervention may appear to be more effective. Gathering background information for all participants, such as previous achievement records or socioeconomic status, can help you estimate bias that is introduced and adjust for it.
How will you elicit cooperation?
Collaborate with program staff, instructors, and students or clients to gain their support and cooperation.
Completing a pre-test measure may make participants aware of a deficit that they then address. If you use identical measures, students or clients may do better the second time because of practice.
How you measure the outcome and who measures it may change from pre- to post-test and can affect whether students or clients appear to improve. It is important that pre and post measures be equivalent.
Because participants are not assigned randomly to groups, the groups may differ in ways that compromise your conclusions. For example, some students who are enrolled in an 8 a.m. class might have been forced to take an early morning class because they failed to register promptly. These students may be less motivated to attend class than those in a 10 a.m. class, which may affect their achievement. Comparing groups that are as similar as possible will improve validity.
Reichardt, C. S. & Mark, M. M. (2004). Quasi-Experimentation. In J. S. Wholey, H. P. Hatry, & K. E. Newcomber (Eds.) Handbook of Practical Program Evaluation, Second Edition, San Franciso: Jossey-Bass.