Analyzing experiment data
Enter your data into a statistical program like SPSS or SAS, that is available to UT Austin faculty, staff, and students for a modest fee through Information Technology Services (ITS). Inspect the data for errors that can occur during data entry or when respondents provide inconsistent answers. For large databases, check at least five percent of entered data for accuracy. If you find any errors, check the remainder of the data and correct the errors.
You may need to recode some answers to questions that have an "other" response option. For example, one person may answer the question, "Do you consider yourself African American, Caucasian, Asian, Hispanic, or other?" by circling "other" and writing, "Chinese." To maintain consistency, code the answer as "Asian" rather than "other."
Calculate means for outcome measures. Determine if outcome and other variables are normally distributed, a requirement for many statistical tests. If a variable is not normally distributed, consult with a statistician to determine if you need to transform the variable.
While comparing means will give you a rough sense of differences on outcome measures, you must use statistical tests to demonstrate that these differences are unlikely to have occurred by chance. Many statistical programs provide a p value that indicates the probability that group differences occurred by chance alone. For example, a p value of .05 indicates that there is a 5% probability that differences between groups occurred by chance rather than because of the intervention. Prior to analyzing your data, set the p value that you will use as a criterion for statistical significance. A p value of .05 is most often used as a cutoff.
If you are comparing pre- and posttest scores for a single group, use a t-test for dependent means (also called a paired samples t-test, repeated measures t-test, or t-test for dependent samples) to determine if there is a statistically significant change. The easiest way to accomplish this is to enter the data into a statistical program like Excel or SPSS and to use a pull-down menu to run the test. If you are using Excel, click Data Analysis on the Tools menu to perform statistical analyses. If Data Analysis is not a listed option, you will need to install the Analysis ToolPak by clicking Add-Ins on the Tools menu.
To compare scores at three or more points in time, one option is a repeated measures analysis of variance (ANOVA) (also called ANOVA for correlated samples). A significant F value for an ANOVA tells you that, overall, scores differ at different times, but it does not tell you which scores are significantly different from each other. To answer that question, you must perform post-hoc comparisons after you obtain a significant F, using tests such as Tukey's and Scheffe's, which set more stringent significance levels as you make more comparisons. However, if you make specific predictions about differences between means, you can test these predictions with planned comparisons, which enable you to set significance levels at p < .05. Planned comparisons are performed in place of an overall ANOVA.
Linear regression enables you to predict the level of an outcome variable using one or more continuous variables. For example, you might institute an instructional innovation: students provide and receive on-line feedback from fellow students every week on essay organization and clarity. Before starting the innovation, you have students complete measures of openness to feedback and communication skill, and you use these scores to predict their degree of writing improvement at the end of the semester.
If participants are assessed multiple times, hierarchical linear models (HLM), may be a better choice than a repeated measures ANOVA. HLM is particularly suited to analyze data from repeated measurements or data in a hierarchical structure. For example, in much educational research, students are grouped within classrooms, which are grouped within schools. HLM takes into account that students from a classroom or school have more in common than individuals who are randomly sampled from a larger population. HLM requires specialized software, available to UT faculty and staff at a discount.
You might also compute correlations to determine whether there is a statistically significant positive or negative relationship between two continuous variables. For example, you could determine if writing improvement is significantly related to course satisfaction. Be aware, however, that computing correlations between several variables increases the chances of finding a relationship due to chance alone, and that finding significant correlations between variables does not tell you what causes those relationships.
If you need additional help from someone knowledgeable about statistics, contact the research consulting staff at UT's Austin's Division of Statistics & Scientific Computation.
Field and controlled experiment
To test for significant differences between two separate groups of students, the most commonly used option is the t-test for independent groups (also called an independent samples t-test, or the t-test for independent means). Administer a version of your outcome measure before your intervention to make sure there are not pre-existing differences that are statistically significant. If there are pre-test differences and you randomly assigned participants to the groups, you can control for these differences using an Analysis of Covariance (ANCOVA) procedure. You cannot use an ANCOVA, however, to control for pre-existing group differences in a field experiment, so consult with a statistician in this case.
Make sure your data meet statistical assumptions for t-tests and other statistical procedures. For t-tests, the distribution of the outcome variable (for example, test scores) should be roughly normal or bell-shaped. When you are comparing two sets of scores, the spread of scores (variance) should be roughly equal for both sets. In addition, your outcome variable should be on a continuous scale (for example, age) and cannot be categorical (for example, dyslexic versus not dyslexic individuals).
When you assign individuals to groups based on a cutoff score on a placement measure, such as a math achievement test, you can use a regression-discontinuity design. Participants who score above the cutoff are assigned to one group, while participants below the cutoff are assigned to a second group. The effect of the treatment is estimated by using the placement score to predict scores on an outcome measure, such as a second test grade, and plotting regression lines separately for each group. If the treatment provided to one group is effective, you should see a "jump" or discontinuity of the regression lines at the cutoff point.
To compare three groups or more, use an independent samples analysis of variance. Again, you will need to conduct post-hoc comparisons after obtaining a significant F value to determine differences between specific groups.
Aron, A. & Aron, E. N. (2002). Statistics for Psychology, 3rd edition. Upper Saddle River, N J: Prentice Hall.
Lane, D. M. (2003). Tests of linear combinations of means, independent groups. Retrieved June 21, 2006 from the Hyperstat Online textbook: http://davidmlane.com/hyperstat/confidence_intervals.html
Linear regression. (n.d.) Retrieved June 21, 2006 from the Yale University Department of Statistics Index of Courses 1997-98 Web site: http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm
Lowry, R. P. (2005). Concepts and Applications of Inferential Statistics. Retrieved June 21, 2006 from: http://faculty.vassar.edu/lowry/webtext.html
Osborne, Jason W. (2000). Advantages of hierarchical linear modeling. Practical Assessment, Research & Evaluation, 7(1). Retrieved June 21, 2006 from: http://PAREonline.net/getvn.asp?v=7&n=1
T-test. Retrieved June 21, 2006 from the Georgetown University, Department of Psychology, Research Methods and Statistics Resources Web site: http://www1.georgetown.edu/departments/psychology/resources/researchmethods/statistics/8318.html.
Trochim, W. M. K. (2002). The Regression-Discontinuity Design. Retrieved June 21, 2006 from: http://www.socialresearchmethods.net/kb/quasird.htm