home
Assess students

Item analysis

An item analysis involves many statistics that can provide useful information for improving the quality and accuracy of multiple-choice or true/false items (questions).  Some of these statistics are:

Item difficulty: the percentage of students that correctly answered the item.

Item discrimination: the relationship between how well students did on the item and their total exam score.

Reliability coefficient: a measure of the amount of measurement error associated with a exam score.

Item-total statistics: measure the relationship of individual exam items to the overall exam score.  

Currently, the University of Texas does not perform this analysis for faculty. However, one can calculate these statistics using SPSS or SAS statistical software.

  1. Corrected item-total correlation
    • This is the correlation between an item and the rest of the exam, without that item considered part of the exam.
    • If the correlation is low for an item, this means the item isn't really measuring the same thing the rest of the exam is trying to measure.
  2. Squared multiple correlation
    • This measures how much of the variability in the responses to this item can be predicted from the other items on the exam.
    • If an item does not predict much of the variability, then the item should be considered for deletion.
  3. Alpha if item deleted
    • The change in Cronbach's alpha if the item is deleted.
    • When the alpha value is higher than the current alpha with the item included, one should consider deleting this item to improve the overall reliability of the exam.

      Example

      Item-total statistic table

      Item-total statistics

      Summary for scale:

      Mean = 46.1100 S.D. = 8.26444 Valid n = 100
      Cronbach alpha = .794313 Standardized alpha = .800491
      Average inter-item correlation = .297818

       Variable

      Mean if
      deleted

      Var. if
      deleted

      S.D. if
      deleted

      Corrected item-total
      Correlation

      Squared multiple
      correlation

      Alpha if
      deleted

      ITEM1

      41.61000

      51.93790

      7.206795

      .656298

      .507160

      .752243

      ITEM2

      41.37000

      53.79310

      7.334378

      .666111

      .533015

      .754692

      ITEM3

      41.41000

      54.86190

      7.406882

      .549226

      .363895

      .766778

      ITEM4

      41.63000

      56.57310

      7.521509

      .470852

      .305573

      .776015

      ITEM5

      41.52000

      64.16961

      8.010593

      .054609

      .057399

      .824907

      ITEM6

      41.56000

      62.68640

      7.917474

      .118561

      .045653

      .817907

      ITEM7

      41.46000

      54.02840

      7.350401

      .587637

      .443563

      .762033

      ITEM8

      41.33000

      53.32110

      7.302130

      .609204

      .446298

      .758992

      ITEM9

      41.44000

      55.06640

      7.420674

      .502529

      .328149

      .772013

      ITEM10

      41.66000

      53.78440

      7.333785

      .572875

      .410561

      .763314

      By investigating the item-total correlation, we can see that the correlations of items 5 and 6 with the overall exam are . 05 and .12, while all other items correlate at .45 or better. By investigating the squared-multiple correlations, we can see that again items 5 and 6 are significantly lower than the rest of the items. Finally, by exploring the alpha if deleted, we can see that the reliability of the scale (alpha) would increase to .82 if either of these two items were to be deleted. Thus, we would probably delete these two items from this exam.

      Deleting item process: To delete these items, we would delete one item at a time, preferably item 5 because it can produce a higher exam reliability coefficient if deleted, and re-run the item-total statistics report before deleting item 6 to ensure we do not lower the overall alpha of the exam. After deleting item 5, if item 6 still appears as an item to delete, then we would re-perform this deletion process for the latter item.

Distractor evaluation: Another useful item review technique to use.

The distractor should be considered an important part of the item. Nearly 50 years of research shows that there is a relationship between the distractors students choose and total exam score. The quality of the distractors influence student performance on an exam item. Although the correct answer must be truly correct, it is just as important that the distractors be incorrect. Distractors should appeal to low scorers who have not mastered the material whereas high scorers should infrequently select the distractors. Reviewing the options can reveal potential errors of judgment and inadequate performance of distractors. These poor distractors can be revised, replaced, or removed.

One way to study responses to distractors is with a frequency table. This table tells you the number and/or percent of students that selected a given distractor. Distractors that are selected by a few or no students should be removed or replaced. These kinds of distractors are likely to be so implausible to students that hardly anyone selects them.

Example of an item analysis.

Additional information

DeVellis, R. F. (1991). Scale development: Theory and applications. Newbury Park: Sage Publications.

Field, A. (2006). Research Methods II: Reliability Analysis. Retrieved August 5, 2006 from The University of Sussex Web site: http://www.sussex.ac.uk/Users/andyf/reliability.pdf  

Haladyna. T. M. (1999). Developing and validating multiple-choice exam items, 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates.

Suen, H. K. (1990). Principles of exam theories. Hillsdale, NJ: Lawrence Erlbaum Associates.

Yu, A. (n.d) Using SAS for Item Analysis and Test Construction. Retrieved August 5, 2006 from Arizona State University Web site: http://seamonkey.ed.asu.edu/~alex/teaching/assessment/alpha.html

Page last updated: Sep 21 2011
Copyright © 2007, The University of Texas at Austin