This usage note describes how to run a factor analysis, specifically an exploratory common factor analysis, using the SAS FACTOR procedure. This document is composed of three sections: Introduction, Outline of Use, and An Illustrative Example. The Introduction section explains what factor analysis is and when one should use it. The next section is a detailed outline for conducting a factor analysis. Finally the last section illustrates the use of common factor analysis using actual data.
Each observed variable (y) can be expressed as a weighted composite of a set of latent variables (f's) such that
y = a f + a f + ... + a f + e i i1 1 i2 2 ik k iwhere y_i is the iith observed variable on the factors, and e_i is the residual of y_i on the factors. Given the assumption that the residuals are uncorrelated across the observed variables, the correlations among the observed variables are accounted for by the factors.
The following is an example of a simple path diagram for a factor analysis model. This diagram is a schematic representation of the above formula.

F1 and F2 are two common factors. Y1, Y2, Y3, Y4, and Y5 are observed variables, possibly 5 subtests or measures of other observations such as responses to items on a survey. e1, e2, e3, e4, and e5 represent residuals or unique factors, which are assumed to be uncorrelated with each other. Any correlation between a pair of the observed variables can be explained in terms of their relationships with the latent variables.
The term "common" in common factor analysis describes the variance that is analyzed. It is assumed that the variance of a single variable can be decomposed into common variance that is shared by other variables included in the model, and unique variance that is unique to a particular variable and includes the error component. Common factor analysis (CFA) analyzes only the common variance of the observed variables; principal component analysis considers the total variance and makes no distinction between common and unique variance.
The selection of one technique over the other is based upon several criteria. First of all, what is the objective of the analysis? Common factor analysis and principal component analysis are similar in the sense that the purpose of both is to reduce the original variables into fewer composite variables, called factors or principal components. However, they are distinct in the sense that the obtained composite variables serve different purposes. In common factor analysis, a small number of factors are extracted to account for the intercorrelations among the observed variables--to identify the latent dimensions that explain why the variables are correlated with each other. In principal component analysis, the objective is to account for the maximum portion of the variance present in the original set of variables with a minimum number of composite variables called principal components.
Secondly, what are the assumptions about the variance in the original variables? If the observed variables are measured relatively error free, (for example, age, years of education, or number of family members), or if it is assumed that the error and specific variance represent a small portion of the total variance in the original set of the variables, then principal component analysis is appropriate. But if the observed variables are only indicators of the latent constructs to be measured (such as test scores or responses to attitude scales), or if the error (unique) variance represents a significant portion of the total variance, then the appropriate technique to select is common factor analysis. Since the two methods often yield similar results, only CFA will be illustrated here.
Several important questions should be considered by a researcher preparing input data for a factor analysis. First, what variables should be included in the analysis? Factor analysis is designed to explain why certain variables are correlated. Moreover, common factor analysis is concerned only with that portion of total variance shared by the variables included in the model. Therefore, you should not include variables that are not believed to be related to each other in any way.
Second, how many variables should be included? Factors are unobserved latent variables that can be inferred from a set of observed variables. Therefore, factors cannot emerge unless there is a sufficient number of observed variables that vary along the latent continuum. You cannot define a factor with a single observed variable. You should have a minimum of three observed variables for each factor expected to emerge. In Thurstone's terminology, the factors defined by only one or two observed variables are called "singlet" or "doublet" factors, which are not desirable. Guttman[1] has shown that if a correlation matrix is suitable for common factor analysis, then R-1 (the inverse of a correlation matrix) should approach a diagonal matrix as the number of variables increases while the number of factors remains constant. Kaiser and Rice[2] proposed a measure of sampling adequacy, which indicates how near R-1 is to a diagonal matrix.
Third, is the number of observations sufficient to provide reliable estimations of the correlations between the variables? Correlation coefficients tend to be unstable and greatly influenced by the presence of outliers if the sample size is not large. It is generally unwise to conduct a factor analysis on a sample of fewer than 50 observations. Moreover, the sample size should also be considered in relation to the number of variables included in the analysis. Various rules of thumb have been proposed, with the minimum number of observations per variable ranging from 5 to 10. While there seems to be no definitive answer to this problem, everyone agrees that the more observations you have, the more valid your results.
Fourth, is correlation a valid measure of association among the variables to be analyzed? The correlation coefficient is being used as a measure of conceptual similarity of the variables. If strong curvilinear relationships are present among variables, for example, the correlation coefficient is not an appropriate measure. In such cases, the results of a factor analysis based on correlation coefficients will be invalid. The variables should meet the other assumptions required for the correlation coefficient as well. However, in social and behavioral sciences, we seldom have variables that strictly meet these assumptions. Ordinal and dichotomous variables have been submitted to a factor analysis in the social and behavioral sciences. Unless the distributions of the variables are strongly nonnormal, factor analysis seems to be robust to minor violations of these assumptions.
There are still other methods of estimating communalities available in SAS. Interested readers should refer to SAS manual[4]. Some method should be chosen, because SAS by default sets all prior communalities to 1.0, which is the same as requesting a principal components analysis. This default setting has caused misunderstanding among the novice users who are not aware of the consequence of overlooking the default settings. Many researchers claim to have conducted a common factor analysis when actually a principal components analysis was performed.

The second Chi-square test statistic, labelled "Test of H0: N factors are sufficient" is the test of the null hypothesis that N common factors are sufficient to explain the intercorrelations among the variables, where N is the number of factors you specify with an NFACTORS=N option in the PROC FACTOR statement. This test is useful for testing the hypothesis that a given number of factors are sufficient to account for your data; in this instance your goal is a small chi-square value relative to its degrees of freedom. This outcome results in a large p-value (p > .05). One downside of this test is that the Chi-square test is very sensitive to sample size: given large degrees of freedom, this test will normally reject the null hypothesis of the residual matrix being a null matrix, even when the factor analysis solution is very good. Therefore, be careful in interpreting this test's significance value. Some data sets do not lend themselves to good factor solutions, regardless of the number of factors extracted.

The variable V1 initially has factor loadings (correlations) of .7 and .6 on factor 1 and factor 2 respectively. However, after rotation the factor loadings have changed to .9 and .2 on the rotated factor 1 and factor 2 respectively, which is closer to a simple structure and easier to interpret.
The simplest case of rotation is an orthogonal rotation in which the angle between the reference axes of factors are maintained at 90 degrees. More complicated forms of rotation allow the angle between the reference axes to be other than a right angle, i.e., factors are allowed to be correlated with each other. These types of rotational procedures are referred to as oblique rotations. Orthogonal rotation procedures are more commonly used than oblique rotation procedures. In some situations, theory may mandate that underlying latent constructs be uncorrelated with each other, and therefore oblique rotation procedures will not be appropriate. In other situations where the correlations between the underlying constructs are not assumed to be zero, oblique rotation procedures may yield simpler and more interpretable factor patterns.
A number of orthogonal and oblique rotation procedures have been proposed. Each procedure has a slightly different simplicity function to be maximized. The ROTATE= option in the PROC FACTOR statement supports five orthogonal rotation methods: EQUAMAX, ORTHOMAX, QUARTIMAX, PARSIMAX, and VARIMAX; and two oblique rotation methods: PROCRUSTES and PROMAX. The VARIMAX method has been the most commonly used orthogonal rotation procedure.
1. Identifying significant loadings: The analyst starts with the first variable (row) and examines the factor loadings horizontally from left to right, underlining them if they are significant. This process is repeated for all the other variables. You can instruct SAS to perform this step by using the FUZZ= option in the PROC FACTOR statement. For instance, FUZZ=.30 prints only the factor loadings greater than or equal to .30 in absolute value.
Ideally, we expect a single significant loading for each variable on only one factor: across each row there is only one underlined factor loading. It is not uncommon, however, to observe split loadings, a variable which has multiple significant loadings. On the other hand, if there are variables that fail to load significantly on any factor, then the analyst should critically evaluate these variables and consider deriving a new factor solution after eliminating them.
2. Naming of Factors: Once all significant loadings are identified, the analyst attempts to assign some meaning to the factors based on the patterns of the factor loadings. To do this, the analyst examines the significant loadings for each factor (column). In general, the larger the absolute size of the factor loading for a variable, the more important the variable is in interpreting the factor. The sign of the loadings also needs to be considered in labeling the factors. It may be important to reverse the scoring of the negatively worded items in Likert-type instruments to prevent ambiguity. That is, in Likert-type instruments some items are often negatively worded so that high scores on these items actually reflect low degrees of the attitude or construct being measured. Remember that the factor loadings represent the correlation or linear association between a variable and the latent factor(s). Considering all the variables' loading on a factor, including the size and sign of the loading, the investigator makes a determination as to what the underlying factor may represent.
Factor Analysis Decision Diagram

The Wechsler Intelligence Scale for Children (WISC-III) was designed as a test of general intelligence to provide estimates of the intellectual abilities for children aged between 6 and 16. The WISC-III consists of 13 subtests, each measuring a different facet of intelligence. The matrix of intercorrelations among the 13 subtests, which served as the input data, was obtained from the manual[5] and is shown in Table 2. Inspection of the correlation matrix shows that the correlations are substantial, indicating the presence of a substantial general factor.
Table 1. Correlation matrix for 13 subscalesSubscale Inf Sim Ari Voc Com Dig PiC Cod PiA Blo Obj Sym Information Similarities .66 Arithmetic .57 .55 Vocabulary .70 .69 .54 Comprehension .56 .59 .47 .64 Digit Span .34 .34 .43 .35 .29 Pic. Completion .47 .45 .39 .45 .38 .25 Coding Subscale .21 .20 .27 .26 .25 .23 .18 Pic. Arrang. .40 .39 .35 .40 .35 .20 .37 .28 Block Design .48 .49 .52 .46 .40 .32 .52 .27 .41 Object Assembly .41 .42 .39 .41 .34 .26 .49 .24 .37 .61 Symbol Search .35 .35 .41 .35 .34 .28 .33 .53 .36 .45 .38 Mazes .18 .18 .22 .17 .17 .14 .24 .15 .23 .31 .29 .24
PROC FACTOR can handle input data consisting of either a correlation matrix or the raw data matrix used to produce the correlation matrix. The correlation matrix can be a SAS dataset generated from the PROC CORR procedure or can be a text file containing the lower triangle (including the main diagonal) of a correlation matrix. For our example, a text file of correlations is created and called WISC.DAT. The following SAS DATA step code defines the type of the input data file WISC.DAT as a correlation matrix, and labels its variables. The _TYPE_=`CORR'; statement must be typed exactly as shown:
DATA d1 (TYPE=CORR); _TYPE_='CORR'; INFILE `wisc.dat' MISSOVER; INPUT inf sim ari voc com dig pic cod pia blo obj sym maz; RUN;The following SAS code calls the FACTOR procedure with some options. METHOD=P or METHOD=PRINCIPAL specifies the method for extracting factors to be the principal-axis factoring method. This option in conjunction with PRIORS=SMC performs a principal factor analysis. The option ROTATE=PROMAX performs an oblique rotation after an orthogonal VARIMAX rotation. It is specified here because the hypothetical constructs that constitute human intelligence, which WISC-III attempts to measure, are believed to be interrelated with each other. The CORR option requests the correlation matrix be printed, and the RES or RESIDUALS option requests that a residual correlation matrix be printed. The residual correlation matrix shows the difference between the observed correlation matrix and the predicted correlation matrix. If the retained factors are sufficient to explain the correlations among the observed variables, the residual correlation matrix is expected to approximate a null matrix (most values <= .10).
PROC FACTOR DATA=D1 METHOD=P PRIORS=SMC ROTATE=PROMAX SCREE
CORR RES;
RUN;
Table 2 shows the prior communality estimates for 13 subtests used in this
analysis. The squared multiple correlations (SMC), which are printed below,
represent the proportion of variance of each of the 13 subtests shared by all
remaining subtests. The subtest MAZES has the prior communality estimate of
0.132, which means that only 13% of the variance of the subtest MAZES is shared
by all other subtests, indicating that this subtest measures a somewhat
different construct than the other subtests. A small communality estimate
might indicate that the variable or item may need to be modified or even
dropped.
Table 2. Initial Communality Estimates Initial Factor Method: Principal Factors Prior Communality Estimates: SMC INFO SIM ARITH VOC COMP 0.594574 0.587543 0.481994 0.636296 0.473358 DIGIT PICTCOM CODING PICTARG 0.224104 0.385580 0.306120 0.287693 BLOCK OBJECT SYMBOL MAZES 0.533202 0.439176 0.422932 0.132220 Eigenvalues of the Reduced Correlation Matrix: Total = 5.50479208 Average = 0.42344554The sum of all prior communality estimates, 5.505 in this example, is the estimate of the common variance among all subtests. This initial estimate of the common variance constitutes about 42% of the total variance present among all 13 subtests.
Table 3 shows the factor numbers and corresponding eigenvalues. According to the Kaiser and Guttman rule, only one factor can be retained because only the first factor has an eigenvalue greater than one. However, as suggested in the previous section, this criterion may be applicable only to principal component analysis, not common factor analysis. Two factors can be retained if the average eigenvalue (0.423) instead of 1.0 is used as the criterion. The authors of WISC-III retained all factors with positive eigenvalues and thus retained the first four factors. The fifth and following factors have negative eigenvalues, which may not be intuitively appealing just as a negative variance is not. This oddity occurs only in common factor analysis due to the restriction that the sum of eigenvalues be set equal to the estimated common variance, not the total variance.
Table 3. Eigenvalues of the Reduced Correlation MatrixThe scree plot shown below seems to suggest the presence of a general factor as predicted from the inspection of the correlation matrix. A large first eigenvalue (5.11) and a much smaller second eigenvalue (0.68) suggests the presence of a dominant global factor. Stretching it to the limit, one might argue that a secondary elbow occurred at the fifth factor, implying a four-factor solution. That is equivalent to retaining all factors with positive eigenvalues. Research has suggested that the structure of the Wechsler's intelligence scales are hierarchical. That is, at the top of the hierarchy all subtests converge to a single general factor, below which are several less general factors defined by clusters of subtests. A four-factor solution is more interesting and meaningful than a single factor solution to investigate the hierarchical structure of the WISC-III. The results presented in the following section will be based on a four-factor solution, which was obtained by repeating the analysis with the NFACTOR=4 option specifying that the first four factors be retained.1 2 3 4 5
Eigenvalue 5.1046 0.6838 0.4021 0.1479 -0.0130
Difference 4.4208 0.2817 0.2542 0.1609 0.0094
Proportion 0.9273 0.1242 0.0731 0.0269 -0.0024
Cumulative 0.9273 1.0515 1.1246 1.1514 1.1491
6 7 8 9 10
Eigenvalue -0.0224 -0.0569 -0.0782 -0.0848 -0.0897
Difference 0.0345 0.0213 0.0065 0.0049 0.0412
Proportion -0.0041 -0.0103 -0.0142 -0.0154 -0.0163
Cumulative 1.1450 1.1347 1.1205 1.1051 1.0888
11 12 13
Eigenvalue -0.1310 -0.1547 -0.2031
Difference 0.0237 0.0485
Proportion -0.0238 -0.0281 -0.0369
Cumulative 1.0650 1.0369 1.0000

Table 4. Initial Factor PatternTable 4 above shows the initial unrotated factor structure matrix, which consists of the correlations between the 13 subtests and the four retained factors. The current estimate of the common variance is now 6.338, which is somewhat larger than the initial estimate of 5.505.
FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO 0.76124 -0.26507 0.00573 -0.00419 INFORMATION
SIM 0.75825 -0.26807 0.00088 -0.01733 SIMILARITY
ARITH 0.70320 -0.04219 0.07006 0.21817 ARITHMETIC
VOC 0.77712 -0.29967 0.08268 -0.07819 VOCABULARY
COMP 0.67220 -0.21792 0.11383 0.09479 COMPREHENSION
DIGIT 0.45938 0.01293 0.10982 0.23284 DIGIT SPAN
PICTCOM 0.61799 0.06079 -0.23502 -0.05384 PICTURECOMPLETION
CODING 0.40429 0.33855 0.34093 -0.06015 CODING
PICTARG 0.54687 0.11799 -0.0165 -0.13620 PICTURE ARRANGEMENT
BLOCK 0.71609 0.21503 -0.2255 0.06332 BLOCK DESIGN
OBJECT 0.62675 0.21928 -0.2652 -0.01736 OBJECT ASSEMBLY
SYMBOL 0.57731 0.36078 0.23968 -0.03620 SYMBOL SEARCH
MAZES 0.32498 0.21379 -0.12221 -0.00324 MAZES
Variance explained by each factor
FACTOR1 FACTOR2 FACTOR3 FACTOR4
5.104620 0.683788 0.402128 0.147927
Final Communality Estimates: Total = 6.338464
The off-diagonal elements of the residual correlation matrix are all close to 0.01, indicating that the correlations among the 13 subtests can be reproduced fairly accurately from the retained factors. The root mean squared off-diagonal residual is 0.0178. The inspection of the partial correlation matrix yields similar results: the correlations among the 13 subtests after the retained factors are accounted for are all close to zero. The root mean squared partial correlation is 0.038, indicating that four latent factors can accurately account for the observed correlations among the 13 subtests.
The table shown below is the factor structure matrix after the VARIMAX rotation. The correlations greater than 0.30 are underlined. There are some split loadings where a variable is significantly (> 0.3) loaded on more than one factor. This matrix, however, is not interpreted because an oblique solution has been requested.
Table 5. Rotated Factor Pattern (VARIMAX)Table 6 shown below is the factor structure matrix after the oblique PROMAX rotation, which allows the latent factors to be correlated with each other. The matrix of inter-factor correlations (Table 7) shows that the factors are substantially correlated with each other. The inter-factor correlations range between 0.44 and 0.65. If we submit these intercorrelated factors to new factor analysis, we might be able to obtain a single second-order factor, which could correspond to the general intelligence or g factor in previous research. One downside of an oblique rotation method is that if the correlations among the factors are substantial, then it is sometimes difficult to distinguish among factors by examining the factor loadings. In such situations, you should investigate the factor pattern matrix, which is a matrix of the standardized coefficients for the regression of the factors on the observed variables.FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO 0.71862 0.29392 0.12616 0.17630 INFORMATION
SIM 0.72023 0.29506 0.12237 0.16230 SIMILARITY
ARITH 0.49726 0.30656 0.23918 0.38771 ARITHMETIC
VOC 0.77718 0.23819 0.17933 0.11727 VOCABULARY
COMP 0.65565 0.19763 0.21399 0.08092 COMPREHENSION
DIGIT 0.29024 0.16907 0.20796 0.34843 DIGIT SPAN
PICTCOM 0.37579 0.53504 0.10572 0.07124 PICTURE COMPLETION
CODING 0.12040 0.14820 0.59510 0.08546 CODING
PICTARG 0.33269 0.37653 0.28170 0.00121 PICTURE ARRANGEMENT
BLOCK 0.32270 0.64662 0.21651 0.21154 BLOCK DESIGN
OBJECT 0.26569 0.63181 0.17377 0.10766 OBJECT ASSEMBLY
SYMBOL 0.21005 0.32244 0.59566 0.13894 SYMBOL SEARCH
MAZES 0.07226 0.36298 0.15838 0.06487 MAZES
Variance explained by each factor
FACTOR1 FACTOR2 FACTOR3 FACTOR4
2.891010 1.894832 1.110948 0.441675
Table 6. Factor Structure (Correlations)FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO 0.80153 0.56064 0.33700 0.52105 INFORMATION
SIM 0.80059 0.55913 0.33257 0.50906 SIMILARITY
ARITH 0.65384 0.55813 0.42927 0.65702 ARITHMETIC
VOC 0.84027 0.53362 0.37803 0.48942 VOCABULARY
COMP 0.71732 0.45943 0.37569 0.41350 COMPREHENSION
DIGIT 0.40958 0.35214 0.32514 0.50255 DIGIT SPAN
PICTCOM 0.53937 0.64229 0.30602 0.37733 PICTURE COMPLETION
CODING 0.28294 0.32896 0.63030 0.31811 CODING
PICTARG 0.47527 0.51677 0.41891 0.30366 PICTURE ARRANGEMENT
BLOCK 0.56601 0.77315 0.44326 0.54029 BLOCK DESIGN
OBJECT 0.48561 0.71459 0.37858 0.41641 OBJECT ASSEMBLY
SYMBOL 0.42630 0.52381 0.69512 0.44612 SYMBOL SEARCH
MAZES 0.21660 0.39830 0.25905 0.22942 MAZES
Table 7. Inter-factor Correlations
FACTOR1 FACTOR2 FACTOR3 FACTOR4
FACTOR1 1.00000 0.64770 0.43503 0.58664
FACTOR2 0.64770 1.00000 0.52336 0.57564
FACTOR3 0.43503 0.52336 1.00000 0.47436
FACTOR4 0.58664 0.57564 0.47436 1.00000
Table 8 is the factor pattern matrix, which will be used to interpret the meaning of the factors. The values in this matrix are the standardized regression coefficients, which are functionally related to the part or semipartial correlation between a variable and the factor when other factors are held constant. Therefore, a value in this matrix represents the individual and nonredundant contribution that each factor is making to predict a subtest. The regression coefficients greater than 0.30 are underlined to assist the interpretation.
Table 8. Rotated Factor Pattern (Standardized Regression Coefficients)The subtests significantly loaded on the first factor are Information, Similarity, Arithmetic, Vocabulary, and Comprehension subtests. These are the subtests that are orally presented and require verbal responses. Therefore, this factor may be named "Verbal Comprehension". The second factor is identified by the following subtests: Picture Completion, Picture Arrangement, Block Design, and Object Assembly. All of these subtests have a geometric or configural component in them: these subtests measure the skills that require the manual manipulation or organization of pictures, objects, blocks, and the like. Therefore, this factor may be named "Perceptual Organization." The two subtests loaded on the third factors are Coding and Symbol Search subtests. Both subtests measure basically the speed of simple coding or searching process. Therefore, this factor can be named "Processing Speed." Finally, Arithmetic and Digit Span subtests identify the fourth factor. Both subtests deal with arithmetic problems or numbers so that this factor can be named "Numerical Ability." The last two factors are doublets since they are identified by only two subtests each. Therefore, they are conceptually weak compared to the first two factors and more subtests may need to be added to these factors to make them conceptually sound.FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO 0.73663 0.06911 -0.0553 0.07540 INFORMATION
SIM 0.74378 0.07445 -0.05694 0.05688 SIMILARITY
ARITH 0.35704 0.08393 0.05243 0.37438 ARITHMETIC
VOC 0.85010 -0.02674 0.02492 -0.00572 VOCABULARY
COMP 0.71870 -0.0391 0.09895 -0.0325 COMPREHENSION
DIGIT 0.16057 -0.01159 0.08321 0.37555 DIGIT SPAN
PICTCOM 0.24101 0.54702 -0.06151 -0.04977 PICTURE COMPLETION
CODING 0.00651 -0.01816 0.62315 0.02916 CODING
PICTARG 0.25467 0.31837 0.20034 -0.12403 PICTURE ARRANGEMENT
BLOCK 0.06661 0.65410 0.01652 0.11685 BLOCK DESIGN
OBJECT 0.04111 0.69028 0.00237 -0.00618 OBJECT ASSEMBLY
SYMBOL 0.03508 0.17311 0.56088 0.05983 SYMBOL SEARCH
MAZES 0.08719 0.40886 0.07943 0.00754 MAZES
It is possible to estimate the factor scores, or a subject's relative standing on each of the factors, if the original subject-by-variable raw data matrix is available. To compute the factor scores for all subjects on all factors, use the following SAS code:
PROC FACTOR DATA=raw {other options here} OUTSTAT=fact;
PROC SCORE DATA=raw SCORE=fact OUT=scores;
RUN;
where raw is the original data matrix, fact is the matrix of factor scoring coefficients, and scores is the matrix of factor scores for subjects.
http://www.utexas.edu/cc/stat/packs/