  June 2020
Factor Analysis with SPSS

What is a Common Factor? • It is an abstraction, a hypothetical construct that affects at least two of our measurement variables. • We want to estimate the common factors that contribute to the variance in our variables. • Is this an act of discovery or an act of invention?

What is a Unique Factor? • It is a factor that contributes to the variance in only one variable. • There is one unique factor for each variable. • The unique factors are unrelated to one another and unrelated to the common factors. • We want to exclude these unique factors from our solution.

Iterated Principal Factors Analysis • The most common type of FA. • Also known as principal axis FA. • We eliminate the unique variance by replacing, on the main diagonal of the correlation matrix, 1’s with estimates of communalities. • Initial estimate of communality = R2 between one variable and all others.

Lets Do It • Using the beer data, change the extraction method to principal axis.

Look at the Initial Communalities • They were all 1’s for our PCA. • They sum to 5.675. • We have eliminated 7 – 5.675 = 1.325 units of unique variance. Communalities


Initial .738 .912 .866 .499 .922 .857 .881

Extraction .745 .914 .866 .385 .892 .896 .902

Extraction Method: Principal Axis Factoring.

Iterate! • Using the estimated communalities, obtain a solution. • Take the communalities from the first solution and insert them into the main diagonal of the correlation matrix. • Solve again. • Take communalities from this second solution and insert into correlation matrix.

• Solve again. • Repeat this, over and over, until the changes in communalities from one iteration to the next are trivial. • Our final communalities sum to 5.6. • After excluding 1.4 units of unique variance, we have extracted 5.6 units of common variance. • That is 5.6 / 7 = 80% of the total variance in our seven variables.

• We have packaged those 5.6 units of common variance into two factors: Total Variance Explained

Factor 1 2

Extraction Sums of Squared Loadings Total % of Variance Cumulative % 3.123 44.620 44.620 2.478 35.396 80.016

Extraction Method: Principal Axis Factoring.

Rotation Sums of Squared Loadings Total % of Variance Cumulative % 2.879 41.131 41.131 2.722 38.885 80.016

Our Rotated Factor Loadings • Not much different from those for the PCA. a Rotated Factor Matrix

Factor 1 2 TASTE .950 -2.17E-02 AROMA .946 2.106E-02 COLOR .942 6.771E-02 SIZE 7.337E-02 .953 ALCOHOL 2.974E-02 .930 COST -4.64E-02 .862 REPUTAT -.431 -.447 Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations.

Reproduced and Residual Correlation Matrices • Correlations between variables result from their sharing common underlying factors. • Try to reproduce the original correlation matrix from the correlations between factors and variables (the loadings). • The difference between the reproduced correlation matrix and the original correlation matrix is the residual matrix.

• We want these residuals to be small. • Check “Reproduced” under “Descriptive” in the Factor Analysis dialogue box, to get both of these matrices: • Reproduced Correlations


COST SIZE .745 b .818 .818 .914b .800 .889 -.365 -.458 1.467E-02 .134 -2.57E-02 8.950E-02 -6.28E-02 4.899E-02 1.350E-02 1.350E-02 -3.29E-02 1.495E-02 -4.02E-02 6.527E-02 3.328E-03 4.528E-02 -2.05E-02 8.097E-03 -1.16E-03 -2.32E-02

ALCOHOL REPUTAT COLOR AROMA TASTE .800 -.365 1.467E-02 -2.57E-02 -6.28E-02 .889 -.458 .134 8.950E-02 4.899E-02 b .866 -.428 9.100E-02 4.773E-02 8.064E-03 -.428 .385b -.436 -.417 -.399 b 9.100E-02 -.436 .892 .893 .893 4.773E-02 -.417 .893 .896 b .898 8.064E-03 -.399 .893 .898 .902 b -3.295E-02 -4.02E-02 3.328E-03 -2.05E-02 -1.16E-03 1.495E-02 6.527E-02 4.528E-02 8.097E-03 -2.32E-02 -3.47E-02 -1.88E-02 -3.54E-03 3.726E-03 -3.471E-02 6.415E-02 -2.59E-02 -4.38E-02 -1.884E-02 6.415E-02 1.557E-02 1.003E-02 -3.545E-03 -2.59E-02 1.557E-02 -2.81E-02 3.726E-03 -4.38E-02 1.003E-02 -2.81E-02

Extraction Method: Principal Axis Factoring. a. Residuals are computed between observed and reproduced correlations. There are 2 (9.0%) nonredundant residuals with absolute values greater than 0.05. b. Reproduced communalities

Nonorthogonal (Oblique) Rotation • The axes will not be perpendicular, the factors will be correlated with one another. • the factor loadings (in the pattern matrix) will no longer be equal to the correlation between each factor and each variable. • They will still equal the beta weights, the A’s in X j = A1 j F1 + A2 j F2 +  + Amj Fm + U j

• Oblique solutions make me uncomfortable. • but I did one just for you – • a Promax rotation. • First a Varimax rotation is performed. • Then the axes are rotated obliquely. • Here are the beta weights, in the “Pattern Matrix,” the correlations in the “Structure Matrix,” and the correlation between factors:

Beta Weights

Correlations Structure Matrix

Pattern Matrixa



1 .947 .946 .945 .123 .078 -.002 -.453

2 .030 .072 .118 .956 .930 .858 -.469

1 2 TASTE .955 -7.14E-02 AROMA .949 -2.83E-02 COLOR .943 1.877E-02 SIZE 2.200E-02 .953 ALCOHOL -2.05E-02 .932 COST -9.33E-02 .868 REPUTAT -.408 -.426


Extraction Method: Principal Axis Factoring. Rotation Method: Promax with Kaiser Normalization. a. Rotation converged in 3 iterations.

Extraction Method: Principal Axis Factoring. Rotation Method: Promax with Kaiser Normalization.

Factor Correlation Matrix Factor 1 2

1 1.000 .106

2 .106 1.000

Extraction Method: Principal Axis Factoring. Rotation Method: Promax with Kaiser Normalization.

Exact Factor Scores • You can compute, for each subject, estimated factor scores. • Multiply each standardized variable score by the corresponding standardized scoring coefficient. • For our first subject, Factor 1 = (-.294)(.41) + (.955)(.40) + (-.036)(.22) + (1.057)(-.07) + (.712)(.04) + (1.219)(.03) + (-1.14)(.01) = 0.23.

• SPSS will not only give you the scoring coefficients, but also compute the estimated factor scores for you. • In the Factor Analysis window, click Scores and select Save As Variables, Regression, Display Factor Score Coefficient Matrix.

• Here are the scoring coefficents: Factor Score Coefficient Matrix Factor COST SIZE ALCOHOL REPUTAT COLOR AROMA TASTE

1 .026 -.066 .036 .011 .225 .398 .409

2 .157 .610 .251 -.042 -.201 .026 .110

Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. Factor Scores Method: Regression.

• Look back at the data sheet and you will see the estimated factor scores.

Use the Factor Scores • Let us see how the factor scores are related to the SES and Group variables. • Use multiple regression to predict SES from the factor scores. Model Summary Model 1

R R Square .988a .976

Adjusted R Square .976

a. Predictors: (Constant), FAC2_1, FAC1_1

Std. Error of the Estimate .385

ANOVAb Model 1

Regression Residual Total

Sum of Squares 1320.821 32.179 1353.000

df 2 217 219

Mean Square 660.410 .148

F 4453.479

Sig. .000 a

a. Predictors: (Constant), FAC2_1, FAC1_1 b. Dependent Variable: SES


Model 1

Standardized Coefficients Beta (Constant) FAC1_1 FAC2_1

.681 -.718

a. Dependent Variable: SES

t 134.810 65.027 -68.581

Sig. .000 .000 .000

Correlations Zero-order Part .679 -.716

.681 -.718

• Also, use independent t to compare groups on mean factor scores. Group Statistics

FAC1_1 FAC2_1

GROUP 1 2 1 2

N 121 99 121 99

Mean -.4198775 .5131836 .5620465 -.6869457

Std. Deviation .97383364 .71714232 .88340921 .55529938

Std. Error Mean .08853033 .07207552 .08030993 .05580969

Independent Samples Test Levene's Test for Equality of Variances

F FAC1_1


Equal variances assumed Equal variances not assumed Equal variances assumed Equal variances not assumed



Sig. .000



t-test for Equality of Means 95% Confidence Interval of the Difference df Sig. (2-tailed) Lower Upper





















Unit-Weighted Factor Scores • Define subscale 1 as simple sum or mean of scores on all items loading well (> .4) on Factor 1. • Likewise for Factor 2, etc. • Suzie Cue’s answers are • Color, Taste, Aroma, Size, Alcohol, Cost, Reputation

• 80, 100, 40, 30, 75, 60, 10 • Aesthetic Quality = 80+100+40-10 = 210 • Cheap Drunk = 30+75+60-10 = 155

• It may be better to use factor scoring coefficients (rather than loadings) to determine unit weights. • Grice (2001) evaluated several techniques and found the best to be assigning a unit weight of ± 1 to each variable that has a scoring coefficient at least 1/3 as large as the largest for that factor. • Using this rule, we would not include Reputation on either subscale and would drop Cost from the second subscale.

Item Analysis and Cronbach’s Alpha • Are our subscales reliable? • Test-Retest reliability • Cronbach’s Alpha – internal consistency – Mean split-half reliability – With correction for attenuation – Is a conservative estimate of reliability

• AQ = Color + Taste + Aroma – Reputation • Must negatively weight Reputation prior to item analysis. • Transform, Compute, NegRep = -1∗Reputat.

• Analyze, Scale, Reliability Analysis

• Statistics • Scale if item deleted.

• Continue, OK

• Shoot for an alpha of at least .70 for research instruments.

• Note that deletion of the Reputation item would increase alpha to .96.

Comparing Two Groups’ Factor Structure • Eyeball Test – Same number of well defined factors in both groups? – Same variables load well on same factors in both groups?

• Catell’s Salient Similarity Index – Factors(one from one group, one from the other group) are compared in terms of similarity of loadings. – Summary statistic, s, can be transformed to p value testing null that the factors are not related to one another. – See the handout for details.

• Pearson r – Just correlate the loadings for one factor in one group with those for the corresponding factor in the other group. – If there are many small loadings, r may be large due to the factors being similar on small loadings despite lack of similarity on the larger loadings.

• Cross-Scoring – Obtain scoring coefficients for each group. – For each group, compute factor scores using coefficients obtained from the analysis for that same group (SG) and using coefficients obtained from the analysis for the other group (OG). – Correlate SG factor scores with OG factor scores.

Required Number of Subjects and Variables • Rules of Thumb (not very useful) – 100 or more subjects. – at least 10 times as many subjects as you have variables. – as many subjects as you can, the more the better.

• It depends – see the references in the handout.

• Start out with at least 6 variables per expected factor. • Each factor should have at least 3 variables that load well. • If loadings are low, need at least 10 variables per factor. • Need at least as many subjects as variables. The more of each, the better. • When there are overlapping factors (variables loading well on more than one factor), need more subjects than when structure is simple.

• If communalities are low, need more subjects. • If communalities are high (> .6), you can get by with fewer than 100 subjects. • With moderate communalities (.5), need 100-200 subjects. • With low communalities and only 3-4 high loadings per factor, need over 300 subjects. • With low communalities and poorly defined factors, need over 500 subjects.

What I Have Not Covered Today

• LOTS. • For a general introduction to measurement (reliability and validity), see http://core.ecu.edu/psyc/wuenschk/docs2210/R

Practice Exercises

• Animal Rights, Ethical Ideology, and Misanthro • Rating Characteristics of Criminal Defendants

