Principle Components Analysis with SPSS 生物統計諮詢中心 蔡培癸
Download The Instructional Documents
• Point your browser to http://core.ecu.edu/psyc/wuenschk/SPSS/SPS . • Click on Principle Components Analysis . • Save, Desktop, Save. • Do same for Factor Analysis .
When to Use PCA • You have a set of p continuous variables. • You want to repackage their variance into m components. • You will usually want m to be < p, but not always.
Components and Variables • Each component is a weighted linear combination of the variables
Ci = Wi 1 X 1 + Wi 2 X 2 + + Wip X p • Each variable is a weighted linear combination of the components.
X j = A1 j C1 + A2 j C2 + + Amj Cm
Factors and Variables • In Factor Analysis, we exclude from the solution any variance that is unique, not shared by the variables.
X j = A1 j F1 + A2 j F2 + + Amj Fm + U j • Uj is the unique variance for Xj
Goals of PCA and FA • Data reduction. • Discover and summarize pattern of intercorrelations among variables. • Test theory about the latent variables underlying a set a measurement variables. • Construct a test instrument. • There are many others uses of PCA and FA.
Data Reduction • Ossenkopp and Mazmanian (Physiology and Behavior, 34: 935-941). • 19 behavioral and physiological variables. • A single criterion variable, physiological response to four hours of cold-restraint • Extracted five factors. • Used multiple regression to develop a multiple regression model for predicting the criterion from the five factors.
Exploratory Factor Analysis • Want to discover the pattern of intercorrleations among variables. • Wilt et al., 2005 (thesis). • Variables are items on the SOIS at ECU. • Found two factors, one evaluative, one on difficulty of course. • Compared FTF students to DE students, on structure and means.
Confirmatory Factor Analysis • Have a theory regarding the factor structure for a set of variables. • Want to confirm that the theory describes the observed intercorrelations well. • Thurstone: Intelligence consists of seven independent factors rather than one global factor.
Construct Test Instrument • Write a large set of items designed to test the constructs of interest. • Administer the survey to a sample of persons from the target population. • Use FA to help select those items that will be used to measure each of the constructs of interest. • Use Cronbach alpha to check reliability of resulting scales.
An Unusual Use of PCA • Poulson, Braithwaite, Brondino, and Wuensch (1997, Journal of Social Behavior and Personality, 12, 743-758).
• Simulated jury trial, seemingly insane defendant killed a man. • Criterion variable = recommended verdict – Guilty – Guilty But Mentally Ill – Not Guilty By Reason of Insanity.
• Predictor variables = jurors’ scores on 8 scales. • Discriminant function analysis. • Problem with multicollinearity. • Used PCA to extract eight orthogonal components. • Predicted recommended verdict from these 8 components. • Transformed results back to the original scales.
A Simple, Contrived Example • Consumers rate importance of seven characteristics of beer. – low Cost – high Size of bottle – high Alcohol content – Reputation of brand – Color – Aroma – Taste
• Download FACTBEER.SAV from http://core.ecu.edu/psyc/wuenschk/SPSS/SPS . • Analyze, Data Reduction, Factor. • Scoot beer variables into box.
• Click Descriptives and then check Initial Solution, Coefficients, and KMO and Bartlett’s Test of Sphericity. Click Continue.
• Click Extraction and then select Principal Components, Correlation Matrix, Unrotated Factor Solution, Scree Plot, and Eigenvalues Over 1. Click Continue.
• Click Rotation. Select Varimax and Rotated Solution. Click Continue.
• Click Options. Select Exclude Cases Listwise and Sorted By Size. Click Continue.
• Click OK, and SPSS completes the Principle Components Analysis.
Checking for Unique Variables • Check the correlation matrix. • If there are any variables not well correlated with some others, might as well delete them. • Bartlett’s test of sphericity tests null that the matrix is an identity matrix, but does not help identify individual variables that are not well correlated with others.
• For each variable, check R2 between it and the remaining variables. • Look at partial correlations – variables with large partial correlations share variance with one another but not with the remaining variables – this is problematic. • Kaiser’s MSA will tell you, for each variable, how much of this problem exists. • The smaller the MSA, the greater the problem. • An MSA of .9 is marvelous, .5 miserable.
• Use SAS to get the partial correlations and individual MSAs. • SPSS only gives an overall MSA, which is of no use in identifying problematic variables. KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Bartlett's Test of Sphericity
Approx. Chi-Square df Sig.
.665 1637.9 21 .000
Extracting Principal Components • From p variables we can extract p components. • Each of p eigenvalues represents the amount of standardized variance that has been captured by one component. • The first component accounts for the largest possible amount of variance. • The second captures as much as possible of what is left over, and so on. • Each is orthogonal to the others.
• Each variable has standardized variance = 1. • The total standardized variance in the p variables = p. • The sum of the m = p eigenvalues = p. • All of the variance is extracted. • For each component, the proportion of variance extracted = eigenvalue / p.
• For our beer data, here are the eigenvalues and proportions of variance for the seven components: Initial Eigenvalues % of Cumulative Component Total Variance % 1 3.313 47.327 47.327 2 2.616 37.369 84.696 3 .575 8.209 92.905 4 .240 3.427 96.332 5 .134 1.921 98.252 6 9.E-02 1.221 99.473 7 4.E-02 .527 100.000 Extraction Method: Principal Component Analysis.
How Many Components to Retain • From p variables we can extract p components. • We probably want fewer than p. • Simple rule: Keep as many as have eigenvalues ≥ 1. • A component with eigenvalue < 1 captured less than one variable’s worth of variance.
• Visual Aid: Use a Scree Plot • Scree is rubble at base of cliff. • For our beer data, Scree Plot 3.5 3.0 2.5 2.0 1.5
Eigenvalue
1.0 .5 0.0 1
2
Component Number
3
4
5
6
7
• Only the first two components have eigenvalues greater than 1. • Big drop in eigenvalue between component 2 and component 3. • Components 3-7 are scree. • Try a 2 component solution. • Should also look at solution with one fewer and with one more component.
Loadings, Unrotated and Rotated • loading matrix = factor pattern matrix = component matrix. • Each loading is the Pearson r between one variable and one component. • Since the components are orthogonal, each loading is also a β weight from predicting X from the components. • Here are the unrotated loadings for our 2 component solution:
Component Matrixa
COLOR AROMA REPUTAT TASTE COST ALCOHOL SIZE
Component 1 2 .760 -.576 .736 -.614 -.735 -.071 .710 -.646 .550 .734 .632 .699 .667 .675
Extraction Method: Principal Component Analysis.
a. 2 components extracted.
• All variables load well on first component, economy and quality vs. reputation. • Second component is more interesting, economy versus quality.
• Rotate these axes so that the two dimensions pass more nearly through the two major clusters (COST, SIZE, ALCH and COLOR, AROMA, TASTE). • The number of degrees by which I rotate the axes is the angle PSI. For these data, rotating the axes -40.63 degrees has the desired effect.
• Component 1 = Quality versus reputation. • Component 2 = Economy (or cheap drunk) versus reputation. a Rotated Component Matrix
TASTE AROMA COLOR SIZE ALCOHOL COST REPUTAT
Component 1 2 .960 -.028 .958 1.E-02 .952 6.E-02 7.E-02 .947 2.E-02 .942 -.061 .916 -.512 -.533
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 3 iterations.
Number of Components in the Rotated Solution • Try extracting one fewer component, try one more component. • Which produces the more sensible solution? • Error = difference in obtained structure and true structure. • Overextraction (too many components) produces less error than underextraction. • If there is only one true factor and no unique variables, can get “factor splitting.”
• In this case, first unrotated factor ≅ true factor. • But rotation splits the factor, producing an imaginary second factor and corrupting the first. • Can avoid this problem by including a garbage variable that will be removed prior to the final solution.
Explained Variance • Square the loadings and then sum them across variables. • Get, for each component, the amount of variance explained. • Prior to rotation, these are eigenvalues. • Here are the SSL for our data, after rotation:
Total Variance Explained
Component 1 2
Rotation Sums of Squared Loadings % of Cumulative Total Variance % 3.017 43.101 43.101 2.912 41.595 84.696
Extraction Method: Principal Component Analysis.
• After rotation the two components together account for (3.02 + 2.91) / 7 = 85% of the total variance.
• If the last component has a small SSL, one should consider dropping it. • If SSL = 1, the component has extracted one variable’s worth of variance. • If only one variable loads well on a component, the component is not well defined. • If only two load well, it may be reliable, if the two variables are highly correlated with one another but not with other variables.
Naming Components • For each component, look at how it is correlated with the variables. • Try to name the construct represented by that factor. • If you cannot, perhaps you should try a different solution. • I have named our components “aesthetic quality” and “cheap drunk.”
Communalities • For each variable, sum the squared loadings across components. • This gives you the R2 for predicting the variable from the components, • which is the proportion of the variable’s variance which has been extracted by the components.
• Here are the communalities for our beer data. “Initial” is with all 7 components, “Extraction” is for our 2 component solution. Communalities
COST SIZE ALCOHOL REPUTAT COLOR AROMA TASTE
Initial 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Extraction .842 .901 .889 .546 .910 .918 .922
Extraction Method: Principal Component Analysis.
Orthogonal Rotations • Varimax -- minimize the complexity of the components by making the large loadings larger and the small loadings smaller within each component. • Quartimax -- makes large loadings larger and small loadings smaller within each variable. • Equamax – a compromize between these two.
Oblique Rotations • Axes drawn between the two clusters in the upper right quadrant would not be perpendicular.
• May better fit the data with axes that are not perpendicular, but at the cost of having components that are correlated with one another. • More on this later.