Chi-Square Test Overview The Chi-square test was designed for use with nominal or categorical data, but it can be used with continuous variables if the data are first converted into frequency ranges. The Chi-square test is used to see whether two discrete variables have a relationship or are independent. Essentially, Chi-square creates a table and then checks to see whether values are randomly distributed into cells or there is some kind of pattern. Hypotheses H0: There is not a significant relationship between the two variables. H1: There is a significant relationship between the two variables. Equation Χ2 = Σ(O-E)2/E
df = Number of cells – 1
Observed frequencies (O) are the actual frequencies that emerge as the result of data collection. Expected frequencies (E) are those that the researcher would expect if there were no relationship between the two variables. As the Chi-square statistic increases, the likelihood that a relationship exists between the two variables increases.
Warning! • Always use categorical data or recode continuous data so that no variable has more than 5 possible levels. You can use the recode feature in SPSS to change continuous to categorical data. • Each cell in the table should have at least 5 data observations or the test may be invalid. SPSS says having fewer than 5 observations is OK as long as it doesn’t happen in more than 20% of the cells. If you have too few observations in a cell, you can combine some categories to boost the number of data values in a particular cell. SPSS • Click on Analyze/Descriptive Statistics/Crosstabs. • Choose the variables. In this example, we are examining whether there is a relationship between a kitten’s fur color and his/her preference for using PubMed or Ovid. • Click on Statistics and check Chi-square. Click Continue. • If you want to see what the observed and expected counts are for each cell, click Cells and make sure the boxes for observed and expected are checked. Click Continue and then OK. UT Southwestern Medical Center Library—October 2007
Database * Furcolor Crosstabulation Count Furcolor Tabby Database Total
PubMed Ovid
Calico
Black
White
Tabby
7 2
8 11
3 7
8 4
26 24
9
19
10
12
50
When examining SPSS output, we must first check to see if the assumptions have been met. If the assumptions have not been met, there may be no point in continuing with the analysis. Look in the “a” footnote at the bottom of the Chi-Square Tests table. Cells with an expected count of less than 5 must be < 20%. The minimum expected count must be > = 1. Look to see if anything is significantly different. If there is a significant difference, we interpret our results by looking at differences between the observed and expected counts. In this case, our level of significance is .660. There is no significant difference. Writing results: χ2 (3) = 1.599, p = .660 SPSS Help For more information and examples of how to perform Chi-square tests in SPSS, please see the following resources: • http://academic.uofs.edu/department/psych/methods/cannon99/level2d. html : Dr. Cannon of the University of Scranton explains how to perform a Chi-square test in SPSS. • http://web.mit.edu/11.220/www/lab2_06/crosstabs.htm: An example of how to find crosstab percentages and perform a Chi-square test from the Department of Urban Studies and Planning at MIT. • http://core.ecu.edu/psyc/wuenschk/SPSS/SPSS-Lessons.htm : An example of how to do a Chi-square Goodness of Fit test by Dr. Karl L. Wuensch of East Carolina University. The Goodness of Fit test is used to see how well an experimental hypothesis fits the data, i.e., if what we see is what we expected. UT Southwestern Medical Center Library—October 2007
UT Southwestern Medical Center Library—October 2007