Test and Item Analysis of Multiple Choice Test Questions Report
by
C. R. Adjah
July 2007
1
Table of Contents Page 1 2 3
Table of Contents List of Tables List if figures Section 1 1.1 1.2 1.3
Introduction Statement of purpose Methodology Report structure
4 4 4 5
2 2.1
Test analysis Distribution of students with correct option per
6 6
2.2 2.3
question Distribution of percentage scores Descriptive statistics
7 7
3 3.1 3.2 3.3
Item analysis Difficulty index Discrimination index Item reliability
9 9 10 11
4
Conclusion Bibliography Addendum A Addendum B
12 13
List of Tables Table 1
Table Name Mean, Standard deviation, Skewness and
2
Kurtosis per question The mean, median, mode and standard
3 4 5 6
deviation of percentages Difficulty index Discrimination index Cronbach’s alpha Cronbach’s alpha on deleting an item 2
Page 7 8 9 10 11 11
List of Figures Figure 1 2 3 4
Figure name The approach Report structure Histogram of students with correct options Frequency histogram of percentage scores
3
Page 4 5 6 7
1 INTRODUCTION This is a report on the test and item analysis of a 20 multiple choice test questions taken by 25 students. 1.1 STATEMENT OF PURPOSE The purpose of this report is to provide a descriptive statistics and item analysis of 20 multiple choice test questions taken by 25 students. 1.2 METHODOLOGY The approach followed is as shown in Figure 1. Figure 1: The approach Steps
Description
4
Data tabulation
Recoding of data
The data collected from answer scripts of the students were captured in an excel spreadsheet For each student, the chosen options captured as A, B, C and D were recoded into 1 for a correct option and 0 for an incorrect option in an excel spreadsheet.
Calculation of student score
The score per student was calculated and sorted in descending order according to percentages
Grouping of students
13 of the students were then grouped in an upper group and 12 in a lower group.
Analysis of data
Histogram
1.3 REPORT STRUCTURE
An analysis of the data was carried out using SPSS (Statistical Program for the Social Sciences) to determine the mean, standard deviation, mode, median, difficulty index, discrimination index and the Cronbach’s alpha. A histogram of number of students with correct options per question was drawn.
The report is made up of the four main sections: •
Introduction
•
Test analysis
•
Item analysis
•
Conclusion
These sections as illustrated in Figure 2 are subdivided into subsections by their headings.
5
Figure 2: Report structure
2 TEST ANALYSIS 2.1 Distribution of students with correct option per question The number of students with the correct options chosen per question were determined and a histogram drawn. This is illustrated in Figure 3. Figure 3: Histogram of students with correct options
6
HISTOGRAM OF NUMBER OF STUDENTS WITH CORRECT OPTIONS PER QUESTION
FREQUENCY
25 20 15 10 5 0 QUESTION
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
It is shown from the histogram that between 21 and 23 which represent 84% to 92% of the students chose the correct options for questions 1, 2, 5, 11, 14, 15 and 16. Between 8 and 13 representing 32% to 52% of the students answered questions 4, 7, 8, 9, 10 and 19 correctly.
2.2 Distribution of percentage scores The number of students that fall within a percentage score is represented by a histogram as illustrated in Figure 4. Figure 4: Frequency histogram of percentage scores
7
HISTOGRAM 6
20-30
FREQUENCY
5
30-40
4
40-50
3
50-60
2
60-70 70-80
1
80-90
0 PERCENTAGE SCORES
90-100
It is shown that 14 learners representing 56% of the students obtained scores above the mean with 44% of the students have scores below the mean. 2.3 Descriptive statistics The mean, standard deviation per item is shown in Table 1. Table 1: Mean, Standard deviation, Skewness and Kurtosis per question Std. QUESTION N Sum Mean Deviation Skewness Kurtosis Q1 25 21.00 .8400 .37417 -1.975 2.061 Q2 25 22.00 .8800 .33166 -2.491 4.563 Q3 25 17.00 .6800 .47610 -.822 -1.447 Q4 25 12.00 .4800 .50990 .085 -2.174 Q5 25 21.00 .8400 .37417 -1.975 2.061 Q6 25 17.00 .6800 .47610 -.822 -1.447 Q7 25 11.00 .4400 .50662 .257 -2.110 Table 1: Mean, Standard deviation, Skewness and Kurtosis per question Std. QUESTION N Sum Mean Deviation Skewness Kurtosis Q8 23 12.00 .5217 .51075 -.093 -2.190 Q9 25 13.00 .5200 .50990 -.085 -2.174 Q10 24 8.00 .3333 .48154 .755 -1.568 Q11 25 23.00 .9200 .27689 -3.298 9.641 8
Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Valid N (listwise)
25 25 25 25 24 24 25 25 25
19.00 15.00 21.00 20.00 22.00 15.00 8.00 13.00 16.00
.7600 .6000 .8400 .8000 .9167 .6250 .3200 .5200 .6400
.43589 .50000 .37417 .40825 .28233 .49454 .47610 .50990 .48990
-1.297 -.435 -1.975 -1.597 -3.220 -.551 .822 -.085 -.621
-.354 -1.976 2.061 .593 9.124 -1.859 -1.447 -2.174 -1.762
22
The mean percentage score calculated is illustrated in Table 2. Also in the table are the Median, Mode and standard deviation of the percentage scores. Table 2: The mean, median, mode and standard deviation of percentages Mean 65.24 Median 65.00 Mode 65.00 Standard deviation 21.60
3 ITEM ANALYSIS 3.1 Difficulty index Illustrated in Table 3 are the p-values of each test item. The p-values indicate the proportion of students who got the test items correct.
9
Table 3: Difficulty index QUE Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
#Correct 21 22 17 12 21 17 11 12 13 8 23 19 15 21 20 22 15 8 13 16
Difficulty index #Answered p 25 0.84 25 0.88 25 0.68 25 0.48 25 0.84 25 0.68 25 0.44 23 0.52 25 0.52 24 0.33 25 0.92 25 0.76 25 0.60 25 0.84 25 0.80 24 0.92 24 0.63 25 0.32 25 0.52 25 0.64
REMARKS Unacceptable item Unacceptable item Acceptable item Acceptable item Unacceptable item Acceptable item Acceptable item Acceptable item Acceptable item Acceptable item Unacceptable item Acceptable item Acceptable item Unacceptable item Acceptable item Unacceptable item Acceptable item Acceptable item Acceptable item Acceptable item
From the table, the p-values of Q1, Q2, Q5, and Q14 are greater than 0.80 and therefore can be termed to be unacceptable test items. Q11 and Q15 with p-values above 0.90 are very easy items and should not be reused in following tests. All other test items are acceptable as their p-values fall between 0.20 and 0.80.
3.2 Discrimination index A measure of the extent to which students who do well on the overall test differentiate from students who did not do well on the overall test items was determined as the discrimination indices. These discrimination indices determined are shown in Table 4.
10
Table 4: Discrimination index Discrimination index #U #L QUE (UPPER) (LOWER) D Q1 12 9 0.23 Q2 13 9 0.31 Q3 13 4 0.69 Q4 7 5 0.15 Q5 13 8 0.38 Q6 11 6 0.38 Q7 8 3 0.38 Q8 10 2 0.62 Q9 10 3 0.54 Q10 8 0 0.62 Q11 12 11 0.08 Q12 12 7 0.38 Q13 11 4 0.54 Q14 13 8 0.38 Q15 12 8 0.31 Q16 13 9 0.31 Q17 10 5 0.38 Q18 5 3 0.15 Q19 10 3 0.54 Q20 10 6 0.31
REMARKS Acceptable item Acceptable item Acceptable item Unacceptable item Acceptable item Acceptable item Acceptable item Acceptable item Acceptable item Acceptable item Unacceptable item Acceptable item Acceptable item Acceptable item Acceptable item Acceptable item Acceptable item Unacceptable item Acceptable item Acceptable item
Even though the discrimination indices of the test items are all positive and therefore can be considered to be desirable items, Q4, Q11, Q18 with discrimination indices less than 0.20 indicate that these test items are poorly constructed items and unacceptable (Measurement and Evaluation Center, 2003). 3.3 Item reliability Cronbach’s alpha which is the indicator of the overall test reliability is shown in Table 5. Table 5: Cronbach’s alpha Cronbach's
Cronbach's
11
N of Items
Alpha
Alpha Based on Standardized Items
.804
.812
20
The high Cronbach’s alpha value of 0.812 indicates that the overall test is reliable. Deleting a test item either increases or decreases the Cronbach’s alpha. These changes are reflected in Table 6. Table 6: Cronbach’s alpha on deleting an item QUEST
Scale Mean if Item Deleted
Scale Variance if Item Deleted
Cronbach's Alpha if Item Deleted
Comments
Q1 13.0455 15.093 .802 Acceptable Q2 13.0000 14.571 .791 Acceptable Q3 13.0909 14.372 .791 Acceptable Q4 13.3182 15.846 .821 Unacceptable Q5 12.9545 15.474 .804 acceptable Q6 13.0909 15.420 .809 Unacceptable Q7 13.4091 14.253 .795 Acceptable Q8 13.3182 14.513 .799 Acceptable Q9 13.3182 13.656 .783 Acceptable Q10 13.5000 13.405 .777 Acceptable Q11 12.9545 15.474 .804 Acceptable Q12 13.0000 15.333 .804 Acceptable Q13 13.2273 13.898 .787 Acceptable Q14 12.9545 14.617 .789 Acceptable Q15 13.0909 14.468 .793 Acceptable Q16 12.9545 14.617 .789 Acceptable Q17 13.2727 14.113 .792 Acceptable Q18 13.5000 14.833 .804 Acceptable Q19 13.2727 13.827 .786 Acceptable Q20 13.1364 14.504 .795 Acceptable Q4 and Q6 showed an increase in Cronbach’s alpha value if deleted. This indicates that this question needs modification or deletion as a test item in order to maintain the reliability of the test.
4 Conclusions
12
All test items discriminate well except for Q4, Q11 and Q18. In the case of Q1, Q2, Q5, and Q14 with difficulty indices above 0.80 is an indication that they are quite easy test items and may need a review. Questions 11 and 15 with difficulty indices above 0.90 are very easy items and should not be reused in subsequent testing. However, based upon the Cronbach’s alpha values, all the test items can be considered to be reliable and acceptable except for Q4 which needs modification or deletion in order to increase the reliability of the test.
Knoetze, J. (2007). Test data. Retrieved July 16, 2007 from Measurement and Evaluation Center. (2003). Test Item Analysis & Decision Making. The University of Texas at Austin. Retrieved July 16, 2007 from 13
Varma, S. (n.d.). Preliminary Item Statistics Using Point-Biserial Correlation and P-Values. Educational Data Systems Inc Morgan Hill CA. Retrieved July 16, 2007 from
14
ADDENDUM A Coding and grouping of students Key St No 11 16 2 3 25 13 20 14 5 4 12 8 9 18 23 10 21 22 17 6 7 15 1 24 19
C
B
D
D
B
C
D
A
C
B
A
C
B
D
A
A
C
D
B
C
Q1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1
Q2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 0
Q3 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 0 1 0 1 0 0 0
Q4 1 1 1 1 1 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 1
Q5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 0 1
Q6 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 0 0
Q7 1 1 0 1 0 1 1 1 0 0 1 0 1 1 0 1 0 0 1 0 0 0 0 0 0
Q8 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1
Q9 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0
Q10 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0
Q11 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
Q12 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 0 0
Q13 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 0 0
Q14 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
Q15 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 1 1 0 1 1 1 0
Q16 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Q17 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 0 0
1 1 0 1 1 0
0 0 0 1 1 0
Q18 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0
Q19 1 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0
Q20 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 1 1 0 1 1 0 0 0 0
1
0 0 0 0 0 0
#Corr 20 20 18 18 18 17 17 16 15 14 14 13 13 13 13 12 11 11 9 10 10 8 6 6 4
#Ans 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 17 20 20 20 19 19 20
% 100.00 100.00 90.00 90.00 90.00 85.00 85.00 80.00 75.00 70.00 70.00 65.00 65.00 65.00 65.00 60.00 55.00 55.00 52.94 50.00 50.00 40.00 31.58 31.58 20.00 65.64 21.60 Upper Lower
ADDENDUM B 15
Grp U U U U U U U U U U U U U L L L L L L L L L L L L
13 12
Difficulty index QUE #Corr #Ans Q1 21 25 Q2 22 25 Q3 17 25 Q4 12 25 Q5 21 25 Q6 17 25 Q7 11 25 Q8 12 23 Q9 13 25 Q10 8 24 Q11 23 25 Q12 19 25 Q13 15 25 Q14 21 25 Q15 20 25 Q16 22 24 Q17 15 24 Q18 8 25 Q19 13 25 Q20 16 25 M 65.64 MDN 65.00 MODE 65.00 STD 21.60
p 0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76 0.60 0.84 0.80 0.92 0.63 0.32 0.52 0.64
Discrimination index #U #L D 12 9 0.23 13 9 0.31 13 4 0.69 7 5 0.15 13 8 0.38 11 6 0.38 8 3 0.38 10 2 0.62 10 3 0.54 8 0 0.62 12 11 0.08 12 7 0.38 11 4 0.54 13 8 0.38 12 8 0.31 13 9 0.31 10 5 0.38 5 3 0.15 10 3 0.54 10 6 0.31 % 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
16
FREQ 1 2 1 5 5 3 3 5