1. Introduction Item analysis is valuable in improving test items and also to eliminate misleading items in a test administration. It increases instructors’ skills in test construction, and identifies specific areas of course content which need emphasis or clarity as described by the Office of Educational Assessment (2005). The following report is based on test and item analysis of 20 multiple-choice test items which were administered to 25 students (Appendix B).
2.
Purpose of report
The purpose of this report is to disseminate information based on the descriptive statistics on 20 multiple-choice test items administered to 25 students.
3.
Test analysis
Table 1: Descriptive statistics Mean 65.79 Mode 65.00 Median 65.00 2 STDEV 479.57 STDEV 21.90 Table 1 show that the mean, mode, and the median have the same value of 65 which means that it is a normal distribution.
Frequency graphs Table 2: Grouped frequency table H L Range Number of intervals Size of interval
100 15 85 10 8.5
1
Table3: Cumulative frequency distribution Lowe r Limit
Upper Limit
Interva l
Middle Value
Cumulativ Frequenc e y Frequency
15.00 25.00 35.00 45.00 55.00 65.00 75.00 85.00 95.00
24 34 44 54 64 74 84 94 104
15-24 25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104
19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
1 2 0 4 3 6 1 6 2
1 3 3 7 10 16 17 23 25
Figure 2: Frequency Histogram Frequency Histogram 7
Frequency
6 5 4 3 2 1 0 15-24
25-34
35-44
45-54
55-64
65-74
75-84
85-94
95-104
Interval
This is the graphical representation of the distribution of data as displayed in Figure 2, which shows a count of the data points falling in various ranges.
2
Figure 3: Frequency Polygon Frequency Polygon 7
Frequency
6 5 4 3 2 1 0 19.5
29.5
39.5
49.5
59.5
69.5
79.5
89.5
99.5
Middle values of intervals
The points in the frequency polygon Figure 3 are connected by straight lines to show that the data are uniformly distributed across the class interval as represented on the histogram Figure 2 by a rectangular bar.
Figure 4:Cumulative Frequency Graph Cumulative Frequency Graph 30
Frequency
25 20 15 10 5 0 24
34
44
54
64
74
84
94
104
Upper values
Figure 4 shows the actual frequency distribution at each interval with the upper values at the x-axis and the frequency at the y-axis.
3
Reliability coefficient of a test The Kuder-Richardson formula 20 was used to calculate the reliability coefficient based on the number of test items (k), the proportion of the responses to an item that are correct (p) , the proportion of the responses that are incorrect (q), and the variance( squared standard deviation).
Table 4: Coefficients of reliability k k-1 Total pq Stdev
20 19 3.83 21.90
(Stdev)2
479.57
KR20 1.04 The above table with a KR20 of 1.04 clearly shows that the test was reliable as shown in Table 4.
4.
Item analysis
Difficulty indices of a test item The difficulty index is the proportion of students who answered the item correctly. The P value is determined by dividing the student selecting the correct answer; by the students attempting the item as shown in Table5.
Table 5: Difficulty index Difficulty index #Question s q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12
#Correc t 21 22 17 12 21 17 11 12 13 8 23 19
#Answere d 25 25 25 25 25 25 25 23 25 24 25 25
p 0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76
4
q13 q14 q15 q16 q17 q18 q19 q20
15 21 20 22 15 8 13 16
25 25 25 24 24 24 25 25
0.6 0.84 0.8 0.92 0.63 0.33 0.52 0.64
Discrimination indices of a test item If the P value is less than 0.25 then the item was difficult and if the P value is greater than 0.75 then the item was fair or acceptable.
Table 6: Interpretation of the difficulty level of questions #Question s q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20
Interpretati Proportion on 0.84 Unacceptable 0.88 Unacceptable 0.68 Acceptable 0.48 Acceptable 0.84 Unacceptable 0.68 Acceptable 0.44 Acceptable 0.52 Acceptable 0.52 Acceptable 0.33 Acceptable 0.92 Unacceptable 0.76 Unacceptable 0.6 Acceptable 0.84 Unacceptable 0.8 Acceptable 0.92 Unacceptable 0.63 Acceptable 0.33 Acceptable 0.52 Acceptable 0.64 Acceptable
Reason Too easy Too easy Fine Fine Too easy Fine Fine Fine Fine Fine Too easy Too easy Fine Too easy Fine Too easy Fine Fine Fine Fine
The p-values are given in Table 5 above clearly shows that 35% of the questions (1, 2, 5,11,12,14, and 16) are unacceptable which mean that they were too easy and 65% (3, 4, 6, 7, 8,
5
9,10,13,15,17,18,19, and 20) are acceptable which shows that they were fine.
Table 7: Number of students in upper and lower group Upper Lower
15 10
Table 8: Discrimination index Discrimination index #U 15 15 14 8 15 12 9 10 10 8 14 14 12 15 14 15 12 5 12 11
#L 6 7 3 4 6 5 2 2 3 0 9 5 3 6 6 7 3 3 1 5
D 0.60 0.53 0.79 0.50 0.60 0.58 0.78 0.80 0.70 1.00 0.36 0.64 0.75 0.60 0.57 0.53 0.75 0.40 0.92 0.55
The discrimination index measures the extent to which the test item differentiates between students who do well on the overall test and those who do not do well on the overall test, referred to as the upper and the lower group as in Table 7. If the D value is positive then the items are acceptable as in Table 8.
6
5.
Conclusion
The test has a KR20 of 1.04 which means that the test was reliable and that students would obtain similar scores if they took another form of the same test.
6.
References
1. A Guide to Interpreting the Item Analysis Report. (2004). Retrieved September 12, 2007, from http://www.asu.edu/uts/InterpIAS.pdf 2. Introduction to Statistical Inference. (2005). Retrieved September 11, 2007, from http://students.washington.edu/hdevans/lec_11.doc 3. Kubiszyn, T., & Borich, G. (2007).Education testing and Measurement. Classroom Application and Practice (8th Ed).John Wiley &sons, Inc.United States of America. 4. Office of Educational Assessment (Understanding item analysis reports). (2005). Retrieved September 12, 2007, from http://personal.gscit.monash.edu.au/~dengs/teaching/GCHE/part33.pdf 5. Test Item Analysis. (2005). Retrieved September12, 2007, From http://personal.gscit.monash.edu.au/~dengs/teaching/GCHE/part33.pdf
7
7.
Appendix Appendix A
8
#Question #Corre s ct
Prop #Incorrec Correct( t p)
Prop Incorrect( q)
q1
21
4
0.84
0.16
q2
22
3
0.88
0.12
q3
17
8
0.68
0.32
q4
12
13
0.48
0.52
q5
21
4
0.84
0.16
q6
17
8
0.68
0.32
q7
11
14
0.44
0.56
q8
12
11
0.52
0.48
q9
13
12
0.52
0.48
q10
8
16
0.33
0.67
q11
23
2
0.92
0.08
q12
19
6
0.76
0.24
q13
15
10
0.6
0.4
q14
21
4
0.84
0.16
q15
20
5
0.8
0.2
q16
22
2
0.92
0.08
q17
15
9
0.63
0.38
q18
8
16
0.33
0.67
q19
13
12
0.52
0.48
q20
16
9
0.64
0.36 Total
pq 0 .13 0 .11 0 .22 0 .25 0 .13 0 .22 0 .25 0 .25 0 .25 0 .22 0 .07 0 .18 0 .24 0 .13 0 .16 0 .08 0 .23 0 .22 0 .25 0 .23 3 .83
9
Appendix B Key St No
C
B
D
D
B
C
D
A
C
B
A
C
B
D
A
A
C
D
B
C
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
Q9
Q10
Q11
Q12
Q13
Q14
Q15
Q16
Q17
Q18
Q19
Q20
1 2 3 4 5 6 7 8 9 10 11 12 13
C C C C C C B C C C C C C
B B B B B A B B B B B B B
B D D D D D A D D B D D D
A D D B C D B B A A D D A
C B B B B C B B B B B B B
D D C C C C C C C C C C C
A A D B B A B B D D D D D
A A A A D B D D C A D A
D C C C C C D B B D C D C
D B B B D D D C D C B A B
A A A A A A A A A A A A A
D C C C C C C C C B C C C
A B B A B A B B B A B A B
A D D D D D D D D D D D D
A A A C A A C A A D A A A
A A A A A A A A A A A A A
C C C C A A A C C C C C A
B D B B B B D A B D D B B
D B D C B D D B D B B B B
B C C C C C C A A C C D C
14 15 16
C C C
B B B
D D D
A D D
B B B
C B C
D A D
A A A
C B C
B D B
A A A
C C C
B D B
D A D
A A A
A C A
A B C
B D
B D B
C D C
17 18 19 20 21 22 23
B C D C C B C
B B C B A B B
C B A D D A D
C D D D D B B
B B B B C B B
A A A C C C C
D D B D A B B
D D A A D B D
C D D C C D B
D C A D D C
A A C C A A A
D C D D C C C
B A A B A B B
D D A D D D D
A A D A A C A
A B A A A A
C B C A A C
C B B D B D A
A B A B D D B
D C B C C C A
24 25
C C
B B
B D
A D
C B
D D
A A
A
D C
D B
A A
D C
A B
A D
A A
A A
C C
B D
D B
B C
10