My Test And Item Analysis Report

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View My Test And Item Analysis Report as PDF for free.

More details

  • Words: 2,296
  • Pages: 14
Test and Item Analysis Report

By F Rufetu Presented as a Major Assignment In Computer-based Assessment (CIA 722) September 2007

1

Table of contents Table of contents

i

List of tables

ii

List of figures

iii

1.

Introduction

1

2.

Purpose of report

1

3. 3.1 3.2 3.3

Test analysis Descriptive statistics Frequency graph Test reliability

1 1 1 3

4. 4.1 4.2

Item analysis Difficulty index Discrimination index

4 4 5

5.

Conclusion

6

References

7

Appendix A

8

Appendix B

9

Appendix C

10

2

List of tables Table

Description

Table 1

Mode, median, mean and standard deviation

1

Table 2

Grouped frequency table

2

Table 3

Cumulative frequency table

2

Table 4

Determining reliability coefficient (KR20)

4

Table 5

Calculation of difficulty index

4

Table 6

Calculation of discrimination index

5

Table 7

Number of students in upper and lower group

6

3

List of figures Figure

Description

Figure 1

Cumulative frequency graph

2

Figure 2

Frequency histogram

3

Figure 3

Frequency polygon

3

4

1.

Introduction

This is a report on test and test items analysis using descriptive statistics (measure of tendency and variability) for a given set of scores. Twenty five students wrote a multiple choice test containing twenty questions with four distracters each, (see appendix A).

2.

Purpose of report

The purpose of this report is to disseminate information pertaining to test and item analysis for a given set of scores.

3.

Test analysis

Test analysis examines how the items perform as a set. According to Kubiszyn and Borich (2007), “no test you construct will be perfect”, meaning it includes invalid or deficient items. This necessitates analysis. 3.1

Descriptive statistics

From the test data (see appendix B), the mode occurs more frequently, the median is the score that splits a distribution by half, the mean is an average of a group of scores and standard deviation is the estimate of variability given by the square root of the sum of (x-Mean)2 over the number of students. The mode, median, mean and standard deviation are given in table 1. The table shows a normal distribution because the mode, median and mean is the same. Table 1: Mode, median, mean and standard deviation Mode

Median

Mean

Standard deviation

65

65

65.79

21.90

3.2 Frequency graphs

The frequency graphs are determined by having a grouped frequency table first, given in table 2.

Table 2: Grouped frequency table H

100 5

L Range Number of Intervals Size of interval

15 85 10 8.5

The cumulative frequency graph is determined by upper values as x-axis and cumulative frequency as y-axis. Cumulative frequency table is shown in table 3. Table 3: Cumulative frequency table Lower Limit 15 25 35 45 55 65 75 85 95

Upper Limit 24 34 44 54 64 74 84 94 104

Middle Value 19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5

Frequency 1 2 0 4 3 6 1 6 2

Cumulative Frequency 1 3 3 7 10 16 17 23 25

The cumulative frequency graph is given in figure 1. An ‘ogive’ shape is formed. Figure 1: Cumulative frequency graph Cumulative frequency

Cumulative

30 25 20 15 10 5 0 24

34

44

54

64

74

84

94

104

Upper values

The frequency histogram is determined by intervals (lower values) as x-axis and frequency as y-axis. The frequency histogram is given in figure 2.

6

Figure 2: Frequency histogram Frequency histogram

Frequency

7 6 5 4 3 2 1 0 15-24

25-34

35-44

45-54

55-64 65-774 75-84

85-94 95-104

Intervals

The frequency polygon is determined by middle values as x-axis and frequency as y-axis. The frequency polygon is given in figure 3. Figure 3: Frequency polygon Frequency polygon

Frequency

7 6 5 4 3 2 1 0 19.5

29.5

39.5

49.5

59.5

69.5

79.5

89.5

99.5

Middle values

3.3

Test reliability

Reliability coefficient (KR20) is the appropriate index of test reliability for multiple choice tests. The coefficient is determined by means of a formula which includes the number of test items (k), student performance on every item (sum of pq), for pq values (see appendix C) and the standard deviation squared (stddev2) for the set of student test scores. The index ranges from 0.00 to 1.00. The larger the number the more reliable the student scores are. The (KR20) is determined by means of values given in table 4. 7

Table 4: Determining reliability coefficient (KR20) k k-1 Total pq stdev stddev2 KR20

20 19 3.83 21.90 479.57 1.04

Reliability coefficient (KR20) =1.04. This is a reliable number because it is large (almost 1.00). The student scores are reliable.

4.

Item analysis

Item analysis can be used to identify items that are deficient in some way so as to improve or even eliminate them. Matlock-Hetzel (2007) states that item analysis “investigates the performance of items considered individually in relation to the remaining items in the test”. 4.1

Difficulty index

This indicates the proportion of students who answered the item correctly. The proportion (p) equals number of students with correct answer over number of students who attempted the item. If p<0.25 it means the item is too difficult, and if p>0.75 then the item is too easy and therefore unacceptable. Calculation and interpretation of difficulty index for each question is given in table 5. Table 5: Calculation of difficulty index Questions #Correct #Answered p q1

21

q2 q3 q4

22 17 12

q5 21 Table 5: Calculation of

Interpretation Reason Too 25 0.84 Unacceptable easy Too 25 0.88 Unacceptable easy 25 0.68 Acceptable Fine 25 0.48 Acceptable Fine Too 25 0.84 Unacceptable easy difficulty index (continued)

8

Questions #Correct #Answered p

Interpretation Reason

q6 q7 q8 q9 q10

17 11 12 13 8

25 25 23 25 24

Acceptable Acceptable Acceptable Acceptable Acceptable

q11

23

25 0.92 Unacceptable

q12 q13

19 15

25 0.76 Unacceptable 25 0.6 Acceptable

q14

21

25 0.84 Unacceptable

q15

20

25

0.68 0.44 0.52 0.52 0.33

0.8 Unacceptable

Fine Fine Fine Fine Fine Too easy Too easy Fine Too easy Too easy

9

q16 q17 q18 q19 q20 4.2

22 15 8 13 16

24 24 24 25 25

0.92 0.63 0.33 0.52 0.64

Unacceptable Acceptable Acceptable Acceptable Acceptable

Too easy Fine Fine Fine Fine

Discrimination index

According to Special Connections (2007), the discrimination index (D) is a “basic measure of item’s ability to discriminate between those who scored high (#u) on the total test and those who scored low (#L)”. If D value is positive (closer to 1.00) there is a strong relationship between performance on that item and overall test performance. This means the discrimination is fine. If D value is negative this suggests poor validity for an item. The distracters must be looked into. Calculation and Interpretation of discrimination index for each question is given in table 6. In this instance all items indicate a positive discrimination. Table 6: Calculation of discrimination index Questions #U #L D Interpretation q1 15 6 0.60 Fine q2 15 7 0.53 Fine q3 14 3 0.73 Fine q4 8 4 0.27 Fine Table 6: Calculation of discrimination index (continued) Questions #U #L D Interpretation q5 15 6 0.60 Fine q6 12 5 0.47 Fine q7 9 2 0.47 Fine q8 10 2 0.53 Fine q9 10 3 0.47 Fine q10 8 0 0.53 Fine q11 14 9 0.33 Fine q12 14 5 0.60 Fine q13 12 3 0.60 Fine q14 15 6 0.60 Fine q15 14 6 0.53 Fine q16 15 7 0.53 Fine q17 12 3 0.60 Fine q18 5 3 0.13 Fine

10

q19 q20

12 11

1 0.73 Fine 5 0.40 Fine

The number of students in upper and lower group is the measure of ability of an item to discriminate among students who have a high score on the test and those with a low score on the test. It is the difference between the correct responses in the upper group and of the correct responses in the lower group. The number of students in upper and lower group is given in table 7. Table 7: Number of students in upper and lower group #Upper #Lower

5.

15 10

Conclusion

In conclusion, since the (KR20) is reliable, while sixty percent of the items under difficulty index are acceptable and the discrimination index is positive on all items, the overall test is valid. Analysis of response options allow educators to fine tune and improve items they may wish to use again with future classes. If items are too difficult teachers can adjust the way they teach. The greater the number of plausible distracters, the more accurate, valid and reliable the test becomes.

References Kuiszyn, T. and Borich, G. (2007). Educational Testing and Measurement: Classroom Application and Practice, p (204-326). Eighth edition. John Wiley & Sons, INC. USA. Matlock-Hetzel, S. (2007). Basic Concepts in Item and Test Analysis. Texas A & M University. Retrieved October 02 2007, from http://ericae.net/ft/tamu/Espy.htm Special Connections. (2007). Retrieved October 02 2007, from http://www.Specialconnections.ku.edu/cgibin/cgiwrap/cpecconn/print.php?path=page/ass..

11

Appendix A Key St No

C

B

D

D

B

C

D

A

C

B

A

C

B

D

A

A

C

D

B

C

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9

Q10

Q11

Q12

Q13

Q14

Q15

Q16

Q17

Q18

Q19

Q20

1 2 3 4 5 6 7 8 9 10 11 12 13

C C C C C C B C C C C C C

B B B B B A B B B B B B B

B D D D D D A D D B D D D

A D D B C D B B A A D D A

C B B B B C B B B B B B B

D D C C C C C C C C C C C

A A D B B A B B D D D D D

A A A A D B D D C A D A

D C C C C C D B B D C D C

D B B B D D D C D C B A B

A A A A A A A A A A A A A

D C C C C C C C C B C C C

A B B A B A B B B A B A B

A D D D D D D D D D D D D

A A A C A A C A A D A A A

A A A A A A A A A A A A A

C C C C A A A C C C C C A

B D B B B B D A B D D B B

D B D C B D D B D B B B B

B C C C C C C A A C C D C

14 15 16

C C C

B B B

D D D

A D D

B B B

C B C

D A D

A A A

C B C

B D B

A A A

C C C

B D B

D A D

A A A

A C A

A B C

B D

B D B

C D C

17 18 19 20 21 22 23

B C D C C B C

B B C B A B B

C B A D D A D

C D D D D B B

B B B B C B B

A A A C C C C

D D B D A B B

D D A A D B D

C D D C C D B

D C A D D C

A A C C A A A

D C D D C C C

B A A B A B B

D D A D D D D

A A D A A C A

A B A A A A

C B C A A C

C B B D B D A

A B A B D D B

D C B C C C A

24 25

C C

B B

B D

A D

C B

D D

A A

A

D C

D B

A A

D C

A B

A D

A A

A A

C C

B D

D B

B C

12

Appendix B x 100.00 100.00 90.00 90.00 90.00 89.47 85.00 85.00 75.00 70.00 70.00 65.00 65.00 65.00 65.00 60.00 55.00 55.00 50.00 50.00 47.06 45.00 31.58 31.58 15.00

Group x-Mean (x-Mean)2 U 34.21 1170.49 U 34.21 1170.49 U 24.21 586.24 U 24.21 586.24 U 24.21 586.24 U 23.69 561.03 U 19.21 369.12 U 19.21 369.12 U 9.21 84.87 U 4.21 17.74 U 4.21 17.74 U -0.79 0.62 U -0.79 0.62 U -0.79 0.62 U -0.79 0.62 L -5.79 33.50 L -10.79 116.37 L -10.79 116.37 L -15.79 249.25 L -15.79 249.25 L -18.73 350.77 L -20.79 432.12 L -34.21 1170.23 L -34.21 1170.23 L -50.79 2579.38

13

Appendix C

Question #Correct #Answered q1 21 25 q2 22 25 q3 17 25 q4 12 25 q5 21 25 q6 17 25 q7 11 25 q8 12 23 q9 13 25 q10 8 24 q11 23 25 q12 19 25 q13 15 25 q14 21 25 q15 20 25 q16 22 24 q17 15 24 q18 8 24 q19 13 25 q20 16 25

Pro correct (p) 0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76 0.6 0.84 0.8 0.92 0.63 0.33 0.52 0.64

Pro incorrect (q) 0.16 0.12 0.32 0.52 0.16 0.32 0.56 0.48 0.48 0.67 0.08 0.24 0.4 0.16 0.2 0.08 0.38 0.67 0.48 0.36

pq 0.13 0.11 0.22 0.25 0.13 0.22 0.25 0.25 0.25 0.22 0.07 0.18 0.24 0.13 0.16 0.08 0.23 0.22 0.25 0.23

14

Related Documents