Analysis Report

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Download & View Analysis Report as PDF for free.

More details

  • Words: 3,280
  • Pages: 21
The Test and Item Analyses Report

By: Owen Maphisa Ncube BEd (Hons) Student number 26336686

Table of contents Topic


List of tables


List of figures




1. Introduction


2. Purpose of report


3 Test analyses


4. Item Analyses


5. Conclusion


6. References 7. Appendices

9 10


List of tables Table 1 Table 2 Table 3 Table 4 Table 5

Number of students, total score, mean, standard deviation, mode and mean Highest and lowest scores, range, number and size of intervals Number of questions, sum of products and reliability Question, correct and answered items, difficult index and acceptability Questions, number correct and answered, upper and lower level and discrimination index


List of figures Figure 1 Histogram Figure 2 Frequency polygon Figure 3 Ogive


Acknowledgements I would like to thank Prof J. Knoetze for his tireless efforts and remarkable advice for the production of this report. I would also like to appreciate my beloved ones Thandi, Lungile, Nokwanda, Ohayo and Owens for their incredible patience and amazing support during the compilation of this report.


1. INTRODUCTION In academic institutions like schools, educators administer tests and examinations for different reasons centred on decisions. Tests are tools that attempt to provide objective data that can be used for making diagnostic, instructional and grading decisions in the classroom. At a larger scale tests can be used for other decisions like selection, placement, counselling, guidance, curriculum, and administration. A test informs the educator about where a student stands compared to classmates and provides information about the students’ level of proficiency and mastery of some skill or set of skills. Statistical analysis can be a powerful technique available to instructors for the guidance and improvement of instruction, tests and examinations. Tests can be analysed by using mean, mode, median, standard deviation and reliability. The quality of the test as a whole is assessed by estimating its internal consistency. On the other hand items can be analysed by finding the item difficulty and item discrimination. For this to be so, the items to be analyzed must be valid measures of instructional objectives designed by the educator for his learners. Furthermore, the items must be diagnostic, that is, knowledge of which incorrect options students select must be a clue to the nature of the misunderstanding, and thus prescribe appropriate remedial strategies. Instructors who set their own examinations or tests may immensely improve the effectiveness of test items and the validity of test scores. This report dwells on issues surrounding the reliability of the multiple choice test written by twenty five students, as well as analysis of twenty five items contained in the test. Each of the items was labelled and the correct options indicated.

2. PURPOSE OF REPORT The purpose of this report is to disseminate information about the test and item analyses of a multiple choice test containing twenty questions which had four options each labelled A,B,C and D and written by twenty-five learners.


Descriptive Statistics

Descriptive statistics are used for describing the basic features of the data in a study and provide simple summaries about the sample and the measures. In conjunction with simple graphics analysis, the descriptive statistics forms the basis of virtually every quantitative


analysis of data. The measures of central tendency of a distribution like mean, mode and median are employed in analysing a test. Each of the items is recoded in preparation for quantitative test and item analysis. (i) To find out the mean which is the average student response to the item, the number of points earned by all students on the item and dividing that total by the number of students, the formula: M = ∑ x was used, where: N M is



∑ is the summation N is the total number of students who wrote the test. In Table 1 the mean is 65. (ii) To find the standard deviation this is a measure of the dispersion of students’ scores on that item, the formula:

was used. Table 1 shows that the calculated standard deviation as 21.90. (iii) To find the mode, which is the most frequently occurring score in a distribution, the frequencies of each score are checked thoroughly. Table 1 shows the mode as 65. (iv)) To find the median (50th percentile) which is the exact midpoint of a distribution, the scores are arranged in ascending or descending order and the mid-mark is identified. In table one the median is 65.

Measure Number of students

Result 25 1645 65.79 21.90 65 65

Total score Mean Standard deviation Mode Median

Table 1: Number of students, total score, mean, mode and median 2


Frequency Graphs

In Table 2 the highest and lowest scores are 100 and 15 respectively, the range is 85 that is the difference between the highest and lowest scores, the number of scores is 25, the size of the interval is 9 and the intervals are 10. There is a wide gap in the poor and best learners’ performances. Highest score Lowest score Range Number of scores Number of intervals Size of interval

100 15 85 25 10 9 Table 2:

Highest and lowest scores, range, number and size of intervals 3.2.1 Histogram A histogram is a bar graph of raw data that creates a picture of data distribution. Histograms illustrate the process of distribution which can be used for predictions. The bars represent the frequency of the occurrence of classes of data. It shows basic information like central location,, width of spread, skewness and shape, which help one to decide on how to improve the instruction or test. Figure 1 shows that the histogram is negatively skewed because the rest of the scores lie above the mean and median. The students did well in the test.


HISTOGRAM 7 6 5 Frequency

4 3 2 1 0 15-24





65-74 75-84

85-94 95-104


Figure 1: Histogram 3.2.2 Frequency polygon The frequency polygon is an alternative way of representing data that has been grouped. Information obtained from the frequency polygon is similar to the one from a histogram because they are constructed from the same data. Figure 2 shows two peaks with a frequency of six and middle value of 69.5 and 89.5. Frequency Polygon Frequency

7 6 5 4 3 2 1 0








Middle values

Figure 2: Frequency polygon





3.2.3 Ogive An ogive is a cumulative frequency polygon, mostly presented in percentages. Cumulative frequencies show the running total, thus the frequency below each class boundary, as shown in Figure 3. The main use of an ogive is to estimate important percentiles like the median (50%), lower quartile (25%) and upper quartile (75%).

Cummulative Frequency 30 Frequency

25 20 15 10 5 0 24









Upper Limit

Figure 3: Ogive 3.2.4 TEST RELIABILITY The reliability of a test measured by KR20 refers to the extent to which the test is likely to produce consistent scores. k ∑ pq ) for calculating KR20 was )(1 − The formula: K = ( k −1 ( Stdev) 2 used, where: K the number of items k-1 the difference ∑ the summation P proportion correct q proportion incorrect Stdev the standard deviation In Table 3 the number of items is 20, the difference 19, sum of products 3.83 and reliability 1.04 which demonstrates consistency. 5

Number of questions Difference Sum of products Reliability

k k -1 ∑ pq KR20

20 19 3.83 1.04

Table 3: Number of questions, sum of products and reliability 4.

ITEM ANALYSES Item analysis is a process, which examines learners’ responses to individual test items in order to assess the quality and accuracy of the items and the test as a whole. Item analysis is especially quite valuable in improving the quality of the items that can be used in later tests and eliminating ambiguous or misleading questions. Additionally the item analysis improves the educator’s skills in test or examination setting and identifying specific areas of course content where there is a need for emphasis or clarity. Each item can be analysed for its difficult index and the discrimination index. Quantitative item analysis was used for detecting the performance in this multiple-choice test.


Difficulty Index (p) The item difficulty index is relevant for determining whether learners have learnt the concepts being tested. It plays an important role in the ability of the question to discriminate between learners who have learnt material that is being tested and those who have not. It is a measure of the proportion of the learners or examinees who answered the item correctly and those who answered the item. The p value is calculated by dividing the number of items answered correctly by the total number of learners who answered the item. The high value indicates that a bigger proportion of learners responded to the item correctly, hence an easier item. An item with a p-value < 0.25 is very difficult hence it is not accepted, a p-value > 0.75 is not acceptable because it is very easy but an item with a value between 0.25 and 0.75 is acceptable because it is neither difficult nor easy. Table 4 shows that the difficult index varies from 0.33 to 0.92, so the questions are either too easy or fine. Twelve items namely q3, q4, q6, q7, q8, q9 , q10, 6

q13, q17, q18, q19 and q20 are good questions whilst q1, q2, q5, q11, q12, q14,q15 and q16 are bad questions.

Question q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20

#Correct 21 22 17 12 21 17 11 12 13 8 23 19 15 21 20 22 15 8 13 16

#Answered 25 25 25 25 25 25 25 23 25 24 25 25 25 25 25 24 24 24 25 25

p 0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76 0.60 0.84 0.80 0.92 0.63 0.33 0.52 0.64

Comment Unacceptable Unacceptable Acceptable Acceptable Unacceptable Acceptable Acceptable Acceptable Acceptable Acceptable Unacceptable Unacceptable Acceptable Unacceptable Unacceptable Unacceptable Acceptable Acceptable Acceptable Acceptable

Reason Too easy Too easy Fine Fine Too easy Fine Fine Fine Fine Fine Too easy Too easy Fine Too easy Too easy Too easy Fine Fine Fine Fine

Table 4: Question, number of correct and answered items, difficult index acceptability 4.2 Discrimination Index (D) Item discrimination index refers to the ability of an item to differentiate among learners on the basis of how well they know the content being tested. It is a measure of how well an item is able to distinguish between learners/ examinees who are knowledgeable and those who are not. D is calculated by dividing the difference between the numbers correct in the upper and the number correct in the lower divided by the larger number in either group. A good item discriminates between students who scored high or low on the examination as a whole. There are three types of discrimination index namely positive index in which applies to learners who did well on the overall test and chose the correct answer for a particular item more often than those who did poorly, negative index negative index applies to students who did poorly but chose the correct answer more than those who did well and zero index where the numbers are equal. When interpreting the value of discrimination it is important to be aware of the relationship


between an item difficulty index and its discrimination index. In Table 6, the discrimination index varies between 0.13 and 0.73 and all of them discriminate positively which implies that all the items can be kept. Question q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20

#Upper level 15 15 14 8 15 12 9 10 10 8 14 14 12 15 14 15 12 5 12 11

#Lower level 6 7 3 4 6 5 2 2 3 0 9 5 3 6 6 7 3 3 1 5

D 0.60 0.53 0.73 0.27 0.60 0.47 0.47 0.53 0.47 0.53 0.33 0.60 0.60 0.60 0.53 0.53 0.60 0.13 0.73 0.40

Table 5: Questions, upper and lower levels and discrimination index


CONCLUSION On the overall the test is very reliable and can be used effectively and stored for later use. The results of the multiplechoice test form a normal distribution curve with very few learners who have done poorly and excellently on either side of the curve. The rest of the learners performed averagely. The scores are clustered around the mean as dictated by the standard deviation. Sixty percent of the questions namely q3, q4. q6, q7, q8, q9, q10, q13, q17, q18, q19 and q20 are very good questions which discriminate positively. Forty percent of the questions thus q1,q2,q5,q11.q12,q14,q15 and q16 are bad items that must either be improved or discarded, although they discriminate positively. The distracters seem not to have played their roles. I recommend that the statistics must always be interpreted in the context of the type of test given, ambiguity of the items, length of the test, the individuals being tested and the number of learners as pointed out by Mehrens and Lehmann (1973). 8



.1 Kubiszyn, T., & Borich, G. (2007). Educational Testing and Measurement: Classroom Application and Practice. Eight Edition.USA. Wiley/Jossey-Bass Education . .2 Interpreting Item Analysis (n.d.) Retrieved September 08,2007 from .3 Seock-Ho K.(1999). A computer program for classical item analysis. Retrieved August 17,2007 from, 4.

Stroud, K, A. (2001) Engineering mathematics. Fifth Edition. Palgrave.New York


Interpreting item analysis : Test Validation & construction Unit. California State Personnel Board.


Kehoe, J. (1995). Basic item analysis for multiple-choice tests. Practical Assessment, Research & Evaluation. Retrieved September 21, 2007 from http:////



Appendices Appendix A: Recoded of scores










1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0

1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 0 0 0 1 0 0 0

1 1 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 0 1

1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 0 0 0 0

1 1 0 1 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 0 1 0 0 0 0

1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1



Q9 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0

Q10 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q11 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0

Q12 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 0 1 0 0 0

Q13 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 0 0 0 1 1 1 0 0 0 0


Q14 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0

Q15 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 0 0 1 1 1 1 0

Q16 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0

Q17 1 1 1 1 1 0 0 1 0 1 1 1 1 1 1 1 0 0 0 0

Q18 1 1 1 0 1

Q19 1 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0

0 1 1 0

Q20 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 1 1 1 1 0 0 0 0 0

Appendix B: Interval, middle value and frequency Interval

Middle Value


15 - 24 25 - 34 35 - 44 45 - 54 55 - 64 65 - 74 75 - 84 85 - 94 95 - 104

19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5

1 2 0 4 3 6 1 6 2


Appendix C: Students’ responses to questions Students 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

#Correct 20 20 18 18 18 17 17 17 15 14 14 13 13 13 13 12 11 11 10 10 8 9 6 6 3

#Answered 20 20 20 20 20 19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 17 20 19 19 20


% Correct 100.00 100.00 90.00 90.00 90.00 89.47 85.00 85.00 75.00 70.00 70.00 65.00 65.00 65.00 65.00 60.00 55.00 55.00 50.00 50.00 47.06 45.00 31.58 31.58 15.00

Appendix D: Question, numbers correct and incorrect, proportion correct and product of p and q




Prop correct (p)

q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20

21 22 17 12 21 17 11 12 13 8 23 19 15 21 20 22 15 8 13 16

4 3 8 13 4 8 14 11 12 16 2 6 10 4 5 2 9 16 12 9

0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76 0.60 0.84 0.80 0.92 0.63 0.33 0.52 0.64


Prop incorrect (q) 0.16 0.12 0.32 0.52 0.16 0.32 0.56 0.48 0.48 0.67 0.08 0.24 0.40 0.16 0.20 0.80 0.38 0.67 0.48 0.36 Total :

pq 0.13 0.11 0.22 0.25 0.13 0.22 0.25 0.25 0.25 0.22 0.07 0.18 0.24 0.13 0.16 0.08 0.23 0.22 0.25 0.23 3.83

Appendix E: Class, upper limit, frequency and cumulative frequency Class

Upper Limit


15 - 24 25 - 34 35 - 44 45 - 54 55 - 64 65 - 74 75 - 84 85 - 94 95 - 104

24 34 44 54 64 74 84 94 104

1 2 0 4 3 6 1 6 2

Appendix F: Learners in upper and lower levels Learners in upper level Learners in lower level

15 10


Cumulative Frequency 1 3 3 7 10 16 17 23 25

Related Documents

Analysis Report
June 2020 6
Analysis Report
November 2019 11
Gap Analysis Report
December 2019 18