Educational Assessment
The process used to evaluate the individual questions contained in an MCQ Test is called:
Item Analysis
Item Analysis • What is item analysis? – Difficulty index – Discrimination index – Analysis of distractors – Purpose of item analysis – Performing item analysis
• What are the main difficulties and limitations of language testing?
Item Analysis It is assumed that an assessment test will: 1. Produce a range of scores 2. Result in the more able students achieving higher marks than the less able Item analysis is largely based on the calculation of:
The Difficulty Index & Discrimination Index for each question/item in an MCQ test
Item Analysis Difficulty Index This is simply the percentage of students who obtain the correct answer to an individual question/item in the test. - Denotes the extent to which an item is easy or difficult for the proposed group of testtakers. - Is important because an item that is too easy or too difficult really does nothing to separate high-ability and low-ability testtakers.
Item Difficulty • Calculating the Item Difficulty ID = (Total no. of ss with correct ans) x 100 (Total no. of ss) 1 Appropriate test items will generally have ID that range from .15 and .85.
Item Difficulty • Two good reasons for occasionally including a very easy item (.85 or higher) are to: – to build in some affective feelings of “success” among lowerability students, and – to serve as warm-up item
• Very difficult items can provide a challenge to the highestability students. Difficulty Index /Item Difficulty(ID)Facility Value (FV) The maximum range of the ID is from 0 (all answers incorrect) to 1 (all answers correct)
INTERPRETATION OF DIFFICULTY INDEX
RANGE
DIFFICULTY INDEX
0.2 and below
very difficult
0.2-0.4
difficult
0.4-0.6
average
0.6-0.8
Easy
0.8 and above
very easy
Discrimination Index Is the power of an individual question to discriminate between the most able and least able students in the group. This may be referred to as the “Discriminating Power” or “Index of Discrimination” Eg. An item that garners correct responses from most of the high-ability group and incorrect responses from most of the low-ability group has good discrimination power.
Discrimination Index • Calculating the Discrimination Index DI = (Upper results – Lower results) 33.3% of total population Or (Douglas Brown) DI = High group no. correct – low group no. correct ½ x total of your two comparison groups
Item Analysis Calculating the Discrimination Index Results Upper = Number of students in the upper group with the correct answer Results Lower = Number of students in the lower group with the correct answer Total = Number of students included in the analysis i.e. number of students in the Upper Group + number of students in the Lower Group
Item Analysis Difficulty Index /Item Difficulty(ID)Facility Value (FV) The maximum range of the ID is from 0 (all answers incorrect) to 1 (all answers correct) Discrimination Index (DI) The maximum range of the Discrimination Index is from -1.0 (all students in the LOWER group only obtaining the correct answer) to +1.0 (all students in the UPPER group only obtaining the correct answer) It is important to note that the Item Difficulty/Facility Value and Discrimination Index are calculated for each item (question) and not for the assessment test as a whole
Item Analysis • Items with a high discriminating power would approach a perfect 1.0 • And no discriminating power at all would be zero. • Discard items that scored near or less than zero • As with ID, no absolute rule governs the establishment of acceptable and unacceptable DI indices
Using Information About Index of Discrimination • The Index of Discrimination tells a teacher the degree to which a test item differentiates the high achievers from the low achievers in the class. A test item may have positive or negatives discriminating power. • An item has a positive discriminating power when more students from the upper group got the right answer than those from the lowest group. • When more students from the lower group got the correct answer on an item than those from the upper group, the item has a negative discriminating power.
INTERPRETATION OF THE DISCRIMINATION INDEX
RANGE
DESCRIPTION
0.6 and above
Very good
0.4-0.59
Good
0.3-0.39
Fair
0.3 and below
Very poor
If I were rich, I……………..work (A) shan’t (B) won’t (C ) wouldn’t (D) didn’t
ID :
U
L
U+L
(A)
1
4
5
(B)
2
5
7
(C )
14
4
18
(D)
3
7
10
U+L Total no. of students DI : U-L 33.3% of total population
= 18 40 = 10 20
= 0.45 = 0.50
John Kennedy……..born in 1917 and died in 1963 (A) is (B) has been (C ) was (D) had been U
L
U+L
(A)
0
2
2
(B)
0
3
3
(C )
13
12
25
(D)
7
3
10
ID : DI :
U+L 2n U-L n
= 25 40 =1 20
= 0.625 = 0.05
Examining Distractor Effectiveness • An ideal item is one that all students in the upper group answer correctly and all students in the lower group answer wrongly. And the responses of the lower group have to be evenly distributed among the incorrect alternatives.
Distractor Analysis • A perfect test item would have 2 characteristics: 1. Everyone who knows the item gets it right. 2. People who do not know the item will have responses equally distributed across the wrong answer. • It is not desirable to have one of the distractors chosen more often than the correct answer. • This result indicates a potential problem with the question. This distractor may be too similar to the correct answer and/or this maybe something in either the stem or the alternatives that is misleading
Distractor Analysis • When the number of persons choosing a distractor significantly exceeds the number expected, there are 2 possibilities: 1. It is possible that the choice reflects partial knowledge 2. The item is a poorly worded trick question. • Unpopular distractors may lower item and test difficulty because it is easily eliminated. • Extremely popular distractors are likely to lower the reliability and validity of the test.
Distractor Analysis • The efficiency of distractors is the extent to which: a. The distractors “lure” a sufficient number of test-takers, esp the lower-ability ones, and b. Those responses are somewhat evenly distributes across all distractors
Distractor Analysis Choices
A
B
C*
D
E
High-ability students (10)
0
1
7
0
2
Low-ability students (10)
3
5
2
0
0
*Note: C is the correct response
What is the discrimination index? • Distractor D doesn’t fool anyone, so need to revise it. • Distractor E attracts more responses from the high-ability group than the low-ability group. Why did this happen? • What if there were more students choosing option B compared to the correct answer C?
• Exercises on Item Difficulty and Discrimination Index
He complained that he.…the same bad film the night before. (A) had seen (B) was seeing (C) has seen (D) would see U
L
U+L
(A)
14
8
22
(B)
4
7
11
(C )
2
5
7
(D)
0
0
0
ID :
DI :
U+L 2n U-L n
= 22 40 =6 20
= 0.55
= 0.30
I don’t think that anybody has seen him. (A) Yes, someone has (B) Yes, no one has (C) Yes, none has (D) Yes, anyone has U
L
U+L
(A)
4
6
-2
(B)
9
8
17
(C )
7
5
12
(D)
0
1
1
ID : DI :
U+L 2n U-L n
= 10 40 = -2 20
= 0.25 = -0.10
When (A)…. Jim (B)…. crossed the (C)…. road, he (D).... ran into a car. U
L
U+L
(A)
0
5
5
(B)
2
12
14
(C )
0
0
0
(D)
18
3
21
ID : DI :
U+L 2n U-L n
= 21 40 = 15 20
= 0.525 = 0.75
What kind of ………is your new suit made of? (A) clothes (B) clothing (C) cloth (D) clothings U
L
U+L
(A)
0
5
5
(B)
2
8
10
(C )
17
3
20
(D)
1
4
5
ID : DI :
U+L 2n U-L n
Past Year Question November 2013 Section B Question 4 • Item Analysis (IA) can further enhance the design and preparation of multiple-choice questions. (Brown, 2010) • Analyse and explain how Item Analysis (IA) can be used in the appropriate selection and arrangement of multiple-choice questions. Base your discussion on the three main indicies of Item Analysis (IA).