Hr-validity & Reliability

  • Uploaded by: Srinivas Devegowda
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Hr-validity & Reliability as PDF for free.

More details

  • Words: 1,662
  • Pages: 10
BHAVAN’S INSTITUTE OFMANAGEMENT MYSORE

SUBJECT

: HRM

ASSGIMENT ON

: VALIDITY AND RELIABILITY

SUBMITED TO

: PROF.ROHINI.G.SHETTY HR DEPARTMENT BHAVAN’S INSTITUTE OF MANAGEMENT MYSORE

SUBMITTED BY

: SRINIVAS.D IV {TRI-SEM}, P.G.PM BHAVAN’S INSTITUTE OF MANAGEMENT MYSORE

DATE: 14-10-2008 DAY : TUESDAY

VALIDITY Introduction Validity is arguably the most important criteria for the quality of a test. The term validity refers to whether or not the test measures what it claims to measure. On a test with high validity the items will be closely linked to the test’s intended focus. For many certification and licensure tests this means that the items will be highly related to a specific job or occupation. If a test has poor validity then it does not measure the job-related content and competencies it ought to. When this is the case, there is no justification for using the test results for their intended purpose. There are several ways to estimate the validity of a test including content validity, concurrent validity, and predictive validity. The face validity of a test is sometimes also mentioned.

TYPES OF VALIDITY Content Validity While there are several types of validity, the most important type

for

most

certification

and

licensure

programs

is

probably that of content validity. Content validity is a logical process where connections between the test items and the job-related

tasks

are

established.

If

a

thorough

test

development process was followed, a job analysis was properly conducted, an appropriate set of test specifications were developed, and item writing guidelines were carefully followed, then the content validity of the test is likely to be very

high.

Content

validity

is

typically

estimated

by

gathering a group of subject matter experts (SMEs) together to review the test items. Specifically, these SMEs are

given the list of content areas specified in the test blueprint, along with the test items intended to be based on each content area. The SMEs are then asked to indicate whether or not they agree that each item is appropriately matched to the content area indicated. Any items that the SMEs identify as being inadequately matched to the test blueprint, or flawed in any other way, are either revised or dropped from the test.

Concurrent Validity Another important method for investigating the validity of a test is concurrent validity. Concurrent validity is a statistical method using correlation, rather than a logical method. Examinees who are known to be either masters or nonmasters on the content measured by the test are identified, and the test is administered to them under realistic exam conditions.

Once

the

tests

have

been

scored,

the

relationship is estimated between the examinees’ known status

as

either

masters

or

non-masters

and

their

classification as masters or non-masters (i.e., pass or fail) based on the test. This type of validity provides evidence that the test is classifying examinees correctly. The stronger the correlation is, the greater the concurrent validity of the test is.

Predictive Validity Another statistical approach to validity is predictive validity. This approach is similar to concurrent validity, in that it measures

the

relationship

between

examinees'

performances on the test and their actual status as masters

or non-masters. However, with predictive validity, it is the relationship

of

test

scores

to

an

examinee's

future

performance as a master or non-master that is estimated. In other words, predictive validity considers the question, "How well does the test predict examinees' future status as masters or non-masters?" For this type of validity, the correlation that is computed is between the examinees' classifications as master or non-master based on the test and their later performance, perhaps on the job. This type of validity is especially useful for test purposes such as selection or admissions.

Face Validity One additional type of validity that you may hear mentioned is

face

validity.

Like

content

validity,

face

validity

is

determined by a review of the items and not through the use of statistical analyses. Unlike content validity, face validity is not investigated through formal procedures and is not determined by subject matter experts. Instead, anyone who looks

over

stakeholders,

the

test,

may

including

develop

an

examinees informal

and

opinion

other as

to

whether or not the test is measuring what it is supposed to measure. While it is clearly of some value to have the test appear to be valid, face validity alone is insufficient for establishing that the test is measuring what it claims to measure. A well developed exam program will include formal studies into other, more substantive types of validity.

Summary The validity of a test is critical because, without sufficient validity, test scores have no meaning. The evidence you collect and document about the validity of your test is also your best legal defense should the exam program ever be challenged in a court of law. While there are several ways to estimate validity, for many certification and licensure exam programs the most important type of validity to establish is content validity.

RELIABILITY Introduction Reliability is one of the most important elements of test quality. It has to do with the consistency, or reproducibility, of an examinee's performance on the test. For example, if you were to administer a test with high reliability to an examinee on two occasions, you would be very likely to reach

the

same

conclusions

about

the

examinee's

performance both times. A test with poor reliability, on the other hand, might result in very different scores for the examinee across the two test administrations. If a test yields inconsistent scores, it may be unethical to take any substantive actions on the basis of the test. There are several methods for computing test reliability including test-

retest

reliability,

parallel

forms

reliability,

decision

consistency, internal consistency, and interpreter reliability. For many criterion-referenced tests decision consistency is often an appropriate choice.

Types of Reliability Test-Retest Reliability To estimate test-retest reliability, you must administer a test form to a single group of examinees on two separate occasions. Typically, the two separate administrations are only a few days or a few weeks apart; the time should be short enough so that the examinees' skills in the area being assessed have not changed through additional learning. The relationship between the examinees' scores from the two different administrations is estimated, through statistical correlation, to determine how similar the scores are. This type of reliability demonstrates the extent to which a test is able to produce stable, consistent scores across time.

Parallel Forms Reliability Many exam programs develop multiple, parallel forms of an exam to help provide test security. These parallel forms are all constructed to match the test blueprint, and the parallel test forms are constructed to be similar in average item difficulty.

Parallel

forms

reliability

is

estimated

by

administering both forms of the exam to the same group of examinees.

While

the

time

between

the

two

test

administrations should be short, it does need to be long

enough so that examinees' scores are not affected by fatigue. The examinees' scores on the two test forms are correlated in order to determine how similarly the two test forms function. This reliability estimate is a measure of how consistent examinees’ scores can be expected to be across test forms.

Decision Consistency In

the

descriptions

of

test-retest

and

parallel

forms

reliability given above, the consistency or dependability of the

test

scores

was

emphasized.

For

many

criterion

referenced tests (CRTs) a more useful way to think about reliability may be in terms of examinees’ classifications. For example, a typical CRT will result in an examinee being classified as either a master or non-master; the examinee will either pass or fail the test. It is the reliability of this classification

decision

that

is

estimated

in

decision

consistency reliability. If an examinee is classified as a master on both test administrations, or as a non-master on both occasions, the test is producing consistent decisions. This approach can be used either with parallel forms or with a single form administered twice in test-retest fashion.

Internal Consistency The internal consistency measure of reliability is frequently used for norm referenced tests (NRTs). This method has the advantage of being able to be conducted using a single form

given at a single administration. The internal consistency method estimates how well the set of items on a test correlate with one another; that is, how similar the items on a test form are to one another. Many test analysis software programs produce this reliability estimate automatically. However, two common differences between NRTs and CRTs make this method of reliability estimation less useful for CRTs. First, because CRTs are typically designed to have a much narrower range of item difficulty, and examinee scores, the value of the reliability estimate will tend to be lower. Additionally, CRTs are often designed to measure a broader range of content; this results in a set of items that are not necessarily closely related to each other. This aspect of CRT test design will also produce a lower reliability estimate than would be seen on a typical NRT.

Interrater Reliability All of the methods for estimating reliability discussed thus far are intended to be used for objective tests. When a test includes performance tasks, or other items that need to be scored by human raters, then the reliability of those raters must

be

estimated.

This

reliability

method

asks

the

question, "If multiple raters scored a single examinee's performance, would the examinee receive the same score. Interrater reliability provides a measure of the dependability or consistency of scores that might be expected across raters.

Summary Test reliability is the aspect of test quality concerned with whether or not a test produces consistent results. While there are several methods for estimating test reliability, for objective CRTs the most useful types are probably testretest reliability, parallel forms reliability, and decision consistency. A type of reliability that is more useful for NRTs is internal consistency. For performance-based tests, and other tests that use human raters, interrater reliability is likely to be the most appropriate method.

Bibliography www.proftesting.com www.google.com

Related Documents

Reliability
April 2020 35
Reliability Indices
June 2020 12
Reliability Report
July 2020 17
Reliability Assessment
June 2020 10
Reliability Cgg2
December 2019 21

More Documents from "tarangworld5740"