09.12.2012 Views

I__. - International Military Testing Association

I__. - International Military Testing Association

I__. - International Military Testing Association

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Item Content Validity: Its Relationship<br />

With Item Discrimination and Difficulty<br />

Teresa M. Rushano<br />

USAF Occupational Measurement Squadron<br />

At the USAF Occupational Measurement Squadron (USAFOMS), subject-matter experts<br />

(SMEs) rate the questions on promotion tests for content validity.<br />

They also use standard statistical criteria to determine whether test questions<br />

should be reused on subsequent test revisions. The purpose of this<br />

research was to explore the relationship between SME content validity ratings<br />

(CVRs) and item statistics.<br />

The Specialty Knowledge Tests (SKTs) used for enlisted promotions in the Air<br />

Force are written at USAFOMS by senior NCOs acting as SMEs under the guidance .<br />

of USAFOMS psychologists. Within each specialty, one SKT is prepared for<br />

promotion to staff sergeant (E-51, and one for promotion to technical and<br />

master sergeant (E-6 and E-7).<br />

The USAFOMS test development process includes a procedure based on the methodology<br />

of Lawshe (1975) for quantifying content validity on the basis of<br />

essentiality to job performance. As part of the process of revising an existing<br />

SKT, each SME independently assigns each test question a rating using<br />

the following scale:<br />

Is the skill (or knowledge) measured by this test question:<br />

(21, Essential<br />

Useful but not essential (11, or<br />

N o t n e c e s s a r y (01,<br />

for successful performance on the job?<br />

The SMEs as a team then use these ratings as a point of departure in discussing<br />

whether individual items should be retained on subsequent test revisions.<br />

Perry, Williams, and Stanley (1990) found that CVRs influence SME determina-<br />

tion of an item's test-worthiness and its subsequent selection for continued<br />

use or deactivation. However, the ratings are not the only factors which may<br />

impact the SME decision whether to reuse an item on an SKT. After completing<br />

the CVRs, SMEs review item statistics.<br />

For each SKT question, item statistics are provided which indicate how well<br />

,-an item is doing on the test. USAFOMS has an established set of statistical<br />

Tcriteria for test items which must be met. Test questions that do not meet<br />

these criteria must be revised in order to be incorporated on the revised<br />

3 version of the test. The two statistical elements examined in this research<br />

~.'are the difficulty index and discrimination index. The difficulty (DIFF) of<br />

# a test item, sometimes known as its ease index, is defined as the total percentage<br />

of examinees on a test who selected each choice. The DIFF value for<br />

the correct answer is examined to see if the item as a whole is too easy or<br />

too hard. For example, an item answered correctly by 97% of the examinees is<br />

considered too easy for the purposes of the SKT and would not be reused on<br />

subsequent test revisions.<br />

The s,econd statistical element used in this research is the discrimination<br />

index (DISC). This statistic is calculated for each item choice by subtract-<br />

ing the percentage of low-scoring examinees (i.e., those scoring in the lower<br />

50% of all examinees) who select a choice, from the percentage of high-scoring<br />

examinees making that choice. If a test question is working properly,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!