09.12.2012 Views

2003 IMTA Proceedings - International Military Testing Association

2003 IMTA Proceedings - International Military Testing Association

2003 IMTA Proceedings - International Military Testing Association

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

45 th Annual Conference<br />

of the<br />

<strong>International</strong> <strong>Military</strong><br />

<strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida<br />

3-6 November <strong>2003</strong><br />

www.<strong>International</strong>MTA.org


TABLE OF CONTENTS<br />

<strong>IMTA</strong> <strong>2003</strong> STEERING COMMITTEE ix<br />

STEERING COMMITTEE MEETING MINUTES x<br />

<strong>IMTA</strong> BY-LAWS (AS AMENDED BY STEERING COMMITTEE) xii<br />

PAPERS<br />

A01. Maliko-Abraham, H., and Lofaro, R.J. USABILITY TESTING: 1<br />

LESSONS LEARNED AND METHODOLOGY<br />

A02. Elliott-Mabey, N.L. WHY WE STILL NEED TO STUDY 6<br />

UNDEFINED CONCEPTS<br />

A03. Annen, H., and Kamer, B. DO WE ASSESS WHAT WE WANT 13<br />

TO ASSESS? THE APPRAISAL DIMENSIONS AT THE<br />

ASSESSMENT CENTER FOR PROFESSIONAL OFFICERS (ACABO)<br />

A04. Gutknecht, S.P. PERSONALITY AS PREDICTOR OF JOB 22<br />

ATTITUDES AND INTENTION TO QUIT<br />

A05. Snooks, S., and Luster, L. USING TASK MODULE DATA TO 30<br />

VALIDATE AIR FORCE SPECIALTY KNOWLEDGE TESTS<br />

A06. Brugger, C. THE SCOPE OF PSYCHOLOGICAL TEST 44<br />

SYSTEMS WITHIN THE AUSTRIAN ARMED FORCES<br />

A.07 Sumer, H.C.; Bilgic, R.; Sumer, N.; and Erol, T. JOB-SPECIFIC 49<br />

PERSONALITY ATTRIBUTES AS PREDICTORS OF<br />

PSYCHOLOGICAL WELL-BEING<br />

A08. Farmer, W.L.; Bearden, R.M.; Eller, E.D.; Michael, P.G.; 62<br />

Johnson, R.S.; Chen, H.; Nayak, A.; Hindelang, R.L.; Whittam, K.;<br />

Watson, S.E.; and Alderton, D.L. JOIN: JOB AND OCCUPATIONAL<br />

INTEREST IN THE NAVY<br />

A09. Temme, L.A.; Still, D.L.; Kolen, J.; and Acromite, M. 70<br />

OZ: A HUMAN-CENTERED COMPUTING COCKPIT DISPLAY<br />

A10. Cian, C.; Carriot, J.; and Raphela, C. PERCEPTUAL DRIFT 91<br />

RELATED TO SPATIAL DISORIENTATION GENERATED BY<br />

MILITARY SYSTEMS : POTENTIAL BENEFITS OF SELECTION<br />

AND TRAINING<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

i


ii<br />

A11. Krouse, S.L., and Irvine, J.H. PERCEPTUAL DYSLEXIA: ITS 96<br />

EFFECT ON THE MILITARY CADRE AND BENEFITS OF<br />

TREATMENT<br />

A12. Cowan, J.D. NEUROFEEDBACK TRAINING FOR TWO 103<br />

DIMENSIONS OF ATTENTION: CONCENTRATION AND<br />

ALERTNESS<br />

A13. Schaab, B.B., and Dressel, J.D. WHAT TODAY’S SOLDIERS 111<br />

TELL US ABOUT TRAINING FOR THE FUTURE<br />

A14. Burns, J.J.; Giebenrath, J.; and Hession, P. OBJECTIVE-BASED 116<br />

TRAINING: METHODS, TOOLS, AND TECHNOLOGIES<br />

A15. Helm, W.R., and Reid, J.D. RACE AND GENDER AS FACTORS 123<br />

IN FLIGHT TRAINING SUCCESS<br />

A16. Phillips, H.L.; Arnold, R.D.; and Fatolitis, P. VALIDATION 129<br />

OF AN UNMANNED AERIAL VEHICLE OPERATOR SELECTION<br />

SYSTEM<br />

A17. Sabol, M.A.; Schaab, B.B.; Dressel, J.D.; and Rittman, A.L. 140<br />

SUCCESS AT COLLABORATION AS A FUNCTION OF<br />

KNOWLEDGE DEPTH<br />

A20. Janega, J.B., and Olmsted, M.G. U.S. NAVY SAILOR 150<br />

RETENTION: A PROPOSED MODEL OF CONTINUATION<br />

BEHAVIOR<br />

A21. Morath, R.; Cronin, B.; and Heil, M. DOES MILITARY 156<br />

PERSONNEL JOB PERFORMANCE IN A DIGITIZED FUTURE<br />

FORCE REQUIRE CHANGES IN THE ASVAB: A COMPARISON<br />

OF A DYNAMIC/INTERACTIVE COMPUTERIZED TEST BATTERY<br />

WITH THE ASVAB IN PREDICTING TRAINING AND JOB<br />

PERFORMANCE AMONG AIRMEN AND SAILORS<br />

A22. Lappin, B.M.; Klein, R.M.; Howell, L.M.; and Lipari, R.N. 158<br />

COMPARISONS OF SATISFACTION AND RETENTION<br />

MEASURES FROM 1999-<strong>2003</strong><br />

A23. Richardson, J. BRITISH ARMY LEAVERS SURVEY: 167<br />

AN INVESTIGATION OF RETENTION FACTORS<br />

A24. Mitchell, D.; Keller-Glaze, H.; Gramlich, A.; and Fallesen, J. 171<br />

PREDICTORS OF U.S. ARMY CAPTAIN RETENTION DECISIONS<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


A25. Huffman, A.H.; Youngcourt, S.S.; and Castro, C.A. THE 177<br />

IMPORTANCE OF A FAMILY-FRIENDLY WORK ENVIRONMENT<br />

FOR INCREASING EMPLOYEE PERFORMANCE AND RETENTION<br />

A26. Harris, R.N.; Mottern, J.A.; White, M.A.; and Alderton, D.L. 199<br />

TRACKING U.S. NAVY RESERVE CAREER DECISIONS<br />

A27. Bowles, S.V. DUTIES AND FUNCTIONS OF A RECRUITING 205<br />

COMMAND PSYCHOLOGIST<br />

B02a. Lancaster, A.R.; Lipari, R.N.; Howell, L.M.; and Klein, R.M. 208<br />

THE 2002 WORKPLACE AND GENDER RELATIONS SURVEY<br />

B02b. Ormerod, A.J., and Wright, C.V. WORKPLACE REPRISALS: 219<br />

A MODEL OF RETALIATION FOLLOWING UNPROFESSIONAL<br />

GENDER-RELATED BEHAVIOR<br />

B02c. Lawson, A.K., and Fitzgerald, L.F. UNDERSTANDING 237<br />

RESPONSES TO SEXUAL HARASSMENT IN THE U.S. MILITARY<br />

B04. Ford, K.A. USING STAKEHOLDER ANALYSIS (SA) AND 252<br />

THE STAKEHOLDER INFORMATION SYSTEM (SIS) IN HUMAN<br />

RESOURCE ANALYSIS<br />

B05. O’Connell, B.J.; Beaubien, J.M.; Keeney, M.J.; and Stetz, T.A. 271<br />

DESIGNING A NEW HR SYSTEM FOR NIMA<br />

B07a. Peck, J.F. PERSONNEL SECURITY INVESTIGATIONS: 278<br />

IMPROVING THE QUALITY OF SUBJECT AND WORKPLACE<br />

INTERVIEWS<br />

B07b. Crawford, K.S., and Wood, S. STRATEGIES FOR INCREASED 283<br />

REPORTING OF SECURITY-RELEVANT BEHAVIOR<br />

B07c. Fischer, L.F. CHARACTERIZING INFORMATION SYSTEMS 289<br />

INSIDER OFFENDERS<br />

B07d. Kramer, L.A.; Heuer, R.J.,Jr.; and Crawford, K.S. TEN 297<br />

TECHNOLOGICAL, SOCIAL, AND ECONOMIC TRENDS THAT ARE<br />

INCREASING U.S. VULNERABILITY TO INSIDER ESPIONAGE<br />

B07e. Wiskoff, M.F. DEVELOPMENT OF A WINDOWS BASED 300<br />

COMPUTER-ADMINISTERED PERSONNEL SECURITY SCREENING<br />

QUESTIONNAIRE<br />

B09. Filjak, T.; Cippico, I.; Debač, N.; Tišlarić, G.; and Zebec, K. 305<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

iii


iv<br />

OCCUPATIONAL ANALYSIS APPLIED FOR THE PURPOSE OF<br />

DEFINING OF SELECTION CRITERIA FOR NEW MILITARY<br />

OCCUPATIONAL SPECIALTIES IN THE ARMED FORCES OF THE<br />

REPUBLIC OF CROATIA<br />

B10a. Lee, W.C., and Drasgow, F. USING DECISION TREE 310<br />

METHODOLOGY TO PREDICT ATTRITION WITH THE AIM<br />

B10b. Chernyshenko, O.S.; Stark, S.E.; and Drasgow, F. PREDICTING 317<br />

ATTRITION OF ARMY RECRUITS USING OPTIMAL<br />

APPROPRIATENESS MEASUREMENT<br />

B10c. Stark, S.E.; Chernyshenko, O.S.; and Drasgow, F. A NEW 323<br />

APPROACH TO CONSTRUCTING AND SCORING FAKE-<br />

RESISTANT PERSONALITY MEASURES<br />

B11. Janega, J.B., and Olmsted, M.G. U.S. NAVY SAILOR 330<br />

RETENTION: A PROPOSED MODEL OF CONTINUATION<br />

BEHAVIOR<br />

B12. Nederhof, F.V.F. PSYCHOMETRIC PROPERTIES OF THE 336<br />

DUTCH SOCIAL SKILLS INVENTORY<br />

B13. Hendriks, B.; van de Ven, C.; and Stam, D. DEPLOYABILITY 344<br />

OF TEAMS: THE DUTCH MORALE QUESTIONNAIRE, AN<br />

INSTRUMENT FOR MEASURING MORALE DURING MILITARY<br />

OPERATIONS.<br />

B14. Cotton, A.J., and Gorney, E. MEASURES OF WELLBEING 358<br />

IN THE AUSTRALIAN DEFENCE FORCE<br />

B15. Lescreve, F.J. WHY ONE SHOULD WAIT BEFORE 336<br />

ALLOCATING APPLICANTS<br />

B16. Schreurs, B. FROM ATTRACTION TO REJECTION: A 381<br />

QUALITATIVE RESEARCH ON APPLICANT WITHDRAWAL<br />

B17. Borman, W.C.; White, L.A.; Bowles, S.; Horgen, K.E.; 398<br />

Kubisiak, U.C.; and Penney, L.M. U.S. ARMY RECRUITER<br />

SELECTION RESEARCH: AN UPDATE<br />

B18. Mylle, J. MODELLING COMMUNICATION IN 404<br />

NEGOTIATION IN PSO CONTEXT<br />

C01. Schultz, K.; Sapp, R.; and Willers, L. ELECTRONIC 412<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


ADVANCEMENT EXAMS – TRANSITIONING FROM PAPER-<br />

BASED TO ELECTRONIC FORMAT<br />

C02. Pfenninger, D.T.; Klion, R.E.; and Wenzel, M.U. 418<br />

INTEGRATED WEB ASSESSMENT SOLUTIONS<br />

C03. Oropeza, T.; Hawthorne, J.; Seilhymer, J.; Barrow, D.; 431<br />

and Balog, J. STREAMLINING OF THE NAVY ENLISTED<br />

ADVANCEMENT NOTIFICATION SYSTEM<br />

C04. Edgar, E.; Zarola, A.; Dukalskis, L.; and Weston, K. THE ROLE 438<br />

OF PSYCHOLOGY IN INTERNET SELECTION<br />

C05. O’Connell, B.J.; Caster, C,H.; and Marsh-Ayers, N. WORKING FOR 448<br />

THE UNITED STATES INTELLIGENCE COMMUNITY: DEVELOPING<br />

WWW.INTELLIGENCE.GOV<br />

C06b. Farmer, W.L.; Bearden, R,M.; Borman, W.C.; Hedge, J.W.; 455<br />

Houston, J.S.; Ferstl, K.L.; and Schneider, R.J. ENCAPS – USING<br />

NON-COGNITIVE MEASURES FOR NAVY SELECTION AND<br />

CLASSIFICATION<br />

C06c. Twomey, A., and O'Keefe, D. PILOT SELECTION IN THE 461<br />

AUSTRALIAN DEFENCE FORCE: AUSBAT VALIDATION<br />

C07a. Styer, J.S. DEVELOPMENT AND VALIDATION OF A 468<br />

REVISED ASVAB CEP INTEREST INVENTORY<br />

C07b. Watson, S.E. JOB AND OCCUPATIONAL INTEREST IN 474<br />

THE NAVY<br />

C07c. Farmer, W.L., and Alderton, D.L. VOCATIONAL INTEREST 481<br />

MEASUREMENT IN THE NAVY - JOIN<br />

C07d. Hanson, M.A.; Paullin, C.J.; Bruskiewicz, K.T.; and White, L.A. 485<br />

THE ARMY VOCATIONAL INTEREST CAREER EXAMINATION<br />

C07e. Putka, D.J.; Iddekinge, C.H.: and Sager, C.E. DEVELOPING 491<br />

MEASURES OF OCCUPATIONAL INTERESTS AND VALUES<br />

FOR SELECTION<br />

C08a. Boerstler, R.E., and Kammrath, J.L. OCCUPATIONAL 499<br />

SURVEY SUPPORT OF AIR AND SPACE EXPEDITIONARY<br />

FORCE (AEF) REQUIREMENTS<br />

C08. Jones, P.L.; Strange, J.; and Osburn, H. OCCUPATIONAL 505<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

v


vi<br />

ANALYTICS<br />

C09a. Heffner, T.S.; Tremble, T.; Campbell, R; and Sager, C. 507<br />

ANTICIPATING THE FUTURE FOR FIRST-TOUR SOLDIERS<br />

C09b. Sager, C.E., and Russell, T.L. FUTURE-ORIENTED JOB 514<br />

ANALYSIS FOR FIRST-TOUR SOLDIERS<br />

C09c. Keenan, P.A.; Katkowski, D.A.; Collins, M.M.; Moriarty, K.O.; 522<br />

and Schantz, L.B. PERFORMANCE CRITERIA FOR THE<br />

SELECT21 PROJECT<br />

C09d. McCloy, R.A.; Putka, D.J.; Van Iddekinge, C.H.; and 531<br />

Kilcullen, R.N. DEVELOPING OPERATIONAL PERSONALITY<br />

ASSESSMENTS: STRATEGIES FOR FORCED-CHOICE AND<br />

BIODATA-BASED MEASURES<br />

C09e. Waugh, G.W., and Russell, T.L. SCORING BOTH JUDGMENT 540<br />

AND PERSONALITY IN A SITUATIONAL JUDGMENT TEST<br />

C09f. Iddekinge, C.H.; Putka, D.J.; and Sager, C.E. ASSESSING 549<br />

PERSON-ENVIRONMENT (P-E) FIT WITH THE FUTURE ARMY<br />

C10. Heffner, T.S.; Campbell, R.; Knapp, D.J.; and Greenston, P. 556<br />

COMPETENCY TESTING FOR THE U.S. ARMY<br />

NONCOMMISSIONED OFFICER (NCO) CORPS<br />

C11. Lane, M.E.; Mottern, J.A.; White, M.A.; Brown, M.E.; and<br />

Boyce, E.M. 1ST WATCH: ASSESSMENT OF COPING STRATEGIES<br />

EMPLOYED BY NEW SAILORS<br />

C12. Brown, M.E.; Mottern, J.A.; White, M.A.; Lane, M.E.; and 567<br />

Boyce, E.M. 1st WATCH: THE NAVY FIT SCALE<br />

C13a. Steinberg, A.G., and Nourizadeh, S. USING RESULTS FROM 573<br />

ATTITUDE AND OPINION SURVEYS<br />

C13b. Nourizadeh, S., and Steinberg, A.G. USING SURVEY AND 575<br />

INTERVIEW DATA: AN EXAMPLE<br />

C13c. Rosenfeld, P.; Newell, C.E.; and Braddock, L. UTILIZING 581<br />

SURVEY RESULTS OF THE NAVY EQUAL OPPORTUNITY/<br />

SEXUAL HARASSMENT SURVEY<br />

D01. Waldköetter, R., and Arlington, A.T. THE U.S. ARMY'S 587<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

561


PERSONNEL REPLACEMENT SYSTEM<br />

D02. Mylle, J. TEAM EFFECTIVENESS AND BOUNDARY 592<br />

MANAGEMENT: THE FOUR ROLES PRECONIZED BY ANCONA<br />

REVISITED<br />

D03. Cotton, A.J., and Gorney, E. MENTAL HEALTH LITERACY IN 599<br />

THE AUSTRALIAN DEFENCE FORCE<br />

D04. Dursun, S., and Morrow, R. DEFENCE ETHICS SURVEY: THE 608<br />

IMPACT OF SITUATIONAL MORAL INTENSITY ON ETHICAL<br />

DECISION MAKING<br />

D05. Thompson, B.R. ADAPTING OCCUPATIONAL ANALYSIS 615<br />

METHODOLOGIES TO ACHIEVE OPTIMAL OCCUPATIONAL<br />

STRUCTURES<br />

D07. Smith, G.A. WHOM AMONG US? PRELIMINARY RESEARCH 620<br />

ON POSITION AND PERSONNEL SELECTION CRITERIA FOR<br />

MALE UAV SENSOR OPERATORS<br />

D09. Lim, B.C., and Ployhart, R.E. TRANSFORMATIONAL 631<br />

LEADERSHIP: RELATIONS TO THE FIVE FACTOR MODEL<br />

AND TEAM PERFORMANCE IN TYPICAL AND MAXIMUM<br />

CONTEXTS<br />

D10. Cronin, B.; Morath, R.; and Smith, J. ARMY LEADERSHIP 654<br />

COMPETENCIES: OLD WINE IN NEW BOTTLES?<br />

D11. Holtzman, A.K.; Baker, D.P.; Calderón, R.F.; Smith-Jentsch, K.; 661<br />

and Radtke, P. DEVELOPING APPROPRIATE METRICS FOR<br />

PROCESS AND OUTCOME MEASURES<br />

D12. Douglas, I. SOFTWARE SUPPORT OF HUMAN 671<br />

PERFORMANCE ANALYSIS<br />

D13. Beaubien, J.M.; Baker, D.P.; and Holtzman, A.K. HOW MILITARY 679<br />

RESEARCH CAN IMPROVE TEAM TRAINING EFFECTIVENESS IN<br />

OTHER HIGH-RISK INDUSTRIES<br />

D14. Costar, D.M.; Baker, D.P.; Holtzman, A.; Smith-Jentsch, K.A.; and 688<br />

Radtke, P. DEVELOPING MEASURES OF HUMAN PERFORMANCE:<br />

AN APPROACH AND INITIAL REACTIONS<br />

D15. Makgati, C.K.M. PSYCHOLOGICAL IMPLICATIONS OF 694<br />

DEPLOYMENTS FOR THE MEMBERS OF THE SOUTH AFRICAN<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

vii


viii<br />

NATIONAL DEFENCE FORCE (S.A.N.D.F.)<br />

D16. Cotton, A.J. THE PSYCHOLOGICAL IMPACT OF 702<br />

DEPLOYMENTS<br />

D18. Brown, K.J. THE LEADERS CALIBRATION SCALE 710<br />

D19. Horey, J.D., and Fallesen, J.J. LEADERSHIP COMPETENCIES: 721<br />

ARE WE ALL SAYING THE SAME THING?<br />

D20. Lett, J.; Thain, J.; Keesling, W.; and Krol, M. NEW 734<br />

DIRECTIONS IN FOREIGN LANGUAGE APTITUDE TESTING<br />

D21. Willis, D. THE STRUCTURE & ANTECEDENTS OF 742<br />

ORGANISATIONAL COMMITMENT IN THE SINGAPORE ARMY.<br />

D22. Tan, C.; Soh, S.; and Lim, B.C. FURTHER UNDERSTANDING OF 750<br />

ATTITUDES TOWARDS NATIONAL DEFENCE AND MILITARY<br />

SERVICE IN SINGAPORE<br />

D23. Bradley, P.; Charbonneau, D.; and Campbell, S. MEASURING 760<br />

MILITARY PROFESSIONALISM<br />

D24. Truhon, S.A. DEOCS: A NEW AND IMPROVED MEOCS 766<br />

D25. Rone, R.S. ADDRESSING PSYCHOLOGICAL STATE 772<br />

AND WORKPLACE BEHAVIORS OF DOWNSIZING SURVIVORS<br />

D26. Devriendt, Y.A., and Levaux, C.A. VALIDATION OF THE 779<br />

BELGIAN MILITARY PILOT SELECTION TEST BATTERY<br />

INDEX OF AUTHORS 784<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Current Members<br />

<strong>2003</strong> <strong>IMTA</strong> Executive Steering Committee<br />

Col Tony Cotton Australia, Defence Health Service Branch, Australian<br />

Defence Force<br />

Dr. Christian Langer Austria, Establishment Command Structure and Force<br />

Organization<br />

LtCol Francois Lescreve Belgium, Ministry of Defense<br />

Dr. Jacques Mylle Belgium, Royal <strong>Military</strong> Academy<br />

Ms. Susan Truscott Canada, Department of National Defence<br />

Dr. Corinne Cian France, Center de Recherches du Service de Santa’ des<br />

Armees<br />

Ms. Wiltraud Pilz Germany, Federal Ministry of Defense<br />

Mr. Kian-Chye Ong Singapore, Ministry of Defense<br />

Dr. Henry Widen Sweden<br />

LtCol Frans Matser The Netherlands, Royal Army<br />

(Dr. Renier van Gelooven)<br />

Ms. Jo Richardson United Kingdom, Ministry of Defence (Army)<br />

Dr. James Riedel U.S. Defense Personnel Security Research Center<br />

Dr. Mike Lentz U.S. Navy, NETPDTC, Navy Advancement Center<br />

LtCol. John Gardner U.S. Air Force, Occupational Measurements Squadron<br />

Mr. Kenneth Schwartz U.S. Air Force, Personnel Command<br />

Ms. Mary Norwood U.S. Coast Guard, Occupational Standards<br />

Liasion Members<br />

Dr Mike Rumsey U.S. Army, Research Institute<br />

Potential New Organizational Members (Voted In During <strong>2003</strong> ESC Meeting)<br />

Dr. Ferdinand Rameckers Netherlands, Defence Selection Institute<br />

Dr. Hubert Annen Switzerland, The <strong>Military</strong> Academy at the Swiss Federal<br />

Institute of Technology<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

ix


x<br />

Minutes<br />

<strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>)<br />

Executive Steering Committee Meeting<br />

3 November <strong>2003</strong><br />

The Steering Committee met for the 45 th <strong>IMTA</strong> at 1530 hours, 4 November <strong>2003</strong> at the<br />

Hilton Garden Inn, Pensacola, Florida. Captain Gary Dye, United States Navy, chaired<br />

the meeting. Steering Committee members in attendance are listed on the attachment.<br />

1. Introductions:<br />

Dr. Lentz welcomed everyone to <strong>IMTA</strong> <strong>2003</strong>. The following countries were<br />

represented: Austria, Royal Netherlands, United States, Australia, Canada,<br />

United Kingdom, Belgium, Switzerland, France, Germany, and Singapore.<br />

2. Conference Administration:<br />

Capt. Gary Dye gave a synopsis of the 45th <strong>IMTA</strong>. Approximately 200 people<br />

registered to attend, 80 papers were collected, and we received $15,000 in seed<br />

funding from <strong>IMTA</strong> 2002 held in Ottawa, Canada. The theme for this year’s<br />

conference was “Optimizing <strong>Military</strong> Performance”. The conference was<br />

expanded this to include a fourth track entitled, Human Performance. <strong>IMTA</strong> <strong>2003</strong><br />

also added non-commercial exhibits this year to the conference. And one<br />

presenter will be demonstrating a tutorial SkillsNet.<br />

The keynote speaker on Tuesday is Vice Admiral Harms, Commander of United<br />

States Training for the Navy. On Tuesday afternoon tours will be given at the<br />

Naval Aviation Museum, social hour, and the IMAX theatre.<br />

The keynote speaker on Wednesday is Dr. Ford, of Institute for Human and<br />

Machine Cognition. On Wednesday evening will be our <strong>IMTA</strong> banquet and a live<br />

band.<br />

3. Website Report:<br />

Monty Stanley presented a status report on the <strong>IMTA</strong> website,<br />

www.internationalmta.org, emphasizing his recent website makeover.<br />

4. Presentations:<br />

Two presentations were made by prospective <strong>IMTA</strong> members.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


a. Dr. Hubert Annen of Switzerland, MILAK/ETHZ (<strong>Military</strong> Academy)<br />

presented information in regards to the Training Centre for Officers of the<br />

Swiss Armed forces, military sciences, and basic continuing education of<br />

officers in various fields of study.<br />

b. Dr. Fernidan HJI Rameckers of the Netherlands, Defence Selection<br />

Institute, presented information in regards to the Defence Interservice<br />

Command, Defence Selection Institute, and Psychological Selection.<br />

5. New Inductions:<br />

Dr. Lentz made a motion to accept both countries (Switzerland and The<br />

Netherlands) into the organization. It was seconded by Australia. Motion was<br />

approved by the committee.<br />

6. <strong>IMTA</strong> bylaws:<br />

Dr. Lentz presented the proposed change, Article V, Section E. of the existing<br />

<strong>IMTA</strong> bylaws. LtCol Francois Lescreve made a motion to accept the change.<br />

Motion was approved.<br />

Dr. Lentz explained we would need to post the recommended change and the<br />

attending membership would need to vote on the change at the <strong>IMTA</strong> Banquet.<br />

7. <strong>IMTA</strong> 2004 will be held in Belgium/Brussels, 26-28 October. LtCol Francois<br />

Lescreve will be our host. Also, a NATO activity workshop on Officer’s<br />

Recruiting and Retention will take place the same week.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

xi


xii<br />

BY-LAWS OF THE INTERNATIONAL MILITARY TESTING ASSOCIATION<br />

Article I - Name<br />

The name of the organization shall be the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>).<br />

Article II – Purpose<br />

A. Discuss and exchange of ideas concerning the assessment of military personnel<br />

B. Discuss the mission, organization, operations and research activities of associated<br />

organizations engaged in military personnel assessment.<br />

C. Foster improved personnel assessment through exploration and presentation of new<br />

techniques and procedures for behavioral measurement, occupational analysis, manpower<br />

analysis, simulation modeling, training, selection methodologies, survey and feedback systems.<br />

D. Promote cooperation in the exchange of assessment procedures, techniques and instruments<br />

among <strong>IMTA</strong> members and with other professional groups or organizations<br />

E. Promote the assessment of military personnel as a scientific adjunct to military personnel<br />

management.<br />

Article III – Participation<br />

The following categories shall constitute the membership within the <strong>IMTA</strong>:<br />

A. Primary Membership shall be open to personnel assigned to organizations of the armed<br />

services and defense agencies that have been recognized by the <strong>IMTA</strong> Steering Committee as<br />

Member Organizations and whose primary mission is the assessment of military personnel.<br />

Representatives from the Member Organizations shall constitute the Steering Committee.<br />

B. Associate Membership shall be open to personnel assigned to military, governmental or other<br />

public entities engaged in activities that parallel those of primary membership. Associate<br />

members (including prior members, such as retired military or civilian personnel who remain<br />

professionally active) shall be entitled to all privileges of the primary members with the exception<br />

of membership on the Steering Committee, which may be waived by a majority vote of the<br />

Steering Committee<br />

C. Non-Member Participants represents all other interested organizations or personnel who wish<br />

to participate in the annual conference, present papers or participate in symposium/panel<br />

sessions. Non-Members will not attend the Steering Committee meeting nor have a vote in the<br />

association affairs.<br />

Article IV – Dues<br />

No annual dues shall be levied against the members or participants.<br />

Article V – Steering Committee<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


A. The governing body of the <strong>Association</strong> shall be the Steering Committee, which will consist of<br />

representatives from the Primary Members and those other members as voted by a majority of the<br />

Steering Committee. Commanders of the Primary Member organizations will each appoint their<br />

Steering Committee Member.<br />

B. The Steering Committee shall have general supervision over the affairs of the <strong>Association</strong><br />

and shall have responsibility for all activities of the <strong>Association</strong>. The Steering Committee shall<br />

conduct the business of the <strong>Association</strong> between the annual conferences of the <strong>Association</strong> by<br />

such means of communications as selected by the Chairman.<br />

C. Meeting of the Steering Committee shall be held in conjunction with the annual conference of<br />

the <strong>Association</strong> and at such times as requested by the Chairman.<br />

Representation from a majority of the Primary Members shall constitute a quorum.<br />

D. Each member of the Steering Committee shall have one vote toward resolving Steering<br />

Committee deliberations.<br />

E. (Added November <strong>2003</strong>) All past recipients of the Harry Greer Award will be ex officio, nonvoting<br />

members of the Steering Committee, unless they still represent their organization, in which<br />

case, they would still be a voting member. (The intent here is to maintain the institutional<br />

knowledge, the depth and breadth of experience, and the connection to our history that could be<br />

lost since Executive Steering Committee members are subject to change.<br />

Article VI – Officers<br />

A. The officers of the <strong>Association</strong> shall consist of the Chairman of the Steering Committee and a<br />

Secretary.<br />

B. The Commander of the Primary Member coordinating the annual conference of the<br />

<strong>Association</strong> shall select the Chairman of the Steering Committee. The term of the Chairman shall<br />

begin at the close of the annual conference of the <strong>Association</strong> and shall expire at the close of the<br />

next annual conference. The duties of the Chairman include organizing and coordinating the<br />

annual conference of the <strong>Association</strong>, administering the activities of the <strong>IMTA</strong>, and the duties<br />

customary to hosting the annual meeting.<br />

C. The Chairman shall appoint the Secretary of the <strong>Association</strong>. The term of the Secretary shall<br />

be the same as that of the Chairman. The duties of the Secretary shall be to keep the records of<br />

the <strong>Association</strong> and the minutes of the Steering Committee, to conduct official correspondence<br />

for the <strong>Association</strong> and to insure notice for the annual conference. The Secretary shall solicit<br />

nominations for the Harry H. Greer Award.<br />

Article VII – Meetings<br />

A. The association shall hold a conference annually.<br />

B. The Primary Members shall coordinate the annual conference of the <strong>Association</strong>, either<br />

individually or as a consortium. The order of rotation shall be determined by the Steering<br />

Committee. The coordinating Primary Members and the tentative location of the annual<br />

conference for the following three years shall be announced at each annual conference.<br />

C. The annual conference of the <strong>Association</strong> shall be held at a time and place determined by the<br />

xiii<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


xiv<br />

coordinating Primary Member. Announcement of the time and place for the next annual<br />

conference will occur at the annual conference.<br />

D. The coordinating Primary Member shall exercise planning and supervision over the program<br />

and activities of the annual conference. Final selection of program content shall be the<br />

responsibility of the coordinating Primary Member. <strong>Proceedings</strong> of the annual conference shall<br />

be published by the coordinating Primary Member.<br />

E. Any other organization (other than a Primary Member) may coordinate the annual conference<br />

and should submit a formal request to the Chairman of the Steering Committee no less than 18<br />

months prior to the date they wish to host.<br />

Article VIII – Committees<br />

A. Committees may be established by vote of the Steering Committee. The Chairman of each<br />

committee shall be appointed by the Chairman of the Steering Committee from among the<br />

members of the Steering Committee.<br />

B. Committee members shall be appointed by the Chairman of the Steering Committee in<br />

consultation with the Chairman of the committee being formed. Committee chairman and<br />

members shall serve in their appointed capacities at the discretion of the Chairman of the Steering<br />

Committee. The Chairman of the Steering Committee shall be an ex officio member of all<br />

committees.<br />

C. All committees shall clear their general plans of action and new policies through the Steering<br />

Committee. No committee or committee chairman shall enter into activities or relationships with<br />

persons or organizations outside of the <strong>Association</strong> that extend beyond the approved general plan<br />

or work specified without the specific authorization of the Steering Committee.<br />

Article IX – Amendments<br />

A. Amendments of these By-Laws may be made at the annual conference of the <strong>Association</strong>.<br />

B. Proposed amendments shall be submitted to the Steering Committee not less than 60 days<br />

prior to the annual meeting. Those amendments approved by a majority of the Steering<br />

Committee may then be ratified by a majority of the assembled membership. Those proposed<br />

amendments not approved by the Steering Committee may be brought to the assembled<br />

membership for review and shall require a two-thirds vote of the assembled membership to<br />

override the Steering Committee action.<br />

Article X – Voting<br />

All members attending the annual conference shall be voting members<br />

Article XI – Harry H. Greer Award<br />

A. The Harry H. Greer Award signifies long standing exceptional work contributing to the<br />

vision, purpose and aim of the <strong>IMTA</strong>.<br />

B. Selection Procedures:<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


1. Prior to June 1st of each year, the Secretary will solicit nominations for the Greer Award<br />

from members of the Steering Committee. Prior Greer Award winners may submit<br />

unsolicited nominations. Award nominations shall be submitted in writing to the Secretary<br />

by 1 July.<br />

2. The recipient will be selected by a committee drawn from the Primary Members and<br />

committee members will have attended at least the previous three <strong>Association</strong> annual<br />

conferences.<br />

3. The Chairman of the Award Committee is responsible for canvassing other committee<br />

members to review award nominations and reach a consensus on the selection of a recipient<br />

of the award prior to the annual conference.<br />

4. The Award Committee selection shall be reviewed by the Steering Committee.<br />

5. No more than one person is to receive the award each year but the Steering Committee<br />

may decide not to select a recipient in any given year.<br />

C. The Award is to be presented during the annual conference. The Award is to be a certificate,<br />

the text prepared by the officers of the <strong>Association</strong> and appropriate memorabilia per discretion of<br />

the Chairman.<br />

Article XII – Enactment<br />

These By-Laws shall be in force immediately upon acceptance by a majority of the assembled<br />

membership of the <strong>Association</strong>.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

xv


USABILITY TESTING: LESSONS LEARNED AND METHODOLOGY<br />

Helene Maliko-Abraham<br />

Basic Commerce and Industries, Inc.<br />

Helene.ctr.maliko-abraham@faa.gov<br />

Ronald John Lofaro, PhD<br />

Embry Riddle Aeronautical University<br />

lofaror@erau.edu<br />

INTRODUCTION<br />

Operational usability testing is an essential aspect of fielding new systems. The field<br />

of knowledge engineering holds great promise in developing new and effective<br />

methodologies for such tests. When developing a new system, you have to know, understand<br />

and work with people who represent the actual user community. It is the users that determine<br />

when a system is ready and easy to use. The user is commonly referred to as a Subject<br />

Matter Expert (SME). The efforts of SMEs add credibility and accuracy to the Operational<br />

Test (OT) process.<br />

Based on the main author’s recent OT experience, careful consideration must be given<br />

on how to properly utilize the input and expertise of the SME group. An evaluation of<br />

lessons learned from this activity resulted in the realization that the full potential of the SME<br />

contributions were not realized. The major contributing factor was the lack of an appropriate<br />

methodology to effectively focus the efforts of this group.<br />

The Small Group Delphi Paradigm (SGDP) (Lofaro, 1989) could have been that<br />

methodology. The SGDP evolved from the Delphi process which was originally developed<br />

in the 1950s, as an iterative, consensus building process for forecasting futures. The SGDP<br />

took the Delphi process in another direction by modifying it via merger with elements of<br />

group dynamics in order to have interactive (face-to-face) Delphi workshops. The<br />

modification resulted in a paradigm for eliciting evaluations, ratings and analyses from small<br />

groups of experts. The SGDP can be used for any project that requires a set of SMEs be used<br />

to identify, evaluate, and criticality rate tasks. It also has applications to recommend<br />

modifications to equipment, procedures and training. Finally, the SGDP can be used to<br />

sharpen, modify and revise methodologies.<br />

The link between the SGDP and usability testing is that both use SMEs to elicit information.<br />

The information garnered from the SMEs can then be used to create realistic scenarios to<br />

evaluate the operability of the equipment being tested. This paper will discuss how the<br />

SGDP technique can be used to develop scenarios that will be used in operational usability<br />

testing.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

1


2<br />

OPERATIONAL USABILITY TESTING<br />

Dumas and Redish (1999) define usability as, “…the people who use the product can<br />

do so quickly and easily to accomplish their own tasks. There are four main parts to their<br />

definition, usability means focusing on users; people use products to be productive; users are<br />

busy people trying to accomplish tasks; and the users decide when a product is easy to use”.<br />

The FAA’s Acquisition Management System (AMS) Test and Evaluation (TE) policy<br />

guidelines require that selected test processes should include a verification of operational<br />

readiness. It defines two distinct components of operational readiness, operational<br />

effectiveness and operational suitability. “Operational suitability is the degree to which a<br />

system can be used satisfactorily in the field with consideration given to availability,<br />

maintainability, safety, human factors, logistics, supportability, documentation, and training”.<br />

For the purpose of this paper, the authors will be concentrating on operational suitability,<br />

which is synonymous with usability.<br />

SMALL GROUP MODIFIED DELPHI<br />

Delphi techniques have become common methodologies for eliciting analyses.<br />

Standard Delphi techniques include anonymity of response, multiple iterations, convergence<br />

of the distribution of answers, and a statistical group response (Judd, 1972).<br />

However, as Meister (1985) has said, “ the Delphi methodology is by no means fixed…[it] is<br />

still evolving and being researched.” The SGDP is seen as another step in his evolution.<br />

In the development of the SGDP technique, Fleischmann’s underlying abilities theory<br />

was merged with traditional Delphi techniques as well as group dynamics. The SGDP has<br />

been successfully used to knowledge engineer SME data for the development of core<br />

competencies, selection tests, training, analyses, objective and model development. The<br />

SGDP is a highly structured, sequentially ordered, process. This technique has been used in<br />

many environments, which demonstrates a robust flexibility and generalizability of the<br />

paradigm. This flexibility and generalizability are borne out as the SGDP has been used,<br />

with modifications resulting from initial conditions, multiple times (Lofaro and Intano 1990,<br />

Gibb and Lofaro 1991, Gibb and Garland 1994, Lofaro, 1998). Every use of the basic SGDP<br />

model results in modifications and takes “shape” as the objectives are defined, the SMEs are<br />

selected and time limits are set.<br />

THE STRUCTURE<br />

After carefully selecting the SMEs who will provide expert data, opinions and<br />

criticality ratings, the first step is to develop the objectives. Each objective must then be<br />

enumerated, and sub-objectives can then be developed and all the components needed to<br />

achieve the whole objective will be in place. This becomes the basis for developing and<br />

scheduling the times/types of group sessions.<br />

A read-ahead package must be prepared and sent to the Workshop participants at least three<br />

weeks in advance. This package is vital. The SGDP will flow, or not, smoothly from it. The<br />

package must contain, besides the pro forma where and when, the following:<br />

A. The objectives of the workshop, including the longer-range goals that the<br />

workshop will enable. This is key to obtaining SME buy-in and maximum cooperation.<br />

B. A clear statement describing not only what the participants will be doing, but also<br />

informing them that their input will be operationally implemented. In this package, the SMEs<br />

should be advised that they were hand picked for their acumen and experience and that they<br />

are the final "word" on this. All of this is true, but the clarity, transparency and use-value, in<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


the operational environment and, of both the immediate and long-range goals are what will<br />

ensure maximal SME effort.<br />

C. Participants should be advised that small-group processes will be used and that<br />

group dynamics training will be the first step in the process. Materials on "groupthink",<br />

group forming etc. should also be included in this package. At the first session, some group<br />

exercises should be conducted to demonstrate what the materials have described.<br />

D. Finally, the read-ahead package should include a homework assignment.<br />

Participants should be instructed to read all of the read-ahead package and then begin<br />

thinking and working on the first objective. A day-by-day agenda should be provided as<br />

well.<br />

NOTE: Along with the read-ahead, the facilitator must prepare a complete set of protocols<br />

for the participants to be given to them when they arrive. Besides having the read -ahead<br />

materials, the protocols also have the group processes materials and a step-by-step breakout<br />

of the objectives, sequencing and, how each objective will be carried out. The protocols are<br />

the structure to keep the groups on the process track.<br />

THE SEQUENTIAL PROCESS<br />

The optimum size of the workshop is 10 persons broken into 2 sub-groups of 5. Upon<br />

arrival each SMEs should be provided with a final agenda and the protocols that specify how<br />

each objective will be accomplished. The next step is to proceed with instructions and<br />

exercises in group dynamics and consensus. Work on the objectives can now begin. An<br />

iterative step-wise process should be used where each objective is accomplished by<br />

anonymous and individual means. Sub-group discussions are then held to achieve consensus.<br />

Finally, the intact (the 2 sub groups meet together) for a group discussion to achieve final<br />

consensus. The iterative, step-wise process it to go from a 5 person group meeting on subobjectives<br />

to an intact 10 person meeting to make a group decision on the main objective.<br />

DISCUSSION<br />

One obvious candidate for the SGDP methodology is Operational <strong>Testing</strong> (OT). The<br />

goal of any OT is to determine the system’s capability to perform its mission in the<br />

operational setting and to determine and evaluate core operational effectiveness and<br />

suitability problems. The SGDP can be modified so that SMEs can use it in OT.<br />

A small set of carefully selected SMEs would be used to face-validate the existing scenarios.<br />

This SME set also would ensure that all the necessary operational issues were embedded in<br />

the scenarios. Dumas and Redish state, “a good scenario is short, composed in the user’s<br />

words (not the products), and is clear so all participants will understand it”. A second group<br />

of SMEs would also assist in the test sequencing as well as the techniques/scalings to be used<br />

in workload analysis. While the process of multiple SGDP workshops may seem lengthy, it<br />

is not the case. The Workshops would run consecutively, each new one beginning with the<br />

data from the prior one---and each SGDP workshop only would need approximately 4 days to<br />

complete. The experience and insight as to the realities of the operational arena of the SMEs<br />

are both needed and invaluable. Upon completion of the actual OT, all SME members of the<br />

2 previous SGDPs would be convened to interpret the results and to make a set of<br />

recommendations. These recommendations may include retrofit, modification, and training.<br />

The two SGDP groups would work independently at first. Then, as they finished, they would<br />

convene into one group to finalize their results.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

3


4<br />

A major consideration is the impact of using face-to-face groups on ratings and<br />

evaluations. Pill (1970) has said that this may dilute the opinions of the real expert. This<br />

seems a strange objection as the SME’s selected are the real experts. However, if what is<br />

meant is that one of the groups may have more expertise in a small, specific area that is being<br />

worked on, then the reality (based on conducting 7 or so of these) is that the other SMEs<br />

recognize, welcome and use that expertise –as their goal is the best product possible.<br />

Another objection is that the group dynamics may force ratings and analyses towards<br />

a mean or middle ground that does not fully reflect all the SMEs views. There are 2 answers<br />

to this: the first is that “real” SMEs will not allow that to happen for personal reasons and<br />

pride. They will not “go along to get along.” The second is that the instruction in group<br />

work, the facilitator and an iterative methodology used in accomplishing the sub-objectives<br />

are all structures in the SGDP process designed to ensure that this does not happen.<br />

CONCLUSION<br />

While the use of the SGDP technique has been suggested for building operability<br />

testing scenarios and the evaluative criteria for them, it seems that its use may be extended.<br />

In developing the content for any usability test, to include requirements, procedures,<br />

scenarios and evaluation, a variant of the SGDP can be used. As stated previously, the goal<br />

of conducting any operability testing on a system is to ascertain the operational readiness of<br />

the system, i.e., the operational effectiveness and the operational suitability. The<br />

identification and refinement of system-critical operational requirements are eminently suited<br />

to being accomplished via the SGDP.<br />

REFERENCES<br />

Dumas, J.S., Redish, J.C., 1999, “A Practical Guide to Usability <strong>Testing</strong>”. Intellect: Portland,<br />

OR.<br />

Gibb, G.M., Lofaro, R.J., et al., 1991,“The Development of Enhanced Screening Techniques<br />

for the selection of Air Traffic Controllers” <strong>Proceedings</strong> of the annual Air Traffic Controller<br />

<strong>Association</strong> (ATCA) Symposium September 1991<br />

Judd, R.C. (1972). “Use of Delphi Methods in higher Education”, Technological Forecasting<br />

and Social change, 4, 176-196<br />

Lofaro, R.J., 1998, “Identifying and Developing Managerial and Technical Competencies:<br />

Differing Methods and Differing Results” (<strong>Proceedings</strong> of 42 nd annual Human Factors and<br />

Ergonomics Society.<br />

Lofaro, R.J., Gibb, G.M., and Garland, D., 1994, “A Protocol for Selecting Airline Passenger<br />

Baggage Screeners”) DOT/FAA/CT-94/110. National Technical Information Service:<br />

Springfield, VA.<br />

Lofaro, R.J. and Intano, G.P., 1989. “Exploratory Research and Development: Army Aviator<br />

Candidate Classification By Specific Helicopter” <strong>Proceedings</strong> of 5th <strong>International</strong><br />

Symposium of Aviation Psychologists, R.S. Jensen (ed.).<br />

Meister, D (1985), Behavioral Analysis and Measurement Methods. New York: Wiley .<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Pill, J. (1970) “The Delphi Method: Substance, Context:; Critique and an Annotated<br />

Bibliography”. Technical Memorandum 183, Case Western Reserve University, Cleveland ,<br />

OH.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

5


6<br />

WHY WE STILL NEED TO STUDY UNDEFINED CONCEPTS<br />

Nicola L Elliott-Mabey<br />

Deputy Directorate Policy (Research and Management Information),<br />

Room F107a, HQ PTC, RAF Innsworth<br />

Gloucester, GL3 1EZ<br />

trgres1.cos@ptc.raf.mod.uk<br />

INTRODUCTION<br />

The Deputy Directorate Policy (Research & Management Information) (DDP(Res &<br />

MI)) consists of 8 psychologists and a psychology student, and is situated at the UK’s Royal<br />

Air Force’s (RAF) Headquarters Personnel and Training Command, RAF Innsworth,<br />

Gloucester. DDP(Res & MI) provide and co-ordinate appropriate applied psychological<br />

research in support of current and future RAF personnel and training policies. The core of the<br />

work programme relates in particular to the areas of ethos/culture, recruitment, training,<br />

retention, community support, diversity and equality. As part of its applied research studies<br />

programme, a number of surveys are conducted which quantify attitudes to a wide variety of<br />

factors, ranging from satisfaction with pay, to importance of promotion prospects. However,<br />

more nebulous terms and concepts such as ‘morale’, ‘quality of life’, ‘ethos’, ‘stability’, and<br />

‘overstretch’ are also regularly assessed/measured; yet these concepts do not lend themselves<br />

well to agreed academic definition neither at a conceptual nor operational level.<br />

FOCUS<br />

This paper will focus upon why it is important to study undefined or poorly defined<br />

concepts such as ethos and morale. The paper will discuss the importance of definition and<br />

language to the discipline of psychology and consequences for the measurement of attitudes.<br />

It will then consider why certain terms fail to be conventionally defined but which are<br />

significant to the study of the military because of their common usage and growing<br />

organisational interest.<br />

IMPORTANCE DEFINITION AND MEASUREMENT<br />

Importance of definition and language<br />

There is no doubt that psychology has its own language. There are terms which are<br />

particular to psychology for example ‘ego-centric’ and ‘self-actualisation’; but also terms<br />

which have both common and psychological definitions such as ‘extraversion’, ‘preference’<br />

and ‘motivation’. Such a technical vocabulary requires definition for comprehensibility and<br />

common understanding, and to ensure that the terms are used in a specific and consistent<br />

manner. A definition is likely to include the elements which comprise the concept but also<br />

reference to the stability of these elements, eg that job satisfaction may vary over time.<br />

A definition should refer to which elements are included but also what is excluded. An<br />

important distinction is not necessarily whether the definition is in some abstract sense<br />

‘correct’ but “whether it is useful or not” (Liefooghe, Jonsson, Conway, Morgan, and Dewe,<br />

<strong>2003</strong>, p28). A clear definition is therefore judged by many (eg McKenna, 1994) to be the<br />

necessary precursor for measurement. Researchers need to be certain what they are trying to<br />

quantify in order to employ an existing measure or construct a new one. A clear definition<br />

also sets a baseline and assists replication and broadening of research in the future.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Types of definition<br />

There are different types of definition available to psychologists. They may use a<br />

descriptive/conventional definition which is universally agreed ie a dictionary definition, for<br />

example “brain: the part of the central nervous system which is encased by the skull” (Reber,<br />

1985, p101). Alternatively, a stipulative/working definition may be more appropriate where<br />

the researcher indicates how they will use the term; acknowledging ambiguity over the<br />

meaning, or that several meanings exist and the most relevant is chosen.<br />

Definitions can also be conceptual or operational. Conceptual definitions are<br />

concerned with defining what a given construct means, for instance satisfaction is “an<br />

emotional state produced by achieving some goal” (Reber, 1985, p660). An operational<br />

definition, on the other hand, needs to be understood in relation to a given context, in which<br />

the term is applicable, and will make reference to how the attribute is to be measured, for<br />

instance intelligence is “that which intelligence tests test” (Bell, Staines and Mitchell, 2001,<br />

p114).<br />

Measurement<br />

There is a strong school of thought which advocates that only terms which can be<br />

defined can be measured (eg Schwab, 1999). The reasoning behind this is to ensure precision<br />

of meaning thus avoiding ambiguity of results. As well as the propensity in psychology for<br />

definition of terms there is a natural tendency towards measurement. Through measurement<br />

constructs are made researchable.<br />

Attitudes are hypothetical constructs representing individuals’ tendencies and so<br />

cannot be measured directly. As such, only inferences can be made about the nature of<br />

attitudes by measuring the behaviour believed to stem from given attitudes or asking<br />

individuals to report their feelings, thoughts and opinions. Nevertheless many different<br />

measurement tools have been developed. Attitudes are measured because, although they are<br />

not directly observable, they facilitate or hinder activity, that is they can be the underlying<br />

cause for an action.<br />

Any measure must seek to be valid and reliable. Reliability relates to the consistency<br />

and stability of the tool, and is necessary for validity to exist. For an instrument to be valid it<br />

must measures what it purports to measure (Cannell and Kahn, 1968). In relation to validity<br />

is it paramount that an instrument measures the construct “fully and exclusively” (Young,<br />

1996, p1).<br />

Definition of attitudes<br />

Whilst psychology has a desire to categorise, define and measure variables, as<br />

practitioners engaged in ‘real world’ research we are acutely aware of how difficult this is.<br />

Motivation, job satisfaction, team, culture and stress are examples of occupational psychology<br />

terms which have been defined but in many different ways. Some would might even call<br />

them ‘definitional elusive’ (Reber, 1995, p454) and which have ‘resisted clarification’ (Reber,<br />

1995, p101).<br />

Researchers and practitioners alike recognise the difficulties of definition, and<br />

therefore measurement, but even so have attempted to construct working definitions and<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

7


8<br />

measurement tools. Often these are relevant to a given context or perspective; take for<br />

instance the following two definitions of leadership:<br />

“The process by which an agent induces a subordinate to behave in a desired manner”<br />

(Bennis, 1959)<br />

“The process of influencing an organised group towards accomplishing its goals” (Roach and Behling,<br />

1984)<br />

The first definition emphasises a leader is someone with a subordinate(s), where as the<br />

second focuses on the process of leadership ie to attain a goal through the influence of a group<br />

member<br />

It is important to note that definitions vary due to a whole range of dependencies and<br />

influences such as context, theoretical perspective, how the definition is to be used or how it<br />

will be researched. What this shows, however, is an understanding amongst the psychology<br />

community that there is often no one single correct definition (Hughes, Ginett, Curphy, 2002),<br />

although of course some may believe their definition to be ‘more appropriate’ or ‘better’ than<br />

others. It also demonstrates how complex and multi-faceted concepts are in that they cannot<br />

be universally defined.<br />

MEASURING POORLY DEFINED CONCEPTS<br />

Presence of poorly defined concepts<br />

What concepts are poorly defined? There are many concepts that lack universally<br />

agreed definitions such as those listed above (eg motivation) but this does not mean they<br />

suffer from a lack of definition per se. There are terms such as morale, however, that have not<br />

been so successfully defined. This term is very significant to the military environment but<br />

seems to escape definitive classification. In a recent investigation of the definition and<br />

measurement of morale, Liefooghe et al (<strong>2003</strong>) (p29) concluded that “whilst many researchers<br />

opt for the objective stance and attempt to investigate the nature of morale … many skip the<br />

part of explicitly defining the concept. The research process moves from the research<br />

question directly to the measurement, without explicitly addressing definitional issues”. This<br />

finding is at odds with the importance of defining concepts in order to measure them. In the<br />

case of morale there was consensus that it was some form of psychological state, and many<br />

correlates and/or components were identified, but actual conceptual definitions were limited.<br />

Why is this the situation? It is agreed that morale is an important concept, especially in<br />

relation to work performance and group effectiveness, but what is comprised of is more<br />

tricky. Liefooghe et al’s (<strong>2003</strong>) work shows that previous researchers could not decide if the<br />

term was related to individual or group processes, whether it was a single entity or a process,<br />

or if indeed it was actually a portmanteau construct combining different characteristics.<br />

Reasons to measure of poorly defined concepts<br />

So why should poorly defined concepts be measured? Good practice dictates clear<br />

definition before measurement although this is not always achieved (eg morale) nor is a<br />

universal definition agreed on (eg leadership). The terms which elude definition are often<br />

referred to as nebulous and vague. However, should this be a reason to ignore such concepts<br />

in occupational psychology research and in particular the military context? One argument is<br />

that it is not acceptable to discount these concepts for a number of reasons not least because<br />

they are commonly used in everyday parlance. The existence of such attitudes is ‘observed’<br />

in the military in two forms. Although some individuals may not use the term in question<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


they may discuss components of it. For instance, a corporal may not actually mention his<br />

‘level of job dissatisfaction’ but may talk about factors which contribute to the overall attitude<br />

such as ‘not liking my current job’, ‘unhappy with pay and conditions’, ‘I clash with my<br />

superiors’ etc. Alternatively, there is evidence, from at least one recent RAF attitude survey,<br />

that individuals use terms like morale, ethos and stability. The following examples are from<br />

the 2001-2002 Officers' Reasons for Leaving the Service questionnaire:<br />

“The relentless search for cost-cutting is very wearing and morale sapping.”<br />

“Line managers are too busy to discharge proper welfare for subordinates which<br />

undermines morale.”<br />

“Stability and quality of life plays a great part in my decision to leave.”<br />

“A gradual compounding degeneration of branch focus and quality of life, coupled with the<br />

Service’s ethos of paying an individual to leave rather than paying to retain.”<br />

Therefore an interest in the wellbeing of Service personnel makes it is a valid reason<br />

to investigate these issues because of the common and colloquial use of these terms, albeit<br />

poorly defined in the academic sense. It is also apparent that individuals have a shared<br />

understanding of what these terms mean. This is not to say that Service personnel would<br />

universally define morale (or that indeed they could articulate what it meant in psychological<br />

terms) but it is clear that it is an important term to them. Another example is specific to<br />

aircrew. Pilots confirm the importance of ‘airmanship’ to flying/operating aircraft efficiently,<br />

effectively and safely. Many can describe components of it, for example ‘being able to<br />

prioritise’, ‘being about to multi-task’, ‘being aware of everything that is happening inter and<br />

intra cockpit’ but few seem able to define the concept as a whole. This does not diminish the<br />

significance of airmanship nor make it any less a candidate for research. The key here is how<br />

it can be measured; more on this in a moment.<br />

But what other reasons are there to measure terms which are poorly defined? One<br />

important reason is that we are asked to do so by the organisations we serve. To cite morale<br />

again, the following quotation illustrates the perceived significance of the concept:<br />

“High morale, as well as being important in its own right, is a vital factor in retaining<br />

personnel, which in turn improves manning levels and helps to obtain the optimum return<br />

on investment in training. Our aim is to maintain excellent levels of retention and morale<br />

through policies that reflect the priorities of our people and their families” (Ingram, 2002<br />

– UK MOD Minister for Armed Forces ).<br />

More and more there is a requirement to quantify performance against management<br />

targets and indicators. We have to pragmatically research issues which are notoriously<br />

difficult to measure (phrases like ‘poison chalice’ and ‘holy grail’ are conjured up) but which<br />

can suffer from a great deal of anecdotal belief and ‘gut feeling’. This is not to say that<br />

everything can be measured but as occupational psychologists it is our job to use sound<br />

research to investigate these questions. There is a need to provide scientific evidence on<br />

which to base policy decisions and this can mean researching concepts that are not<br />

psychologically well defined but which may have colloquial definitions.<br />

Finally, as occupational psychologists we are constantly trying to make sense of the<br />

working environment, the people within it and the organisation itself. This curiosity should<br />

not prevent us from disregarding concepts that at present are vague (and may always elude<br />

conventional definition) but which can help explain the world of work, especially in the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

9


10<br />

military context. We are of course not trying to make sweeping generalisations about the<br />

results. Often attitudes raised in survey research require further investigation especially if a<br />

sub set of the population has a different view point to the mainstream. Findings are caveated<br />

because attitudes may not result in direct behaviour or action, or in the direction proposed.<br />

Additionally, the findings may tell us something about a tendency towards a situation but<br />

nothing about that the attitude itself. However, our results will tell us something about<br />

individual or group tendencies and the prevalence of such tendencies.<br />

Measurement of poorly defined concepts<br />

So if there are compelling reasons for trying to measure poorly defined concepts, how<br />

can we achieve this reliably and validly? The solution to this would create enough material<br />

for several additional discussion papers. However, there are a few key points that are worth<br />

higlighting.<br />

Anderson, Herriot and Hodgkinson (2001) believe that the basis for and majority of<br />

occupational/industrial psychology should be in grounded in what they call ‘pragmatic<br />

science’, that is psychology should have “both practical relevance and methodological rigour”<br />

(Anderson et al, p394). It has been suggested that only well defined concepts can be<br />

measured. However, in reality many attitudinal instruments have been developed to measure<br />

poorly defined terms. The relevance and motivation for attitude measures has already been<br />

outlined so the next question relates to ‘methodological rigour’.<br />

Hinkin (1998) sets out the constituents of a good measure of a construct ie ensuring<br />

construct validation. Identify the construct domain (clear definition); develop items (based on<br />

previous research, subject matter experts and pilot studies); determine how items measure the<br />

construct domain (content validity); and assess antecedents, predictors, correlates and<br />

consequences of the concept (convergent, discriminant and criterion validity). Therefore,<br />

measurement can be made reliable and valid although this does not necessarily help define the<br />

concepts in the first place. So what have other researchers done?<br />

Citing morale yet again, as a concept it has been measured as a single entity eg “How<br />

would you rate your own morale?” (Schumm & Bell, 2000) and this might reflect the lack of<br />

operational definition (Liefooghe et al, <strong>2003</strong>). Is this the best approach when there is no<br />

agreed definition? Alternatively, should we employ multi-item measures, for example Paulus,<br />

Nagar, Larey and Camacho (1996) who used seven items relating to feelings about being in<br />

the Army, unpleasant experiences in the Army, helpful Army experiences, relationships with<br />

other soldiers, satisfaction with leadership, reenlistment, desire to leave the Army? The<br />

second option assumes that we are confident with the components of a given concept and that<br />

we can construct a ‘morale scale/score’ from the results.<br />

One approach might be to explicitly define terms for respondents ie use stipulative<br />

definitions where we indicate how we will want to use the term. Here a new definition may<br />

be created and one which is specific to a given environment like the military. If a definition<br />

is too prescriptive, examples of attributes and characteristics which comprise the term could<br />

be used; eg for morale terms: ‘dedication’, ‘willingness to sacrifice’, ‘motivation’,<br />

‘confidence’ ‘commitment’. There is a note of caution though, even when a definition is<br />

provided it does not mean that the respondents will use it (Oppenheim, 1992).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


It seems that in the absence of universal definitions and/or theoretical frameworks,<br />

practitioners have to try to develop tools that are as reliable and valid as possible, which may<br />

seem unachievable without a clear definition of concepts. However, preliminary observations<br />

and previous research help us construct our items and we are able to determine the<br />

consistency of responses and understanding of terms whilst seeking for more definitive<br />

baselines. There is, therefore, the opportunity to develop working definitions for poorly<br />

defined concepts in order to measure them and some exciting work could emerge in this field<br />

in years to come.<br />

CONCLUSION<br />

Although there is a strong tendency in psychology to define and measure constructs,<br />

there are several key concepts which remain poorly defined and which resist categorisation.<br />

These concepts are still measured in academic research but importantly also by occupational<br />

psychology practitioners including those working in the military environment. This is<br />

attempted because of the common usage of such terms as morale and ethos and by the<br />

increasing need by organisations to quantify personnel’s attitude to a range of issues. In the<br />

interim period (how ever long that might be) before universal or at least working definitions<br />

(probably context based) and/or theoretical frameworks are constructed, measures will be as<br />

reliably and validly developed as possible in order to capture these key attitudes.<br />

REFERENCES<br />

Anderson, N., Herriot, P., and Hodgkinson, G.P. (2001). The practitioner-researcher divide in<br />

industrial, work and organizational (IWO) psychology: Where are we now, and where do we<br />

go from here? Occupational and Organizational Psychology, 74, pp 391-411.<br />

Bell, P.B., Staines, P.J. and Mitchell, J. (2001). Evaluating, doing and writing research in<br />

psychology. Melbourne, Australia: Sage.<br />

Bennis, W.G. (1959). Leadership theory and administrative behavior: The problem of<br />

authority. Administrative Science Quarterly 4, pp 259-260.<br />

Cannell, C.F. and Kahn, R.L. (1968). Experimental psychology. In G. Lindzey and E.<br />

Aronson (Eds). Handbook of social psychology Vol 2. Reading, MA: Addison Wesley.<br />

Hinkin, T.R. (1998). A brief tutorial in the development of measures for use in survey<br />

questionnaires. Organizational Research Methods, 1(1), 104-121.<br />

Hughes, R.L., Ginett, R.C, Curphy, G.J. (2002). Leadership: Enhancing the lessons of<br />

experience. New York, NY: McGraw Hill, 2002.<br />

Ingram, A. (2002). House of Commons Written Answers to Questions, Mon 11 Feb 2002.<br />

http://www.parliament.the-stationary-office.co.uk/pa/cm/cmhansrd<br />

Liefooghe, A., Jonsson, H., Conway, N., Morgan, S., and Dewe, P. (<strong>2003</strong>). The definition<br />

and measurement of morale: Report for the Royal Air Force. Extra-mural contract report for<br />

RAF: Contract - PTC/CB/00677.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

11


12<br />

McKenna, E. (1994). Business psychology and organisational behaviour. Hove, UK:<br />

Lawrence Erlbaum Associates Ltd.<br />

Oppenheim, A.N. (1992). Questionnaire design, interviewing and attitude measurement.<br />

Kings Lynn, Norfolk: Pinter Publishers Ltd.<br />

Paulus, P. B., Nagar, D., Larey, T. S. and Camacho, L. M. (1996). Environmental, Lifestyle,<br />

and Psychological Factors in the Health and well-being of <strong>Military</strong> Families. Journal of<br />

Applied Social Psychology, 26 (23). 2053-2057.<br />

Reber, A.S. (1985). Dictionary of psychology. St Ives, UK: Penguin Books.<br />

Roach, C.F. and Behling, O (1984). Functionalism: Basis for an alternate approach to the<br />

study of leadership. In Leaders and Managers: <strong>International</strong> Perspectives on managerial<br />

Behavior and Leadership. Ed J.G. Hunt, D.M. Hosking, C.A. Schriesheim and R. Stewart.<br />

Elmsford, NY: Pergamon.<br />

Schumm, W. R. and Bell D. B. (2000). Soldiers at Risk for Individual Readiness or Morale<br />

Problems During Six-Month Peacekeeping Deployment to the Sinai. Psychological Reports,<br />

80. 623-633.<br />

Schwab (1999). Research methods for organizational studies. Mahwah, New Jersey:<br />

Lawrence Erlbaum Associates Ltd.<br />

Young, C.A. (1996). Validity issues in measuring psychological constructs: The case of<br />

emotional intelligence. http://trochim.human.cornell.edu/tutorial/young/ieweb.<br />

© British Crown Copyright <strong>2003</strong>, MOD<br />

Published with permission of the Controller of Her Britannic Majesty’s Stationery Office<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Do we assess what we want to assess?<br />

The appraisal dimensions at the Assessment Center for Professional Officers (ACABO)<br />

Dr. Hubert Annen & Lic. phil. Barbara Kamer<br />

<strong>Military</strong> Academy at the Swiss Federal Institute of Technology<br />

Steinacherstrasse 101b, CH-8804 Au<br />

hubert.annen@milak.ethz.ch<br />

INTRODUCTION<br />

The Assessment Center is a widely used tool for the selection and development of managers<br />

in various organizations. Many studies have demonstrated that assessment center appraisals<br />

predict a variety of important organizational criteria, such as training and job performance or<br />

promotion (Gaugler, Rosenthal, Thornton & Bentson, 1987; McEvoy, Beaty, & Bernardin,<br />

1987).<br />

It appears therefore that the assessment center has a good predictive validity. However,<br />

scientists seem to differ about what each individual dimension and what the tool itself really<br />

measure or what the observers exactly assess. Because when it comes to construct validity,<br />

most studies show a similar picture: ratings of multiple dimensions within a single exercise<br />

correlate higher than do ratings of the same dimension across multiple exercises (Annen,<br />

1995; Bycio, Alvares, & Hahn, 1987; Kleinmann, Kuptsch, & Köller, 1996; Robertson,<br />

Gratton, & Sharpley, 1987; Sackett & Dreher, 1982; Turnage & Muchinsky, 1984).<br />

There are many hypotheses which try to explain why the assessment center is nevertheless a<br />

good predictor for future job success. Russel und Domm (1995) for example say that<br />

Assessment Centers have such a high prognostic value because they measure attitudes which<br />

are of importance for the future job. For a better understanding of the predictive value of the<br />

assessment center, Shore, Thornton und Shore (1990) claim that the construct validity of<br />

dimension ratings should be explored by building a nomological network of related<br />

constructs. Their own studies showed that, during an assessment center, cognitive ability<br />

correlate stronger with problem-solving dimensions and that personal traits have a stronger<br />

connection with interpersonal dimensions. Other studies focus on the connection between<br />

personality factors and cognitive ability respectively and the performance in the assessment<br />

center, and they have produced significant results (Crawley, Pinder, & Herriot, 1990; Fleenor,<br />

1996; Chan, 1996; Goffin, Rothstein & Johnston, 1996; Scholz & Schuler, 1993, Spector,<br />

Schneider, Vance, Hezlett, 2000).<br />

Taking into account the various studies made on the subject, it seems to be difficult to<br />

establish a correlation between the results of assessment center and certain other criteria. Each<br />

assessment center in an organization is tailored to persons with a specific job background.<br />

Depending on the job profile which the successful candidate should meet other criteria of<br />

observation are also used and operationalized according to the requirements of the job. It is<br />

therefore of vital importance to have a clear idea of what is really measured through the given<br />

dimensions and whether or not we really measure what we mean to measure.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

13


14<br />

THE ASSESSMENT CENTER FOR PROFESSIONAL OFFICERS (ACABO)<br />

During the winter semester 1991/92 the Swiss <strong>Military</strong> College at the Federal Institute of<br />

Technology Zurich (ETH) introduced a diploma study course for future professional officers.<br />

In much the same way as managers in the private sector, future professional officers not only<br />

need to have intellectual ability and technical skills but they must also show a high level of<br />

social competence. Therefore in 1992 a three-day assessment center programme was<br />

developed in order to provide students with an appraisal of their situation and hints for<br />

improvement concerning their personal and social competences and to provide the trainers<br />

with more accurate information regarding their students. In this form the assessment center<br />

was neither a pure selection instrument, nor a long-term potential appraisal. In 1996 the<br />

assessment center finally became a definite selection tool, which was an obstacle to be<br />

overcome by every candidate before the beginning of his study course.<br />

The ACABO is a classical three-day assessment center. The candidates have to deal with<br />

reality-based tasks in group discussions, presentations and role play. The observer team is<br />

composed of superiors and chiefs of training who are recruited above all from divisions which<br />

have sent candidates. Because the Swiss Militia Army can still be considered a part of society,<br />

civilian assessors – usually psychologists or human resources specialists – are also employed.<br />

During the assessment center, each participant is appraised by several observers according to<br />

seven rating criteria.<br />

Owing to the fact that the decisions taken during the assessment center have far-reaching<br />

consequences, a scientific evaluation on a regular basis and resulting adaptations as well as<br />

further developments of the procedure are indispensable. Besides studies on the social validity<br />

(Hophan, 2001) or interrater reliability (Wey, 2002) other studies particularly with regards to<br />

construct and criterion related validity (Annen, 1995) as well as prognostic validity<br />

(Gutknecht, 2001) were conducted. In more recent studies (Annen & Kamer, <strong>2003</strong>)<br />

endeavours were made to make a further contribution to the nomological network of the<br />

assessment center and to show the connection between personality factors, cognitive<br />

competence and the assessment center results.<br />

It has always been considered a basic principle to use the findings of the studies on ACABO<br />

not only for scientific purposes but to implement them in practice in order to further develop<br />

the tool. Since the current paper can be seen as another contribution to the understanding of<br />

our assessment center dimensions and since the research design is based on the findings of<br />

former studies, the following pages will again present the most important former studies and<br />

their practical implications.<br />

FORMER STUDIES<br />

Social validity<br />

Based on the concept of social validity (Schuler, 1998) the candidates fill in a questionnaire at<br />

completion of ACABO, in which they are asked to convey their impressions and attitudes to<br />

the ACABO they have just gone through. Hophan (2001) critically examined the results and<br />

has come to the conclusion that the high level of acceptance by the participants is independent<br />

of their geographical origin, school-qualifications, their assessment center results, their age or<br />

their military rank.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Construct and criterion related validity<br />

Annen (1995) examined the construct and criterion related validity of the ACABO and<br />

concluded that the ACABO, like many other assessment center, shows no construct validity in<br />

a test theoretical sense. Based on these findings the question then arises as to how the<br />

dimensions could be depicted in a more differentiated and valid way. Of special interest was<br />

the dimension “analysis”, because there was the uncertainty of whether or not some<br />

candidates were unable to present an adequate solution to the problem because of their lack of<br />

structural and analytical abilities or due to their practical inexperience.<br />

Further development of ACABO based on this study: In order to better back up the dimension<br />

“analysis” it was decided in 1996 to introduce written cognitive ability tests and to integrate<br />

the result to a fourth of its value into the rating of the dimension “analysis”.<br />

Prognostic validity of ACABO<br />

Gutknecht (2001) examined the various tools used for the selection of professional officers in<br />

their prognostic validity regarding study and job success. His findings showed that school<br />

performance (grades of upper secondary certificate) and cognitive ability (tests in the<br />

ACABO) have the highest prognostic validity regarding study success. The cognitive ability<br />

together with study success turn out to be the best predictors of job success (assessed by<br />

means of job appraisals by superiors). Based on the knowledge that the assessment center<br />

should have a high prognostic validity, a high correlation between the competences measured<br />

during assessment center and job success is to be expected. But this is simply not the case.<br />

The results of the assessment center reveal that the predictor “social competence” can be<br />

called construct valid, but this predictor shows no significant correlation with study success or<br />

with job appraisals by superiors.<br />

It has also been confirmed on various occasions that assessment center rather predict the<br />

development of the career than future performance ratings (e.g. Thornton, Gaugler, Rosenthal<br />

& Bentson, 1992). Scholz and Schuler (1993) strongly believe that in the assessment center<br />

“the qualities which are relevant are rather those which foster promotion than those which<br />

predict performance in the first place” (p.82). Therefore, in a second study, not only the<br />

performance appraisal was taken into consideration, but also the fact of being a member of the<br />

general staff, which could be considered as an indicator for the successful career of a<br />

professional officer. Based on this study it can be concluded that the social competences<br />

which are rated at ACABO do correlate with this operationalization of job success or<br />

successful career respectively.<br />

Further development of ACABO based on this study: At the end of the assessment center an<br />

appraisal matrix for each participant is established and in the concluding observer conference<br />

each matrix is reconsidered and ratings regarding the overall dimension, which do not yield a<br />

clear arithmetic result are rounded (up or down) after a consensus-based decision. The<br />

cognitive ability tests, which were taken into account earlier for the “analysis” by only one<br />

fourth of its weight, can now play an important role. Since 2002 these cognitive ability tests<br />

have been taken into consideration whenever a candidate reached a close result or when the<br />

outcome was arithmetically unclear and thus tipped the scales in the observer conference in<br />

favour or disfavour of the candidate.<br />

Concerning the operationalization of job success, further endeavours have to be made in order<br />

to define this outer criterion in the most accurate way possible.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

15


16<br />

ACTUAL STUDY<br />

The relation of cognitive ability and personality traits to assessment center performance<br />

in the ACABO<br />

The studies by Annen (1995) and Gutknecht (2001) show that ACABO has prognostic<br />

validity regarding job success (composed of job appraisal by superiors and membership of<br />

general staff) of professional officers, but that the ACABO dimensions have no satisfying<br />

construct validity. The question now arises as to what the dimensions really measure.<br />

Therefore we have to find out if the assessments are based on some hidden aspects such as<br />

certain personality traits.<br />

Method<br />

Participants: Assessment ratings were obtained on 214 assessees in a 3-day selection<br />

assessment process conducted during a 3-year period. All the participants had a secondary<br />

degree (Matura) and were officers of the Swiss Army.<br />

Assessors and ratings: As already mentioned, the observer team is composed of superiors and<br />

chiefs of training who are recruited above all from divisions which have sent assessment<br />

center candidates and civilian specialists in psychology and human resources management.<br />

All assessors received extensive, specific instruction and on-the-job training in conducting<br />

assessments. Each assessor rated candidates on a 4-point rating scale. In order to guarantee a<br />

fair and well-based judgement, the assessment follows a procedure involving several stages.<br />

During the perception and assessment stage, observation and judgement must be strictly kept<br />

separated. Next, results from individual, main or secondary, observers are thoroughly<br />

discussed after each exercise. In the final observer conference the appraisal matrix of every<br />

candidate is discussed again.<br />

Assessment center exercises and rating dimensions: Currently the requirement profile an<br />

ACABO candidate has to fulfil consists of seven dimensions (personal attitude, motivational<br />

behaviour, analysis, social contact, oral communication, dealing with conflicts, influencing<br />

behaviour). In focusing on activities a candidate might meet immediately after his completion<br />

of the diploma study course the following six exercises were designed: Two presentation<br />

exercises (spontaneous short oral presentation and a prepared 20-minute oral presentation),<br />

two group discussions (leaderless group discussion and debate) and two exercises in role play<br />

(motivational talk and short cases).<br />

Personality and cognitive ability measures: In order to assess personality a short version of<br />

the MRS-inventory by Ostendorf (1990) which was developed by Schallberger & Venetz<br />

(1999) is used. This 20-item short version assesses the dimensions of the five-factor-model of<br />

personality (extraversion, agreeableness, conscientiousness, emotional stability and openness<br />

to experience). Despite its shortened version the tool has still a high factoral validity and a<br />

sufficiently high reliability for research purposes (Schallberger & Venetz, 1999).<br />

Cognitive ability was measured by a test specifically designed by Saville & Holdsworth Ltd<br />

(SHL) for the selection of managers and tailored for the ACABO. The test battery consists of<br />

three tests regarding „verbal comprehension (VC1), „numeric comprehension“ (NC2) and<br />

”diagram comprehension“ (DT8). It is obvious that these three tests are focused on the<br />

construct “general intelligence” and studies by Gutknecht (2001) have shown that the three<br />

tests can be depicted in the common construct “cognitive ability”.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Results<br />

Due to the lack of a normal distribution of our data we have abstained from an evaluation on a<br />

higher level and have limited ourselves to showing only the correlative connections.<br />

Table 1 shows the connection between the Big Five and the individual dimension ratings as<br />

well as the overall assessment center score in the ACABO. The ACABO dimensions were<br />

broken down into three categories: personality dimensions (personal attitude and motivational<br />

behaviour), social dimensions (social contact, oral communication and influencing behaviour)<br />

and cognitive dimensions (analysis and dealing with conflicts). Here it was shown that<br />

“emotional stability” is an especially basic personality trait offering a good basis for passing<br />

the assessment center. It shows significant correlations with the overall assessment center<br />

score as well as with all personality dimensions and all social dimensions. Yet there is no<br />

significant correlation with cognitive dimensions.<br />

The personality trait „extraversion“ correlates only with the rating of the „personal attitude“,<br />

but not with other dimensions ratings or the overall score.<br />

It seems that cognitive ability still have an slightly higher influence on behaviour or ratings at<br />

the ACABO than personality. Whereas a correlation with personality tests is low, cognitive<br />

ability tests, however, show a highly significant correlation with the ratings of the cognitive<br />

dimensions “analysis” and “dealing with conflicts”. This comes as no surprise as the<br />

dimension “analysis” measures the analytical ability in dealing with problems, which is<br />

indispensable in dealing with conflicts in a reasonable way.<br />

The cognitive ability tests show a clear and significant correlation with the ratings of „oral<br />

communication“, which could be due to the fact that one of the three tests measuring<br />

cognitive ability refers to verbal comprehension.<br />

Furthermore there appears to be a connection between the cognitive ability tests and<br />

„motivational behaviour“ at the ACABO.<br />

Table 1<br />

Correlations of cognitive ability and personality with dimension and overall ac scores (n = 214)<br />

Social dimensions<br />

Agre Cons Stbl Extr Open CA<br />

social contact .13 .01 .17 * .11 -.01 .10<br />

oral communication -.02 .06 .15 * .13 .01 .18 **<br />

influencing behaviour Personality dimensions<br />

.07 .02 .15 * .10 .03 .11<br />

personal attitude .10 -.02 .14 * .18 * .01 .11<br />

motivational behaviour Cognitive dimensions<br />

.02 .02 .14 * .02 .05 .16 **<br />

analysis -.01 .01 .06 .08 .02 .18 **<br />

dealing with conflicts .02 .02 .07 -.01 .00 .15 **<br />

Overall ac rating .05 .03 .16 * .11 .01 .18 **<br />

Note. Agre = agreeableness, Cons = conscientiousness, Stbl = emotional stability, Extr = extraversion, Open =<br />

openness to experience, CA = cognitive ability. *p < .05, **p < .01.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

17


18<br />

All in all it can be stated that both the personality trait „emotional stability“ and the cognitive<br />

ability are relevant to the ratings in the assessment center. It shows that “emotional stability”<br />

is particularly well reflected in social and personality dimensions and that the cognitive ability<br />

correlate highly with the cognitive dimensions as well as with “oral communication” and<br />

“motivational behaviour”.<br />

Acting on the assumption that the dimensions show no construct validity we have also<br />

examined the correlations of the overall performance on assessment center exercises with<br />

measures of cognitive ability and personality (Table 2). Given that “emotional stability”<br />

correlates with the ratings of the social dimensions, we can assume that this personality trait<br />

also shows a correlation with the overall ratings of exercises with a strong interpersonal<br />

orientation such as group exercises. This hypothesis was confirmed. “Extraversion” did not<br />

significantly correlate with the ratings of the social dimensions, yet there is a statistically<br />

conclusive correlation with the overall ratings in the group exercises, which could be<br />

interpreted as a hint to a halo-effect. Finally a significant correlation between cognitive ability<br />

and the ratings in the presentation exercises and role plays can be established, which is not<br />

surprising given the fact that these exercises require a systematic problem analysis, good<br />

comprehension and problem-solving skills.<br />

Table 2<br />

Correlations of cognitive ability and personality with exercise performance (n = 214)<br />

Agre Cons Stbl Extr Open CA<br />

Group discussions .06 .06 .16 * .23 * .10 .07<br />

Presentations .05 -.01 .13 .01 -.06 .21 **<br />

Role Plays -.01 .03 .06 .08 .07 .14 *<br />

Note. Agre = agreeableness, Cons = conscientiousness, Stbl = emotional stability, Extr = extraversion, Open =<br />

openness to experience, CA = cognitive ability. *p < .05, **p < .01.<br />

Schuler (1991) considers “intelligence” and „emotional stability“ to be essential determinants<br />

for professional performance. Based on their meta-analysis, Scholz und Schuler (1993) come<br />

to the conclusion that „emotional stability“ is relevant to success in general (Baehr, 1987;<br />

Burke & Pearlman, 1988; Hoelemann, 1989), but does not show in assessment center. This<br />

statement can not be backed by our findings.<br />

Further development of ACABO based on this study: Like the studies made by Gutknecht<br />

(2001) these current findings underscore the high prognostic value of the cognitive ability<br />

tests concerning the assessment center results. The question now arises if cognitive ability<br />

tests should not have a higher importance in the future by adding them as an individual<br />

dimension score to the overall assessment center result.<br />

From the findings regarding „emotional stability“ it can be concluded that this personality<br />

trait is useful for showing a good performance during ACABO or being rated favourably by<br />

the assessors. This result can be interpreted in such a way that a future professional officer has<br />

to be “a role model for the other members of the army in every situation …. and to lead<br />

successfully under difficult conditions “ (Schweizer Armee, 2001, S. 6). This requires a<br />

certain amount of „emotional stability“. It would therefore make sense to measure this factor<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


during ACABO. The question was therefore raised if this personality trait with respect to the<br />

job profile of a future professional officer should become more relevant. The conclusion that<br />

was arrived at was if whether or not, it should become an explicit part of the ACABOdimensions.<br />

OUTLOOK<br />

We have presented a number of former studies and illustrated the ensuing consequences for<br />

the further development of ACABO. Evaluation is a process in progress and next we will<br />

focus our interest especially on self and peer appraisal within the assessment center.<br />

Preliminary studies have shown that peer, self and assessor appraisals regarding the influence<br />

of behaviour in a group exercise are very similar; yet it has also become clear that the<br />

participants have great difficulty in assessing their overall assessment center score themselves.<br />

However, the sample is still too small to make conclusive statements or to give<br />

recommendations. Depending on the results of a further study some form of peer appraisal<br />

could be taken into consideration; e.g. as an additional source of information for the<br />

evaluation or as additional feedback for the participants. It would also be interesting to pay<br />

more attention to the self evaluation of the candidates given the fact that studies have shown<br />

that there seems to be various links between the correlation of self perception and perception<br />

by others and various organizational criteria such as job performance or job promotion<br />

(McCall & Lombardi, 1983, Van Velsor, Taylor & Leslie, 1993; Bass & Yammarino, 1991;<br />

McCauley & Lombardo, 1990; Yammarino & Atwater, 1993).<br />

BIBLIOGRAPHY<br />

Annen, H. (1995). Konstrukt- und kriterienbezogene Validität des MFS-Assessment Centers.<br />

Unveröff. Lizenziatsarbeit, Universität Zürich, Psychologisches Institut, Abt.<br />

Angewandte Psychologie.<br />

Annen, H. & Gutknecht, S. (2002). Selektions- und Beurteilungsinstrumente in der<br />

Berufsoffizierslaufbahn - eine erste Wertung. Allgemeine Schweizerische<br />

Militärzeitschrift, 2/02, 19-20.<br />

Annen, H. & Gutknecht, S. (2002). The Validity of the Assessment Center for Future<br />

Professional Officers. <strong>Proceedings</strong> of the 44 th Annual Conference of the <strong>International</strong><br />

<strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>s [CD-Rom].<br />

Baehr, M. E. (1987). A review of employee evaluation procedures and a description of "high<br />

potential" executives and professionals. Journal of Business and Psychology, 1, 172-<br />

202.<br />

Bass, B. M., & Yammarino, F. (1991). Congruence of self and others' leadership ratings of<br />

naval officers for understanding successful performance. Applied Psychology: An<br />

<strong>International</strong> Review, 40, 437-454.<br />

Burke, M. J. & Pearlman, K. (1988). Recruiting, selecting, and matching people with jobs. In<br />

J.P. Campbell & R.J. Campbell (Eds.), Productivity in organizations. San Francisco:<br />

Jossey-Bass.<br />

Bycio, P., Alvares, K. M., & Hahn, J. (1987). Situational specifity in assessment center<br />

ratings: A confirmatory factor analysis. Journal of Applied Psychology, 72(463-474).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

19


20<br />

Chan, D. (1996). Criterion and construct validation of an assessment center. Journal of<br />

Occupational and Organisational Psychology, 69, 167-181.<br />

Crawley, B., Pinder, R., & Herriot, P. (1990). Assessment center dimensions, personality, and<br />

aptitudes. Journal of Occupational Psychology, 63(211-216).<br />

Fleenor, J. W. (1996). Constructs and developmental assessment center: Further troubling<br />

empirical findings. Journal of Business and Psychology, 10, 319-335.<br />

Gaugler, B. B., Rosenthal, D.B., Thornton, G.C., III, & Bentson, C. (1987). Meta-analyses of<br />

assessment center validity. Journal of Applied Psychology, 74, 493-511.<br />

Goffin, R. D., Rothstein, M.G., & Johnston, N.G. (1996). Personality testing and the<br />

assessment center: Incremental validity for managerial selection. Journal of Applied<br />

Psychology, 81, 746-756.<br />

Gutknecht, S. (2001). Eine Evaluationsstudie über die verschiedenen Instrumente der<br />

Berufsoffiziersselektion und deren Beitrag zur Vorhersage des Studien- und<br />

Berufserfolges. Unveröff. Lizenziatsarbeit, Universität Bern, Psychologisches Institut,<br />

Btl. für Arbeits- und Organisationspsychologie.<br />

Hoelemann, W. (1989). Langzeitprognose von Aufstiegspotential. Zeitschrift für<br />

betriebswirtschaftliche Forschung, 41, 516-525.<br />

Hophan, U. (2001). Participants' reactions to Assessment Centres. Manchester: Manchester<br />

School of Management.<br />

Kleinmann, M., Kuptsch, C., & Köller, O. (1996). Transparency: A necessary requirement for<br />

the construct validity of assessment centers. Applied Psychology: An <strong>International</strong><br />

Review, 45, 67-84.<br />

McCall, M. L., M. (1983). Off the Track: Why and How Successful Executives get Derailed.<br />

Greensboro, NC: Center for Creative Leadership.<br />

McCauley, C. L., M. (1990). Benchmarks: An instrument for diagnosing managerial strength<br />

and weaknesses. In K. E. C. M. B. C. (Eds.) (Ed.), Measures of leadership (pp. 535-<br />

545). West Orange, NJ: Leadership Library of America.<br />

McEvoy, G., Beaty, R., & Bernardin, J. (1987). Unanswered questions in assessment center<br />

research. Journal of Business and Psychology, 2, 97-111.<br />

Ostendorf, F. (1990). Sprache und Persönlichkeitsstruktur. Zur Validität des Fünf-Faktoren-<br />

Modells der Persönlichkeit. Regensburg: Roderer.<br />

Robertson, I. T., Gratton, L., & Sharpley, D. (1987). The psychometric properties and design<br />

of managerial assessment centers: Dimensions into exercises won't go. Journal of<br />

Occupational Psychology, 60(187-195).<br />

Russell, C. I., & Domm, D.R. (1995). Two field tests of an explanation of assessment center<br />

validity. Journal of Occupational and Organisational Psychology, 68, 25-47.<br />

Sackett, P. D., & Dreher, G.F. (1982). Constructs and assessment center dimensions: Some<br />

troubling empirical findings. Journal of Applied Psychology, 97(401-410).<br />

Schallberger, U. & Venetz, M. (1999). Kurzversion des MRS-Inventars von Ostendorf (1990)<br />

zur Erfassung der fünf "grossen" Persönlichkeitsfaktoren. Unveröff. Bericht,<br />

Universität Zürich, Psychologisches Institut, Abt. Angewandte Psychologie.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Scholz, G. S., H. (1993). Das nomologische Netzwerk des Assessment Centers: eine<br />

Metaanalyse. Zeitschrift für Arbeits- und Organisationspsychologie, 37, 73-85.<br />

Schuler, H. (1991). Der Funktionskreis "Leistungsförderung"- eine Skizze. In H. S. (Hrsg.)<br />

(Ed.), Beurteilung und Förderung beruflicher Leistung (pp. 171-189). Göttingen:<br />

Hogrefe/Verlag für Angewandte Psychologie.<br />

Schuler, H. (1998). Psychologische Personalauswahl. Göttingen: Verlag für Angewandte<br />

Psychologie.<br />

Schweizerische Armee. (2001). Das Militärische Personal der Armee XXI: Leitbild. Bern:<br />

Chef Heer & Kdt Luftwaffe.<br />

Shore, T. H., Thornton, G.C., III, & Shore, L.M. (1990). Construct validity of two categories<br />

of assessment center dimension ratings. Personnel Psychology, 43, 101-116.<br />

Spector, P. E., Schneider, J.R., Vance, C.A., & Hezlett, S.A. (2000). The relation of cognitive<br />

ability and personality traits to assessment center performance. Journal of Applied<br />

Social Psychology, 30(7), 1474-1491.<br />

Thornton, G. C. I., Gaugler, B.B., Rosenthal, D.B., & Bentson, C. (1992). Die prädikative<br />

Validität des Assessment Centers - eine Metaanalyse. In H. S. W. S. (Hrsg) (Ed.),<br />

Assessment Center als Methode der Personalentwicklung (pp. 36-60). Göttingen:<br />

Hogrefe/Verlag für Angewandte Psychologie.<br />

Turnage, J., & Muchinsky, P. (1984). A comparison of predictive validity of assessment<br />

center evaluations versus traditional measures in forecasting supervisory job<br />

performance: Interpretive implications of criterion distortion for the assessment<br />

paradigm. Journal of Applied Psychology, 69, 595-602.<br />

Van Velsor, E., Taylor, S. & Lesli, J. (1993). An examination of the relationship among selfperception<br />

accuracy, self-awareness, gender, and effectiveness. Human Resource<br />

Management, 32, 249-264.<br />

Wey, M. (2002). ACABO. Assessment Center für angehende Berufsoffiziere. Eine Analyse der<br />

Interrater-Reliabilität in Bezug auf die Gesamtbeurteilung sowie die dimensionssowie<br />

übungsspezifischen Urteile. Praktikumbericht, Militärakademie, Dozentur<br />

Militärpsychologie und Militärpädagogik.<br />

Yammarino, F., & Atwater, L. (1993). Understanding self-perception accuracy: Implications<br />

for human resource management. Human Resource Management, 32, 231-247.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

21


22<br />

Personality as predictor of job attitudes and intention to quit<br />

Simon P. Gutknecht<br />

<strong>Military</strong> Academy at the Swiss Federal Institute of Technology Zurich<br />

STEINACHERSTRASSE 101B, 8804 AU<br />

Switzerland<br />

simon.gutknecht@milak.ethz.ch<br />

Introduction<br />

The Swiss Army is going through a time of change. There is, in addition, a radical change<br />

in process at the moment which goes along with the reduction in size of the army also brings<br />

with it the establishment of new functions. The uncertainties with respect to future positions and<br />

functions are felt especially strong among professional and non-commissioned officers. The<br />

question of how the affective commitment to the establishment and job satisfaction (job<br />

attitudes) is influenced has to be raised. This is to be taken seriously since these variables have<br />

been important predictors concerning intention to quit or absenteeism (Lum, Kervin, Clark, Reid<br />

& Sirola, 1998; Michaels & Spector, 1982; You, 1996).<br />

Besides the general questions about factors which influence attitudes toward work such<br />

as for example salary, job security as well as leadership quality, it is especially in times of<br />

change, interesting to see if there are people who due to their disposition suffer less or more<br />

under the process of change. A question related to this aspect is to what extent do certain<br />

personality dispositions (e.g. extraversion) have an effect on work related attitudes as well as on<br />

intentions to take action.<br />

The consideration of personality traits is not only interesting from the point of view of<br />

basic research but carries a certain relevance to practice. In addition to the traditional AC<br />

exercises done within the framework of the “assessment center for professional officers” a<br />

personality test (to assess the Big 5: extraversion, agreeableness, conscientiousness, neuroticism<br />

& culture) has also been introduced. The results of this test have so far only been used for<br />

research purposes and therefore no weight within the selection process has been attributed to it.<br />

However it is important for the people responsible at the Assessment Center to know what<br />

further pieces of additional information are contained in this test. Although the relationship<br />

between personality variables and job performance have mainly been of interest (cf. Barrick &<br />

Mount, 1991; Day & Silverman, 1989; Schmidt & Hunter, 1998) organisational attitudes such as<br />

affective commitment or job satisfaction can also be used as external criteria to assign the<br />

criterion validity of this test. In the end one wants to select people who are also in times of<br />

difficulty committed to the organization. While collecting data on the topic job satisfaction and<br />

affective commitment within the framework of a study for validation purposes the scale for<br />

assessing the Big 5 was also used.<br />

Findings concerning personality and job attitudes<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


There are not many studies with the sole aim of looking more closely at the connection<br />

between affective commitment and personality. There are, however, elaborate studies with<br />

reference to job satisfaction which deal with the influence of traits such as general self-esteem,<br />

general self-efficacy, locus of control as well as neuroticism (emotional stability) (cf. (Judge,<br />

Locke, Durham & Kluger, 1998; Judge, Bono & Locke, 2000). These studies point out that these<br />

traits have an influence on the perception respectively the judgement of the work situation and in<br />

this way have an indirect influence on the construct job satisfaction. Direct effects even though<br />

present are weaker.<br />

As to the results of the influence of the Big 5 there are isolated references. Judge, Heller<br />

and Mount (2002) observed with the help of a meta-analysis that only the factors “neuroticism”<br />

and “extraversion” were significantly connected to general job satisfaction. On the other hand<br />

Tanoff (1999) could demonstrate that all the factors of the Big 5 with the exception of culture<br />

were related to job satisfaction. It must be added that the variable neuroticism played a decisive<br />

role. Seibert and Kraimer (2001) ascribe the Big 5 factors in connection with the variable job<br />

satisfaction a prognostic validity as well but the effects are rather minor. In this study the<br />

variable extraversion was of importance. Day, Bedeian and Conte (1998) also found an influence<br />

of the variable extraversion on job satisfaction, but the coefficient .10 is rather modest.<br />

As mentioned above there are only very few studies that deal directly with the relation<br />

between the Big 5 and commitment. In a recent study done by Naquin and Holton (2002) the<br />

variables neuroticism, conscientiousness as well as agreeableness show a relation to the variable<br />

affective commitment. Otherwise there are no findings in this respect.<br />

These results are interesting and show that in the debate over increasing job satisfaction<br />

or commitment the hypothesis that personality disposition is relevant seems justified. On the<br />

basis of the above mentioned results concerning traits and job satisfaction where the variable<br />

neuroticism is mainly of an indirect nature, the question has to be asked to what extent is this<br />

true for the other Big 5 traits. Especially job characteristics (skill variety, task identity, task<br />

significance, autonomy and feedback) as introduced by Hackman and Oldham (1980) in the job<br />

characteristic model (JCM) must have a significant mediating role between personality features<br />

and job related attitudes. This can be assumed in the context of the studies done by Judge et al.<br />

(1998, 2000).<br />

The question to be asked is in what way beside the content aspect, the so called context<br />

factors (satisfaction with salary, colleagues, job security & leadership quality), as listed in the<br />

JCM, serve as mediators. It seems that these factors are in correlation with job satisfaction and<br />

especially so with affective commitment. Context factors have a not to be underestimated role<br />

especially in times of change and its related uncertainty.<br />

In the following it will be shown what kinds of personality traits influence the job<br />

attitudes of the Swiss professional military. To test the direct as well as the indirect influences<br />

the personality factors, affective commitment, job satisfaction and personality traits, as well as<br />

job characteristics and context factors were considered in the calculations. These variables were<br />

put into a relationship with the variable intention to quit.<br />

Method<br />

Setting and Participants<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

23


24<br />

820 questionnaires done anonymously (coded) were sent to the private address of the<br />

professional military personnel. 420 questionnaires were sent to professional officers and 400 to<br />

professional non-commissioned officers. Both sample groups of addressees were of random<br />

choice. The return rate was 61% (n=499). 19 questionnaires had to be disregarded because they<br />

were insufficiently filled in, which means that there were a total of n=480 . The average age of<br />

the total sample was 42.5 years.<br />

Measures<br />

The job characteristics (skill variety, task identity, task significance, autonomy and<br />

feedback), satisfaction with salary and colleagues, perceived job security and leadership quality<br />

(context factors) as well as job satisfaction were registered with the German version (van Dick et<br />

al., 2001) of the “Job Diagnostic Survey” (JDS) by Hackman and Oldham (1975). The individual<br />

contents of job characteristics were integrated into one scale value as suggested by Hackman and<br />

Oldham. The respective items were put into a 6-point Scale.<br />

A German translation of the Organizational Commitment Questionnaire (OCQ) by Maier and<br />

Woschée (2002), was used to record the affective commitment. The range of the scale was from<br />

1 - 7. 4 items (scale 1-5) by Baillod (1992) were used to record the intention to quit.<br />

The Big 5 were recorded with the version MRS-30 by Schallberger and Venetz (1999). The<br />

respective scale consisted of six bi-polar pairs of adjectives as “vulnerable” – “sturdy” or<br />

“secure” – “insecure”. The test person had to indicate out in a 6-poit Scale how these adjectives<br />

applied to him.<br />

The reliability coefficients of the respective scales are shown in table 1. The context<br />

factors could not be established because the items were too small to calculate a coefficient.<br />

Results<br />

The internal consistency of the scales were sufficient to very good. With reference to the<br />

means it can be said that the variables job satisfaction (4.60) and job characteristics (4.44) were<br />

judged as satisfactory to good, but not the context factors with the exception of the variable<br />

satisfaction with colleagues. The variable affective commitment is lower than the variable job<br />

satisfaction because it was measured on a scale from 1-7. The 4.55 represents a rather low level<br />

of affective commitment. There was nothing conspicuous as to the means of the personality<br />

constructs.<br />

On the basis of the matrix it can be seen that there is a moderate to high connection but to<br />

a varying degree between job characteristics, context factors and the variables commitment, job<br />

satisfaction and intention to quit. The influence of the context factor satisfaction with colleagues<br />

is far less than that of other variables.<br />

M SD α 1 2 3 4 5 6 7 8 9 10 11 12<br />

1 Job characteristics 4.60 .56 .84 -<br />

Context factors<br />

2 Job security 3.72 1.16 - .34 -<br />

3 Salary 3.60 1.20 - .24 .48 -<br />

4 Leadership 3.60 1.20 - .46 .48 .36 -<br />

5 Colleagues 4.60 .92 - .29 .04 -.03 .20 -<br />

Personality<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


6 Neuroticism 4.53 .56 .71 -.25 -.14 -.01 -.11 -.16 -<br />

7 Extraversion 4.11 .64 .77 .16 -.02 -.09 .04 .14 -.20 -<br />

8 Culture 4.32 .56 .66 .11 -.05 -.07 .04 .07 -.38 .33 -<br />

9 Agreeablesness 4.49 .55 .70 .05 .03 .09 .06 .10 -.22 -.02 .20 -<br />

10 Councentiousness 4.95 .59 .84 .18 .06 -.05 .02 .05 -.30 .14 .31 .35 -<br />

Attitudes<br />

11 Job satisfaction 4.44 .85 .85 .55 .56 .47 .54 .23 -.21 .03 .03 .05 .09 -<br />

12 Commitment 4.55 .93 .89 .44 .58 .48 .51 .11 -.15 .11 .07 .00 .14 .75 -<br />

Intention<br />

13 Intention to quit 3.55 .97 .80 -.34 -.50 -.46 -.44 .06 -.09 .04 .02 -.03 -.02 -.75 -.68<br />

Table 1: Bivariate correlations, means, standard deviations and Cronbachs Alpha. Correlations ≥ .09 are significant at the .05 level.<br />

As far as the personality variables are concerned (see table 2) they correlate only sporadically<br />

and only if so moderately to weakly with the job characteristics, context factors (salary,<br />

leadership, job security and colleagues) and job attitudes. Only the variable neuroticism appears<br />

in 7 of 8 correlations and shows significant values. As to extraversion there are just 4. As to<br />

conscientiousness there are only 3 significant correlations, to agreeableness two and to culture<br />

just one. Neuroticism, extraversion and conscientiousness show significant correlations with job<br />

attitudes but just neuroticism is in a significant relation to intention to quit.<br />

What is striking is the high correlation of .75 between affective commitment and job<br />

satisfaction (see table 1). Starting with .80 the assumption of a multi-co linearity is high.<br />

Respective tests did not result in a clear picture therefore the variable job satisfaction was no<br />

longer taken into consideration for further analysis. The variable affective commitment was kept<br />

because it manifested a new construct and therefore has hardly been tested. Furthermore this<br />

variable will be of special interest in the context of the changes in the Swiss Army.<br />

Neuroticism Extraversion Culture Agreeableness Conscientiousness<br />

Job<br />

Characteristics<br />

-.25 .16 .11 .05 .18<br />

Job Security -.14 -.02 -.05 .03 .06<br />

Salary -.01 -.09 -.07 .09 -.05<br />

Leadership -.11 .04 .04 .06 .02<br />

Colleagues -.16 .14 .07 .10 .05<br />

Job Satisfaction -.21 .03 .03 .05 .09<br />

Affective<br />

Commitment<br />

-.15 .11 .07 .00 .14<br />

Intention to quit -.09 .04 .04 -.03 -.02<br />

Table 2: Bivariate correlations. Correlations ≥ .09 are significant at the .05 level.<br />

Structural Models<br />

In the following the direct as well as the indirect effects resulting from the personality<br />

variable will be determined. It is wise to only consider as many variable as absolutely necessary<br />

in such a complicated procedure as encountered in structure models. This has as a consequence<br />

that only those constructs can be considered where because of a bivariate correlation it can be<br />

assumed that there will be a connection to the variable affective commitment. Such an approach<br />

is justified because of the explorative character of the study. Therefore only neuroticism,<br />

extraversion and conscientiousness are considered among the personality variables. This makes<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

25


26<br />

sense because according to results reached in other studies an effect can be especially expected<br />

from these constructs (Judge et al., 2002, Tanoff, 1999). On top of this the context factors are<br />

summed up in one factor (α = .81).<br />

To check the model-fits the conventional indices such as goodness-of-fit-index (GFI),<br />

Tucker Lewis Index (TLI) as well as root-mean squared-error-of-approximation (RMSEA) were<br />

used. The latter two are of importance because they can be interpreted independent of the sample<br />

size.<br />

In figure1 the model is estimated on the basis of the total sample. Hardly any differences<br />

were found between the fit indices of the fully mediated model (GFI=.93, TLI=.89,<br />

RMSEA=.=85) and the one where direct and indirect influences are allowed (GFI=.93, TLI=.89,<br />

RMSEA=.086). Additional variance (.02) even if the minimal is accounted for in the model that<br />

allows direct and indirect influences depending on personality traits. If the effects of the variable<br />

neuroticism for the most part come about indirectly then the variable conscientiousness has a<br />

direct influence on the affective commitment even though these effects are minimal. As to the<br />

total effects, the contribution of the variable neuroticism was in the mediated model (.17) as well<br />

as in the model allowing direct effects (.17) bigger than the effects of job characteristics<br />

(.15/.11). Thus the variable neuroticism with reference to affective commitment shows similar<br />

effects as was determined in other studies in reference to job satisfaction. This cannot be said for<br />

the variable extraversion, because there was neither a direct nor indirect significant influence.<br />

Concerning job satisfaction the mediators which seemed to hold true were mainly the job<br />

characteristics while in the context of the affective commitment it seems that context factors<br />

carry a very high influence. To what extent this is related to the ongoing changes in the army is<br />

difficult to determine and can only be established with the help of successive studies. It seems<br />

that many are content with the quality of the job characteristics but not with the context factors<br />

with the exception of the colleagues as already previously described. This also means that even<br />

though there is possibly displeasure the distinction between content aspects and factors such as<br />

job security or salary are made which is clearly illustrated in figure 1 and which mainly seems to<br />

have an effect on the affective commitment.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


-.37**<br />

Neuroticism<br />

-.38**<br />

Extraversion<br />

.21**<br />

Conscientiousness<br />

Figure 1: Structural Model; * = p


28<br />

Concetiousness .09 (08%) .11 (10%) .02 (01%)<br />

Table 3: Total standardized effects on “affective commitment.” Percentage of accounted variance are put in brackets.<br />

Discussion<br />

Generally it can be said that the personality factors extraversion and especially<br />

neuroticism have an indirect as well as a direct influence on affective commitment. It is however<br />

interesting to note that these effects on affective commitment are different with regards to the<br />

sub-samples. Whereas the influence of the variable neuroticism in the sub-samples “younger”<br />

and “older” was as expected moderately high, the direct effects based on the personality traits are<br />

missing. As to the findings in general: There is a big influence from the context factors in terms<br />

of commitment. These effects could be established in the sub-samples. The job characteristic<br />

received a bigger weight in the sub- sample ”older” than in the other.<br />

The results have to be taken cautiously. On the one hand it is only a cross-sectional<br />

analysis. This means the consistency of the established effects cannot be tested and there can be<br />

no talk of causality. On the other hand the sub-samples are too small. Therefore artefacts cannot<br />

be ruled out.<br />

The results serve the purpose of first points of reference. Further studies are planned and<br />

will be analysed more minutely, for example with reference to distinctive features such as rank<br />

or function in the analyse. It can be assumed that the use of a personality test in the selection<br />

process for professional officers modestly serves its purpose to clarify commitment as well as<br />

intention to quit. How one can continue to justify the use of these tests can only be established<br />

with further studies and if economic considerations allow it.<br />

References<br />

Baillod, J. (1992). Fluktuation bei Computerfachleuten. Bern: Lang.<br />

Barrick, M.R., & Mount, M.K. (1991). The big five personality dimensions and job performance: A meta-<br />

analysis. Personnel Psychology, 44, 1-26.<br />

Day, D.V., & Silverman, S.B. (1989). Personality and job performance: Evidence of incremental validity. Per-<br />

sonnel Psychology, 42, 25-36<br />

Day, D.V., Bedeian, A.G., & Conte, J.M. (1998). Personality as Predictor of work-Related Outcomes: Test of a<br />

Mediated Latent Structural Model. Journal of Applied Social Psychology, 28, 2068-2088.<br />

Dick, R. van, Schnitger, Chr., Schwartzmann-Buchelt, C., & Wagner, U. (2001). Der Job Diagnostic Survey im<br />

Bildungsbereich. Zeitschrift für Arbeits- und Organisationspsychologie, 45, 74-92.<br />

Hackman, J.R., &, Oldham, G.R. (1975). Development of the Job Diagnostic Survey. Journal of applied Psy-<br />

chology, 60, 159-170.<br />

Judge, T.A., Locke, E.A., Durham, C.C., & Kluger, A.N. (1998). Dispositional effects on Job and Life Satisfac-<br />

tion: The Role of Core Evaluations. Journal of Applied Psychology, 83, 17-34.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Judge, T.A., Bono, J.E., & Locke, E.A. (2000) Personality and Job Satisfaction: The Mediating Role of Job<br />

Characteristics. Journal of Applied Psychology, 85, 237-249.<br />

Judge, T.A., Heller, D., & Mount, M.K. (2002). Five-factor model of personality and job satisfaction: A meta-<br />

analysis. Journal of Applied Psychology, 87 (3), 530-541.<br />

Lum, L., Kervin, J., Clark, K., Reid, F., & Sirola, W. (1998). Explaining nursing turnover intent: Job satisfac-<br />

tion, pay satisfaction, or organizational commitment?. Journal-of-Organizational-Behavior, 19, 305-<br />

320.<br />

Maier, G.W., & Woschée, R.-M. (2002). Die affektive Bindung an das Unternehmen. Zeitschrift für Arbeits- und<br />

Organisationspsychologie, 46, 126-136.<br />

Michaels, Ch.-E., & Spector, P.-E. (1982). Causes of employee turnover: A test of the Mobley, Griffeth, Hand<br />

und Meglino model. Journal-of-Applied-Psychology, 67, 53-59.<br />

Naquin, S.-S., & Holton, E.-F. (2002). The effects of personality, affectivity, and work commitment on motiva-<br />

tion to improve work through learning. Human-Resource-Development-Quarterly, 13 (4), 357-376.<br />

Seibert, S.E. & Kraimer, M.L. (2001). The Five-Factor Model of Personality and Career Success. Journal of Vo-<br />

cational Behavior, 58, 1-21.<br />

Schallberger, U., & Venetz, M. (1999). Kurzversion des MRS-Inventars von Ostendorf (1990). Zur Erfassung<br />

der fünf „grossen“ Persönlichkeitsfaktoren. Bericht aus der Abteilung Angewandte Psychologie, Psy-<br />

chologisches Institut der Universität Zürich.<br />

Schmidt, F.L. & Hunter, J.E. (1998). The Validity and Utility of Selection Methods in Personnel<br />

Psychology: Practical and Theoretical Implication of 85 Years of Research Findings.<br />

Psychological Bulletin, 124, 262-274.<br />

Tanoff, G.F. (1999). Job satisfaction and personality: The utility of the Five-Factor Model of personality.<br />

Dissertation-Abstracts-<strong>International</strong>:-Section-B:-The-Sciences-and-Engineering.Vol 60(4-B): 1904.<br />

Tett, R.P., Jackson, D.N., & Rothstein, M. (1991). Personality measures as predictors of job performance: A<br />

metaanalytic review. Personnel Psychology, 44, 703-742.<br />

You, T.-J. (1996). The role of ethic origins in the discriminant approach to employee turnover. Dissertation-<br />

Abstracts-<strong>International</strong>-Section-A: Humanities-and-Social-Sciences, 56, 4469.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

29


30<br />

Abstract<br />

USING TASK MODULE DATA TO VALIDATE<br />

AIR FORCE SPECIALTY KNOWLEDGE TESTS<br />

Shirley Snooks and Cindy Luster<br />

Air Force Occupational Measurement Squadron<br />

Randolph Air Force Base TX, USA<br />

shirley.snooks@randolph.af.mil<br />

In support of the Weighted Airman Promotion System, the Air Force Occupational<br />

Measurement Squadron (AFOMS) Test Development Flight (TE), Occupational Analysis Flight<br />

(OA), and career field subject-matter experts (SMEs) work synergistically to produce the best<br />

promotion tests possible, by ensuring the tests are valid, fair, and credible. Although the use of<br />

SMEs and the linking of individual items to Specialty Training Standard (STS) paragraphs<br />

already validate the specialty knowledge tests (SKTs), AFOMS goes one step further and<br />

validates test items by linking them to real world occupational performance data. Traditionally,<br />

the occupational performance data is grouped by duty area, and is then ranked in descending<br />

order by percent members performing (PMP) data, by predicted testing importance (PTI) index<br />

scores, or by a field-validated testing importance (FVTI) index scores. The data is then compiled<br />

into SKT extracts that SKT teams use extensively to determine the best possible item content for<br />

tests. Many TE test psychologists (TPs), OA analysts, and SMEs express concerns about the<br />

time involved and the difficulty in using SKT extracts to translate action-based tasks into<br />

knowledge-based test items.<br />

This research provided selected SKT teams with additional products that organize the<br />

occupational data into logical task module (TM) extracts based on co-performance, and assesses<br />

the appeal of the TM extracts versus the appeal of the SKT extracts.<br />

It is the belief of these researchers that SMEs will report more satisfaction with the TM<br />

data because the tasks will be logically grouped by co-performance rather than general duty area.<br />

In addition, SME comments will be encouraged in order to explore additional approaches of<br />

presenting task information.<br />

USING TASK MODULE DATA TO VALIDATE AIR FORCE SPECIALTY<br />

KNOWLEDGE TESTS<br />

In an innovative approach the AFOMS TE and OA Flights are working synergistically to<br />

provide SMEs, TDY to AFOMS, with an alternative organization of occupational data to<br />

facilitate SKT item writing by making the data easier to understand and more logical. Ongoing<br />

criticism, provided by AFOMS psychologists, analysts and SMEs, speaks to the difficulty in<br />

using performance tasks to develop questions on a knowledge test.<br />

“The Air Force Comprehensive Occupational Data Analysis Program (CODAP) system<br />

is a collection of analysis tools and procedures which use, as raw material, information provided<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


y members of the occupational field being studied (Thew and Weismuller, 1979)”. It is<br />

designed to furnish users with a wide variety of reports that facilitate the identification of<br />

individual and group characteristics and the detection of job similarities and differences. For<br />

over 35 years, AFOMS has analyzed career fields and provided occupational data to Air Force<br />

managers for use in decision-making. The analysis of work performed within career fields and<br />

the demographic characteristics of the members performing this work has remained the<br />

stronghold for objective and quantitative occupational data for managers to use for personnel and<br />

training decisions for years.<br />

These researchers will apply a currently available but seldom used CODAP analysis<br />

program. This less common approach defines work by grouping tasks into TMs. With these<br />

TMs, natural groupings of performance tasks can be provided to SMEs TDY to AFOMS. These<br />

teams will compare the tasks organized by co-performance (TM extract) with SKT extract data<br />

organized by major duty area and sorted by PMP, FVTI, or PTI index scores. It is the opinion of<br />

these researchers that organizing performance data into meaningful groups (i.e., TMs) will<br />

greatly aid in alleviating some of the concerns and the problems SMEs have with using typical<br />

SKT extract data and will allow SMEs to make a more intuitive leap from performance-based<br />

tasks to knowledge-based test questions.<br />

Co-Performance<br />

The idea of grouping tasks, by co-performance, into TMs has been discussed within the<br />

research arena since the mid 1980s. For example, the Training Decisions System (TDS) was<br />

conceived as a computer-based training requirements planning and decision support system<br />

developed to meet Air Force needs for better decision information (Vaughan, Mitchell, Yadrick,<br />

Perrin, Knight, Eschenbrenner, Rueter and Fledsott, 1989). The TDS allows managers and<br />

analysts to evaluate policy options in terms of costs and training capacities of representative units<br />

and to conduct trade-off analyses between various formal training programs and on-the-job<br />

training (Mitchell, Knight, Budkenmeyer, and Hand, 1989). The TDS supports Air Force<br />

managers in making decisions as to the what, where, and when of technical training (Ruck, 1982)<br />

by using AFS-specific occupational survey task data as a starting point for training development<br />

decisions. Task clustering is used in the TDS to capture the economies of training different tasks<br />

at the same time, either because of common skills and knowledge, including perhaps shared<br />

equipment, or because the tasks are generally performed by the same people.<br />

AFOMS analysts have played an integral part in providing task data to support these<br />

training systems since the inception of the program. Task modular data was a critical source of<br />

information for program functionality.<br />

Occupational Data for SKT Development<br />

Of particular importance to SKT teams is a specially compiled SKT extract containing<br />

occupational survey data specific to test populations. AFOMS survey data provides test writers<br />

(SMEs) with PTI index scores that are derived from PMP, PTS, task difficulty (TD), and training<br />

emphasis (TE) data. Those tasks rated highest in FVTI or PTI also tend to be high in all four of<br />

the primary indices (PMP, PTS, TD and TE), and are exactly the kinds of tasks one would<br />

generally consider job-essential and therefore appropriate for SKT content.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

31


32<br />

When possible, FVTI data are produced for SKT revisions. To obtain FVTI data,<br />

approximately 6 months before the start of the SKT development project a sample of 100 senior<br />

career field NCOs is sent a survey containing a list of the 150-200 tasks rated highest in PTI.<br />

Respondents are asked to provide a seven-point Likert scale rating (“1” is least important and<br />

“7” is most important) of how important they believe the task is for coverage on an SKT. The<br />

responses are averaged for each task, yielding the FVTI index, a direct measure of the opinions<br />

of career field experts as to what constitutes job-essential knowledge.<br />

Two separate data sets are prepared, one for use in the development of an E-5 SKT and<br />

one for use in the development of an E-6/7 SKT. Regardless of whether PTI or FVTI data are<br />

provided to the SKT team, the data provides a restricted set of tasks for use in SKT construction<br />

that will discriminate between the most knowledgeable and the least knowledgeable workers<br />

(Pohler, 1999).<br />

Participants, Design, and Procedure<br />

Forty SMEs TDY to AFOMS for SKT major development projects with start dates from<br />

5 August through 7 October <strong>2003</strong>, from a total of 10 Air Force career fields, participated in this<br />

research. These career fields are as follows:<br />

AFSC 1C3X1 Command Post<br />

AFSC 1S0X1 Safety<br />

AFSC 2A5X3D Integrated Avionics Systems (Airborne Surveillance Radar<br />

Systems)<br />

AFSC 2E1X2 Meteorological and Navigation Systems<br />

AFSC 2E1X4 Visual Imagery and Intrusion Detection Systems<br />

AFSC 2T0X1 Traffic Management<br />

AFSC 3C0X2 Communications/Computers Systems Programming<br />

AFSC 3C3X1 Communications-Computer Systems Planning and Implementation<br />

AFSC 3N0X2 Radio and Television Broadcasting<br />

AFSC 3U0X1 Manpower<br />

In addition, 13 full-time AFOMS TPs who conducted these SKT development projects<br />

participated in this research.<br />

Occupational data were retrieved from archived storage and copied into new CODAP<br />

system files for this study.<br />

Originally, these researchers planned to use the top 150 to 200 tasks determined by the<br />

PTI and FVTI values for each AFSC. However, after much effort and consternation, it became<br />

apparent that it was neither cost nor time effective to convert only the top 150-200 tasks into<br />

readable and useable CODAP files. Subsequently, the TM clustering program was applied to the<br />

complete task lists for these AFSCs.<br />

Once task clusters were identified as being different from one another, TASSET, another<br />

CODAP program, extracted an asymmetric matrix of percentage values indicating the degree to<br />

which each task is co-performed with each other cluster. For example, if a cluster consists of<br />

tasks “A”, “B”, “C”, and “D”, the average co-performance of “A” with “B”, “A” with “C”, and<br />

“A” with “D” is determined. Tasks within each cluster were then ordered from highest co-<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


performance value to lowest through the use of PRTFAC, a TASSET output. Those tasks with<br />

the highest co-performance values are most representative of the TM (Phalen, Staley, and<br />

Mitchell, 1987). In addition, the PRTFAC allows an analyst to better characterize and name<br />

each TM, discern finer distinctions between tasks, and create new TMs if necessary. These<br />

adjustments can be manually input into GRPMOD (another TASSET output), re-analyzed and<br />

readjusted until a final GRPMOD TM listing (TM extract) is produced.<br />

Previous research by Snooks and Luster (<strong>2003</strong>) indicated that labeling task modules<br />

added no measurable benefit; therefore, none of the task modules were named for this study.<br />

Instead, unlabeled TMs were provided for each team that showed tasks grouped by coperformance<br />

only (TM extracts).<br />

As customary, SKT development teams were provided with SKT extracts during the first<br />

week of each project. The SMEs began using the SKT extract data in the first week of each 5week<br />

major revision project by using the data for outline development in accordance with TE<br />

process standards, and then continued using the SKT extracts throughout item development to<br />

validate test items. The TM extracts were presented to the teams between week 2 and week 4 of<br />

the projects, after the teams were familiar with the SKT Extract. Each team member was asked<br />

to review the TM extract and then complete a 10-item, 7-point Likert scale survey (see<br />

Attachment 1). OMS TPs assigned to these projects were also asked to review the TM extracts<br />

and complete a survey, identical to the TP survey, except for the heading (see Attachment 2). In<br />

order to avoid tainting the responses by the enthusiasm of the researchers, and to ensure standard<br />

survey administration, a cover letter (see Attachment 3) was developed that included a brief<br />

explanation of the two survey products and directions for completing the survey. In other words,<br />

the survey was designed for self-administration.<br />

Results<br />

In simple mean comparison tests: the perceived “importance” of the SKT extract was<br />

compared with the perceived “importance” of the TM extract; the perceived “accuracy” of the<br />

SKT extract was compared with the perceived “accuracy” of the TM extract; the perceived “ease<br />

of understanding” the SKT extract was compared with the perceived ‘ease of understanding” the<br />

TM extract; the perceived “ease of using” the SKT extract was compared with the perceived<br />

“ease of using” the TM extract; and the “desire to use” the SKT extract was compared with the<br />

“desire to use” the TM extract. In addition, both groups (SMEs and TPs) were encouraged to<br />

provide comments.<br />

No statistically significant differences were found, however some comparisons can be<br />

made. All respondents (N = 53) appeared to be slightly more positive about the SKT extract<br />

(Mean = 5.1811) than the TM extract (Mean = 4.8604) on all five examined attributes, with the<br />

“desire to use” attribute rated the highest of all (Mean = 5.8679) for the SKT extract and the<br />

“ease of use” attribute rated the lowest (Mean – 4.5660) for the SKT extract. The highest rating<br />

for the TM extract, from all respondents, was for the “importance” attribute (Mean = 5.1698),<br />

while the attribute with the lowest rating from all respondents was “ease of use” (Mean =<br />

4.4339). It is worth noting that the “ease of use” attribute for both products has the lowest<br />

ratings, indicating that it may be the data itself that is difficult to use.<br />

TPs (N = 13, Mean = 5.5000) appeared to be slightly more positive towards both<br />

products on all five examined attributes than the SMEs (N = 40, Mean = 4.4688), with the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

33


34<br />

greatest difference appearing in the “ease of use” attribute (TP Mean = 5.0384, SME Mean =<br />

4.3250) and the least difference appearing in the “importance” attribute (TP Mean = 5.8462,<br />

SME Mean = 5.2375). Again, there were no statistically significant differences between the TP<br />

and SME scores.<br />

Interestingly, these results do not appear to be supported by respondents’ comments in<br />

which both products, for the most part, were criticized as not being presented in a format they are<br />

familiar with and thus, not being easy to use (see Attachment 4).<br />

Table 1, All respondent mean score comparisons with T-Values<br />

Attribute SKT Extract Mean<br />

TM Extract<br />

T-Value<br />

Mean<br />

Importance 5.6038 5.1698 0.0122<br />

Accuracy 4.8868 4.6792 0.2578<br />

Ease of Understanding 4.9811 4.9245 0.7614<br />

Ease of Use 4.5660 4.4339 0.5950<br />

Desire to Use 5.8679 5.0943 0.0244<br />

Table 2, SME mean score comparisons with T-Values<br />

Attribute SKT Extract Mean<br />

TM Extract<br />

T-Value<br />

Mean<br />

Importance 5.0000 4.6250 0.0483<br />

Accuracy 4.6875 4.0625 0.1561<br />

Ease of Understanding 4.3125 4.5000 0.2980<br />

Ease of Use 3.8125 3.6875 0.2663<br />

Desire to Use 5.1250 4.8750 0.0265<br />

Table 3, TP mean score comparisons with T-Values<br />

Attribute SKT Extract Mean<br />

TM Extract<br />

T-Value<br />

Mean<br />

Importance 5.0769 5.6153 0.0820<br />

Accuracy 5.0000 5.1538 0.5486<br />

Ease of Understanding 4.9231 5.3846 0.2132<br />

Ease of Use 4.8461 5.2307 0.5221<br />

Desire to Use 6.4615 6.3076 0.6872<br />

Conclusion and Implications<br />

In summary, the comments and results of this research reflect the need for more research.<br />

A review of TP and SME comments provided a common theme – subjects perceive difficulty in<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


comparing the two products; perhaps, since the presentation of PMP, PTI, and PTVI data was not<br />

available on the TM extracts. In spite of these reported difficulties, these researchers feel the use<br />

of TM data in SKT development merits continued research. A follow-on study is planned for<br />

early 2004, in which both the SKT and TM extracts will be converted into spreadsheet format<br />

and hand massaged before being provided to future SKT teams (TPs and SMEs). If this followon<br />

study is successful, the current study will then be replicated. Furthermore, as mentioned in a<br />

previous study (Snooks and Luster, <strong>2003</strong>), several important issues are still unresolved. For<br />

example, the decision must be made as to who will determine what tasks comprise what module<br />

(i.e., OA analysts, TE psychologists, or SMEs). If, after further research, management<br />

determines that TM extracts should be run off the top PTI/FVTI tasks, rather than off the whole<br />

task list as was done in this research, allocation of additional funds and resources will have to be<br />

allotted in order to provide automated data runs conducive to CODAP. Discussion and<br />

modifications to the TM extract “final product” format remains to be decided and then<br />

incorporated into future research efforts.<br />

Author’s Note<br />

Special thanks and acknowledgement go to Ms. Jeanie C. Guesman, AFOMS,<br />

Occupational Analysis Flight, for her diligent and enthusiastic programming support. This<br />

project could not have gone forward without her help.<br />

Comments and questions about this research can be addressed to Ms. Shirley Snooks, at<br />

AFOMS/TEEQ, 1550 Fifth Street East, Randolph AFB TX 78150. Calls can be made to<br />

Ms. Snooks at (210) 652-5013, extension 3114, or DSN: 487-5013, extension 3114.<br />

In addition, comments and questions about this research can be addressed to Ms. Cindy<br />

Luster, at AFOMS/OAE, 1550 Fifth Street East, Randolph AFB TX 78150. Calls can be made<br />

to Ms. Luster at (210) 652-6811, extension 3044, or DSN: 487-6811, extension 3044.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

35


36<br />

SME Questionnaire<br />

AFSC ______________________________NAME ______________________________DATE ______________________________<br />

Read the question, then, in the space next to the question number, mark the number that best corresponds to your opinion.<br />

1. How important is the SKT Extract?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither Important Absolutely<br />

Unimportant Nor Unimportant Important<br />

2. How important is the Task Module document?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither Important Absolutely<br />

Unimportant Nor Unimportant Important<br />

3. How accurate is the SKT Extract?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither accurate Absolutely<br />

Inaccurate Nor Inaccurate Accurate<br />

4. How accurate is the Task Module document?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither accurate Absolutely<br />

Inaccurate Nor Inaccurate Accurate<br />

5. How easy is it to understand the SKT Extract?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

6. How easy is it to understand the Task Module document?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

7. How easy will it be to use the SKT Extract to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

8. How easy will it be to use the Task Module document to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

9. How often will you use the SKT Extract to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

10. How often will you use the Task Module document to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Not at all Half of the time All of the time<br />

Comments:<br />

Attachment 1<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


TP Questionnaire<br />

AFSC ___________________NAME ______________________________DATE ____________________ PROJECT DATE____________________<br />

Read the question, then, in the space next to the question number, mark the number that best corresponds to your opinion.<br />

1. How important is the SKT Extract?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither Important Absolutely<br />

Unimportant Nor Unimportant Important<br />

2. How important is the Task Module document?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither Important Absolutely<br />

Unimportant Nor Unimportant Important<br />

3. How accurate is the SKT Extract?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither accurate Absolutely<br />

Inaccurate Nor Inaccurate Accurate<br />

4. How accurate is the Task Module document?<br />

1 2 3 4 5 6 7<br />

Absolutely Neither accurate Absolutely<br />

Inaccurate Nor Inaccurate Accurate<br />

5. How easy is it to understand the SKT Extract?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

6. How easy is it to understand the Task Module document?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

7. How easy will it be to use the SKT Extract to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

8. How easy will it be to use the Task Module document to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Extremely Neither Easy Extremely<br />

Difficult Nor Difficult Easy<br />

9. How often will you use the SKT Extract to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Not at all Half of the time All of the time<br />

10. How often will you use the Task Module document to link test items to tasks?<br />

1 2 3 4 5 6 7<br />

Not at all Half of the time All of the time<br />

Comments:<br />

Attachment 2<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

37


Attachment 3<br />

Cover Letter<br />

Dear SME or TP:<br />

We are asking for your help in providing a better product for use in matching Occupational Survey Report (OSR) tasks<br />

to Specialty Knowledge Test (SKT) items.<br />

Your assistance in this effort is completely voluntary. In addition, your privacy will be protected and no personal<br />

information will be made public. Please read and sign your name in the appropriate place below.<br />

I agree to participate in this research. ____________________________<br />

____________________________<br />

(Signature)<br />

(Printed Name)<br />

Please read the following paragraphs, review the two OSR task documents and then complete the questionnaire about<br />

these documents. Results from this questionnaire may affect how OSR tasks are presented to SKT teams in the future<br />

so please read and respond carefully and thoughtfully.<br />

Document 1 is the traditional SKT Extract that has been used to validate test items at AFOMS for some time. The<br />

SKT Extract contains the top 150 to 200 tasks identified in the OSR and presents these tasks in several different ways.<br />

It is sorted first by rank (i.e. E-5 and E-6/7), then by duty area (i.e. A, B, and C), then by percent members performing<br />

(PMP). The SKT Extract also lists a derived testing importance (TI), (either field-validated (FVTI), or predicted (PTI)<br />

value for each task. Finally, the SKT Extract provides a task listing in task number order that shows the PMP, and TI<br />

value across ranks.<br />

Document 2 is a task module listing that is currently used by AFOMS occupational analysts to help determine career<br />

field training needs and requirements. This document lists all tasks identified in the OSR, however it is sorted by task<br />

module (tasks that are co-performed). For example in the career field of cooking, one task module might be cake<br />

baking. Tasks within the cake baking task module include adding ingredients, mixing ingredients, greasing and<br />

flouring cake pan, turning on oven, pouring mix into cake pan, putting cake into oven, setting timer and so on.<br />

Please read each question, then circle the number that best corresponds with your IMMEDIATE reaction. If you have<br />

any questions, please have your TP contact me.<br />

Thank you,<br />

Shirley Snooks<br />

Personnel Psychologist<br />

AFOMS/TEEQ<br />

Cindy Luster<br />

Occupational Analyst<br />

AFOMS/OAV<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

39


40<br />

Attachment 4<br />

Respondent Comments<br />

“If you go to the Task Module, the format should incorporate the tasks from the STS as headers to align<br />

information under. Also, the task number from the STS should be reflected to make it easier to associate the<br />

STS with the Task Module task number; all of which has to be transcribed over to the item record card.”<br />

“Recommend linking items listed in the SKT Extract and Task Module to the STS. I.e. List each process under<br />

the appropriate STS heading (and in the same order).”<br />

“The tasks should actually be aligned with the STS. If they were initially aligned this way, then it would make<br />

the referencing much easier.”<br />

“The SKT extract and Task Module would both be valuable documents if they were re-arranged to flow logically<br />

with the STS.”<br />

“Task Module was nearly useless because of the difficulty of linking test areas to the tasks offered. The intent is<br />

very good. Solution: Plan the survey with actual CDC areas in mind, by paragraph number. Will reduce linking<br />

time by probably 75%.”<br />

“The Task Module in our career field needs updating. Some things need to be added as there was no appropriate<br />

task.”<br />

“I would like to see SKT Extract as computer based to save research time. Using root words as a search engine<br />

on the computer would accurately match items for the SME.”<br />

“In the Task Module, we do not know the task category since it has only numeric headings. If all processes can<br />

be identified (as these headings) then the task module will be easier to use. Also, ordering should not be by verb,<br />

rather it should list nouns e.g., “The Shipment Planning Process”. How about that?”<br />

“The SKT is very useful in determining what personnel in career field are doing. However some items listed on<br />

survey were difficult to relate to test items: wording of some tasks were very broad. Perhaps a team of SMEs<br />

could refine wording of survey questions/items. SMEs do their best to inform personnel of importance of<br />

surveys, but we need to enlist help of unit training managers to stress importance. Some unit training managers<br />

do not do a very good job of spreading word.”<br />

“Logical grouping of tasks in task module does not fit “tree” structure of groupings I am comfortable and<br />

familiar with. Tree structure identifies major broad task areas as major branches with individual smaller groups<br />

and ultimately individual tasks stemming from the major broad task areas. This task module data might be more<br />

usable if the modules were sorted in recognizable groups similar to the way they are organized in alignment with<br />

AFOSH and Mishap prevention programs.”<br />

“I believe the current application of linking the SKT and career field task is the best method in developing test<br />

questions. However, the Extract could be more useful if the information received was based on the current<br />

operation task and time period. For example, when a major rewrite is scheduled, the Measurement Squadron<br />

must ensure the information is both valid and up-to-date.”<br />

“The sample Task Module seemed easier to read. However, there were some clusters that were not accurate.<br />

May need an SME to help insure accuracy of clusters. With both, more in depth descriptions of the entries and<br />

what they are used for is needed.”<br />

“I like the Task Module Data presentation/outline. I think by grouping the tasks together it gives the SME a<br />

better look at which task may be the best “true” fit. In both documents there were a few items that we had to<br />

make a best fit because there was no exact match. Overall, both documents are very helpful.”<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


“The PMP and FVTI data is important to determine if questions need to come from certain task numbers. If the<br />

Task Module document and/or the SKT Extract were linked more closely to the STS, linking test items to tasks<br />

would be easier. Overall, I like the SKT Extract better than the Task Module document.”<br />

“The SKT Extract should match up better with the CFETP tasks. Most work centers don’t have access to the<br />

SKT Extract information because they don’t know it exists. Information is derived from surveys distributed to<br />

the field, but technicians don’t understand the importance of the survey, or how it affects the entire career field.”<br />

“I understand the goal is to present SKT Extract information in the best possible format for SME use. First, the<br />

SKT Extract and Task Module should be linked to the STS by STS task #. At this time the STS and SKT<br />

Extracts are independent documents. Of these two documents, the STS Extract is the best. Why not keep the<br />

STS Extract format and sort tasks as modules?”<br />

“Not a great tool, but better than the SKT Extract. Ideally an electronic version with keyword sequence<br />

capabilities is needed. The SKT Extract and/or the Task Module would be easier to use if it correlated to the<br />

STS.”<br />

“The method used to develop the Occupational survey should have outlined the current CFETP and the current<br />

CDCs. With minimal correlation it is difficult to use any document to assist and align test questions. This faulty<br />

comparison makes it simply another task for SMEs with minimal value added to the process.”<br />

“”Task module document assumes “that if incumbents perform Task A and Task B, there is high likelihood the<br />

two tasks share common skills and knowledge and thus can be trained together”, which is not correct. Aligning<br />

the task groupings to the most current STS seems to be a more logical method of grouping tasks.”<br />

“Areas of general information headings (i.e., Radio Production Skills, Writing Broadcast Products, etc.) not<br />

evident on Task Module. It would take additional time to look for information w/o headings.”<br />

“I do not feel the task list is 100% comprehensive, however I do understand this is a product of a sample of<br />

workers. I hope our 3-level tech school uses the task analysis to build their training program.”<br />

“SKT Extract should eliminate some items that aren’t necessary, for example: - PFE/Supervisor type items<br />

shouldn’t be placed in SKT Extract (even though it does need to be asked during surveys). – Any remove/replace<br />

tasks should be removed from the extract for SKT development – no SKT test questions will cover simple<br />

remove/replace tasks.”<br />

“Task Module usefulness is unknown without actual data, but compared to SKT Extract, it seems much easier to<br />

use the logical sequence of the TM EXTRACTD.”<br />

“The module data for tasks seems to be easier to follow. With a little bit of training on it, it should be no<br />

problem at all.”<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

41


42<br />

References<br />

Institute for Job and Occupational Analysis (1994). Training Impact Decision &<br />

Evaluation System for Air Force Career Fields. TIDES Operational Guide.<br />

Mitchell, J.L., Buckenmeyer, D.V., and Hand, D.K. (1989). The Use of CODAP in<br />

the Training Decision System<br />

Phalen, W.J., Staley, M.R., and Mitchell, J.L. (1987) New ASCII CODAP Programs<br />

and Product for Interpreting Hierarchical and Nonhierarchical Clusters. <strong>Proceedings</strong> of the 6 th<br />

<strong>International</strong> Occupational Analysts’ Workshop, San Antonio, TX: USAF Occupational<br />

Measurement Squadron.<br />

Pohler, W.J. (1999). Test Content Validation – New Data or Available Data.<br />

<strong>Proceedings</strong> from the 11 th <strong>International</strong> Occupational Analysts Workshop, San Antonio, TX:<br />

USAF Occupational Measurement Squadron.<br />

Ruck, H.W. (1982). Research and Development of a Training Decisions System.<br />

<strong>Proceedings</strong> of the Society of Applied Learning Technology. Orlando, FL.<br />

Snooks, S.F. and Luster, C. (<strong>2003</strong>) Occupational Analysis: Grouping Performance<br />

Tasks Into Task Modules for Use in Test Development <strong>Proceedings</strong> from the 13 th<br />

<strong>International</strong> Occupational Analysts Workshop, San Antonio, TX: USAF Occupational<br />

Measurement Squadron.<br />

Thew, M.C. and Weissmuller, J.J. (1978). CODAP: A new modular approach to<br />

occupational analysis. Proceeding of the 20 th Annual Conference of the <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>, Oklahoma City, OK, 362-372.<br />

Vaughan, D.S., Mitchell, J.L., Yadrick, R.M., Perrin, B.M., Knight, J.R.,<br />

Eschenbrenner, A.J., Rueter, F.H., and Fledsott, S. (1989). Research and Development of the<br />

Training Decisions System (AFHRL-TR-88-50).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


44<br />

THE SCOPE OF PSYCHOLOGICAL TEST SYSTEMS WITHIN THE<br />

AUSTRIAN ARMED FORCES<br />

Dr. Christoph Brugger<br />

Austrian Armed Forces <strong>Military</strong> Psychology Service<br />

Vienna, Austria<br />

hpa.hpd@bmlv.gv.at<br />

ABSTRACT<br />

During his or her career, a soldier in the Austrian Armed Forces has to pass several<br />

psychological tests. Since Austria will also be involved in supporting future WEU missions,<br />

changes in preconditions are reflected in the test systems applied. The integration of existing<br />

and planned selection systems into the typical career, beginning from recruitment to peace<br />

support operations or pilot selection is shown.<br />

Some more detail will be offered concerning the computer based test systems used in the<br />

induction centers as well as problems arising with harmonization of selection for peace<br />

support operations and selection of forces earmarked for international operations.<br />

INTRODUCTION<br />

Most papers on testing cover specific methodological issues or aspects of individual tests, but<br />

almost never the integration of test systems within an organization is shown in detail. Even<br />

though many people are interested in such aspects and often ask questions on how such<br />

systems are implemented in the individual countries, there exists very little information in this<br />

field. Based on these observations, the present state as well as some related problems within<br />

the Austrian Armed Forces will be presented.<br />

FACTS ABOUT THE AUSTRIAN ARMED FORCES<br />

For better understanding of the following some facts about Austria and the Austrian Armed<br />

Forces are presented first:<br />

• Austria is a small country. To reach the necessary strength Austria’s defense is based on a<br />

conscript and militia /reserve system.<br />

• <strong>Military</strong> service in Austria is compulsory. Every young man at an age of about 18 has to<br />

visit one of the six induction centers, where after a detailed examination a commission<br />

decides on whether he is fit for military service or not.<br />

• Persons not fit for the army will drop out of this system. All others have the option to<br />

chose alternative service with a civilian Non Profit Organization. Alternative service lasts<br />

significantly longer than military service to limit excessive losses of recruited personnel.<br />

• <strong>Military</strong> service lasts either 8 months, or 7 months plus 3 times 10 days of reserve recalls,<br />

usually every two years.<br />

• Austria has a long tradition of participation in Peace Support Operations of the United<br />

Nations. PSO personnel consists of volunteers only, professional soldiers as well as<br />

members of the militia.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


• Austria also has committed itself to supporting future missions of the European Union.<br />

Therefore specially prepared and trained units – again based on volunteers – have to be<br />

raised.<br />

PSYCHOLOGICAL TEST SYSTEMS<br />

All these aspects have to be regarded when deciding on where and how to implement<br />

psychological test systems for the Armed Forces.<br />

According to our current specifications testing and selection is applied with<br />

• recruits at the induction centers<br />

• soldiers applying to become cadre / professional soldiers<br />

• soldiers applying for service with <strong>International</strong> Operations Forces<br />

• professional soldiers as well as members of the militia to select PSO personnel<br />

• soldiers applying or intended for special functions like air defense personnel, pilot etc.<br />

TESTING AT THE INDUCTION CENTERS<br />

A young man officially gets in contact with the Austrian Armed Forces for the first time when<br />

he has to take part in the examinations at an induction center. There are six induction centers<br />

in Austria, examining about 60.000 persons a year.<br />

The examination lasts one and a half days, with an information section plus medical and<br />

psychological tests on the first day and a psychological interview as well as a medical<br />

examination on the second day. Based on the data collected a commission consisting of an<br />

officer, a physician (usually a medical officer) and a psychologist decides, on whether the<br />

draftee is fit for military service or not.<br />

Regarding the amount of people to be tested and the limited time available, the test system<br />

implemented at the induction centers is computer based, including some adaptive tests as well<br />

as self-adapting test batteries. Adaptive testing is optimized to yield highest precision in the<br />

lower ability range, since persons with low abilities belong to those possibly unfit for military<br />

service.<br />

To provide information for improved placement besides registering verbal abilities, reasoning<br />

and spatial performance as well as perceptual speed, also tests covering psychomotor skills<br />

using tracking tasks and anticipation of movement have been included. Conditions imposing<br />

additional stress in the second half of the test battery provide information on performance<br />

under load.<br />

An additional life event inventory and a recruitment-specific personality questionnaire are<br />

also part of the test battery.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

45


46<br />

To support the psychological interview, which is an important part of the selection procedure,<br />

printouts with detailed test results of each candidate are produced. The test results are<br />

available online as well. If there is just a slight characteristic indicating some problem, the<br />

candidate is interviewed by one of the two psychologists that are at each of the induction<br />

centers. This happens in about half of the cases, and the other half is interviewed by specially<br />

trained non-academic personnel.<br />

ERGOPSYCHOMETRY<br />

All of the following test systems have one thing in common: They in some way also make use<br />

of the concept of Ergopsychometry, a concept developed by Guttmann and others at the<br />

University of Vienna: testing under neutral conditions followed by testing under appropriate<br />

load conditions. It has been shown in various fields that the change of performance measured<br />

allows a surprisingly reliable prognosis of performance under real life load conditions. As<br />

stress resistance is obviously an important aspect in a soldiers daily life, it is also one of the<br />

main dimensions our tests are supposed to reveal, and therefore Ergopsychometry is the<br />

method of choice.<br />

SELECTION OF FUTURE CADRE<br />

The Cadre Aptitude Test is the next hurdle young soldiers applying to become career officers<br />

or NCO have to pass. In general this test will come up about one or one and a half years after<br />

passing the tests in the induction centers, that is after about half a year of military service.<br />

With female applicants the Cadre Aptitude Test immediately follows the basic entrance exam<br />

at an induction center.<br />

This test should not only provide information on cognitive abilities, necessary to successfully<br />

complete courses on schedule, but should also cover the necessary basic aptitude for serving<br />

abroad as well as at home. While participation in international operations right now is still<br />

based on volunteers, both cadre and militia, the next generation of cadre soldiers will<br />

probably have to commit themselves to service abroad. Based on long-term experience in<br />

PSO selection and other selection tasks and supported by very proficient officers and NCOs,<br />

both of the Austrian <strong>International</strong> Peace Support Command and of homeland troops, it was<br />

decided to use a mix of psychometric tests and assessment elements in the Cadre Aptitude<br />

Test to cover all relevant psychological aspects. These also include social competence<br />

(registered by assessing for example communication skills and conflict behavior), planning<br />

and organizational skills, motivational parameters, etc. The soldiers are tested on their ability<br />

to perform their duties also under stress, and not to be a potential danger to themselves or<br />

others.<br />

The Cadre Aptitude Test starts at 0930 and continues after a lunch break until 1700. Then<br />

tests of physical condition follow, including a swimming exam, a 16 km hike at night, again<br />

followed by another block of assessment elements and tests under load, a total of almost 24<br />

hours without sleep. This is completed by a final personal interview by a psychologist.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


INTERNATIONAL OPERATIONS FORCES ENTRANCE EXAM<br />

As Austria also has committed itself to supporting future missions of the European Union,<br />

specially prepared and trained units – also still based on volunteers – have to be raised. To<br />

become part of such a unit, also members of the existing cadre who are not yet part of<br />

<strong>International</strong> Operations Forces have to pass an examination. This exam is shorter than the<br />

one for young soldiers, as there is already more information available on existing armed<br />

forces personnel. But to check the current psychological state some testing is necessary.<br />

To reveal possible impairment this test system again combines the ergopsychometric<br />

approach based on tests (for example of memory and concentration), with assessment<br />

elements, questionnaires and a final personal interview by a psychologist.<br />

SELECTION PROCEDURE FOR PEACE SUPPORT OPERATIONS<br />

Except for soldiers who successfully passed a test covering the aptitude for serving abroad<br />

within the last three years, all volunteers for PSO, professional soldiers as well as members of<br />

the militia, have to pass a test on medical, physical and psychological fitness. Our concept of<br />

combining standard procedures with the so called „shelter test“ lasting all night with group<br />

dynamic tasks under stressful conditions, followed by a final personal interview by a<br />

psychologist has already been presented by Slop at the 42 nd <strong>IMTA</strong> conference in 2000.<br />

All of the last three test systems mentioned also cover the aptitude for serving abroad, but not<br />

all are administered under the same command. They were developed for subjects with<br />

differing backgrounds, and the time frames, within which test result had to be refreshed, used<br />

to be shorter for PSO personnel. So some difficulties were encountered trying to harmonize<br />

the already existing systems, but it was necessary to reach some kind of mutual agreement.<br />

Now the results of all of these three test systems concerning aptitude for serving abroad are<br />

valid for three years. Members of units earmarked for international operations have to<br />

participate in a short check on the current physical and psychological state once a year, as<br />

they might get deployed any time. If one wants to join a unit earmarked for international<br />

operations and tests were more than one year ago, a short check will also be necessary.<br />

TESTS FOR SPECIAL FUNCTIONS<br />

There are also specific test systems for pilot candidates or applicants for special units, where<br />

particular abilities are necessary. While all of the test systems described before are based on<br />

selecting out persons with significant shortcomings, the tests for specific functions are<br />

designed to select the cream of the crop. Numbers of subjects tested per year for such<br />

functions range from about 400 candidates in the pilot pre-selection down to 50 for some very<br />

specific jobs. In most cases physical or medical checks have to be passed before entering the<br />

psychological exam.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

47


48<br />

CONCLUSION<br />

During his or her career, a soldier in the Austrian Armed Forces has to pass several<br />

psychological tests. Each of the test systems applied was designed taking into account the<br />

specific requirements of the function as well as the number and background of subjects to be<br />

tested.<br />

One might question whether the effort taken to get reliable and valid data, and the strain and<br />

stress imposed on the applicants is still reasonable. Acceptance is quite good: even in the<br />

higher ranks, among officers polls at several terms yielded numbers of more than 80 % up to<br />

100 % of positive judgement. And especially within a military organization with a small<br />

budget, high quality of personnel selection can help to bridge some of the gaps existing.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


JOB-SPECIFIC PERSONALITY ATTRIBUTES AS PREDICTORS OF<br />

PSYCHOLOGICAL WELL-BEING<br />

Dr. H. Canan Sumer, Dr. Reyhan Bilgic, Dr. Nebi Sumer, and Tugba Erol, M.S.<br />

Middle East Technical University<br />

Department of Psychology<br />

06531 Ankara, Turkey<br />

hcanan@metu.edu.tr<br />

ABSRACT<br />

The purpose of this study was to examine the nature of the relationships between job-specific<br />

personality dimensions and psychological well-being for noncommissioned officers (NCOs)<br />

in the Turkish Armed Forces (TAF). A job-specific personality inventory, comprising<br />

measures of 11 personality dimensions (i.e., military bearing, determination, dependability,<br />

orderliness, communication, self-discipline, self-confidence, agreeableness, directing and<br />

monitoring, adaptability, and emotional stability) was developed for selection purposes. The<br />

inventory was administered to a representative sample of 1428 NCOs along with a general<br />

mental health inventory developed by the authors, which consisted of six dimensions of<br />

psychological well-being: depression, phobic tendencies, hostility, psychotic tendencies,<br />

psychosomatic complaints, and anxiety. Exploratory and confirmatory factor analyses<br />

suggested existence of a single factor underlying the six psychological well-being dimensions,<br />

named Mental Health, and two latent factors underlying the 11 personality dimensions, named<br />

<strong>Military</strong> Demeanor and <strong>Military</strong> Efficacy. The Mental Health factor was regressed on the two<br />

personality constructs using LISREL 8.30 (Jöreskog & Sörbom, 1996). The two personality<br />

constructs explained 91 percent of the variance in the Mental Health construct. A stepwise<br />

regression indicated that beta weights of the personality measures were significant except for<br />

military bearing, orderliness, and dependability. Results suggested that job-specific<br />

personality attributes were predictive of mental health. Implications of the findings for the<br />

selection of NCOs are discussed.<br />

Paper presented at the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> <strong>2003</strong> Conference, Pensacola,<br />

Florida. Address correspondence to H. Canan Sumer, Middle East Technical University,<br />

Department of Psychology, 06531 Ankara, Turkey. Send electronic mail correspondence to<br />

hcanan@metu.edu.tr. This study is a part of a project sponsored by the Turkish Armed<br />

Forces.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

49


50<br />

INTRODUCTION<br />

Past two decades have witnessed the resurgence of individual differences variables,<br />

especially personality attributes, in a variety of human resources management applications.<br />

An important credit in this movement rightfully goes to the five-factor model of personality<br />

(i.e., Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and<br />

Neuroticism) (Costa & McCrae, 1985; Goldberg, 1990), which stimulated a large quantity of<br />

both empirical and theoretical work on the relationships between personality variables and a<br />

number of outcome variables, performance being the most widely studied one. Recent<br />

literature suggests that personality predicts job performance, and that validities of certain<br />

personality constructs, such as conscientiousness or integrity, generalize across situations<br />

(e.g., Barrick & Mount, 1991; Borman, Hanson, & Hedge, 1997; Hogan, Hogan, & Roberts,<br />

1996; Hough, Eaton, Dunnette, Kamp, & McCloy, 1990; Ones, Viswesvaran, & Schmidt,<br />

1993; Salgado, 1997). In a recent meta analysis of 15 meta analytic studies, Barrick, Mount,<br />

and Judge (2001) found that among the Big Five traits, Conscientiousness and Emotional<br />

Stability were valid predictors of performance in all occupations, whereas the other three<br />

traits predicted success in specific occupations. Empirical evidence also suggests that<br />

different facets of performance have different predictors, and attributes that lead incumbents<br />

to do well in task performance are different from those that lead incumbents to do well in<br />

contextual aspects of performance (e.g., McCloy, Campbell, & Cudeck, 1994; Motowidlo &<br />

Van Scotter, 1994; Van Scotter & Motowidlo, 1996). Motowidlo and Van Scotter reported<br />

that although both task performance and contextual performance contributed independently to<br />

overall job performance, personality variables were more likely to predict contextual<br />

performance than task performance.<br />

The relationships of personality variables with outcome variables other than job<br />

performance have also received research attention. For example, Schneider’s Attractionselection-attrition<br />

(ASA) model lends support for the criticality of personality in both<br />

attraction of potential candidates to the organization and turnover process (Schneider, 1987;<br />

Schneider, Goldstein, & Smith, 1995). The model states that individuals are attracted to,<br />

selected by, and stay with organizations that suit their personality characteristics. The major<br />

assumption of the ASA model is that both the attraction and retention processes are based on<br />

some kind of person-environment (i.e., organization) fit.<br />

Personality and Psychological Health in the <strong>Military</strong> Context<br />

Personality variables have also been considered in the selection of military personnel.<br />

Specific personality characteristics or personality profiles have been shown to be associated<br />

with desired/undesired outcomes in the military settings (e.g., Bartram, 1995; Sandal,<br />

Endresen, Vaernes, & Ursin, 1999; Stevens, Hemstreet, & Gardner, 1989; Stricker & Rock,<br />

1998; Sumer, Sumer, & Cifci, 2000; Thomas, Dickson, & Bliese, 2001). Furthermore, in<br />

addition to job-related personality variables, psychological well-being or mental health has<br />

been among the individual differences factors considered in the selection/screening of military<br />

personnel (e.g., Holden & Scholtz, 2002; Magruder, 2000). As stated by Krueger (2001),<br />

compared to most civilian jobs, military jobs involve much more demanding physical and<br />

psychological conditions, such as fear, sensory overload, sensory deprivation, exposure to<br />

extreme geographies and climatic temperatures, and like. These conditions call for<br />

individuals with not only physical but also psychological stamina. According to Cigrang,<br />

Todd, and Carbone (2000), mental-health-related problems play a critical role in a significant<br />

portion of the turnover/discharge within the first six months of enlistment in the U.S. Armed<br />

Forces. Holden and Scholtz (2002) used Holden Psychological Screening Inventory (HPSI)<br />

for predicting basic military training outcome for a sample of noncommissioned recruits in the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Canadian Forces. Results indicated that the Depression scale of the inventory was predictive<br />

of training attrition, yielding support for the use of the inventory as a screening tool.<br />

The Relationship Between Personality and Mental Health<br />

The line between job-related personality attributes and psychological well-being is not<br />

always clear. According to Rusell and Marrero (2000), “[personality] styles mirror the traits<br />

that, in extreme forms, are labeled disorder.” These authors almost equate personality style<br />

with psychological/mental health or well-being. However, we believe that although these two<br />

constructs are closely associated, at the conceptual level a distinction needs to be made<br />

between personality style and overall psychological well-being or mental health of a person.<br />

Personality is simply what makes people act, feel, and think different from one another.<br />

Psychological health, on the other hand, refers to the extent to which an individual is<br />

functioning, feeling, and thinking within the “expected” ranges. Accordingly, while most<br />

measures of mental health are aimed at discriminating between clinical and nonclinical<br />

samples (or between the so called normal and abnormal), personality measures, which are<br />

mostly nonclinical in nature, are descriptive of an individual’s patterns of functioning in a<br />

particular domain of life (e.g., work, nonwork).<br />

There exists empirical evidence concerning personality variables as predictors of<br />

mental health (e.g., Ball, Tennen, Poling, Kranzler, & Rounsaville, 1997; DeNeve & Cooper,<br />

1998; Siegler & Brummett, 2000; Trull & Sher, 1994). Ball et al. reported that “normal”<br />

personality dimensions, such Agreeableness, Conscientiousness, and Extraversion contributed<br />

significantly to the prediction of psychopathology. For example, they found that<br />

Agreeableness and Conscientiousness contributed significantly to the prediction of antisocial<br />

and borderline personality disorders. Furthermore, as predicted, individuals with schizoid and<br />

avoidant personality disorders were lower in Extraversion. Hence, based on the available<br />

empirical and theoretical evidence we state that personality and psychological health,<br />

although related, are conceptually distinct, and that one should be able to predict one from the<br />

other. Moreover, we believe that prediction of psychological well-being from job-related<br />

personality characteristics has important practical implications, reduced cost of selection<br />

being an important one, for the selection of military personnel.<br />

Majority of the studies examining the relationship between personality and mental<br />

health have looked at the relationship between Axis I or Axis II disorders of the American<br />

Psychiatric <strong>Association</strong>’s (1994) Diagnostic and Statistical Manual for Mental Disorders, 4th<br />

edition, and the five-factor dimensions. It is believed that the power of personality attributes<br />

in predicting psychological health could further be improved when job/context specific<br />

attributes are employed.<br />

Thus, the purpose of the present study was to examine the nature of the relationships<br />

between personality variables and mental health within work context. More specifically, the<br />

study was carried out to examine the predictive ability of job-specific personality variables<br />

concerning psychological well-being for NCOs. Consequently, two inventories, an 11dimension<br />

measure of NCO personality and a 6-dimension measure of psychological wellbeing<br />

were developed, and the hypothesized relationship was tested on a relatively large<br />

sample of NCOs in the TAF. It is important to note that the term personality is not used<br />

rigidly in this study; some skill-based individual differences variables were also included<br />

under the same term.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

51


52<br />

METHOD<br />

Participants<br />

In the current study, 1500 NCOs received a questionnaire package, which comprised<br />

the scales used in the study, through the internal mailing system of the TAF. Of the 1500<br />

NCOs receiving the package, 1428 (95.2 percent) returned the package with the scales<br />

completed. Respondents were employed in the Army (483), Navy (298), Air Force (345), and<br />

Gendarmerie (302), with a mean and standard deviations of 33.11 and 6.85 years of age,<br />

respectively (range = 38). The average tenure of the participants was 13.17 years and the<br />

standard deviation of the tenure was 6.86 years (range = 34.25). All but two participants were<br />

men.<br />

Measures<br />

Noncommissioned Officer Personality Inventory (NCOPI )<br />

A personality inventory, developed by the researchers, was used to evaluate jobrelated<br />

personality attributes of the NCOs of the TAF. The NCOPI comprises 11 job-related<br />

personality dimensions (i.e., military bearing, determination, dependability, orderliness,<br />

communication, conscientiousness, self-confidence, agreeableness, directing and monitoring,<br />

adaptability, and emotional stability – see Table 1 for dimension descriptions). The<br />

development process of the NCOPI is summarized below. The present study involves the<br />

analyses of the data collected at Step III (i.e., Norm Study) of the NCOPI development.<br />

Step I - Identification and Verification of Critical NCO Attributes. Through 15 focused<br />

group discussions with both noncommissioned and commissioned (COs) officers (N = 152)<br />

and administration of a Critical NCO Behavior Incidents Questionnaire (N = 214), 92 NCO<br />

attributes critical for performance were identified. The identified attributes were then rated by<br />

both NCOs (N = 978) and COs (N = 656) in terms of the extent to which they discriminated<br />

successful and unsuccessful NCOs on a 6-point Likert type scale (1 = Does not discriminate at<br />

all; 6 = Discriminates very much). The analyses yielded 56 attributes relevant for the job of<br />

NCO in the TAF. The 56 attributes were subjected to a further screening by subject matter<br />

experts (i.e., personnel officers from all four forces). The SMEs were asked to evaluate each<br />

attribute in terms importance and cost of not assessing the attribute in the selection of the<br />

NCOs on two 5-point Likert type scales. Forty-six attributes surviving the examination of the<br />

SMEs were then examined by the researchers, and these 46 specific attributes laid down the<br />

framework for item development.<br />

Step II - Item Development and Pilot Study. In the development of the items the relevant<br />

literatures were examined. An initial item pool tapping the dimensions identified at the<br />

previous stage was formed. First, relevant literatures and the <strong>International</strong> Personality Item<br />

Pool (IPIP, 1999) were examined. Then, the item development was carried out using an<br />

iterative approach. That is, the items developed for a given attribute by individual members<br />

of the research team were brought together and examined in group meetings. In these<br />

meetings, items were either kept, revised, or eliminated from the item pool, and the remaining<br />

items were then reexamined. Resulting item pool (N = 227) were then content analyzed and<br />

the items were further grouped under 28 broader personality variables.<br />

The initial version of the NCOPI was administered to a sample of 483 NCOs<br />

representing different forces. The respondents were asked to indicate the extent to which each<br />

item/statement was True of themselves (1 = Completely false; 4 = Completely true). Item<br />

analyses (reliability analyses and factor analyses) resulted in major revisions in the initial<br />

version of the NCOPI in terms of both item and dimension numbers. Also a few new items<br />

were developed to keep the item numbers across dimensions within an established range. The<br />

resulting version had 166 items under 17 personality dimensions.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 1. The NCOPI Dimensions and General Descriptors<br />

NCO Personality Dimensions Descriptors<br />

1. <strong>Military</strong> Bearing • complying with rules and regulations<br />

• respecting superiors and chain of command<br />

• showing pride in being a military personnel<br />

2. Determination • wanting to be successful<br />

• liking challenge<br />

• having persistence<br />

3. Dependability • being reliable<br />

• being honest<br />

• being reserved and keeping secrets<br />

4. Orderliness • being organized, clean, and tidy<br />

5. Communication • expressing oneself clearly<br />

• being an active listener<br />

6. Self-Discipline • sense of responsibility<br />

• being hardworking<br />

• planning and execution<br />

7. Self-Confidence • self-reliance<br />

• believing in oneself<br />

8. Agreeableness • being able to establish and keep relationships<br />

• working with others in harmony<br />

9. Directing and Monitoring • being able to direct, coordinate, and guide<br />

subordinates<br />

10. Adaptability • being able to adapt to changing work conditions<br />

• stress tolerance<br />

11. Emotional Stability • keeping calm<br />

• not experiencing sudden fluctuations in the mood<br />

states<br />

Step III - Norm Study. The revised version of the NCOPI was administered to a<br />

sample of 1500 NCOs, and of those 1500, 1428 returned the inventory back. The purpose of<br />

this administration was two fold. First was to finalize the inventory and the second was to<br />

establish the norms on the final version of the NCOPI for the population of interest. In<br />

analyzing the norm data, reliability analyses were followed by a series exploratory and<br />

confirmatory factor analyses. These analyses resulted in the final version of the NCOPI,<br />

which comprised 103 items under 11 job-related personality dimensions. The internal<br />

consistency reliabilities for the 11 NCOPI dimensions are presented in Table 2. Norms were<br />

established for the final version. General Mental Health Inventory (GMHI)<br />

A mental health inventory, developed by the researchers as a screening tool to be used<br />

by the TAF, was used to evaluate overall psychological well-being of the respondents. This<br />

inventory was developed in response to a need expressed by the management of the<br />

organization. The GMHI was developed in two steps alongside the NCOPI.<br />

Step I - Item Development and Pilot Study. First, based on the data obtained from<br />

focused group discussions, The Critical NCO Behavior Incidents Questionnaire described<br />

above and the meetings with SMEs (i.e., personnel officers), six dimensions of psychological<br />

well-being were identified: psychotic tendencies, phobic tendencies, psychosomatic<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

53


54<br />

complaints, hostility, depression, nxiety/obsessive compulsive behaviors. An initial item<br />

pool was formed composed of items related to the dimensions identified. In the development<br />

of the items, relevant literatures and available screening tools were examined. Similar to the<br />

item development of the NCOPI, items of the GMHI were developed using an iterative<br />

approach. The initial version of the GMHI to be piloted consisted of 61 items under the<br />

identified categories.<br />

The GMHI was piloted at the same time with the NCOPI on the same sample (N =<br />

438), using the same scale format. Item analyses resulted in a major revision in the<br />

anxiety/obsessive compulsive tendencies dimension. The items aiming to measure obsessive<br />

compulsive tendencies had lower item-total correlations and consistently they lowered the<br />

internal consistency reliability of the dimension. A decision was made to eliminate these<br />

items and to focus more on generalized anxiety. Hence, new items tapping into generalized<br />

anxiety were developed. The resulting version had 64 items grouping under 6 psychologicalwell<br />

being dimensions.<br />

Step II - Norm Study. The revised version of the GMHI was administered (along with<br />

the NCOPI) to a sample of 1500 NCOs, and of those 1500, 1428 returned the inventory back.<br />

Again, the purpose of this administration was two fold. First was to finalize the GMHI<br />

inventory and the second was to establish the norms on the final version of the inventory for<br />

the NCOs in the TAF.<br />

In analyzing the GMHI norm data, reliability analyses were followed by a series<br />

exploratory and confirmatory factor analyses. These analyses resulted in the final version of<br />

the GMHI, which is composed of 55 items under the 6 mental health dimensions. The<br />

internal consistency reliabilities for these dimensions are presented in Table 2. Norms were<br />

established for the final version. 1<br />

Demographic Information Questionnaire<br />

The questionnaire package sent to the participants included a demographic information<br />

questionnaire. This questionnaire consisted of questions on gender, age, tenure, specialty<br />

area, force, rank, posting, and like.<br />

Procedure<br />

The questionnaire package was sent to an approximately representative sample of the<br />

NCOs employed in the Army, Navy, Air Force, and Gendarmerie of the TAF through internal<br />

mail system with a cover letter from the Chief of Command. The respondents were asked to<br />

fill out the forms and return it back again via the same system. The relatively high response<br />

rate obtained in the current study was thought to partially be a result of the cover letter<br />

accompanying the package, which encouraged participants to respond.<br />

RESULTS<br />

Correlations and reliabilities for the variables are presented in Table 2. As can be seen<br />

from the table, correlations between the variables of the study were all significant, ranging<br />

from -.15 to .81. Reliabilities of both the job-specific personality dimensions and<br />

psychological health dimension were above .70.<br />

1<br />

For a more detailed account of the development process of both the NCOPI and the GMHI please contact the<br />

corresponding author.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 2. Correlations Between Personality and Mental Health Dimensions and The<br />

Reliabilities<br />

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18<br />

1. Depression .85<br />

2. Phobic Tendencies .65 .74<br />

3. Hostility .58 .47 .80<br />

4. Psychotic Tendencies .70 .59 .54 .75<br />

5. Psychosomatic Complaints .62 .52 .50 .57 .83<br />

6. Anxiety .81 .68 .63 .73 .70 .88<br />

7. <strong>Military</strong> Bearing -.52 -.45 -.47 -.44 -.29 -.44 .80<br />

8. Determination -.54 -.46 -.34 -.33 -.33 -.43 .45 .73<br />

9. Dependability -.43 -.36 -.40 -.47 -.22 -.37 .57 .35 .71<br />

10. Orderliness -.29 -.17 -.19 -.27 -.15 -.22 .30 .33 .34 .82<br />

11. Communication -.64 -.59 -.52 -.55 -.48 -.63 .43 .51 .35 .27 .78<br />

12. Self-Discipline -.66 -.52 -.44 -.53 -.44 -.57 .48 .64 .49 .51 .60 .81<br />

13. Agreeableness -.57 -.52 -.61 -.52 -.38 -.54 .53 .39 .40 .22 .55 .47 .70<br />

14. Directing and Monitoring -.40 -.36 -.15 -.24 -.21 -.31 .21 .52 .15 .24 .49 .49 .27 .72<br />

15. Adaptability -.61 -.61 -.59 -.50 -.49 -.64 .59 .46 .37 .19 .58 .48 .58 .31 .75<br />

16. Self-Confidence -.77 -.59 -.47 -.59 -.52 -.69 .43 .55 .39 .28 .61 .65 .47 .45 .53 .81<br />

17. Emotional Stability -.81 -.64 -.68 -.68 -.61 -.81 .54 .49 .45 .27 .64 .61 .64 .35 .72 .67 .83<br />

18. Mental Health .88 .77 .78 .82 .79 .92 -.53 -.49 -.46 -.26 -.69 -.63 -.64 -.33 -.70 -.73 -.86 .95<br />

Note 1. All of the correlations are significant at .01. Note 2. Reliabilities are presented at the diagonal.<br />

Since both personality and mental health items were presented within the same<br />

instrument using the same format to the same source of data collection (i.e., NCOs), there was<br />

a possibility that the observed correlations could be an artifact of the common method<br />

employed. Hence, before testing the relationship between the personality and mental health<br />

dimensions, a series of confirmatory factor analyses using LISREL 8.30 (Jöreskog & Sörbom,<br />

1996) was performed to test the possibility of common method variance. In these analyses,<br />

first a confirmatory model in which all indicators (personality and mental health-related<br />

dimensions all together) clustering under a single latent variable was tested. This model was<br />

then compared against two alternative models. The first alternative, against which the singlefactor-model<br />

was evaluated, had two latent constructs, one for job-related personality<br />

variables, the other for mental health variables (i.e., Personality and Mental Health). The<br />

second alternative model suggested three latent constructs, one for mental health variables,<br />

two for personality variables (i.e., Mental Health, <strong>Military</strong> Demeanor, <strong>Military</strong> Efficacy). The<br />

two latent personality constructs in this model were identified through exploratory processes.<br />

The <strong>Military</strong> Demeanor latent variable included adaptability, emotional stability, military<br />

bearing, dependability, and agreeableness, whereas <strong>Military</strong> Efficacy included determination,<br />

self-discipline, orderliness, communication, self-confidence, directing and monitoring.<br />

An examination of the modification indices suggested that errors between several<br />

indicator pairs be correlated. Majority of these pairs were conceptually related. Hence a<br />

decision was made to let the errors between dependability and military bearing, dependability<br />

and self-discipline, orderliness and self-discipline, determination, and directing and<br />

monitoring, adaptability and military bearing, emotional stability and military bearing free.<br />

The single factor model was then compared against the two alternative models. Results<br />

suggested that the two-factor and three-factor alternatives had a better fit than the single-factor<br />

model, [χ 2 change (1, N = 1428) = 144.65, p < .001)] and [χ 2 change (3, N = 1428) = 675.86, p <<br />

.001)], respectively, decreasing the possibility of existence of common method variance.<br />

Furthermore, the results suggested that the alternative model, with two latent constructs for<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

55


56<br />

personality variables (i.e., three-factor model) had a relatively better fit than the model with<br />

only one latent variable embracing personality variables (i.e., two-factor model), [χ 2 change (2, N<br />

= 1428) = 531.21, p < .001)]. Hence a decision was made two conceptualize the 11 NCO<br />

personality variables measured by the NCOPI as being grouped under two latent personality<br />

constructs in the following analyses. The correlations between Mental Health-<strong>Military</strong><br />

Demeanor, Mental Health-<strong>Military</strong> Efficacy, and <strong>Military</strong> Demeanor-<strong>Military</strong> Efficacy latent<br />

construct pairs were, .-94, .-.88, and .85, respectively. Figure 1 depicts the three-factor<br />

measurement model.<br />

.63<br />

.25<br />

.74<br />

.12<br />

-.02 .51<br />

.08<br />

.41<br />

.12<br />

.55<br />

.36<br />

.21<br />

.87<br />

.16<br />

.40<br />

.31<br />

.71<br />

.19<br />

.45<br />

.52<br />

.19<br />

.39<br />

.50<br />

<strong>Military</strong> Bearing<br />

Dependability<br />

Agreeableness<br />

Adaptability<br />

Emotional Stability<br />

Determination<br />

Self-Discipline<br />

Orderliness<br />

Communication<br />

Self-Confidence<br />

Directing and<br />

Monitoring<br />

Depression<br />

Phobic Tendencies<br />

Hostility<br />

Anxiety<br />

Psychotic<br />

Tendencies<br />

Psychosomatic<br />

Complaints<br />

Figure 1. Three-factor Measurement Model<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.61<br />

.50<br />

.70<br />

.77<br />

.94<br />

.67<br />

.79<br />

.35<br />

.78<br />

.83<br />

.54<br />

.90<br />

.74<br />

.70<br />

.90 .9<br />

.78<br />

.71<br />

<strong>Military</strong><br />

Efficacy<br />

<strong>Military</strong><br />

Demeanor<br />

.85<br />

-.88<br />

Mental<br />

Health<br />

-.94


To examine the relationship between job-related personality variables and mental<br />

health a model was tested in which two personality constructs with multiple indicators<br />

predicted the mental health construct with multiple indicators (See Figure 2) using LISREL<br />

8.30 (Jöreskog & Sörbom, 1996). The results indicated that the two personality constructs<br />

explained a significant portion of the variance in the mental health factor, R 2 = .91, and the fit<br />

of the model was satisfactory (χ 2 = 1774.06, df = 110, p < .00, GFI = .87, AGFI = .82, NFI =<br />

.90, NNFI = .89, RMSEA = .10).<br />

<strong>Military</strong> Bearing<br />

Dependability<br />

Agreeableness<br />

Adaptability<br />

Emotional<br />

Stability<br />

Determination<br />

Self-Discipline<br />

Orderliness<br />

Communication<br />

Self-Confidence<br />

Directing and<br />

Monitoring<br />

<strong>Military</strong><br />

Demeanor<br />

.85<br />

<strong>Military</strong><br />

Efficacy<br />

-.70<br />

Mental<br />

Health<br />

Depression<br />

Phobic<br />

Tendencies<br />

Hostility<br />

Anxiety<br />

Psychotic<br />

Tendencies<br />

Psycho-<br />

somatic<br />

Complaints<br />

Figure 2. Personality Constructs As Predictive of Mental Health<br />

A stepwise regression analysis was performed to see the contribution of the individual<br />

personality variables to the prediction of overall mental health score, which was constructed<br />

by averaging the scores on the six mental health dimensions. Results indicated that except<br />

<strong>Military</strong> Bearing, Dependability, and Orderliness all personality variables contributed<br />

significantly to the variance explained in the mental health factor. The R 2 at the final step was<br />

found to be .81. Emotional Stability had the highest contribution in explaining the variance in<br />

mental health. Table 3 displays β, t, and standard error values resulting from the stepwise<br />

analyses.<br />

-.29<br />

.09<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

57


58<br />

Table 3. Stepwise Regression Results for the Prediction of Mental Health<br />

Predictors β t Standard<br />

Error<br />

Emotional Stability -.48 -23.13 .02<br />

Self-Confidence -.23 -12.92 .02<br />

Communication -.15 -8.47 .02<br />

Adaptability -.11 -6.13 .02<br />

Directing and Monitoring -.09 6.21 .01<br />

Agreeableness -.08 -5.13 .02<br />

Self-Discipline -.08 -4.62 .02<br />

Determination -.04 2.49 .02<br />

DISCUSSION<br />

The main purpose of the study was to explore the nature of the relationships between<br />

job-specific personality attributes and psychological well-being for the NCOs in the TAF. A<br />

model in which the two job-specific personality constructs, <strong>Military</strong> Demeanor and <strong>Military</strong><br />

Efficacy, predicting mental health was found to be satisfactory. More specifically, the two<br />

personality constructs explained a great amount of variance in mental health for the NCOs.<br />

Furthermore, among the two latent personality constructs, <strong>Military</strong> Demeanor had a stronger<br />

association with mental health. Analyses on the individual effects of personality dimensions<br />

suggested that except for military bearing, dependability, orderliness, the NCOPI dimensions<br />

contributed significantly to the prediction of the mental health composite. In short, consistent<br />

with the existing literature, the results provided support for the power of personality attributes<br />

(both in the form of latent traits and as individual dimensions) in predicting mental health<br />

(e.g., Ball, Tennen, Poling, Kranzler, & Rounsaville, 1997; DeNeve & Cooper, 1998; Siegler<br />

& Brummett, 2000; Trull & Sher, 1994).<br />

Results of the present study yielded some support for the argument that there is a need<br />

to distinguish between personality variables and mental health variables, and that common<br />

method by itself cannot be responsible for the observed effects. Furthermore, results<br />

suggested that the NCOPI, which has been developed as a selection tool for the NCOs in the<br />

TAF, could also serve the purpose of screening for mental health. The exceptionally large<br />

variance explained in mental health by both latent and individual personality factors suggested<br />

that the more fitting the personality profile of a candidate, the more he/she is likely to be<br />

mentally fit for the job. As discussed early in the paper, military context calls for individuals<br />

with not only physical but also psychological stamina. This is why mental health has been<br />

among the individual differences factors considered in the selection/screening of military<br />

personnel (e.g., Holden & Scholtz, 2002; Magruder, 2000). Results of the present study imply<br />

that when job-relevant personality attributes are used in the selection process, mental health<br />

assessment may be dispensable, resulting in significant cost savings.<br />

On the other hand, one could still argue that the strong structural correlations between<br />

the latent variables as well as individual dimensions may have resulted from a possible<br />

conceptual overlap among these constructs. The issue of construct overlap indeed deals with<br />

what is actually measured by different constructs/dimensions. Some of the dimensions<br />

included under the NCOPI have either direct or indirect conceptual links to the dimensions of<br />

psychological well-being. For instance, although emotional stability and self-confidence are<br />

treated and measured as independent dimensions of personality, they are also natural<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


correlates of psychological well-being, specifically depression (Baumeister, 1993). However,<br />

despite the possibility of conceptual overlaps among personality and mental health constructs,<br />

the three-factor model (i.e., two personality and one mental health factor model) was<br />

significantly better fitting than the alternative single- and two-factor models, suggesting that<br />

the measured constructs were relatively independent of each other.<br />

Future studies are needed to establish both criterion-related and construct validity of<br />

the NCOPI. There exists preliminary evidence concerning the criterion-related validity of the<br />

NCOPI. That is, in a recent study present authors examined the extent to which the NCOPI<br />

dimensions predicted a set of performance criteria (i.e., the ranking of the NCOs in terms of<br />

cumulative performance ratings over their tenure and the number of commendations and<br />

reprimands that they had received over their military career). Some of the NCOPI dimensions<br />

(e.g., directing and monitoring, self-discipline, and self-confidence) were found to contribute<br />

significantly to the variance in both cumulative performance rankings and the number of<br />

commendations received, providing some evidence concerning the predictive validity of the<br />

NCOPI. Yet, it needs to be noted that the criteria of job performance used were not<br />

direct/complete indexes of current performance. Hence, studies using more direct and<br />

comprehensive indices of job performance are needed in establishing the criterion-related<br />

validity of the NCOPI. Concerning the construct validity, correlations of the NCOPI<br />

dimensions with the measures of the same dimensions from different sources (i.e.,<br />

inventories) should be examined.<br />

REFRENCES<br />

American Psychiatric <strong>Association</strong>. (2000). Diagnostic and statistical manual of mental<br />

disorders (4th ed., Rev.). Washington, DC: Author.<br />

Ball, S. A., Poling, J. C., Tennen, H., Kranzler, H. R., Rounsville, B. J. (1997).<br />

Personality, temperament, and character dimensions and the DSM-IV personality disorders in<br />

substance abusers. Journal of Abnormal Psychology, 106, 545-553.<br />

Barrick, M. R., & Mount, M. K. (1991). The big five personality dimensions and job<br />

performance: A meta-analysis. Personnel Psychology, 44, 1-26.<br />

Barrick, M. R., Mount, M. K., & Judge, T. A. (2001). Personality and performance at<br />

the beginning of the new millennium: What do we know and where do we go next?<br />

<strong>International</strong> Journal of Selection & Assessment, 9, 9-22.<br />

Bartram, D. (1995). The predictive validity of the EPI and 16PF for military training.<br />

Journal of Occupational & Organizational Psychology, 68, 219.<br />

Baumeister, R. F. (1993). (Ed.) Self-esteem: The puzzle of low self-regard. New York:<br />

Plenum Press.<br />

Borman, W. C., Hanson, M. A., & Hedge, J. W. (1997). Personnel selection. Annual<br />

Review of Psychology, 48, 299-337.<br />

Cigrang, J. A., Todd, S. L., Carbone, E. G. (2000). Stress management training for<br />

military trainees returned to duty after a mental health evaluation: Effect on graduation rates.<br />

Journal of Occupational Health Psychology, 5, 48-55.<br />

DeNeve, K. M., & Cooper, H. (1998). The happy personality: A meta-analysis of 137<br />

personality traits and subjective well-being. Psychological Bulletin, 124, 197-229.<br />

Goldberg L. R. (1990). An alternative “description of personality”: The big five factor<br />

structure. Journal of Personality and Social Psychology, 59, 1216-1229.<br />

Hogan, R., Hogan, J., & Roberts, B. W. (1996). Personality measurement and<br />

employment decisions. American Psychologist, 51 (5), 469-477.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

59


60<br />

Holden, R. R. & Scholtz, D. (2002). The Holden Psychological Screening Inventory in<br />

the prediction of Canadian Forces basic training outcome. Canadian Journal of Behavioral<br />

Science, 34, 104-110.<br />

Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990).<br />

Criterion-related validities of personality constructs and the effect of response distortion on<br />

those validities. Journal of Applied Psychology, 75 (5), 581-595.<br />

IPIP. (31.08.1999). The 1,412 IPIP items in alphabetical order with their means and<br />

standard deviations. http://ipip.ori.org/ipip/1412.htm.<br />

Jöreskog, K., & Sörbom, D. (1996). LISREL 8: User’s reference guide. Chicago, IL:<br />

Scientific Software <strong>International</strong>.<br />

Krueger, G. P. (2001). <strong>Military</strong> psychology: United States. <strong>International</strong> Encyclopedia<br />

of the Social & Behavioral Sciences.<br />

Magruder, C. D. (2000). Psychometric properties of Holden Psychological Screening<br />

Inventory in the US military. <strong>Military</strong> Psychology, 12, 267-271.<br />

McCloy, R. A., Campbell, J. P., & Cudeck, R. (1994). A confirmatory test of a model<br />

of performance determinants. Journal of Applied Psychology, 79, 493-505.<br />

McCrae, R. R. & Costa, P. T. (1989). More reasons to adopt the five-factor model.<br />

American Psychologist, 451-452.<br />

Motowidlo, S. J., & Van Scotter, J. R. (1994). Evidence that task performance should<br />

be distinguished from contextual performance. Journal of Applied Psychology, 79, 475-480.<br />

Mount, M. K., & Barrick, M. R. (1995). The Big Five personality dimensions:<br />

Implications for research and practice in human resource management. Research in<br />

Personnel and Human Resource Management, 13, 153-200.<br />

Ones, D. S., Viswesvaran, C., & Schmidt, F. L. (1993). Comprehensive meta-analysis<br />

of integrity test validities: Findings and implications for personnel selection and theories of<br />

job performance. Journal of Applied Psychology, 78, 679-703.<br />

Russell, M. & Marrero, J. M. (2000). Personality styles of effective soldiers. <strong>Military</strong><br />

Review, 80, 69-74.<br />

Salgado, J. F. (1997). The five factor model of personality and job performance in the<br />

European Community. Journal of Applied Psychology, 82, 30-43.<br />

Sandal, G. M., Endresen, I. M., Vaernes, R., V., & Ursin, H. (1999). Personality and<br />

coping strategies during submarine missions. <strong>Military</strong> Psychology, 11, 381.<br />

Schneider, B. (1987). The people make the place. Personnel Psychology, 40, 437-453.<br />

Schneider, B., Goldstein, H. W., & Smith, D. B. (1995). The ASA framework: An<br />

update. Personnel Psychology, 48(4), 747-773.<br />

Siegler, I. C. & Brummett, B. H. (2000). <strong>Association</strong>s among NEO Personality<br />

Assessments and well-being at mid-life: Facet-level analysis. Psychology and Aging, 15, 710-<br />

714.<br />

Stevens, G., Hemstreet, A., Gardner, S. (1989). Fit to lead: Prediction of success in a<br />

military academy through the use of personality profile. Psychological Reports, 64, 227-235.<br />

Stricker, L. J. & Rock, D. A. (1998). Assessing leadership potential with a<br />

biographical measure of personal traits. <strong>International</strong> Journal of Selection and Assessment, 6,<br />

164.<br />

Sumer, H. C., Sumer, N., Cifci, O. S. (November 7-9, 2000). Establishing construct<br />

and criterion-related validity of a personality inventory in the Turkish Armed Forces. Paper<br />

presented at the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Annual Conference, Edinburgh,<br />

UK.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Thomas, J. L., Dickson, M. W., & Bliese, P. D. (2001). Values predicting leader<br />

performance in the U.S. Army Reserve Officer Training Corpse Assessment Center: Evidence<br />

for a personality-mediated model. The leadership Quarterly, 12, 181-196.<br />

Trull, T. J. & Sher, K. J. (1994). Relationship between the five-factor model of<br />

personality and Axis-I disorders in a nonclinical sample. Journal of Abnormal Psychology,<br />

103, 350-360.<br />

Van Scotter, J. R., & Motowidlo, S. J. (1996). Interpersonal facilitation and job<br />

dedication as separate facets of contextual performance. Journal of Applied Psychology, 81,<br />

525-531.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

61


62<br />

JOIN: Job and Occupational Interest in the Navy<br />

William L. Farmer, Ph.D., Ronald M. Bearden, M.S., PNC Edward D. Eller,<br />

Paul G. Michael, Ph.D., LCDR Roni S. Johnson, Hubert Chen, B.A., Aditi Nayak, M.A.,<br />

Regina L. Hindelang, M.S., Kimberly Whittam, Ph.D., Stephen E. Watson, Ph.D.,<br />

and David L. Alderton, Ph.D.<br />

Navy Personnel Research, Studies, and Technology Department (PERS-1)<br />

Navy Personnel Command<br />

5720 Integrity Drive, Millington, TN, USA 38055-1000<br />

William.L.Farmer@navy.mil<br />

Navy researchers, along with other contributors have recently developed new<br />

classification decision support software, the Rating Identification Engine (RIDE). The main goal<br />

behind the new system is to improve the recruit-rating job assignment process so that it provides<br />

greater utility in the operational classification system. While RIDE employs many tactics to<br />

improve the assignment procedure, one strategy is to develop a measure of Navy-specific<br />

interests that may be used in conjunction with the current classification algorithm during<br />

accessioning. Job and Occupational Interest in the Navy (JOIN) is a computer-administered<br />

instrument intended to inform applicants about activities and work environments in the Navy,<br />

and measure the applicant’s interest in these activities and environments. It is expected that the<br />

simultaneous utilization of the JOIN interest measure and the RIDE ability components will<br />

improve the match between the Navy recruit’s abilities and interest, and ultimately serve as a<br />

means of increasing job satisfaction, performance, and retention.<br />

The recruit typically has some degree of uncertainty when faced with the wide array of<br />

opportunities available from among more than 70 entry-level jobs (in the Navy, ratings), and<br />

over 200 program-rating combinations. The Navy, in deciding which rating is best suited for a<br />

recruit, should strike a careful balance between filling vacancies with the most qualified<br />

applicants and satisfying the applicants’ career preferences. Much is at stake in the process and<br />

research in civilian and military organizations has produced several pertinent findings. First, a<br />

lack of qualifications has been shown to lead to training failures and degraded job performance.<br />

Additionally, people who occupy jobs that are inconsistent with their interests are less likely to<br />

be satisfied with their work and are more prone to leave the organization for other job<br />

opportunities. Finally, dissatisfied employees have higher absenteeism on the job, engage in<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


more counterproductive behaviors, and seek alternative employment more often than their<br />

satisfied counterparts (Lightfoot, Alley, Schultz, Heggestad, Watson, Crowson, & Fedak, in<br />

press; Lightfoot, McBride, Heggestad, Alley, Harman, & Rounds, in press).<br />

The JOIN vocational interest system will provide a critical component to the RIDE<br />

classification process. Current interest inputs to RIDE represent informal discussions with the<br />

classifier, which vary quantitatively and qualitatively by applicant. The JOIN system educates<br />

individuals about the variety of job related opportunities in the Navy, and creates a unique<br />

interest profile for the individual. The Sailor-rating interest fit for all Navy Ratings is identified<br />

by comparing the Applicant’s Rating Interest Profile to each of the Rating Interest Profiles<br />

generated by JOIN. Once validated, JOIN provides a standardized and quantified measure of<br />

applicant vocational interests, which will be provided as an input to RIDE. If successful,<br />

RIDE/JOIN can be implemented for initial classification, and transitioned to training and fleet<br />

commands for re-classification. Recent research efforts have focused on the development of the<br />

comprehensive JOIN Rating Interest Profile model for all Navy ratings, based on a series of<br />

analyses including iterative Subject Matter Expert (SME) interviews. Paralleling these efforts has<br />

been the development of the JOIN experimental software, also developed in concert with SMEs<br />

(see section below for details).<br />

JOIN Model Development<br />

Following an early effort by Alley, Crowson, and Fedak (in press), that was very much in<br />

the vein of typical contemporary interest inventories, it was determined that JOIN’s format<br />

should be more pragmatically based. The development of the current JOIN tool is documented<br />

in Michael, Hindelang, Watson, Farmer, and Alderton (in press). The item development for Jobs<br />

and Occupational Interests in the Navy (JOIN) was an iterative process. The first challenge was<br />

to develop work activity and work environment items through an abbreviated job analytic<br />

procedure. A basic model of work served as the framework for the examination of Navy jobs and<br />

for the development of the inventory items. Conceptually, at the macro-level, the Navy consists<br />

of various job families or groupings of jobs according to organizational function, platform and/or<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

63


64<br />

work process (e.g., administration, health care, submarine personnel, etc.). Examining the world<br />

of Navy work on a micro-level reveals work activities or tasks that describe the work that is<br />

performed.<br />

The first step in the item development process involved the collection of all of the<br />

available job descriptions from the Navy Enlisted Community Manager’s (ECM) web site. A<br />

researcher reviewed each of these job descriptions, and highlighted words that reflected the<br />

following categories: 1) job families, or Navy community areas (i.e., aviation, construction,<br />

submarine, etc.); 2) work activity dimensions, a process (verbs) and/or content (nouns) words;<br />

and 3) work context dimensions, or the work environment (i.e., working indoor, working with a<br />

team, etc.). From these highlighted extracts, lists of communities, processes, content words, and<br />

work environments, which seemed most representative of each Navy rating (N=79) were created.<br />

The process and content words were joined in various combinations to form process-content<br />

(PC) pairs. These PC pairs serve as individual interest items, allowing participants to indicate<br />

their level of interest in the work activity (e.g., maintain-mechanical equipment). Currently, a<br />

total of 26 PC pairs are included in JOIN.<br />

After developing the content for the interest inventory, the next phase of the project was<br />

to design and create a computer-administered measure of interests. The current version of the<br />

interest inventory (JOIN 1.01e) assesses three broad dimensions of work associated with Navy<br />

jobs and integrates over three hundred pictures of personnel performing job tasks to enhance the<br />

informational value of the tool. The first dimension, Navy community area, contains seven Navy<br />

community areas (e.g., aviation, surface, construction, etc.). Participants are asked to rank the<br />

individual areas, represented by four pictures each with its own text description, based on level<br />

of interest (i.e., most interesting to least interest, and not interested). The second dimension<br />

contains eight items describing work environments or work styles (e.g., work outdoor, work with<br />

a team, etc.). Participants are asked to indicate their level of preference for working in certain<br />

contextual conditions. Again, pictures with text descriptions represent each item. The final<br />

dimension, work activity, includes twenty-six PC pairs. Each PC pair serves as an individual<br />

interest item that allows participants to indicate their level of interest in the work activity<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


dimension (e.g., maintain mechanical equipment, direct aircraft, etc.). Three pictures (in the<br />

initial release), and descriptive text, represent each PC pair item<br />

JOIN Software <strong>Testing</strong><br />

Usability <strong>Testing</strong> I. The first test phase occurred during August of 2002 at the Recruit<br />

Training Center (RTC) Great Lakes, and was conducted with a sample of 300 new recruits.<br />

Participants were presented with JOIN and its displays, images, question content, and other<br />

general presentation features in order to determine general test performance, item reliability,<br />

clarity of instructions and intent, and appropriateness with a new recruit population for overt<br />

interpretability, required test time, and software performance. The initial results from the<br />

usability testing were very promising on several levels. First, the feedback from participants<br />

provided researchers with an overall positive evaluation of the quality of the computer<br />

administered interest inventory. Second, the descriptive statistical analyses of the JOIN items<br />

indicated that there was adequate variance across individual responses. In other words, the<br />

participants were different in their level of interest in various items. Finally, the statistical<br />

reliability of the work activity items was assessed and the developed items were very consistent<br />

in measuring participant interest in the individual enlisted rating job tasks. The results from this<br />

initial data collection effort were used to improve the instrument prior to subsequent usability<br />

and validity testing (Michael, Hindelang, Watson, Farmer, & Alderton, in press).<br />

Instrument Refinement. Based on the results of the initial usability study, a number of<br />

changes were made. These changes were made with three criteria in mind. First, we wanted to<br />

improve the interface from the perspective of the test taker. Second, it was imperative that<br />

testing time be shortened. Though this modification does contribute to the “user-friendliness” of<br />

the tool, the initial impetus for this was the very real operational constraint, as directed by<br />

representatives from the Navy Recruiting Command (CNRC), that the instrument take no more<br />

than ten to fifteen minutes to complete. Finally, if at all possible, it was necessary that the<br />

technical/psychometric properties of the instrument be maintained, if not enhanced.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

65


66<br />

Though the initial usability testing was favorable overall, one concern was voiced on a<br />

fairly consistent basis. Respondents stated that there was an apparent redundancy in the items<br />

that were presented. This redundancy was most often characterized, as “It seems like I keep<br />

seeing the same items one right after another.”<br />

One explicit feature that was a target during the initial development was that a set of<br />

generalizable process and content statements would be used. For instance, the process<br />

“Maintain” is utilized in nine different PC pair combinations and a number of the content areas<br />

are used in as many a three PC pairs. Due to targeted design it was decided that this feature<br />

would not be revised.<br />

Also contributing to the apparent redundancy was the fact that three versions of each PC<br />

item were presented, yielding a total of 72 PC items being administered to each respondent. This<br />

feature had been established as a way of ensuring psychometric reliability. With a keen eye<br />

toward maintaining technical standards, the number of items was cut by one-third, yielding a<br />

total of 56 PC items in the next iteration of the JOIN tool.<br />

Finally, the original algorithm had specified that all items be presented randomly.<br />

Though the likelihood of getting the alternate versions of a PC pair item one after the other was<br />

low, we decided to place a “blocking constraint” in the algorithm; whereas an individual receives<br />

blocks of one version of all of the 26 PC pairs presented randomly. With the number of PC pair<br />

presentations being constrained to two, each participant receives two blocks of 26 items.<br />

As users had been pleased with the other major features of the interface, no refinements<br />

were made other than those mentioned. Reduction in testing time was assumed based on the<br />

deletion of PC item pairs. Decisions to delete items were made using a combination of rational<br />

and technical/psychometric criteria. As stated earlier, initial item statistics had been favorable in<br />

that internal consistencies within 3-item PC scales were good (mean α = 0.90), and sufficient<br />

variation across scale endorsement indicated that individuals were actually making differential<br />

preference judgments. Items were deleted if they contributed little (in comparison to other items<br />

in the scale) to PC scale internal consistency or possessed response distributions that were<br />

markedly different from alternate versions of the same item. In lieu of definitive empirical<br />

information, items were also deleted if they appeared to present redundant visual information (as<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


judged by trained raters). The resulting 2-item PC scales demonstrated good internal consistency<br />

(mean α = 0.88). Additional modifications were made that enhanced item data interpretation and<br />

allowed for the future measurement of item response time.<br />

Usability <strong>Testing</strong> II. The second phase of testing occurred over a three and a half month<br />

period in the spring of <strong>2003</strong> at RTC Great Lakes. A group of approximately 4,500 participants<br />

completed the refined JOIN (1.0e) instrument. The group was 82% male, 65% white, 20%<br />

black, and 15% other. From a usability perspective, 93.2% of all respondents rated JOIN “good”<br />

or “very good.” Regarding the P-C descriptors, 90.4% of respondents felt that the pictures did a<br />

“good” or “very good” job of conveying the information presented in the descriptors, and 80.5%<br />

stated that the items did a “good” or “very good” job of conveying Navy relevant job information<br />

to new recruits. In terms of psychometric quality, the average P-C scale α was 0.87. Descriptive<br />

statistics indicate that participants have provided differential responses across and within work<br />

activity scales. The average testing time decreased (from the original version) from an average<br />

of 24 minutes to 13 minutes. The average time spent per item ranges from 8 to 10 seconds<br />

(except for special operations items – 21 seconds). Special programs and Aviation are preferred<br />

communities, with working outside and in a team as the work environment and style of choice.<br />

As in the initial pilot test, the most desirable work activity has been to operate weapons.<br />

Criterion Related Validity <strong>Testing</strong>. Currently the data collected in the most recent round<br />

of testing is also being used to establish criterion-related validity of the JOIN instrument. As<br />

those who completed the instrument lack prior experience or knowledge of the Navy or Navy<br />

ratings, they are an ideal group to use for establishing predictive validity of the tool. Criterion<br />

measures (e.g. A-school success) will be collected as participants progress through technical<br />

training, and those data become available. Participants’ social-security-numbers (SSN) were<br />

collected to link interest measures to longitudinal data, including the multiple survey 1st Watch<br />

source data. Additional measures will also include attrition prior to End of Active Obligated<br />

Service (EAOS), measures of satisfaction (on the job and in the Navy), propensity to leave the<br />

Navy, or desire to re-enlist. Additionally, JOIN results will be linked with performance criteria.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

67


68<br />

JOIN Model Enhancement<br />

In addition to the establishment of criterion-related validity, current efforts are focused on<br />

establishing and enhancing the construct validity of the SME model upon which the JOIN<br />

framework exists. As mentioned previously, the tool was developed using the judgments of<br />

enlisted community managers. In addition to decomposing rating descriptions to core elements<br />

and matching photographs with these elements, this group also established the initial scoring<br />

weights, which were limited in the first iteration to unit weights. At present, NPRST researchers<br />

are conducting focus group efforts with Navy detailers, classifiers, and A-school instructors for<br />

the purpose of deriving SME rater determined numerical weights that establish an empirical link<br />

between JOIN components and all existing Navy rates that are available to first term sailors.<br />

These weights will be utilized in the enhancement of the scoring algorithm that provides an<br />

individual preference score for each Navy rating. A rank ordering (based on preference scores)<br />

of all Navy ratings is provided for each potential recruit.<br />

Future Directions<br />

Plans include linking JOIN results with other measures that include the Enlisted Navy<br />

Computer Adaptive Personality Scales (ENCAPS) and other individual difference measures<br />

currently being developed at NPRST. The establishment of a measurable relationship between<br />

job preference and such constructs as individual temperament, social intelligence, teamwork<br />

ability, and complex cognitive functioning will greatly advance the Navy’s efforts to select and<br />

classify sailors and ensure the quality of the Fleet into the future.<br />

References<br />

Alley, W.E., Crowson, J.J., & Fedak, G.E. (in press). JOIN item content and syntax templates<br />

(NPRST-TN-03). Millington, TN: Navy Personnel Research, Studies, & Technology.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Lightfoot, M.A., Alley, W.E., Schultz, S.R., Heggestad, E.D., Watson, S.E., Crowson, J.J., &<br />

Fedak, G.E. (in press). The development of a Navy-job specific vocational interest<br />

model (NPRST-TN-03). Millington, TN: Navy Personnel Research, Studies, &<br />

Technology.<br />

Lightfoot, M.A., McBride, J.R., Heggestad, E.D., Alley, W.E., Harmon, L.W., & Rounds, J.<br />

(in press). Navy interest inventory: Approach development (NPRST-TN-03).<br />

Millington, TN: Navy Personnel Research, Studies, & Technology.<br />

Michael, P.G., Hindelang, R.L., Watson, S.E., Farmer, W.L., & Alderton, D.L. (in press). JOIN:<br />

Interest inventory development and pilot testing. I (NPRST-TN-03). Millington, TN:<br />

Navy Personnel Research, Studies, & Technology.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

69


70<br />

OZ: A Human-Centered Computing Cockpit Display<br />

Leonard A. Temme, Ph.D.<br />

Naval Aerospace Medical Research Laboratory<br />

U. S. Naval Air Station - Pensacola<br />

David L. Still, O.D., Ph.D.<br />

John Kolen, Ph.D.<br />

Institute of Human and Machine Cognition<br />

University of West Florida<br />

LCDR Michael Acromite, MC USN<br />

Training Air Wing SIX<br />

dstill@ai.uwf.edu<br />

Abstract<br />

An aviator may control an aircraft by viewing the world through the windscreen (visual<br />

flight) or with information from the cockpit instruments (instrument flight), depending upon<br />

visibility, meteorological conditions, and the competence of the pilot. It seems intuitively<br />

obvious that instrument flight should be far more challenging than visual flight. However, since<br />

the same pilot controls the same aircraft through the same air using the same stick-and-rudder<br />

input devices, the only difference is the way the same information is presented. Consequently,<br />

instrument flight is harder than visual flight only because of the way the instruments display the<br />

information. Each instrument displays one flight parameter and it is up to the pilot to convert the<br />

numeric value of the displayed parameter into useful information. We think that it is possible to<br />

use modern display technologies and computational capabilities to make instrument flight safer,<br />

easier, and more efficient, possibly even beyond that of visual flight. The challenge is design of<br />

the instruments. Using emerging principles of knowledge engineering derived from such areas<br />

as human-centered computing, cognitive task analysis, and ecological display design, we are<br />

designing a suite of aircraft cockpit instruments to enhance instrument flight performance. Our<br />

presentation will describe our approach, methodology, instrumentation, and some experimental<br />

results. We have used commercially available, off-the-shelf desk-top flight simulators for which<br />

we have developed precise flight performance measures that may have broad application for<br />

training and performance testing.<br />

Introduction<br />

In their 1932 book, Blind Flight, Ocker and Crane (26) delineated several requirements<br />

for an ideal cockpit instrument display. It should: (1) be a single instrument, (2) resemble<br />

closely the natural environment, (3) show the direction and degree of turn in a normal manner,<br />

(4) show angle of bank in a normal manner, (5) show climb and glide by the position of the<br />

“nose” of the aircraft, (6) eliminate vertigo, (7) prevent fatigue, (8) require little special training<br />

to use, (9) prompt the pilot to use the controls in a normal reflex manner, (10) not be affected by<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


abnormal positions such as spins and acrobatics, and (11) have dependability and incorporate<br />

factors of safety so that should one feature fail, others would suffice for the time.<br />

Ocker and Crane’s list can be extended to: (1) enable pilots to fly as aggressively and as<br />

effectively on instruments as in visual flight conditions, (2) enable instrument group tactics of<br />

manned or unmanned air combat vehicles by allowing each pilot/operator to visualize the others,<br />

(3) adapt to changes in aircraft configuration and environment in order to show real-time aircraft<br />

capability (This is particularly important for tilt-rotor, vectored thrust, or variable geometry craft<br />

with their correspondingly complex flight envelopes.), (4) facilitate instrument hover with<br />

motion parallax cues showing translation, (5) combat spatial disorientation and loss of situational<br />

awareness by provide a visually compelling 360° frame of reference as a true six degrees-offreedom,<br />

directly-perceivable link to the outside world, (6) allow pilots to fly aircraft without<br />

deficit while simultaneously studying a second display, (i.e. radar, FLIR, or map), (7) enable<br />

pilots to control the aircraft even with reduced vision from laser damage or lost glasses, and (8)<br />

be realized with a software change in any aircraft with a glass cockpit.<br />

Conventional flight instrument displays clearly fail to meet these requirements for several<br />

reasons. For example, the pilot must scan the instruments, looking at or near each of a number<br />

of instruments in succession to obtain information. Studies of pilot instrument scanning have<br />

shown that it is not unusual for even trained pilots to spend as much as 0.5 sec viewing a single<br />

instrument and durations of two seconds or more are to be expected even from expert pilots in<br />

routine maneuvers (10, 11, 32). Consequently, the time required to sequentially gather<br />

information can be substantial, severely limiting the pilot’s ability to cope with rapidly changing<br />

or unanticipated situations and emergencies. Furthermore, the pilot must constantly monitor<br />

instruments to ensure that the aircraft is performing as intended because the instruments do not<br />

"grab" the pilot's attention when deviations from prescribed parameters occur.<br />

Another shortcoming is that current flight instrument displays use many different frames<br />

of reference, with information in a variety of units: degrees, knots, feet, rates of change, etc. The<br />

pilot must integrate these different units into a common frame of reference to create an overall<br />

state of situational awareness. Moreover, the basic flight instruments are not integrated with<br />

other cockpit instrumentation such as engine, weather, and radio instruments. The components<br />

of each of these, like the basic flight instruments, have different units and do not share a common<br />

frame of reference. The traditional practical solutions to these problems have been to severely<br />

limit flight procedures, emphasize instrument scan training, and require extensive practice.<br />

The Development of OZ<br />

OZ is a system based on principles of vision science, Human-Centered Computing<br />

(HCC) (12), computer science, and aerodynamics aimed at meeting the requirements of an ideal<br />

cockpit display. OZ, as an example of HCC, is an effective technology that amplifies and<br />

extends the human's perceptual, cognitive, and performance capabilities while at the same time<br />

reducing mental workload. The controlling software (i.e., the calculations ordinarily imposed on<br />

the pilot) runs seamlessly "behind the curtain" but without hiding specific values of important<br />

parameters to which the pilot needs to have access.<br />

Research on vision and cognition suggested ways to eliminate the fundamental speed<br />

barrier of traditional displays, the "instrument scan." The visual field can be divided into two<br />

channels, the focal (closely related to central or foveal vision), and the ambient (closely related<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

71


72<br />

to, but not identical with peripheral vision) (20, 21, 22, 28, 30, 35). The focal channel is used for<br />

tasks such as reading, which require directed attention. The ambient channel is used primarily<br />

for tasks such as locomotion that can be accomplished without conscious effort or even<br />

awareness. In the normal environment both of these channels are simultaneously active, as when<br />

a running quarterback passes the ball to a receiver or when a driver reads a sign while controlling<br />

an automobile during a turn. Significantly, the design of conventional instruments requires that<br />

the focal channel be directed sequentially to each instrument, producing the ‘instrument scan’<br />

(18, 25) while the part of the visual system that is optimized for processing locomotion<br />

information, the ambient channel, is largely irrelevant for this task.<br />

To harness the power of both focal and ambient channels, and therefore to reduce the<br />

delays imposed by sequential information gathering, OZ display elements are constructed using<br />

visual perceptual primitives⎯luminance discontinuities that are resilient to one and/or two<br />

dimensional optical and neurological demodulation (i.e., dots and lines). The resilience of the<br />

perceptual primitive to demodulation allows them to pass information through both the ambient<br />

and focal channels’ optical and neurological filters (33, 34, 39). OZ organizes these perceptual<br />

primitives into meaningful objects using such visual perceptual phenomena as figure-ground (5,<br />

16, 19, 29), pop-out (9, 17), chunking, texture (2, 5), effortless discrimination (2, 16,28), and<br />

structure-from-motion (13, 23,27). These phenomena organize the graphic primitives into the<br />

objects that constitute OZ symbology, objects that have perceptual meaning and are quickly<br />

understood. Concepts derived from the Human-Centered Computing approach (7, 12) enabled<br />

us to further refine OZ by reducing human information processing requirements. Specifically,<br />

OZ combines and reduces different data streams into proportionately-scaled symbology that the<br />

pilot can immediately apprehend and use. For example, information on aircraft configuration,<br />

density altitude, and location are integrated into a directly perceivable picture of the aircraft’s<br />

present capability.<br />

Finally, to reduce the cognitive workload of instrument flight, OZ uses a common frame<br />

of reference to bring together all cockpit information to create a single, unified display,<br />

producing a picture that can be clearly and quickly understood. The frame of reference provides<br />

the structure that transforms OZ’s separate perceptual objects into an ensemble of meaningfully<br />

interactive components. This is one reason that OZ can communicate spatial orientation, aircraft<br />

location, flight performance, aircraft configuration, and engine status all in the time it takes to<br />

look at a single conventional instrument.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Description of OZ<br />

---------------------------------------------------------<br />

Fig. 1: The star field metaphor<br />

---------------------------------------------------------<br />

OZ is organized into two components, the star field metaphor and aircraft metaphor,<br />

which encode aircraft location and capability respectively. The star field metaphor, Figure 1,<br />

shows the aircraft’s attitude and location in the external world. The star field metaphor is a oneto-one<br />

mapping of the external world onto a coordinate system that displays both translations and<br />

rotations. The star field metaphor in Figure 1 shows horizontal angular displacements linearly<br />

projected along the x-axis and vertical angular displacements tangentially projected along the yaxis.<br />

Several star layers are within the star field. Each star layer is located at a specific altitude.<br />

The forward edges of these altitude layers are 0.5 nautical miles in front of the aircraft and are<br />

composed of dots placed at every 10° of heading. The surface plane of the layers is defined by<br />

star ‘trails’ flowing back around the aircraft from every third dot of the altitude layer’s leading<br />

edge . The flow of these star ‘trails’ shares a common expansion point, located at the center of<br />

the horizon line. This array of star trails creates apparent altitude planes and heading streams.<br />

The horizon line, relative to the star layers, shows the aircraft’s altitude. The center of the<br />

horizon line corresponds to the aircraft’s centerline and location. For example, an aircraft<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

73


74<br />

altitude of 3,250 feet would place the horizon line midway between the 3,000 and 3,500 foot<br />

layers. When the aircraft's altitude corresponds to a displayed star layer, the horizon line is<br />

located at the layer’s forward edge and that layer’s stars stream along the horizon line. The<br />

location of the lubber line (blue line located above the center of the horizon line and<br />

perpendicular to it) provides heading information. In OZ’s current embodiment, at any given<br />

instant the displayed portion of the coordinate system extends 360° horizontally and 60°<br />

vertically. This angular construction enables the location of external objects such as runways,<br />

radio aids, weather, terrain, and air traffic to be mapped in the same coordinate space used to<br />

show aircraft attitude and flight path.<br />

The graphic elements of the star field metaphor shown in Figure 1 are:<br />

• Horizon Line – Aircraft attitude is shown by the orientation of the horizon line. Aircraft<br />

altitude is indicated by the location of the horizon line within the star field.<br />

• Lubber Line - Aircraft heading is show by the location of the lubber line within the star field.<br />

• Pitch Ladder – The vertical angular displacement scale is provided by the pitch ladder. Cross<br />

marks are 5° apart.<br />

• Star Layers and Streams – Star layers marks every 500 feet in altitude. Each star layer’s<br />

leading edge is constructed with stars placed 10° apart. Star streams originate from every<br />

third star and mark 30° heading increments.<br />

• Highlighted Star Layer – Specific altitudes may be marked with additional star layers. These<br />

layers are constructed with larger stars for clarity.<br />

• Highlighted Star Stream - Specific headings may be marked with additional star streams.<br />

These streams are constructed with larger stars for clarity.<br />

• Runway – The ends and middle of a runway are marked with filled circles connected with<br />

straight lines. Alignment with the runway centerline and the runway’s location are shown.<br />

In addition to these elements, the OZ star field can display the three dimensional location<br />

of waypoints, other aircraft, obstructions, and thunderstorms.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


-------------------------------------------------------------<br />

Fig. 2: The aircraft metaphor<br />

--------------------------------------------------------------<br />

The aircraft metaphor depicted in Figure 2 is a stylized triplane composed of lines and<br />

circles. The location of the aircraft metaphor in the star field metaphor shows attitude and flight<br />

path information. The size and interrelationship of the triplane’s parts map the aircraft’s<br />

configuration, airspeed, engine output, and flight envelope. The span-wise location of the struts<br />

connecting the wings is proportionate to airspeed. The further outboard the struts the greater the<br />

airspeed, with the structurally limited airspeed located at the wingtips and the minimum flying<br />

speed located at the wing roots. This span location also provides the x-axis speed scale of the<br />

embedded power required curve. The shape of the upper and lower wings is a stylized graph of<br />

the aircraft’s power requirements with the perpendicular distance between the upper and lower<br />

bent wings indicating the amount of power required for unaccelerated flight at the corresponding<br />

airspeeds. The length of the struts indicates power available. The extent to which the wing<br />

struts are colored green indicates power in use. The struts are scaled so that power equals<br />

demand when the green of the struts reaches the upper and lower wings. With this design, the<br />

wings and struts depict the complex interrelationship between power, drag, lift, airspeed,<br />

configuration, and performance.<br />

Figure 2 shows the components of the aircraft metaphor when the aircraft is at cruising<br />

speed, with power set to less than required for level flight. Digital readouts for some parameters<br />

have been toggled off to simplify the illustrations. The graphic elements shown are:<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

75


76<br />

• Stick – Turn rate and direction are shown by the stick’s lateral and angular position relative<br />

to the star field.<br />

• Pitch Line – Aircraft attitude is shown by the location of the pitch lines in the star field.<br />

• Speed Rings – Vertical speed magnitude and direction are encoded by the radius and location<br />

of the speed rings in the star field. The outboard speed rings mark a selected airspeed<br />

location along the wingspan.<br />

• Pendulum – Inclinometer and Gz forces are shown by the orientation and size of the<br />

pendulum.<br />

• Speed Struts – Airspeed is shown by the location of the speed strut along the wing span.<br />

Available propulsion is reflected in the length of a strut relative to the separation of the upper<br />

and lower bent wings. Power is correct for the current airspeed, aircraft configuration, and<br />

density altitude when the inner green segment of the strut just touches the upper and lower<br />

bent wings.<br />

• Straight Wing – The airspeed scale is provided by the straight wing. Aircraft wings are level<br />

when the straight wing is parallel with the horizon line and star layers.<br />

• Bent Wings – The aircraft drag curve adjusted for aircraft configuration (weight, landing gear<br />

position, flap setting, etc.) and density altitude effects is displayed in a stylized fashion by the<br />

bent wings. The angle between them and the straight wing corresponds to the bank angle<br />

required for a standard rate turn when in coordinated flight.<br />

• Wing Pinch – Minimum drag speed is marked by the location of the wing pinch along wing<br />

span.<br />

• Wing Tips – Maximum safe structural speed is marked by the location of the wing tips.<br />

• Wing Roots – Minimum flying speed is marked by the location of the wing roots.<br />

For comparison, Figure 3 illustrates the computer screens of OZ (Figure 3a) and<br />

conventional (Figure 3b) displays. These screens were captured within seconds of each other<br />

and therefore show correspondingly similar information, but of course in very different ways.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


---------------------------------------------------------<br />

Fig. 3: Screen shot of OZ with the star field and aircraft metaphors combined. The screen shot of<br />

the conventional and OZ displays were separated by a few moments.<br />

---------------------------------------------------------<br />

Examination of Figure 3a reveals: The near end of the runway is 30° to the left of the<br />

aircraft centerline and 9° below the horizon. The aircraft is turning left at a low rate toward the<br />

runway. The current heading is 081°. The highlighted heading is set on the runway heading of<br />

092°. Notice that the extended runway centerline is well to the left of the aircraft. The aircraft<br />

descent angle is 2.5° and is at 2,900 feet. Flaps are retracted and airspeed is 6 knots below that<br />

marked by the Speed Rings. Airspeed is above minimum drag speed and below maximum<br />

structural speed. The power is set less than required for level flight under these conditions. No<br />

obstructions, weather, or other aircraft are visible. The conventional instrument panel was<br />

captured slightly after the OZ screen capture. Altitude has decreased to 2,760 feet, heading<br />

changed to 070°, and rate of turn increased.<br />

Depicting aerodynamic relationships by the size and interaction of structure, as illustrated<br />

by the wings and struts, is a general concept carried throughout OZ. As a consequence of this<br />

design approach, OZ produces an explicit graphic depiction of aircraft performance that the pilot<br />

would otherwise have to construct and maintain as a mental model. This has several benefits as<br />

demonstrated in the experiments. First, it reduces the pilot’s requirements to recall the currently<br />

correct model. Second, it reduces the amount of mental calculation required to apply the model<br />

to current conditions. Third, it can insure that all pilots are using the same model. The overall<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

77


78<br />

result is that OZ shifts the workload requirements for flight from one of visual scanning of<br />

separate instruments and displays, requiring intensive integration and computation, to nearly<br />

instantaneous or ‘direct’ perception of an integrated picture. This allows a glance at OZ to<br />

convey most of the information contained in an entire panel of conventional instruments that<br />

may take several seconds to scan. Although OZ presents the pilot with processed data, the<br />

processing does not obscure information nor does it make covert, de facto decisions for the<br />

operator.<br />

Experiments<br />

Two experiments are reported. Experiment 1 evaluated the overall OZ approach, the<br />

general scaling of flight parameters to OZ symbology, and the software implementation.<br />

Experiment 2 involved the tasks of simultaneously flying with instruments and reading. We<br />

considered this a particularly compelling demonstration since it is a task that can be clearly<br />

important operationally but is patently impossible with conventional flight instruments, but with<br />

OZ, it’s easy.<br />

Experiment 1: The Demonstration of the OZ Concept<br />

This was the first formal evaluation of OZ; the goal was to ensure that OZ was scaled<br />

such that flight performance with OZ under simple conditions was comparable with that obtained<br />

with conventional instruments.<br />

Method<br />

Participants: Two first year medical-students with no previous pilot experience<br />

volunteered for these studies conducted over a two-month period. Neither participant required a<br />

refractive correction and both were less than twenty five years of age.<br />

Simulator and Equipment: Elite (Prop. Version 4) simulator software running on a<br />

Macintosh computer provided the Cessna 172 aerodynamic model and the conventional<br />

instrument display (CD) used in this and the following study. The manufacturer modified the<br />

commercial software to export flight data. The OZ display was created with custom C++ code<br />

running on a Pentium PC receiving the exported simulator data. Both OZ and the CD were<br />

presented on 19-inch monitors placed adjacent to each other. An opaque screen blocked one or<br />

the other monitor from view depending on which display the participant was flying. The flight<br />

controls were the same for both displays. Aileron, elevator, and engine controls were input with<br />

a Precision Flight Control Console, Jack Birch Edition. Rudder control was disabled.<br />

Task: Participants were to fly straight and level on a course heading due south at 3000<br />

feet at a constant indicated air speed of 100 knots for about three minutes per trial. Participants<br />

could rest between trials. Completing a condition required between one to two hours per<br />

participant.<br />

Independent Variables: In a two factor experiment, two levels of flight display (OZ and<br />

the CD) were compared for the 4 levels of the default turbulence that the Elite Simulator<br />

provided (none, low, moderate, and severe).<br />

Training: The participants had no formal experience with either CD or OZ before the<br />

experiment. The task of flying straight and level was described and illustrated to the participants<br />

with both displays. Participants were given instructions about the instruments, and all questions<br />

were answered about the displays until the participants said they were satisfied that they<br />

understood the task. Data collection started with their first flight.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Procedure: Data collection consisted of the participant flying one display for 3000<br />

simulator cycles, which was about three minutes, then flying the other display for 3000 cycles,<br />

alternating between the displays for from one to two hours. Data collection for both displays<br />

continued for a given turbulence condition until the participant felt that the flight performance<br />

had stabilized on both displays. Whether OZ or CD began the run was determined randomly.<br />

The condition without turbulence was completed first. This condition was followed by the low<br />

turbulence, followed by the moderate turbulence followed by the extreme turbulence. One<br />

turbulence level per day was collected per participant. Several days could intervene between<br />

data collection. The two participants worked as a pair, one serving as the experimenter for the<br />

other.<br />

Data Reduction: Flight performance was scored as root mean square (RMS) error. RMS<br />

error was chosen because it combines two different aspects of performance into a single metric-the<br />

variability of the observed performance relative to a target performance, which itself can be<br />

displaced from the average performance (1, 14, 31). RMS errors for both heading (in degrees)<br />

and altitude (in feet) were calculated from 300 successive simulation cycles, each cycle<br />

providing a heading and an altitude RMS. A trial consisted of 10 heading and altitude RMS<br />

error scores, summarizing the 3000 successive simulation cycles. For a trial, the 10 RMS scores<br />

for heading and 10 RMS scores for altitude were further reduced to a mean altitude RMS error<br />

and mean heading RMS error.<br />

Results<br />

To illustrate how the data were reduced for analysis, Figure 4 plots, as a function of time,<br />

flight performance data of individual trials obtained for a single condition and participant. Panel<br />

(a) shows RMS altitude error in feet and panel (b) shows RMS heading error in degrees under<br />

the condition of light turbulence. Each dot displays RMS error for 300 simulator cycles obtained<br />

with OZ; the line connects comparable RMS errors obtained with CD. The horizontal axis<br />

shows the that data collection duration for this run lasted 66 minutes (33,000 simulator cycles).<br />

Before this run, the participant had about 45 minutes of flight experience obtained during the no<br />

turbulence condition. Data from the other participant were essentially identical to those shown<br />

in the Figure.<br />

----------------------------------------------<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

79


80<br />

Fig. 4a: Altitude performance of the volunteer (as RMS error in feet) for light turbulence over<br />

the complete run, alternating between OZ and the CD.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Fig. 4b: Heading performance of the volunteer (as RMS error in degrees) for light turbulence<br />

over the complete run, alternating between OZ and the CD.<br />

----------------------------------------------<br />

Figure 4 shows that the flight performance obtained with the two displays is different.<br />

Among the most obvious differences is that flight performance with OZ has smaller RMS values<br />

than that obtained with CD. This was true for altitude as well as heading control. With CD, this<br />

participant frequently lost control of heading (over 135° error) and/or altitude (over 200 feet<br />

error); but never lost control of either with OZ. Furthermore, there was a difference in how<br />

performance changed as the run progressed. Performance with both displays improved at about<br />

the 40 minute mark. OZ’s superior flight performance improved slightly. CD’s relatively poor<br />

starting flight performance improved by over a factor of 8 for altitude and a factor of 4 for<br />

heading by the end of the run nonetheless, performance was still far worse with CD than with<br />

OZ.<br />

To evaluate the effect of display, turbulence, and experience, data from each combination<br />

of conditions were divided chronologically into a first and last half for each run, and each half<br />

averaged to produce a mean RMS error. Thus, the data of Figure 4 produced four mean RMS<br />

altitude values and four mean RMS heading values, two for OZ and two for CD. Panel (a) of<br />

Figure 5 shows mean RMS altitude error; panel (b) shows mean RMS heading error. The<br />

horizontal axis of each panel is divided into the first half and the second half of the run.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

81


82<br />

Turbulence condition is indicated for each of these halves. Error bars are standard error of the<br />

mean.<br />

---------------------------------------------<br />

Fig. 5a: Altitude performance of the volunteer (as RMS error in feet) for each of the four<br />

turbulence levels (none, light, moderate, and severe) over the first and second halves of the run,<br />

for CD and for OZ display.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Fig. 5b: Heading performance of the volunteer (as RMS error in degrees) for each of the four<br />

turbulence levels (none, light, moderate, and severe) over the first and second halves of the run,<br />

for CD and for OZ display.<br />

---------------------------------------------<br />

To quantitatively evaluate the data shown in Figure 5, ANOVAs were performed (15).<br />

The ANOVAs for altitude (F = 5.642, MSe = 9803.741, df = 3,746; p < 0.0008) and heading (F =<br />

17.579, MSe = 5836.413, df = 3,746; p < 0.0001) showed a statistically significant three-way<br />

interaction of display by turbulence by flight experience for both altitude and heading mean<br />

RMS. Post hoc analysis (Newman-Keuls test procedures) revealed the general pattern of<br />

differences among the test conditions. The important comparison was between OZ and CD. As<br />

can be seen in Figure 5, with OZ neither experience nor turbulence had any impact on mean<br />

RMS error for altitude or heading. On the other hand, with CD there was a relatively<br />

complicated interaction between turbulence level and the experience of the participant. The<br />

analysis of data obtained from the other participant yielded a similar pattern.<br />

Discussion<br />

The results showed that the participants with minimal training and no previous flight<br />

experience executed a simple flying task using OZ with greater consistency and far greater<br />

precision than when using CD. Turbulence had no impact on OZ performance although<br />

turbulence radically degraded performance with CD. OZ was easier to learn than CD;<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

83


84<br />

performance was still improving with CD whereas with OZ performance did not change but<br />

remained consistently superior.<br />

Experiment 2: Dual Task Performance<br />

Experiment 1 suggested that OZ supports precise aircraft control and situational<br />

awareness even under severe turbulence. Therefore, we put OZ to a most extreme test; one that<br />

is operational important but patently impossible with CD. We thought that reading text out loud<br />

while flying would be a demanding task that would dramatically demonstrate that OZ enables the<br />

pilot to attend to other mission relevant tasks, e.g., map reading, communications, targeting,<br />

check lists, etc., while maintaining control of the aircraft in instrument flight.<br />

Method<br />

Participants: The same two medical students who volunteered for Experiment 1<br />

volunteered for Experiment 2.<br />

Equipment: The same instrumentation was used as was used in Experiment 1. However,<br />

OZ software was modified so that text could be presented at the rate of one word per second in a<br />

centered, blacked-out circular patch of approximately one-inch diameter. There were no other<br />

changes made in the instrumentation.<br />

Task: The task was to fly due south at 100 knots at 3000 feet under the same four levels<br />

of turbulence that were used in Experiment 1. There were three display conditions. The first<br />

two were essentially replications of Experiment 1, flying with OZ and CD. The third condition<br />

was to fly with OZ while reading out loud the text presented in the middle of the screen.<br />

Independent Variables: In a two factor experiment, three levels of flight display (CD, OZ,<br />

and OZ with reading) were compared for the 4 levels of the default turbulence that the Elite<br />

Simulator provided (none, low, moderate, and severe).<br />

Training: The two medical student volunteers had learned during Experiment 1 to use<br />

OZ, CD, and the data collection methodology and procedures. The participants practiced the<br />

current task before initiating the data collection session.<br />

Data Reduction: The data reduction methods were identical to those used in Experiment<br />

1.<br />

Results<br />

Figure 6 shows one of the participant’s average flight performance under four levels of<br />

turbulence and three display conditions are shown as mean RMS altitude error (feet) in panel (a)<br />

and mean RMS heading error (degrees) in panel (b). Error bars are standard error of the mean.<br />

On the left side of each panel is flight performance with conventional instruments without the<br />

secondary reading task. In the middle of each panel is flight performance with OZ without the<br />

secondary reading task. On the right side of each panel is flight performance with OZ while<br />

performing the secondary reading task.<br />

---------------------------------------------<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Fig. 6a: Altitude performance of the volunteer (as RMS error in feet) for each of the four<br />

turbulence levels (none, light, moderate, and severe) for the CD (left group of histograms), OZ<br />

(middle group of histograms) and OZ while reading (right group of histograms).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

85


86<br />

Fig. 6b: Heading performance of the volunteer (as RMS error in degrees) for each of the four<br />

turbulence levels (none, light, moderate, and severe) for the CD (left group of histograms), OZ<br />

(middle group of histograms) and OZ while reading (right group of histograms).<br />

---------------------------------------------<br />

With CD, performance degraded with increased turbulence, as shown by the RMS error<br />

values on the y-axis; the larger the RMS, the poorer performance. Note that for the task of flying<br />

with CD, the participants did not perform the secondary reading task; their only responsibility<br />

was to control the aircraft. The secondary task of reading was simply impossible for the<br />

participants with CD. Performance with OZ without the secondary task replicated Experiment 1,<br />

but the participants were much more experienced.<br />

ANOVAs for altitude (F = 4.542, MSe = 2480.073, df = 11, 308; p < 0.0001) and heading<br />

(F = 38.8159, MSe = 576.668, df = 11, 308; p < 0.0001) and Post Hoc tests (LSD) showed that<br />

while turbulence did significantly degrade performance with CD, it did not degrade performance<br />

with OZ with or without the additional reading task. This difference between OZ and CD for<br />

turbulence is clearly shown in Figure 6. Particularly noteworthy is the absence of a statistically<br />

significant difference between OZ flight performance with and without the secondary the reading<br />

task. This finding is also visually evident in Figure 6.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Discussion<br />

The most important conclusion of this experiment is that the secondary reading task (with<br />

OZ) did not impact flight performance in the least. OZ enabled control of the simulated aircraft<br />

while reading, regardless of turbulence.<br />

One of the design goals of OZ was to scale the display so that it would support precision<br />

flight equal to the best obtainable with CD. Optimal precision flight with CD can be expected<br />

when a trained operator’s sole task is flying straight and level in smooth air without the burden<br />

of secondary tasks (radio, map reading, etc.). Data for these ideal conditions was collected in<br />

Experiments 1 and 2 and are shown in the no turbulence conditions from the second half of<br />

Experiment 1 (Figure 5) and for Experiment 2 (Figure 6). Here it is evident that the design goal<br />

of OZ to equal the best flight performance obtained with CD was achieved.<br />

General Discussion<br />

The two experiments reported here evaluated a revolutionary new approach to the display<br />

of flight information. The goal was to find out whether performance with OZ was at least<br />

comparable with that supported by conventional instrumentation. However, the results suggest<br />

that far stronger claims can be made. OZ enabled the students to develop accuracy more quickly<br />

than the CD. We reported only the RMS scores rather than the component variability and<br />

displacement (bias) scores, a practice not uncommon in the literature (38). However, our<br />

conclusions are completely consistent with the conclusions based on the variability and<br />

displacement scores; OZ enabled significantly more precise flight than the conventional<br />

instruments.<br />

It might be argued that the precision of flight might not be a meaningful performance<br />

parameter. After all, the difference in altitude RMS error of between 5 feet and 15 feet at 3000<br />

feet may not be important in a practical sense. However, the specific task we used, precise flight<br />

measured by RMS error, should actually favor conventional instruments composed of dials,<br />

gauges, arrows, and pointers. In other words, the experiments stacked the cards against OZ in<br />

favor of the conventional display.<br />

To our knowledge, OZ is the only cockpit environment that, while avoiding the<br />

conventional reliance on alphanumerics and the tools of dials, gauges, arrows, and pointers,<br />

supports total flight performance that is as precise, or even more so, than that obtained with the<br />

conventional instruments. (In addition to altitude and heading, OZ also indicates attitude,<br />

configuration, power, radio navigation, and location management.)<br />

An important feature of OZ is its one-to-one mapping on the two-dimensional CRT of<br />

360° horizontal and 60° vertical air space. The OZ star field map is accomplished with a high<br />

degree of precision that is not confusing, while preserving perspective and distance. Because of<br />

the power of OZ’s unique representation of airspace, we expect OZ will demonstrate superiority<br />

in supporting situational awareness and spatial orientation. These are aspects of flight<br />

performance for which the ‘ecological’ displays that use graphic and iconic elements have<br />

demonstrated superiority over the conventional pointers and arrows (6, 37).<br />

OZ was specifically designed to reduce, if not eliminate, the need to scan separate flight<br />

instruments. There are reports in the literature of previous efforts to accomplish this by using socalled<br />

peripheral vision display strategies (3, 4, 8, 24, 36). All such efforts met with limited<br />

success, at best. The approach OZ uses is different; it is not a peripheral vision display. OZ<br />

presents the information in a graphical fashion. As such, the information is processed by the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

87


88<br />

human visual system at the speeds it uses to process images, speeds faster than those required to<br />

foveate and read dials and gauges and integrate numerical data.<br />

OZ is an invention with functional characteristics that we are discovering as we continue<br />

to study it. For example, we have recently found that trained pilots can actually fly two<br />

simulators simultaneously under severe turbulence conditions, executing different maneuvers<br />

simultaneously, each simulator with its own OZ display. This demonstration strongly suggests<br />

that OZ can be considered to be a single instrument that integrates all the information needed to<br />

fly the aircraft. This observation and the results of the experiments reported here have strong<br />

implications for the development of remotely-piloted vehicles, while also having theoretical<br />

implications for the pilots' mental model of an aircraft’s situation. We anticipate further<br />

discoveries about OZ and its human users, and we are exploring additional applications of OZ.<br />

For example, OZ may be ideal for use in aircraft whose capabilities change in a dynamic fashion,<br />

such as tilt-rotor aircraft. We are also exploring the possibility that OZ design principles may<br />

find a role in the design of information display for use in contexts other than aviation.<br />

References<br />

1. Bain, L. J., and Engelhardt, M. (1992). Introduction to probability and mathematical<br />

statistics (2nd. Ed.). Belmont, CA: Duxbury Press.<br />

2. Bergen, J. R.: Theories of visual texture perception. In. Spatial Vision. Ed. David Regan,<br />

Vol. 10, Vision and Visual Dysfunction General Editor J. R. Cronley-Dillon. CRC<br />

Press., Inc., Boca Raton, 1991, 71-92.<br />

3. Beringer, D. B. & Chrisman, S. E. (1991) Peripheral polar-graphic displays for<br />

signal/failure detection. <strong>International</strong> Journal of Aviation Psychology, 1, 133-148.<br />

4. Brown, I. D., Holmqvist, S. D., and Woodhouse, M. C. (1961). A laboratory comparison<br />

of tracking with four flight-director displays. Ergonomics, 4, 229-251.<br />

5. Caputo, G.: The role of the background: texture segregation and figure-ground<br />

segmentation. Vision Res 1996 Sep; 36(18):2815-2826.<br />

6. Flach, J. M., and Warren, R. (1995). Low altitude flight. In J. M. Flach, P. A. Hancock, J.<br />

K. Caird, & K. J. Vicente (Eds.), An ecological approach to human machine systems.<br />

Hillsdale, NJ: Erlbaum.<br />

7. Flanagan, J. L., Huang, T. S., Jones, P. and Kasif, S.: Final Report of the National<br />

Science Foundation Workshop on Human-Centered Systems: Information, Interactively,<br />

and Intelligence (HCS)., Hosted by Beckman Institute for Advanced Science and<br />

Technology, University of Illinois at Urbana-Champaign July, 1997 Arlington, VA.<br />

8. Fenwick, C. A. (1963). Development of a peripheral vision command indicator for<br />

instrument flight. Human Factors, 5, 117-128.<br />

9. Goodenough, B., Gillam, B.: Gradients as visual primitives. J Exp Psychol Hum Percept<br />

Perform 1997 Apr; 23(2):370-387.<br />

10. Harris, L. R., Christhilf, D. M. (1980). What do pilots see in displays? Presented at the<br />

Human Factors Society Meeting, Los Angeles, CA.<br />

11. Harris, L. R., Glover, B. J., Spady, A. A. (1986, July). Analytic techniques of pilot<br />

scanning behavior and their application. NASA Technical Paper 2525. Moffett Field, CA:<br />

NASA-Ames Research Center.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


12. Hoffman, R. R., Ford, K. M., and Coffey, J. W. (2000). "The Handbook of Human-<br />

Centered Computing." Report, Institute for Human and Machine Cognition, University<br />

of West Florida, Pensacola FL.<br />

13. Hogervorst, M. A., Kappers, A. M., Koenderink, J. J.: Structure from motion: a<br />

tolerance analysis. Percept Psychophys 1996 Apr; 58(3):449-459.<br />

14. Hubbard, D. (1987). Inadequacy of root mean square error as a performance measure. In<br />

Procedings of the <strong>International</strong> Symposium on Aviation Psychology (pp. 698-704).<br />

Columbus OH: Ohio State University.<br />

15. Johnson, R. A., Wichern, D. W.: Applied Multivariate Statistical Analysis. Prentice Hall,<br />

Upper Saddle River, N.J.: 1998<br />

16. Julesz, B.: Figure and ground perception in briefly presented isodipole textures. In:<br />

Perceptual Organization, Eds.: M. Kubovy and J. Pomerantz, Lawrence Erlbaum<br />

Associates, Hillsdale, NJ 1981.<br />

17. Kastner, S., Nothdurft, H. C., Pigarev, I. N.: Neuronal correlates of pop-out in cat striate<br />

cortex. Vision Res 1997 Feb; 37(4):371-376.<br />

18. Kershner, W. K.: The Instrument Flight Manual (4th Edition) Iowa State University<br />

Press/Ames. 1991.<br />

19. Lamme, V.A.: The neurophysiology of figure-ground segregation in primary visual<br />

cortex. J Neurosci 1995 Feb; 15(2):1605-1615.<br />

20. Leibowitz, H.: Post, R. B.: Two modes of processing concept and some implications. In:<br />

Organization and Representation in Perception. Ed.: J. Beck, Erlbaum, 1982.<br />

21. Leibowitz, H., Shupert, C.L.: Low luminance and spatial orientation: In <strong>Proceedings</strong> of<br />

the Tri-Service Aeromedical Research Panel Fall Technical Meeting, Naval Aerospace<br />

Medical Research Laboratory, Pensacola, Fl; NAMRL Monograph-33, 1984, 97-104.<br />

22. Leibowitz, H., Shupert, C.L. and Post (1984). The two modes of visual processing:<br />

Implications for spatial orientation. In Peripheral Vision Horizon Display (PVHD),<br />

NASA Conference Publication 2306 (pp. 41-44). Dryden Flight Research Facility, NASA<br />

Ames Research Center, Edwards Air Force Base, CA.<br />

23. Lind, M.: Perceiving motion and rigid structure from optic flow: a combined weakperspective<br />

and polar-perspective approach. Percept Psychophys 1996 Oct; 58(7):1085-<br />

1102.<br />

24. Malcolm, R. (1984) Pilot disorientation and the use of a peripheral vision display.<br />

Aviation, Space, and Environmental Medicine, 55, 231-238.<br />

25. Naval Air Training Command, Flight Training Instruction TH-57, Helicopter Advanced<br />

Phase, CNATRA P_457 New(08-93) PAT; NAS Corpus Christi, TX, 1993.<br />

26. Ocker, W. C., and Crane, C. J.: Blind Flight in Theory and Practice. The Naylor Co, San<br />

Antonio, TX 1932<br />

27. Pollick, F. E.: The perception of motion and structure in structure-from-motion:<br />

comparisons of affine and Euclidean formulations. Vision Res. 1997 Feb; 37(4): 447-<br />

466.<br />

28. Post, R. B., Leibowitz, H.,W.: Two modes of processing visual information: implications<br />

for assessing visual impairment. Am J Optom Physiol Opt 1986 Feb; 63(2): 94-96.<br />

29. Siddiqi, K., Tresness, K. J., Kimia, B.B.: Parts of visual form: psychophysical aspects.<br />

Perception 1996; 25(4): 399-424.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

89


90<br />

30. Simoneau, G. G., Leibowitz, H. W., Ulbrecht, J. S., Tyrrell, R. A., Cavanagh, P. R.: The<br />

effects of visual factors and head orientation on postural steadiness in women 55 to 70<br />

years of age. J Gerontol 1992 Sep; 47(5) M151-M158.<br />

31. Temme, L. A., Chapman, F. A., Still, D. L., and Person, P. C. (1999). The performance of<br />

the standard vertical S-1 flight (VS-1) maneuver by student navy helicopter pilots in a<br />

training stimulator. Abstract, Aviation, Space and Environmental Medicine, 70, p. 428.<br />

32. Temme, L. A., Woodall, J., Still, D. L.: Calculating A Helicopter Pilot's Instrument Scan<br />

Patters From Discrete 60 Hz Measures Of The Line-Of-Sight: The Evaluation Of An<br />

Algorithm. Paper in Review.<br />

33. Thibos, L. N., Still, D. L., Bradley, A.: Characterization of Spatial aliasing and contrast<br />

sensitivity in peripheral vision. Vision Research, 36, 249-258, 1996.<br />

34. Thibos, L. N., Bradley, A.: Modeling off-axis vision-II: The effects of spatial filtering<br />

and sampling by retinal neurons. In Vision Models for Target Detection and Recognition:<br />

World Scientific Press, Singapore, 1995.<br />

35. Turano, K., Herdman S. J., Dagnelie, G.: Visual stabilization of posture in retinitis<br />

pigmentosa and in artificially restricted visual fields. Invest Ophthalmol Vis Sci 1993<br />

Sep; 34(10): 3004-3010.<br />

36. Vallerie, L. L. (1966) Displays for seeing without looking. Human Factors, 8 , 507-513<br />

37. Warren, R.: Preliminary questions for the study of egomotion. In R. Warren & A.H.<br />

Wertheim, (Eds). Perception and control of self motion (pp 3-32) Hillsdale, N.J.<br />

Lawrence Erlbaum Associates.<br />

38. Weinstein, L.F. & Wickens, C.D. (1992) Use of nontraditional flight displays for the<br />

reduction for central visual overload in the cockpit. <strong>International</strong> Journal of Aviation<br />

Psychology.<br />

39. Williams, D. R., Coletta, N. J.: Cone spacing and the visual resolution limit. Journal of<br />

the Optical Society of America, A, 4, 1514-1523 1988.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


PERCEPTUAL DRIFT RELATED TO SPATIAL DISORIENTATION<br />

GENERATED BY MILITARY SYSTEMS :<br />

POTENTIAL BENEFITS OF SELECTION AND TRAINING<br />

Corinne Cian a , Jérôme Carriot b & Christian Raphel a<br />

a Département des facteurs humains, Centre de Recherches du Service de Santé des Armées,<br />

Grenoble, France.<br />

b UPR-ES 597 Sport et Performance Motrice, Université Joseph Fourier, UFRAPS, Grenoble,<br />

France.<br />

The development of new technologies in the weapon systems generate sensory flows<br />

that can induce sensory conflicts and dysfunction. The consequences can be pathological<br />

disorders but beside these extreme cases, the most common consequence is spatial<br />

disorientation. Spatial disorientation is characterized by the failure of the operator to sense<br />

correctly the position or motion of an object or himself within a fixed coordinate system<br />

provided by the surface of the earth and the gravitational vertical. For example spatial<br />

disorientation may induce a misperception of the location of a visual target resulting in a<br />

perceptual drift. This phenomenon may be related to the functional properties of the Central<br />

Nervous System. The object location requires information about the body orientation of the<br />

subject. Then the perceived location depend on visual vestibular and somaesthetic<br />

information. Thus, when the relationship between an observer and the gravitational frame of<br />

reference is altered as when subject is tilted with respect to the gravity or when the magnitude<br />

or direction of the gravity change as often occurs in accelerating vehicle the apparent<br />

locations of seen objects are usually altered (Cohen, 2002).<br />

Most of these problems has been studied in the field of aviation and concerned large<br />

gravito-inertial forces. However, perceptual drifts have been observed for lower gravitoinertial<br />

forces generated by antiaircraft guns. In this system the subject rotated and or is tilted<br />

together with the system. These very low body rotations unconsciously affect the spatial<br />

perception of a target. This perceptual drift may be related to the oculogravic illusion<br />

phenomenon already observed in operational aviation environments.<br />

To study this illusion in a laboratory we generally ask the subject to determine whether<br />

a given target is above or below the level of his eyes. For an upright subject, a target is<br />

considered to be at eye level when an imaginary line connecting the target to the eyes is<br />

perpendicular to the direction of gravity. The angular deviation between the visual target set to<br />

appear at eye level and this horizontal plane defines the visually perceived eye level (Dizio et<br />

al., 1997; Matin et al., 1992; Li et al., 1993). The perceived eye level is strongly influenced by<br />

the variation of the gravitational-inertial forces acting on the subject. This is the case when an<br />

upright subject faces toward the center of a centrifuge that rotates at a steady velocity for<br />

some time. A target that remains at true eye level appears to be above its true location. The<br />

oculogravic illusion induces a lowering of the visually perceived eye level (Cohen, 1973;<br />

Cohen et al., 2001; Graybiel, 1952; Whiteside et al., 1965). This illusion is explained by an<br />

illusory perception of body tilt in pitch related for higher gravito-inertial force changes to a<br />

mechanical action on the otolithic organs of the vestibular system, as well as of the muscle<br />

and cutaneous proprioceptors (Cohen, 1973; Wade et al., 1971), whereas bodily senses<br />

affected by very limited variations of G are restricted, the lowering of the perceived eye level<br />

is probably due to the stimulation of the otolithic system alone (Raphel et al., 1994, 1996).<br />

In the range of very low gravitational-inertial stimulation, there were large individual<br />

differences. The gravitational-inertial disturbances did not induce the same negative<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

91


92<br />

consequences on the eye level for all subjects. Some of them were not subjected to illusion,<br />

others showed a smaller sensibility to the oculogravic illusion that is a smaller perceptual<br />

drift. These individual differences could find an explanation in the sensibility with which the<br />

subjects perceive a variation of sensory state. This sensibility would depend on the<br />

comparison of the signals related to body orientation with internal models which specify<br />

expected sensory configurations (Berthoz et al., 1999). These internal representations were<br />

elaborated from the subjects’ experience (Lackner, 1992; Young et al., 1996). Their adequacy<br />

to an environmental reality would be then different from one individual to another.<br />

EXPERIMENTAL DESIGN<br />

To investigate to which extent spatial experience modifies the oculogravic illusion, we<br />

study the effect of limited variations of G with two different populations differently skilled in<br />

gymnastics: 20 subjects practiced trampoline at a national or international level (expert group)<br />

and 20 subjects with no special sport expertise (control group). Acrobats were chosen because<br />

they are trained to face high postural constraints and their activity requires to associate finely<br />

unusual sensory configurations with a precise body orientation. Unconsciously, they would<br />

have improved the functional characteristics of the sensory systems and learned perceptual<br />

strategies.<br />

The subjects were seated in a centrifuge facing towards the axis of rotation or were<br />

laying on the rotating horizontal plane which rotated around the chest to spine axis of the<br />

supine subjects (figure 1). The supine position correspond to the maximum tilt in pitch of the<br />

subject on the antiaircraft gun. In this condition, the subjective perception of the eye level<br />

corresponds to the plane as being parallel to the direction of gravity (zenith). They were asked<br />

to set a luminous target at the place perceived as the eye level (or zenith) while they were in<br />

total darkness and undergoing very low centrifugation less than 1.01G. For each gravitationalinertial<br />

condition, the visually perceived eye level was averaged over the trials. The mean<br />

value measured while the subject was motionless and in total darkness served as a reference<br />

value. Experimental data expressed in degrees of visual angle in different centrifugation<br />

conditions consisted of the algebraic difference between the perceived eye level measured<br />

under a gravitational-inertial condition and the reference value. When the difference was<br />

negative, the perceived eye level was below the reference value and when it was positive, the<br />

perceived eye level was above the reference value.<br />

Horizontal plane<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

γ r<br />

θ<br />

GIF G<br />

Vertical plane<br />

Figure 1: The apparatus consisted of a centrifuge in which the subject could take place in the<br />

upright position (left panel) or in the supine position (right panel ). The illustration shows the<br />

the initial position of the target, the horizontal and vertical plane through the eyes and the<br />

vectorial sum GIF between G and radial acceleration γr under centrifugation.<br />

RESULTS<br />

In the seated position (figure 2) and for the control subjects, increasing the radial acceleration<br />

lowered the perceived eye level - that is a shift upward toward the lower part of the body as


indicated by the negative values. The visual drift become lower than the reference value with<br />

a modification of G which is less than 0.01 percent. For the expert subjects, The theoretical<br />

threshold of sensitivity to radial acceleration is higher. Only settings obtained under the higher<br />

value of radial acceleration were lower than the reference that is 1.02G. Moreover, the<br />

lowering of the eye level when it happens was much smaller than for the control subjects.<br />

Radial acceleration γr (m.s-2 )<br />

0,0152 0,0609 0,38077 1,67<br />

2<br />

0<br />

VPEL reference<br />

Mean deviation (deg)<br />

-2<br />

-4<br />

-6<br />

-8<br />

*<br />

Controls<br />

Experts<br />

Figure 2: Mean deviation (deg) of the visually perceived eye level (VPEL) settings relative to<br />

the reference (motionless) for the control and expert groups and for the different conditions of<br />

radial acceleration.<br />

In the supine position (figure 3) and for the control subjects, the zenith perceived under each<br />

centrifugation condition was lower than the reference (motionless) and increasing the radial<br />

acceleration lowered the eye level. The expert group was not sensitive to low gravitationalinertial<br />

stimulation up to 1.04 G which represent a tilted angle of the gravitoinertial force of<br />

17 degrees relative to G. The zenith perceived under each centrifugation condition was not<br />

different from the reference motionless.<br />

Radial acceleration γr (m.s<br />

0.55 0.97 1.52 2.19 2.98<br />

2<br />

-2 )<br />

Mean deviation (deg)<br />

0<br />

-2<br />

-4<br />

-6<br />

*<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

*<br />

Zenith reference<br />

-8<br />

Controls<br />

Experts<br />

Figure 2: Mean deviation (deg) of the zenith settings relative to the reference (motionless) for<br />

the control and expert groups and for the different conditions of radial acceleration.<br />

DISCUSSION<br />

Thus the gravitational-inertial disturbances did not induce the same negative consequences on<br />

the eye level for spatial experts. Whatever the orientation of the body in space, the control<br />

subjects set the target progressively lower as the magnitude of the gravito inertial force<br />

increased. In the upright position, the spatial experts showed a smaller sensibility to the<br />

93


94<br />

oculogravic illusion observed at the level of the theoretical threshold of radial accelerations<br />

sensitivity but also at the level of the oculogravic illusion scale. In the supine position,<br />

acrobats were not subjected to illusion<br />

The origin of these inter-individual differences may be a better efficiency in the use of the<br />

sensory information. When limited variations of G are used only the vestibular system was<br />

stimulated. However the otolithic sensors do not distinguish a linear acceleration of a head tilt,<br />

and the central nervous system prefer to interpret the gravitational modifications in term of<br />

physical tilt in pitch. Then a target that remains at true eye level appears to be above its true<br />

location and the oculogravic illusion induces a lowering of the visually perceived eye level. It<br />

can be suggested that the spatial experts were less sensitive to oculogravic illusion because of<br />

their otolithic capacity to distinguish linear accelerations from an head-body tilt. Through<br />

learning, new configurations corresponding to postural states would be stored at the level of<br />

the central nervous system enriching the internal models of orientation. They could have<br />

develop internal models at the origin of this perceptive distinction (Merfeld et al., 1999).<br />

Beside this interpretation the origin of the individual differences may be related to a sensorial<br />

weighting with respect to subjects’ experience. In absence of vision, the perception of body<br />

orientation in the gravitational field is based on the coordination of vestibular and<br />

somaesthetic sensory modalities. The somaesthetic system is assumed to provide information<br />

about body orientation, notably in response to the anti-gravitational forces. Most of the time,<br />

the information provided by these two systems (vestibular and somaesthetic) is redundant.<br />

Then, the errors of perceptive judgment would depend on an inadequate resolution of a<br />

sensory conflict generated by the presence of non-redundant information, implying an<br />

inappropriate sensory dominance during the integration process (Young, 1984). In that context<br />

the eye level lowering observed for the very low gravitational-inertial stimulation may be the<br />

result of a sensory conflict between the otolithic information, for which the sensors do not<br />

distinguish a linear acceleration of a head tilt, and the somaesthetic information, which<br />

indicates no change in the postural state with regard to the gravity. The perceptive shift would<br />

be then associated to a process of sensory integration which gives more weight to the otolithic<br />

information and interprets the gravitational modifications in term of physical tilt. Conversely,<br />

the absence of oculogravic illusion may be the result of a somaesthetic sensory dominance.<br />

In conclusion, the relations the subject maintains with the spatial environment and the<br />

knowledge acquired through experience modify the processing of sensory information and the<br />

perceptive construction resulting from it. The extensive practice of acrobatics which requires<br />

to finely associate sensory configurations with a precise physical orientation, allows these<br />

spatial experts to be less sensitive to the oculogravic illusion stemming from radial<br />

accelerations similar to those met during the daily life.<br />

REFERENCES<br />

Berthoz, A. & Viaud-delmon, I. (1999). Multisensory integretion in spatial orientation.<br />

Current Opinion in Neurobiology, 9, 708-712.<br />

Cohen, M. (1973). Elevator illusion : influences of otolith organ activity and neck<br />

proprioception. Perception and Psychophysics, 14, 401-406.<br />

Cohen, M. (2002). Visual and vestibular determinants of perceived eye-level. RTO-MP-086:<br />

Spatial Disorientation in <strong>Military</strong> Vehicules: Causes, Consequences and Cures (pp. 37-1 to<br />

37-8).<br />

Cohen, M., Stopper, A., Welch, R. & DeRochia, C. (2001). Effects of gravitational and optical<br />

stimulation on the perception of target elevation. Perception and psychophysics, 63, 29-35.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Dizio, P., Li, W., Lackner, J.R. & Matin, L. (1997). Combined influences of gravito-inertial<br />

force level and visual field pitch on visually perceived eye level. Journal of Vestibular<br />

Research, 7, 381-392.<br />

Graybiel, A. (1952). Oculogravic illusion. Archives of Ophtalmology, 8, 605-615.<br />

Lackner, J.R. (1992). Multimodal and motor influences on orientation : implications for<br />

adapting to weightless and virtual environments. Journal of Vestibular Research, 2, 307-322.<br />

Li, W. & Matin, L. (1993). Eye & head position, visual pitch, and perceived eye level.<br />

Investigative Ophthalmology & Visual Science, 34, 1311.<br />

Matin, L. & Li, W. (1992). Visually perceived eye level : changes induced by pitch-fromvertical<br />

2-line visual field. Journal of Experimental Psychology, 18, 1, 257-289.<br />

Merfeld, D. M., Zupan, L. & Peterka, R. J. (1999). Humans useinternal models to estimate<br />

gravity and linear accelaration. Nature, 398, 615- 161.<br />

Raphel C., Barraud P.A. (1994). Perceptual thresholds of radial acceleration as indicated by<br />

visually Perceived Eye Level. Aviation, Space and Environmental Medicine, 65, 204-208.<br />

Raphel C., Cian C., Barraud P.A. & Micheyl C. (2001). Effects of supine body position and<br />

low radial accelerations on the visually perceived apparent zenith. Perception &<br />

Psychophysics, 1, 36-46.<br />

Wade, N.J. & Schöne, H. (1971). The influence of force magnitude on the perception of body<br />

position : I. Effects of head posture. British Journal of Psychology, 62, 157-163.<br />

Whiteside, T.C.D., Graybiel, A., & Niven, J.I. (1965). Visual illusions of movement. Brain,<br />

88, 193-210<br />

Young, L.R. (1984). Perception of the body in space : Mechanisms. In I. Smith (Ed.),<br />

Handbook of Physiology - The nervous system ; vol. 3 (pp. 1023-1066). New-York :<br />

Academic Press.<br />

Young, L.R., Mendoza, J.C., Groleau, N. & Wojcik, P.W. (1996). Tactile influences on<br />

astronaut visual spatial orientation: human neurovestibular studies on SLS-2. Journal of<br />

Applied Physiology, 81, 44-49.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

95


96<br />

PERCEPTUAL DYSLEXIA:<br />

ITS EFFECT ON THE MILITARY CADRE AND<br />

BENEFITS OF TREATMENT<br />

Susann L. Krouse<br />

Naval Education and Training<br />

Professional Development and Technology Center<br />

Pensacola, FL, USA<br />

James H. Irvine<br />

Naval Air Warfare Center, Weapons Division<br />

China Lake, CA, USA<br />

Perceptual dyslexia—also known as Irlen Syndrome, Scotopic Sensitivity Syndrome,<br />

SSS, scotopic sensitivity/Irlen syndrome, and, in the United Kingdom, Meares-Irlen<br />

Syndrome—is a perceptual disorder that affects an estimated 46-50 percent of those with<br />

learning disabilities or reading problems; 33 percent of those with dyslexia, attention deficit<br />

(hyperactivity) disorder, and other behavior problems; and approximately 12-14 percent of<br />

the general population (Irlen, 1999). It is not a dysfunction of the physical process of sight.<br />

People with perceptual dyslexia can have 20/20 vision or they may wear corrective lenses.<br />

Perceptual dyslexia is a problem with how one’s nervous system encodes and decodes visual<br />

information and transmits it to the visual cortex of the brain (Warkentin & Morren, 1990).<br />

SYMPTOMS OF PERCEPTUAL DYSLEXIA<br />

People who are affected with perceptual dyslexia have problems accommodating<br />

specific wavelengths of light, and each person’s troublesome frequency is unique. Factors<br />

such as bright light, fluorescent light, high-gloss paper, and black-and-white contrast can<br />

aggravate the disorder. The victim’s scope of focus may be restricted so that he or she may<br />

only see very small bits of a line of text instead of the entire line. The text that the person<br />

sees might blur, swirl, move, pulsate, vibrate, or even disappear. The white page is too<br />

bright; or it may flicker or flash; or colors may appear. SSS victims rarely report these<br />

symptoms to others because they think that everyone experiences the same problems (Irlen,<br />

1991). Those with perceptual dyslexia often avoid reading at all costs, and, as a result, they<br />

may be affected physically, academically, and psychologically (Irlen, 1991).<br />

From a physical standpoint, because of the text distortions suffered, reading becomes<br />

extremely difficult, often physically painful. Without intervention, victims of Irlen<br />

Syndrome exhibit symptoms such as sensitivity to light, headaches, nausea, eyestrain,<br />

sleepiness while reading, attention deficit, and distortions of text (Irlen, 1991).<br />

Academically, everything derives from reading, and victims of Irlen Syndrome<br />

invariably find it difficult to read. They may skip words or reverse or change letter order—<br />

seeing the word “saw” as “was,” for instance. They may have poor penmanship, a result of<br />

difficulty with spatial orientation: they misjudge how much space to leave between a pair of<br />

letters or words. Because they frequently can’t envision an entire word, they find it difficult<br />

to spell or work with large numbers (Irlen, 1991).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Psychologically, the victim of perceptual dyslexia is prone to exhibit problems with<br />

behavior, motivation, and self-esteem. Those with SSS frequently exhibit symptoms of<br />

attention deficit disorder, acting out, and behavior problems (Irlen, 1991). They are often<br />

poorly motivated to succeed. Almost invariably they tried early on, when they were young.<br />

But with few successes and many “failures,” their attitude became “why bother?” Their selfesteem<br />

is low because, while everyone around them is reading and learning, they cannot—no<br />

matter what they do or how hard they work, they just can’t seem to “get it.”<br />

Identification of SSS<br />

Helen Irlen, a literacy instructor in California, first identified this perceptual dyslexia<br />

in the early 1980s and labeled it “scotopic sensitivity syndrome.” Irlen had received a grant<br />

from California State University, Long Beach, in 1980, to set up a literacy program for<br />

adults. She chose to work with adults because adults can communicate better than children<br />

and are more accurate “reporters” of what they experience; they are less intimidated by<br />

authority than children and are less likely to he swayed without some evidence; and adults<br />

are more motivated to succeed. They have reached a point in their lives where they<br />

recognize the importance of learning in general and reading in particular.<br />

After three years of in-depth research, Irlen discovered that many problems appeared<br />

after readers had been actively reading for a relatively short period of time (usually about 10<br />

minutes or more). Those who had trouble reported that distortions began to appear on the<br />

page, and those distortions prevented them from comprehending the words. All of their<br />

energy was going into perceiving the words, holding them on the page, or even just finding<br />

them! As a result, many stopped reading. It was just too difficult for them. As Irlen<br />

explained in her speech at the dyslexia Higher Education Conference, October 31-November<br />

2, 1994, at Plymouth University, England, once she began asking the more definitive<br />

question, “WHAT do you see?” instead of “DO you see?” the answers made it apparent to<br />

her that these poor readers were victims of a unique syndrome that was not being adequately<br />

addressed by the professional educational community. (Dyslexia in higher education:<br />

strategies and the value of asking).<br />

Serendipitous Discovery<br />

One day, one of Irlen’s students discovered that when she placed a red overlay—left<br />

over from previous eye-dominance exercises—on the page she was reading, the sensation of<br />

movement that she had always experienced stopped! For the first time, she could actually<br />

read without having the words constantly sway back and forth! (Irlen, 1991) The red didn’t<br />

work for everybody, however. It made no difference to the rest of the students.<br />

So, Irlen tried other colors and found that the vast majority of those who tried the<br />

colored overlays were helped. Each person who was helped responded to one specific color.<br />

Once that particular color was determined and used, the individual was able to read better<br />

and longer and reported that the distortions previously experienced disappeared immediately.<br />

Irlen didn’t know at that time why the overlays worked, just that they did.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

97


98<br />

RESEARCHING THE CAUSE<br />

With the advent of magnetic resonance imaging, we’ve been able to determine that<br />

the brains of all dyslexics—including perceptual dyslexics—work differently than those of<br />

non-dyslexics. (Lewine, et al., in press) Dyslexics use a different part of the brain than nondyslexics<br />

when they read, and they use a larger portion of their brain when they read or<br />

perform visual tasks.<br />

Receptor Field Theory<br />

In the 1980s, visual physiologists developed the receptor field theory of color vision.<br />

This theory hypothesizes that the cones of the eyes are organized into eight sets of<br />

concentric, counterbalancing fields. Cones, of course, help us distinguish things clearly and<br />

distinctly. Because they contain photopigments that are sensitive to red, green, and blue light<br />

wavelengths, we are able to see color. (Irvine, 2001)<br />

Each type of field is determined by the field’s color region arrangement and the<br />

balance of the output of each field’s energy or signal. The output should be equal—that is,<br />

neither positive nor negative—as it passes through the optic nerve to enter the brain’s visual<br />

processing center. (Irvine, 2001)<br />

If the receptor fields are summed to a unity as they enter the brain’s processing<br />

center, and each single receptor field is equal to the others (so that none is governing or<br />

dominant), there will be no perceptual distortion, and the image formed will be accurate. On<br />

the other hand, if any of the receptor fields does not sum to a unity or is, in fact, dominant<br />

under a set of spectral input conditions, the visual control system will change, and the image<br />

formed will overlap, swirl, jump about—generally be distorted. (Irvine, 2001)<br />

The Pathways to the Visual Cortex<br />

Over the years since Irlen’s discovery, numerous studies of this visuo-perceptual<br />

disorder have been conducted, and the general consensus is that scotopic sensitivity<br />

syndrome affects the way the visual pathways carry messages from the eye to the brain.<br />

There are two pathways to the visual cortex:<br />

1. the magnocellular, which does fast processing of information for perceiving<br />

position, motion, shape, and low contrast; and<br />

2. the parvocellular, which carries out slower processes for perceiving still images,<br />

color, detail, and high contrast.<br />

It is theorized that when the receptor fields do not sum to unity, the pathways are<br />

affected, causing the magnocellular impulses to be slowed, so only partial perception occurs.<br />

This results in words that blur, fuse, or seem to jump off the page (Newman, 1998).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Individualized colored filters seem to return the balance between the two processing systems,<br />

preventing this overlapping (Robinson, 1994). The colored overlays and filters cut down or<br />

eliminate the perceptual problem by screening out the wavelengths of light troublesome to<br />

the individual (Sims, 1999). Studies of both the long- and short-term efficacy of the<br />

transparencies and filters have shown that they do, indeed, provide benefits to the individual<br />

afflicted with SSS (Whiting, Robinson, & Parrott, 1990; Robinson & Conway, 1990).<br />

THREE STUDIES<br />

Although there have been numerous studies into perceptual dyslexia since its<br />

recognition in 1983, we will look at just three in this paper: Irvine, Lewine, and Wilkins.<br />

Irvine’s Experiment for the Navy<br />

The Navy wanted to see if the visual performance of those afflicted with perceptual<br />

dyslexia changed as the energy spectrum presented to them changed. Therefore, in 1995,<br />

James Irvine conducted an experiment at China Lake, California, that showed that for certain<br />

perceptual dyslexics the receptor fields do NOT sum to unity, so the image sent to the brain<br />

is not crisp and clear. When this happens, the subject’s visual control system alters radically,<br />

so the subject does not see the image properly. (Irvine & Irvine, 1997)<br />

Lewine’s Study<br />

In the late 1990s, Dr. Jeffrey Lewine, a neuroscientist then at the University of Utah<br />

Center for Advanced Medical Technologies, discovered that modifying the light frequency<br />

spectrum that went to a perceptual dyslexic’s vision system could make the brain alter and<br />

revert to a more normal brain pattern. He also noted that he could actually cause five to six<br />

percent of the “normal” population to develop dyslexic-type dysfunction when they were<br />

exposed to “abnormal” light frequency environments. (Lewine, et al., in press) This means<br />

that some ordinarily non-dyslexic personnel can develop gross inefficiency, degraded<br />

performance, and/or become dysfunctional and unable to perform normally under certain<br />

lighting conditions such as red battle lighting, blue precision operating bays, or in foggy or<br />

hazy conditions.<br />

Wilkins’ Studies<br />

Professor Arnold Wilkins, while a research scientist at the Medical Research Council<br />

Applied Psychology Unit of Cambridge University in the United Kingdom, studied the<br />

neuropsychology of vision, reading and color, photosensitive epilepsy, and attention,<br />

conducting double-blind experiments to validate the existence and potential treatment of<br />

perceptual dyslexia. He did this using four different groups of readers, mostly children,<br />

randomizing the presentation order of the overlays, and further randomizing the use of the<br />

appropriate overlays versus placebo overlays. (Wilkins, <strong>2003</strong>)<br />

Wilkins’ studies determined that, when given the choice, about half the readers would<br />

choose clear overlays, and the other half would choose the colored overlays. Given that only<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

99


100<br />

approximately 15 percent of the population is afflicted with perceptual dyslexia, we can<br />

assume from Wilkins’ experiments that, in addition to these people being helped by the<br />

colored overlays or filters, some of those not-so-afflicted can also benefit from color!<br />

TESTING AND TREATMENT<br />

them.<br />

Generally speaking, before we can treat perceptual dyslexics, we have to identify<br />

Types of Screening<br />

There are generally three types of screening that would be used, two of which are<br />

based on the Irlen Method:<br />

1. In the field or at the recruiting site, a simple, 10- to 15-question inquiry of the<br />

subject, and trial-and-error determination of the appropriate colored overlay.<br />

2. At the Recruit Training Center or a major command, an in-depth inquiry<br />

consisting of questions concerning the subject’s symptoms and related history.<br />

The third test, the Wilkins Rate of Reading Test, is also easy to administer and<br />

consists of four easy one-minute tests. The entire process should not take more than about a<br />

half hour.<br />

Resources Required<br />

There are many ways to improve the situation for perceptual dyslexics without having<br />

to spend a penny. Such simple and cost-free actions as dimming the lights in a room, using<br />

natural instead of fluorescent lighting, allowing students to use colored paper and to wear<br />

caps or visors indoors, and avoiding the use of white boards will all help. (Irlen, undated;<br />

Wilkins, <strong>2003</strong>)<br />

But to alleviate the problem requires intervention in the form screening and,<br />

ultimately, selection of appropriate colored overlays or filters.<br />

The outlay required to implement such a program would be minimal. Only basic<br />

instructions would be required at the recruiting sites, enough training for the recruiter to be<br />

able to administer the simple Irlen Type-1 test or the Wilkins Rate of Reading test and to<br />

assist the applicant in choosing the appropriate overlay. At the Recruit Training Center, it is<br />

anticipated that one or two Educational Specialists who have backgrounds in education and<br />

have been trained in the Irlen Method will be required to administer the screening and<br />

perform the diagnostic analysis.<br />

Supplies of overlays or transparencies for recruiting sites and the Recruit Training<br />

Center will also be necessary. Overlays from Irlen Institute cost approximately $1.25 each,<br />

although less expensive transparencies are available from other commercial sources. (It must<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


e remembered, however, that Irlen overlays are specifically designed and developed for the<br />

purpose of alleviating SSS.) Tinting of lenses (whether corrective or not) adds about $50 to<br />

$100 per pair at this point in time. Under contract, however, the price will certainly drop to a<br />

more nominal figure (Irvine, 1997).<br />

MILITARY APPLICATION<br />

And what will we get back for this investment? The individual service members will<br />

benefit, of course, with improved reading speed and comprehension. Because they will<br />

experience less visual fatigue, their attention spans will increase. As they begin to<br />

understand and realize that they can do what they thought they couldn’t do, their selfconfidence<br />

will improve, as will their attitude to training and the job itself. Just knowing that<br />

there is a solution available will often be enough to change an attitude and strengthen a<br />

resolve to succeed.<br />

The military services will also reap the rewards of this program because, in addition<br />

to increasing the qualified pool of applicants for enlistment, the young people affected will<br />

be able to train more efficiently. Remediation, basic, and ongoing training will be more<br />

effective and, as a result, more efficient. With more effective training, the service member<br />

will be more knowledgeable and efficient in the field. It can further be anticipated that there<br />

will be fewer behavioral problems—both during and after training—primarily due to the<br />

change in attitude that has been shown to occur following screening and diagnosis for SSS.<br />

All in all, we believe that a higher-quality service member will be delivered to the<br />

field or fleet, both academically and attitudinally.<br />

101<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


102<br />

References<br />

Borlik, SSG A. (1999). Services grapple with recruiting challenges. American Forces<br />

Information Service.<br />

Irlen, H. (1991). Reading by the colors. Garden City Park, NY: Avery Publishing<br />

Company.<br />

Irlen, H. (1999). Irlen Syndrome/Isotopic Sensitivity: Most Frequently Asked Questions.<br />

Retrieved September 20, 1999, from the World Wide Web:<br />

http://Irlen.com/sss.faq.htm<br />

Irlen, H. (undated) Tips and Research from Helen Irlen. ACNOnline: <strong>Association</strong> for<br />

Comprehensive Neuropathy. Retrieved October 31, <strong>2003</strong>, from the World Wide<br />

Web: http://www.latitudes.org/articles/irlen_tips_research.htm<br />

Irvine, J. H. (1997). Dyslexic effect on the Navy training environment and operational<br />

efficiency: A prognosis for improvement. (Briefing.)<br />

Irvine, J. H. (2001). The cause of Arleen syndrome. (Briefing.)<br />

Irvine, J. H., & Irvine, E. W. (1997). Isotopic sensitivity syndrome in a single individual (a<br />

case study). Naval Air Warfare Center, Weapons Division, China Lake, California,<br />

April.<br />

Lewine, J. D., Davis, J. T., Provencal, S., Edgar, J. C., & Orrison, W. W. (in press). A<br />

magnetoencephalographic investigation of visual information processing in Irlen’s<br />

scotopic sensitivity syndrome. Perception.<br />

Newman, R. M. (1998). Technology for dyslexia: Federal education & disability law<br />

compliance.<br />

Robinson, G. L., & Conway, R. N. (1990). Irlen filters and reading strategies: The effect of<br />

coloured filters on reading achievement, specific reading strategies and perception of<br />

ability. Retrieved on August 19, 1999, from the World Wide Web:<br />

http://www.edfac.usyd.au/centres/children/Greg.html<br />

Robinson, G. L. (1994). Coloured lenses and reading: A review of research into reading<br />

achievement, reading strategies and causal mechanisms. Australasian Journal of<br />

Special Education, 18(1), 3-14, citing M. C. Williams, K. Lecluyse, & A. Rock-<br />

Facheux (1992), Effective interventions for reading disability, Journal of the<br />

American Optometric <strong>Association</strong>, 63(6), 411-417.<br />

Sims, P. (1999). Awakening brilliance. Retrieved August 19, 1999, from the World Wide<br />

Web: http://www.onlineschoolyard.com<br />

Warkentin, M., & Morren, R. (1990). A perceptual learning difference. Notes on Literacy,<br />

1990-1994 (vol. 64, Oct. 1990).<br />

Whiting, P. R., Robinson, G. L., & Parrott, C. F. (1990). Irlen coloured filters for reading: a<br />

six-year follow-up. Retrieved on August 19, 1999, from the World Wide Web:<br />

http://www.edfac.usyd.edu.au/centres/childrens/SixYr.html<br />

Wilkins, A. J. (<strong>2003</strong>). Reading through colour. West Sussex: John Wiley & Sons Ltd.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Neurofeedback Training for Two Dimensions of Attention:<br />

Concentration and Alertness<br />

Jonathan D. Cowan, Ph.D.<br />

President and Chief Technical Officer<br />

NeuroTek, LLC d/b/a Peak Achievement Training,<br />

1103 Hollendale Way<br />

Goshen, KY 40026<br />

jon@peakachievement.com<br />

Dr. Louis Csoka founded the United States <strong>Military</strong> Academy’s Center for Enhanced<br />

Performance at West Point in 1989, when he was a Colonel and Professor in the Department of<br />

Behavioral Science and Leadership. It has grown to be the largest performance enhancement<br />

center in the U.S. because the Army has found it to be so valuable. Dr. Csoka recently stated “In<br />

today’s military, learning the cognitive skills is not enough. One must also learn the optimal<br />

sequence of concentration, alertness and relaxation for each activity. At no time has the demand<br />

for improved attention been greater than in today’s Army with its high tech weapons and the<br />

extensive use of advanced information technology. The demand for attention has grown<br />

exponentially while our basic attention skills lag far behind. We still basically learn attention as<br />

a by-product of the education and skills training we receive. To be sure, many enhancements<br />

have been made in the Army’s training programs. But still today, none directly target the<br />

attention network in the brain. Attention control training with the Peak Achievement Trainer<br />

does exactly that. “<br />

BRAINWAVE BIOFEEDBACK<br />

The Peak Achievement Trainer uses a simpler and clearer method of brainwave<br />

biofeedback, or neurofeedback, to measure and enhance both concentration—single pointed<br />

focus—and alertness or arousal. Biofeedback trains people to better regulate their brains and<br />

bodies by telling them what is happening there more understandably. Brainwave biofeedback is<br />

also known as Neurofeedback or EEG Biofeedback. The Peak Achievement Trainer uses much<br />

more accurate Neurofeedback to detect and improve concentration and relaxation. It is easier to<br />

understand and faster to finish training than other neurofeedback. Using the older procedures<br />

that he developed, Dr. Joel Lubar at the University of Tennessee (Lubar, 1994) found that<br />

neurofeedback was 80% effective in more than 1000 children with Attention Deficit Disorder.<br />

In his case series, the average increase in grade level on standardized tests was 2.5 years. The<br />

typical increase in IQ test scores was 8-19 points, and the average Grade Point Average<br />

improved 1.5 levels (C to B+). 60% of his clients decreased or eliminated medication. There<br />

were major improvements in behavior, with decreased hyperactivity and violence (see Nash,<br />

2001). As a result of instituting a school neurofeedback program in Yonkers, New York, the<br />

number of students suspended at Enrico Fermi School decreased from 53 in 1996-97 to 17 in<br />

1997-98 and 22 in 1998-99 (Sabo, M.J., personal communication).<br />

103<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


104<br />

PEAK ACHIEVEMENT TRAINING PROCEDURES<br />

The Peak Achievement Trainer detects the brainwaves from the Executive Attention<br />

Network and converts them to a measure of single pointed focus and interest. It then converts<br />

the measure to audio and video outputs. Students can use these outputs to learn how to control<br />

their concentration. They can improve through practice with the Trainer.<br />

The improved procedures used in the Peak Achievement Trainer were developed by<br />

combining research performed by Dr. Barry Sterman on Air Force B-2 Bomber pilots with<br />

neuroimaging studies that pinpointed the parts of the brain most associated with the executive<br />

control of attention. In NASA and Air Force research (Sterman, DTIS Reports) that was<br />

classified at the time—the late ‘80s--and is still largely unpublished, Dr. Sterman found that as<br />

they focused on a particular aviation task, the alpha brainwave decreased. There was an alpha<br />

burst as focus ended, and then suppression as the next task began. We call this the “microbreak”.<br />

The more difficult the task, the greater the alpha suppression. The better pilots in a B2 instructor<br />

selection process suppressed parietal alpha more completely. The better pilots needed a shorter<br />

alpha period or “microbreak” before starting to focus again. From this data, Dr.Sterman<br />

developed measurements of the EEGs of these pilots, which was a very powerful differentiator<br />

between the 6 Top Guns who became instructors and the other 12 pilots. The Air Force used a<br />

regression line with hundreds of variables to make the determination. None of the variables<br />

correlated above 0.4, except for his EEG measures, which were above 0.6. He selected the same<br />

pilots with just his metrics that the Air Force did with all of theirs.<br />

From Sterman’s studies, we developed the idea that the healthy individual cycles<br />

between concentration and relaxation by focusing on (part of) a task until it is done, and then<br />

taking a brief rest. Even the best trained brain cannot concentrate forever. It’s not designed to.<br />

Consistent, intense concentration has its costs in stress, emotional tension, and disease. The Peak<br />

Achievement Trainer teaches relaxation through breathing instructions on an audiotape and then<br />

integrates it into training for the cycle.<br />

More recently, we developed a second measurement that is quite different—a way to<br />

determine the degree of Alertness or Arousal of the Central Nervous System. By Concentration,<br />

we mean the degree of single-pointed focus on a perception, thought, or image, like a camera,<br />

zooming in. You can be relaxed, very alert, or in-between and still have single pointed focus.<br />

Many people see a parallel between this intense focus and the popular term for the state of an<br />

athlete at his peak—“the Zone.” In fact, The New York Times Magazine did an article about<br />

the Peak Achievement Trainer in a special issue called Tech 2010: A Catalog of the Near<br />

Future, focusing on technology that will change all of our lives in this decade. They called the<br />

article “The Coach Who Will Put You in the Zone.”<br />

Increasing Alertness/Arousal creates more intense stimulation or excitement. It enhances<br />

emotion. High Alertness is also associated with summoning resources to respond to a<br />

challenging situation. It is related to stimulation of the Reticular Activating System by many<br />

studies. It has a high cost in energy, and there is a quick “burnout” if the energy is not<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


conserved. We can determine both Concentration and Alertness from the same brainwave at the<br />

same location. We call this the ConAlert protocol. The Trainer’s Concentration measure<br />

decreases as you focus more, while the Alertness measure increases with greater arousal and<br />

effort.<br />

One of our trainers is also a specialist in teaching a particular reading technique, the<br />

Scholar’s Edge. In Figure 1, he is applying this to a book that he has never read. First, he<br />

examines the Table of Contents in a particular way. From about 5 to 15 seconds, numbered at<br />

the bottom, this tracing shows the second part of his effort, in which he reviews the schemata he<br />

has constructed. He is shifting his attention from a narrow focus on particular entries in the<br />

Table of Contents to a broader view of the synopsis of the book, back and forth. The<br />

Concentration line reflects this, but he is not very intensely alert. From 15 to 18 seconds, he<br />

takes a “microbreak”, resting and giving his brain time to recharge. He then reviews two points<br />

of interest from 18 to 24 seconds. Both of them are absorbing, as can be seen by the dips in the<br />

concentration line. The Alertness line shows that one is perhaps slightly arousing. Then, from<br />

24 to 27 seconds, he asks himself to set an intent for this reading session. This requires both<br />

high Alertness and Concentration. This is even more evident for the next five seconds, when he<br />

sets his strategy for reading the book.<br />

There are many activities, military and otherwise, that can be analyzed in this fashion.<br />

With study and statistics, expert performance patterns can be differentiated, traced out, and<br />

described in detail by reviewing the record. Audio and video can be recorded; our software will<br />

soon provide for synchronized recording and playback. Additional measures such as the<br />

direction of eye gaze can be added to the analysis. These very sophisticated measures may be<br />

particularly helpful in designing new user interfaces and making training more interesting. Once<br />

the problem areas can be delineated, appropriate corrective steps can be taken.<br />

PEAK ACHIEVEMENT TRAINING METHODS<br />

Our DARPA Project Officer suggested the idea that the Peak Achievement Trainer<br />

develops the “Cognitive Athlete”. We believe there are at least 8 different ways that they can be<br />

trained:<br />

• Strengthening the ability of the Executive Attention Network to focus attention.<br />

• Strengthening the ability of the midbrain to intensify alertness/arousal.<br />

• Focusing attention on parts of the body that the coach wishes to work with.<br />

• Train the user to take brief, relaxing microbreaks which recharge the brain.<br />

• Find the best possible degree of alertness/arousal to perform particular activities<br />

optimally.<br />

• Perform arbitrary sequences of concentration, alertness, and microbreaks.<br />

• Discover and enhance performance of the sequences that are optimal for particular<br />

activities.<br />

• Perform these sequences despite distractions such as self-talk and crowd noise.<br />

105<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


106<br />

THE HILLENBRAND PEAK PERFORMANCE PROGRAM<br />

The Peak Achievement Trainer is an integral part of the Attention Control Training<br />

program at the Center for Enhanced Performance. Dr. Csoka purchased his first Peak<br />

Achievement Trainer at the recommendation of Dr. Nate Zinsser, the psychologist who currently<br />

runs the Attention Control Training program. Dr. Csoka has recently created a very similar Peak<br />

Performance Center at the executive offices of Hillenbrand Industries, a Fortune 1000 health care<br />

company. He has created enough value there that all 22 of their top executives have completed<br />

the 20 session training session, doing about one session a week. Response to the program was<br />

overwhelmingly positive. These executives have returned voluntarily for many additional<br />

sessions, primarily using the Peak Achievement Trainer’s ConAlert protocol, and the program is<br />

being expanded to 39 more executives. They have ordered Trainers for their top executives. Dr.<br />

Csoka states “Training attention control begins the entire process of developing peak<br />

performance, in the military and elsewhere. The Peak Achievement Trainer provides an<br />

excellent tool for developing enhanced skills in focus and concentration. We have been working<br />

with business executives at a Fortune 1000 company on their attention skills as part of a broader<br />

peak performance training program that is an integral component of their leadership development<br />

program. By providing business situations and scenarios as input for the Peak Achievement<br />

Trainer, we have been able to demonstrate measurable improvements in their ability to attend in<br />

crucial meetings, engage in critical performance appraisals with employees, and deliver<br />

exceptional presentations. The Peak Achievement Trainer provides the critical feedback during<br />

attention control training needed to develop this skill to a level equal to what elite athletes have<br />

been able to achieve.”<br />

The change in the ability to concentrate of the group of Hillenbrand executives after five<br />

sessions of Peak Achievement Training is shown in Figure 2. The two cones on the left show<br />

how long they could hold the concentration line below 30 during the pre-test, while the last two<br />

cones are the post-test data. They were given four trials during each test. The first trial is shown<br />

on the left, and the best trial is shown on the right. During the pre-test, the first trial averaged 19<br />

seconds, ranging from 10 to 40. The first trial in the post test was more than twice as long, 44<br />

seconds, with the range running from 25 to 65 seconds. The best trial average almost doubled<br />

from 65 seconds at the pre-test to 128 seconds at post test. The ranges were 18 to 180 and 48 to<br />

220, respectively.<br />

THE MISSING COMPONENT OF MILITARY TRAINING<br />

Dr. Csoka and I believe that there is a missing component of military training. Learning<br />

the optimal sequences of concentration, alertness, and microbreaks is as integral to skill<br />

development as having the correct cognitive information at the right time. A high quality<br />

training experience should produce sequences of focus and alertness that are very similar to those<br />

in the real battle.<br />

Dr. Csoka further states “The applications for direct attention training in the Army are<br />

almost endless. Both the individual soldier and the crew weapons systems involve highly<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


sophisticated equipment and technology requiring much higher levels of attention and<br />

concentration. Take for example the tank crew. Target identification, acquisition and locking<br />

while on the move and under fire require very fine-tuned concentration on the part of the crew.<br />

West Point’s Center for Enhanced Performance staff conducted a study at the National Training<br />

Center with Army M1 tank crews undergoing tank gunnery. The training group was given<br />

performance enhancement training involving relaxation, imagery and attention training. They<br />

significantly outperformed a control group on all the tank gunnery measures. This was without<br />

the availability of the Peak Achievement Trainer. Had this attention technology been available,<br />

the results would have been even more compelling and the training time could have been<br />

reduced.”<br />

POTENTIAL MILITARY USES<br />

Our initial DARPA project produced several suggestions for military use of this new<br />

technology. We proposed an advanced system designed to include a mental exerciser—a<br />

separate program for taking the brain to the gym and enhancing the capacity for focusing and<br />

maintaining alertness. The military spends billions of dollars each year to train physical fitness,<br />

but very little to directly train mental fitness. Many recruits have problems paying attention.<br />

Considering the importance of mental fitness for many of the decision making tasks in today’s<br />

high-tech military, it will be very useful to provide tools that can be used like physical<br />

conditioning equipment, to create a state of mental toughness via repetition of demanding mental<br />

tasks. Training the “cognitive athlete” to enhance their capacity for focus and alertness also<br />

increases the envelope within which task demands can be structured and reasonably met, thus<br />

permitting increased efficiency of the use of human resources within the Armed Forces. We<br />

believe there are specific ways to use this technology to enhance memory and decrease fatigue.<br />

It can also be developed into a metric for monitoring workloads during a particular task. It can<br />

also help the military to design better Computer-Based Training by measuring the interest of the<br />

student from moment to moment.<br />

The original DARPA project description—the Student State Project—focused on<br />

developing a way to monitor the state of a student being given Computer-Based Training, so that<br />

the computer could improve the tutoring it administered to the student. There are a variety of<br />

tutoring systems for attention control training that can be created. One suggested approach<br />

would include a semi-transparent overlay which can be placed in the lower right corner of the<br />

screen. This display can show both instantaneous and time-averaged Concentration and<br />

Alertness, and/or a plot of the last minute. Audio feedback employing particular sounds can<br />

steer the user back to the optimal state. Alternatively, visual warning signals can be provided as<br />

flashing text or interruptions. Later, a dynamically generated review of the sequence and events<br />

could be presented, along with coaching generated by the computer or a coach.<br />

There is an enormous potential for developing Concentration and Alertness databases that<br />

can assist in training the optimal sequences for each task. Measurements of experts at the<br />

particular task could be used to provide a library of optimal sequences for comparison.<br />

Collecting information on those who perform similar tasks would produce a group of norms that<br />

107<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


108<br />

will allow the system to flag unusually poor performances and forward the information for<br />

appropriate remedial action. Comparison of present performances with the data from past ones<br />

by the same individual would provide useful information for teaching and evaluation. Our patent<br />

#5,983,129 covers analyzing these brainwave signals in association with an event and also using<br />

them to modify the subsequent presentation by a computer or simulator. The possibility of<br />

integrating this approach with measurements of eye fixation could produce even more refined<br />

measurements and descriptions of the sequences.<br />

Due to a change in priorities associated with the development of the DARWARS project<br />

and a funding crunch at DARPA, we are currently looking for additional funding to continue<br />

research and development. Our collaborators have developed a new system small enough to fit<br />

completely on a golf visor. It will soon be wirelessly connected to the PC. We have planned a<br />

very sophisticated series of further validation studies for our ConAlert measures and a new, more<br />

powerful measure still under development.<br />

REFERENCES<br />

Lubar, J.F. (1994): Neurofeedback for the management of attention deficit-hyperactivity<br />

disorders. In M.S. Schwartz & Associates (Eds.), Biofeedback (2 nd . ed.). New York, Guilford<br />

Press, pp. 493-525.<br />

Nash, J.K. (2000): Treatment of Attention Deficit Hyperactivity Disorder with Neurotherapy.<br />

Clinical Electroencephalography 31(1): 30-37.<br />

Sterman, M. B. et al: Defense Technical Information Center Reports ADA151901,<br />

ADA171093, ADA186351, ADP006101, ADA142919.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 1: Changes in Concentration and Alertness during “Scholar’s Edge” reading program.<br />

109<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


110<br />

140<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

19<br />

Figure 2: Length of consistent concentration before and after the Hillenbrand Peak Performance<br />

program for the first and best trial.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

65<br />

44<br />

128<br />

1st Trial<br />

Best Trial


WHAT TODAY’S SOLDIERS TELL US<br />

ABOUT TRAINING FOR THE FUTURE<br />

Brooke B. Schaab and J. Douglas Dressel<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

5001 Eisenhower Ave.<br />

Alexandria, VA 22333<br />

Dressel@ari.army.mil<br />

How do we train mid- and junior-level Soldiers in the information technology (IT) skills<br />

needed for operational units? How can we maximize the acquisition, transfer, adaptability, and<br />

retention of these skills necessary for transformation to the future force? To help answer such<br />

questions, scientists from the U.S. Army Research Institute (ARI) administered questionnaires<br />

and conducted interviews with operators of Army Battlefield Command Systems (ABCS). Sixtytwo<br />

enlisted Soldiers from three Army posts participated.<br />

OBJECTIVES<br />

Soldiers who are currently using Army digital systems gave their perspective on current<br />

and future training needs for the “digital Soldier.” Findings provided insight into the type of<br />

training the Soldiers view as most productive in developing the expertise needed to take full<br />

advantage of new technologies. This paper summarizes Soldier perspectives about:<br />

• How will current training practices prepare Soldiers to successfully perform in units<br />

supported by digitization;<br />

• Learning preferences for new technologies, noting opportunities presently available to<br />

capitalize on training;<br />

• How digital systems change the jobs or the tasks that Soldiers perform.<br />

METHOD<br />

Researchers met with groups of four to eight Soldiers to gather information on current<br />

digital training practices on multiple systems. First, Soldiers were administered a questionnaire<br />

requesting information on the Soldiers’ training background, training environment, training<br />

preferences, computer experience, and digital team performance.<br />

Next, each Soldier was given the first of a series of four questions concerning digital<br />

training. The Soldiers were given 10 minutes to write their responses to these questions. Soldiers<br />

then passed their question sheets counter-clockwise to the next Soldier who would answer this<br />

new question. A Soldier could expand upon the previous answer or give a different response;<br />

instructions were to write whatever seemed appropriate. This rotation of questions and additional<br />

responses was continued until each Soldier answered each of the four questions.<br />

111<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


112<br />

Following this, a similar rotation approach was taken where each Soldier selected and<br />

ranked what he or she saw as the best two responses to each question. This resulted in each of the<br />

four questions having four sets of responses and four sets of rankings.<br />

Finally, researchers conducted an interview with each group of Soldiers, which generally<br />

took 45-60 minutes. The interviews were audio-recorded for later review. Soldiers were asked to<br />

speak freely, and to give their full and complete impressions of digital training practices that<br />

would be valuable to the Army. Comments were not for individual attribution. Tape recording of<br />

the session did not seem to constrain or inhibit the Soldiers.<br />

FINDINGS/IMPLICATIONS<br />

“Our biggest problem is that we need more training.”<br />

Current training on ABCS systems focuses on how to operate the system and is contextfree.<br />

Soldiers complete this training at a novice level with knowledge of facts, features, and rules<br />

that they can verbalize and apply by rote. They have difficulty applying their knowledge in new<br />

contexts. To move beyond this novice level, Soldiers want and need practical experience in<br />

multiple situations to form a more sophisticated understanding of system uses. For example, they<br />

need to learn to prioritize and organize information to achieve a variety of goals. Lengthening the<br />

time Soldiers spend on their initial IT training is not the answer. When it comes to the most<br />

effective training beyond basic “knobology,” field exercises are preferred by far, followed by<br />

exploring the system on their own.<br />

The most pervasive and consistent finding was that junior-level enlisted Soldiers needed<br />

and wanted additional training to become proficient at their jobs.<br />

Army digital systems are a never-ending work in progress: build one, try it out, make<br />

modifications, and build it again. Soldiers must continuously update their knowledge and adapt<br />

to new, changed, or absent functionality. More important, they must understand how these<br />

changes influence their ability to do their Army job. This type of training goes beyond the<br />

content in New Equipment Training (NET), which focuses on how to operate the system.<br />

“A lot of hands on, that’s important for today’s up and coming Army.”<br />

Soldiers said what kind of training they found successful. “Give us hands-on training,<br />

using a full job flow sequence.” Here, they received the inputs they would normally receive in an<br />

actual mission and produced and transmitted the required outputs. Soldiers said that field<br />

experiences and interacting with their peers were the “best” ways to learn the systems. Field<br />

exercises should include setting up and connecting the digital systems as well as operating the<br />

systems in various situations. Soldiers complained that without connectivity, an all too common<br />

occurrence, training does not happen. In short, Soldiers say that there were two problems. One<br />

associated with training to use the system. The second associated with training to set up and<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


troubleshoot interconnectivity problems. Some digital system training addressed both problems,<br />

while other training addressed only system use problems.<br />

Learning Preferences and Computer Experience<br />

Soldier responses show they want to learn Army digital systems the same way they<br />

acquire much of their non-military digital expertise: explore the software and equipment to solve<br />

real problems. (see Figure 1).<br />

Percent<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

Read the<br />

manual<br />

Watch<br />

someone<br />

use it<br />

Take a<br />

course<br />

Explore<br />

the<br />

program<br />

Have<br />

someone<br />

help me<br />

Figure 1. Preferred method of learning new software<br />

Soldiers were queried about their familiarity with using technology aids to supplement or<br />

support training. Most Soldiers reported a great deal of experience with the Internet and instant<br />

messaging, but they had limited experience with distance learning (DL), web-based gaming, or<br />

hardware installation (see Figure 2). This suggests that training delivered via distance learning or<br />

web-based gaming might require added support to implement, at least until Soldiers become<br />

familiar with these techniques.<br />

Opportunities for Training<br />

Soldiers indicated that they do have the training time and resources available to take<br />

advantage of DL opportunities.<br />

� Ninety-two percent (92%) had their ABCS digital systems available in their unit for<br />

training.<br />

� Fifty-eight percent (58%) had time to train during their work hours if training<br />

resources were available (e.g., CD ROM, manuals, on-line help, practice<br />

vignettes/scenarios).<br />

113<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


114<br />

� Seventy-four percent (74%) would train on their own time if computer systems and<br />

training resources were available.<br />

Percent<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

Internet Instant<br />

messaging<br />

Networks Web-based<br />

gaming<br />

Figure 2. Computer experience reported by Soldiers<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Installing<br />

hardware<br />

Distance<br />

learning<br />

A lot Some A little None<br />

“Technology Changes the Way We Fight”<br />

Troops at all levels are beginning to understand how information technology changes<br />

their duties and responsibilities.<br />

• “The system is a good thing because you can give and receive messages instead of<br />

walking far across the firing point or using a radio because the enemy may intercept<br />

the channels.”<br />

• “With the FBCB2, I can send messages to the commander when I am lost or in a<br />

dangerous situation. I can tell what’s going on where I am, or set up an attack, or plan<br />

where to go next.”<br />

• “It is my belief that field training is the best training that an analyst can benefit from.<br />

It is valuable because it gave me an understanding of what the other ABCS<br />

components provided me within the Army.”<br />

One commander enthusiastically recounted a recent field exercise where Soldiers left<br />

from dispersed points to converge at a common location at the designated time. Soldiers did not<br />

talk with each other, but used their digital system to track themselves and their allies. “This<br />

would have been impossible without our digital systems,” reported the commander.


What advantages do Soldiers rate as applying to a moderate or to quite an extent to their<br />

digital systems:<br />

� Digital systems make it much safer for troop movement in enemy territory.<br />

� Once we understood the limitations and capabilities of the digital systems, we were<br />

able to use them in new and better ways.<br />

� Planning and preparation is much faster when we can collaborate using our digital<br />

systems.<br />

Summary<br />

Soldiers expressed a desire and need for more training using digital technology. They<br />

wanted training to be hands on and scenario based. In short, there were three major findings from<br />

the research:<br />

• Soldiers want more training to integrate their knowledge of their digital system with<br />

their Army job.<br />

• Soldiers see opportunities available now for additional training at their home station.<br />

Although unfamiliar with distributed learning methods, they express a willingness to<br />

use technology to advance their training.<br />

• Soldiers recognize the value of technology to augment their military capacities.<br />

115<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


116<br />

OBJECTIVE-BASED TRAINING: METHODS, TOOLS, AND<br />

TECHNOLOGIES<br />

Dr. John J. Burns, Janice Giebenrath, and Paul Hession<br />

Sonalysts, Inc.<br />

12501 Research Parkway<br />

Orlando, FL 32826<br />

burns@sonalysts.com<br />

INTRODUCTION<br />

The United States Surface Navy has adopted a “train by objectives” approach as an essential<br />

element in achieving and maintaining readiness. This approach is backed by years of R&D in<br />

individual and team training. However, in order for this approach to be fully embraced at the<br />

deck plate level, methods and tools that are flexible and able to be tailored are needed. In<br />

addition, the software and hardware associated with these methods and tools must be accessible<br />

and affordable.<br />

The Coalition Readiness Management System (CReaMS) effort seeks to provide methods, tools,<br />

and technology to support objective-based training for warfighters within a coalition<br />

environment. In particular, the CReaMS effort has leveraged the Navy’s investment in the Battle<br />

Force Tactical Training system and associated efforts. In this paper we first present an overview<br />

of objective-based training. Then we turn to a description of the first two phases of the CReaMS<br />

effort with an emphasis on our effort in Phase II. We conclude with a short discussion of<br />

ongoing CReaMS Phase III efforts.<br />

OBJECTIVE-BASED TRAINING<br />

The US Navy has adopted an objective-based approach to training as a result of analysis into the<br />

effectiveness of existing training approaches. This move was predicated on the fact that systems<br />

such as the Battle Force Tactical Training (BFTT) system provided extremely capable shipboard<br />

training systems. BFTT can in fact provide significant training opportunities for Sailors onboard<br />

ship, using their own tactical systems – training the way they will fight. However, it was<br />

recognized that in addition to the hardware and software manifest in systems such as BFTT,<br />

there was also a requirement for methods and tools to support Sailors in using the effective use<br />

of these embedded training systems (Acton and Stevens, 2001).<br />

What was needed was a way to exploit this capability and provide form and structure to the<br />

Navy’s afloat training processes. The approach chosen would ultimately fulfill the Center for<br />

Naval Analysis’ recommendation that “. . . an appropriate course is for the Navy training<br />

establishment to focus on ensuring the minimum acceptable training requirements are both<br />

defined and executed.” The challenge was to provide a process that would:<br />

• Assist shipboard training teams;<br />

• Help quantify training readiness;<br />

• Attain a synergy with existing training systems;<br />

• Be sufficiently robust to support both existing and emerging training systems.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


This new training process viewed the ship as a system that was capable of a finite number of<br />

tasks based on its systems. Each task was used to generate a list of subordinate actions or steps<br />

required to ensure the success of the higher-level task. These higher-level tasks could have a<br />

team focus, or they could be an aggregation of individual watch stander tasks or a combination<br />

of both. In all cases, team and individual tasks were defined by measurable standards of<br />

performance. These “standards” were criterion-based measures tied to absolute standards, using<br />

specific objective behavioral items. This follows the generally accepted premise that “The best<br />

information for evaluation of training can be acquired by using objective, criterion-referenced<br />

measures.”<br />

The Joint Mission Essential Task List (JMETLS), Navy Mission Essential Task List (NMETLS)<br />

and the Required Operational Capability/Projected Operational Environment (ROC/POE)<br />

documents were integral to this effort. The tasks, sub-tasks and measures generated had to be<br />

realistic, and they had to link to and support, higher-level requirements.<br />

Training Objective Construction<br />

Training objectives were constructed using the same basic model shore-based school houses and<br />

academia have used for years; a hierarchy of terminal and enabling objectives supported by<br />

measures of performance and effectiveness. As mentioned earlier, each MOE/MOP contained a<br />

standard – a data point observable and quantifiable in terms of time, distance and/or quality.<br />

This approach was critical to ensure we removed or lessened trainer subjectivity.<br />

The final structure included the terminal objective, the “what” would be accomplished, the<br />

enabling objective, the “how” it would be accomplished, and the measure of performance or<br />

standard, the “how well” it would be accomplished. This terminology is derived from a task<br />

based training paradigm. In a Personnel Performance Profile (PPP) based system, the terminal<br />

objective would be a level-one objective; the enabling objective would be a level two objective.<br />

OBJECTIVE-BASED TRAINING IN A COALITION ENVIRONMENT: CREAMS<br />

The Coalition Readiness Management System (CReaMS) is a congressionally mandated effort<br />

that evolved to address operational requirements with increased emphasis on combined or<br />

coalition operations. The US Congress funded the CReaMS project to encourage international<br />

collaboration in exploring areas of cooperation in training and readiness (Clark, Ryan, O’Neal,<br />

Brewer, Beasley, & Zalcman, 2001). CReaMS builds upon several past Office of Naval<br />

Research (ONR) sponsored research and development efforts in the areas of team performance<br />

and decision-making, performance measurement, objective-based team training and various data<br />

collection tools (Cannon-Bowers & Salas, 1998; Brewer, Baldwin-King, Beasley, & O’Neal,<br />

2001). While the CReaMS effort involves a plethora of government and industry participants<br />

whose expertise ranges from network topologies, to military training, to weapons and combat<br />

systems, early on a Learning Methodology (LM) team was stood up and tasked with infusing the<br />

effort with theory and methods that would lead to enhanced warfighter performance. This paper<br />

focuses on the work of the LM team.<br />

117<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


118<br />

CReaMS Phase I<br />

CReaMS Phase I was a collaborative effort between the United States Navy (USN) and the<br />

Royal Australian Navy (RAN) in adapting, implementing, and evaluating the BFTT learning<br />

model to a coalition training environment. The BFTT learning model is presented below in<br />

Figure 1.<br />

Figure 3: BFTT conceptual learning model.<br />

The Phase I effort uncovered challenges for coalition training and measurement. The Afloat<br />

Training, Exercise, and Management System (ATEAMS) was used to apply and manage the<br />

team learning methodology during these exercises. ATEAMS contains an extensive database of<br />

training objectives for each ship in the Afloat Training Group. Based on the chosen training<br />

conditions, the user selects specific training objectives to be associated with various events<br />

throughout the training session. This carries with it the measures of performance associated with<br />

each objective (Hession, Burns, & Boulrice, 2000).<br />

While this training management system has been successful, it presented challenges and<br />

limitations when applied to the CReaMS effort (i.e., management of such a large database has<br />

been shown to be cumbersome). In addition, the focus of ATEAMS is on intra-ship training,<br />

while the CReaMS effort involves inter-ship training. The results of the Phase I effort showed<br />

the ATEAMS tool was not the best choice for CReaMS performance measurement. Lessons<br />

learned were applied to the Phase II effort.<br />

CReaMS Phase II<br />

Using lessons learned from Phase I, CReaMS Phase II involved the development of a new<br />

approach to coalition performance measurement, from developing training objectives to data<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


collection and analysis. Also included was the development of novel approaches for organizing<br />

and representing data.<br />

Following the objective-based training model, training objectives to be measured during the<br />

exercises were identified and developed. In keeping with objective and mission-based<br />

development, the focus remained on objective-based training oriented up to the ship level for the<br />

USN and utilized NMETL-derived measures. Building upon this effort, the RAN objectives<br />

were then aligned with the USN approach for performance measurement. Training objectives<br />

were developed for both procedural and process measurement. Procedural measurement<br />

involves the assessment of individual and team completion of training objectives usually within a<br />

prescribed sequence. Process measurement examines team-level skills necessary for effective<br />

performance (e.g., effective communication).<br />

The development of this new approach for collecting performance data also involved front-end<br />

data organization and analysis software, hand-held software for data collection, and methods for<br />

transferring data between each tool. The front-end data management tool, stored on a PC,<br />

housed the entire population of event scenarios, training objectives, and associated performance<br />

measures. Here, a data collection is be built by linking desired training objectives with event<br />

scenarios. This data collection plan then transfers to the hand-held device for implementation.<br />

Once data has been collected, it is then transferred back into the data management software for<br />

analysis. Output includes summary data regarding performance standards met for the training<br />

objectives (please see Giebenrath, Burns, Hockensmith, Hession, Brewer, and McDonald, <strong>2003</strong><br />

for a detailed description of the methods and tools developed and implemented during the<br />

CReaMS Phase II effort).<br />

Putting it all Together: CReaMS Phase III<br />

In Phase II of the CReaMS Phase effort, a new and novel approach to the development and<br />

implementation of objective-based performance was successfully employed in a coalition<br />

training environment. The use of task sequences provided an effective method for organizing,<br />

storing, and transferring procedurally oriented performance data and while actual data collection<br />

tools differed across sites (laptop computers with Excel-based tools were used in Australia while<br />

the HHD was used for data collection with the USN participants), there was consensus among all<br />

CReaMS participants that the methodology worked well both in terms being “user-friendly” in<br />

data collection and in terms of providing performance data at the right level of detail for exercise<br />

participants.<br />

Building on lessons learned and the successes of Phases I and II, in CReaMS Phase III, a specific<br />

goal was set to conduct a distributed, participative, and facilitated debrief. Once again the<br />

primary participants in the “Virtual Coalition Readiness” (VCR) exercise were the USN<br />

(including the USS HOWARD (DDG-83), and Tactical Training Group Pacific) and the RAN<br />

(including RAN Ship Trainers at HMAS WATSON with watch standers from the ADELAIDE<br />

and the ARUNTA). The VCR exercise was designed to incorporate all of the elements of<br />

CReaMS Phase I and Phase II events with an additional emphasis on the integration of<br />

procedural and process measurement and feedback. While a lengthy description of the CReaMS<br />

Phase III effort is beyond the scope of the current paper, a discussion of key constructs and how<br />

119<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


120<br />

they were implemented will provide the reader with a big-picture understanding of this latest<br />

phase.<br />

Integrating Process and Procedural Measurement<br />

In CReaMS Phase II the LM team brought together procedural and process measurement through<br />

the integration of the previously discussed task sequences and the Team Dimensional Training<br />

(TDT) methodology (Smith-Jentsch, Zeisig, Acton, and McPherson, 1998). This effort<br />

addressed questions such as, “What to measure?”, “What level of detail to measure?”, and “What<br />

resources are required?” It was determined that for coalition training, the focus of performance<br />

measurement needed to be on the process and procedure within and between warfare areas (intraship<br />

level), between ships and between warfare areas (inter-ship level), and within the task unit<br />

level and between the task unit commander level and the ships in company.<br />

The TDT process measurement methodology specifies four super-ordinate dimensions against<br />

which team performance is to be evaluated: Information Exchange; Communication; Supporting<br />

Behavior; and Initiative/Leadership. Each of these dimensions is further delineated by a set of<br />

sub-items (e.g., phraseology, brevity, clarity, and completeness for Communication) that provide<br />

behavioral anchors for raters to use in assessing performance. To this, the CReaMS effort added<br />

a fifth dimension—Critical Thinking—defined by the sub-items: maintaining awareness of tasks<br />

in progress; effective management of integrated tasks; appropriate allocation of tasks among<br />

team members; recognizing unexpected/emergency situations; and appropriately implementing<br />

pre-planned responses. In CReaMS Phase III, The LM team worked with the VCR scenario<br />

development team to create events within the scenario that would stress both specific team tasks<br />

(to be measured by procedural measurement scheme) and team process (as measured by TDT)<br />

Implementing a Participative and Facilitated Debrief<br />

Previous research has demonstrated that debriefing skills and engaging all members of a team are<br />

factors critical to realizing the benefits of objective-based training (Tannenbaum, Smith-Jentsch,<br />

and Behson, 1998, Smith-Jentsch, Zesig, Acton, and McPhereson, 1998). Thus, in CReaMS<br />

Phase III, Video Tele Conferencing (VTC) hook-ups were established at all participant sites and<br />

the LM team developed a web-based debriefing tool that integrated performance data (process<br />

and procedural) collected from all participant sites. Each day across the 4-day exercise, data<br />

collectors at each site collected targeted (by event) performance data, met with their local data<br />

collection teams to integrate assessment data, and then provided inputs to LM team members at<br />

HMAS WATSON. Using the Distributed Debrief Product tool, LM team members at HMAS<br />

WATSON integrated procedural and process data across multiple levels of performance and<br />

within an hour of the end of the exercise, a comprehensive debrief product was available for<br />

review by local facilitators. After their review, the facilitators then engaged the VCR<br />

participants in a Task Unit debrief with the Distributed Debrief Product providing structure for<br />

facilitators to provide context, specific performance feedback, replay (from BFTT Debrief<br />

Product sets), and, most importantly, input from exercise participants. Figure 2 provides a<br />

representative sample screen from the debrief tool—a typical debrief included hundreds of such<br />

screen allowing for multiple levels of debrief across multiple warfare areas.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


CONCLUSION<br />

Figure 4: Sample Distributed Debrief Product screen shot.<br />

As events in the new millennium underscore, global forces are coming together that demand<br />

formation of effective Nation-to-Nation coalition training capability. The CReaMS project<br />

provides a case study in how coalition partners can work together leveraging technologies,<br />

methods, and tools that individual coalition partners bring to the endeavor in order to create a<br />

training environment in which the whole is greater than the sum of the parts.<br />

In particular, the CReaMS project has seen the successful adaptation of the BFTT system within<br />

a virtual training environment. In addition, multiple Navies worked together to develop methods<br />

for data collection, data reduction, and debrief that enabled successful coalition training in a<br />

virtual environment. While quantitative analyses that are now in the planning stage will allow<br />

for an empirical evaluation of the value of the CReaMS effort, the value to the warfighter is best<br />

evidenced in their words. As the Sea Combat Commander, CAPT. Bill Hoker, USN,<br />

COMDESRON SEVEN, debriefed his multi-national warfare commanders in a facilitated after<br />

action review in 1 of the 8 intense Strike Force synthetic exercises, he stated, “I am amazed at<br />

the level of intensity, interoperability and training value exhibited between the US Navy and the<br />

Royal Australian Navy during this coalition event.”<br />

121<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


122<br />

REFERENCES<br />

Acton, B., & Stevens, B.J. (2001). Objective-Based Training and the Battle Force Tactical<br />

Training System: Focusing our Fleet Training Processes. 23 rd Interservice/Industry, Training,<br />

Simulation, and Education Conference <strong>Proceedings</strong>.<br />

Brewer, J., Baldwin-King, V., Beasley, D., & O’Neal, M. (2001). Team Learning Model: A<br />

Critical Enabler for Development of Effective and Efficient Learning Environments. 23 rd<br />

Interservice/Industry, Training, Simulation, and Education Conference <strong>Proceedings</strong>.<br />

Cannon-Bowers, J.A., & Salas, E. (1998). Making Decisions Under Stress: Implications for<br />

Individual and Team Training, Washington D.C.: American Psychological <strong>Association</strong>.<br />

Clark, P., Ryan, P., O’Neal, M., Brewer, J., Beasley, D., & Zalcman, L. (2001). Building<br />

Towards Coalition Warfighter Training. 23 rd Interservice/Industry, Training, Simulation, and<br />

Education Conference <strong>Proceedings</strong>.<br />

Giebenrath J., Burns, J., Hockensmith, T., Hession P., Brewer J., & McDonald, D. (<strong>2003</strong>).<br />

Extending the Team Learning Methodology to Coalition Training. Paper accepted for<br />

presentation at the 25 th Annual Interservice/Industry, Training, Simulation and Education<br />

Conference.<br />

Smith-Jentsch, K.A., Zeisig, R.L., Acton, B., & McPherson, J.A. (1998). Team dimensional<br />

training: A strategy for guided team self-correction. In J.A. Cannon-Bowers & E. Salas<br />

(Eds.), Making decisions under stress: Implications for individual and team training (pp. 271-<br />

297). Washington D.C.: American Psychological <strong>Association</strong>.<br />

Tannenbaum, S. I., Smith-Jentsch, K.A., & Behson, S. J. (1998). Training team leaders to<br />

facilitate team learning and performance. In J.A. Cannon-Bowers & E. Salas (Eds.), Making<br />

decisions under stress: Implications for individual and team training (pp. 247-270).<br />

Washington D.C.: American Psychological <strong>Association</strong>.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


RACE AND GENDER AS FACTORS IN FLIGHT TRAINING SUCCESS<br />

Dr. Wade R. Helm and Jonathan D. Reid<br />

Embry-Riddle Aeronautical University<br />

P.O. Box 33360<br />

NAS Pensacola, Fl. 32508<br />

Pensacola.center@erau.edu<br />

ABSTRACT<br />

Success in flight training in terms of attrition and performance scores have differed<br />

between male caucasians and non-male caucasians both before and after implementation of<br />

affirmative action programs. It has been suggested that institution or instructor bias may account<br />

for the lower success rates of non-male caucasians. Two studies were conducted to examine<br />

various prediction variables related to flight training success. It was hypothesized that both<br />

minority status and gender would be significant variables in a multi-regression equation<br />

predicting flight-training success. Results indicate that for both pilot and Naval Flight Officer<br />

candidates minority status and gender were significant predictor variables. However, when<br />

selection test scores were normalized to the beginning of flight training and then compared to<br />

normalized completing scores, all groups but one were non-significant. Only female pilots and<br />

Naval Flight Officers had lower normalized prediction scores than normalized completed flight<br />

training scores. Basically the Aviation Selection Test Battery under predicts female success in<br />

flight training. All other groups when adjusted for Aviation Selection Test Battery prediction test<br />

scores, performed as predicted.<br />

INTRODUCTION<br />

Roughly 12,000 individuals annually contact ascension sources with an interest in Naval<br />

aviation. Through an initial screening, the number is reduced to 10,000 who take the Aviation<br />

Selection Test Battery (ASTB), a series of six tests used to select future Naval aviators (What,<br />

n.d.). Those scores are combined with flight physicals, physical fitness scores, and officer ship<br />

ratings to select aviation candidates. All Naval Aviator and Naval Flight Officer (NFO)<br />

candidates then attend Aviation Pre-Indoctrination (API) at Naval Air Station Pensacola. After<br />

taking courses in weather, engines, aerodynamics, navigation, and flight rules and regulations,<br />

they head to their respective training wings to start primary training. Pilot students attend<br />

primary training at Training Wing FIVE at Whiting Field in Milton, FL and Training Wing<br />

FOUR at Corpus Christi NAS, TX. NFO students remain at Pensacola to start Joint<br />

Undergraduate Navigator Training with Training Wing SIX (CTW-6).<br />

CTW-6 conducts primary, intermediate, and advanced training for NFOs. Primary<br />

training lasts 15 weeks and is conducted using the T-34C Mentor aircraft by Training Squadrons<br />

FOUR (VT-4) and TEN (VT-10). Primary Undergraduate Student Naval Flight<br />

Officer/Navigator Training is designed to provide officers in U.S. and international services the<br />

skills and knowledge required to safely aviate, navigate, communicate, and manage aircraft<br />

systems and aircraft in visual and instrument conditions (CNATRA Instruction 1542.54L, 2002).<br />

123<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


124<br />

After primary training, some students can select panel navigation training conducted at Randolph<br />

AFB, TX for the P-3C Orion or E-6 Mercury. The remaining students continue in VT-4 and VT-<br />

10 to start intermediate training (CNATRA Instruction 1542.54L, 2002). After intermediate<br />

training, students receive assignments to the Airborne Early Warning pipeline (training at<br />

Norfolk, VA) or Strike and Strike-Fighter pipeline (training at Training Squadron 86 in<br />

Pensacola) (CNATRA Instruction 1542.131B, 2001). After students finish the different training<br />

pipelines, they receive their NFO wings. It is a long process that not everyone successfully<br />

completes.<br />

It is extremely expensive to put an aviator through the years of training required to<br />

prepare him or her for their operational aircraft. If students cannot complete the program, the<br />

service cannot recover the sunk costs that vary from $500,000 to $1,000,000 depending on which<br />

stage of training the student failed to complete (Fenrick, 2002). Historically, the attrition rate is<br />

20-30% of students with a majority of the losses occurring during the API and primary phases.<br />

The Navy, therefore, has spent much time and money developing the ASTB as an economical<br />

tool to predict a candidate’s performance and likelihood of attrition. If the ASTB predicts<br />

positive performance, the student should theoretically succeed regardless of the student’s race or<br />

gender. Past research, though, has detected a difference in performance based on an individual’s<br />

race or gender.<br />

STATEMENT OF THE PROBLEM<br />

Since the ASTB has gone through an extensive process to remove racial, ethnic, and<br />

gender bias, it would follow that the performance of aviator candidates would not vary among<br />

racial, ethnic, or gender groups. It has been observed that a large proportion of minority students<br />

fail to complete primary nor do many finish at the top of his or her class. These two studies<br />

explored the perceived difference in minority and gender performance versus male-caucasians at<br />

aviator primary flight training.<br />

STATEMENT OF HYPOTHESIS<br />

The research hypothesis states that minority and female aviator candidates achieve<br />

different primary NSS scores versus male-caucasian candidates. The null hypothesis states that<br />

there will be no significant difference in NSS scores determined through multiple regression<br />

analysis at the 95% confidence level.<br />

METHODOLOGY<br />

The names and social security numbers of the subjects who completed API were obtained<br />

from the Naval Schools Command (NASC) database. ASTB scores and race codes were obtained<br />

from CTW-6 TMS 2 database. This database will also be used to obtain the subjects’ primary<br />

training phase academic, flight, and simulator grades represented by their grade point averages.<br />

A regression equation was computed using ASTB academic and flight scores. These grades were<br />

converted into a NSS and combined into an overall primary NSS. The subjects were then divided<br />

into majority and minority groups. This classification was added as a variable in a multiple-<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


egression equation. The process was repeated using minority status and ASTB academic and<br />

flight scores as predictor variables and primary NSS as the criterion variable.<br />

Race information on each student came from questionnaires completed when the<br />

candidates begin the aviation pipeline using selected codes. Based upon student answers, a three<br />

digit alphanumeric code containing sex, race, and ethnic classifications was entered in the<br />

database. Candidates for this study were divided according to their race codes: white or minority.<br />

The performance databases contain grade point information on each subject. The grade<br />

point represents a student’s performance in flight or simulator events. An academic average is<br />

also stored in the system. This raw information is used to compute a Naval Standardized Score.<br />

A NSS of 50 is assigned to the mean, and a score above or below 50 correspondingly describes<br />

performance above or below the mean for a particular area, such as academic, simulator, or flight<br />

grades. Grades are awarded according to criteria specified in CNATRA Instruction 1500.4F.<br />

The data was compiled in a Microsoft Excel spreadsheet. Males were assigned a value of<br />

0 and females a value of 1. Students identified as non-minority were given a 0. All minorities<br />

received a 1. The population’s academic, flight, and simulator means were then computed. The<br />

means and standard deviations were used to compute academic, flight, and simulator NSS using<br />

the following formula: NSS = ((x-mean)/std dev) * 10 + 50. An overall NSS was finally<br />

computed by weighing the individual NSS. Using the overall NSS as the criterion variable, the<br />

study first determined how well ASTB academic scores predict primary performance using<br />

multiple regression using Microsoft Excel. AQR scores were entered as the first variable, and<br />

FAR scores were the second variable. The process was repeated adding minority status as a third<br />

variable to determine if minority status predicts beyond ASTB scores. Finally, multiple<br />

regression was done by adding sex. For this study, minority status was set to 1 and non-minority<br />

status to 0, and sexual status of male was set to 0 and female to 1.<br />

RESULTS<br />

Primary phase of training simulator, academic, and flight grades were converted to a NSS<br />

and then combined into an overall NSS. For comparison, NSS scores were also calculated for<br />

AQR and FAR. Tables 1 and 2 summarize the NSS scores for the groups in these two studies.<br />

Table 1<br />

NSS Summary for Pilot Candidates<br />

Group Data NSS<br />

Male Caucasians Average of AQRnss 50.90<br />

Average of PFARnss 51.00<br />

Average of Overall Primary Training 50.50<br />

Minority Average of AQRnss 43.33<br />

Average of PFARnss 43.96<br />

Average of Overall Primary Training 45.60<br />

Female Average of AQRnss 43.50<br />

Average of PFARnss 41.80<br />

Average of Overall Primary Training 46.40<br />

125<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


126<br />

Table 2<br />

NSS Summary for NFO Candidates<br />

Group Data NSS<br />

Male Caucasians Average of AQRnss 50.74<br />

Average of FOFARnss 50.37<br />

Average of Overall Primary Training 50.53<br />

Minority Average of AQRnss 47.78<br />

Average of FOFARnss 48.10<br />

Average of Overall Primary Training 47.25<br />

Female Average of AQRnss 46.44<br />

Average of FOFARnss 47.55<br />

Average of Overall Primary Training 51.06<br />

Due to limited sample size, regression analysis was not conducted on the pilot sample.<br />

For the NFO sample a multiple regression using AQR and FOFAR raw scores yielded the<br />

equation, y = 38.120 + .81666x1 + 1.5348x2, and an R of .3451. The degrees of freedom<br />

numbered 737 thus resulting in a significant correlation coefficient at α = .05. Adding race<br />

yielded the equation, y = 38.875 + .71272x1 + 1.5700x2 + (-) 2.5652x3, and an R of .3632.<br />

Again significance was achieved, but the addition of race improved the equation. The final<br />

regression using sex as the fourth predictor yielded the equation, y = 38.26 + 0.87x1 + 1.48x2 +<br />

(-) 2.61x3 + 2.35x4. Sex added predictability yet at a lower extent than minority status. The<br />

strength of the relationships among the variables considered also supports this. Table 3<br />

summarizes the strengths.<br />

Table 3<br />

Correlation Matrix<br />

Race Sex AQR FOFAR Acad NSSA/C NSS Sim NSS<br />

Overall<br />

NSS<br />

Race 1.00<br />

Sex 0.04 1.00<br />

AQR -0.11 -0.15 1.00<br />

FOFAR -0.08 -0.09 0.87 1.00<br />

Acad NSS -0.13 0.01 0.35 0.33 1.00<br />

A/C NSS -0.12 0.04 0.23 0.25 0.40 1.00<br />

Sim NSS -0.12 0.05 0.31 0.32 0.55 0.45 1.00<br />

Overall NSS -0.15 0.05 0.33 0.34 0.61 0.90 0.78 1.00<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


DISCUSSION<br />

These studies found that minority status added predictability in terms of pilot and NFO<br />

primary flight training, and correspondingly, the null hypothesis was rejected.. It was expected<br />

that the regression equation using AQR and FOFAR scores was statistically significant. The<br />

ASTB was designed to help select those most likely to succeed in flight training. For the groups<br />

in this study, the Navy used a minimum score of 3, 3 in the AQR and FAR respectively to<br />

increase their numbers entering the program<br />

When adding minority status into the regression equation, the R-value increased,<br />

supporting the claim that such status adds predictive value to a selection equation. Even though<br />

the ASTB underwent analysis to ensure the lack of racial bias, race influenced performance as<br />

minorities typically had lower NSS scores than non-minorities. When AQR and FOFAR scores<br />

were converted to NSS, minorities had a lower average NSS than non-minorities.. Using ASTB<br />

NSS as beginning NSS and primary training overall NSS as ending NSS, the only significant<br />

difference was noted among females. The ASTB underestimated the performance of females in<br />

primary training. For males, both non-minority and minority, the ASTB was accurate. In all<br />

phases of training, females, on the average, performed better than the test projected during<br />

primary training.<br />

For females, ASTB scores underestimated performance. This disparity could be blamed<br />

on generous grading by instructors. When academic performance is evaluated, however, the same<br />

underestimation is seen. Academic scores act as a control since the grades are reached<br />

objectively. Possibly a biased ASTB towards females would explain the phenomenon. Also, the<br />

difference may have been exaggerated by the relatively short time period covered in this study.<br />

RECOMMENDATIONS<br />

For female students, the Navy should investigate if gender bias exists in the ASTB.<br />

Hundreds of capable candidates may have been turned away as the test incorrectly anticipated<br />

their performance.<br />

This study excluded Marine and Air Force students who attended primary training.<br />

Similar studies could evaluate similar performance disparities among minorities in these services.<br />

REFERENCES<br />

Baisden, A. G. (1980). A comparison of college background, pipeline assignment, and<br />

performance in aviation training for black student naval flight officers and white student<br />

naval flight officers. (NAMRL-SR-80-2). NAS Pensacola, FL: Naval Aerospace Medical<br />

Research Laboratory.<br />

CNATRA Instruction 1542.54L (2002). Primary student naval flight officer/navigator training<br />

curriculum. NAS Corpus Christi: Naval Air Training Command<br />

127<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


128<br />

CNATRA Instruction 1542.131B (2001). Intermediate naval flight officer (NFO)/air force<br />

navigator (AF NAV) training curriculum. NAS Corpus Christi: Naval Air Training<br />

Command<br />

Fenrick, R. (2002). The effect of Navy and Air Force commissioning sources on performance in<br />

naval flight officer/Air Force navigator intermediate training. Graduate Research Project<br />

presented to Embry-Riddle Aeronautical University, NAS Pensacola, FL.<br />

Hopson, J. A., Griffin, G. R., Lane, N. E., & Ambler, Rosalie K. Development and evaluation of a<br />

naval flight officer scoring key for the naval aviation biographical inventory. (NAMRL-<br />

1256). NAS Pensacola, FL: Naval Aerospace Medical Laboratory.<br />

Miller, S. A. (1994). Perceptions of racial and gender bias in naval aviation flight training.<br />

Master’s Thesis submitted to Naval Postgraduate School, Monterey, CA.<br />

What is the aviation selection test battery (ASTB)? (n.d.). Retrieved November 3, 2002 from<br />

http://navyrotc.mit.edu/www/aviation/astb.htm<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Validation of an Unmanned Aerial Vehicle Operator Selection System<br />

LT Henry L. Phillips<br />

LT Richard D. Arnold<br />

Naval Aerospace Medical Institute<br />

LT Philip Fatolitis<br />

Naval Aerospace Medical Research Laboratory<br />

Abstract<br />

The purpose of this study was to validate selection performance standards for the screening of<br />

candidates for entrance into the US Navy and Marine Corps Unmanned Aerial Vehicle (UAV)<br />

Pioneer Pilot training program. A minimum Pioneer crew consists of an external pilot (EP),<br />

internal pilot (IP), and a mission commander/payload specialist (MC). The EP is responsible for<br />

take-offs, landings, and control of the vehicle when it is within visual range. The IP is<br />

responsible for control of the aircraft when it is beyond visual range. The MC is responsible for<br />

planning and execution of the mission, operation of the payload, and for information gathering<br />

during the mission. In the development and initial validation phases of this system, a task<br />

analysis was completed in training and fleet squadrons to identify both tasks that are critical for<br />

safe flight and skills required to perform piloting tasks. Specific computer-based psychomotor<br />

tests were chosen as predictor variables based on the task analysis and initial validation. In the<br />

present study subjects consisted of 39 students: 5 IPs and 34 Ground Control Station Operators<br />

(who received combined IP and MC training) for whom both psychomotor test battery scores and<br />

training outcome data were available. A single, four-component, unit-weighted, composite<br />

scoring algorithm was generated to indicate performance on the computerized test battery. This<br />

composite score was found to be a significant predictor of final average in primary UAV training<br />

(r = .59, p


130<br />

reduced training costs – the Navy’s Aviation Selection Test Battery (ASTB) yields annual<br />

estimated savings of $38.1 million by improving the quality of training accessions, reducing the<br />

flight hours needed to meet winging requirements, and by lowering the number of trainees who<br />

flunk out or quit (NAMI, 2001). A similarly effective selection procedure for UAV operators<br />

could yield monumental savings to the Naval services by ensuring that the individuals most<br />

likely to succeed in training are selected.<br />

The present study represents an initial step toward achievement of that goal, and is a<br />

follow-up to a preliminary validation study conducted by Biggerstaff et al. (1998). It evaluates<br />

the validity of the Naval Aerospace Medical Research Laboratory (NAMRL) selection system<br />

for Internal Pilot (IP) operators for the Pioneer UAV. Biggerstaff et al. also conducted a task<br />

analysis to underlie subtest identification as well as physical requirements for different positions<br />

within the Pioneer crew.<br />

More extensive descriptions of the Pioneer, its capabilities, and its crew requirements are<br />

provided in Biggerstaff et al. (1998), but some basic facts are presented. The Pioneer is a<br />

relatively small UAV (wingspan 16.9 ft., length 14.0 ft., height 3.3 ft.) used for real-time<br />

surveillance. It has a service ceiling of 12,000 ft., maximum range of 185 km, and cruising<br />

speed of 65 kts. The Pioneer may be launched shipboard or from an airfield, and requires only<br />

21 m for pneumatic launches and 70 m for recovery. It has been used successfully in both the<br />

first Gulf War and Operation Enduring Freedom. Each Pioneer costs over $800,000.<br />

A minimum Pioneer crew consists of an external pilot (EP), internal pilot (IP), and a<br />

mission commander/payload specialist (MC). The EP is responsible for take-offs, landings, and<br />

control of the vehicle when it is within visual range. The IP is responsible for control of the<br />

aircraft when it is beyond visual range. The MC is responsible for planning and execution of the<br />

mission, operation of the payload, and for information gathering during the mission. An<br />

additional training curriculum offered for Pioneer trainees is Ground Control Station Operator<br />

(GCSO), which receives combined IP and MC training.<br />

Method<br />

Participants<br />

Participants were 39 students trained at OLF Choctaw between 1995 and 1997 for whom<br />

both selection test battery and training outcome data were available. Of the 39 students, 5 were<br />

IPs and 34 were GCSOs who received combined IP and MC training. While race and ethnicity<br />

data were unavailable, 2 of the 39 were female and 2 were left-handed. Participant ages ranged<br />

from 18 to 30, with mean age 22.08 years (SD = 3.26). This sample did not include the 14<br />

individuals described in the Biggerstaff et al. (1998) preliminary validation.<br />

Procedures<br />

Participants were administered a battery of 5 different types of tasks in combinations over<br />

11 distinct subtests listed below (more information is provided in Table 1). The entire battery<br />

took approximately two hours to administer. Computer requirements included a PC with at least<br />

25MHZ processing speed or greater. Peripherals attached to the machine included a monitor,<br />

two joysticks, rudder pedals, numeric keypad, and headphones.<br />

Measures<br />

Tasks<br />

Psychomotor tasks. Three psychomotor tasks, titled stick, rudder, and throttle, were<br />

administered in additive conjunction with each other.<br />

Stick. The stick test required subjects to use a joystick to keep a crosshair cursor (‘+’)<br />

centered at the intersection of a row and a column of dots. The stick could be moved in any<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


direction, and stick movement produced cursor movement in the opposite direction. The cursor<br />

movement algorithm introduced a constant rate of drift to the upper right. The 3 minute test was<br />

preceded by 3 minutes of practice.<br />

Rudder. The rudder task required subjects to operate a set of rudder pedals with their feet<br />

to keep a crosshair cursor centered along a horizontal axis of dots. This axis and cursor appeared<br />

on the screen simultaneously with the stick cursor and dot axes beneath the latter. Depressing the<br />

left pedal caused the cursor to move to the right, and depressing the right caused cursor motion to<br />

the left. This test was also preceded by a 3 minute practice session.<br />

Throttle. The throttle task added a second joystick, operated with the left hand, to the<br />

controls used to operate the stick and rudder tasks. Goal of the throttle task was to keep a third<br />

cursor centered along a vertical axis of dots, left of the stick cursor and dot axes. The throttle<br />

joystick moved in two directions only, toward the examinee and toward the screen. Movement<br />

of the throttle produced movement of a vertical cursor in the opposite direction. This task also<br />

was preceded by a 3 minute practice session.<br />

Figure 1 from Biggerstaff et al. (1998) displays the apparatus used for all three<br />

psychomotor tasks. Each task was scored as the total number of pixels a respective cursor is offcenter<br />

at random intervals throughout the test. The stick task was administered alone over 3<br />

minutes and in conjunction with rudder twice for 3 minutes each trial. Throttle was administered<br />

only in conjunction with stick and rudder once for 4 minutes. The stick and rudder tasks were<br />

also administered in conjunction with the dichotic listening task over 4 minutes.<br />

Dichotic listening task. This task required subjects to focus on one string of audio<br />

information in the presence of competing audio stimuli. Over 12 trials in 4 minutes, subjects<br />

were presented simultaneous but different strings of alphanumeric characters to each ear. A cue<br />

indicated which ear participants were to attend to during a given trial. Subjects were instructed<br />

to key in the numbers they heard in the designated ear using a numeric keypad, but ignore the<br />

letters. The task was preceded by 4 practice sessions over 3 minutes.<br />

Horizontal tracking. This task required subjects to keep a square cursor centered on a<br />

horizontal axis using a joystick, depicted in Figure 2 from Biggerstaff et al. (1998). The cursor<br />

algorithm made the cursor accelerate as its distance from center increased, forcing participants to<br />

attempt to ‘balance’ the cursor over the center point through small corrective adjustments. The<br />

direction of joystick input matched the direction of cursor movement for this task. The<br />

horizontal tracking task was administered in 7 sessions over 15 minutes.<br />

Digit cancellation task. This task required subjects to enter randomly generated numbers<br />

ranging from 1-4 on a keypad using their left hands as the numbers appeared on the screen. It<br />

was administered alone for 2 minutes and in conjunction with the horizontal tracking task for 8<br />

minutes.<br />

Manikin task. This test assessed ability to perform mental rotations and reversals. It<br />

consisted of drawings of a human figure holding a square in one hand and a circle in the other.<br />

The figure was oriented in four ways over 48 trials: facing forward or backward and upsidedown<br />

or upright. The task was to determine whether the square was held in the right or left hand<br />

in a given trial. This task was not timed, and was not administered in conjunction with any other<br />

tasks.<br />

Score components<br />

Due to relatively small sample size, a priori unit-weighted combinations of task scores<br />

from various subtests were used to generate 4 broad component scores: psychomotor ability,<br />

multitasking calculation, multitasking psychomotor, and visuospatial ability. Specific details of<br />

131<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


132<br />

score component calculation are provided in Table 1. The mean of the four components is<br />

reported as an index score.<br />

Psychomotor ability. This component assessed eye-hand coordination. It did not assess<br />

multitasking or divided attention.<br />

Multitasking calculation. This component assessed ability to focus on audio and visual<br />

numeric inputs and information under conditions of divided attention. Distracting activities<br />

performed simultaneously with calculations included dichotic listening and digit cancellation.<br />

Multitasking psychomotor. This component assessed psychomotor performance under<br />

conditions of divided attention introduced by competing psychomotor activities as well as<br />

dichotic listening.<br />

Visuospatial ability. This component captured the ability to perform mental rotations and<br />

reversals. It was based solely on trials of the manikin task, and did not assess multitasking or<br />

divided attention.<br />

Criterion variables<br />

Two outcome variables were predicted in this study: final average of test performance<br />

and flight evaluations for UAV operator training curriculum, scored on a continuous scale, and<br />

post-primary phase attrition from training, which was scored dichotomously.<br />

Results<br />

Variable minimums, maximums, means, and standard deviations are presented in Table 2<br />

(Index M = -.01 SD = .77; Visuospatial M = -.03 SD =.99; Multitasking psychomotor M = -.01<br />

SD = 1.10; Multitasking calculation M =.04 SD = .81; Psychomotor M = -.02 SD = .90; Training<br />

performance M = 93.67 SD 3.18). The same table also displays correlations among all variables.<br />

All correlations were significant at p < .01, including correlations of score components with<br />

training performance (Index r = .59; Visuospatial r = .54; Multitasking psychomotor r = .51;<br />

Multitasking calculation = .42; Psychomotor r = .43).<br />

Differences between students attriting from training and those completing training were<br />

significant for all score components (Visuospatial t12.4 = 2.38, p < .05; Multitasking psychomotor<br />

t37 = 2.47, p < .05; Multitasking calculation t37 = 2.01, p < .05; Psychomotor t37 = 3.06, p < .01)<br />

including the index score (t37 = 2.91, p < .01), but not for training performance (t37 = 1.15, ns)<br />

(see Table 3).<br />

Discussion<br />

Results were impressive. Index score and all score components correlated strongly with<br />

training performance and reliably differentiated between attriting and non-attriting students.<br />

Additionally, because unit-weighted score component computations were determined a priori<br />

based on content validity alone (Cascio, 1991), it was not necessary to cross-validate results<br />

using a separate sample for purposes of the present study.<br />

Performance on this test battery appears to be an excellent predictor of both training<br />

performance and attrition. Adoption of this or a similar selection procedure incorporating<br />

reasonable minimum performance standards should serve to both improve mean trainee<br />

performance and reduce training attrition, likely resulting in substantial savings to the Marine<br />

Corps (NAMI, 2001).<br />

Subsequent exploratory work on this and future samples will investigate the joint roles of<br />

accuracy and reaction time in prediction of UAV training performance and attrition. Due to the<br />

relatively small sample size available for this study, the number of a priori relationships of<br />

predictor combinations with criterion variables tested was kept small to avoid inflation of alpha<br />

error. Even so, results were extremely promising.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


References<br />

Biggerstaff, S., Blower, D. J. Portman. C. A. and Chapman. A. D. (1998). The<br />

development and initial validation of the unmanned aerial vehicle (UAV) external pilot selection<br />

system. NAMRL-1398, Pensacola. FL: Naval Aerospace Medical Research Laboratory,<br />

Selection Division.<br />

Cascio, W. F. (1991). Applied psychology in personnel management. (4th ed.).<br />

Englewood Cliffs, NJ: Prentice-Hall.<br />

Dolgin, D., Kay, G., Wasel, B., Langlier, M., & Hoffman, C. (2002). Identification of<br />

the cognitive, psychomotor, and psychosocial skill demands of uninhabited combat aerial<br />

vehicles (UCAV) operators. Survival and Flight Equipment Journal, 30, 219-225.<br />

Helton, K.T., Nontasak, T., & Dolgin, D.L. (1992). Landing craft air cushion crew<br />

selection system manual (Tech. Rep. No. 92-4). Pensacola, FL: Naval Aerospace Medical<br />

Research Laboratory, Selection Division.<br />

Hilton, T. F. & Dolgin, D. L. (1991). Pilot Selection in the military of the free world. In<br />

R. Gal and A. D. Mangelsdorff (Eds.), Handbook of <strong>Military</strong> Psychology, pp. 81-101. Sussex,<br />

England: John Wiley and Sons.<br />

McHenry, J., Hough, L. M. Toquam. J. L., Hanson. M. A. & Ashworth, S. (1990). Project<br />

A validity results: The relationship between predictor and criterion domains. Personnel<br />

Psychology, 43, 335-354.<br />

NASA Mission Demonstrates Practical Use of UAV Technology Oct 17, 2002. Online at<br />

http://www.uavforum.com/library/news.htm; visited March 25, <strong>2003</strong>.<br />

Street, D. R., & Dolgin, D. L. (1994). Computer-based psychomotor tests in optimal<br />

training track assignment of student naval aviators. NAMRL-1391, Pensacola, FL: Naval<br />

Aerospace Medical Research laboratory, Selection Division.<br />

Yelland, B. (2001). UAV Technology Developmental: A Node within a System: Flight<br />

<strong>International</strong>’s UAV Australia Conference <strong>Proceedings</strong>, 8-9 February 2001. Melbourne,<br />

Australia.<br />

133<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


134<br />

Table 1.<br />

Test Combinations and Score Component Derivation<br />

Test Administration Order and Combinations<br />

Test 1: stick<br />

Test 2: DL<br />

Test 3: DL & stick<br />

Test 4: stick & rudder<br />

Test 5: DL, stick, & rudder<br />

Test 6: stick, rudder, & throttle<br />

Test 7: not used<br />

Test 8: HT (4 trials)<br />

Test 9: DC (correct, RT, & RT SD tracked)<br />

Test 10: HT & DC (3 joint trials)<br />

Test 11: Manikin (correct, RT & RT SD tracked; four trials)<br />

Score Component Derivation<br />

Psychomotor:<br />

Test 1 stick<br />

Test 8 4 trials of HT<br />

Multitasking-calculation:<br />

Test 3 DL<br />

Test 10 3 trials of DC<br />

Multitasking-psychomotor:<br />

Test 3 stick<br />

Test 4 stick & rudder<br />

Test 5 stick & rudder<br />

Test 6 stick, rudder, & throttle<br />

Test 10 HT<br />

Visuospatial:<br />

Test 11 4 Manikin trials<br />

Note: DC: Digit cancellation; DL: Dichotic listening; HT: horizontal tracking; RT: Reaction<br />

time; SD: Standard deviation<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 2.<br />

Correlations and descriptive statistics (N = 39).<br />

Descriptives Min Max Mean SD<br />

1. Index Score -1.86 1.39 -.01 .77<br />

2. Visuospatial -1.64 2.23 -.03 .99<br />

3. Multitasking – psychomotor -2.40 1.67 -.01 1.10<br />

4. Multitasking – calculation -1.73 1.72 .04 .81<br />

5. Psychomotor -2.51 1.40 -.02 .90<br />

6. Training Performance 87.71 99.09 93.67 3.18<br />

Correlations 1 2 3 4 5<br />

1. Index Score<br />

2. Visuospatial 75<br />

3. Multitasking – psychomotor 90 52<br />

4. Multitasking – calculation 73 48 53<br />

5. Psychomotor 82 39 81 43<br />

6. Training Performance 59 54 51 42 43<br />

Note: Variables 2-5 computed as means of contributing standardized variables. Index score<br />

computed as the mean of variables 2-5. All correlations significant at p < .01. Decimals<br />

omitted.<br />

135<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


136<br />

Table 3.<br />

Variable means by attrite status (N = 39)<br />

Attrite (n = 6) Mean SD SE 95% CI<br />

1. Index Score ** -.77 .65 .26 -1.30 to -.24<br />

2. Visuospatial *** -.61 .55 .23 -1.06 to -.16<br />

3. Multitasking – psychomotor * -.97 1.02 .42 -1.81 to -.14<br />

4. Multitasking – calculation * -.55 .95 .39 -1.32 to .22<br />

5. Psychomotor ** -.96 .73 .30 -1.55 to -.36<br />

6. Training performance 92.30 .366 1.49 89.31 to 95.29<br />

Complete (n = 33) Mean SD SE 95% CI<br />

1. Index Score ** .13 .71 .12 -.11 to .38<br />

2. Visuospatial *** .07 1.02 .18 -.28 to .43<br />

3. Multitasking – psychomotor * .16 1.03 .18 -.20 to .52<br />

4. Multitasking – calculation * .14 .75 .13 -.12 to .41<br />

5. Psychomotor ** .16 .83 .14 -.13 to .45<br />

6. Training performance 93.92 3.08 .54 92.85 to 94.99<br />

Note. ** Complete-attrite difference significant at p < .01; * at p < .05; *** significant at p < .05<br />

assuming unequal group variances.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 1.<br />

Stick, rudder, and throttle task apparatus and display for psychomotor test (from Biggerstaff et<br />

al., 1998).<br />

Throttle<br />

Throttle<br />

Cursor<br />

Stick<br />

Stick<br />

Cursor<br />

Rudder<br />

Cursor<br />

Rudder Pedal<br />

137<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


138<br />

Figure 2.<br />

Horizontal tracking task controls and display from Biggerstaff et al. (1998).<br />

Stick<br />

(cursor at zero position)<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 3.<br />

Scatterplot of index score against training performance by training completion status (N = 39).<br />

Training Performance<br />

100<br />

98<br />

96<br />

94<br />

92<br />

90<br />

88<br />

86<br />

-2.0<br />

-1.5<br />

Index Score<br />

-1.0<br />

-.5<br />

Index Score – Training r = .59, p < .01.<br />

0.0<br />

.5<br />

1.0<br />

139<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

1.5<br />

Attrite Status<br />

Complete<br />

Attrite


140<br />

SUCCESS AT COLLABORATION<br />

AS A FUNCTION OF KNOWLEDGE DEPTH*<br />

Mark A. Sabol, Brooke B. Schaab, and J. Douglas Dressel<br />

U. S. Army Research Institute for the Behavioral and Social Sciences, Alexandria, Virginia<br />

Andrea L. Rittman<br />

George Mason University, Fairfax, Virginia, Consortium Research Fellows Program**<br />

Abstract<br />

Pairs of college students played a computer game, SCUDHunt***, that requires<br />

collaboration. Separated but communicating players deployed information-gathering “assets” to<br />

locate 3 missile launchers on a 5x5 grid. Since each player controlled only half the assets (either<br />

the "ground" or "air" set), pairs had to coordinate asset deployment to maximize the value of<br />

information collected. Before the game, each player in 13 pairs received Deep-but-Narrow<br />

(DbN) training, i.e., two identical sessions on the attributes (possible movements and reliability)<br />

of assets controlled by that player; 13 other pairs received Broad-but-Shallow (BbS) training, a<br />

session on one's own assets, followed by equivalent training on one's partner's assets. A quiz on<br />

training content followed each session. Pairs then played two 5-turn games, each turn requiring<br />

each player to guess the fixed launcher locations.<br />

Results suggest that knowledge of one set of assets – those of the "ground controller" –<br />

was more crucial to game-playing success than knowledge of the other set – those of the "air<br />

controller." But knowledge of that more crucial set proved more complex and difficult to<br />

acquire. During the first game, players assigned the more crucial set needed DbN training to<br />

succeed. However, players given BbS training appeared to gain knowledge of their partners'<br />

assets while playing the first game, leading to improvement in later performance. Players given<br />

DbN training on the less crucial assets did poorly throughout. We interpret these preliminary<br />

results as addressing the question whether training for collaborative tasks should include systemwide<br />

aspects or concentrate on a single role.<br />

____________<br />

* The views expressed in this paper are those of the authors and do not necessarily represent an<br />

official position of the U. S. Army or Department of Defense.<br />

** Farrasha L. Jones made important contributions to the data collection phase of this research<br />

while a student at George Mason University and a participant in the Consortium Research<br />

Fellows Program.<br />

*** Reference to and use of this product does not constitute endorsement by the<br />

U. S. government.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The application of new information technologies to the battlefield allows widely<br />

dispersed combatants to work together in new ways, altering the traditional conduct of warfare.<br />

Such network-centric operations have been described as “based upon a new model of command<br />

and control, one that features sharing information (synchronization in the information domain)<br />

and the collaborative processes to achieve a high degree of shared situational awareness” (p. 60,<br />

Alberts, <strong>2003</strong>). Successful employment of these new technologies, which join personnel from<br />

diverse military jobs into interactive networks, relies on the Soldiers' tendency to engage in --<br />

and their skill at accomplishing -- collaborative interaction. But Alberts cautions that, without<br />

appropriate training and practice, the network-centric environment might actually increase the<br />

fog of war rather than provide superior situational understanding. To insure the latter result and<br />

avoid the former, the training side of the Army, in particular, needs to understand the dynamics<br />

of this new environment, where Soldiers interact with their peers and leaders electronically. The<br />

purpose of this paper is to describe the preliminary results of our research team's attempts to<br />

identify training issues that arise when unacquainted Soldiers must collaborate at a distance,<br />

rather than face-to-face.<br />

Research on collaboration. The Army defines collaboration as, “people actively sharing<br />

information, knowledge, perceptions, or concepts when working together toward a common<br />

purpose." It is well established that the basis for collaboration is a shared understanding of the<br />

situation (Clark, & Brennan, 1991). But this understanding is more than shared information or<br />

even what is sometimes called a Common Relevant Operating Picture (CROP). Establishing a<br />

CROP should be seen as the beginning, not the endpoint, in establishing situational awareness.<br />

As Hevel (2002) has said, each person’s interpretation of the CROP depends on that individual's<br />

training, experience, and values.<br />

To gain further insights into such issues involving collaboration and training, we first<br />

conducted observations and interviews of Army personnel in units that were in the process of<br />

incorporating digital systems (Schaab & Dressel, <strong>2003</strong>). It soon became clear that classroom<br />

training on how to use digital systems is not enough. Even inexperienced Soldiers know that<br />

their digital jobs require an understanding of how the system they are learning to operate<br />

interacts with other systems. But they may need to experience multiple training exercises,<br />

incorporating numerous scenarios, in order to develop both a clear sense of how to collaborate<br />

with the people operating those other systems and an appreciation of how important such<br />

collaboration is in achieving and maintaining situational understanding. In one command center,<br />

we saw Soldiers actually place two different systems side-by-side and cross train each other in<br />

order to promote face-to-face collaboration. They already grasped the need to understand the<br />

interrelationship between their roles. Such opportunities to foster mutual understanding become<br />

more difficult, of course, when members are dispersed.<br />

Successful collaboration in distributed environments requires the same abilities as<br />

collaboration when co-located, but the means of training must differ when groups are distributed<br />

(Klein, Pliske, Wiggins, Thordsen, Green, Klinger, & Serfaty, 1999). Challenges with<br />

distributed groups include the loss of visual/verbal cues, added effort in working together, and<br />

difficulty in knowing when goals need to be adjusted. In short, good communication is an<br />

141<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


142<br />

antecedent of effective team performance, and communication becomes more difficult when<br />

teams are dispersed.<br />

Indeed, previous research has suggested that the most important aspects of collaboration<br />

may be these intertwined issues of communication and shared mental models of the combat<br />

situation (Schrage, 1990). Collaboration helps create shared meaning about a situation, and this<br />

shared meaning is important for effective decision-making performance. At the same time,<br />

some prior shared situational awareness is essential for effective communication, and<br />

communication is crucial in maintaining and refining that shared awareness. We designed our<br />

research program with these complexities in mind.<br />

We selected the game SCUDHunt for our research on collaboration precisely because it<br />

provides a simplified model of this interplay of shared awareness and communication, while<br />

permitting independent manipulation of variables thought to affect them. SCUDHunt requires<br />

participants to (1) collaborate from distributed locations and (2) share unique information from<br />

their intelligence assets for optimal game performance. The goal of the game is simple, to locate<br />

three SCUD missile launchers on a map. To accomplish this, separated but communicating<br />

players use a computer mouse to deploy information gathering "assets" across the map they share<br />

on their computer screens. Players get five "turns" during which they can gather and accumulate<br />

that information regarding launcher locations. The game thus requires players to execute digital<br />

tasks in order to achieve a shared goal, while performing their different tasks in geographically<br />

separate locations.<br />

Our research began with a cognitive task analysis of this game to identify critical points<br />

where collaboration would be beneficial.* These are points where players need to communicate<br />

planning strategies and to share gathered information in order to perform effectively. The general<br />

collaboration areas identified were:<br />

Coordinating deploying: This is the discussion among players of where best to place their<br />

assets on the map grid, with the goals of (1) maximizing coverage of the area remaining to<br />

be searched, and (2) using certain assets to verify the results of earlier searches;<br />

Interpreting results: This is the discussion among players of the reliability of reports from<br />

different intelligence-gathering assets, leading to a determination of the likelihood that a<br />

SCUD launcher is at any particular location. This involves interpretation of results from the<br />

current turn, as well as integration of findings from previous searches.<br />

____________<br />

* Ross, K.G. (September, <strong>2003</strong>). Perspectives on Studying Collaboration in Distributed<br />

Networks, Contractor Report prepared for the U. S. Army Research Institute for the<br />

Behavioral and Social Sciences by Klein Associates Inc., Fairborn, Ohio, under<br />

Contract PO DASW01-02-P-0526.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Communication. Recent research has shown that both communication content (Urban, Weaver,<br />

Bowers, & Rhodenizer, 1996) and statement patterns (Kanki, Lozito, & Foushee, 1989; Achille,<br />

Schulze, & Schmidt-Nielsen, 1995) and sequences (Bowers et al., 1998) influence team<br />

coordination and performance. To investigate these relationships, we included in our research<br />

design a manipulation of communication ease: All participating pairs wore headsets that allowed<br />

direct oral communication during one of the two games they played; during the other game,<br />

participants were permitted to communicate only by sending typed messages via an on-screen<br />

"chat" box. For a random half of the pairs, the "chat" game came first. The data we collected<br />

included measures of the types and frequency of communication between participants in the<br />

"chat" game, during which each transmitted message was recorded. The analysis of these data is<br />

not yet complete. Effects of that manipulation and results of that analysis will be presented in a<br />

later paper.<br />

Knowledge. The notion of networked individuals who have shared goals but unique roles raises<br />

a new question for research on collaboration at a distance: To what extent does knowledge about<br />

a partner’s role matter to an individual’s performance effectiveness? Knowledge about others’<br />

roles, responsibilities, and job requirements has been termed "interpositional knowledge." One<br />

training strategy effective for increasing interpositional knowledge among team members is<br />

cross-training (Blickensderfer Cannon-Bowers, & Salas, 1998).<br />

Volpe, Cannon-Bowers, Salas, and Spector (1996) defined cross-training as a strategy<br />

where “each team member is trained on the tasks, duties, and responsibilities of his or her fellow<br />

team members” (p. 87). This involves having team members understand and sometimes practice<br />

each other’s skills. The Volpe et al. (1996) initial research on cross-training, as well as an<br />

extension (Cannon-Bowers, Salas, & Blickensderfer, 1998), showed that those who received<br />

cross-training were better able to anticipate each other’s needs, shared more information, and<br />

were more successful in task performance. Additional research has found that cross-training and<br />

a common understanding of roles contributes to shared mental model development, effective<br />

communication, and improved coordination (McCann, Baranski, Thompson, & Pigeau, 2000;<br />

Marks, Sabella, Burke, & Zaccaro, 2002).<br />

Some researchers have even suggested that "implicit coordination" may be an important<br />

mechanism. Here, cross-trained teams may be better able to coordinate without depending on<br />

overt communication (Blickensderfer et al., 1998). This phenomenon has been suggested as an<br />

intervening factor that explains the benefit that cross-training imparts to a team task. Implicit<br />

coordination may only be possible given the reduced interpositional uncertainty regarding other<br />

team members' roles that comes with cross-training. However, distributed environments may<br />

limit the extent to which this implicit coordination can operate.<br />

To investigate such issues, we included cross-training versus intensive training in one<br />

role as the primary independent variable in the portion of our research reported here. Players<br />

received either a double dose of training on the tasks (deployment and interpretation of "their"<br />

assets) they would be expected to perform later or a single dose of training on their own tasks<br />

and on those expected of their partners. The question being asked by the inclusion of this<br />

143<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


144<br />

variable may be stated in terms of training depth and breadth. That is, those participants who<br />

receive a double exposure to training on their own set of assets are receiving deep but narrow<br />

training. They should be turned into fairly good "experts," with confidence in their<br />

understanding of their own role in finding the missing SCUD launchers, but with little or no<br />

understanding of the contribution that could be made by their partners. On the other hand, those<br />

participants who receive training on both their own assets and those of their partners may be said<br />

to receive broad but shallow training. They may have some understanding of how they and their<br />

partners can work together toward successfully finding the launchers, but they may not be very<br />

confident in their ability to apply any of the specific knowledge they were taught. We are<br />

asking: Which type of training, deep but narrow (DbN) or broad but shallow (BbS) leads to<br />

better performance?<br />

Two other independent variables were automatically included in the experiment by the<br />

nature of the task. The first is position, whether a participant was assigned the "ground<br />

controller" position, in charge of the information gathering assets that are "on the ground" (a<br />

single spy, a team of Navy Seals, a Joint Special Operations team, and a communications<br />

analysis unit), or the "air controller" position, in charge of the gathering assets that are "airborne"<br />

(a reconnaissance satellite, a manned aircraft, and an unoccupied aerial vehicle). The second is<br />

the sequence of turns in the two games all participants played, a variable that may be thought of<br />

as representing increasing on-the-job experience. Our experimental questions are, therefore, how<br />

the first two variables, training and position, affect game performance across the five turns of the<br />

two games, separately and in interaction.<br />

Method<br />

Participants. This experiment employed undergraduate students at a large university in Virginia.<br />

A total of 52 students received course credit for two hours of participation, for which they were<br />

scheduled as 26 pairs.<br />

Instruments. In addition to an Informed Consent form and various questionnaires used in other<br />

aspects of this research, the following measurement instruments were used in the course of this<br />

experiment: 1) The asset quizzes on the knowledge the participants acquired during training<br />

prior to playing the game, and 2) The SCUDHunt game, itself, described below:<br />

The SCUDHunt game presents players with the mission of determining where – on a<br />

five-by-five board representing the map of a hostile country – the launchers for SCUD missiles<br />

are located. The players are told that there are three such launchers, each in a different fixed<br />

location on one square among the 25 squares on the board. On each of five turns, the players<br />

deploy intelligence-gathering assets (for example, a reconnaissance satellite or a team of Navy<br />

Seals), receive reports from those assets, and create a “strike plan” (to be sent to their fictional<br />

commander) indicating their best guess at that point as to the launcher locations. They are told<br />

that only the final strike plan – after the fifth turn – will actually be used by their commander to<br />

direct an attack on the launchers, and they are given the results of this final strike plan in terms of<br />

which bombed location held a now-destroyed launcher. This game is a representation of the kind<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


of situation in which Soldiers would use digital systems to execute tasks requiring collaboration.<br />

The primary measures generated as the game is played are 1) the number of launcher that would<br />

have been hit by each strike plan submitted and 2) the degree to which the two players on a team<br />

chose to include the same grid squares in their independent strike plans. Only the first of these<br />

will be discussed here.<br />

Design. The primary independent variable for this experiment (ALL versus OWN) involved<br />

training on the characteristic of the information-gathering assets used in the SCUDHunt game.<br />

All participants received, as their first training module, an explanation of the characteristics of<br />

the assets they would be controlling. Half of the pairs (the OWN condition) received a second<br />

exposure to the same asset training; the other half (the ALL condition) received training in which<br />

each participant learned the characteristics of the assets to be controlled by that participant's<br />

partner. A secondary independent variable is the position to which participants were randomly<br />

assigned, either "air" or "ground" controller. This position determined the particular set of<br />

information gathering assets that were under the participant's control. The main dependent<br />

variables in this experiment are 1) each participant's performance on the asset-knowledge quizzes<br />

administered after each asset training module, 2) success at playing the SCUDHunt game, as<br />

measured by the number of missile launcher locations correctly identified in a strike plan.<br />

Procedure. Upon arrival at the laboratory, participants completed a preliminary questionnaire on<br />

their experience with computers and computer games. The experimenter then explained that the<br />

experiment would involve the participants playing such a computer game with a partner in<br />

another room. First, they would watch a training video giving an overview on how the game is<br />

played and explaining the concept of information-gathering "assets." They would then see a<br />

video providing details on their assets, after which they would be asked a few questions about<br />

what they had just learned.<br />

Several computer-based training modules were then presented on 1) the overall aspects of<br />

playing the SCUDHunt game and 2) the characteristics of the information-gathering assets used<br />

in playing the game. Participants took paper and pencil quizzes on the material just presented<br />

following each training module. Immediately after this training, the pair played a one-turn<br />

practice game, to insure that the mechanics of playing the game were understood. After the<br />

experimenters answered any question the participants might have, the pair played two complete<br />

five-turn games of SCUDHunt. During these games, data were automatically collected on 1) the<br />

messages participants sent to each other, 2) the degree to which grid squares chosen as targets in<br />

the "strike plans" (submitted at the end of each turn) were identical for the two members of the<br />

pair, and 3) the number of those chosen target squares that actually contained missile launchers.<br />

Results and Discussion<br />

The primary results are presented in Figures 1 and 2. Figure 1 depicts results for those<br />

participants in the "air controller" position; it presents, for them, the main measure of success at<br />

playing the SCUDHunt game – the number of SCUD launchers positions correctly identified –<br />

on each of the five turns of both games played. Figure 2 presents the same data from the<br />

145<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


146<br />

participants in the "ground controller" position. The following pattern can be seen: Regardless<br />

of position, during the second game, participants given the "all" training (cross-training) seem to<br />

be at least as successful as their counterparts given the "own" training, that is, double training on<br />

one set of assets. That advantage for cross-training is not evident in the first game. In fact,<br />

cross-trained individuals in the "ground controller" position seem to have been at a disadvantage<br />

during the first game.<br />

Mean # of Launchers Found<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

1 2 3 4 5 6 7 8 9 10<br />

Turns Across Two Games<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Own<br />

All<br />

Figure 1. Success by "air controllers" at finding launcher positions, separately for 13 given<br />

cross-training ("All") and 13 given double training on one set of assets ("Own").<br />

Mean # of Launchers Found<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

1 2 3 4 5 6 7 8 9 10<br />

Turns Across Two Games<br />

Own<br />

All<br />

Figure 2. "Ground controllers'" success at finding launcher positions, separately for 13 given<br />

cross-training ("All") and 13 given double training on one set of assets ("Own").


An analysis of variance was performed on the data in these two figures, taking into<br />

account the matching of the "air" and "ground" controllers in each pair. In order to simplify the<br />

analysis, only the data from turn 5 (on each of the 2 games) was entered into this analysis. This<br />

transforms the multi-level variable of "Turns" in the figures into the binary variable of games,<br />

first versus second. These simplified data are depicted below (Figure 3.)<br />

Mean # of Launchers Found on<br />

5th Turn<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

Air 1st Air 2nd Grd 1st Grd 2nd<br />

Position and Game<br />

Own<br />

All<br />

Figure 3. Success at finding launcher positions on the fifth (last) turn of each game,<br />

separately for players in the "Air Controller" and "Ground Controller" positions and<br />

separately for those given cross training ("All") and those given double training on<br />

one set of assets ("Own"). Each point represents 13 players.<br />

None of the three main effects ("all" versus "own" training, "ground" versus "air"<br />

position, and first versus second game) was significant. However, a highly significant result was<br />

found for the two-way interaction between position and games (F(1,24)=10.29, p


148<br />

performed better on the second game. It may be that, while playing the first game, these broadly<br />

trained participants gained additional knowledge of the role of their partners. This learning<br />

could have been facilitated by their exposure to training on all assets, and that learning could<br />

have led to improvement in later performance. However, players given DbN training on the less<br />

crucial assets of the "air controller" position seem to have been at an initial disadvantage, one<br />

that only increased in the second game.<br />

The results presented here support the view that collaborative tasks benefit when the<br />

collaborating participants have a broader, system-wide view of the entire situation. At least in<br />

this experiment, training that concentrated on a single role was only beneficial if that role was<br />

complex enough to require considerable training depth. It should be emphasized, however, that<br />

these results are only preliminary and this research is continuing. In particular, further iterations<br />

of this experiment will include a final measure of asset knowledge, in order to provide a direct<br />

test of the hypothesis that cross-training facilitates on-the-job learning.<br />

References<br />

Achille, L.B., Schulze, K.G. & Schmidt-Nielsen, A. (1995). An analysis of communication and<br />

the use of military terms in navy team training. <strong>Military</strong> Psychology, 7(2), 95-107.<br />

Alberts, D. S. (2002). Information age transformation: Getting to a 21 st century military.<br />

Washington, DC: DoD Command and Control Research Program.<br />

Blickensderfer, E., Cannon-Bowers, J. A., & Salas, E. (1998). Cross-training and team<br />

performance. In J. A. Cannon-Bowers & E. Salas (Eds.), Decision making under stress:<br />

Implications for training and simulation (pp. 299-312). Washington, DC: American<br />

Psychological <strong>Association</strong>.<br />

Bowers, C.A., Jentsch, F., Salas, E. & Braun, C.C. (1998). Analyzing communication sequences<br />

for team training needs assessment. Human Factors, 40(4), 672-679.<br />

Brannick, M.T., Roach, R.M. & Salas, E. (1993). Understanding team performance: a<br />

multimethod study. Human Performance, 6(4), 287-308.<br />

Cannon-Bowers, J. A., Salas, E., & Blickensderer, E. (1998). The impact of cross-training and<br />

workload on team functioning: A replication and extension of initial findings. Human<br />

Factors, 40, 92-101.<br />

Clark, H. H., & Brennan, S. E. (1991). Grounding in communication. In: L.B. Resnick, J. M.<br />

Levine, & S. D. Teasley (Eds.) Perspectives on socially shared cognition. American<br />

Psychological <strong>Association</strong>: Washington, DC.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Hevel, J. R. (2002). The Objective Force Battle Staff? Monograph, School of Advanced <strong>Military</strong><br />

Studies, U. S. Army Command and General Staff College, Fort Leavenworth, Kansas.<br />

Kanki, B.G., Lozito, S. & Foushee, H.C. (1989). Communication indices of crew coordination.<br />

Aviation, Space, and Environmental Medicine, 60, 56-60.<br />

Klein, G., Pliske, R., Wiggins, S., Thordsen, M., Green, S., Klinger, D., & Serfaty, D. (1999). A<br />

model of distributed team performance. (SBIR N613399-98-C-0062). NAWCTSD<br />

Orlando, FL.<br />

Marks, M. A., Sabella, M. J., Burke, C. S., & Zaccaro, S. J. (2002). The impact of cross-training<br />

on team effectiveness. Journal of Applied Psychology, 87, 3-13.<br />

McCann, C., Baranski, J. V., Thompson, M. M., & Pigeau, R. A. (2000). On the utility of<br />

experiential cross-training for team decision-making under time stress. Ergonomics, 43,<br />

1095-1110.<br />

Schaab, B.B., & Dressel, J.D. (<strong>2003</strong>). Training the troops: What today's Soldiers tell us about<br />

training for information age digital competency. (Research Report 1805). Alexandria,<br />

VA: U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

Schrage, M. (1990). Shared minds: The new technologies of collaboration. New York:<br />

Random House.<br />

Volpe, C. E., Cannon-Bowers, J. A., Salas, E., & Spector, P. E. (1996). The impact of cross-<br />

training on team functioning: An empirical investigation. Human Factors, 38, 87-100.<br />

149<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


150<br />

U.S. NAVY SAILOR RETENTION: A PROPOSED MODEL OF<br />

CONTINUATION BEHAVIOR 2<br />

Jessica B. Janega and Murrey G. Olmsted<br />

Navy Personnel Research, Studies and Technology Department<br />

jessica.janega@persnet.navy.mil<br />

Sailor turnover reduces the effectiveness of the Navy. Turnover has improved<br />

significantly since the late 1990s due to the implementation of a variety of retention<br />

programs including selective re-enlistment bonuses, increased sea pay, changes to the<br />

Basic Allowance for Housing (BAH) and other incentives. Now in many cases, the Navy<br />

has adequate numbers of Sailors retained, however, it faces the problem of retaining the<br />

best and brightest Sailors in active-duty service (Visser, 2001). Changes in employee<br />

values will require that organizations, such as the Navy, make necessary changes in their<br />

strategy to retain the most qualified personnel (Withers, 2001). Attention to quality of life<br />

issues is one way in which the military has addressed the changing needs of its members<br />

(Kerce, 1995). One of the most effective ways to assess quality of life in the workplace is<br />

to look at the issue of job satisfaction. Job satisfaction represents the culmination of<br />

feelings the Sailor has toward the Navy. Job satisfaction in combination with variables<br />

like organizational commitment can be used to predict employee (i.e., Sailor) retention<br />

(for a general overview see George & Jones, 2002). The purpose of this paper is to<br />

explore the relationship of job satisfaction, organizational commitment, career intentions,<br />

and continuation behavior in the U.S. Navy.<br />

Job Satisfaction<br />

According to Locke (1976), job satisfaction is predicted by satisfaction with<br />

rewards, satisfaction with work, satisfaction with work context (or working conditions),<br />

and satisfaction with other agents. Elements directly related to job satisfaction include<br />

direct satisfaction with the job, action tendencies, career intentions, and organizational<br />

commitment (Locke, 1976). Olmsted & Farmer (2002) replicated a version of Locke’s<br />

(1976) model of job satisfaction proposed by Staples & Higgins (1998) by applying it to<br />

a Navy sample. Staples and Higgins (1998) proposed that job satisfaction is both a factor<br />

predicted by other factors, as well as an outcome in and of itself. Olmsted & Farmer<br />

(2002) applied the model of Staples and Higgins (1998) directly to Navy data using the<br />

Navy-wide Personnel Survey 2000. The paper evaluated two parallel models, which<br />

provided equivalent results indicating that a similar version of Locke’s model could be<br />

successfully applied to Navy personnel.<br />

Organizational Commitment<br />

Organizational commitment involves feelings and beliefs about entire<br />

organizations (George & Jones, 2002). Typically, organizational commitment can be<br />

viewed as a combination of two to three components (Allen & Meyer, 1990). The<br />

2<br />

The opinions expressed are those of the authors. They are not official and do not represent the views of<br />

the U.S. Department of Navy.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


affective (or attitudinal) component of organizational commitment involves positive<br />

emotional attachment to the organization, while continuance commitment is based on the<br />

potential losses associated with leaving the organization, and normative commitment<br />

involves a commitment to the organization based on a feeling of obligation (Allen &<br />

Meyer, 1990). Commonalities across all affective, normative, and continuance forms of<br />

commitment indicate that each component should affect employee’s intentions and final<br />

decision to continue as a member of the organization (Jaros, 1997). The accuracy of these<br />

proposed relationships have implications for turnover reduction because “turnover<br />

intentions is the strongest, most direct precursor of turnover behavior, and mediates the<br />

relationship between attitudes like job satisfaction and organizational commitment and<br />

turnover behavior,” (Jaros, 1997, p. 321). This paper primarily addresses affective<br />

commitment, since it has a significantly stronger correlation with turnover intentions than<br />

either continuance or normative commitment (Jaros, 1997).<br />

Career Intentions<br />

Career intentions represent an individual’s intended course of action with respect<br />

to continuation in their current employment. While a person’s intentions are not always<br />

the same as their actual behavior, an important assumption is that these intentions<br />

represent the basic motivational force or direction of the individual’s behavior (Jaros,<br />

1997). In general, Jaros (1997) suggests that the combination of organizational<br />

commitment and career intentions appears to be a good approximation of what is likely to<br />

occur in future career behavioral decisions (i.e., to stay or leave the organization).<br />

Purpose<br />

This paper looks at job satisfaction, organizational commitment, career intentions,<br />

and continuation behavior using structural equation modeling. It was hypothesized that<br />

increased job satisfaction would be associated with increased organizational commitment,<br />

which in turn would be positively related to career intentions and increased continuation<br />

behavior (i.e., retention) in the Navy. A direct relationship was also hypothesized to exist<br />

between career intentions and continuation behavior.<br />

METHODS<br />

Participants<br />

The sample used in this study was drawn from a larger Navy quality of work life<br />

study using the Navy-wide Personnel Survey (NPS) from the year 2000. The NPS 2000<br />

was mailed to a stratified random sample of 20,000 active-duty officers and enlisted<br />

Sailors in October 2000. A total of 6,111 useable surveys were returned to the Navy<br />

Personnel Research, Studies, & Technology (NPRST) department of Navy Personnel<br />

Command, a return rate of 33 percent. The current sample consists of a sub-sample of<br />

700 Sailors who provided social security numbers for tracking purposes. Sailors whose<br />

employee records contained a loss code 12 months after the survey were flagged as<br />

having left the Navy (10.4%). Those Sailors who still remained in active-duty in the<br />

Navy (i.e., those who could be tracked with social security number and did not have a<br />

loss code in their records) were coded as still being present in the Navy (87.8%). Sailors<br />

whose status was not clear from their employment records (i.e., those who could not be<br />

151<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


152<br />

tracked by social security number) were retained in the analysis with “unknown status”<br />

(1.8%).<br />

Materials<br />

The NPS 2000 primarily focuses on issues related to work-life and career<br />

development for active-duty personnel in the U.S. Navy. The survey contains 99<br />

questions, many of which include sub-questions. Formats for most of the 99 questions<br />

follow a five-point Likert-type scale.<br />

Analysis Procedures<br />

This sample contained missing data. Of those who returned the NPS 2000, not<br />

every Sailor filled it out completely. For this reason, Amos 4.0 was chosen as the<br />

statistical program to perform the structural equation models for this sample. Amos 4.0 is<br />

better equipped to handle issues with missing data then most any other structural equation<br />

modeling program (Byrne, 2001). Once acceptable factors were found via data reduction<br />

with SPSS 10, the factors and observed variables were input into Amos 4.0 for structural<br />

equation modeling via maximum likelihood estimation with an EM algorithm (Arbuckle<br />

& Wothke, 1999).<br />

RESULTS<br />

Overall, the proposed model ran successfully and fit the data adequately. A<br />

significant chi-square test was obtained for the model, indicating more variance remains<br />

to be accounted for in the factor, χ 2 (938) = 7637.94, p


e1<br />

e2<br />

e3 Q53K<br />

.36<br />

e4<br />

e5<br />

Q53I<br />

Q53J<br />

.32<br />

.62<br />

.71<br />

e6 Q53G<br />

.85<br />

.64<br />

e7 Q53F<br />

.79<br />

.55<br />

e8 Q53E .56<br />

e9 Q53D<br />

.45<br />

.41<br />

e10 Q62M<br />

.40<br />

.45<br />

e11 Q62L .41<br />

e12<br />

e13<br />

e14<br />

Q94E<br />

Q52A<br />

Q73F<br />

Q52G<br />

Q52F<br />

e15 Q54E .58<br />

e16 Q53B .43<br />

e17 Q62D<br />

.73<br />

e18 Q60F<br />

.50<br />

e19 Q62I .73<br />

e20 Q53H<br />

.36<br />

e21<br />

e22 Q52Q .71<br />

e23 Q65D<br />

.62<br />

.87<br />

e24 Q64D<br />

.47<br />

.52<br />

e25 Q64B .89<br />

e26 Q65B<br />

.50<br />

.94<br />

e27<br />

e28<br />

Q62V<br />

Q64E<br />

Q65E<br />

e33<br />

e34<br />

Satisfaction<br />

with Rewards<br />

Q60H<br />

Q60I<br />

e29<br />

Q62F<br />

e35 Q52H .73<br />

Satisfaction<br />

with Work<br />

Satisfaction<br />

with Other<br />

Agents<br />

Figure 1. Exploratory Model<br />

.67<br />

.79<br />

.88<br />

Satisfaction<br />

with Working<br />

Conditions<br />

.80<br />

.24<br />

e30<br />

Q62G<br />

.78<br />

-.02<br />

e31<br />

Q62U<br />

.68<br />

.18<br />

Global Job<br />

Satisfaction<br />

Q50A<br />

e36<br />

e32<br />

Q62N<br />

.41<br />

e46<br />

Q50B<br />

e37<br />

e43<br />

Q47B<br />

.81<br />

e44<br />

Q47C<br />

Career<br />

Intentions<br />

Q50D<br />

e38<br />

.78<br />

.70 .35 .34<br />

Organizational<br />

Committment<br />

.66 .73<br />

.87 .90<br />

.64 .82 .80<br />

Q50E<br />

e39<br />

Q50F<br />

Q50G<br />

Status<br />

Continuation<br />

Behavior<br />

153<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.83<br />

e48<br />

e40<br />

e47<br />

e41<br />

e45<br />

Q50H<br />

e42<br />

.08


154<br />

DISCUSSION<br />

This model provides an adequate fit to Navy data for use in relating job<br />

satisfaction and organizational commitment to career intentions and continuation<br />

behavior. Advantages over previously tested models include the use of structural equation<br />

modeling over regression path analysis, and the treatment of job satisfaction and<br />

organizational commitment as separate factors. Several points of interest are apparent in<br />

evaluating the results of the model. First, several factors and observed variables<br />

contributed to global job satisfaction. Satisfaction with work predicted the most variance<br />

in global job satisfaction of any of the factors (path weight = .80). Satisfaction with other<br />

agents was the next largest predictor of global job satisfaction, followed by working<br />

conditions and satisfaction with rewards. Interestingly, the amount of variance in global<br />

job satisfaction predicted by satisfaction with rewards was very low (path weight = -.02).<br />

This suggests that the rewards listed on this survey are not as important to job satisfaction<br />

as being generally satisfied with the job itself, or else these rewards do not adequately<br />

capture what Sailors value when considering satisfaction with their job. Perhaps these<br />

results may also indicate the differences between intrinsic and extrinsic rewards as<br />

predictors of job satisfaction. The relationships between variables relating to intrinsic and<br />

extrinsic motivation should be explored further in this model as they pertain to job<br />

satisfaction.<br />

Job satisfaction as it is modeled here is a good predictor of affective<br />

organizational commitment. The path weight from job satisfaction to organizational<br />

commitment is .70 for the exploratory model. Adding a path from global job satisfaction<br />

to career intentions did not add any predictive value to the structural equation model.<br />

Here, organizational commitment mediates the relationship between job satisfaction and<br />

career intentions/continuation behaviors. Organizational commitment predicted both<br />

career intentions and continuation behaviors adequately in the model. Since the model<br />

did not explain all of the variation present (as evidenced by the significant chi-square<br />

statistic), this difference could be the result of an unknown third variable that is<br />

influencing this relationship. This problem should be explored more in the future.<br />

The more the Navy understands regarding Sailor behavior, the more change can<br />

be implemented to improve the Navy. The results of this study suggest that job<br />

satisfaction is a primary predictor of organizational commitment and that both play an<br />

important role in predicting both career intentions and actual continuation behavior. In<br />

addition, the results of this paper suggest that career intentions are actually stronger in<br />

predicting continuation behavior than organizational commitment when evaluating them<br />

in the context of all of the other variables in the model. More research is needed to fully<br />

understand these relationships, and the specific contributions to job satisfaction that can<br />

be implemented in the Navy. A validation of this model should be conducted in the future<br />

to verify these relationships. However, it is clear at this point that understanding Sailor<br />

continuation behavior would be incomplete without measurement of job satisfaction,<br />

organizational commitment, and career intentions.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


REFERENCES<br />

Allen, N. J., & Meyer, J. P. (1990). The measurement and antecedents of affective,<br />

continuance, and normative commitment to the organization. Journal of<br />

Occupational Psychology, 63, 1-18.<br />

Arbuckle, J. L., & Wothke, W. (1999). Amos 4.0 user’s guide. Chicago, IL: SmallWaters<br />

Corporation.<br />

Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the<br />

analysis of covariance structures. Psychological Bulletin, 88, 588-606.<br />

Browne, M. W., & Cudeck, R. (1989). Single sample cross-validation indices for<br />

covariance structures. Multivariate Behavioral Research, 24, 445-455.<br />

Byrne, B. M. (2001). Structural equations modeling with AMOS: Basic concepts,<br />

applications and programming. Mahwah, New Jersey: Laurence Earlbaum<br />

Associates, Publishes.<br />

George, J. M., & Jones, G. R. (2002). Organizational behavior (3 rd ed.). New Jersey:<br />

Prentice Hall.<br />

Jaros, S. (1997). An assessment of Meyer and Allen’s (1991) three-component model of<br />

organizational commitment and turnover intentions. Journal of Vocational<br />

Behavior, 51, 319-337.<br />

Kerce, E. W. (1995). Quality of life in the U.S. Marine Corps. (NPRDC TR-95-4). San<br />

Diego: Navy Personnel Research and Development Center.<br />

Locke, E. A. (1976). The nature and causes of job satisfaction. In M. D. Dunnete (Eds.),<br />

Handbook of Industrial and Organizational Psychology (pp. 1297-1349). New<br />

York: John Wiley & Sons.<br />

Olmsted, M. G., & Farmer, W. L. (2002, April). A non-multiplicative model of Sailor job<br />

satisfaction. Paper presented at the annual meeting of the Society for Industrial &<br />

Organizational Psychology, Toronto, Canada.<br />

SPSS, Inc. (1999). SPSS 10.0 syntax reference guide. Chicago, IL: SPSS, Inc.<br />

Staples, D. S., & Higgins, C. A. (1998). A study of impact of factor importance<br />

weightings on job satisfaction measures. Journal of Business and Psychology,<br />

13(2), 211-232.<br />

Visser, D. (2001, January 1-2). Navy battling to retain sailors in face of private sector’s<br />

allure. Stars and Stripes. Retrieved March 3, <strong>2003</strong>. http://www.pstripes.com/<br />

jan01/ed010101a.html<br />

Withers, P. (2001, July). Retention strategies that respond to worker values. Workforce.<br />

Retrieved September 24, <strong>2003</strong>. http://www.findarticles.com/cf_0/m0FXS/7_80/<br />

76938893/print.jhtml<br />

155<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


156<br />

Does <strong>Military</strong> Personnel Job Performance in a Digitized Future Force Require<br />

Changes in the ASVAB: A comparison of a dynamic/interactive computerized<br />

test battery with the ASVAB in predicting training and job performance among<br />

airmen and sailors.<br />

Ray Morath, Brian Cronin, & Mike Heil,<br />

Caliber Associates<br />

In the late 1990’s a team or researchers developed, as part of an award-winning project (SIOP Scott-Meyers<br />

Professional Practices Award, 1999), a battery of dynamic, interactive/computerized tests that is currently<br />

being used to select air traffic controllers; This battery of tests is known as the air traffic selection and<br />

training (AT-SAT). An empirical validation study using a concurrent sample of over 1,000 job incumbents<br />

found that the AT-SAT to have possess high predictive validity as well as high face validity. The AT-SAT<br />

was designed to measure the cognitive, perceptual, and psychomotor abilities critical to the air traffic control<br />

job, however, it has been found that many of these same cognitive, perceptual, and psychomotor abilities<br />

have also been identified as important for military officer, non-commissioned office, and enlisted personnel<br />

performance (Campbell, Knapp, & Heffner, 2002; Horey, Cronin, Morath, Franks, Cassella, & Fallesen, in<br />

press; Noble & Fallesen, 2000; Rumsey, 1995).<br />

Our team of researchers has recently created a parallel form of the original AT-SAT and is conducting an<br />

equating study to ensure that the new form is of equivalent difficulty and complexity and measures the same<br />

performance domains as the original battery. This equating study involves collecting data from<br />

approximately 1,500 Air Force and Navy personnel who have recently completed boot camp and are about to<br />

enter technical training programs for their assigned MOS. Our study will compare airmen and sailor scores<br />

on the AT-SAT battery to their scores on the ASVAB and will also investigate the ability of the AT-SAT to<br />

predict variability in training and job performance that is unique from that already predicted by the ASVAB.<br />

Our paper will present the research methodology for creating and validating the AT-SAT and will present<br />

data regarding the correlations of the various sub-tests with multiple dimensions of air traffic controller<br />

performance. We will also present data linking the AT-SAT to individual cognitive, perceptual, and<br />

psychomotor abilities required by air traffic controllers and discuss how many of these abilities are required<br />

not only within various technical jobs, but also across officers, NCOs, and enlisted personnel. Finally, we<br />

will discuss the influence of digitization on military performance requirements—specifically within the areas<br />

of technical training performance and tactical and technical performance, and the need for new, dynamic and<br />

interactive methods, such as the AT-SAT, of measuring the abilities associated with these changing<br />

requirements.<br />

Campbell, R. C., Knapp, D. J., & Heffner, S. T. (2002). Selection for leadership: Transforming NCO<br />

promotion. (ARI Special Report 52). Alexandria, VA: US Army Research Institute for the Behavioral and<br />

Social Sciences.<br />

Horey, J., Cronin, B., Morath, R., Franks, W., Cassella, R., & Fallesen, J. (in press). Army Training and<br />

Leader Development Panel Consolidation Phase: U.S. Army Future Leadership Requirement Study.<br />

Prepared for the U.S. Army Research Institute under contract (DASW01-98D0049). Caliber Associates:<br />

Fairfax, VA.<br />

Noble, S. A., & Fallesen, J. J. (2000). Identifying conceptual skills of future battle commanders. (ARI<br />

Technical Report 1099) Alexandria, VA: US Army Research Institute for the Behavioral and Social<br />

Sciences.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Rumsey, M.G. (1995). The best they can be: Tomorrow’s Soldiers. In R.L. Phillips& M.R. Thurman (Eds.),<br />

Future Soldiers and the quality imperative: The Army 2010 conference. (pp.123-157). Fort Knox, KY: US<br />

Army Recruiting Command.<br />

157<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


158<br />

COMPARISONS OF SATISFACTION AND RETENTION MEASURES<br />

FROM 1999-<strong>2003</strong><br />

Brian M. Lappin, Regan M. Klein, Lee M. Howell, and Rachel N. Lipari<br />

Defense Manpower Data Center<br />

1600 Wilson Blvd., Suite 400<br />

Arlington, VA 22209-2953<br />

lappinbm@osd.pentagon.mil<br />

Introduction<br />

Like many organizations, the U.S. Department of Defense (DoD) is interested in<br />

employee satisfaction and turnover intentions. Although satisfaction and retention are distinct<br />

concepts, they are closely related in that higher satisfaction with characteristics of work life is<br />

associated with stronger organizational commitment (Ting, 1997; Mathieu, 1991). Therefore,<br />

maintaining a high level of satisfaction with the military way of life can be a key element in the<br />

retention of Service members.<br />

Observing the fluctuation in levels of satisfaction and retention over time provides the<br />

military with insight to how its current programs, services, and policies affect the quality of life<br />

of its personnel. Using data from surveys conducted by Defense Manpower Data Center<br />

(DMDC) from 1999 to <strong>2003</strong>, this paper will analyze Service members’ level of satisfaction and<br />

retention intentions. The 1999 Active Duty Survey (ADS) continued a line of research begun in<br />

1969 with a series of small-scale surveys administered approximately every 2 years. These<br />

surveys were expanded in 1978 to provide policymakers with information about the total<br />

population directly involved with active duty military life (Doering, Grissmer, Hawes, and<br />

Hutzler, 1981). Like its predecessors, the 1999 ADS provided timely, policy-relevant<br />

information about the military life cycle (Wright, Williams, & Willis, 2000). The 1999 ADS<br />

was a large-scale, paper-and-pencil survey which included items on attitudes toward service and<br />

retention intentions. Since the 1999 ADS administration, Web-based surveys (Status of Forces<br />

Surveys (SOFS) of Active-Duty Members) were conducted in July 2002 and March <strong>2003</strong>. These<br />

surveys also included measures of satisfaction and the likelihood to remain in service.<br />

This paper will analyze differences in satisfaction levels and retention intentions by<br />

military Service. Aldridge, Sturdivant, Smith, Lago, and Maxfield (1997) found differences<br />

among Service groups in levels of satisfaction with specific components of military life. For<br />

example, Navy, Marine Corps, and Air Force officers reported higher levels of satisfaction with<br />

military life than did Army officers. Furthermore, Air Force enlisted members were more<br />

satisfied with military life than were Army enlisted members (Aldridge et al., 1997).<br />

In addition, this paper will analyze differences in satisfaction levels and retention<br />

intentions by military paygrade. Paygrade has accounted for some differences in levels of<br />

member satisfaction with military life (GAO, 1999; Norris, Lockman, and Maxfield, 1997). In<br />

general, higher paygrade groups reported higher levels of satisfaction with military life. GAO<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


(1999) found that among members in retention critical specialties, more enlisted members than<br />

officers were dissatisfied with the military.<br />

Although most demographic characteristics were not strong predictors of retention<br />

intention in 1985, and were not significant in 1992, Norris et al. (1997) found that, in 1985,<br />

paygrade was the major demographic predictor of retention intention. Specifically, higher<br />

paygrades were positively associated with lower reenlistment intentions among enlisted<br />

personnel and higher retention intentions among officers. However, Wong (2000) notes the<br />

possibility of generational differences in attitudes towards serving in the military that could<br />

impact retention intentions. Loosely defined, one might expect paygrade groups E7-E9 and O4-<br />

O6 to represent Baby Boomers, and paygrade groups below these to represent Generation X<br />

members. Differences between collective experiences of the Baby Boom Generation and those<br />

of Generation X have resulted in different attitudes toward work ethics (Wong, 2000).<br />

In presenting survey results from 1999 to <strong>2003</strong>, this paper will also discuss the relevance<br />

of the economy and September 11th terrorist attacks on satisfaction and retention in the military.<br />

As the United States transitioned from peacetime to wartime during this period, the military’s<br />

role shifted from an already accelerated schedule of training and peacekeeping operations to<br />

heavy involvement in world affairs and very high rates of activity. Assessing changes in<br />

satisfaction and retention during this timeframe may provide insight into the stability of military<br />

personnel’s intentions during this changing historical context.<br />

Methods<br />

In the analysis of satisfaction and retention across this 4-year period, three different<br />

surveys are utilized. The 2002-<strong>2003</strong> surveys are part of the Human Resources Strategic<br />

Assessment Program (HRSAP) which consists of both Web-based and traditional paper-andpencil<br />

surveys that assess the attitudes and opinions of the entire DoD community—active,<br />

Reserve, DoD civilian, and family members. Whereas the 1999 ADS employed a paper-andpencil<br />

administration method, both the July 2002 and March <strong>2003</strong> SOFS were Web-only<br />

surveys.<br />

Each of the three surveys targeted similar populations. The population of interest for the<br />

1999 ADS consisted of all Army, Navy, Marine Corps, Air Force, and Coast Guard active-duty<br />

members (including Reservists on active duty) below the rank of admiral or general, with at least<br />

6 months of service when surveys were first mailed. Similarly, the population of inferential<br />

interest for the July 2002 and March <strong>2003</strong> SOFS consisted of active-duty members of the Army,<br />

Navy, Marine Corps, and Air Force, who had at least 6 months of service and were below flag<br />

rank when the sample was drawn, and those who were not National Guard or Reserve members<br />

in active-duty programs. Coast Guard members and Reserve component members in full-time<br />

active duty programs were excluded from the 1999 ADS data prior to analyses for this report in<br />

order to maximize comparability between the surveys.<br />

159<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


160<br />

For all three surveys, single-stage, nonproportional stratified random-sampling<br />

procedures were used to ensure adequate sample sizes for the reporting categories. The initial<br />

sample for the 1999 ADS consisted of 66,040 individuals drawn from the sample frame<br />

constructed from DMDC’s May 1999 Active-Duty Master Edit File. The survey was distributed<br />

between August 1999 to January 2000. Completed surveys were received from 33,189 eligible<br />

military members. The overall weighted response rate for eligible members, corrected for<br />

nonproportional sampling, was 51%.<br />

The initial sample for the July 2002 SOFS consisted of 37,918 individuals drawn from<br />

the sample frame constructed from DMDC’s December 2001 Active-Duty Master Edit File. The<br />

July 2002 SOFS was conducted July 8 to August 13, 2002. Completed surveys were received<br />

from 11,060 eligible members, yielding an overall weighted response rate, corrected for<br />

nonproportional sampling, of 32%.<br />

The initial sample for the March <strong>2003</strong> SOFS consisted of 34,929 individuals drawn from<br />

the sample frame constructed from DMDC’s August 2002 Active-Duty Master Edit File. The<br />

March <strong>2003</strong> SOFS was conducted March 10 to April 21, <strong>2003</strong>. Completed surveys were<br />

received from 10,828 eligible respondents. The overall weighted response rate for eligible<br />

members, corrected for nonproportional sampling, was 35%.<br />

Data from all three surveys were weighted to reflect the population of interest. These<br />

weights reflect (1) the probability of selection, (2) a nonresponse adjustment factor to minimize<br />

bias arising from differential response rates among demographic subgroups, and (3) a<br />

poststratification factor to force the response-adjusted weights to sum to the counts of the target<br />

population as of the month the sample was drawn and to provide additional nonresponse<br />

adjustments.<br />

The 1999 ADS was an omnibus personnel survey covering such topics as military<br />

assignments, retention issues, personal and military background, preparedness, mobilizations and<br />

deployments, family composition, use of military programs and services, housing, perceptions of<br />

military life, family and childcare concerns, spouse employment, financial information, and other<br />

quality of life issues. In comparison, the July 2002 and March <strong>2003</strong> SOFS were somewhat<br />

shorter surveys. Although the content of the three surveys was not identical, each included<br />

questions pertaining to attitudes and behaviors, and all three surveys included questions<br />

concerning Service members’ overall satisfaction with the military way of life and retention<br />

intentions.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Results<br />

Satisfaction with the <strong>Military</strong> Way of Life<br />

All three surveys asked Service members how satisfied they were with the military way<br />

of life. For the purposes of this paper, the five response categories were collapsed into three<br />

categories: satisfied, neither satisfied nor dissatisfied, and dissatisfied.<br />

From 1999 to <strong>2003</strong>, satisfaction with the military way of life increased 18-percentage<br />

points—with the largest increase occurring between 1999 and 2002. In 1999, more Service<br />

members indicated they were satisfied with the military way of life (49%) than indicated they<br />

were dissatisfied (29%). In 2002, the percentage of members that indicated they were satisfied<br />

increased to 61% and the percentage of members that were dissatisfied decreased to 20%.<br />

Similarly, in <strong>2003</strong>, the percentage of Service members reporting they were satisfied (67%)<br />

increased, while the percentage of members that were dissatisfied (16%) decreased.<br />

Across the three surveys, a higher proportion of Air Force members indicated they were<br />

satisfied with the military way of life than did members of other Services. The percentage of Air<br />

Force members that were satisfied increased from 1999 (56% vs. 45-49%) to 2002 (68% vs. 54-<br />

61%), and again in <strong>2003</strong> (74% vs. 61-69%). From 1999 to <strong>2003</strong>, the percentage of members<br />

who were satisfied increased in all of the Services, with the percentages for Navy members<br />

increasing (from 45% to 61% to 69%) such that Navy members were almost as satisfied as Air<br />

Force members in <strong>2003</strong> (69% vs. 74%).<br />

Satisfaction with the military way of life tended to increase with rank across the three<br />

surveys. In 1999, fewer junior enlisted members (E1-E4) were satisfied than members in other<br />

paygrade groups (36% vs. 54-72%). This finding held true in 2002 (46% vs. 68-85%) and in<br />

<strong>2003</strong> (53% vs. 74-87%). Similarly, across the three surveys, senior officers (O4-O6) were the<br />

most satisfied with the military way of life and the percentage reporting satisfied increased<br />

across the three years (72% to 85% to 87%).<br />

Retention Intentions<br />

Three measures of retention intentions were included in each of the three surveys. First,<br />

members were asked to indicate their willingness to stay on active duty. Next, members were<br />

asked if they intended to remain in the military for 20 years, a full career. For the purposes of<br />

this paper, the five response categories for likelihood were collapsed into three categories: likely,<br />

neither likely nor unlikely, and unlikely. Finally, members were asked if their spouse, girlfriend,<br />

or boyfriend favored their remaining in the military. The five response categories for this<br />

question were also collapsed into three categories: favors staying, has no opinion one way or the<br />

other, and favors leaving.<br />

From 1999 to <strong>2003</strong>, likelihood to stay on active duty increased 11-percentage points. In<br />

1999, more Service members indicated they were likely to stay on active duty (50%) than said<br />

they were unlikely to stay (36%). In 2002, the percentage of members that indicated they were<br />

161<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


162<br />

likely to stay increased to 58%. Meanwhile, the percentage of members that were unlikely to<br />

stay decreased to 26%. In <strong>2003</strong>, the percentage of Service members that were likely to stay<br />

(61%) increased from the previous year. However, the percentage of members that were<br />

unlikely to stay (27%) remained roughly the same.<br />

Across the three surveys, fewer Marine Corps members indicated they were likely to stay<br />

on active duty. However, percentages for Marine Corps members indicating that they were<br />

likely to stay did increase from 1999 (42% vs. 48-56%) to 2002 (46% vs. 58-63%), and again in<br />

<strong>2003</strong> (53% vs. 59-65%). More Air Force members indicated they were likely to stay than<br />

members of other Services across the three surveys, but this difference was significant only in<br />

1999 (56% vs. 42-50%).<br />

In each of the three surveys, fewer junior enlisted members responded that they were<br />

likely to stay on active duty than other paygrade groups. Although fewer junior enlisted<br />

members indicated they were likely to stay, percentages did increase from 1999 (32% vs. 53-<br />

72%) to 2002 (41% vs. 67-78%), and again in <strong>2003</strong> (46% vs. 63-80%). In 1999, more senior<br />

officers were likely to stay on active duty than members of other paygrade groups (72% vs. 32-<br />

66%). Likewise, more senior officers were likely to stay than other paygrade groups in 2002<br />

(78% vs. 41-69%) and <strong>2003</strong> (80% vs. 46-72%), with the exception of warrant officers (W1-W5)<br />

(73% and 79%, respectively).<br />

From 1999 to <strong>2003</strong>, likelihood to stay for 20 years increased 10-percentage points. In<br />

1999, more Service members indicated they were likely to stay on active duty for at least 20<br />

years (51%) than said they were unlikely to stay (36%). In 2002, the percentage of members that<br />

indicated they were likely to stay for 20 years increased to 59% and the percentage of members<br />

that were unlikely to stay decreased to 28%. In <strong>2003</strong>, the percentage of Service members that<br />

were likely to stay for 20 years (61%) increased and the percentage of members that were<br />

unlikely to stay (28%) remained constant.<br />

Across the three surveys, a lower proportion of Marine Corps members indicated they<br />

were likely to remain in the military for at least 20 years. However, the percentage of Marine<br />

Corps members responding that they were likely to stay for 20 years did increase from 1999<br />

(43% vs. 49-58%) to 2002 (49% vs. 59-65%), and again in <strong>2003</strong> (52% vs. 59-66%). More Air<br />

Force members indicated they were likely to stay in the military for 20 years than members of<br />

other Services across the three surveys, but this difference was significant only in 1999 (58% vs.<br />

43-51%).<br />

Fewer junior enlisted members responded that they were likely to stay for at least 20<br />

years than other paygrade groups across the three surveys. Although fewer junior enlisted<br />

indicated they were likely to stay for 20 years, percentages did increase from 1999 (26% vs. 51-<br />

87%) to 2002 (37% vs. 64-92%), and again in <strong>2003</strong> (40% vs. 60-91%). In 1999, more senior<br />

officers (87%) were likely to stay for 20 years than members of other paygrade groups (26-83%).<br />

Furthermore, more senior officers were likely to stay for 20 years than other paygrade groups in<br />

2002 (92% vs. 37-76%) and <strong>2003</strong> (91% vs. 40-79%), with the exception of warrant officers<br />

(91% and 89%, respectively).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Spouse/significant other support to stay on active duty increased 8-percentage points<br />

from 1999 to 2002 (44% vs. 52%), but decreased 6-percentage points from 2002 to <strong>2003</strong> (52%<br />

vs. 46%). Meanwhile, the percentage of spouses/significant others that favored leaving active<br />

duty decreased 7-percentage points from 1999 to 2002 (40% vs. 33%), but increased 3percentage<br />

points from 2002 to <strong>2003</strong> (33% vs. 36%).<br />

In 1999, fewer Marine Corps members (37% vs. 43-48%) indicated that their<br />

spouse/significant other favored staying on active duty, whereas Air Force members (48% vs.<br />

37-43%) were the most likely to indicate they had spouse/significant other support for staying.<br />

Again, in 2002, fewer Marine Corps members indicated that their spouse/significant other was<br />

supportive of their remaining on active duty (44% vs. 52-56%). However, in <strong>2003</strong>, there were<br />

no significant differences among the Services.<br />

Across the three surveys, fewer junior enlisted members indicated that their<br />

spouse/significant other favored staying on active duty than other paygrade groups. The<br />

percentage indicating that their spouse/significant other favored staying increased from 1999<br />

(27% vs. 42-57%) to 2002 (35% vs. 57-67%), but decreased from 2002 to <strong>2003</strong> (30% vs. 47-<br />

60%). In both 1999 (42%) and <strong>2003</strong> (47%), fewer junior officers indicated that their<br />

spouse/significant other favored staying on active duty than other paygrade groups in that year,<br />

excluding junior enlisted.<br />

Correlations of Satisfaction and Retention Measures<br />

As Table 1 shows, strong correlations were found across the three surveys between<br />

member satisfaction with military life and intention to remain in the military. Correlations of the<br />

greatest magnitude were between likelihood of staying, staying for 20 years, and<br />

spouse/significant other support.<br />

Table 1. Correlation Matrix: Satisfaction and Retention Indices from 1999-<strong>2003</strong><br />

Likelihood of Staying<br />

Likelihood of Staying<br />

for 20 Years<br />

Spouse/Significant<br />

Other Support<br />

1999 2002 <strong>2003</strong> 1999 2002 <strong>2003</strong> 1999 2002 <strong>2003</strong><br />

Overall Satisfaction .55 .55 .53 .51 .54 .51 .44 .46 .39<br />

Likelihood of Staying — — — .75 .75 .80 .66 .61 .56<br />

Likelihood of Staying<br />

for 20 Years<br />

— — — — — — .57 .53 .51<br />

Note: Correlations significant at the p < .0001 level<br />

163<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


164<br />

However, not everyone who is satisfied intends to stay and not everyone who is<br />

dissatisfied intends to leave. While 70% of members who were satisfied with the military way of<br />

life indicated they were likely to remain in the military in 1999, there were also 11% of members<br />

who were dissatisfied that indicated an intention to stay. By 2002, the percentage of satisfied<br />

members that intended to stay in the military rose to 81%, while the percentage of dissatisfied<br />

members that intended to stay decreased to 5%. Percentages remained roughly the same in <strong>2003</strong>,<br />

with 83% of satisfied members, and 5% of dissatisfied members, indicating an intention to stay.<br />

A similar pattern was seen across the three surveys when members were asked if they<br />

intended to stay for at least 20 years. In 1999, 66% of satisfied members indicated an intention<br />

to remain in the military for 20 years, whereas 14% of dissatisfied members intended to stay for<br />

20 years. The gap widened in 2002, with 78% of satisfied members, and 8% of dissatisfied<br />

members, wanting to stay for a full career. In <strong>2003</strong>, 81% of satisfied members, and 6% of<br />

dissatisfied members, indicated an intention to stay for 20 years.<br />

Lastly, when asked about support in 1999, 68% of satisfied members indicated that their<br />

spouse/significant other favored staying in the military, and 13% of dissatisfied members<br />

indicated that their spouse/significant other favored staying in the military. As seen in the other<br />

retention measures, the gap widened in 2002, with 79% of satisfied members, and 7% of<br />

dissatisfied members, indicating that their spouse/significant other favored staying in the<br />

military. In <strong>2003</strong>, 83% of satisfied members, and 6% of dissatisfied members, indicated that<br />

their spouse/significant other favored staying.<br />

Conclusion<br />

Between 1999 and <strong>2003</strong>, the historical landscape was marked by the events of September<br />

11, 2001, and the ensuing global war on terrorism. During this period, Service members’ overall<br />

satisfaction with the military way of life, their likelihood to stay on active duty, and their<br />

likelihood to remain in the military for a full career increased. Furthermore, the largest increases<br />

in satisfaction and retention intentions occurred between 1999 and 2002. Increases in<br />

satisfaction and retention indices may have been the result of a renewed sense of patriotism<br />

following the terrorist attacks of September 11th. The downturn in the economy preceding<br />

September 11th is another possible explanation for the increases in satisfaction and retention. It<br />

is possible that, as the economy took a turn for the worse, the military became a more attractive<br />

employment option. It is also interesting to note that, of the three points in time,<br />

spouse/significant other support to stay on active duty peaked in 2002. The slight decrease in<br />

support in <strong>2003</strong> may have been the result of spouses and significant others growing weary of<br />

members’ time away from home.<br />

Consistent with previous research (Aldridge et al., 1997), there were notable differences<br />

amongst the Services. These differences largely reflect varying organizational structures across<br />

the Services that are needed to support the Service-unique roles and missions (OASD 2002). For<br />

example, across the three surveys, Air Force members were more satisfied and more likely to<br />

remain in the military than members of the other Services. This trend reflects the emphasis on<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


quality of life in the Air Force, and their organizational structure, which is characterized by<br />

higher retention rates creating an “older” force. In contrast, the Marine Corps does not have the<br />

same organizational need (OASD 2002). The Marine Corps is a smaller force that retains fewer<br />

members, particularly among enlisted members—the majority of its force. Although percentages<br />

did improve over the 4-year period, fewer Marine Corps members indicated that they were likely<br />

to stay on active duty than members of the other Services.<br />

Fewer junior enlisted members were satisfied than members in other paygrade groups.<br />

Although percentages did improve from 1999 to <strong>2003</strong>, fewer junior enlisted members responded<br />

that they were likely to stay on active duty than other paygrade groups. Also, fewer junior<br />

enlisted members indicated that their spouse/significant other favored staying on active duty than<br />

other paygrade groups. Senior officers were the most satisfied with the military way of life. In<br />

addition, senior officers were more likely to stay on active duty than members of other paygrade<br />

groups, with the exception of warrant officers in 2002 and <strong>2003</strong>. These results are not surprising<br />

as junior enlisted members are newer to their respective Services, and therefore, may have lower<br />

levels of organizational commitment. Senior officers, in contrast, have been in service longer<br />

and have more invested in the military as a career.<br />

Satisfaction and retention remain important factors in sustaining a military organization.<br />

As the military organization in the United States does not accommodate lateral entry into mid-<br />

and senior-level paygrades, it is essential to retain the appropriate number of personnel at each<br />

paygrade to ensure manpower and readiness requirements are met. In the post-September 11th<br />

period of heightened personnel tempo, the survey results from 1999 to March <strong>2003</strong> indicate that<br />

satisfaction and retention are stable, if not improving. However, given the current military<br />

involvement in Iraq, it will be essential to continuously monitor the fluctuations in satisfaction<br />

and retention intentions of military personnel.<br />

References<br />

Aldridge, D., Sturdivant, T., Smith, C., Lago, J., & Maxfield, B. (1997). The military as a<br />

career: 1992 DoD Surveys of Officers and Enlisted Personnel and Their Spouses (Report No.<br />

1997-006). Arlington, VA: DMDC.<br />

Doering, Z. D., Grissmer, D. W., Hawes, J. A., & Hutzler, W. P. (1981). 1978 DoD Survey of<br />

Officers and Enlisted Personnel: User’s manual and codebook (Rand Note N-1604-MRAL).<br />

Santa Monica, CA: Rand.<br />

General Accounting Office. (1999). <strong>Military</strong> Personnel: Perspectives of surveyed Service<br />

members in retention critical specialties (GAO Report No. NSIAD-99-197BR). Washington,<br />

DC: United States General Accounting Office.<br />

Mathieu, J. E. (1991). A cross-level nonrecursive model of the antecedents of organizational<br />

commitment and satisfaction. Journal of Applied Psychology, 76, 607-618.<br />

165<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


166<br />

Norris, D. G., Lockman, R. F., & Maxfield, B. D. (1997). Understanding attitudes about the<br />

military way of life: Analysis of longitudinal data from the 1985 and 1992 DOD surveys of<br />

officers and enlisted personnel and military spouses (DMDC report No. 97-008). Arlington,<br />

VA: DMDC.<br />

Office of the Assistant Secretary of Defense (Force Management Policy) (2002). Population<br />

representation in the military services fiscal year 2000. Washington, D.C.<br />

Ting, Y. (1997). Determinants of job satisfaction of federal government employees. Public<br />

Personnel Management, 26 (3), 313-334.<br />

Wong, L. (2000). Generations apart: Xers and boomers in the officer corps. Carlisle, PA:<br />

Strategic Studies Institute, U.S. Army War College.<br />

Wright, L. C., Williams, K., & Willis, E. J. (2000). 1999 Survey of Active Duty Personnel:<br />

Administration, datasets, and codebook (Report No. 2000-005). Arlington, VA: DMDC.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


BRITISH ARMY LEAVERS SURVEY:<br />

AN INVESTIGATION OF RETENTION FACTORS<br />

Johanna Richardson<br />

Ministry of Defence: British Army,<br />

Directorate Army Personnel Strategy (Science), Building 396a,<br />

Trenchard Lines, Upavon, Wiltshire, UK.<br />

The British Army has been operating a rolling programme of Continuous Attitude<br />

Surveys (CASs) for the last twenty years. The surveys are a management information<br />

tool to facilitate effective planning. The aim of the CAS programme is to obtain<br />

information from a representative sample on aspects of Army life. Surveys are<br />

administered to Serving Personnel (Regulars and the Territorial Army), Families of<br />

Serving Personnel and to those who leave the Army. They aim to monitor aspects of<br />

duty of care, factors affecting retention, satisfaction with the main areas of Army life,<br />

and the impact of personnel policies. Data from the surveys provide information which<br />

assists with personnel planning and manning strategy. The results are also used within<br />

Army wide Performance Indicators (PIs) for the Adjutant General and the Executive<br />

Committee of the Army Board. Analyses of the data are made available to the Armed<br />

Forces Pay Review Board, and in the future may also be made available for use by the<br />

Defence Management Board.<br />

Retention is a key issue for the British Army, for both financial and operational reasons.<br />

Factors influencing retention can be conceptualised as being either retention positive or<br />

retention negative. Retention positive factors are those which impact upon an<br />

individual’s intention to stay within the Army, while retention negative factors are those<br />

that impact upon an individual’s intention to leave. Previous research has shown that if<br />

attitudes can reliably be related to behavioural intentions, a more reliable prediction of<br />

actual behaviour may be obtained. Data on factors influencing retention are obtained<br />

primarily from two surveys, the Serving Personnel survey and also the Leavers survey.<br />

The Serving Personnel survey is administered twice annually to a random sample of 4%<br />

of all regular soldiers and 10% of all regular officers. The sample is stratified by rank to<br />

ensure representation of relatively small groups of interest. Overall response rates are<br />

typically within the region of between 40 and 55%, although the response rates vary<br />

between officers and soldiers. Typically, around 65% of officers respond, compared to<br />

approximately 40% of soldiers. The results in this paper include some items from a<br />

recent wave of this survey (SP4).<br />

The Leavers survey is administered on an ongoing basis to all regular serving personnel<br />

who are leaving the Army. This includes cases where the individual has reached the end<br />

of a term of engagement or commission, has applied for premature voluntary release, or<br />

is being discharged for medical, administrative or disciplinary reasons. The Leavers<br />

survey began with a pilot study that involved questionnaires being administered<br />

between October 2001 to January 2002. Following this trial, administration continued<br />

on an ongoing basis, and questionnaires in their current form have been issued to leavers<br />

at their exit point since that time. Hence, this paper includes analyses from leavers’<br />

questionnaires administered between January 2002 and October <strong>2003</strong>.<br />

167<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


168<br />

Of all personnel who leave the Army, those who apply for premature voluntary release<br />

probably provide the most insight into retention issues. Army officers are able to PVR<br />

with a six month notice period, and soldiers are currently required to give twelve<br />

months notice. In some situations it is possible for an individual to leave the Army with<br />

a lesser notice period, although this is determined on a case by case basis, and is<br />

dependent on factors such as the operational commitments of the Army and any return<br />

of service required to repay training investment. Statistics indicate that approximately<br />

40% of those who apply for PVR later withdraw their application. Because the leavers’<br />

questionnaires are anonymous, there is currently no way to determine whether any are<br />

completed and returned by personnel who apply for PVR and subsequently change their<br />

minds.<br />

PVR levels in the Army are slightly higher than those in the other Armed Forces.<br />

Annual PVR rates from the trained strength for the Army are 3.3% of officers, and 5.2%<br />

of soldiers. Within the Royal Navy/Royal Marines, the annual PVR rate for officers is<br />

2.5% and 5.4% for ratings/other ranks, while within the RAF the figures are 2.1% and<br />

4.0% respectively. These rates exclude PVR from training establishments, and are not<br />

an indication of overall attrition. Of the 590 completed leavers’ questionnaires received<br />

by DAPS Science between January 2002 and October <strong>2003</strong>, 282 were from personnel<br />

who had applied for PVR. 130 were from personnel who had reached the end of their<br />

engagement/commission, and 44 were from personnel discharged for medical,<br />

administrative or disciplinary reasons. The reason for leaving was missing data in 134<br />

cases.<br />

Of the 282 PVR personnel who returned a questionnaire during the period, 8.5% were<br />

officers and 91.4% were soldiers. Of the total, 83.9% were male and 16.1% were<br />

female. 76.2% were aged 30 or under at the time they completed the questionnaire. The<br />

majority of soldiers were from the lower ranks: privates, lance corporal/bombardiers or<br />

corporals/bombardiers. Unless soldiers buy themselves out, or are discharged for<br />

medical reasons, soldiers serve a minimum of four years. Unsurprisingly therefore, most<br />

of those who applied for PVR had served for between 4 and 7 years, which may explain<br />

the large proportion of soldiers from the lower ranks. Overall, the majority of PVR<br />

personnel had applied for PVR within six months of deciding to leave (37.2%),<br />

compared to 29.4% who waited for between 7 and 12 months and 30.9% who had<br />

waited for more than twelve months before handing in their notice.<br />

In terms of the reasons given for applying to leave the British Army, the majority of<br />

PVR cases said that the decision was related entirely or in part to the impact of the<br />

Army on personal and/or domestic life (81.6%). 80.1% said that the decision was<br />

related entirely, or in part, to general aspects of Army life. 79.1% said that the decision<br />

to leave was related entirely, or in part, to job satisfaction, and 76.2% said that it was<br />

related entirely, or in part, to factors outside the Army. These categories are not<br />

mutually exclusive, and it therefore appears that the reasons why people apply for PVR<br />

are many and varied, even for a single individual.<br />

A number of statements were included within each of these four categories, and PVR<br />

respondents were asked to state which had contributed to, or been critical in, their<br />

decision to leave the Army. Over all four categories, PVR personnel said that the two<br />

most important reasons for leaving the British Army were a feeling that civilian life<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


would be better, and that there would be better (civilian) employment opportunities<br />

outside the Army. In third place was the belief that if the individual stayed in the British<br />

Army any longer, if would be difficult to start a civilian career. This statement is<br />

particularly interesting in light of the fact that the majority of PVR questionnaires were<br />

returned by personnel aged 30 or under.<br />

These top three reasons for taking PVR from the British Army are fairly similar to those<br />

given in surveys conducted by the Royal Navy/Royal Marines and the Royal Air Force.<br />

In the RN/RM, the top three reasons cited are the desire to live at home, the wish to take<br />

up another career, and to marry and/or raise a family. In the RAF, the reasons are given<br />

as a lack of family stability, career prospects outside the RAF, and the difficulty in<br />

starting a second career if the individual stays in the service for any longer.<br />

For the British Army, the fourth most frequently endorsed contributory or critical factor<br />

in the decision to PVR was the statement there was too much separation from a spouse<br />

or partner. Interestingly, this statement was cited as contributing or being critical to a<br />

decision to leave among married and single personnel, and personnel in a long term<br />

relationships. When responses related to the impact of the Army on personal and/or<br />

domestic life were analysed in greater detail, an interesting pattern emerged. The top<br />

two factors in this category were the same for married personnel, single personnel, and<br />

those in a long term partnership. These were the degree of separation from a<br />

spouse/partner, and the detrimental effects of Army life on the relationship. The third<br />

most important factor for single personnel and those in a long term relationship was the<br />

poor standard of single living accommodation (SLA). For married personnel, it was the<br />

detrimental effect of Army life upon children.<br />

Personnel expectations will certainly have a role in retention. For example, the Serving<br />

Personnel survey asks, for those who joined the British Army within the last five years,<br />

about the factors that most influenced their decision to join. Recruitment positive factors<br />

include the opportunities for sport and an active life, and the opportunities for adventure<br />

training. However, 59% of PVR leavers stated that a lack of opportunity for sporting<br />

activities or adventurous training had contributed, or was critical, to their decision to<br />

leave the British Army. Similarly, 43% of PVR leavers stated that an insufficient<br />

amount/quality of training had contributed, or was critical, to their decision to leave the<br />

Army. Clearly, providing a positive image of Army life is a key recruitment factor.<br />

However, expectations need to be managed throughout an Army career to avoid<br />

disappointment and enhance retention.<br />

The surveys administered to Serving Personnel also provide valuable information on<br />

retention factors. These data can be used to compare intentions to leave with Leavers<br />

survey data on actual exit behaviour. Table 1 shows the factors that increase intention to<br />

leave among serving personnel, and those that are important in actually deciding to<br />

leave for those who PVR.<br />

169<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


170<br />

Table 1: Factors influencing intentions to leave (among serving personnel) and exit<br />

behaviour (among PVR leavers).<br />

Ranked Serving Personnel: Factors<br />

reason<br />

increasing intention to leave<br />

1 Impact of Army lifestyle on<br />

personal/domestic life<br />

2 Operational commitments and<br />

over-stretch<br />

3 Amount of extra duties<br />

4 Frequency of operational tours<br />

5 Accommodation<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Leavers (PVR): reasons for<br />

leaving the British Army<br />

Opportunities outside the Army<br />

Impact of Army lifestyle on<br />

personal/domestic life<br />

Your own morale<br />

Management in your workplace<br />

Job satisfaction<br />

The impact of the Army lifestyle on personal/domestic life is a key retention negative<br />

factor for both groups. For the PVR leavers, job satisfaction, own morale and<br />

management issues are key retention negative factors. However, for the serving<br />

personnel, these are retention positive factors – these personnel identify other irritants as<br />

key influences on their intentions to leave. Opportunities outside the Army appear to be<br />

a key issue for PVR leavers, but it is not known whether this is a cause or an effect. The<br />

policy implications of these differences are another issue. Does one concentrate on<br />

alleviating the factors influencing intention to leave for serving personnel? Or is it<br />

preferable to focus on remedying factors which are known to be associated with exit<br />

behaviour? Perhaps the debate is theoretical, since the British Army would like to<br />

provide a satisfying job and retention positive service conditions to all.<br />

The analyses reported here are the first available from the British Army Leavers survey<br />

since the pilot study was completed. Future plans for the survey include refinement of<br />

the instrument, and achieving some consistency with the questions asked of serving<br />

personnel. In addition, it must be acknowledged that there were administration issues<br />

associated with the questionnaires included in the current analyses. Unfortunately, these<br />

precluded calculation of an accurate and reliable response rate. Also, given that<br />

approximately 40% of PVR applications are later withdrawn, it would be preferable to<br />

be able to account for this within the data. DAPS Science is now addressing these<br />

issues, in order that the exit data from leavers’ questionnaires can provide enhanced<br />

information to military policy makers. This will assist in manpower planning and<br />

personnel policy development.


ABSTRACT<br />

PREDICTORS OF U.S. ARMY CAPTAIN RETENTION DECISIONS<br />

Debora Mitchell Ph.D., Heidi Keller-Glaze, Ph.D., and Annalisa Gramlich<br />

Caliber Associates<br />

Jon Fallesen, Ph.D.<br />

Army Research Institute<br />

In June 2000, at the direction of the U.S. Army Chief of Staff, the Army began the largest<br />

assessment it has ever conducted on training and leader development. The research assessed<br />

organizational culture, training and leader development, perceptions of advancement<br />

opportunity, and the effect of these factors on retention. In all, approximately 13,500 leaders and<br />

spouses provided their input during surveys, focus groups, and interviews. Data were collected<br />

from lieutenants, captains, majors, lieutenant colonels, colonels, and NCOs.<br />

To identify the variables that impact captains’ decision to leave the Army before<br />

retirement, logistic regression was conducted with intent to stay or leave as the dependent<br />

variable and demographic data and factors related to benefits, pay, self-development, mentoring,<br />

performance evaluation, and training as independent variables. Results showed that length of<br />

time as an officer, source of commissioning, gender of the respondent, benefits, mentoring, and<br />

counseling were significant predictors of intent to leave. These results provide evidence of the<br />

importance of professional development to retention.<br />

INTRODUCTION<br />

The Army Training and Leader Development Panel was initiated in June 2000, at the<br />

direction of the U.S. Army Chief of Staff. The Panel’s charter was to review, assess, and provide<br />

recommendations for the development and training of 21st Century leaders. The panel was made<br />

up of Army researchers, subject matter experts, and officers who collected data on satisfaction<br />

with training and leader development. They also collected data on a wide variety of related<br />

topics, such as institutional and unit training, self-development, performance appraisal,<br />

mentoring, selection and retention, satisfaction, commitment, and Army culture.<br />

The panel’s major emphasis was on training and leader development. A meta-analysis by<br />

Hom and Griffeth (1995) suggests that the quality of one’s management positively affects<br />

satisfaction and retention (Hom & Griffeth, 1995). By targeting leader development, the Army<br />

should be able to improve retention as well as improve Soldiers’ ability to meet mission<br />

requirements.<br />

The captain rank is a decision point for many Soldiers. Because of incentives provided<br />

by the retirement system, if a Soldier decides to stay beyond the rank of captain, he or she<br />

171<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


172<br />

typically plans to stay until retirement. To better understand the relationship of training and<br />

development and the retention of captains, a logistic regression analysis was conducted. This<br />

paper describes the analysis and results.<br />

METHOD<br />

Officer and NCO data collectors were trained in conducting focus groups and<br />

administering surveys. They were provided with a sampling plan, and asked to gather data from<br />

a representative group of officers, warrant officers, and NCOs. They collected data from Army<br />

personnel and spouses worldwide. Approximately 13,500 soldiers and spouses provided data<br />

through focus groups, interviews, or surveys.<br />

Participants in the Officer Comprehensive survey were 5,525 officers, warrant officers,<br />

and NCOs, from both active and reserve components. Data collectors distributed the instrument<br />

in June 2000 to groups of participants. Of the 1,548 captains who responded to the officer<br />

survey, 1,296 provided complete information, and comprised the sample used in the logistic<br />

regression described below.<br />

Logistic regression is a type of regression analysis in which the dependent variable is<br />

dichotomous. In this case, logistic regression estimates the odds of intent to stay in the Army<br />

and the odds of intent to leave before retirement, based on a set of predictor variables. The result<br />

of this analysis provides insight into the characteristics of captains who have indicated that they<br />

are planning to leave prior to retirement.<br />

Dependent variable. The dependent variable for this analysis was career intention. This<br />

variable is intended to reflect whether the captain plans to leave the Army before retirement.<br />

Career intention was determined by creating a dichotomous variable with 1=leaving and<br />

0=staying, which was computed based on officers’ responses to two survey items.<br />

Independent Variables. Both demographic and survey items were investigated as<br />

possible predictors of officers’ intent to leave. The following demographic items were included:<br />

months in current position, number of deployments, number of PCS moves, type of unit, gender,<br />

rank, source of commission, ethnicity, career field, branch, functional area, highest echelon of<br />

command, Combat Training Center (CTC) experience, and length of time as an officer.<br />

A collinearity analysis suggested that rank, length of time as an officer, and number of<br />

PCS moves were strongly correlated with one another. The model including the variable “length<br />

of time as an officer” was selected as having the best fit of the three.<br />

Due to the long list of potentially important demographic variables, a step-wise logistic<br />

regression was conducted to determine which demographic variables were better predictors of<br />

intent to leave than others. Those demographic variables that contributed the most to the model<br />

were retained for the subsequent analyses. Retained demographic variables include: CTC<br />

experience, years as an officer, source of commission, gender, and months in current position.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Survey items were then aggregated into components based on principal components<br />

analysis. The resulting components were used as predictor variables. The components included<br />

in the logistic regression are as follows: satisfaction with leader development, expectations met<br />

with regard to benefits, retention factors (service and job aspects), retention factors (pay,<br />

benefits, quality of life), service ethic, performance orientation and understanding of service,<br />

obstacles to self-development, individuals that aid in self-development, importance and<br />

effectiveness of mentors, usefulness of counseling by rater, effectiveness of leadership skills<br />

training, and quality of home station training. All variables, demographic and components, were<br />

entered into the model simultaneously.<br />

RESULTS<br />

Results of the Homer and Lemeshow Test assess the goodness of fit of the model.<br />

Results are significant at the p>.05 level (Χ 2 (8)=12.884, p


174<br />

Table 3: Variables in the Equation<br />

Variable Name B S.E. Wald df Sig. Exp(B)<br />

CTC experience 0.132 0.14 0.884 1 0.347 1.141<br />

Length of time as an officer -0.25 0.03 69.309 1 0.000 0.779<br />

Source of commission (other) 80.426 3 0.000<br />

Source of commission (ROTC) 0.462 0.317 2.126 1 0.145 1.588<br />

Source of commission (<strong>Military</strong> Academy) 1.456 0.354 16.94 1 0.000 4.288<br />

Source of commission (officer candidate) -1.024 0.378 7.356 1 0.007 0.359<br />

Sex of respondent (male) -0.766 0.213 12.998 1 0.000 0.465<br />

Months in current position 0.012 0.008 2.162 1 0.141 1.013<br />

Leader development -0.107 0.112 0.903 1 0.342 0.899<br />

Expectations met with benefits -0.803 0.131 37.664 1 0.000 0.448<br />

Retention factors (service and job aspects) -0.107 0.13 0.683 1 0.408 0.898<br />

Retention factors (pay, benefits, quality of<br />

life) -0.091 0.096 0.898 1 0.343 0.913<br />

Service ethic 0.058 0.089 0.416 1 0.519 1.059<br />

Performance orientation and understanding<br />

of service 0.097 0.097 0.997 1 0.318 1.102<br />

Obstacles to self-development 0.064 0.092 0.495 1 0.482 1.067<br />

Individuals that in self-development -0.018 0.108 0.028 1 0.868 0.982<br />

Importance and effectiveness of mentors -0.229 0.095 5.78 1 0.016 0.795<br />

Rater provides useful counseling -0.198 0.082 5.893 1 0.015 0.82<br />

Effectiveness of leadership skills training 0.118 0.075 2.501 1 0.114 1.126<br />

Quality of home station training -0.031 0.094 0.109 1 0.741 0.969<br />

Constant 5.172 0.774 44.697 1 0.000 176.26<br />

The results indicate that captains who have been an officer longer are less likely to<br />

consider leaving than those who have been an officer for a shorter period of time. Captains<br />

whose source of commission or appointment was the military academy are 4.3 times more likely<br />

to be planning to leave than those who were commissioned or appointed some other way (e.g.,<br />

ROTC and Officer Candidate School). In addition, men are less likely to plan to leave than<br />

women. Those who indicate that their expectations have been met with respect to benefits are<br />

less likely to plan to leave. Also, those captains who have had effective mentors or mentoring<br />

experiences are less likely to plan to leave. Finally, those captains who indicate their raters<br />

provide useful mentoring and counseling are less likely to consider leaving the Army.<br />

A follow-up analysis of other survey items was conducted to gain more insight. Captains<br />

who reported that they were planning to leave before retirement at 20 years answered a set of<br />

questions about the importance of various issues in their decision to leave. The top issues were:<br />

a belief that the Army no longer demonstrates that it is committed to captains as much as it<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


expects them to be committed, they are not having fun anymore, a perception of excessive<br />

micro-management, they do not have a sense of control or self-determination in their career, and<br />

they can no longer balance the needs of the Army with the needs of their families.<br />

DISCUSSION<br />

The following variables predicted captain intent to leave the Army: length of time in the<br />

Army, source of commissioning, gender, benefits satisfaction, the perception of effectiveness of<br />

counseling received, and having a mentor. Those captains who have been an officer longer are<br />

less likely to consider leaving than those who have been an officer for a shorter period of time.<br />

This finding is expected, for at least two reasons, according to Mackin, Hogan, and Mairs (1993).<br />

First, the Army’s retirement system provides an increasingly greater incentive to stay for 20<br />

years. Second, self-selection is occurring; Soldiers who are more suited for the Army life are<br />

more likely to stay at each decision point. (Mackin, Hogan, & Mairs, 1993).<br />

Captains whose source of commission was through a military academy are more likely to<br />

plan to leave than those who were commissioned some other way. This may be due in part to a<br />

disconnect between what is taught in the military academies and what actually goes on in the<br />

field. Also, captains who have had a highly selective military academy education may have<br />

more opportunities in the private sector. Those whose source of commission was Officer<br />

Candidate School are less likely to plan to leave than others. These captains moved up through<br />

the NCO ranks and therefore have a considerable amount of experience in the Army and a<br />

realistic job preview. Furthermore, they most likely would not have attended OCS if they were<br />

not intending to make the Army a career.<br />

Gender is also a significant predictor of career intentions with men as less likely to plan<br />

to leave than women. Work-family balance was one of the top reasons captains said they were<br />

planning to leave. In a related study, time separated from family was the top reason officers<br />

reported thinking about leaving or planning to leave before retirement (ARI, 2002). Familyrelated<br />

issues are important to both men and women; however, women may find it harder to<br />

balance the needs of their families with those at work. This issue needs more research.<br />

Those who indicate that their expectations have been met with respect to benefits are less<br />

likely to plan on leaving. This finding corresponds to the top reason for planning to leave, which<br />

suggests that there is a perceived imbalance between the commitment of the individual to the<br />

Army and the Army’s commitment to the individual. Improving benefits or better<br />

communication of the value of the existing benefits may help to improve perceptions of benefits.<br />

Also, those captains who have had effective mentors or mentoring experiences are less<br />

likely to consider leaving as well as those captains who indicate their raters provide useful<br />

mentoring and counseling. The significance of these variables may be that both mentoring and<br />

counseling provide important feedback to the individual as well as guidance on professional<br />

development and career advice. In addition, mentoring and counseling provide a one-on-one<br />

175<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


176<br />

connection to a very large organization and can reflect an investment by the organization in the<br />

individual.<br />

In conclusion, this paper provides results of an analysis of captain retention. The results<br />

can lead to a better understanding of the factors that affect retention, as well as suggest some<br />

improvements that the Army can make to attempt to improve retention.<br />

REFERENCES<br />

Army Research Institute for the Behavioral and Social Sciences (August, 2002). Reasons<br />

for Leaving the Army Before Retirement, Survey Report (Report No. 2002-13).<br />

Hom, P. W. & Griffeth, R. W. (1995). Employee Turnover. Cincinnati, OH: South-<br />

Western College Publishing.<br />

Mackin, P. C., Hogan, P. F., Mairs, L. S. (1993). A Multiperiod Model of US Army<br />

Officer Retention Decisions. (ARI Technical Report 93-03). Alexandria, VA: US Army<br />

Research Institute for the Behavioral and Social Sciences.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


THE IMPORTANCE OF A FAMILY-FRIENDLY WORK ENVIRONMENT<br />

FOR INCREASING EMPLOYEE PERFORMANCE AND RETENTION<br />

Ann H. Huffman and Satoris S. Youngcourt<br />

Texas A&M University<br />

College Station, TX 77843-4235<br />

annhuffman@tamu.edu<br />

Carl Andrew Castro<br />

Division of Neuropsychiatry<br />

Walter Reed Army Institute of Research<br />

ABSTRACT<br />

This study tested perceptions of family-friendly work environment as a moderator of the<br />

relationship between work-life conflict and job performance. Survey data and actual performance<br />

measures from 230 US Army soldiers were examined. Findings indicated a perceived familyfriendly<br />

work environment was negatively related to intentions to leave the organization, and<br />

positively related to individual and future performance. Furthermore, although employees who<br />

had more family responsibilities benefited from the family-friendly work environment, there was<br />

no apparent adverse impact on single, childless individuals. The results underscore the<br />

importance of family-friendly work environments to facilitate employee performance and career<br />

decisions.<br />

Authors’ Notes.<br />

The views expressed in this paper are those of the authors and do not necessarily represent the<br />

official policy or position of the Department of Defense (paragraph 4-3, AR 360-5) or the U.S.<br />

Army Medical Command.<br />

The findings described in this paper were collected under WRAIR Research Protocol #700<br />

entitled “A Human Dimensions Assessment of the Impact of OPTEMPO on the Forward-<br />

Deployed Soldier” under the direction of C.A. Castro (1998). The authors thank Amy B. Adler,<br />

Robert Bienvenu, Jeffrey Thomas and Carol Dolan, co-investigators on this research project. We<br />

would also like to thank Millie Calhoun, Coleen Crouch, Alexandra Hanson, Tommy Jackson,<br />

Rachel Prayner, Shelly Robertson and Angela Salvi for their excellent technical support. This<br />

research was funded by the Research Area Directorate for <strong>Military</strong> Operational Medicine U.S.<br />

Army Medical Research and Materiel Command in Ft. Detrick, Maryland and the U.S. Army,<br />

Europe, Heidelberg, Germany.<br />

177<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


178<br />

In recent years, researchers have been interested in the balance between employees’ work<br />

and home lives (Kossek & Ozeki, 1998; Major, Klein, & Ehrhart, 2002). Ideally, individuals are<br />

able to effectively manage the requirements of both roles without undue difficulty.<br />

Unfortunately, work and life demands frequently clash, making it difficult for the individual to<br />

be simultaneously successful in both domains, resulting in work-life conflict.<br />

Work-life conflict has been linked to numerous negative consequences for the individual,<br />

including lower general well-being (Aryee, 1992; Frone, 2000; Frone, Russell, & Cooper, 1992;<br />

Thomas & Ganster, 1995), lower job satisfaction (Adams, King, & King, 1996); greater burnout<br />

(Burke, 1988), and greater alcohol use and poor health (Allen, Herst, Bruck, & Sutton, 2000;<br />

Frone, Russell, & Barnes, 1996). The organization also experiences negative consequences. For<br />

example, researchers have suggested that conflict leads to negative organizational outcomes such<br />

as increased turnover and decreased performance (e.g., Jex, 1998). Although researchers have<br />

consistently demonstrated a link between work-life conflict and increased turnover intentions<br />

(Burke, 1988; Greenhaus, Collins, Singh, & Parasuraman, 1997; Greenhaus, Parasuraman, &<br />

Collins, 2001), few have empirically examined the relationship between work-life conflict and<br />

job performance (Allen et al., 2000).<br />

As the potential adverse effects of work-life conflict become more apparent,<br />

organizations have become more proactive in their attempts to buffer the negative effects. One<br />

way in which organizations have tried to assist employees is by fostering a family-friendly<br />

culture to allow employees support and flexibility to successfully sustain both their work and<br />

personal lives (Kossek & Lobel, 1996).<br />

The current study examined whether a family-friendly work environment buffers the<br />

negative relationship between work-life conflict and performance and organizational outcomes.<br />

Specifically, we examined employee perceptions of a family-friendly work environment and how<br />

these perceptions directly and indirectly related to subjective and objective measures of<br />

performance. Additionally, we assessed whether perceptions of a family-friendly work<br />

environment were beneficial for all employees regardless of family responsibilities, or if they<br />

were detrimental to employees with few family responsibilities.<br />

WORK-LIFE CONFLICT<br />

There are numerous work/nonwork conflict constructs in the work and family literature<br />

(e.g., work-life conflict, work-nonwork conflict, work-family conflict). Although many of the<br />

constructs are similar, or are used interchangeably, there are some subtle differences. Work-life<br />

conflict is based on a broader definition than the more specific construct work-family conflict.<br />

Whereas work-family conflict focuses on time and strain due to family responsibilities, work-life<br />

conflict encompasses family factors in addition to personal responsibilities not necessarily<br />

related to families (e.g., shopping for personal items, exercising, spending time with friends). We<br />

chose to operationalize role conflict specifically as work-life conflict for three reasons. First, the<br />

more wide-ranging work-life conflict construct allows us to include both single and married<br />

individuals. Second, researchers have advocated the use of more flexible and broader constructs,<br />

such as work-life conflict, in work and nonwork role research (Behson, 2002; Grover & Crooker,<br />

1995). Finally, although many have stated the two constructs are similar, scant research has<br />

empirically tested the relationship between work and life roles (Frone, <strong>2003</strong>).<br />

According to role theory (Hart, 1999), all of the work-nonwork variables are similar.<br />

Role theory asserts that strain will occur when individuals face competing demands from<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


multiple life roles (Kahn, Wolfe, Quinn, Snoek, & Rosenthal, 1964). Work-life conflict can be<br />

conceptualized as a type of interrole conflict occurring when the pressures and demands of work<br />

collide with the pressures and demands of one’s personal life (Kopelman, Greenhaus, &<br />

Connolly, 1983). The strength of the conflict depends on the pressures or demands of both roles,<br />

with more conflict occurring when the capacity of the individual to meet all the demands from<br />

both roles is exceeded. With this in mind our literature base for the current study is based on both<br />

specific and general work-life demands research and will use the terms interchangeably.<br />

WORK-LIFE CONFLICT AND WORK-RELATED OUTCOMES<br />

Measures of organizational outcomes can be “objective” or “subjective”. Objective<br />

measures focus on discrete, countable behaviors or outcomes, such as scores on proficiency<br />

exams or turnover rates, and are important because they give independently verifiable depictions<br />

of employee performance. These measures, however, may be deficient indicators of performance<br />

because they might fail to capture other important markers of performance. For example, many<br />

job-relevant knowledge, skills, and abilities (e.g., leadership or communication skills) are not<br />

easily assessed objectively. For these, other measures may be necessary.<br />

Subjective measures focus on outcomes requiring individual perceptions, such as<br />

supervisory ratings of performance, turnover intentions, and customer satisfaction reports.<br />

Subjective ratings, although necessary to capture important aspects of most jobs, suffer from<br />

various biases (Cascio, 1998). Self-ratings are often only moderately related to independent<br />

measures of performance (e.g., Bommer, Johnson, Rich, Podsakoff, & Mackenzie, 1995; Katosh<br />

& Traugott, 1981; Kleck, 1982), unless the self-ratings are based on data that can be readily<br />

observed or verified (Spector, Dwyer, & Jex, 1988). Although researchers have consistently<br />

shown that individuals are typically lenient with self-ratings (Harris & Schaubroeck, 1988;<br />

Thornton, 1980), the bias is not always in favor of the individual. For example, Adler, Thomas,<br />

and Castro (2002) found that, when asked to provide information on their performance,<br />

individuals tended to portray themselves in a more negative light than the independent records<br />

would indicate, calling into question the accuracy and validity of the so-called more reliable,<br />

non-self-report measures.<br />

Although intuitively related, objective and subjective measures of performance often<br />

exhibit low correlations with one another (e.g., Bommer et al., 1995; Heneman, 1986), and<br />

therefore when used alone (i.e., only subjective or only objective), results should be interpreted<br />

with caution. That is, each data type provides unique information and together can provide a<br />

more complete representation of performance. Therefore, in this study, both objective measures<br />

(i.e., marksmanship scores and physical training scores) and subjective measures (i.e.,<br />

perceptions of future combat performance and turnover intentions) of performance are used.<br />

Individual Job Performance. Although job performance is one of the most relevant outcomes<br />

to organizations, it is one of least studied in relation to work-life conflict (Frone, Yardley, &<br />

Markel, 1997; Kossek & Ozeki, 1998; Perrewe, Treadway, & Hall, <strong>2003</strong>). Of the few studies that<br />

examine the work-life conflict and job performance relationship, the results are far from<br />

equivocal, with some reporting a negative relationship (Aryee, 1992, Frone et al., 1997; Kossek<br />

& Nichol, 1992) and some reporting no relationship (Greenhaus, Bedeian, & Mossholder, 1987;<br />

Netemeyer, Boles, & McMurrian, 1996). While we found no published studies that reported a<br />

positive relationship between work-life conflict and job performance, Greenhaus et al. (1987)<br />

provided two arguments for why researchers might expect to find such a relationship. First, they<br />

noted that conflict might exist for high performing individuals because they spend more time at<br />

179<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


180<br />

work than others, and therefore have less time for their personal lives. They proposed that, “the<br />

very activities that produce high job performance, in other words, may estrange an employee<br />

from his or her family and produce feelings of personal dissatisfaction” (p. 202). Second, they<br />

suggested that there are particular behaviors required to attain a high level of performance that<br />

are only appropriate in the work domain, and may in fact be detrimental in the family domain.<br />

For example, employees may be required to conform to the values and norms of the<br />

organization, or they may need to alienate themselves from a satisfying personal relationship in<br />

order to be successful within an organization.<br />

In their meta-analytic review, Allen et al. (2000) found an overall weighted mean<br />

correlation of -.12 across all studies examining the relationship between work-life conflict and<br />

job performance. This finding was based on only four samples, however, and therefore should be<br />

interpreted with caution. Furthermore, these studies all had one common limitation in that the<br />

measure of performance was solely based on subjective performance measures. For example,<br />

whereas Aryee (1992) used three measures of work-family conflict (job-parent conflict, jobspouse<br />

conflict, and job-homemaker conflict), he assessed performance with a single self-report<br />

measure of work quality. Similarly, Frone et al. (1997) assessed performance with a self-report<br />

measure that tapped work role behaviors. Although Netemeyer et al. (1996) used real estate sales<br />

as their performance measure, they nevertheless depended on normative self-report ratings.<br />

Indeed, the only sample in the Allen et al. meta-analysis that did not rely on self-ratings was<br />

Greenhaus et al.’s (1987) study, which used supervisor ratings.<br />

Despite the rationale that has been provided for positive relationships between work-life<br />

conflict and job performance (see Greenhaus et al., 1987), we suspect a negative relationship will<br />

exist, based primarily on the generally negative effects of interrole conflict. That is, we propose<br />

that individuals who report greater levels of work-life conflict will have decreased job<br />

performance, as indicated by objective measures, because of the strain inherent in the conflict.<br />

Specifically, we propose individuals reporting greater levels of work-life conflict have more<br />

personal distractions interfering with their work roles, and therefore will not be able to devote as<br />

much cognitive, emotional, or physical energy to preparation for or actual engagement in their<br />

work tasks, which will contribute to decreased performance. Based on this logic, we propose the<br />

following:<br />

Hypothesis 1a: Work-life conflict is negatively related to job performance.<br />

Collective Efficacy. One subjective organizational outcome related to work-life conflict is<br />

employee perceptions of his or her group’s future performance. These perceptions are akin to<br />

collective-efficacy, defined by Bandura (1982) as personal judgments about the group’s<br />

capacities to execute necessary behaviors to perform specified tasks. Highly efficacious<br />

individuals tend to be more productive in setting their goals and persist more in reaching those<br />

goals than non-efficacious individuals (Gist, 1987). Bandura (1984) suggested that individuals<br />

with greater levels of efficacy are also more effective under stressful conditions. Many<br />

researchers, in fact, have demonstrated that higher levels of efficacy lead to higher levels of<br />

performance (e.g., Sadri & Robertson, 1993; Stajkovic & Luthans, 1998). Therefore, collective<br />

efficacy can be considered a positive organizational outcome.<br />

Although we could find no studies directly examining work-life conflict and efficacy<br />

beliefs, we anticipate that a negative relationship exists between the two. Our rationale involves<br />

the fact that work-life conflict is a strain that potentially saps energy, which is needed to perform<br />

well. Therefore, employees experiencing work-life conflict may have low energy levels, and<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


consequently feel they are unable to perform well on tasks. Based on this logic, we provide the<br />

following hypothesis:<br />

Hypothesis 1b: Work-life conflict is negatively related to collective efficacy perceptions.<br />

Turnover Intention. Employee turnover, both pervasive and costly, is often one of the biggest<br />

dilemmas for organizations and the final organizational outcome of interest in this study.<br />

Excessive turnover is linked to numerous negative consequences including the loss of good<br />

employees, loss of loyalty among current employees, lower perceived job quality, lower<br />

satisfaction of customers, and loss of money for the organization (Abassi & Hollman, 2000;<br />

Gupta & Jenkins, 1992; Hacker, 1997; White, 1995). Although actual turnover has rarely been<br />

studied with work-life conflict, several studies have examined the relationship between work-life<br />

conflict and turnover intentions. The findings consistently show a positive relationship between<br />

work-life conflict and intentions to leave (Allen et al., 2000; Kossek & Ozeki, 1999; Netemeyer<br />

et al., 1996). Based on these findings, we propose the following hypothesis:<br />

Hypothesis 1c: Work-life conflict is positively related to turnover intentions.<br />

PERCEPTIONS OF FAMILY-FRIENDLY WORK ENVIRONMENT<br />

Jex (1998) suggested work stressors have both a direct and an indirect effect on<br />

performance. That is, it may be that the inconsistent findings for the relationship between worklife<br />

conflict and performance are due to the indirect nature of the relationship. Although<br />

numerous moderators of the relationship may exist, we examine employee perceptions of the<br />

family-friendliness of the work environment.<br />

Organizations do not have to feel powerless in dealing with factors that occur outside of<br />

their domain. Recently, many organizations have been receptive to “family-friendly” programs<br />

to decrease employee work-life stress. Work environments are considered family-friendly when<br />

they “(a) help workers manage the time pressures of being working parents by having policies<br />

such as vacation time, sick leave, unpaid or personal leave, or flexible work schedules, or (b)<br />

help workers meet their continuing family responsibilities through such programs as maternity<br />

and paternity leave, leave that can be used to care for sick children or elders, affordable health<br />

insurance, and child-care or elder care programs” (Marshal & Barnett, 1996, p. 253).<br />

Despite the purported benefits of family-friendly policies, such as increased productivity,<br />

job satisfaction, and organizational commitment, relatively few researchers have empirically<br />

examined such family-friendly work environments (Aldous, 1990; Bourg & Segal, 1999; Glass<br />

& Finley, 2002). The concept is fairly new and the little research that has been conducted has<br />

been methodologically weak, based primarily on anecdotes. Although the existence of these<br />

programs seems important, researchers have suggested that an ideal family-friendly workplace<br />

goes beyond the availability of programs (Fredriksen-Goldsen & Scharlach, 2001; Secret &<br />

Sprang, 2001). A true family-friendly environment exists where both day-to-day taskings and<br />

important policy decisions include addressing the needs of the family.<br />

Few studies go beyond measuring the mere presence or number of family-friendly<br />

programs by actually examining the family-friendly culture of an organization. Three measures<br />

have recently been developed to capture the essence of a family-friendly work culture. The first<br />

scale, developed by Thomas and Ganster (1995), measures family-friendly culture and taps into<br />

the idea that for organizational policies to be successful the organization needs supervisors that<br />

support the policies. More recently, Eaton (<strong>2003</strong>) constructed two scales that measured<br />

perceptions of availability of informal work-family policies and the perceptions of usability of<br />

work-family policies, and found that perceived usability of work-family programs related more<br />

181<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


182<br />

to productivity than did actual presence of formal or informal policies. Finally, Allen (2001)<br />

developed a scale measuring family-supportive organization perceptions. The construct tapped<br />

by her measure is very similar to perceived organizational support (see Rhoades & Eisenberger,<br />

2002, for a recent review of this literature), because both are measures of global support, rather<br />

than specific measures of support, such as supervisor or coworker support. Allen differentiated<br />

her construct with that of perceived organizational support, however, maintaining that hers<br />

specifically concerns reactions to the organization regarding family-supportiveness whereas<br />

perceived organizational support concerns responses to the supportiveness of the organization as<br />

a whole, not specific to family-related aspects.<br />

FAMILY-FRIENDLY WORK ENVIRONMENT AND WORK-RELATED OUTCOMES<br />

Given the similarity between perceived organizational support and family-friendly work<br />

environments, the conceptual framework should be similar in nature. Similar to Eisenberger<br />

Huntington, Hutchison, and Sowa (1986), we stress the importance of the norm of reciprocity<br />

between the employee and the organization. Specifically, we propose that the nature of the<br />

relationship between the employee and the organization can be explained by the tenets of social<br />

exchange theory (Blau, 1964) and equity theory (Adams, 1963, 1965).<br />

Adams’ (1963, 1965) equity theory, based on cognitive dissonance theory (Festinger,<br />

1957), posits that individuals compare ratios of inputs to outcomes of themselves and others to<br />

establish if something is fair. For example, if an employee compares the ratio of his or her inputs<br />

(the amount of work he or she is doing) to his or her outcomes (e.g., flexible work schedule) with<br />

the organization’s ratio of inputs to outcomes, and perceives a discrepancy, then feelings of<br />

dissonance will emerge that must be resolved.<br />

Adams (1963) noted several ways in which individuals could reduce feelings of inequity.<br />

These methods of regaining balance include increasing or decreasing inputs or outcomes to<br />

match those of the organization, quitting, distorting perceptions, or changing the referent. In<br />

terms of family-friendly work environments, if an employee perceives he or she is giving more<br />

to the organization (e.g., working long hours, helping coworkers) than he or she is getting in<br />

return (e.g., no flexibility in schedule or little vacation time), he or she will attempt to reduce the<br />

inequity. That is, the employee may decrease his or her effort in order to balance the perceived<br />

relationship. This same logic applies to social exchange theory (Blau, 1964), whereby the<br />

employee will reciprocate that which he or she feels the organization is providing.<br />

Job Performance. Individuals who perceive an organization as supportive of their needs<br />

may feel indebted to the organization and therefore reciprocate the exchange by increasing their<br />

performance, their efficacy beliefs, and their intentions to remain with an organization. Previous<br />

findings concerning family-friendly work environments and job-related outcomes support these<br />

assertions. For example, Kossek and Ozeki (1999) identified five studies (Dunham, Pierce, &<br />

Castenada, 1987; Kossek & Nichol, 1982; Orthner & Pittman, 1986; Pierce & Newstrom, 1982;<br />

1983) that examined the effects of a family-friendly work environment on job performance.<br />

Results were generally positive but small.<br />

The small effects in the five studies reported by Kossek and Ozeki (1998) could be<br />

because actual family-friendly policies were used when perceptions may be the more appropriate<br />

measure (Allen, 2001; James & McIntyre, 1996). Studies have consistently shown, for example,<br />

that work environment perceptions (specifically perceived organizational support) are related to<br />

higher job performance (e.g., Bhanthumnavin, <strong>2003</strong>; Rhoades & Eisenberger, 2002). Similarly,<br />

Eaton (<strong>2003</strong>) contrasted perceptions of work-family culture with actual policies and found that<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


perceptions of the usability of work-family policies were positively related to organizational<br />

productivity more so than were the actual policies. We therefore propose the following<br />

hypothesis:<br />

Hypothesis 2a: Family-friendly work environment perceptions are positively related to<br />

job performance.<br />

Collective Efficacy. As discussed earlier, one subjective measure of organizational outcomes is<br />

the individual’s perceptions of future performance, which relates closely to self- and collectiveefficacy.<br />

Just as we argued that a negative relationship between work-life conflict and efficacy<br />

beliefs could be due to unsuccessful past performances, we argue successful past performance<br />

facilitated by family-friendly policies could result in a positive relationship between perceptions<br />

of family-friendly work environments and efficacy beliefs. That is, if an employee was able to<br />

successfully complete his or her duties with help from family-friendly policies, such as a flexible<br />

work schedule or onsite childcare, then he or she should feel just as capable of performing well<br />

in the future if he or she perceives the same benefits are available and usable. Empirical evidence<br />

has supported the logic that a positive relationship should exist between perceptions of a familyfriendly<br />

work environment and efficacy beliefs (Bhanthumnavin, <strong>2003</strong>). Based on the preceding<br />

logic and extant literature, the following hypothesis is presented:<br />

Hypothesis 2b: Family-friendly work environment perceptions are positively related to<br />

collective efficacy perceptions.<br />

Turnover Intentions. Researchers have also become increasingly interested in the effects of<br />

family-friendly policies on turnover and turnover intentions (e.g., Huselid, 1995). While the<br />

majority of studies have reported a negative relationship between family-friendly policies and<br />

intentions to leave (e.g., Grover & Crooker, 1995; Kossek & Ozeki, 1999), a few have found no<br />

relationship (Dalton & Mesch, 1990; Dunham et al., 1987). Despite the inconsistent findings, we<br />

expect a negative relationship will exist between perceptions of a family-friendly work<br />

environment and turnover intentions because individuals will be less likely to want to leave an<br />

organization they feel is treating them fairly and allowing the necessary flexibility to manage<br />

their work and personal lives with greater ease. This logic has been supported in the<br />

organizational justice literature, with withdrawal behaviors (including absenteeism, turnover, and<br />

neglect) typically relating negatively to perceptions of procedural, distributive, and informational<br />

justice (Colquitt, Conlon, Wesson, Porter, & Ng, 2001). We therefore propose the following<br />

hypothesis:<br />

Hypothesis 2c: Family-friendly work environment perceptions are negatively related to<br />

turnover intentions.<br />

MODERATING EFFECTS OF FAMILY-FRIENDLY WORK ENVIRONMENT<br />

PERCEPTIONS<br />

Based on the same logic of social exchange theory, the negative effects of work-life<br />

conflict on performance may be lessened when an individual perceives the organization is<br />

family-friendly. Equity theory and social exchange theory suggest that a healthy relationship<br />

would be based on an equitable exchange between the two entities. The employee would expect<br />

more from the organization to gain a sense of equality if they felt they were giving more than the<br />

organization was returning. One considerable organizational contribution is that of a familyfriendly<br />

work environment. That is, a family-friendly work environment can act as a buffering<br />

agent between employee conflict and organizational outcomes.<br />

183<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


184<br />

Bourg and Segal (1999) examined the impact of family-friendly policies and practices in<br />

the military on the perceived conflict between enlisted soldiers’ family and unit. They concluded<br />

“responsiveness to families on the part of the military will lessen the degree of conflict between<br />

the two greedy institutions of the military and the family” (p. 647). They further noted that such<br />

policies and practices can serve as a way for the organization (i.e., the military in their study) “to<br />

send a message to soldiers and family members that the family is no longer viewed as a<br />

competing outside influence” (p. 648).<br />

The mere presence of family-friendly policies, however, is not enough (Behson, 2002;<br />

Raabe & Gessner, 1988). That is, the employee’s perceptions of the workplace are more<br />

important than the workplace itself in affecting attitudinal and behavioral organizational<br />

responses (Allen, 2001; James & McIntyre, 1996). For example, two individuals working for the<br />

same organization could perceive the family-friendliness of the work environment as being<br />

completely different. The first employee may perceive there are available resources for him or<br />

her to use to reduce the conflict felt between the demands of work and home. By having these<br />

perceptions, he or she may be more likely to pursue these resources and receive their associated<br />

benefits than if the perceptions were absent. Therefore, the employee would be less likely to<br />

experience the detrimental effects associated with the conflict. The second employee, however,<br />

may not feel there are adequate resources available, regardless of having the same actual<br />

resource availability. He or she may not pursue, and therefore not get, resources that could be<br />

beneficial in reducing work-life conflict. He or she therefore would more likely experience the<br />

adverse effects of the work-life conflict.<br />

This logic leads to the following hypotheses:<br />

Hypothesis 3a: Family-friendly work environment perceptions moderate the work-life<br />

conflict-job performance relationship.<br />

Hypothesis 3b: Family-friendly work environment perceptions moderate the work-life<br />

conflict-collective efficacy perceptions relationship.<br />

Hypothesis 3c: Family-friendly work environment perceptions moderate the work-life<br />

conflict-turnover intentions relationship.<br />

One of the concerns of family-friendly work environments is that they discriminate<br />

against the single, childless employee (Rothausen, Gonzalez, Clarke, & O’Dell, 1998). When<br />

employees leave work early to attend to a sick child or spouse, the remaining employees must<br />

compensate for their absence. This could lead the remaining employees who do not have such<br />

demands to feel resentment toward the absent employees, and possibly toward the organization.<br />

Furthermore, by allowing employees with families to leave as needed (i.e., providing a familyfriendly<br />

work environment), the remaining employees may be adversely affected in terms of<br />

their performance. Few empirical studies, however, have tested the notion that family-friendly<br />

work environments may have negative effects on employees with fewer family responsibilities or<br />

demands.<br />

Because a family-friendly work environment is intended to assist employees with<br />

families, employees who are more likely to need a family-friendly work environment (i.e.,<br />

employees with more family responsibilities) would benefit from that environment more so than<br />

those who are less likely to need that environment. Behson (2002) noted, “organizational<br />

policies, programs, and attitudes that specifically address the topic of work-family balance may<br />

be of limited salience to ‘non-familied’ employees” (p. 67). So, whereas the policies may not be<br />

detrimental to employees without families, they certainly may not be beneficial either. With this<br />

in mind we propose that number of family responsibilities moderates the relationship between<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


family-friendly work environment perceptions and performance. Specifically, individuals with<br />

more family responsibilities will benefit more from positive family-friendly work environment<br />

perceptions than those with fewer family responsibilities. We therefore propose the following<br />

hypotheses:<br />

Hypothesis 4a: Number of family responsibilities moderates the relationship between<br />

family-friendly work environment perceptions and job performance.<br />

Hypothesis 4b: Number of family responsibilities moderates the relationship between<br />

family-friendly work environment perceptions and collective efficacy perceptions.<br />

Hypothesis 4c: Number of family responsibilities moderates negative relationship<br />

between family-friendly work environment perceptions and turnover intentions.<br />

CURRENT STUDY<br />

Work stressors, work-life conflict, and family-friendly work environments are issues that<br />

are salient to military and civilian workers alike. First, the military population is similar to many<br />

civilian organizations in that it has excessive work stressors such as long hours, lack of<br />

predictable schedule, and high levels of perceived overload (Castro & Adler, 1999). Second, like<br />

many civilian organizations (Allen, 2001), family-friendly policies are a standard policy in the<br />

military (Department of Defense, 1996). Finally, the military has been described as a reflection<br />

of the civilian society (Martin, Rosen, & Sparacino, 2000), and thus shares similar<br />

interrelationships between work and family.<br />

The current study investigated how family-friendly work environments act as a buffering<br />

mechanism in the stressor-strain relationship. Specifically, we examined employees’ perceptions<br />

of family-friendly work environments and how these perceptions directly and indirectly affected<br />

both subjective and objective measures of performance.<br />

METHOD<br />

Participants<br />

The participants in the study were soldiers (N=230) stationed in Europe. All participants<br />

were active duty US Army personnel with an average of 8 years in the military. There were<br />

61.4% non-commissioned officers, 31.3% junior-enlisted soldiers, and 7.4% commissioned<br />

officers. The sample was predominantly male (84.8%) and the largest ethnic group was White<br />

(51.8%), followed by African-American (27.9%), Hispanic (10.6%) and other (9.8%). In terms<br />

of marital status, 64.6% of the participants were married, 24.5% had never been married (single),<br />

and 11.0% were separated or divorced. Approximately half of the sample (51.1%) had children<br />

living at home.<br />

Procedure<br />

This paper is part of a larger study examining the effects of workload on individual and<br />

organizational outcomes. <strong>Military</strong> personnel in 10 units stationed in Germany and Italy were<br />

surveyed every three months for two years. Questionnaires were administered on-site at the<br />

military unit by research principal investigators or trained research assistants with follow-up data<br />

collections to obtain data from absent personnel. We only included data obtained from January<br />

2001 to May 2001 because questionnaires during this time period included scales that assessed<br />

perceptions of family-friendly work environment. In addition to the survey items, research staff<br />

also collected actual performance measures by visiting the units approximately three months<br />

after the data collection and collecting physical training scores and marksmanship scores that<br />

coincided with the survey data. Data from participants were only included if complete surveys<br />

185<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


186<br />

were available from both the survey and unit record data. See Castro, Adler, and Bienvenu<br />

(1998) for a full description of the methodology.<br />

Measures<br />

Family Responsibilities. Family responsibilities were determined by combining marital status<br />

(married = 1 and single = 0) and number of children living at home. For example, if the<br />

individual was married and had two children living at home family responsibilities would be<br />

three.<br />

Work-life Conflict. Work-life conflict was measured using a four-item scale modified by Gutek,<br />

Searle, and Klepa, (1991; based on Kopelman et al., 1983). This scale was designed to measure<br />

the extent to which work roles interfere with life roles. Sample items include, “After work, I<br />

come home too tired to do some of the things I’d like to do” and “On the job, I have so much<br />

work to do that it takes away from my personal interests.” Response choices ranged from 1<br />

(strongly disagree) to 5 (strongly agree). Scores were calculated by summing all items.<br />

Family-Friendly Work Environment. The extent to which the environment is perceived as<br />

family-friendly was assessed using items adapted from Allen’s (2001) measure of family<br />

supportive organizational perceptions. Sample items from the eight-item scale include, “In this<br />

unit, it is assumed that the most productive employees put their work before their family” and<br />

“In this unit, it is assumed that work should be the primary priority in the employees’ lives”<br />

(both items reverse-scored). Response choices ranged from 1 (strongly disagree) to 5 (strongly<br />

agree). Scores were calculated by summing all items.<br />

Job Performance. Objective performance ratings were obtained from unit records. <strong>Military</strong><br />

personnel are tested on their shooting capability twice each year. All soldiers must obtain<br />

qualifying scores with their assigned weapon. In the current study, weapon scores (i.e.,<br />

marksmanship scores) were based on M16 total scores from the participant’s most recent<br />

qualification record. The M16 is the standard weapon that is issued to all enlisted military<br />

personnel. The possible range of scores was 0 to 40, with a score of 24 being necessary to<br />

successfully qualify.<br />

Participants’ total physical training scores were also used. All soldiers are required to<br />

take a physical fitness test twice each year, consisting of a two-mile run and the total number of<br />

push-ups and sit-ups that can be performed in two minutes. The run time and number of pushups<br />

and sit-ups are then standardized based on sex and age, with scores ranging from 0 to 100 for<br />

each event. Physical training scores were calculated by adding sit-up, push-up, and running<br />

standardized scores.<br />

Collective Efficacy. A four-item combat readiness scale that assessed participants’ perceptions<br />

of their future level of performance was used to measure perceptions of collective efficacy<br />

(Vaitkus, 1994). This measure has been used in previous studies to assess collective efficacy<br />

(e.g., Jex & Bliese, 1999). Sample items include “I think my unit would do a better job in combat<br />

than most US Army units” and “I have real confidence in my unit’s ability to perform its<br />

mission.” Response choices ranged from 1 (strongly disagree) to 5 (strongly agree). Scores<br />

were calculated by summing all items.<br />

Turnover Intentions. Turnover intentions were measured with a single item: “Which best<br />

describes your current active-duty Army career intentions?” The response options were: 1)<br />

definitely stay in until retirement; 2) probably stay in until retirement; 3) definitely stay in<br />

beyond present obligation, but not until retirement; 4) undecided; 5) probably leave upon<br />

completion; or 6) definitely leave upon completion of current obligation. This item has been used<br />

in previous military research (Tremble, Payne, Finch, & Bullis, <strong>2003</strong>) to measure career intent.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Previous studies have found that one-item measures can be psychometrically comparable to<br />

multiple-item measures (Gardner, Cummings, Dunham, & Pierce, 1998; Wanous, Reichers, &<br />

Hudy, 1997).<br />

RESULTS<br />

The means, standard deviations, and reliabilities (where appropriate) for all of the key<br />

variables are included in Table 1.<br />

Table 1<br />

Correlations between Work-Life Variables, Organizational Outcome Construct, and Control<br />

variables.<br />

Mean SD 1 2 3 4 5 6 7 8<br />

Work and Life Variables<br />

1. FFWE 18.44 4.74 (.82)<br />

2. Work-Life Conflict<br />

Organizational Outcomes<br />

14.30 3.64 -.49** (.93)<br />

3. Collective Efficacy 12.53 3.67 .25** -.17* (.71)<br />

4. Physical Training Scores 249.65 31.98 .18** .02 .23** --<br />

5. Marksmanship Scores 31.84 5.37 .13 -.13 .04 .01 --<br />

6. Turnover Intentions<br />

Demographics<br />

3.23 1.93 -.29** .22** -.25** -.18** -.32** --<br />

7. Sex 1.85 .36 -.01 .09 .05 -.07 .23** -.17** --<br />

8. Rank 15.64 2.70 .22** .02 .15* .20** .09 -.39** -.05 --<br />

Note. Coefficient alphas are presented in parentheses on the diagonal. For sex, females are coded as 1 and males are<br />

coded as 2. FFWE = Family friendly work environment<br />

N = 230. ** p < .01; * p < .05.<br />

Sex and rank were used as control variables in all analyses. Moderated regressions yield a<br />

high likelihood of a Type II error rate (Aiken & West, 1991) therefore we selected an alpha level<br />

of .10 when testing interactions. A .05 alpha level was used for all other analyses.<br />

Hypothesis 1a proposed that work-life conflict was negatively related to job performance.<br />

We found no support for this hypothesis. Hypothesis 1b proposed that work-life conflict was<br />

negatively related to perceptions of future performance. This hypothesis was supported, with<br />

work-life conflict being related to perceptions of future combat performance (β = -.177, p =<br />

.007). The control variables (sex and rank) and perceptions of future combat performance<br />

explained 6% of the variance of work-life conflict. Hypothesis 1c proposed that work-life<br />

conflict was positively related to turnover intentions. This hypothesis was supported, with worklife<br />

conflict being negatively related to career intentions (β = .25, p = .000). The control variables<br />

and career intentions explained 25 percent of the variance.<br />

Hypotheses 2a, 2b, and 2c predicted positive relationships between family-friendly work<br />

environment perceptions and job performance, perceptions of future performance, and turnover<br />

intentions, respectively. Hypothesis 2a was partially supported, with family-friendly work<br />

environment perceptions being related to physical training scores (β = .141, p = .036) but not to<br />

marksmanship scores. Hypotheses 2b and 2c were supported, with significant relationships<br />

existing between family-friendly work environment perceptions and perceptions of future<br />

187<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


188<br />

combat performance (β = .223, p = .001), and turnover intentions (β = -.235, p = .000). After<br />

controlling for sex and rank, family-friendly work environment perceptions accounted for 6%<br />

(physical training), 8% (combat readiness), and 30% (turnover intentions) of the variance in the<br />

performance measures.<br />

Hypothesis 3a, 3b, and 3c proposed that family-friendly work environment perceptions<br />

would moderate the relationships between work-life conflict and job performance, perceptions of<br />

future performance, and turnover intentions, respectively. Only hypothesis 3a was supported,<br />

with the interaction shown in Figure 1. That is, family-friendly work environment perceptions<br />

moderated the relationship between work-life conflict and physical training scores (Table 2).<br />

PT<br />

Score<br />

High<br />

Low<br />

425<br />

400<br />

375<br />

350<br />

325<br />

300<br />

Low<br />

Work Life Conflict<br />

Figure 1<br />

Interaction between Work-Life Conflict, FFWE, and Physical Training Scores<br />

Table 2<br />

Interaction between Work-Life Conflict and FFWE Perceptions on Physical Training Scores<br />

Variable B SE B β R 2 ∆R 2<br />

Step 1: .04 .04*<br />

Sex -5.05 5.78 -.06<br />

Rank 2.30** .77 -.07<br />

Step 2: .08 .04*<br />

Sex -6.23 5.74 -.07<br />

Rank 1.68* .79<br />

FFWE 1.46** .52 .22<br />

Work-Life Conflict 1.16 .66<br />

Step 3: .11 .03**<br />

Sex -5.75 5.65 -. 07<br />

Rank 1.81* .78 .15<br />

FFWE 1.35** .51 .20<br />

Work-Life Conflict .89 .66 .10<br />

FFWE x Work-Life Conflict .27** .09 .20<br />

Notes. N = 289. FFWE = Family Friendly Work Environment. For sex, females are coded as 1 and males<br />

are coded as 2. The B weights in the columns are from the step of entry into the model. ** p < .01. * p < .05.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

High<br />

Low FFWE<br />

High FFWE


For both employees with high family-friendly work environment perceptions and<br />

employees with low perceptions, higher levels of work-life conflict was related to higher<br />

physical training scores. The physical training scores, however, were significantly higher for<br />

those employees with higher levels of family-friendly work environment perceptions.<br />

Hypotheses 4a, 4b, and 4c proposed that the number of family responsibilities would<br />

moderate the relationship between family-friendly work environment perceptions and job<br />

performance, perceptions of future combat performance, and turnover intentions, respectively.<br />

As shown in Tables 3 and 4 family responsibilities moderated the relationship between familyfriendly<br />

work environment perceptions and marksmanship scores (Figure 2), providing partial<br />

support for Hypothesis 4a, and the relationship between family-friendly work environment<br />

perceptions and turnover intentions (Figure 3), providing full support for Hypothesis 4c.<br />

Table 3<br />

Interaction between Family Responsibilities and FFWE Perceptions on Marksmanship Scores<br />

Variable B SE B β R 2 ∆R 2<br />

Step 1: .06 .06*<br />

Sex 3.05** 1.12 .23<br />

Rank .24 .21 .10<br />

Step 2: .08 .02<br />

Sex 2.80** 1.67 .21<br />

Rank .18 .21 .07<br />

FFWE .18 .11 .14<br />

Family Responsibilities .26 .33 .07<br />

Step 3: .11 .03*<br />

Sex 3.20** 1.17 .24<br />

Rank .11 .21 .05<br />

FFWE .15 .11 .11<br />

Family Responsibilities .25 .32 .07<br />

FFWE x Family Responsibilities .18* .08 .18<br />

Notes. N = 289. FFWE = Family Friendly Work Environment. For sex, females are coded as 1 and males are coded<br />

as 2. The B weights in the columns are from the step of entry into the model. ** p < .01. * p < .05.<br />

Table 4<br />

Interaction between Family Responsibilities and FFWE Perceptions on Turnover Intentions<br />

Variable B SE B β R 2 ∆R 2<br />

Step 1: .18 .18**<br />

Sex -1.02** .32 -.19<br />

Rank -.28 .04 -.39<br />

Step 2: .30 .12**<br />

Sex -.68** .31 -.13<br />

Rank -.19* .04 -.26<br />

189<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


190<br />

FFWE -.09** .02 -.24<br />

Family Responsibilities -.39** .08 -.30<br />

Step 3: .31 .01 ~<br />

Sex -.74* .31 -.14<br />

Rank -.18** .04 -.25<br />

FFWE -.10** .02 -.24<br />

Family Responsibilities -.41** .08 -.31<br />

FFWE x Family Responsibilities -.02 ~ .02 -.10<br />

Notes. N = 289. FFWE = Family Friendly Work Environment. For sex, females are coded as 1 and males are coded<br />

as 2. The B weights in the columns are from the step of entry into the model. ** p < .01. * p < .05. ~ p


Figure 3<br />

Interaction between FFWE, Family Responsibilities and Turnover Intentions<br />

Specifically, employees with more family responsibilities were more likely to have higher<br />

marksmanship scores if they perceived their environment to be family-friendly compared to<br />

employees with fewer family responsibilities. Similarly, employees with more family<br />

responsibilities were more likely to indicate they were going to remain in the military than<br />

employees with fewer family responsibilities if they perceived their work environment to be<br />

friendly toward families. No support was found for hypothesis 4b.<br />

DISCUSSION<br />

One solution for dealing with high work-life conflict is establishing a family-friendly<br />

work environment. The current study explored how social exchange theory and equity theory can<br />

help explain the relationships between work-life conflict and work outcomes in relation to<br />

family-friendly work environments. While previous studies have examined work outcomes and<br />

family-friendly work environments, few studies have assessed both subjective and objective<br />

performance measures.<br />

We found mixed results in examining the relationship between work-life conflict and<br />

performance and organizational outcomes. Whereas perceptions of work-life conflict were<br />

related to perceptions of future combat performance and turnover intentions, they were not<br />

related to individual performance measures (i.e., marksmanship or physical training scores). One<br />

possible explanation for this pattern of findings is that soldiers, as well as unit leaders (see<br />

below), will always ensure that individual performance remains relatively high, particularly for<br />

skills such as physical training and marksmanship, even when other work-life demands are high.<br />

Soldiers are particularly motivated to maintain high physical fitness and marksmanship scores, as<br />

both are important factors in determining directly and indirectly job advancement. Indeed, as<br />

Greenhaus et al. (1987) have pointed out, there are many job skills that are required to maintain a<br />

high level of performance Physical fitness and marksmanship skills certainly fall into this<br />

category, and the maintenance of such skills appears to trump other responsibilities and<br />

demands, both in and outside the work domain.<br />

Within the work domain, while employees may continue to complete their expected job<br />

duties, and maintain their skills directly related to their own promotion and performance<br />

evaluations, they might be less likely to perform extra, non-mandated or non-required duties,<br />

especially when other demands become high. Whereas the individual’s required performance<br />

would remain unchanged, not performing these additional tasks could adversely impact on the<br />

organization. These non-work related tasks have been referred to as organizational citizenship<br />

behaviors, and have been show to be linked to organizational effectiveness (Chen, Hui, & Sego,<br />

1998; Podsakoff & MacKensie, 1997).<br />

Perhaps the most interesting finding in our study in terms of unit readiness is that while<br />

work-life conflict did not impact individual job performance measures such as physical training<br />

and marksmanship, it was related to soldiers’ perceptions of team performance in future combat.<br />

<strong>Military</strong> units must function effectively as teams during war in order to be successful. The<br />

present data indicated, however, that although soldiers’ job performance skills were unrelated to<br />

work-life conflict, they were more pessimistic about future combat performance of their group as<br />

their work-life conflict increased. Given that efficacy beliefs have been consistently linked to<br />

performance (e.g., Sadri & Robertson, 1993; Stajkovic & Luthans, 1998), these findings suggest<br />

191<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


192<br />

that work-life conflict can be detrimental to the group, even if it is not directly through decreased<br />

performance of individual team members.<br />

Our hypotheses that perceptions of a family-friendly work environment would moderate<br />

the relationship between work-life conflict and organizational outcomes were based on social<br />

exchange and equity theories. Disappointedly, only one of our hypotheses was partially<br />

supported. Specifically, perceptions of a family-friendly work environment moderated the<br />

relationship between work-life conflict and physical training scores. Surprisingly, however, the<br />

nature of the interaction between work-life conflict and family-friendly work environment for<br />

physical training scores was in the opposite direction as predicted. That is, higher work-life<br />

conflict was related to higher physical training scores in general and especially when perceptions<br />

of family-friendly work environment were high. One possible explanation for why we observed<br />

this anti-buffering effect has to do with the nature of the performance measure, physical training.<br />

Namely, physical fitness training is mandatory training that usually occurs early in the morning,<br />

three times a week. Thus, while having a routine physical fitness training program no doubt will<br />

lead to higher physical fitness scores, the time the training is conducted is also likely to interfere<br />

with taking care of family responsibilities that occur in the morning, such as helping to get<br />

children fed, dressed, and transported to daycare or school.<br />

We hypothesized a direct relationship between family-friendly environment perceptions<br />

and performance and organizational outcomes. These hypotheses were generally supported.<br />

Family-friendly work environment perceptions were related to individual job performance (i.e.,<br />

physical training scores), perceptions of future combat performance, and turnover intentions. We<br />

also found that perceptions of a family-friendly work environment moderated the relationship<br />

between work-life conflict and physical training. In other words, regardless of the level of worklife<br />

conflict, employees who perceived their organization to have a family-friendly work<br />

environment also reported higher physical training scores. These results suggest that perceptions<br />

of a family-friendly work environment are important regardless of the level of work-life conflict.<br />

Researchers have suggested that family-friendly work policies might benefit only those<br />

employees with family demands or responsibilities, while penalizing those without such<br />

demands (Jenner, 1994). We examined this by testing if the number of family responsibilities an<br />

employee has moderates the relationship between perceptions of family-friendly work<br />

environment and organizational outcomes. This hypothesis was only partially supported;<br />

employees with more family responsibilities had higher marksmanship scores if they perceived<br />

their environment to be family-friendly compared to employees with fewer family<br />

responsibilities. More importantly, perceived family-friendly environments did not hinder the<br />

performance of individuals with fewer family responsibilities. Contrary to Jenner’s (1994)<br />

speculation, these results suggest that individuals who do not directly benefit from perceived<br />

family-friendly environments are not hindered by them either. It is possible that even employees<br />

without family responsibilities or demands nevertheless will still support family-friendly work<br />

environments with the expectation that when they have families that they too will benefit from<br />

them.<br />

Limitations<br />

There were several limitations to the current study. These results are based on a military<br />

sample, and therefore they may not generalize to a civilian population. Furthermore, the military<br />

units studied were stationed overseas where life stressors may have been higher than that of a<br />

stateside sample, such as higher likelihood of deployments, separation from personal support<br />

networks (e.g., parents), and everyday cultural differences. Finally, turnover is viewed<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


differently in the military compared to civilian organizations, whereby most military members<br />

are obligated to fulfill a written contract specifying their length of military service, civilians have<br />

relatively more freedom in job movement. These differences may have affected the ratings of<br />

turnover intentions. Nevertheless, the effect of these limitations would have likely resulted in<br />

range restriction, making it harder to find relationships should they exist. Therefore, the use of a<br />

military sample should not be considered a limitation in this study.<br />

Another limitation of this study is the nature of the data collection procedure.<br />

Specifically, cross sectional data does not allow us to know causal relationships. Without<br />

longitudinal data we are unable to know the direction of the relationships we examined. For<br />

example, it could be that individuals who score higher on their physical training tests are<br />

subsequently treated better by the organization, and therefore view the organization as being<br />

family-friendly. Future studies should examine the effects of work-life conflict, family-friendly<br />

work environment perceptions, and organizational outcomes using a longitudinal design.<br />

Implications and Future Studies<br />

Our results support the underlying tenets of social exchange theory and equity theory.<br />

That is, the relationship between the employee and the organization is a balancing act. As the<br />

responsibilities in the organization come to dominate the relationship, the employee will attempt<br />

to tip the scale back in his or her favor. This response may represent an attitude shift (i.e.,<br />

perceptions of future performance and career intentions) or an actual behavioral change<br />

(performance). Organizations have the ability to overcome some of the negative affects that can<br />

occur in an unbalanced relationship. The current study showed how maintaining a familyfriendly<br />

work environment is one way to modify imbalances between the employee and<br />

organization.<br />

The U.S. military has numerous programs to assist families, yet having programs is not<br />

always enough (Secret & Sprang, 2001). As we have shown in the present study, perceptions of a<br />

family-friendly work environment are also an important factor to key organizational outcomes.<br />

While the senior leadership is usually responsible for establishing policy that encourages a<br />

family-friendly work environment, it is up to the local leadership to foster and support the policy<br />

in order to create a family-friendly culture.<br />

There appear to be direct beneficial outcomes associated with family-friendly work<br />

environments. There may also be more indirect benefits. Individual’s career choice may be<br />

partially due to the family-friendly culture of the work environment. Future studies should look<br />

at other variables of interest such as recruitment and applicant attraction (Rau & Hyland, 2002).<br />

The current trend is for organizations to adopt family-friendly policies, with the intent to<br />

improve and retain effective employees. We suggest that organizations may want to expand the<br />

notion of a “family-friendly work environment” to “employee- or life-friendly work<br />

environment”. Single and married employees have pressures and responsibilities that extend<br />

beyond the family that can interfere with their ability to successfully perform the jobs.<br />

Lockwood (<strong>2003</strong>) suggested that the trend in work-family research is to broaden the term from<br />

work-family to work-life. We suggest the same should be done for the policies that benefit these<br />

interrelated domains.<br />

193<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


194<br />

REFERENCES<br />

Abbasi, S. M., & Hollman, K. W. (2000). Turnover: The real bottom line. Public Personnel<br />

Management, 29, 333-343.<br />

Adams, J. S. (1963). Toward an understanding of inequity. Journal of Abnormal and Social<br />

Psychology, 67, 422-436.<br />

Adams, J. S. (1965). Inequity in social exchange. In L. Berkowitz (Ed.), Advances in<br />

experimental social psychology (Vol. 2, pp. 267-299). New York: Academic Press.<br />

Adams, G. A., King, L. A., & King, D. W. (1996). Relationships of job and family involvement,<br />

family social support, and work-family conflict with job and life satisfaction. Journal of<br />

Applied Psychology, 81, 411-420.<br />

Adler, A. B., Thomas, J., & Castro, C. A. (2002, August). Measuring Up: A Comparison of Self-<br />

Reports and Unit Records for Assessing Soldier Performance. Paper presented at the annual<br />

meeting of the American Psychological <strong>Association</strong>, Chicago, IL.<br />

Aiken, L. S., & West, S. G., (1991). Multiple regression: <strong>Testing</strong> and interpreting interactions.<br />

Newbury Park: Sage.<br />

Aldous, J. (1990). Specification and speculation concerning the politics of workplace family<br />

policy. Journal of Family Issues, 11, 921-936.<br />

Allen, T. D. (2001). Family-supportive work environments: The role of organizational<br />

perceptions. Journal of Vocational Behavior, 58, 414-435.<br />

Allen, T. D., Herst, D. E. L., Bruck, C. S., & Sutton, M. (2000). Consequences associated with<br />

work-to-family conflict: A review and agenda for future research. Journal of Occupational<br />

Health Psychology, 5, 278-308.<br />

Aryee, S. (1992). Antecedents and outcomes of work-family conflict among married professional<br />

women: Evidence from Singapore. Human Relations, 45, 813-837.<br />

Bandura, A. (1982). Self-efficacy mechanism in human agency. American Psychologist, 37, 122-<br />

147.<br />

Bandura, A. (1984). Recycling misconceptions of perceived self-efficacy. Cognitive Therapy and<br />

Research, 8, 231-255.<br />

Behson, S. J. (2002). Which dominates? The relative importance of work-family organizational<br />

support and general organizational context on employee outcomes. Journal of Vocational<br />

Behavior, 61, 53-72.<br />

Bhanthumnavin, D. (<strong>2003</strong>). Perceived social support from supervisor and group members’<br />

psychological and situational characteristics as predictors of subordinate performance in Thai<br />

work units. Human Resource Development Quarterly, 14(1), 79-97.<br />

Blau, P. M. (1964). Exchange and Power in Social Life. New York: Wiley.<br />

Bommer, W. H., Johnson, J. L., Rich, G. A., Podsakoff, P. M., & Mackenzie, S. B. (1995). On<br />

the interchangeability of objective and subjective measures of employee performance: A<br />

meta-analysis. Personnel Psychology, 48, 587-605.<br />

Bourg, C., & Segal, M. W. (1999). The impact of family supportive policies and practices on<br />

organizational commitment to the Army. Armed Forces & Society, 25, 633-652.<br />

Burke, R. J. (1988). Some antecedents and consequences of work-family conflict. Journal of<br />

Social Behavior and Personality, 3, 287-302.<br />

Cascio, W. F. (1998). Applied psychology in human resource management (5 th ed.). Upper<br />

Saddle River, NJ: Prentice Hall.<br />

Castro, C. A., & Adler, A. B. (1999, Autumn). The impact of operations tempo on soldier and<br />

unit readiness. Parameters, 86-95.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Castro, C. A., Adler, A. B., & Bienvenu, R. V. (1998). A human dimensions assessment of the<br />

impact of OPTEMPO on the forward-deployed soldier [WRAIR Protocol #700]. Silver<br />

Spring, MD: Walter Reed Army Institute of Research.<br />

Chen, X. Hui, C. & Sego, D. J. (1998). The role of organizational citizenship behavior in<br />

turnover: Conceptualization and preliminary tests of key hypotheses. Journal of Applied<br />

Psychology, 83, 922-931<br />

Colquitt, J. A., Conlon, D. E., Wesson, M. J., Porter, C. O. L. H., & Ng, K. Y. (2001). Justice at<br />

the millennium: A meta-analytic review of 25 years of organizational justice research.<br />

Journal of Applied Psychology, 86, 425-445.<br />

Dalton, D. R., & Mesch, D. J. (1990). The impact of flexible scheduling on employee attendance<br />

and turnover. Administrative Science Quarterly, 35, 370-387.<br />

Dunham, R. B., Pierce, J. L., & Castenada, M. B. (1987). Alternative work schedules: Two field<br />

quasi-experiments. Personnel Psychology, 40, 215-242.<br />

Department of Defense. (1996, December). Department of Defense and families: A total force<br />

partnership. Department of Defense.<br />

Eaton, S. C. (<strong>2003</strong>). If you can use them: Flexibility policies, organizational commitment, and<br />

perceived performance. Industrial Relations, 42, 145-167.<br />

Eisenberger, R., Huntington, R., Hutchison, S., & Sowa, D. (1986). Perceived organizational<br />

support. Journal of Applied Psychology, 71, 500-507.<br />

Festinger, L. (1957). A theory of cognitive dissonance. Stanford, CA: Stanford University Press.<br />

Fredriksen-Goldsen, K. I., & Scharlach, A. E. (2001). Families and work. New directions in the<br />

twenty-first century. New York: Oxford University Press.<br />

Frone, M. R. (2000). Work-family conflict and employee psychiatric disorders. The national<br />

comorbidity survey. Journal of Applied Psychology, 85, 88-895.<br />

Frone, M. R. (<strong>2003</strong>). Work-family balance. In J. C. Quick & L. E. Tetrick (Eds.), Handbook of<br />

occupational health psychology. Washington, DC: American Psychological <strong>Association</strong>.<br />

Frone, M. R., Russell, M., & Barnes, G. M. (1996). Work-family conflict, gender, and healthrelated<br />

outcomes. A study of employed parents in two community samples. Journal of<br />

Occupational Health Psychology, 1, 57-69.<br />

Frone, M. R., Russell, M., & Cooper, M. L. (1992). Antecedents and outcomes of work-family<br />

conflict testing a model of the work-family interface. Journal of Applied Psychology, 77, 65-<br />

78.<br />

Frone, M. R., Yardley, J. K., & Markel, K. S. (1997). Developing and testing an integrative<br />

model of the work-family interface. Journal of Vocational Behavior, 50, 145-167.<br />

Gardner, D. G., Cummings, L. L., Dunham, R. B., & Pierce, J. L. (1998). Single-item versus<br />

multiple-item measurement scales: An empirical examination. Educational and<br />

Psychological Measurement, 58, 898-915.<br />

Gist, M. E. (1987). Self-efficacy: Implications for organizational behavior and human resource<br />

management. Academy of Management Review, 12, 472-485.<br />

Glass, J. S., & Finely, A. (2002). Coverage and effectiveness of family-responsive workplace<br />

policies. Human Resource Management Review, 12, 313-337.<br />

Greenhaus, J. H., Bedeian, A. G., & Mossholder, K. W. (1987). Work experiences, job<br />

performance, and feelings of personal and family well-being. Journal of Vocational<br />

Behavior, 31, 200-215.<br />

195<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


196<br />

Greenhaus, J. H., Collins, K. M., Singh, R., & Parasuraman, S. (1997). Work and family<br />

influences on departure from public accounting. Journal of Vocational Behavior, 50, 249-<br />

270.<br />

Greenhaus, J. H., Parasuraman, S., & Collins, K. M. (2001). Career involvement and family<br />

involvement as moderators of relationships between work-family conflict and withdrawal<br />

from a profession. Journal of Occupational Health Psychology, 6, 91-100.<br />

Grover, S. L., & Crooker, K. J. (1995). Who appreciates family-responsive human resource<br />

policies: The impact of family-friendly policies on the organizational attachment of parents<br />

and non-parents. Personnel Psychology, 48, 271-288.<br />

Gupta, N., & Jenkins, G. D., Jr. (1992). The effects of turnover on perceived job quality. Group<br />

and Organization Management, 17, 431-446.<br />

Gutek, B. A., Searle, S., & Klepa, L. (1991). Rational versus gender role expectations for workfamily<br />

conflict. Journal of Applied Psychology, 76, 560-568.<br />

Hacker, C. (1997, October). The cost of poor hiring decisions…And how to avoid them. HR<br />

Focus, 5-13.<br />

Harris, M. M., & Schaubroeck, J. (1988). A meta-analysis of self-supervisor, self-peer, and peersupervisor<br />

ratings. Personnel Psychology, 41, 43-62.<br />

Hart, P.M. (1999). Predicting employee life satisfaction: A coherent model of personality, work<br />

and nonwork experiences, and domain satisfactions. Journal of Applied Psychology, 84, 564-<br />

584.<br />

Hartline, M. D., & Ferrel, O. C. (1996). The management of customer-contact service<br />

employees: An empirical investigation. Journal of Marketing, 60, 52-70.<br />

Heneman, R. L. (1986). The relationship between supervisory ratings and results-oriented<br />

measures of performance: A meta-analysis. Personnel Psychology, 39, 811-826.<br />

Huselid, M. A. (1995). The impact of human resource management practices on turnover,<br />

productivity, and corporate financial performance. Academy of Management Journal, 38,<br />

635-672.<br />

James, L. R., & McInytre, M. D. (1996). Perceptions of organizational climate. In K. R. Murphy<br />

(Ed.), Individual differences and behavior in organizations (pp. 416-450). San Francisco:<br />

Jossey-Bass.<br />

Jenner, L. (1994). Family-friendly backlash. Management Review, 7.<br />

Jex, S. M. (1998). Stress and job performance: Theory, research, and implications for<br />

managerial practice. Thousand Oaks, CA: Sage.<br />

Jex, S. M., & Bliese, P. D. (1999). Efficacy beliefs as a moderator of the impact of work-related<br />

stressors. A multilevel study. Journal of Applied Psychology, 84, 340-361.<br />

Kahn, R. L., Wolfe, D. M., Quinn, R., Snoek, J. D., & Rosenthal, R. A. (1964). Organizational<br />

stress. New York: Wiley.<br />

Katosh, J. P., & Traugott, M. W. (1981). The consequences of validated and self-reported voting<br />

measures. Public Opinion Quarterly, 45, 519-535.<br />

Kleck, G. (1982). On the use of self-report data to determine the class distribution of criminal<br />

and delinquent behavior. American Sociological Review, 47, 427-433.<br />

Kopelman, R. E., Greenhaus, J. H. & Connolly, T. F. (1983). A model of work, family, and<br />

interrole conflict: A construct validation study. Organization Behavior and Human<br />

Performance, 32, 198-215.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Kossek, E. E., & Lobel, S. (1996). Beyond the family friendly organization. In E. E. Kossek & S.<br />

A. Lobel (Eds.), Managing diversity: Human resource strategies for transforming the<br />

workplace (pp. 221-244). Oxford: Blackwell.<br />

Kossek, E. E., & Nichol, V. (1992). The effects of on-site child care on employee attitudes and<br />

performance. Personnel Psychology, 45, 485-509.<br />

Kossek, E. E., & Ozeki, C. (1998). Work-family conflict, policies, and the job-life satisfaction<br />

relationship: A review and directions for future organizational behavior-human resources<br />

research. Journal of Applied Psychology, 83, 139-149.<br />

Lockwood, N. R. (<strong>2003</strong>). Work/life balance. Challenges and solutions. Society for Human<br />

Resource Management Research Quarterly, 2, 1-11.<br />

Major, V. S, Klein, K. J., & Ehrhart, M. G. (2002). Work time, work interference with family,<br />

and psychological distress. Journal of Applied Psychology, 87, 427-436.<br />

Marshal, N. L., & Barnett, R. C. (1996). Family-friendly workplaces, work-family interface, and<br />

worker health. In G. P. Keita & J. J. Hurrell (Eds.), Job stress in a changing workforce:<br />

Investigating gender, diversity, and family issues (pp. 253-264). Washington, DC: American<br />

Psychological <strong>Association</strong>.<br />

Martin, J. A., Rosen, L. N., & Sparacino, L. R. (2000). The military family: A practice guide for<br />

human service providers. Westport, CT: Praeger.<br />

Netemeyer, R. G., Boles, J. S., & McMurrian, R. (1996). Development and validation of workfamily<br />

conflict and family-work conflict scales. Journal of Applied Psychology, 81, 400-410.<br />

Orthner, D. K., & Pittman, J. F. (1986). Family contributions to work commitment. Journal of<br />

Marriage and the Family, 48, 573-581.<br />

Pierce, J. L., & Newstrom, J. W. (1982). Employee responses to flexible work schedules: An<br />

inter-organization, inter-system comparison. Journal of Management, 8, 9-25.<br />

Pierce, J. L., & Newstrom, J. W. (1983). The design of flexible work schedules and employee<br />

rezones: Relationships and process. Journal of Occupational Behaviour, 4, 247-262.<br />

Perrewe, P. L., Treadway, D. C., & Hall, A. T. (<strong>2003</strong>). The work and family interface: Conflict,<br />

family-friendly policies, and employee well-being. In D. A. Hofmann & L. E. Tetrick (Eds.),<br />

Health and safety in organizations: A multilevel perspective (pp. 285-315). San Francisco:<br />

Jossey-Bass/Pfeiffer.<br />

Podsakoff, P. M., & MacKensie, S. B. (1997). Impact of organizational citizenship behavior on<br />

organizational performance: A review and suggestions for future research. Human<br />

Performance, 10, 133-151<br />

Raabe, P. H., & Gessner, J. (1988). Employer family-supportive policies: Diverse variations on<br />

the theme. Family Relations, 37, 196-202.<br />

Rau, B. L., Hyland, M. M. (2002). Role conflict and flexible work arrangements: The effects on<br />

applicant attraction. Personnel Psychology. 55, 111-136.<br />

Rothausen, J. J., Gonzalez, J. A., Clark, N., & O’Dell, L. (1998). Family-friendly backlash-fact<br />

or fiction? The case of organizations’ on-site child care centers. Personnel Psychology, 51,<br />

685-703.<br />

Rhoades, L., & Eisenberger, R., (2002). Perceived organizational support: A review of the<br />

literature. Journal of Applied Psychology, 87, 698-714.<br />

Sadri, G., & Robertson, I. T. (1993). Self-efficacy and work-related behaviour: A review and<br />

meta-analysis. Applied Psychology: An <strong>International</strong> Review, 42, 139–152.<br />

Secret. M., & Sprang, G. (2001). The effects of family-friendly workplace environments on the<br />

work-family stress of employed parents. Journal of Social Service Research, 28, 21-41.<br />

197<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


198<br />

Spector, P. E., Dwyer, D. J., & Jex, S. M. (1988). Relation of job stressors to affective, health,<br />

and performance outcomes: A comparison of multiple data sources. Journal of Applied<br />

Psychology. 73, 11-19<br />

Stajkovic, A., & Luthans, F. (1998). Self-efficacy and work-related performance: A metaanalysis.<br />

Psychological Bulletin, 124, 240–261.<br />

Thomas, C. A., & Ganster, D. C. (1995). Impact of family-supportive work variables on workfamily<br />

conflict and strain: A control perspective. Journal of Applied Psychology, 80, 6-15.<br />

Thornton, G. C. (1980). Psychometric properties of self-appraisals of job performance.<br />

Personnel Psychology, 33, 263-271.<br />

Tremble,T. R., Jr., Payne, S. C., Finch, J. F., & Bullis, R. C. (<strong>2003</strong>). Opening organizational<br />

archives to research: Analog measures of organizational commitment. <strong>Military</strong> Psychology,<br />

15, 167-190.<br />

Vaitkus, M. (1994). Unit Manning System: Human dimensions field evaluation of the COHORT<br />

company replacement model. Technical report ADA285942, Washington, DC.<br />

Wanous, J. P., Reichers, A. E., & Hudy, M. J. (1997). Overall job satisfaction: How good are<br />

single-item measures? Journal of Applied Psychology, 82, 247-252.<br />

White, G. L. (1995). Employee turnover: The hidden drain on profits. HR Focus, 72(1), 15-18.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Tracking U.S. Navy Reserve Career Decisions 3<br />

Rorie N. Harris, Ph.D.<br />

Jacqueline A. Mottern, Ph.D.<br />

Michael A. White, Ph.D.<br />

David L. Alderton, Ph.D.<br />

U. S. Navy Personnel Research, Studies and Technology, PERS-13<br />

Millington, TN 38055-1300<br />

jacqueline.mottern@navy.mil<br />

Since December 2000, the U.S. Navy Reserve has tracked the career decisions of Selected<br />

Reservists with a web-based survey, the Reserve Career Decision Support System. Designed by<br />

Navy Personnel Research, Studies and Technology (NPRST), the survey relies on adaptive<br />

question branching to keep questions relevant to the respondent, thus reducing respondent<br />

burden. The web-based survey was designed for administration at transition points (e.g.,<br />

retirement, promotion, mobilization, and demobilization). In addition to the questionnaire, the<br />

system includes a near real-time query system available to commanders and career counselors<br />

to answer questions about their commands. Through the Reserve Career Decision Support<br />

System, the Navy Reserves are able to track the career decisions of Selected Reservists and<br />

assess the impact of mobilization on those decisions.<br />

INTRODUCTION<br />

In order to maintain a qualified, productive workforce, it is necessary for organizations to<br />

identify talented employees, train them effectively, and participate in actions and behaviors that<br />

encourage the employee to remain with the organization. Both private and public sector<br />

organizations, such as the military, face challenges in attracting, motivating, and retaining<br />

competent employees. When highly skilled, qualified members leave, the organization suffers<br />

losses in terms of talent, level of readiness, and the monetary costs associated with providing the<br />

training they received. As such, researchers for the military continually attempt to identify and<br />

examine those factors that influence whether or not a member chooses to stay or leave the<br />

organization (Harris, <strong>2003</strong>).<br />

A key segment of the Navy personnel population that contributes greatly to the mission<br />

of the Navy is the Naval Reserve. Members of the reserves are volunteers who are trained to<br />

serve the expanded needs of the Navy, and make up almost 30% of the military personnel<br />

serving with the Navy (U.S. Navy, <strong>2003</strong>). The Naval Reserve relies on timely and accurate<br />

retention and attrition statistics to guide its officer and enlisted personnel policies and programs.<br />

In order to plan and manage accession, retention, separation and advancement targets in the<br />

Naval Reserve, Naval Reserve planners, managers, and career counselors need accurate,<br />

understandable, timely, easily accessible information on career decisions. The Naval Reserve<br />

3<br />

The opinions expressed are those of the authors. They are not official and do not represent the views of the U.S.<br />

Department of Navy.<br />

199<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


200<br />

has lacked a standardized means of collecting accurate, comprehensive reasons why personnel<br />

are staying in and leaving the Reserves and on other critical separation and retention issues.<br />

In an effort to address these issues, the U.S. Navy Personnel Research, Studies and<br />

Technology Department (PERS-1), with the Naval Reserves as project sponsor, developed a<br />

career decision support survey and query system. The career decision survey is a web-based<br />

questionnaire administered at transition points (promotions, retiring, voluntary or involuntary<br />

separations, mobilizing and demobilizing) in Selected Reservist’s (SELRES) careers. The webbased<br />

query system is available to commands to access their data in near real-time.<br />

THE RESERVE CAREER DECISION QUESTIONNAIRE AND QUERY SYSTEM<br />

The questionnaire relies on extensive item branching to allow a maximum number of<br />

items with a minimum response burden. The questionnaire adapts itself to each individual, based<br />

on their demographics and responses to initial question topics. Answers to marital status,<br />

dependents, rank and reason for completing the questionnaire serve as branching trunks. For<br />

example, a SELRES who is single with no dependents will not see questions concerning spouse<br />

or dependents. In addition, a series of 14 broad items (see Table 1) also serve as major branches.<br />

For example, a SELRES who is mobilizing will only see questions related to mobilization.<br />

Table 1. List of Item Branching Topics<br />

Training/Promotion/Advancement<br />

Opportunities<br />

Education and Other Benefits<br />

Career Assignments Pay and Retirement<br />

Command Climate Civilian Job Opportunities<br />

Time Away From Home Mobilization<br />

Recognition<br />

Navy Culture<br />

(FITREPs/Evaluations/Awards) (Regulation/Discipline/Standards)<br />

Maintenance and Logistic Support Navy Leadership<br />

Current Job Satisfaction Personal and Family Life<br />

The questionnaire also departs from the traditional use of satisfaction scales by using a 7point<br />

Likert-type scale that asks if an item has “influenced you (contributed to your decision) to<br />

stay, influenced you to leave, or had no effect on your Naval Reserve career decision”. As<br />

SELRES near completion of the survey, the computer generates a list of items they have<br />

identified as strong influences to stay. Each SELRES then selects the five most important items<br />

that are influencing their career decision. A similar list of influences to leave is generated as<br />

well.<br />

In FY03 we added a web-based query system for commanders and career counselors to<br />

access for direct access to their command data, based on RUIC and echelon. Using drop-down<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


menus, a commander or career counselor can generate reports for all questions in the dataset,<br />

except for SSN, and can run Stay-Leave reports (a list of the most important influences to stay<br />

and to leave for officers and enlisted in order of frequency) for their commands compared to<br />

Reserve wide data.<br />

METHOD<br />

Between December 1, 2000 and February 28, 2001, the U.S. Naval Reserve asked all<br />

their SELRES to complete a version of the web-based questionnaire. The survey had a 71%<br />

completion rate with a population of 71,300, thus serving as a baseline data collection. Within 30<br />

days of ending data collection, a detailed briefing of these data was delivered. A revised<br />

questionnaire was implemented May 2001 (N = 13,163) and included a section on the impact of<br />

mobilization and demobilization experiences on SELRES careers.<br />

The mobilization process offers many potential elements that could influence a member’s<br />

retention decisions, including effects on family, pay and benefits, and mobilization jobs and<br />

tasks. Influences on staying or leaving the reserves are of interest to those in positions to make<br />

changes to the mobilization process and those aspects of mobilization that show levels of<br />

influence on the retention decisions of reserve members. On the Career Decisions Survey,<br />

questions regarding mobilization are designed to examine those aspects that influence a reserve<br />

member to either remain in or leave the Navy Reserves. Questions cover a variety of general<br />

topics, including the mobilization process in general, the gaining command, family issues, pay<br />

and benefits, effects on civilian job, and willingness to extend mobilization term. Within each<br />

topic, questions are rated on a scale from 1 (influence to leave) to 7 (influence to stay), with a<br />

response of 4 representing no effect on leaving or staying in the Reserves.<br />

RESULTS<br />

Approximately 5,700 respondents were branched to respond to the questions on<br />

mobilization. Currently, almost 90% of the Reservists who responded to the mobilization<br />

questions are mobilized. A majority of these respondents (83%) reported having been mobilized<br />

during the last year and 85% of the Reservists had not been mobilized previously.<br />

Approximately three-fourths of the respondents did not volunteer for mobilization.<br />

Gaining Command Issues<br />

In examining issues with the gaining command that influence reserve members, most<br />

aspects show higher influences on staying in the reserves or show no effect in either direction.<br />

The gaining command assigned showed to be an influence to stay for 43% of the respondents<br />

(see Fig. 1). Location of the gaining command (43%) also showed an influence on staying in the<br />

reserves. 42% of the respondents reported that the treatment received by Active Duty sailors<br />

influenced them to stay, and the job that was assigned on arrival was also an influence for<br />

staying for almost 45% of the respondents.<br />

The decision to leave was influenced by several different aspects of the gaining command<br />

(Fig. 1). The morale of the gaining command was reported to be an influence to leave by almost<br />

50% of the respondents. Another influence to leave is the leadership at the gaining command,<br />

201<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


202<br />

with 48% of respondents indicating a negative influence. Approximately 40% of respondents<br />

indicated that treatment at the gaining command was an influence to leave, although roughly the<br />

same percentage of respondents reported their treatment as an influence to stay in the Reserves.<br />

Gaining command assigned<br />

Location of gaining command<br />

Treatment by AD sailors<br />

Job assigned on arrival<br />

Morale of gaining command<br />

Leadership at gaining command<br />

Treatment at gaining command<br />

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%<br />

Figure 1. Gaining Command Questions: Influences to Stay and Leave the Reserves<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Influence to leave<br />

No Effect<br />

Influence to stay<br />

Mobilization Process Issues<br />

Questions regarding mobilization probed issues such as satisfaction with the mobilization<br />

experience, satisfaction with mobilization assignments, and evaluation of job tasks. When asked<br />

about how smooth the process was, almost 69% of respondents indicated that the process was at<br />

least moderately smooth. In terms of task assignments, approximately 70% of respondents<br />

characterized the tasked assigned to them as being interesting. Almost 65% of those who<br />

responded reported that their mobilization job is/was relevant to their rank (see Fig. 2).<br />

Assigned interesting<br />

tasks<br />

Mob job relevant to<br />

rank<br />

Figure 2. Task assignment ratings<br />

0% 20% 40% 60% 80% 100%<br />

Yes<br />

No


Respondents were divided in their reports of the effect of the overall mobilization<br />

experience on their career decisions, with 42% reporting their experience as an influence to stay<br />

and 38% reporting it as an influence to leave (Fig. 3). The mobilization assignment received was<br />

reported as an influence to stay by 43% of the respondents. Reports on aspects such as time<br />

given to report to new command, the manner in which Reservists were notified about<br />

mobilization and the time needed to get correct orders indicated that for a majority of<br />

respondents, these issues had no strong influence on the decision to stay or the decision to leave<br />

the Naval Reserves.<br />

Overall mob<br />

experience<br />

Mob assignment<br />

you received<br />

Time given to<br />

report<br />

Manner notified<br />

Time get correct<br />

orders<br />

0% 20% 40% 60% 80% 100%<br />

Influence to leave<br />

No Effect<br />

Influence to stay<br />

Figure 3. Mobilization Process Questions: Influences to Stay and Leave the Reserves<br />

Family Related Issues<br />

A final area on which to examine the effects of mobilization in terms of influencing<br />

Reservists to stay or leave the organization is that of family related issues. 68% of respondents<br />

indicated that they are married, and almost 60% of respondents reported that there are children<br />

living in their household. A majority of respondents said that they saw their families once a<br />

month or less. For the majority of respondents, benefits such as family use of the commissary,<br />

the exchange, and Tricare indicated no effect or served as influences to stay in the Naval<br />

Reserves. More than half of the Reservists who responded indicated that having to leave their<br />

families for mobilization and the inability to move their families were influences for them to<br />

leave the organization. Also, the fact that the family shows concern for the Reservists’ safety was<br />

an influence to leave for approximately 43% of the respondents. Other influences to leave were<br />

the effects of mobilization on children, as well as the additional stress that mobilization causes<br />

for spouses (Fig. 4).<br />

203<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


204<br />

Family use of commissary<br />

Family use of exchange<br />

Family use of TRICARE<br />

Leaving family for mobilization<br />

Inability to move family<br />

Family concern for your safety<br />

Effect of mob on children<br />

Additional stress of mob on spouse<br />

0% 20% 40% 60% 80% 100%<br />

Figure 4. Family Related Questions: Influences to Leave<br />

Summary<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Influence to leave<br />

No Effect<br />

Influence to stay<br />

Results from the information provided by mobilized Reservists indicate that there are<br />

areas of interest that should be further examined by Reserve leaders. Areas showing higher<br />

percentages of influence on leaving include family issues, such as separation from family during<br />

mobilization, and effects on spouses and children. The command to which Reservists are<br />

assigned also seem to show influence on the decision to remain, with leadership and morale<br />

being rated as reasons to consider leaving the Reserves. The web-based survey and query system<br />

reported here provide Reserve leadership an important tool for monitoring the Reserve force and<br />

the effects of events such as mobilization and other personnel policies on Reserve career<br />

decisions.<br />

REFERENCES<br />

Harris, R.N. (<strong>2003</strong>). Navy Spouse Quality of Life: Development of a model predicting spousal<br />

support of the reenlistment decision. Unpublished doctoral dissertation. The University of<br />

Memphis.<br />

U.S. Navy. (<strong>2003</strong>). Status of the Navy report. Retrieved October 6, <strong>2003</strong>, from<br />

http://www.chinfo.navy.mil/navpalib/news/.www/status.html


Duties and Functions of a Recruiting Command Psychologist<br />

LTC Stephen V. Bowles<br />

U.S.Army Recruiting Center One<br />

This paper discusses the duties and functions of the US Army Recruiting Command<br />

(USAREC) Psychologist position that has applications to other national and international<br />

recruiting psychologist positions, as well as other organizational psychologist positions.<br />

This position serves as advisor to the USAREC Commanding General and senior<br />

leadership on all areas of psychological issues, programs, projects, and initiatives in the<br />

Command. The person in this position serves as liaison to the office of the Army Surgeon<br />

General Psychology Consultant, Department of Defense, and civilian agencies regarding<br />

psychological aspects of Soldiers in recruiting. This position oversees relevant screening<br />

and selection projects for recruiting command and trains recruiters and leaders on<br />

leadership, team building, and psychological principles at recruiting school and in the<br />

field. This position also provides consultation on Soldiers’ and family members’ wellbeing<br />

and command climate, and conducts research in recruiting and relevant areas to<br />

enhance the screening, training, well-being, and performance of recruiters and leadership.<br />

This position serves as a force multiplier to identify the best personnel to recruit and lead<br />

while providing sustainment training.<br />

The Command Psychologist serves as one of the members of the USAREC Commanding<br />

General staff. In addition, the Command Psychologist serves as an advisor to all senior<br />

leadership, on all areas of psychological issues, programs, projects, and initiatives in this<br />

Command as well as other major Army Commands and Department of Defense agencies.<br />

The person in this position serves as liaison to the Office of The Army Surgeon General<br />

Psychology Consultant and interfaces with other military psychologists on recruiter<br />

operational training and psychological research for Soldiers and leadership in recruiting.<br />

The role of the Command Psychologist is to serve as the command advisor for screening,<br />

assessment, well-being, and enhanced performance training. These programs will be<br />

described over the course of this paper.<br />

Recruiter and Leader Screening<br />

The objective of the screening program is to develop research based programs to identify<br />

the best personnel in the Army for recruiting and leadership. Past research that has been<br />

overseen by the Command Psychologist in this area was a concurrent validation project<br />

in the development of a recruiter screening test. Currently an ongoing predictive<br />

validation study is being conducted. Most recently under the guidance of the Command<br />

Psychologist, an Army Model web-based test program has been developed to screen all<br />

soldiers on recruiting orders for recruiting. In this capacity, the Command Psychologist<br />

served as the program manager directing Army staff from agencies in charge of<br />

personnel, human resources, research, software development, and testing facilities. This<br />

program has recommended a screening process that is currently under examination while<br />

the testing process has been operationalized in Army facilities worldwide. The test will<br />

undergo further research and development in the web-based phase for a couple more<br />

years as the test data refines the scoring algorithm. The test will also be placed into a new<br />

host testing system this year as the Army improves through continued advanced<br />

205<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


206<br />

technology. The focus of this process will be to continue to screen larger numbers of<br />

soldiers from the applicant pool to enhance recruiter selection for the Army recruiting.<br />

Recruiter Assessment<br />

The purpose of the assessment board evaluation (ABE) is to evaluate marginal students<br />

by reviewing their performance records. Recruiting students complete the recruiter<br />

screening test prior to reporting to class or upon arriving to the United States Recruiting<br />

and Retention School (RRS). These factors help to determine if the student should<br />

continue at RRS. The focus of the assessment is on technical and interpersonal skills. A<br />

future project for this program is to examine the use of developmental tests to provide<br />

feedback to recruiters on interpersonal skills.<br />

Recruiter and Station Commander Well-being<br />

The word stress has become synonymous with recruiting. Eighty-four percent of<br />

recruiters report that their stress levels are higher than their previous occupations. Since<br />

the summer of 2002, stress and well-being material has been provided to recruiters to<br />

alleviate this increased stress. Feedback to recruiters includes stress level, health habits,<br />

alcohol, caffeine and cigarette use, social support network, Type A behavior, cognitive<br />

hardiness, coping style, and psychological well being. To date, over 1,500 recruiters<br />

have been provided with well-being feedback material. Consultation is also provided to<br />

senior leadership on Soldiers’ and family members’ well-being and command climate.<br />

Leader Coach Program<br />

The objective of the Leader Coach Program is to enhance the performance of leaders,<br />

increase production, reduce stress, improve well-being, and develop more capable<br />

leaders, while focusing on station commanders. Training is done in the following phases:<br />

assessment, development of a leader plan of action, two stress resilience training<br />

sessions, recognition of personal and professional developmental goals, and individual<br />

coaching at RRS and in the field for one year. Leaders volunteer for the program while<br />

attending courses at the RRS or are Cadre at RRS. Currently the program has a 90+%<br />

satisfaction rate for the program and coaching.<br />

Enhanced Performance Program<br />

This program provides classroom training for all station commanders (junior level<br />

managers) on leadership development. The Enhanced Performance Program identifies<br />

the leadership characteristics of station commanders, such as decisiveness,<br />

expressiveness, self-confidence, extroversion, assertiveness, positive motivation, and<br />

ability to mentor subordinates. Station commanders are provided with individualized<br />

feedback forms with their classroom training. Upon graduating from training, station<br />

commanders are trained by their first sergeants in the field who are provided with<br />

coaching feedback forms of strengths and weaknesses of the station commanders.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The in-house coaching provided to station commanders consists of three coaching<br />

sessions. The focus of this coaching is on developing business and personal goals to<br />

enhance production and quality of life of these leaders.<br />

Center One and RRS staff members mentor or coach station commanders for three<br />

sessions.<br />

207<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


208<br />

THE 2002 WORKPLACE AND GENDER RELATIONS SURVEY<br />

Anita R. Lancaster, Rachel N. Lipari, Lee M. Howell, and Regan M. Klein<br />

Defense Manpower Data Center<br />

1600 Wilson Blvd., Suite 400<br />

Arlington, VA 22209-2953<br />

liparirn@osd.pentagon.mil<br />

Introduction<br />

This paper provides results for sections of the Status of Armed Forces: Workplace and<br />

Gender Relations Survey (2002 WGR). The United States Department of Defense (DoD) has<br />

conducted three sexual harassment surveys of active-duty members in the Army, Navy, Marine<br />

Corps, Air Force, and Coast Guard – in 1988, 1995, and 2002. The surveys not only document<br />

the extent to which Service members report experiencing unwanted, uninvited sexual attention,<br />

they also provide information on the details surrounding those events (e.g., where they occur),<br />

and Service members’ perceptions of the effectiveness of sexual harassment policies, training,<br />

and programs.<br />

This paper examines the circumstances in which unprofessional, gender-related behaviors<br />

occur as reported in the 2002 survey. 4 Service members who experienced at least one<br />

unprofessional, gender-related behavior were asked to consider the “one situation” occurring in<br />

the year prior to taking the survey that had the greatest effect on them. Members then reported<br />

on the circumstances surrounding that experience. Specifics related to the situation provided<br />

answers to questions such as: (1) What was the unprofessional, gender-related experience, (2)<br />

Who were the offenders, (3) Where did the experience occur, (4) How often did the situation<br />

occur, (5) How long did the situation last, (6) Was the situation reported, and if so, to whom, and<br />

(7) Were there any repercussions as a result of reporting the incident?<br />

This paper will analyze gender differences in the circumstances surrounding the one<br />

situation, reporting behaviors, and problems at work resulting from unprofessional, genderrelated<br />

behavior. In addition, this paper will include an analysis of paygrade by gender<br />

differences. Members in the W1-W5 paygrade are not presented or analyzed in this paper<br />

because estimates would be unstable due to low cell size. Only differences for statistically<br />

significant numbers are presented in this paper.<br />

Survey Methodology<br />

The population of interest for the survey consisted of all active-duty members of the<br />

Army, Navy, Marine Corps, Air Force, and Coast Guard, below the rank of admiral or general,<br />

with at least 6 months of active duty service. The sampling frame included Service members<br />

who were on active-duty in May 2001, with eligibility conditional on their also being on active<br />

duty in September 2001 and December 2001.<br />

The sample consisted of 60,415 Service members. A total of 19,960 eligible members<br />

returned usable surveys yielding an adjusted weighted response rate of 36%. Data were<br />

collected by mail and Web between December 26, 2001 and April 23, 2002. Data were weighted<br />

to reflect the active duty population as of December 2001. The nonresponse-adjusted weights<br />

were raked to force estimates to known population totals of the midpoint of data collection.<br />

45 th 4<br />

Comparisons are made to the 1995 survey but not to the 1988 survey as it was substantially different.<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Metrics of Unprofessional, Gender-Related Behavior<br />

The 2002 WGR contains 19 behaviorally-based items intended to represent a continuum<br />

of unprofessional, gender-related behaviors—not just sexual harassment—along with an open<br />

item for write-in responses of “other gender-related behaviors.” The 18 question sub-items can<br />

be grouped into three primary types of behaviors: (1) Sexist Behavior, (2) Sexual Harassment,<br />

and (3) Sexual Assault. The sexual harassment behaviors can be further categorized as: (1)<br />

Crude/Offensive Behavior, (2) Unwanted Sexual Attention, and (3) Sexual Coercion. The 12<br />

sexual harassment behaviors are consistent with the U.S. legal system’s definition of sexual<br />

harassment (i.e., behaviors that could lead to a hostile work environment and others that<br />

represent quid pro quo harassment). Service members were asked to indicate if any of these<br />

behaviors happened to them in the past 12 months. The rates of unprofessional, gender-related<br />

behaviors are based on these items. However, details are not asked regarding all behaviors.<br />

Rather, details are obtained about a specific situation only from those who had experienced some<br />

behaviors in the past year. Service members were asked to pick the one situation that had the<br />

greatest effect on them from the list of 19 unprofessional, gender-related behaviors. Service<br />

members were asked to indicate, in the situation that affected them most, whether the offender<br />

“did this” or “did not do this” for each item. Those analyzed in this paper represent those<br />

members who experienced at least one behavior and chose to answer the questions pertaining to<br />

the one situation with the greatest effect.<br />

Results<br />

Types of Behaviors<br />

Figure 1 shows that in 2002, over half of the women and one-third of the men indicated<br />

that multiple types of behaviors occurred in the one situation, with the remainder of them<br />

reporting only that Sexist Behavior, Crude/Offensive Behavior, or Unwanted Sexual Attention<br />

occurred. Both women and men reported experiencing Sexual Coercion and Sexual Assault only<br />

in combination with other behaviors. In 2002, Sexist Behavior was the most commonly<br />

experienced type of behavior occurring alone for women (26%), whereas Crude/Offensive<br />

Behavior was most commonly experienced by men (48%). While levels were different in 1995<br />

with fewer combinations, the overall pattern was very similar.<br />

209<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


210<br />

Figure 1<br />

Percentage Distribution of Behaviors in One Situation, by Gender and Year<br />

2002 Female<br />

1995 Female<br />

2002 Male<br />

1995 Male<br />

Gender of Offenders<br />

To obtain information on the perpetrators of unprofessional, gender-related behavior,<br />

Service members were asked about the identity of the offender(s) in the situation that had the<br />

greatest effect on them. It should be noted that it is possible for there to be multiple offenders<br />

during the one situation.<br />

Given the gender make-up of the active-duty military, 85% male and 15% female, it is<br />

not unexpected that the majority of women (85%) and men (51%) reported their offender as<br />

male. Comparing 2002 to 1995, more women (14% vs. 6%) and men (27% vs. 16%) reported<br />

that the offenders included both genders (see Figure 2). The complementary change for women<br />

and men were in the percentages who said the offenders were solely of the opposite gender.<br />

Figure 2<br />

Gender Percentages of Reported Offenders in One Situation, by Year<br />

2002 Female<br />

1995 Female<br />

2002 Male<br />

1995 Male<br />

15<br />

20<br />

26<br />

38<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

51<br />

52<br />

10<br />

85<br />

7<br />

48<br />

92<br />

0 10 20 30 40 50 60 70 80 90 100<br />

15<br />

53<br />

0 10 20 30 40 50 60 70 80 90 100<br />

Sexist Behavior Crude/Offensive Behavior<br />

Unwanted Sexual Attention Combination of Behaviors<br />

Male Female Both Males and Females<br />

22<br />

32<br />

16<br />

4<br />

57<br />

1<br />

27<br />

12<br />

14<br />

2<br />

16<br />

33<br />

Margin of error does not exceed ±4<br />

6<br />

31<br />

15<br />

Margin of error does not exceed ±4


Regardless of paygrade, both women and men reported most often the gender of the<br />

offenders as male (see Table 1). With the exception of senior officers, across paygrades, roughly<br />

twice as many women and men reported the offenders included both men and women in 2002<br />

than in 1995.<br />

Table 1<br />

Percentage of Reported Offenders in One Situation, by Gender, Paygrade, and Year<br />

Junior Enlisted<br />

(E1-E4)<br />

Senior Enlisted<br />

(E5-E9)<br />

Junior Officer<br />

(O1-O3)<br />

Senior Officer<br />

(O4-O6)<br />

1995 2002 1995 2002 1995 2002 1995 2002<br />

Men 92 85<br />

Females<br />

92 83 92 89 93 89<br />

Women 2 1 1 1 3 2* 1 2*<br />

Both 6 14 7 16 5 9 5 9<br />

Males<br />

Men 53 53 51 47 57 62 51 51<br />

Women 32 20 32 22 33 17 33 29<br />

Both 15 26 17 30 10 21 17 20<br />

Margin of Error ± 5 ± 6 ± 6 ± 4 ± 9 ± 8 ± 11 ± 8<br />

* Low precision and/or unweighted denominator size between 30 and 59.<br />

Organizational Affiliation of Offenders<br />

Another characteristic of interest regarding perpetrators of unprofessional, gender-related<br />

behavior is his/her organizational affiliation. Service members interact with both other military<br />

personnel and civilians of various paygrades, therefore the perpetrators of unprofessional,<br />

gender-related behaviors can be found in both groups. Service members were asked to identify<br />

whether or not the offenders in the situation that had the greatest effect on them were military<br />

members and/or civilians. Offenders were categorized as: military members, civilians, or both<br />

military and civilian personnel.<br />

Given that during duty hours Service members are more likely to interact other military<br />

personnel than non-military personnel (excluding family members), it was expected that the<br />

majority of both women (84%) and men (82%) reported the offenders in the situation that had<br />

the greatest effect on them were Service members (see Figure 3). Both women (84% vs. 82%)<br />

and men (82% vs. 78%) were less likely in 2002, than in 1995, to report the offenders included<br />

only civilians (see Figure 3).<br />

211<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


212<br />

Figure 3<br />

Percentage of Offenders’ Organizational Affiliation<br />

2002 Female<br />

1995 Female<br />

2002 Male<br />

1995 Male<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

78<br />

84<br />

82<br />

82<br />

0 10 20 30 40 50 60 70 80 90 100<br />

<strong>Military</strong> Only Both <strong>Military</strong> and Civilian Civilian Only<br />

9<br />

12<br />

12<br />

12<br />

13<br />

Margin of error does not exceed ±4<br />

Female (68% vs. 82-88%) and male (57% vs. 82-87%) senior officers were the least<br />

likely to report the offenders were military members (see Table 2). The complementary findings<br />

for both female (14% vs. 3-6%) and male (23% vs. 2-7%) senior officers were in the percentages<br />

who said the offenders were solely civilians.<br />

Table 2<br />

Percentage of Offenders’ Organizational Affiliation, by Paygrade<br />

Junior Enlisted<br />

(E1-E4)<br />

Senior Enlisted<br />

(E5-E9)<br />

Junior Officer<br />

(O1-O3)<br />

6<br />

6<br />

4<br />

Senior Officer<br />

(O4-O6)<br />

F M F M F M F M<br />

<strong>Military</strong> only 88 87 82 80 83 82 68 57<br />

Both military and civilians 10 11 13 14 11 12 17 20<br />

Civilians only 3 2 5 7 6 7 14 23<br />

Margin of Error ±2 ±4 ±2 ±4 ±4 ±6 ±5 ±8<br />

Place and Time One Situation Occurred<br />

Members were asked questions to describe the characteristics of the one situation with<br />

the greatest effect. To understand this section, it is necessary to remember that these behaviors<br />

can happen in various locations during multiple times in one single day, and can also span over a<br />

long period of time. Through examining these characteristics, it is possible to identify<br />

commonalities between incidents of unprofessional, gender-related behavior.<br />

The majority of women and men reported some or all of the behaviors occurred at an<br />

installation (Females 86%; Males 75%), at work (Females 81%; Males 74%); during duty hours<br />

(Females 84%; Males 76%) (see Tables 59a.1-59d.4 in Greenlees et al. (<strong>2003</strong>)). Approximately<br />

twice as many women than men (13% vs. 24%) reported none of the behaviors occurred on a


military installation. However, women and men were less likely to report in 2002 than in 1995<br />

that all of the behaviors in the situation occurred during duty hours (Females 46% vs. 54%;<br />

Males 40% vs. 48%) and either on a military installation (Females 51% vs. 73%; Males 42% vs.<br />

62%) or at work (Females 44% vs. 51%; Males 39% vs. 51%) (See Table 3).<br />

Among women, junior enlisted members (37% vs. 49%-61%) were the least likely to<br />

report that all of the behaviors occurred at their work (see Table 3). In contrast, female senior<br />

officers were the most likely to report that all of the behaviors occurred at work (61% vs. 37-<br />

50%). Similarly, among women, junior enlisted members (39%) were the least likely, and senior<br />

officers (63%) were the most likely, to report that none of the behaviors occurred during duty<br />

hours. This may be partially explained by the findings that, among women, junior enlisted<br />

members (62%) were the least likely, and senior officers (83%) were the most likely, to report<br />

that none of the behaviors occurred in the local community by an installation (see Tables 59a.4-<br />

59d.4 in Greenlees et al. (<strong>2003</strong>)). For men, there were no significant paygrade differences.<br />

Regardless of paygrade, women were at least 15-percentage points less likely to report in<br />

2002 than in 1995 that all of the behaviors occurred on a military installation (see Table 3).<br />

Regardless of gender, senior enlisted members were less likely than members in other paygrades<br />

to report in 2002 than in 1995 that all of the behaviors occurred at work (Females 50% vs. 57%;<br />

Males 39% vs. 56%), or during duty hours (Females 53% vs. 62%; Males 40% vs. 52%).<br />

Moreover, junior (43% vs. 57%) and senior (40% vs. 66%) enlisted men were less likely to<br />

report in 2002 than in 1995 that all of the behaviors occurred on a military installation (see Table<br />

3).<br />

Table 3<br />

Percentage of Members Reporting all of the Behaviors Occurred at a Particular Time or<br />

Location, by Gender<br />

Total DoD<br />

Junior Enlisted<br />

(E1-E4)<br />

Senior Enlisted<br />

(E5-E9)<br />

Junior Officer<br />

(O1-O3)<br />

213<br />

Senior Officer<br />

(O4-O6)<br />

1995 2002 1995 2002<br />

Females<br />

1995 2002 1995 2002 1995 2002<br />

In the local community --- 5 --- 6 --- 5 --- 5 --- 4<br />

At a military installation 73 51 70 47 76 45 71 53 76 61<br />

At work 51 44 45 37 57 50 57 49 69 61<br />

During duty hours 54 46 45 39 62 53 59 51 73 63<br />

Margin of Error ± 2 ±2 ± 3 ±3 ±3 ± 3 ± 4 ± 5 ± 6 ±5<br />

Males<br />

In the local community --- 5 --- 4 --- 5 --- 7 --- 8<br />

At a military installation 62 42 57 43 66 40 62 47 61 50<br />

At work 51 39 44 38 56 39 55 44 58 47<br />

During duty hours 48 40 40 38 52 40 56 46 58 50<br />

Margin of Error ± 4 ±3 ± 5 ± 5 ± 6 ± 4 ± 9 ± 8 ± 11 ± 8<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


214<br />

Frequency and Duration of Incidents Concerning Unprofessional, Gender-related Behavior<br />

Regarding the frequency and duration of incidents of unprofessional, gender-related<br />

behavior, women were less likely than men to report that such incidents had only happened once<br />

(22% vs. 32%) and that it lasted less than a month (45% vs. 60%) (see Tables 4 and 5).<br />

Among women, junior enlisted members were the most likely to report that the incidents<br />

of unprofessional, gender-related behavior occurred almost every day or more than once a day<br />

(9% vs. 1-5%) (see Table 4). Among men, there were no paygrade differences in the frequency<br />

of behaviors. Regardless of gender, there were no paygrade differences in the duration of the<br />

situation (see Table 5).<br />

Table 4<br />

Percentage of Members Reporting Frequency of Behaviors, by Paygrade<br />

Total DoD Junior Enlisted Senior Enlisted Junior Officer Senior Officer<br />

(E1-E4) (E5-E9) (O1-O3) (O4-O6)<br />

F M F M F M F M F M<br />

Once 22 32 21 29 23 35 25 33 27 38<br />

Occasionally 52 50 50 46 53 53 56 57 55 54<br />

Frequently 17 11 19 16 17 8 15 9 14 3<br />

Almost everyday/More<br />

than once a day<br />

9 6 11 9 8 5 4 1 4 5<br />

Margin of Error ±2 ±3 ±3 ±5 ± 3 ±5 ±5 ±8 ±5 ±8<br />

Table 5<br />

Percentage of Members Reporting Duration of the Situation, by Paygrade<br />

Total DoD Junior Enlisted Senior Enlisted Junior Officer Senior Officer<br />

(E1-E4) (E5-E9) (O1-O3) (O4-O6)<br />

F M F M F M F M F M<br />

Less than 1 month 45 60 43 55 46 62 52 64 45 65<br />

1 month to less than 6 27 17 30 19 24 16 25 15 20 15<br />

More than 6 months 28 23 27 25 30 22 23 21 35 21<br />

Margin of Error ±2 ±3 ±3 ±5 ±3 ±4 ±5 ±8 ±8 ±5<br />

Reporting<br />

A series of survey questions asked Service members to provide details regarding<br />

reporting and to give details about various aspects of the reporting process. Overall, 30% of<br />

women and 17% of men reported the situation to a supervisor or person responsible for followup<br />

(see Table 6). However, in 2002, fewer women reported behaviors than in 1995 (38% vs.<br />

30%). For more details see Tables 66a.3-66e.3 in Greenless et al. (<strong>2003</strong>).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 6<br />

Frequency of Reporting Behavior in One Situation to Any Supervisor or Person Responsible<br />

for Follow-up<br />

Females Males<br />

1995 38 15<br />

2002 30 17<br />

Margin of Error ±3 ±3<br />

To Whom Behaviors Are Reported<br />

Less than 10% of women and men chose to report unprofessional, gender-related<br />

behavior to either a special military office responsible for these types of behaviors, or to another<br />

installation/Service/DoD official. Rather, female and male Service members tend to report to<br />

members in their chain-of-command, such as their immediate supervisor (Females 21%; Males<br />

12%), or to the supervisor of the offender (Females 16%; Males. 10%) (see Tables 66a.1-66e.4<br />

in Greenlees et al. (<strong>2003</strong>)). Among women, enlisted members were more likely than officers to<br />

report unprofessional, gender-related behavior to someone in their chain-of-command (15-17%<br />

vs. both 10%) or to a special military office responsible for these types of behaviors (7-8% vs.<br />

both 3%) (see Tables 66a.4-66e.4 in Greenlees et al. (<strong>2003</strong>)).<br />

Reasons for Not Reporting Behaviors<br />

Service members were asked to indicate which of the 19 items explained why they chose<br />

not to report any or all of the behaviors they experienced. The five reasons Service members<br />

most frequently indicated for not reporting behaviors are shown in Table 7. Women (67%) and<br />

men (78%) most often indicated that they did not report behaviors because they felt the situation<br />

was not important enough to report. For detailed information on all 19 items, see Tables 74a.1-<br />

74s.4 in Greenless et al. (<strong>2003</strong>).<br />

Table 7<br />

Top Five Reasons for Not Reporting Any or All Behaviors in One Situation<br />

Females Males<br />

Was not important enough to report 67 78<br />

You took care of the problem yourself 65 63<br />

You felt uncomfortable making a report 40 26<br />

You did not think anything would be done if you reported 33 28<br />

You thought you would be labeled a troublemaker if you reported 32 22<br />

Margin of Error ±2 ±3<br />

Junior enlisted women were more likely than other women to indicate they did not report<br />

behaviors because they felt uncomfortable (48% vs. 30-36%), thought they would not be<br />

believed (22% vs. 11-16%), thought coworkers would be angry (31% vs. 16-20%), did not want<br />

to hurt the person (34% vs. 16-26%), or were afraid of retaliation from the offender (28% vs. 18-<br />

19%) (see Tables 74a.1-74s.4 in Greenlees et al. (<strong>2003</strong>)). In contrast, more junior enlisted men<br />

indicated they did not report because it would take too much time (29% vs. 11-17%).<br />

Reasons for Not Reporting Behaviors by Reporting Category<br />

For those Service members who reported either none of the behaviors or only some of the<br />

behaviors, this section includes an analysis of Service members’ reasons for not reporting<br />

215<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


216<br />

behaviors. Women were more likely than men to identify retaliatory behaviors as reasons not to<br />

report any of the behaviors (see Table 8). These reasons included: being labeled a troublemaker<br />

(29% vs. 19%), fear of retaliation from the offender (18% vs. 10%), fear of retaliation from<br />

friends of the offender (13% vs. 8%), and fear of retaliation from their supervisor (12% vs. 8%).<br />

Men were more likely than women to report either none (81% vs. 71%), or only some (59% vs.<br />

50%) of the behaviors because they believed the behaviors were not important enough to report.<br />

Table 8<br />

Percentage of Reasons for Not Reporting the Behaviors, by Gender and Reporting Category<br />

Reasons For Not Reporting Reported No<br />

Behaviors<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Reported Some<br />

Behaviors<br />

F M F M<br />

Was not important enough to report 71 81 50 59<br />

You did not know how to report 13 9 26 21<br />

You felt uncomfortable making a report 37 24 53 48<br />

You took care of the problem yourself 67 63 57 58<br />

You talked to someone informally in your chain-of-command 10 8 70 62<br />

You did not think anything would be done if you reported 30 24 46 47<br />

You thought you would not be believed if you reported 15 10 28 25<br />

You thought your coworkers would be angry if you reported 23 17 29 33<br />

You wanted to fit in 15 14 19 21<br />

You thought reporting would take too much time and effort 23 21 28 29<br />

You thought you would be labeled a troublemaker if you reported 29 19 45 48<br />

A peer talked you out of making a formal complaint 2 1 10 10<br />

A supervisor talked you out of making a formal complaint 1 1 16 14<br />

You did not want to hurt the person’s feelings, family, or career 28 20 32 34<br />

You thought your performance evaluation or chance of promotion would suffer 14 10 28 31<br />

You were afraid of retaliation from the person(s) who did it 18 10 39 30<br />

You were afraid of retaliation/reprisals from friends of the person(s) who did it 13 8 26 29<br />

You were afraid of retaliation/reprisals from your supervisors 12 8 26 26<br />

Some other reason 22 18 25 27<br />

Margin of Error ±3 ±4 ±5 ±11<br />

Satisfaction With Complaint Outcome<br />

Satisfaction with the outcome of the complaint can be indicative of a Service member’s<br />

perception of the reporting and complaint process. Approximately a third of women and men<br />

were satisfied with the outcome of their complaint. This trend remained consistent across years,<br />

as women (34% vs. 36%) and men (37% vs. 36%) were equally satisfied with the outcome of the<br />

complaint process in 2002 and in 1995 (see Tables 72.1-72.3 in Greenlees et al. (<strong>2003</strong>).<br />

Complaint Outcome<br />

This section includes an analysis of the outcome of the complaint by Service members’<br />

satisfaction with the outcome. As expected, Service members were most likely to be satisfied<br />

with the outcome of their complaint when the situation was corrected (Females 92%; Males<br />

91%), the outcome of complaint was explained to them (Females 69%; Males 70%), and some


action was taken against the offender (Females 55%; Males 66%). Women and men (both 48%)<br />

were most likely to be dissatisfied with the outcome of their complaint when nothing was done<br />

about it. For both women and men, satisfaction with the complaint outcomes was not predicated<br />

totally on whether the complaint was found to be true (see Table 9). For more detailed paygrade<br />

findings regarding complaint outcomes, see Tables 71a.1-71h.4 in Greenlees et al. (<strong>2003</strong>).<br />

Table 9<br />

Percentage of Complaint Outcome, by Satisfaction with Outcome and Gender<br />

Outcome of Complaint Satisfied with<br />

Outcome<br />

217<br />

Dissatisfied<br />

with Outcome<br />

F M F M<br />

They found your complaint to be true 78 85 33 48<br />

They found your complaint to be untrue 0* 0* 14 5<br />

They were unable to determine whether your complaint was true or not 8 6* 12 14<br />

The outcome of your complaint was explained to you 69 70 20 22<br />

The situation was corrected 92 91 12 12<br />

Some action was taken against the person(s) who bothered you 55 66 14 4<br />

Nothing was done about the complaint 9 10* 48 48<br />

Action was taken against you 0* 6* 19 17<br />

Margin of Error ±6 ±11 ±6 ±16<br />

* Low precision and/or unweighted denominator size between 30 and 59.<br />

Problems at Work<br />

Service members were asked to describe problems they have had at work as a result of<br />

their experience or how they responded to it. These problems can include both social (e.g.,<br />

hostile interpersonal behaviors) and professional (e.g., behaviors that interfere with their career_<br />

reprisals. Overall, 29% of women and 23% of men reported experiencing some type of problem<br />

at work as a result of unprofessional, gender-related behavior (see Figure 4). Women and men<br />

most often reported being gossiped about by people in an unkind way (15% and 20%). Women<br />

were more likely than men to report experiences of being ignored or shunned by others at work<br />

(10% vs. 6%), blamed for the situation (9% vs. 6%), or mistreated in some other way (10% vs.<br />

6%) (see Tables 75a.4-75l.4 Greenlees et al. (<strong>2003</strong>)).<br />

Both junior enlisted women (33%) and men (31%) were more likely to report<br />

experiencing at least some kind of problem at work than members in other paygrades (see Figure<br />

4). Junior enlisted women (15% vs. 9-18%) and men (21% vs. 5-11%) were the most likely to<br />

report experiencing unkind gossip (see Tables 75a.4-75l.4 Greenlees et al. (<strong>2003</strong>)).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


218<br />

Figure 4<br />

Percentage of Members Who Experienced Any Problems at Work, by Gender and Paygrade<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

23<br />

29<br />

31<br />

33<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

18<br />

27<br />

Males<br />

Females<br />

20 21<br />

11 10<br />

DoD Total E1-E4 E5-E9 O1-O3 O4-O6<br />

Margin of error ±5<br />

Conclusions<br />

This paper provides an overview of the characteristics of the situations of unprofessional,<br />

gender-related behavior. By analyzing the characteristics, the DoD can better target areas and<br />

individuals who are affected by these behaviors and implement strategies to reduce occurrences.<br />

Of those who experience unprofessional, gender-related behavior, their experiences vary<br />

and often include multiple types of behaviors. However, some characteristics are common<br />

across most experiences. The majority of the offenders are male, although the perpetrators<br />

increasingly include both genders. For example, in 2002 more women and men reported that the<br />

offenders included both genders than in 1995. Another common characteristic is that the<br />

majority of women (84%) and men (82%) reported the offenders were military personnel.<br />

When Service members report their experiences, there are more opportunities to address<br />

these problems. Therefore, an analysis of various reporting factors is important to the process of<br />

attending to these issues and more effectively resolving their troublesome effects. Overall, 30%<br />

of women and 17% of men reported the situation. These rates are a concern because in this<br />

analysis, reporting was not limited to formal reports; rather it included informal discussion with<br />

any installation, Service, or DoD individual or organization. Hence, for formal reporting, it<br />

would be expected that the numbers would be even lower. It is important to note that whether or<br />

not Service members report may also be a function of the types of behaviors they experience.<br />

The unprofessional, gender-related behaviors measured in the 2002 WGR represent a continuum<br />

of behaviors ranging from Sexist Behavior, which, although considered a precursor to sexual<br />

harassment, is not illegal, to Sexual Assault, a criminal offense. One explanation for low<br />

reporting rates may seem low is that the most commonly noted reason for not reporting for<br />

women (67%) and men (78%) is that the situation was not important enough to report.<br />

References<br />

Greenlees, J.B., Deak, M.A., Rockwell, D., Lee, K.S., Perry, S., Willis, E.J., & Mohomed, S.G.<br />

(<strong>2003</strong>). Tabulations of Responses from the 2002 Status of the Armed Forces Survey—<br />

Workplace and Gender Relations: Volume 2 Gender Related Experiences in the <strong>Military</strong> and<br />

Gender Relations. DMDC Report No. <strong>2003</strong>-013. Arlington, VA: DMDC.


Workplace Reprisals: A Model of Retaliation Following Unprofessional<br />

Gender-Related Behavior 1<br />

Alayne J. Ormerod, Ph.D. and Caroline Vaile Wright,<br />

University of Illinois<br />

603 East Daniel Street<br />

Champaign, IL 61820<br />

aonnerod@s.psych.uiuc.edu<br />

Retaliation is considered to be both a deterrent to and consequence of reporting sexual<br />

harassment. Existing research suggests that reporters of harassment routinely experience retaliation<br />

and that reporting worsens outcomes beyond that of harassment alone (Coles, 1986; Hesson-<br />

McInnis & Fitzgerald, 1997; Loy & Stewart, 1984; Stockdale, 1998). Interestingly, the relationship<br />

between reporting and outcomes may be indirect, that is, reporting appears to trigger "postreporting"<br />

variables that exert a negative influence on outcomes (Bergman, Langhout, Cortina,<br />

Palmieri, & Fitzgerald, 2002). The goal of this paper is to examine one such post-reporting<br />

variable, retaliation that follows from experiences of unprofessional, gender-related behavior, in a<br />

sample of military personnel.<br />

UNPROFESSIONAL, GENDER-RELATED BEHAVIOR IN THE MILITARY<br />

Retaliation occurs in the context of sexual harassment and other unprofessional, gender-related<br />

behavior (UGRB), 2 thus it is important to consider that context. In an issue of <strong>Military</strong> Psychology<br />

entirely devoted to sexual harassment in the Armed Forces (Drasgow, 1999), Fitzgerald, Drasgow,<br />

& Magley (1999) tested a model that provided an integrated framework for understanding the<br />

predictors and outcomes of sexual harassment. Their findings, applicable to both male and female<br />

personnel, suggest that harassment occurs more often when personnel perceive that leadership<br />

efforts, practices, and training do not address issues of harassment and when work groups are not<br />

gender-integrated. Experiences of harassment were associated with decrements in job satisfaction<br />

and psychological and physical well being. In turn, lowered job satisfaction was associated with<br />

lowered organizational commitment and work productivity. These relationships also apply to the<br />

civilian workforce (Fitzgerald, Drasgow, Huhn, Gelfand, & Magley, 1997). Given that UGRB is<br />

the major stimulus for retaliation, the general model is a starting point for examining those<br />

variables considered antecedents and outcomes of retaliation.<br />

1 This paper is part of a symposium, entitled Sexual Harassment in the <strong>Military</strong>: Recent Research Findings,<br />

presented at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Conference in Pensacola, Florida (T. W.<br />

Elig, Chair). This research was supported in part by the Defense Manpower Data Center (DMDC) through<br />

the Consortium of Universities of the Washington Metropolitan Area, Contract. M67004-03-C-0006 and also<br />

by NIMH grant # MH50791-08. The opinions in this paper are those of the authors and are not to be<br />

construed as an official DMDC or Department of Defense position unless so designated by other authorized<br />

documents. The authors wish to thank Louise F. Fitzgerald for her comments.<br />

2 Survey measurement of sexual harassment is defined by the U.S. Department of Defense as the presence of<br />

behaviors indicative of sexual harassment (Crude/Offensive Behavior, Sexual Coercion, and Unwanted<br />

Sexual Attention; Sexist Behavior and Sexual Assault are not counted in the DoD survey measure of sexual<br />

harassment) and the labeling of those behaviors as sexual harassment (Survey Method for Counting Incidents<br />

of Sexual Harassment, 2002). In this paper we examine behaviors indicative of sexual harassment and sexist<br />

behavior and refer to them together as unprofessional, gender-related behavior (UGRB). We use the term<br />

sexual harassment to refer to the existing literature.<br />

219<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


220<br />

PROCESS OF RETALIATION<br />

There are currently only a handful of empirical studies that examine the correlates of<br />

retaliation following sexual harassment, however, the literature on whistle-blowing offers insight<br />

into the retaliatory process. Miceli and Near (1992) conceptualize retaliation as a collection of<br />

punitive work-related behaviors (e.g., poor performance appraisal, denial of promotion or other<br />

advancement opportunities, transfer to a different geographic location, suspension, demotion),<br />

both experienced and threatened. They consider the number and status (e.g., coworkers,<br />

supervisors) of those who are engaged in the retaliation (Near & Miceli, 1986) and suggest that<br />

retaliation from coworkers and management may differ, as each may be a function of dissimilar<br />

variables (Miceli & Near, 1988; 1992).<br />

Near, Miceli, and colleagues (Miceli & Near, 1992; Miceli, Rehg, Near, & Ryan, 1999;<br />

Near & Miceli, 1986) describe retaliation as a phenomenon arising from an employee's<br />

disclosure of some form of organizational wrong-doing to someone who can take action. They<br />

propose that retaliation can be best understood from the perspective of resource dependency<br />

theory, that is, when an organization is dependent on the wrong-doing, it is more likely to resist<br />

change and engage in retaliation toward the (relatively less powerful) whistle-blower. Further,<br />

they suggest that the determinants of retaliation may be sensitive to context, such as the type of<br />

wrong-doing, the organizational setting, and victim variables such as whether the victim of the<br />

retaliation was also the victim of the wrong-doing (Miceli & Near, 1992).<br />

Sexual harassment is a type of wrong-doing in which the "whistle-blower," or reporter, is<br />

almost always the target of the wrong-doing. Recent research about the nature of retaliation<br />

following sexual harassment suggests that retaliation can include both personal (e.g., isolating<br />

and targeting victims of harassment with hostile interpersonal behaviors) and professional (e.g.,<br />

behaviors that interfere with career advancement and retention) reprisals that may contribute<br />

differentially to outcomes (Cortina & Magley, in press; Fitzgerald, Smolen, Harned,<br />

Collinsworth, & Colbert, in preparation). Although these studies do not directly assess the source<br />

of the retaliation, it is likely that professional retaliation results largely from actions by a<br />

supervisor or other person in a more powerful position than the target. Professional retaliation is<br />

prohibited by Title VII of the Civil Rights Act of 1964 (Crockett & Gilmere, 1999) and is<br />

thought to be less common than social retaliation. Social forms of retaliation, on the other hand,<br />

can arise from any individual with whom the target interacts, including coworkers. Social<br />

retaliation has been linked to negative outcomes (Cortina & Magley, in press) but is not<br />

explicitly prohibited by law.<br />

Retaliation following sexual harassment is thought to arise from a general process<br />

through which an employee is (a) victimized by another organizational member, (b) makes an<br />

external response (e.g., support seeking, confrontation, or reporting the mistreatment), (c) is<br />

retaliated against by an organizational member, either the original perpetrator or others, and (d)<br />

subsequently suffers negative consequences (Cortina & Magley, in press; Fitzgerald et al., in<br />

preparation; Magley & Cortina, 2002).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


INCIDENCE OF RETALIATION<br />

Studies define whistle-blowing (that is, response to wrong-doing) in slightly different<br />

ways, some include all forms of active response and others limit the notion to official reporting.<br />

The whistle-blowing literature suggests that less than one quarter of those who report wrongdoing<br />

experience retaliation (Miceli & Near, 1988; Near & Miceli, 1986). Other research<br />

suggests that formally filing charges of sexual harassment or discrimination is associated with a<br />

retaliation rate of between 40% and 60% (Coles, 1986; Loy & Stewart, 1984; Parmerlee, Near, &<br />

Jensen, 1982). In a study of military personnel who experienced sexual harassment and reported<br />

it to someone in an official position, 15.7% of the men and 19.4% of the women reported some<br />

form of retaliation (Magley & Cortina, 2002). These disparate findings suggest that incidence<br />

rates may vary depending on the type organization, the nature of the wrong-doing, the target's<br />

response, and the gender of the target. They also lead us to conclude that there is, at present, no<br />

reliable general estimate of the extent of retaliation.<br />

ANTECEDENTS OF RETALIATION<br />

Organizational Climate<br />

In the sexual harassment research literature, organizational climate regarding sexual<br />

harassment has been linked to both harassment and negative outcomes (Fitzgerald, Hulin, &<br />

Drasgow, 1995; Hunter Williams, Ftizgerald, & Drasgow, 1999; Pryor, Geidde, & Williams,<br />

1995; Pryor & Whalen, 1997). Miceli and Near (1992) argue that organizations who engage in<br />

one type of wrong-doing will engage in others and that retaliation against whistle-blowers is just<br />

one type of unfair practice found within an organization. Supporting this line of reasoning, they<br />

reported that employee perceptions that an organization's reward distribution system is fair were<br />

negatively associated with retaliation (Miceli & Near, 1989). Thus it is reasonable to suppose<br />

that a climate that tolerates sexual harassment will also be tolerant of retaliation. Two recent<br />

studies support this contention and reported an association between climate and experiences of<br />

retaliation for male and female military personnel. In one study, implementation of policies<br />

prohibiting harassment (an indicator that the organization does not tolerate harassment) was<br />

associated with less frequent retaliation (Magley & Cortina, 2002). In the second study, an<br />

organizational climate that tolerates harassment was one of several predictors of retaliation for<br />

female personnel and the sole predictor for male personnel (Bergman et al., 2002). In addition,<br />

this same study found that retaliation of female personnel was associated with working in a<br />

highly masculinized job context and having perpetrators of a higher status.<br />

Unprofessional, Gender-Related Behavior<br />

In the whistle-blowing literature, the relationship between the frequency of wrong-doing<br />

and retaliation is not entirely clear-cut; it is associated in some research (Miceli, Rehg, Near, &<br />

Ryan, 1999) but not in others (Near & Miceli, 1986). In the sexual harassment literature, on the<br />

other hand, the frequency of experiencing harassment is directly related to increased experiences<br />

of retaliation. In a study of sexually harassed male and female federal court employees who had<br />

confronted the harasser, reported, or sought social support, more frequent harassment and<br />

interpersonal mistreatment predicted both personal and professional retaliation (Cortina &<br />

Magley, in press). Male and female military personnel who endorsed more frequent experiences<br />

of harassment also endorsed more experiences of retaliation (Magley & Cortina, 2002). In<br />

another study, female military reporters who experienced more frequent sexual harassment<br />

experienced more retaliation (Bergman et al., 2002).<br />

221<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


222<br />

Findings are mixed concerning whether the severity of the wrong-doing predicts increases<br />

in retaliation. Near and Miceli (1986) reported that retaliation was related to the seriousness of the<br />

wrong-doing but later argued that severity of retaliation can not be reliably measured across<br />

contexts because contextual and individual variables make a standard hierarchy of severity difficult<br />

to determine. (Miceli & Near, 1992). Magley and Cortina (2002) attempted to model severity by<br />

assigning harassing behaviors to hostile environment, or hostile environment plus quid pro quo<br />

categories. They found a small but significant relationship between harassment that included quid<br />

pro quo behaviors and increased experiences of retaliation; however they note that differences<br />

between retaliation associated with hostile environment experiences and those associated also with<br />

quid pro quo were minimal, suggesting that retaliation is associated with both types of<br />

harassment. In a second study, Cortina and Magley (in press) dichotomized work mistreatment into<br />

either incivility alone or incivility with sexual harassment (arguably a more severe type of wrongdoing)<br />

but found no relationship between type and retaliation. Thus, research attempting to link<br />

severity of wrong-doing to retaliation has, to date, been largely unsuccessful.<br />

Primary and Secondary Appraisal<br />

The target's subjective assessment of whether the harassing event was stressful or<br />

threatening and their subsequent responses may prove useful when attempting to better<br />

understand the relationship of wrong-doing to retaliation. We draw on Lazarus and Folkman's<br />

(1984) cognitive stress framework and its application to sexually harassing events (Fitzgerald,<br />

Swan, & Fischer, 1995; Fitzgerald, Swan, & Magley, 1997) to explore the role that appraisal<br />

plays in the process of retaliation following from an unprofessional, gender-related event.<br />

Primary appraisal is the cognitive evaluation of an event to determine whether it is "stressful,"<br />

whereas secondary appraisal is the process of determining a response to the event or stressor<br />

(Lazarus & Folkman, 1984). Appraisal is thought of as a complex process, influenced by<br />

multiple determinants that can change as the stressful event changes. In the case of sexual<br />

harassment and other UGRB, influencers are thought to include individual, contextual, and<br />

objective factors. Fitzgerald, Swan et al. (1997) consider that the stressfulness of a harassing<br />

event inheres in the appraisal of the event (as stressful) rather than in the event itself and suggest<br />

that the frequency, intensity, and duration of the harassment, the victim's resources and personal<br />

attributes, and the context (e.g., organizational climate, gender context) all make up evaluations<br />

of whether the event is stressful or threatening. Appraisal, in turn, is linked to decisions about<br />

response and to outcomes.<br />

Secondary appraisal, or coping, is thought of as attempts to manage both the stressful<br />

event and one's cognitive and emotional reactions to the event (Fitzgerald, Swan et al., 1995).<br />

Responses to UGRB can include reporting the behavior, confronting the person, seeking social<br />

support, behavioral avoidance (e.g., avoid the person), and cognitive avoidance (e.g., pretend not<br />

to notice, try to forget). 3 Reporting, the most studied form of coping has been linked to<br />

retaliation to the reporter (Bergman et al., 2002; Hesson-McInnis, 1997), particularly when<br />

reporting wrong-doing outside of one's organization (Near & Miceli, 1986) or when reporting<br />

harassment to multiple people in official positions (Magley & Cortina, 2002). Confronting a<br />

harasser is associated with increased retaliation (Cortina & Magley, in press; Stockdale, 1998).<br />

Indeed, complaining about the retaliation itself can result in further retaliation (Near & Miceli,<br />

1986). The type of response can also interact with the status of the target and the harasser. In a<br />

sample of federal court employees, work-related retaliation increased when the target confronted<br />

offenders who held more organizational power, and personal retaliation increased when the<br />

harasser was more powerful and the target sought social support (Cortina & Magley, in press).<br />

3 See Fitzgerald, Swan et al. (1995) for a detailed description of coping responses.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Far less is known about the consequences of cognitive or behavioral avoidance. These<br />

less direct behaviors are far more common than confrontation or reporting, the two most often<br />

recommended behaviors in response to harassment. It is not unreasonable to consider whether<br />

common behaviors, such as ignoring the UGRB, can also trigger a retaliatory response.<br />

Organizational Power<br />

Findings concerning the status of the reporter or target and retaliation are mixed. Two<br />

studies, one with military personnel, the other with federal employees, showed no direct<br />

relationship between the reporter's status within the organization and retaliation (Bergman et<br />

al., 2002; Near & Miceli, 1986). However, a study of federal court employees demonstrated<br />

that victims with lower occupational status were more likely to receive both personal and workrelated<br />

retaliation than higher status victims; when a lower status victim reported, confronted,<br />

or sought social support regarding a higher status harasser, the victim experienced more<br />

retaliation of both types (Cortina & Magley, in press). Holding power within the organization,<br />

as measured by support by management and immediate supervisor, has been consistently<br />

associated with receiving less retaliation in the whistle-blowing literature (Miceli, Rehg, Near,<br />

& Ryan, 1999; Near & Miceli, 1986). Thus it seems important to consider the relationship of<br />

organizational power or status to retaliation.<br />

OUTCOMES OF RETALIATION<br />

Work-Related Outcomes<br />

Given that harassment is associated with negative outcomes to the individual it is likely<br />

that retaliation will also be negatively related to an individual's work attitudes, performance,<br />

and commitment to the organization. In a study of civilian women involved in a class action<br />

lawsuit against a private sector firm, outcomes associated with retaliation (after controlling for<br />

the harassment and effects due to reporting) included decreased satisfaction with coworkers and<br />

supervisors and increased work withdrawal (Fitzgerald et al., in preparation). For public sector<br />

employees, as retaliation increased so did the targets' job dissatisfaction, job stress, and<br />

organizational withdrawal (Cortina & Magley, in press). In a study of federal employees<br />

retaliation resulted in increased involuntary exit from the organization including forced transfer<br />

or leaving (Near & Miceli, 1986). For male and female military personnel, those who<br />

experienced more retaliation generally had poorer work outcomes (Magley & Cortina, 2002),<br />

and for those who reported harassment, retaliation was associated with lower procedural<br />

satisfaction with reporting (Bergman et al., 2002).<br />

Well-Being<br />

In a study of civilian women involved in a class action lawsuit against a large private<br />

firm, outcomes associated with retaliation included decreased health satisfaction and increased<br />

psychological distress (Fitzgerald et al., in preparation). For public sector employees, as<br />

retaliation increased the targets' psychological and physical health decreased (Cortina &<br />

Magley, in press) and military personnel who experienced more retaliation demonstrated<br />

generally poorer psychological and health-related outcomes (Magley & Cortina, 2002).<br />

223<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


224<br />

A MODEL OF RETALIATION<br />

Figure 1. Conceptual model of retaliation. Dashed lines represent paths from climate, leadership,<br />

power, unprofessional, gender-related behavior, and retaliation to supervisor, coworker, work,<br />

and military satisfaction. RET = retaliation. UGRB = unprofessional, gender-related behavior.<br />

CLIM = climate. LEAD = leadership. POW = organizational power. CON = masculinized work<br />

context. APP = appraisal. REP = reporting. COPE = coping. PSY = psychological well-being.<br />

WB = physical well-being. JOB SAT = coworker, supervisor, and work satisfaction. MIL SAT =<br />

military satisfaction. OC = organizational commitment.<br />

We place retaliation within the context of UGRB and utilize the general model for sexual<br />

harassment as a framework within which to consider the antecedents and consequences of<br />

retaliation. Retaliation is defined as the frequency with which a target perceives either personal<br />

or professional reprisals following an active or indirect response to an incident of unprofessional,<br />

gender-related behavior. Consistent with the general model, the whistle-blowing literature, and<br />

research on retaliation, we conceptualize retaliation (and the original UGRB) as arising from an<br />

organizational climate that tolerates wrong-doing and negative interpersonal behaviors and<br />

leadership that do not make reasonable efforts to stop harassment. A masculinized work context<br />

is thought to lead to more frequent UGRB (Fitzgerald, Drasgow et al., 1997) and more frequent<br />

UGRB to more frequent retaliation. Targets with low organizational status are expected to<br />

receive more UGRB and retaliation. We draw on Lazarus and Folkman's (1984) cognitive stress<br />

framework as applied to sexual harassment (Fitzgerald, Swan et al., 1995; Fitzgerald, Swan et al.,<br />

1997) and suggest that the individual will appraise the "stressfulness" of the UGRB incident prior<br />

to responding and that this personal appraisal will influence the nature of the subsequent responses<br />

which will in turn be associated with retaliation. In theory, primary and secondary (coping response<br />

and reporting) appraisal and are thought to have additional antecedents. In our model we limited<br />

the antecedents of the appraisal process so that we could test the major propositions of retaliation<br />

and keep the path model from becoming overly complex. Retaliation is thought to lead to negative<br />

outcomes beyond those associated with UGRB, climate indicators, and organizational power.<br />

Drawing from the organizational literature and the general model, the multiple aspects of job<br />

satisfaction and satisfaction with the military will positively influence organizational commitment<br />

and well-being. Figure 1 expresses these conceptual relationships.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


METHOD<br />

Participants and Procedure<br />

The data for this study were taken from the The 2002 Status of the Armed Forces<br />

Surveys - Workplace and Gender Relations (Form 2002GB). This survey used a nonproportional<br />

stratified, single stage random sample of 60,415 active-duty military personnel,<br />

who were below the rank of admiral or general, with at least 6 months of active duty service.<br />

Respondents were given the choice of either returning a paper and pencil questionnaire by mail,<br />

or completing the same questionnaire on the Web. Both women and ethnic minorities were<br />

oversampled relative to the overall military population. The target population consisted of<br />

56,521 eligible members; out of those, 19,960 eligible respondents (10,235 men and 9,725<br />

men) returned usable surveys, for an adjusted weighted response rate of 36%. The current study<br />

utilizes a subsample of 5,795 service members (1,408 men and 4,387 women).<br />

On average, the women were 28.73 years old (range = 17 - 60, SD =7.98). Sixty-six<br />

percent of the women self-identified as White, 22% as Black/African American, 11% as<br />

Hispanic/Latino, 5% as Asian, 3% as American Indian/Alaska Native, and less than 1 % as<br />

Native Hawaiian/Pacific Islander. 4 Nearly half of the female respondents were married (45%),<br />

38% had never married, and 17% were separated, divorced, or widowed. Eighteen percent of<br />

the women held a GED or high school diploma, 43% had attended some college, 6% had a<br />

degree from a 2-year college, 15% received a degree from a 4-year college, 5% had attended<br />

some graduate school, and 13% had obtained a graduate or other professional degree. Twenty<br />

nine percent of the female respondents were in the Army, 24% reported being a service member<br />

in the Air Force, 22% in the Navy, 15% in the Marine Corps, and 10% in the Coast Guard. 5<br />

On average, the men were 30.43 years old (range = 18-55, SD = 8.19). Seventy percent<br />

of the men self-identified as White, 15% as Black/African American, 14% as Hispanic/Latino,<br />

5% as Asian, 4% as American Indian/Alaska Native, and 1% as Native Hawaiian/Pacific<br />

Islander. More than two-thirds (65%) of the male respondents were married, 29% had never<br />

married, and 6% were separated, divorced, or widowed. Twenty two percent of the men held a<br />

GED or high school diploma, 40% had attended some college, 5% had a degree from a 2-year<br />

college, 14% received a degree from a 4-year college, 5% had attended some graduate school,<br />

and 14% had obtained a graduate or other professional degree. Thirty one percent of the male<br />

respondents were in the Air Force, 23% reported being a service member in the Navy, 22% in<br />

the Army, 14% in the Marine Corps, and 10% in the Coast Guard<br />

4 Respondents were able to endorse more than one race/ethnicity category.<br />

5 Sample demographics are not significantly different from the full sample of eligible respondents.<br />

225<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


226<br />

Instrumentation<br />

Unprofessional, Gender-Related Behaviors<br />

The Sexual Experiences Questionnaire - DoD - Shortened Version (SEQ-DoD-S; Stark,<br />

Chernyshenko, Lancaster, Drasgow, & Fitzgerald, 2002) consists of 16 behavioral items that<br />

assess respondents' unwanted sex-related experiences that occurred during the last 12 months<br />

involving military personnel, or civilian employees/contractors in the military workplace. The<br />

SEQ-DoD-S assesses four general categories of unprofessional, gender-related behaviors. Sexist<br />

Behavior (4 items) includes gender-based discriminatory behaviors such as offensive sexist<br />

remarks and differential, negative treatment based on gender. Crude/Offensive Behavior (4 items)<br />

is more explicitly sexual in nature and includes behaviors such as repeatedly telling sexual<br />

stories or jokes and making crude sexual remarks. Unwanted Sexual Attention (4 items) includes<br />

unwanted sexual behaviors such as repeated requests for dates and touching or stroking. Sexual<br />

Coercion (4 items) is defined as implicit or explicit demands for sexual favors through the threat<br />

of negative job-related consequences or the promise of job-related benefits or bribes. 6 Responses<br />

were provided on a 5-point Likert-type scale ranging from 0 (never) to 4 (very often). Higher<br />

scores indicated more frequent UGRB or more types of UGRB. The 16-item SEQ-DoD-S is<br />

highly reliable, (coefficient alphas for all scales can be seen in Table 1) and considerable validity<br />

information is available.<br />

Retaliation<br />

Respondents were asked to indicate whether or not, as a result of any unprofessional,<br />

gender-related behavior, or response to that behavior (e.g., reporting, confronting, avoiding),<br />

they had experienced any of 11 types of retaliatory behaviors. Three of these behaviors are<br />

classified as personal retaliation (e.g., gossiped about you in an unkind or negative way) and<br />

eight as professional (e.g., given an unfair performance evaluation). Responses were arranged<br />

along a 3-point response scale and were recoded such that 1 = "no", 2 = "don't know", and 3 =<br />

"yes," based on research indicating that a "don't know" option tends to act as a midpoint<br />

(Drasgow, Fitzgerald, Magley, Waldo, & Zickar, 1999). Higher scores reflected greater amounts<br />

of retaliation. Although the scale contains two types of retaliation, confirmatory factor analysis<br />

indicated that the personal and professional factors were highly correlated; thus the scale is<br />

considered to be unidimensional (Ormerod et al., in preparation).<br />

Reporting<br />

Five items assessed whether and to whom (e.g., supervisor; office designed to handle<br />

such complaints) the respondent reported the unprofessional, gender-related behavior. The items<br />

were scored dichotomously, with higher scores indicating that the person reported such behavior<br />

through one or more channels.<br />

Coping Responses to Harassment<br />

This scale asks respondents to indicate the extent to which they engaged in specific nonreporting<br />

coping strategies in response to unprofessional, gender-related behavior. The 17 items<br />

comprise four individual scales (cognitive avoidance, confrontation, social support, behavioral<br />

avoidance). However, for current purposes, all items were combined onto one response scale to<br />

represent the frequency of the targets' non-reporting responses to UGRB. Responses were provided<br />

on a 5-point Likert-type scale, ranging from 0 (not at all) to 4 (very large extent). Higher scores<br />

indicated that the respondent engaged in more frequent coping responses to the UGRB.<br />

6<br />

Two additional items asking about sexual assault and an item asking about "other unwanted gender-related behavior"<br />

were not utilized in these analyses.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Subjective Appraisal<br />

The subjective appraisal scale contains six items that ask respondents to rate the degree to<br />

which a critical incident involving unprofessional, gender-related behavior was distressing (e.g.,<br />

"offensive," "threatening," or "embarrassing"). Responses were provided on a 5-point Likert-type<br />

scale, ranging from 0 (not at all) to 4 (extremely), with higher scores reflecting personal appraisals<br />

of greater distress.<br />

Organizational Climate<br />

Climate was assessed by adapting the Organizational Tolerance of Sexual Harassment<br />

Scale (OTSH; Hulin, Fitzgerald, & Drasgow, 1996) to a military context. Respondents were<br />

presented with three hypothetical scenarios of different types of UGRB, and asked to indicate the<br />

degree to which they agreed with statements about the climate for UGRB within workgroups or<br />

broader organizational units. The climate scale assesses individual perceptions of organizational<br />

tolerance for UGRB along scenarios about Crude and Offensive Behavior, Unwanted Sexual<br />

Attention, and Sexual Coercion. Response options ask if a complaint was made by the respondent,<br />

whether the respondent would incur risk, be taken seriously, or if corrective action would be taken.<br />

Responses to these nine items were provided on a 5-point Likert-type scale, ranging from 1<br />

(strongly disagree) to 5 (strongly agree). Higher scores reflected a work climate that is more<br />

tolerant of UGRB.<br />

Masculinized Work Context<br />

A masculinized work context (i.e., the degree to which the gender of the workgroup and<br />

the respondents' jobs are traditionally masculine) was assessed with four items. Included were the<br />

gender of immediate supervisor (male or female), the gender ratio of coworkers (response scale<br />

was recoded to range from 1 = all women to 7 = all men), and two dichotomously scored questions<br />

asking whether their jobs were typically held by a person of their gender and whether members of<br />

their gender were common in their work environment. These four items were standardized and<br />

summed to create a single variable with high scores representing the degree to which the<br />

respondent's work context was masculine.<br />

Target's Organizational Power<br />

Two items assessed the organizational power of the respondent. Respondents' pay grade<br />

(i.e., military pay classifications recoded to range from 1 to 20) and the number of years of<br />

completed active-duty service were standardized and summed to create a scale where higher scores<br />

reflect holding a greater amount of organizational power.<br />

Leadership efforts to stop sexual harassment<br />

This 3-item scale assessed respondents' beliefs regarding whether senior leadership "made<br />

honest and reasonable efforts to stop sexual harassment." Responses were provided on a 3-point<br />

response scale and were recoded such that I = "no", 2 = "don't know", and 3 = "yes." A higher<br />

score indicated a higher perception that senior leadership made "honest and reasonable efforts to<br />

stop sexual harassment.<br />

227<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


228<br />

Job satisfaction<br />

Three indices of job satisfaction were assessed: supervisor satisfaction (9 items; e.g.,<br />

"Leaders ... treat service members with respect"), coworker satisfaction (6 items; e.g., "You like<br />

your coworkers"), and work satisfaction (6 items; e.g., "You like the kind of work you do").<br />

Responses were provided on a 5-point Likert-type scale, ranging from I (strongly disagree) to 5<br />

(strongly agree). Higher scores reflected more satisfying experiences with leaders, coworkers,<br />

and work, respectively.<br />

<strong>Military</strong> life satisfaction<br />

The military life satisfaction scale consisted of seven items asking respondents to rate<br />

their degree of satisfaction with their life and work in the military (e.g., "Quality of your<br />

current residence," "Quality of your work environment," "Opportunities for professional<br />

development"). Responses were provided on a 5-point Likert-type scale, ranging from 1 (very<br />

dissatisfied) to 5 (very satisfied). Items were summed so that higher scores reflected greater<br />

satisfaction with various aspects of military life.<br />

Commitment<br />

The organizational commitment scale consists of three items that assessed the<br />

respondents' commitment to their service. Responses were provided on a 5-point Likert-type<br />

scale, ranging from 1 (strongly disagree) to 5 (strongly agree). Higher scores reflected a higher<br />

degree of commitment.<br />

Psychological outcomes<br />

Two indices of psychological well-being were assessed: emotional effects (3 items; e.g.,<br />

"Didn't do work or other activities as carefully as usual") and psychological distress (5 items;<br />

e.g., "Felt downhearted and blue"). Responses for both scales were provided on a 4-point<br />

Likert-type scale, ranging from 1 (little or none of the time) to 4 (all or most of the time).<br />

Items from both scales were recoded and summed into a composite variable, with higher scores<br />

reflecting greater psychological well-being. Health outcomes<br />

Two indices of health satisfaction were assessed: general health (4 items; e.g., "My<br />

health is excellent") and health effects (4 items; e.g., "Accomplished less than you would like").<br />

Responses for both scales, respectively, were provided on 4-point Likert-type scales, ranging<br />

from 1 (definitely false, little or none of the time) to 4 (definitely true, all or most of the time).<br />

Items from both scales were recoded and summed into a composite variable, with higher scores<br />

reflecting greater physical well-being.<br />

Analysis Plan<br />

We conducted path analysis separately for men and women to test the proposed model<br />

of retaliation following unprofessional, gender-related behavior (UGRB) shown in Figure 1.<br />

This approach was utilized instead of structural equation modeling because certain constructs<br />

(e.g., organizational power, leadership, masculinized work context, reporting) only had single<br />

indicators available. Each sample (male and female) was randomly split into half so that the<br />

model could be tested, with potential modifications, in the first half-sample and confirmed in<br />

the second half-sample. Analyses were conducted using LISREL 8.30 and PRELIS 2.30<br />

(Jöreskog & Sörbom, 1999) software. The path analysis utilized product moment correlation<br />

matrices and maximum likelihood estimation. The following fit statistics from LISREL were<br />

used to evaluate whether the specified model adequately fit the data, root mean square error of<br />

approximation (RMSEA), non-normed fit index (NNFI), standardized root mean square<br />

residual (SRMR), goodness-of-fit index (GFI), and adjusted goodness-of-fit index (AGFI).<br />

The residual components of the job satisfaction variables were allowed to covary because<br />

previous modeling with military samples suggested that the basic integrated model of<br />

sexual harassment does not include all relevant antecedents of job satisfaction (Fitzgerald,<br />

Drasgow, & Magley, 1999).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


RESULTS Exploratory Model<br />

The range, mean, standard deviation, and coefficient alpha for variables by gender are<br />

presented in Table 1. 7 All descriptive analyses were performed using SPSS 11.5.0.<br />

Table 1. Descriptive and Psychometric Information for Variables Included in the Path Analysis Model<br />

Women (n = 4,387) Men (n = 1,408)<br />

Scale Range M ab SD ab α Range M ab SD ab α<br />

Exogenous Variables<br />

Organizational Climate 9-45 .90 9-45 .90<br />

Masculinized -6.76-6.20 -.004 2.78 .64 -15.7-2.15 .005 2.34 .37<br />

Work Context c<br />

Organizational Power c -2.33-9.53 .001 1.67 .56 -2.50-8.75 .001 1.68 .58<br />

Leadership Practices 3-9 7.56 1.73 .77 3-9 7.94 1.60 .81<br />

Endogenous Variables<br />

UGRB 1-64 .90 1-64 .86<br />

Appraise 0-24 .86 0-24 .84<br />

Coping Response 0-68 .82 0-68 .82<br />

Reporting d 0-5 .73 0-5 .75<br />

Retaliation 11-33 .88 11-31 .89<br />

Supervisor Satisfaction 9-45 28.35 8.01 .89 9-45 28.94 7.60 .88<br />

Coworker Satisfaction 6-30 20.49 5.15 .91 6-30 20.91 4.83 .90<br />

Work Satisfaction 6-30 20.41 6.09 .91 6-30 20.59 5.99 .90<br />

<strong>Military</strong> Satisfaction 7-35 22.83 5.55 .79 7-35 22.62 5.92 .82<br />

Psychological 8-32 26.20 4.78 .89 8-32 26.39 4.71 .88<br />

Well-Being<br />

Physical Well-Being 9-32 27.81 3.99 .84 8-32 28.01 3.83 .83<br />

Organizational 3-15 11.81 2.45 .84 3-15 11.97 2.43 .82<br />

Commitment<br />

Notes. a The data are not yet released to the public, therefore we were unable to report certain statistics<br />

such as the mean, standard deviation, and the number of individuals who experienced unprofessional,<br />

gender-related behavior. b The means and standard deviations are based on unweighted data and caution<br />

is urged in their interpretation. c The low reliability of these scales is likely due to the small number of<br />

items. d Respondents may have reported one time only. UGRB = unprofessional, gender-related<br />

behavior.<br />

7 The data are not yet released to the public therefore we were unable to report certain statistics such<br />

as the means and the number of individuals who experienced unprofessional, gender-related behavior<br />

or retaliation. Intercorrelations among the scales are available from the first author.<br />

229<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


230<br />

Investigating the women's data first, the proposed model (see Figure 1) was examined<br />

via path analysis in an exploratory random 50% sample (n = 1659, after listwise deletion). This<br />

model of retaliation following UGRB utilizes the integrated model of sexual harassment with<br />

military personnel (Fitzgerald et al., 1999; Fitzgerald, Hulin et al., 1995) as a theoretical starting<br />

point and incorporates variables consistent with the whistle-blowing literature, research about<br />

harassment and retaliation, and appraisal of stressful events and harassment. The fit statistics for<br />

this initial model were acceptable (x 2 /df ratio = 445.88/62 = 7.19, RMSEA =.06, NNFI =.90,<br />

SRMR =.05, GFI = .97, AGFI = .93) suggesting that the model fit well. Inspection of the β and<br />

Γ matrices, standardized residuals, and modification indices suggested three minor and<br />

theoretically justified revisions. The path from organizational power to retaliation was nonsignificant<br />

and was dropped and the residual components of coping response and appraisal were<br />

allowed to covary as were the residual components of coping response and reporting. This is<br />

logical because appraisal is considered to be a process rather than a static event and appraisal<br />

about the stressfulness of UGRB would likely undergo reevaluation following any response to<br />

the behavior. For example, the target may originally appraise the UGRB as annoying and ask the<br />

perpetrator to stop, if this is unsuccessful and it continues or increases, the target may then<br />

appraise the event as more stressful or threatening and elect to report the behavior. Coping<br />

response is also considered to be an ongoing process that can involve more than one type of<br />

response over time. Reporting, a type of response, is measured separately from other responses<br />

but is thought to be related. This modification improved fit (x 2 /df ratio = 245.70/61 = 4.03,<br />

RMSEA = .04, NNFI = .95, SRMR = .03, GFI = .98, AGFI = .96) and did not affect the<br />

estimated elements of the β and Γ matrices.<br />

Next, the proposed model was examined for the men via path analysis in the men's<br />

exploratory random 50% sample (n = 545, after listwise deletion). The fit statistics for this<br />

initial model were acceptable (x 2 /df ratio = 227.53/62 = 3.67, RMSEA = .07, NNFI = .87,<br />

SRMR = .07, GFI = .95, AGE = .89) suggesting that the model fit reasonably well. We again<br />

allowed the residual components of response and appraisal and the residual components of<br />

response and reporting to covary and dropped the nonsignificant path from organizational<br />

power to retaliation. Following these modifications, the proposed model was examined and fit<br />

statistics were found to be acceptable (x 2 /df ratio = 154.22/61 = 2.53, RMSEA = .05, NNFI =<br />

.93, SRMR = .05, GFI = .97, AGFI = .92).<br />

Cross-Validation Model<br />

For the women, the model was cross-validated on the remaining 50% sample (n = 1702,<br />

after listwise deletion), and the fit was excellent (x/df ratio = 203.41/61 = 3.33 RMSEA = .04,<br />

NNFI = .96, SRMR = .03, GFI = .99, AGFI = .97). The same process took place for the men<br />

and the model was cross-validated on the remaining 50% sample (n = 568, after listwise<br />

deletion). The fit was again excellent (x/df ratio = 140.73/61 = 2.31 RMSEA =.05, NNFI =.93,<br />

SRMR =.04, GFI =.97, AGFI =.93).<br />

Model Summary<br />

Path coefficients can be seen in Tables 2, 3, 4, and 5 for both women and men. The paths<br />

suggest that for both sexes more frequent retaliation is predicted by an organizational climate that<br />

tolerates sexual harassment and, conversely, when leadership makes reasonable efforts to stop<br />

harassment, retaliation is less frequent (see Table 2).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


More frequent occurrences of UGRB, and more frequent use of coping responses such as cognitive<br />

and behavioral avoidance, confronting the perpetrator, and seeking social support are associated<br />

with increases in retaliation (see Table 3). For women, reporting is linked to more retaliation (see<br />

Table 3).<br />

Table 2. Paths from Antecedent Variables to Retaliation and UGRB for Women and Men<br />

Antecedent<br />

Outcome Climate Leader Power UGRB Context<br />

Retaliation .21 -.07 ns .22 -<br />

.22 -.13 ns .10<br />

UGRB .30 -.21 -.17 - .15<br />

.20 -.23 ns ns<br />

Note. The first entry in each cell is for the women's cross-validation sample; the second<br />

entry in each cell is from the men's sample. UGRB = unprofessional, gender-related<br />

behavior. ns = not significant.<br />

Table 3. Paths from Primary and Secondary Appraisal to Retaliation for Women and<br />

Men<br />

Antecedent<br />

Outcome Appraisal Coping Reporting UGRB<br />

Retaliation - .13 .16 .22<br />

.21 ns .10<br />

Appraisal - .93 .41 .53<br />

.87 .24 .46<br />

Note. The first entry in each cell is for the women's cross-validation sample; the second<br />

entry in each cell is from the men's sample. UGRB = unprofessional, gender-related<br />

behavior. ns = not significant.<br />

Retaliation is associated with lowered levels of coworker satisfaction and<br />

psychological and physical well-being for both men and women (see Table 4). For women,<br />

retaliation is related to lower levels of supervisor and work satisfaction and satisfaction with<br />

the military. However, several of these paths are small and should be viewed cautiously (see<br />

Table 4).<br />

As expected for both men and women, more frequent experiences of UGRB were<br />

associated with an organizational climate that is tolerant of behaviors indicative of sexual<br />

harassment (see Table 2). When personnel perceive that leadership makes efforts to stop<br />

harassment they also report less frequent UGRB. For women, working in a masculinized work<br />

context and holding lower organizational power are related to higher scores on the SEQ-DoD (see<br />

Table 2). More frequent experiences of UGRB were associated with appraising such experiences as<br />

more distressing or threatening for male and female personnel which, in turn, were related to<br />

reporting and other types of coping response (see Table 3). Perceptions that one's organization is<br />

tolerant of harassment was related to decrements in job satisfaction and satisfaction with the<br />

military for men and women (see Table 4). Conversely, perceiving one's leadership as making<br />

reasonable efforts to stop harassment and holding greater organizational power were related to<br />

increased satisfaction for personnel. More frequent experiences of UGRB were related to<br />

decrements in coworker satisfaction, satisfaction with the military, and psychological well-being<br />

for personnel. Additionally, UGRB was related to decreased supervisor and work satisfaction for<br />

the men, although paths were small and should be interpreted with caution. Unexpectedly, the path<br />

231<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


232<br />

from UGRB to work satisfaction was positive (see Table 4), albeit small (.06; t = 2.13), in the<br />

women's cross-validation sample. No explanation is offered because this path was negative in the<br />

derivation sample. Finally, strong paths were observed between psychological and physical wellbeing<br />

(women = .34, men = .27) and organizational commitment was associated with satisfaction<br />

with coworkers, work, and the military for both men and women. For women, satisfaction with<br />

supervisor predicted organizational commitment (see Table 5).<br />

Table 4. Paths from Antecedent Variables to Outcomes for Women and Men<br />

Antecedent<br />

Outcome Climate Leader Power UGRB Retaliation<br />

Supervisor -.34 .25 .12 ns -.05<br />

Satisfaction -.31 .19 .20 -.10 ns<br />

Coworker -.22 .14 .15 -.11 -.06<br />

Satisfaction -.22 .10 .15 -.08 -.14<br />

Work -.20 .16 .15 .06 -.05<br />

Satisfaction -.19 .17 .11 -.09 ns<br />

<strong>Military</strong> -.27 .13 .16 -.07 -.10<br />

Satisfaction -.26 .11 .15 -.14 ns<br />

Psychological - - - -.11 -.09<br />

Well-Being -.10 -.14<br />

Physical - - - ns -.05<br />

Health ns -.18<br />

Note. The first entry in each cell is for the women's cross-validation sample; the second entry<br />

in each cell is from the men's sample. UGRB= unprofessional, gender-related behavior. ns =<br />

not significant.<br />

Table S. Paths from Satisfaction to Organizational Commitment and Psychological Well-<br />

Being for Women and Men<br />

Antecedent<br />

Supervisor Coworker Work <strong>Military</strong><br />

Outcome Satisfaction Satisfaction Satisfaction Satisfaction<br />

Organizationa .08 .07 .23 .23<br />

Commitment ns .13 .33 .22<br />

Psychological ns .09 .15 .19<br />

Well-Being ns .11 .14 .16<br />

Note. The first entry in each cell is for the women's cross-validation sample; the second<br />

entry in each cell is from the men's sample. ns = not significant.<br />

In sum, military personnel reported more retaliation when they (1) worked in a climate<br />

where UGRG was believed likely to occur, (2) endorsed more unprofessional, gender-related<br />

behaviors, and (3) experienced these behaviors as more threatening or severe and responded by<br />

seeking social support, confronting or avoiding the perpetrator, or attempted to cope by managing<br />

their cognitive and emotional reactions to the behavior. Female personnel endorsed more<br />

retaliation when they reported UGRB to their supervisors, leadership, or organization.<br />

Conversely, retaliation was less when personnel perceived that leaders made efforts to stop<br />

harassment. Retaliation was directly and inversely related to (1) coworker satisfaction, (2)<br />

psychological well-being, and (3) physical well-being for male and female personnel. Decrements<br />

in elements of job satisfaction and satisfaction with the military were in turn related to lowered<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


organizational commitment and psychological well-being impacted physical well-being. For<br />

women, retaliation was associated with lowered satisfaction with supervisors, work, and the<br />

military.<br />

DISCUSSION<br />

Our findings suggest that retaliation is associated with damages to male and female<br />

personnel, including their psychological and physical well-being and satisfaction with<br />

coworkers. It is likely to occur when UGRB is severe and when the organization is tolerant of<br />

such behavior. Leadership efforts to stop harassment exerted a negative effect on retaliation,<br />

suggesting that such efforts also contribute to curbing retaliation. The appraisal process appears<br />

to play a critical role in determining retaliation. The relationships among UGRB, appraisal,<br />

response, reporting, and retaliation were as expected with the exception of reporting for men.<br />

When UGRB is appraised as more distressing or threatening, personnel are likely to engage in<br />

increased responding (and for women, reporting), which is related to increased retaliation.<br />

Although the determinants of UGRB (masculinized work context, holding organizational<br />

power) may differ for men and women, those of retaliation do not. It was surprising that<br />

organizational power was not related to retaliation for men or women. However, this is<br />

consistent with Near and Miceli (1986) who found that the power of the whistle-blower was<br />

unrelated to retaliation. It is possible that the status of the target is irrelevant in face of the<br />

potential threat or damage to the organization (Near & Miceli, 1986) from a charge of<br />

harassment, particularly given the high degree of media attention that follows charges of sexual<br />

harassment in the military. Alternatively, those at the very highest levels of power (e.g.,<br />

Admiral, General) were not surveyed; therefore it is impossible to say whether those who hold<br />

the highest positions experience less retaliation. It is possible that status of the target may<br />

function significantly only relative to other variables such as the status of the perpetrator.<br />

It was unexpected that reporting had no relationship to retaliation for men. It is possible<br />

that men report less often, preferring to institute other types of responses. That the composite<br />

measure of coping response had such a robust relationship with appraisal and retaliation bears<br />

further investigation. The direct and mediating effects of the appraisal process on retaliation<br />

have until now been unstudied. In future studies it will be important to understand the<br />

contributions of each type of response to retaliation (e.g., confronting, seeking support,<br />

cognitive and behavioral avoidance).<br />

Retaliation exerted less of an effect on job satisfaction and satisfaction with the military<br />

than was expected. This may have to do with the numerous other predictor variables included<br />

in the model and bears closer investigation. At the same time, it is important to look at the<br />

whole picture and observe that, taken together, job satisfaction (and indirectly organizational<br />

commitment) was strongly influenced by multiple forms of negative workplace behavior<br />

(retaliation, UGRB) and climate related to harassment (organizational tolerance). Although we<br />

did not examine costs to the military, such behaviors likely have significant organizational<br />

costs, given that they affect the commitment and well-being of personnel. Related to costs, it<br />

will be important to investigate turnover intentions and actual turnover in future studies. That<br />

increases in retaliation were associated with decreases in satisfaction with the military for<br />

female personnel is notable. A next step for future research is to examine whether links exist<br />

between retaliation, military satisfaction, and exit strategies.<br />

That the two measures of organizational climate (leadership and climate) were such<br />

important correlates for retaliation (and UGRB and outcomes) supports both theory and<br />

research in the whistle-blowing and sexual harassment literatures. Climate is consistently<br />

associated with negative workplace experiences and negative outcomes directly and indirectly<br />

through the negative experiences (Fitzgerald, Drasgow et al., 1997; Glomb et al., 1997). In our<br />

study, tolerant climate was strongly related to increased experiences of retaliation, and when<br />

leadership implemented efforts to reduce sexual harassment retaliation decreased. Of course<br />

these findings are correlational and should not be interpreted causally. They are important<br />

233<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


234<br />

because active efforts from leadership to implement policy to stop harassment (e.g., punish<br />

those who harass) has been identified as the single most effective strategy from leadership<br />

(among those studied thus far) that relates to reduced rates of harassment in the military<br />

(Hunter Williams, Fitzgerald, & Drasgow, 1999). Our findings support that it is important to<br />

promote an organizational climate that does not tolerate wrong-doing.<br />

Our general framework for understanding retaliation following from UGRB appears to<br />

be supported. An additional variable to consider in future research is the fear of retaliation.<br />

Magley and Cortina (2002) found that fear of retaliation was related to harassment, power of<br />

the perpetrator, leadership behaviors, and that even in the absence of reporting or other active<br />

coping response, fear of retaliation was associated with negative outcomes.<br />

Although this is a strong first attempt to understand the process of retaliation following<br />

UGRB, it raises several issues. One issue is whether personal and professional retaliation have<br />

the same antecedents and outcomes. Given that our measure included a majority of items about<br />

professional retaliation, the model may be more reflective of professional retaliation. There is<br />

far less research about personal retaliation and in civilian contexts it is not a legally actionable<br />

offense. However, what little research exists, suggests that interpersonal retaliation is strongly<br />

associated with damages that can lead to costs to the organization (Cortina & Magley, in press;<br />

Fitzgerald et al., in preparation), thus it seems important to include both types of retaliation in<br />

future research.<br />

This study is not without limitations. Our data are cross-sectional and the analytic<br />

methods correlational and therefore no assumptions about causality can be made. Common<br />

method variance may also explain some of the significant relationships among variables given<br />

that data were single-source and self-report. However, unprofessional, gender-related<br />

experiences and retaliation were asked after the outcome variables in an attempt to minimize<br />

method variance due to self-report.<br />

In conclusion, findings from this study suggest that military personnel who experience<br />

retaliation are also likely to experience decrements in job satisfaction and decreased well-being.<br />

Supporting our framework, retaliation was associated with a climate that tolerates harassment,<br />

unprofessional, gender-related experiences, and the appraisal process. This study supports the<br />

contention that organizational climate is of paramount importance for reducing negative<br />

workplace experiences and that unchecked, negative reprisals will be associated with outcomes<br />

that can be costly for individuals and organizations. That leadership efforts to reduce harassment<br />

are associated with reduced retaliation suggests that it would be effective to continue to<br />

implement policy and procedure that inhibits unprofessional, gender-related behavior.<br />

REFERENCES<br />

Bergman, M. E., Langhout, R. D., Palmieri, P. A., Cortina, L. M., & Fitzgerald, L. F. (2002).<br />

The (Un)reasonableness of reporting: Antecedents and consequences of reporting<br />

sexual harassment. Journal of Applied Psychology, 87, 230-242.<br />

Coles, F. S. (1986). Forced to quit: Sexual harassment complaints and agency response. Sex<br />

Roles, 14, 81-95.<br />

Cortina, L. M., & Magley, V. J. (in press). Raising voice, risking retaliation: Events following<br />

interpersonal mistreatment in the workplace. Journal of Occupational Health<br />

Psychology.<br />

Crockett, R. W., Gilmere, J. A. (1999). Retaliation: Agency theory and gaps in the law. Public<br />

Personnel Management, 28, 39-49.<br />

Drasgow, F. (1999). Preface to the special issue. <strong>Military</strong> Psychology, 11, 217-218.<br />

Drasgow, F., Fitzgerald. L.F., Magley, V.J., Waldo, C.R., & Zickar, M.J. (1999). The 1995<br />

Armed forces sexual harassment survey: Report on scales and measures (DMDC<br />

Report No. 98-004). Arlington, VA: Defense Manpower Data Center.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Fitzgerald, L.F., Drasgow, F., Hulin, C.L., Gelfand, M.J. & Magley, V.J. (1997). The<br />

antecedents and consequences of sexual harassment in organizations: A test of an<br />

integrated model. Journal of Applied Psychology, 82, 578-589.<br />

Fitzgerald, L.F., Drasgow, F., & Magley, V.J. (1999). Sexual harassment in the Armed Forces:<br />

A test of an integrated model. <strong>Military</strong> Psychology, 11, 329-343.<br />

Fitzgerald, L. F., Hulin, C. L., & Drasgow, F. (1995). The antecedents and consequences of<br />

sexual harassment in organizations: An integrated model. In G.P. Keita & J.J. Hurrell,<br />

Jr. (Eds.). Job stress in a changing workforce: Investigating gender, diversity, and<br />

family issues (pp. 55-74). Washington, DC: American Psychological <strong>Association</strong>.<br />

Fitzgerald, L. F., Smolen, A. C., Harped, M., S., Collinsworth, L. L., Colbert, C. L. (in<br />

preparation). Sexual harassment. Impact of reporting and retaliation. University of<br />

Illinois at Urbana-Champaign.<br />

Fitzgerald, L. F., Swan, S., & Fischer, K. (1995). Why didn't she just report him? The<br />

psychological and legal implications of women's responses to sexual harassment.<br />

Journal of Social Issues, 51, 117-138.<br />

Fitzgerald, L. F., Swan, S., & Magley, V. J. (1997). But was it really sexual harassment? Legal,<br />

behavioral, and psychological definitions of the workplace victimization of women. In W.<br />

O'Donohue (Ed.), Sexual harassment. Theory, research, and treatment (pp. 5-28). Boston:<br />

Allyn and Bacon.<br />

Glomb, T. M., Richman, W. L., Hulin, C. L., Drasgow, F., Schneider, K. T., & Fitzgerald,<br />

L. F. (1997). Ambient sexual harassment: An integrated model of antecedents and<br />

consequences. Organizational Behavior and Human Decision Processes, 71, 309-<br />

328.<br />

Hesson-McInnis, M.S. & Fitzgerald, L.F. (1997). Sexual harassment: A preliminary test of<br />

an integrative model. Journal of Applied Social Psychology, 27, 877-901.<br />

Hulin, C. L., Fitzgerald, L. F., & Drasgow, F. (1996). Organizational influences on sexual<br />

harassment. In M. Stockdale (Ed.), Sexual harassment in the workplace, Vol. 5, (pp.<br />

127-150). Thousand Oaks, CA: Sage.<br />

Hunter Williams, J., Fitzgerald, L.F., & Drasgow, F. (1999). The effects of organizational<br />

practices on sexual harassment and individual outcomes in the military. <strong>Military</strong><br />

Psychology, 11, 303-328.<br />

Jöreskog, K., & Sörbom, D. (1999). LISREL 8.30 and PRELIS 2.30. Scientific Software<br />

<strong>International</strong>, Inc.<br />

Lazarus, R.S. & Folkman, S. (1984). Stress, appraisal and coping. New York: Springer.<br />

Loy, P. H., & Stewart, L. P. (1984). The extent and effects of sexual harassment of working<br />

women. Sociological Focus, 17, 31-43.<br />

Magley, V. M., & Cortina, L. M. (2002, April). Retaliation against military personnel who<br />

blow the whistle on sexual harassment. In V. J. Magley & L. M. Cortina (Cochairs),<br />

Intersections of workplace mistreatment, gender, and occupational health.<br />

Symposium presented at the annual meeting of the Society for Industrial and<br />

Organizational Psychology, Toronto, CN.<br />

Miceli, M.P. & Near, J.P. (1988). Individual and situational correlates of whistleblowing.<br />

Personnel Psychology, 41, 267-281.<br />

Miceli, M. P., & Near, J. P. (1992). Blowing the whistle: The organizational and legal<br />

implications for companies and employees. NY, NY: Lexington Books.<br />

Miceli, M. P., Rehg, M., Near, J. P., & Ryan, K. C. (1999). Can laws protect whistleblowers?<br />

Results of a naturally occurring field experiment. Work and<br />

Occupations, 26, 129-151.<br />

Near, J. P., & Miceli, M. P. (1986). Retaliation against whistle-blowers: Predictors and<br />

effects. Journal of Applied Psychology, 71, 137-145.<br />

235<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


236<br />

Parmelee, M. A., Near, J. P. & Jensen, T. C. (1982). Correlates of whistle-blowers'<br />

perceptions of organizational retaliation. Administrative Science Quarterly, 27, 17-<br />

34.<br />

Ormerod A. J., Lawson, A. K., Sims, C. S., Lytell, M. C., Wadlington, P. L., Yaeger, D.<br />

W., Wright, C. V., Reed, M. E., Lee, W. C., Drasgow, F., Fitzgerald, L. F.,<br />

Cohorn, C. A. (In preparation). The 2002 Status of the Armed Forces Surveys -<br />

Workplace and Gender Relations: Report on scales and measures. (DMDC Report<br />

No.). Arlington, VA: Defense Manpower Data Center.<br />

Pryor, J. B., Giedd, J. L, & Williams, K. B. (1995). A social psychological model for<br />

predicting sexual harassment. Journal of Social Issues, 51, 69-84.<br />

Pryor, J. B., & Whalen, N. J. (1997). A typology of sexual harassment: Characteristics of<br />

harassers and the social circumstances under which sexual harassment occurs. In W.<br />

O’Donohue (Ed.), Sexual harassment: Theory, research, and treatment. (pp. 129-<br />

151). Boston: Allyn & Bacon.<br />

SPSS for Windows 11.5.0 [Computer Software]. (2002). Chicago, IL: SPSS Inc.<br />

Stark, S., Chernyshenko, O.S., Lancaster, A.R., Drasgow, F., Fitzgerald, L.F. (2002).<br />

Toward standardized measurement of sexual harassment: Shortening the SEQ-DoD<br />

using item response theory. <strong>Military</strong> Psychology, 14, 49-72.<br />

Stockdale, M.S. (1998). The direct and moderating influences of sexual-harassment<br />

pervasiveness, coping strategies, and gender on work-related outcomes.<br />

Psychology of Women Quarterly, 22, 521-535.<br />

Survey Method for Counting Incidents of Sexual Harassment (April 28, 2002). Washington,<br />

DC: Office of the Under Secretary of Defense for Personnel and Readiness.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


UNDERSTANDING RESPONSES TO SEXUAL HARASSMENT IN THE<br />

U.S. MILITARY *<br />

Angela K. Lawson<br />

Louise F. Fitzgerald<br />

University of Illinois at Urbana-Champaign<br />

603 East Daniel<br />

Champaign, Illinois 61820<br />

alawson@s.psych.uiuc.edu<br />

INTRODUCTION<br />

Research on sexual harassment prevalence confirms its presence in a wide variety of<br />

organizational environments and finds it to be strongly associated with a number of negative<br />

outcomes for both individuals and organizations (Fitzgerald, Drasgow, Hulin, Gelfand, &<br />

Magley, 1997; Malamut & Offermann, 2001; McKinney, K. et al., 1998; U.S. Merit Systems<br />

Protection Board, 1994). Work-place gender-ratio, gender stereotyping of jobs, and<br />

organizational climate have all been linked to the prevalence of sexually harassing behaviors;<br />

further, female employees in male-dominated work groups and/or organizations that appear to<br />

tolerate sexually inappropriate behavior are more likely to be targets of harassment than are those<br />

employed in more gender-balanced environments intolerant of sexual harassment (Fitzgerald et<br />

al., 1997).<br />

Whatever the context, employees rarely report such experiences to management<br />

(Bergman et al., 2002; Marin & Guadagno, 1999). Marin and Guadagno (1999) suggest that such<br />

non-reporting may be linked to non-labeling of the incident (as harassment), fear of retaliation,<br />

or negative appraisals from supervisors and coworkers. Fitzgerald and Swan (1995) postulate<br />

that reluctance to report may also arise from a belief that complaints would not be taken<br />

seriously whereas Baker et al. (1990) implicate the organization’s perceived tolerance of<br />

inappropriate behavior as an important influence. Additionally research suggests that the reasons<br />

for not reporting can be grouped into two categories. The first being fear resulting from the<br />

perceived risks inherent to the target’s occupational and personal well-being and the second<br />

focusing more on issues associated with organizational policies and procedures associated with<br />

reporting sexual harassment (Peirce, Rosen, & Hiller, 1997).<br />

* Paper presented at the <strong>2003</strong> <strong>IMTA</strong> Conference, Pensacola, Florida.<br />

This research is funded by the Defense Manpower Data Center (DMDC), through the<br />

Consortium of Universities of the Washington Metropolitan Area, Contract, M67004-03-C-0006<br />

as well as the National Institute of Mental Health grant #MH50791-08. The opinions of this<br />

paper are those of the authors and are not to be construed as official DMDC or Department of<br />

Defense position unless so designated by other authorized documents.<br />

Please do not cite or quote without permission. Correspondence should be addressed to Angela<br />

K. Lawson, 603 E. Daniel, Champaign, IL 61820 or alawson@s.psych.uiuc.edu.<br />

237<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


238<br />

Determinants of Reporting<br />

Research on the determinants of reporting sexual harassment has typically focused on a<br />

combination of individual, stimulus (i.e., severity-related), and/or organizational variables.<br />

Exploration of the organizational determinants of reporting behavior has examined the climate,<br />

or tolerance, for sexual harassment in the organization. Organizations that do not pro-actively<br />

discourage this problem, do not take reports of harassment seriously, do not discourage<br />

retaliation, or that have inadequate or non-existent harassment policies and investigative<br />

procedures are less likely to be told about the problems their employees may be having<br />

(Offermann & Malamut, 2002; Fitzgerald et al., 1997; Brooks & Perot, 1991, Perry et al., 1997,<br />

Malamut & Offermann, 2001: Bergman, Langhout, Palmieri, Cortina, & Fitzgerald, 2002).<br />

Stimulus antecedents include variables specific to the incident of the harassment, such as<br />

frequency, intensity, and duration, (Rudman, et al., 1995; Malamut & Offermann, 2001,<br />

Bergman et al., 2002), whereas individual variables focus on demographic data such as age, race,<br />

etc. (Perry et al., 1997; Brooks and Perot, 1991; Knapp, Faley, Ekeberg, & DuBois, 1997,<br />

Rudman, et al., 1995, Malamut & Offermann, 2001, Bergman et al., 2002). Although each of<br />

these determinants has been shown to affect reporting behavior, individual variables are typically<br />

less influential than organizational and situational variables in the decision to report (Fitzgerald<br />

et al., 1995).<br />

While work on the determinants of reporting sexual harassment provides readers with<br />

invaluable information this work has operated under a dichotomous model of reporting sexual<br />

harassment. This research ignores the findings of Malamut et al., (2002) which suggests that<br />

targets of sexual harassment do not rigidly respond to sexual harassment in such a black and<br />

white manner. Instead it appears that these targets employ reporting strategies far less rigidly and<br />

choose to report only some of the harassment they have experienced rather than all or none of it.<br />

Thus, combining individuals who report only some of the harassment experience(s) with either<br />

non-reporters or those who report all the harassment may confound our understanding of<br />

reporting behavior. Additionally, analysis of sexual harassment litigation suggests that details of<br />

the harassing experiences gradually emerge during the litigation process. This emersion of<br />

details can be used to discredit complainants by suggesting that the complainant is fabricating<br />

their description of the experience. A better understanding of reporting behavior could be utilized<br />

to lend credibility to targets during litigation.<br />

Further, while it is interesting to know which individual, stimulus, and organizational<br />

variables impact reporting behavior it is arguably equally if not more important to understand<br />

target’s reasons for not reporting sexual harassment. An examination of this type of data can<br />

provide researchers and organizations with valuable information that could assist them in<br />

effectively encouraging reporting. Enactment of these types of changes, based on target’s<br />

responses, protect individuals from physical and psychological harm and also protect an<br />

organizations investment in their employees by possibly increasing job satisfaction, decreasing<br />

attrition, etc.<br />

METHOD<br />

Participants<br />

This study utilized data from the Status of the Armed Forces Survey: Workplace and<br />

Gender Relations 2002 (WGR 2002) collected by the Defense Manpower Data Center. The<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


sample population was selected through a stratified random sampling procedure in an effort to<br />

adequately represent all relevant subgroups including branch of service, gender, pay-grade, and<br />

racial/ethnic group membership. The original sample consisted of 60,415 individuals including<br />

3,894 undeliverable records. Survey response rate was based on only those surveys with a<br />

minimum of 50% item completion and completion of at least one item on the Sexual Experiences<br />

Questionnaire (SEQ-DoD). Survey distribution resulted in a 36% adjusted weighted response<br />

rate.<br />

Following survey completion, data from 5,886 (1,671 men and 4,215 women) of the total<br />

sample population were classified as eligible participants for data analysis. Participant data was<br />

deemed eligible for analysis in this study if, in addition to the eligibility requirements for<br />

response rate calculation, the participant had answered at least one item on the Sexual<br />

Experiences Questionnaire with a score of 1 or greater, at least one item on the One Situation<br />

with the Greatest Effect (a situation specific Sexual Experiences Questionnaire) question with a<br />

score of 1, and at least one item on the Reporting scale with a score of 1 or at least one item on<br />

the Non-Reporting scale with a score of 1.<br />

Participants were somewhat evenly divided across branches of service, with 27% in the<br />

Army, 23% in the Navy, 14% the Marine, 26% the Air Force, and 10% in the Coast Guard. The<br />

majority of participants were white women, with 72% women, 28% men, 12% Hispanic, 60%<br />

non-Hispanic white, 18% non-Hispanic black or African American, and 10% other races/ethnic<br />

backgrounds. Participant’s marital status was fairly evenly split between those that were married<br />

or separated (54%) and those who never married, divorced or widowed (46%). Roughly half of<br />

the participants (48%) completed at least two years of college but did not obtain a degree beyond<br />

an Associates degree, 33% received a 4-year college degree or higher degree, and 19% received<br />

their GED, high school diploma or participated in less than 12 years of school. The majority of<br />

participants had completed less than 6 years of active duty service (47%), 14% had completed<br />

years 6-9 years, 30% had completed 10-19 years, and 9% had completed 20 years or more.<br />

Finally 70% of participants were enlisted personnel, 17% were warrant officers, and 26% were<br />

commissioned officers. Participant age was not assessed in the survey.<br />

Procedure<br />

A 16-page survey booklet was mailed to Active Duty and Coast Guard members via a<br />

primarily residential mailing list. Additionally a web site was created to provide service members<br />

with the option of online survey completion. A notification of the upcoming survey was mailed<br />

in December 2001 followed by the first wave of the survey mailing three weeks later. Two<br />

weeks after the initial mailing service members were sent a thank you letter via direct mail<br />

followed by a second wave of surveys mailed to individuals who had not yet returned the initial<br />

survey. The third and final wave of survey mailings was sent four weeks after the second survey<br />

mailing. The survey was closed on April 23, 2002.<br />

Measures<br />

Climate. Respondents were presented with three scenarios of harassment and asked to<br />

assess the degree to which they thought that a report of this behavior would be taken seriously,<br />

whether or not it would be risky, to complain, and whether or not they thought any action would<br />

be taken as a result of the complaint. The three item variables (one item per scenario) of Serious,<br />

Risk, and Action were then combined to form an overall nine-item Climate variable. These items<br />

are modified from the Organizational Tolerance of Sexual Harassment scale (Hulin, 1993) and<br />

239<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


240<br />

are intended to assess participant’s perceptions of climate in the military. A higher score on the<br />

climate variable indicates a perception that the organization is tolerant of sexual harassment.<br />

Frequency. Participants were asked to reflect on the One Situation with Greatest Effect to<br />

answer questions related to the frequency, and other aspects specific to the individual’s<br />

experience with the unwanted behavior(s). Frequency was determined by the participant’s<br />

response to a single item that asked how often the offensive behavior occurred. Using a 5-point<br />

response scale respondents were given the following options: “once”, “occasionally”,<br />

“frequently”, “almost every day”, and “more than once a day”. A higher score indicates the<br />

presence of more frequent harassment.<br />

One Situation with the Greatest Effect. All 19 items from the Department of Defense<br />

Sexual Experiences Questionnaire (DoD SEQ) were used to measure the frequency of unwanted<br />

sex/gender related talk and/or behavior as it pertained to the one situation of sexual harassment<br />

that the target perceived to have had the greatest effect on them 5 . The response scale for the<br />

situation specific version of the SEQ was modified to contain a dichotomous scoring option of<br />

“did this” or “did not do this” instead of a 5-point response scale ranging from “never” to “very<br />

often” utilized in the DoD SEQ. Respondents were asked to indicate how often the unwanted<br />

verbal and/or physical behavior(s) had occurred by using the dichotomous response scale. The<br />

response options were divided into subgroups based on the types of behaviors indicated in each<br />

item.<br />

Sexist Behavior was identified by the endorsement of one or more of four response<br />

options which referenced unwanted behaviors that included “referring to people of your gender<br />

in insulting or offensive terms”, “treated you “differently” because of your gender”, “made<br />

offensive sexist remarks”, and “put you down or was condescending to you because of your<br />

gender”.<br />

Unwanted Sexual Attention goes beyond experiencing verbal discourse relative to sexual<br />

topics and includes behaviors such as unwanted touching and unreciprocated “attempts to<br />

establish a romantic sexual relationship”. Respondents were identified as having experienced<br />

Unwanted Sexual Attention if they endorsed at least one of six response options relevant to these<br />

types of behavior. These response options include “made unwanted attempts to establish a<br />

romantic sexual relationship with you despite your efforts to discourage it”, continued to asked<br />

you for dates, drinks, dinner, etc., even though you said No”, “touched you in a way that made<br />

you feel uncomfortable”, “made unwanted attempts to stroke, fondle or kiss you”, “attempted to<br />

have sex with you without your consent or against your will, but was not successful”, and “had<br />

sex with you without your consent or against your will”.<br />

5 Survey measurement of sexual harassment is defined by the U.S. Department of Defense as the presence of<br />

behaviors indicative of sexual harassment (Crude/Offensive Behavior, Sexual Coercion, and Unwanted Sexual<br />

Attention; Sexist Behavior and Sexual Assault are not counted in the DoD survey measure<br />

of sexual harassment) and the labeling of those behaviors as sexual harassment (Survey Method for Counting<br />

Incidents of Sexual Harassment, 2002). The WGR 2002 did not include a labeling item specific to the “One<br />

Situation with the Greatest Effect,” rather, labeling was tied to all incidents of behaviors indicative of harassment.<br />

As such the use of the phrase “sexual harassment” or “harassment” in this document reflects only one of the two<br />

qualifiers of the phenomenon. Further, use of these terms is in no way meant to reflect the legal definition of sexual<br />

harassment. This wording is utilized for consistency and comparison with the literature on reporting sexual<br />

harassment.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Finally individuals were identified as having experienced Sexual Coercion if they<br />

endorsed one of four response items including behaviors wherein the harasser bribed, threatened<br />

or in some way coerced the target into participating in sexual activities or treated them badly for<br />

not participating. These response options include “made you feel like you were being bribed with<br />

some sort of reward or special treatment to engage in sexual behavior”, “made you feel<br />

threatened with some sort of retaliation for not being sexually cooperative”, “treated you badly<br />

for refusing to have sex”, and “implied faster promotions or better treatment if you were sexually<br />

cooperative”.<br />

Reasons for Not Reporting. Survey participants were asked to indicate whether or not any<br />

of the 19 items listed in this checklist matched their reasons for not reporting those behavior(s)<br />

indicative of sexual harassment they had experienced as a part of the One Situation with the<br />

Greatest Effect.<br />

Reporting Status. Research participants were separated into one of three groups:<br />

Complete Reporters (those individuals who reported all of the harassment that had occurred as<br />

part of the One Situation With the Greatest Effect), Non-Reporters (those individuals who did<br />

not report any of the harassing behaviors indicated in the One Situation With the Greatest<br />

Effect), and Partial Reporters (those individuals who reported only some of the behaviors that<br />

had occurred as part of the One Situation With the Greatest Effect). Non-Reporters were<br />

identified through the use a 5-item dichotomous (yes or no) reporting checklist which asked<br />

participants to specify whether or not they had reported the unwanted sexual talk and/or<br />

behaviors they had indicated had occurred on the Sexual Experiences Questionnaire as applied<br />

specifically to the one situation (incident(s) of sexual harassment) which had the greatest effect<br />

on the participant. Respondents who did not endorse reporting the One Situation with the<br />

Greatest Effect to any of the five listed individuals/groups were placed in the Non-Reporters<br />

group.<br />

Participants were placed into the Complete Reporters group based on their response to<br />

both the reporting checklist used to identify Non-Reporters and an additional survey item<br />

relevant to the comprehensiveness of the participant’s formal or informal report. Respondents<br />

were asked to indicate (yes or no) on the reporting checklist whether they had reported the<br />

incident of sexual harassment which had the greatest effect on them to their immediate<br />

supervisor, someone else in their chain-of-command (including their commanding officer),<br />

Supervisor(s) of the person(s) who did it, Special military office responsible for handling these<br />

kinds of complaints (for example, <strong>Military</strong> Equal Opportunity of Civil Rights Office), or Other<br />

installation/Service/DoD person or office with responsibility for follow-up.<br />

Participants who endorsed reporting the situation to any of the above individuals/groups<br />

and indicated that they had reported all of the behaviors that had occurred as part of the One<br />

Situation with the Greatest Effect were designated as Complete Reporters. Participants who<br />

endorsed reporting the situation to any of the individuals/groups listed in the reporting checklist<br />

but indicated that they had not reported all of the behaviors that had occurred as part of the One<br />

Situation with the Greatest Effect were designated as Partial Reporters.<br />

Supervisor Harassment. Bergman et al (2002) suggest that a harasser’s status in an<br />

organization may influence a target’s willingness to report sexual harassment. Therefore<br />

perpetrators were categorized on the basis of their power within the military context.<br />

Respondents who endorsed one of five items indicating that a supervisor or someone else of a<br />

241<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


242<br />

higher rank than themselves perpetrated the harassment were identified as having experienced<br />

Supervisor Harassment.<br />

Subordinate Harassment. Subordinate Harassment was assessed by participant’s<br />

responses to one of two questions regarding a subordinate’s perpetration of harassment. was<br />

assessed by participant’s responses to one of two questions regarding a subordinate’s<br />

perpetration of harassment.<br />

Missing Data<br />

Missing data was imputed utilizing a technique suggested by Bernaards, C. A. & Sijtsma,<br />

K. (2000). The author’s recommend utilizing a two-way imputation of item mean and person<br />

mean rather than relying on one or the other alone. "Two-way imputation (TW) calculates across<br />

available scores the overall mean, the mean for item j and the mean for person i, and imputes IM<br />

(item mean) + PM (person mean) –OM (overall mean) for missing observation (i,j)." (Bernaards,<br />

C. A., et al., 2000, p. 331). Scale data was imputed based on total number of items in the scale. If<br />

the scale contained 4-10 items only 1 item was utilized for imputation, if the scale contained 11-<br />

20 items 2 items could be imputed and if the scale contained 21-30 items 3 items could be<br />

imputed.<br />

ANALYSIS<br />

Following data collection and cleaning the dataset was randomly divided into a<br />

developmental and a confirmatory sample (N=2951). The developmental sample was used to<br />

examine variables that could differentiate between the group of individuals who reported none of<br />

the harassment (Non-reporters, N=4471), some of the harassment (Partial Reporters, N=644), or<br />

all of the harassment they had experienced (Complete Reporters, N=771). A host of variables<br />

were included in the Multinomial Logistic Regressions performed separately for women and men<br />

on the developmental sample. These variables included sex, level of education, race/ethnicity,<br />

marital status, branch of service, paygrade, years of active-duty service, gender of supervisor,<br />

gender mix of work group, perception of leadership, sexual behaviors, sexist behaviors,<br />

unwanted sexual attention, sexual coercion, appraisal of harassment, where and when harassment<br />

occurred, gender of the harasser(s), rank of the harasser(s), frequency and duration of<br />

harassment, organizational tolerance, and sexual harassment training. Only those variables that,<br />

when alone in the model, resulted in significant (≤.05) or marginally significant (≤ .15)<br />

differentiation between the three groups were kept in the model for inclusion in testing using the<br />

confirmatory sample 6 .<br />

Following developmental sample analysis all significant or marginally significant<br />

variables were input into separate Multinomial Logistic Regression models for the men and<br />

women and run on the confirmatory sample 7 . Only two variables resulted in significant<br />

discrimination between groups in the female confirmatory sample (see Figure 1). Interpretation<br />

of these results reveals that in comparison to Non-reporters, individuals who only reported some<br />

of the harassment they experienced were more likely to endorse experiencing sexist behaviors<br />

and sexual coercion. Additionally when compared to Complete Reporters, Partial Reporters were<br />

6 Despite significance results Gender Harassment, a combination of sexual and sexist behaviors, is omitted from the<br />

training sample results and confirmatory sample model due to its redundancy and the author’s interest in<br />

understanding its components.<br />

7 Marital Status was erroneously entered into the model used in the confirmatory sample and is therefore omitted.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


less likely to indicate they had experienced sexist behaviors but more likely to endorse<br />

experiencing sexual coercion.<br />

Theoretically the results differentiating the Non-reporters from the Partial Reporters<br />

make sense as one would expect a Non-reporter to experience less harassment than someone who<br />

reports these behaviors. However, the comparison between the group of Complete Reporters and<br />

the Partial Reporters is somewhat less clear. One might argue that sexual coercion is a more<br />

severe form of harassment, owing to its inherently threatening nature, and therefore would be<br />

more likely to lead to reporting and that sexist behavior, a more minor offense, would be less<br />

likely to lead to such an end result. Yet it appears that it is sexist behavior, rather than sexual<br />

coercion, that may have driven the group of Complete Reporters to report their harassment (see<br />

Figure 1).<br />

Sexist Behavior<br />

● Non-Reporters < Partial Reporters < Complete Reporters<br />

Sexual Coercion<br />

● Non-Reporters < Partial Reporters > Complete Reporters<br />

Figure 1. Female Confirmatory Three-Group Comparison of Significant Variables.<br />

Frequency analysis of the Sexual Experiences Questionnaire reveals that Partial and<br />

Complete Reporters equally endorse experiencing sexual coercion but the Partial Reporters<br />

appear more likely to list that behavior as part of the one situation that had the greatest effect on<br />

them. Therefore it does not appear that individuals in the Partial Reporting category have the<br />

most severe experience of harassment. Hypotheses as to the cause of this result are difficult in<br />

that the survey asked participants to indicate the behaviors they had experienced and did not ask<br />

which behaviors they had reported. Assuming the Partial Reporters actually reported the sexual<br />

coercion could lead to hypothesize that their experience of sexual coercion was so horrific as to<br />

warrant reporting whereas other behaviors were viewed as minor incidents and went unreported.<br />

An examination of the Multinomial Logistic Regression results for the male 3 group<br />

confirmatory sample reveals similarly interesting results. When compared to Non-reporters it<br />

appears that Partial Reporters are more likely to endorse experiencing more frequent harassment<br />

and sexist behaviors (see figure 2).<br />

Frequency<br />

● Non-Reporters < Partial Reporters > Complete Reporters<br />

Sexist Behavior<br />

● Non-Reporters < Partial Reporters<br />

Supervisor Harassment<br />

● Partial Reporters < Complete Reporters<br />

Figure 2. Male Confirmatory Three-Group Comparison of Significant Variables.<br />

243<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


244<br />

Although less is known about male responses to sexual harassment this result seems<br />

logical in that one would expect individuals who experience more harassment and more frequent<br />

harassment to be more likely to report that behavior. As with the female sample, however, the<br />

comparison between the Complete Reporters and the Partial Reporters that is somewhat more<br />

ambiguous. In comparison to the Complete Reporters, the men in the Partial Reporters group are<br />

more likely to report experience more frequent harassment and less likely to report being<br />

harassed by a supervisor or multiple supervisors. Perhaps while the harassment is less frequent<br />

for men in the Complete Reporters group than in the Partial Reporters group it is the recognition<br />

of the harassment perpetrated by a supervisor that is identified by the target as worthy of<br />

reporting. Whereas for the Partial Reporters it is the frequency with which a behavior occurs that<br />

results in the behavior being reported. The implication is that men in the Partial Reporting<br />

category appear to have the most severe experience of harassment in terms of<br />

at least frequency of harassment.<br />

The original sample was later split into two groups in order to better understand<br />

the impact of combining individuals in the Partial Reporters groups with either the Non-reporters<br />

or Complete Reporters on our interpretation and understanding of the determinants of reporting<br />

sexual harassment. Individuals who reported none of the harassment they experienced comprised<br />

the Non-reporters group and individuals who reported some or all of the harassment that<br />

occurred to them were placed in the Reporters group. As with the previous analyses, Multinomial<br />

Logistic Regressions 8 were run separately for women and men on both the developmental and<br />

confirmatory samples using the same set of initial variables. Only those variables that, when<br />

alone in the model, resulted in significant (≤.05) or marginally significant (≤ .15) differentiation<br />

between the two groups were kept in the model for inclusion in testing using the confirmatory<br />

sample 9 .<br />

Comparison between the Non-reporters and Reporters in the confirmatory 2-group female<br />

sample resulted in two significant differentiating variables (see Figure 3). When compared to the<br />

Non-reporters, Reporters were more likely to endorse experiencing sexist behaviors and sexual<br />

coercion. Intuitively this appears correct in that the more one experiences these types of negative<br />

behaviors the more likely one would be to report them. However, in light of the findings in the 3group<br />

female sample it becomes clear that a dichotomous based analysis provides a molar view<br />

of the phenomena on reporting rather than a deeper molecular understanding. Not only does the<br />

2-group analysis fail to provide us with much more than is already accessible through common<br />

sense it also confounds our understanding of the role these two variables, sexist behavior and<br />

sexual coercion, play in reporting sexual harassment.<br />

Sexist Behavior<br />

● Non-Reporters < Reporters<br />

Sexual Coercion<br />

8 Multinomial Logistic Regression was utilized instead of (Binomial) Logistic Regression for comparative purposes.<br />

Use of this method should not alter the results or interpretation of the analyses.<br />

9 Despite significance results Gender Harassment, a combination of sexual and sexist behaviors, is omitted from the<br />

training sample results and confirmatory sample model due to its redundancy and the author’s interest in<br />

understanding its components.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


● Non-Reporters < Reporters<br />

Figure 3. Female Confirmatory Two-Group Comparison of Significant Variables.<br />

As in the 3-group comparison results between the male sample, and incidentally in the<br />

female sample as well, of Partial and Complete Reporters a review of the 2-group analysis<br />

revealed a similar set of variables significantly differentiating the comparison groups (see Figure<br />

4). Examination of the 2-group confirmatory male sample reveals that when compared to Nonreporters,<br />

Reporters are more apt to indicate experiencing sexist behaviors and to endorse being<br />

harassed by one or more supervisors. It is apparent from the differences seen between the male<br />

and female models that male reporting of sexual harassment does not follow the logic typically<br />

utilized in discussions of female reporting of sexual harassment.<br />

Sexist Behavior<br />

● Non-Reporters < Reporters<br />

Supervisor Harassment<br />

● Non-Reporters < Reporters<br />

Figure 4. Male Confirmatory Two-Group Comparison of Significant Variables.<br />

Following the above two and three group comparisons additional analyses were<br />

conducted in an effort to better understand targets reasons for not reporting the harassment<br />

experience(s). Frequency analyses (see Table 1) for both women and men suggest that the most<br />

frequent reasons for not reporting the harassment are that the behavior was not important enough<br />

to report and that the individual took care of the problem by herself or himself.<br />

Table 1<br />

Frequency of reasons for not reporting harassment.<br />

Reason for not reporting Women Men<br />

Was not important enough to report 59.9 74.3<br />

You did not know how to report 11.1 8.7<br />

You felt uncomfortable making a report 32.1 20.2<br />

You took care of the problem yourself 57.4 60.0<br />

You talked to someone informally in your chain-of-command 19.5 12.4<br />

You did not think anything would be done if you reported 27.4 20.6<br />

You thought you would not be believed in you reported 14.5 8.7<br />

You thought your coworkers would be angry if you reported 19.5 14.9<br />

You wanted to fit in 15.3 12.0<br />

You thought reporting would take too much time and effort 19.1 16.0<br />

You thought you would be labeled a troublemaker if you reported 27.5 16.6<br />

A peer talked you out of making a formal complaint 3.1 1.5<br />

A supervisor talked you out of making a formal complaint 3.0 1.5<br />

You did not want to hurt the person’s or persons’ feelings, family, or career 22.6 16.8<br />

You were afraid of retaliation or reprisals from the person(s) who did it 18.4 10.0<br />

You were afraid of retaliation or reprisals from friends/associates of the 13.2 7.9<br />

245<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


246<br />

person(s) who did it<br />

You were afraid of retaliation or reprisals from your supervisors or chain-ofcommand<br />

12.6 8.1<br />

Some other reason 18.9 16.0<br />

Further analyses were conducted to explore the possible factor structure of the abovementioned<br />

reporting items. Exploratory factor analyses using principal axis factoring with<br />

varimax rotation suggests that these items do not form a cohesive scale structure (Ormerod,<br />

Lawson, Sims, Lytell, Wadlington, Yaeger, Wright, Reed, Lee, Drasgow, Fitzgerald, and<br />

Cohorn, 2002). However cluster analysis suggests the presence of two interpretable clusters and<br />

a third clustering (Importance) of the two most frequently endorsed items (“was not important<br />

enough to report” and “you took care of the problem yourself”). The first cluster (Other) is<br />

difficult to interpret in that it is a compilation of items regarding lack of knowledge, being talked<br />

out of reporting, and other seemingly dissimilar items. The second cluster (Negative Outcome) is<br />

easier to interpret and contains items that suggest the target feared there would be negative<br />

repercussions should they report the harassment. Scales based on these clusters were created and<br />

subsequent alpha coefficients for women, men and the total sample ranged from .45 to .89.<br />

Alpha coefficients for all items in the reasons for not reporting measure range from .80 to .81<br />

(see Table 2).<br />

Table 2<br />

Alpha Coefficients for items in the Reasons for Not Reporting measure.<br />

Women Men Total Sample<br />

All Items (74A-S) .80 .81 .80<br />

Cluster 1 (Other) .45 .51 .47<br />

Cluster 2 (Negative Outcome) .89 .89 .89<br />

Frequency analyses on SE and SEQ scores for men and women were then conducted for<br />

those individuals who endorsed only fear-based reasons (Negative Outcome) for not reporting.<br />

Approximately 1% of men (N=13) and 2% of women (N=84) indicated only fear-based reasons<br />

for not reporting. Items endorsed in both the One Situation with the Greatest Effect (SE) and<br />

Sexual Experiences Questionnaire (SEQ) were summed for these individuals. The average SEQ<br />

scores for men and women in this group were relatively high (men= 5.23 and women=10.93)<br />

compared to SEQ scores from all respondents regardless of reason(s) for not reporting<br />

(men=4.79 and women=8.66). The average SE scores for men and women were relatively high<br />

(men=3.10 and women=3.79) compared to SE scores from all respondents regardless of<br />

reason(s) for not reporting (men= 2.14 and women=3.59). It appears then that individuals who<br />

indicate solely fear-based reasons for not reporting endorse experiencing more harassing<br />

behaviors.<br />

Approximately 1% of men (N=8) and 1% of women (N=29), and incidentally<br />

only individuals in the Non-Reporter group, indicated that the only reasons that they did not<br />

report were that they either took care of problem by themselves or it was not important enough to<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


eport (Other). Items endorsed in both the One Situation with the Greatest Effect (SE) and Sexual<br />

Experiences Questionnaire (SEQ) were summed for these individuals. The average SEQ scores<br />

for men and women were relatively low (men= 3.88 and women= 4.30) compared to SEQ scores<br />

from all male and female respondents regardless of reason(s) for not reporting (men=4.79 and<br />

women=8.66). The average SE scores for men were somewhat high and for women were<br />

relatively low (men= 2.50 and women=2.33) compared to SE scores from all respondents<br />

regardless of reason(s) for not reporting (men= 2.14 and women=3.59). This suggests that, at<br />

least for the women, the harassment was less severe and perhaps not severe enough to warrant<br />

reporting.<br />

Further analyses show that men and women typically indicate more than one reason for<br />

not reporting the harassment experience (see Table 3). The most frequent response pattern for<br />

women and men was a combination of responses to the Other, Negative Outcome, and<br />

Importance groups. Again hypotheses as to the cause of this result are difficult in that the survey<br />

asked participants to indicate the reasons they did not report but did not ask the participants to<br />

identify the behavior(s) associated with those reasons.<br />

Table 3<br />

Individuals who endorsed combinations of Other, Negative Outcome, and Importance.<br />

Women Men Total<br />

Other & Negative Outcome & Importance 1262 381 1643<br />

Negative Outcome & Importance 225 109 334<br />

Negative Outcome & Other 366 76 442<br />

Importance & Other 569 250 819<br />

DISCUSSION<br />

Analyses of reporting status generally support the classification of individuals into groups<br />

of Non-Reporters, Partial Reporters and Complete Reporters as significant group differences<br />

were found in the self-reported experiences of these individuals. Although group differentiation<br />

was not always found in stimulus, organizational, and individual variables all three variable<br />

types were found to be significant in either the developmental and/or confirmatory analyses. It<br />

appears that stimulus factors played a consistent role in group differentiation for both men and<br />

women across developmental and confirmatory analyses. In particular the presence of sexist<br />

behavior(s) contributes significantly to group differentiation for both men and women in all<br />

analyses. This is likely due to the higher frequency, as compared to other forms of harassment,<br />

with which these behaviors occur. Further, analyses suggest that individuals’ reasons for not<br />

reporting the harassment typically include a combination of both fear and non-fear-based reasons<br />

for not reporting.<br />

Implications<br />

If we are to ever fully understand the determinants of reporting sexual harassment we<br />

must first acknowledge that research on reporting sexual harassment may require the inclusion of<br />

exploratory analyses. Reliance on hypotheses regarding the determinants of reporting sexual<br />

harassment that neglect influential variables dictated by previous research will likely result in<br />

incomplete answers. Researchers cannot know prior to conducting their work, which variables<br />

will play a statistically significant role in reporting harassment nor should they assume the<br />

importance of the same variables from one year to the next as the harassment experience is apt to<br />

differ from one time period to the next. This is evident in the differing findings of this research<br />

247<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


248<br />

and that of Bergman et al, 2002 and Malamut et al., 2001. All three analyses are based on<br />

surveys of the same organization, although the current project utilizes a more recent dataset, and<br />

yet they all reveal differing results in essence validating the need to be more inclusive of<br />

variables in our hypotheses, or possible methodological problems, and providing insight into the<br />

impact of the passage of time, perhaps due to policy and procedural changes, on results. An<br />

extension of this argument would be that this type of research should be conducted not only on a<br />

year-to-year basis but also on an organizational basis, as women in different organizations are<br />

likely to have differing experiences of sexual harassment in very different contexts. The<br />

workplace is likely a dynamic entity wherein changes occur within and across time.<br />

Regardless of whether or not researchers accept the benefits of exploratory research on<br />

this topic it should be understood that creating dichotomous reporting categories that fail to<br />

either take into account the existence of Partial Reporters or merges them into either the group of<br />

Non-Reporters or Complete Reporters blurs our understanding of the phenomena. This research<br />

shows that analyses of two and three-group samples result in a similar set of discriminating<br />

variables that when further explored in three-group analyses begin to broaden our understanding<br />

of their importance.<br />

Further, examination of target response behavior by a trichotmous grouping has<br />

implications for sexual harassment litigation. This research suggests that targets are almost<br />

equally likely to report all of the experienced as they are to report only some of the harassment.<br />

Litigation outcomes often rely heavily on the behavior of the target and a lack of reporting either<br />

all of some of the unwanted experience(s) may lead to summary judgment or an unfavorable<br />

verdict. Expert testimony regarding the nature of reporting behavior that includes information<br />

relevant to the normality of partial reporting will likely enhance the credibility of complainant’s<br />

actual reporting behavior, whether she reported none or some of the experience.<br />

Limitations<br />

As previously discussed, a limitation of the current study lies in the lack of information<br />

regarding the specific behaviors reported or not reported by participants could supplement our<br />

understanding of reporting behavior. Analyses of this data could be used to ascertain the<br />

relationship between reporting/non-reporting and the harassing behavior(s). It is also unfortunate<br />

that this research was unable to incorporate a variable specific to the labeling of a behavior as<br />

previous research suggests it plays an important role in determining whether or not a behavior is<br />

reported. Although a labeling item was included in the survey it was omitted from analyses, as it<br />

was not specifically associated with the One Situation with the Greatest Effect but rather to all of<br />

the harassment experienced. Due to its nature, it is likely that targets would be more apt to label<br />

events associated with the One Situation with the Greatest Effect as sexual harassment.<br />

However, this assumption could not be tested in the current survey format.<br />

An examination of this type of data may provide researchers and organizations with<br />

valuable information that could assist them in effectively encouraging reporting. Enactment of<br />

changes, based on target’s responses, may serve to protect individuals from physical and<br />

psychological harm and also protect an organizations investment in their employees by possibly<br />

increasing job satisfaction, decreasing attrition, strictly enforcing anti-retaliation policy and<br />

positively impacting other negative outcomes of reporting harassment. Further this data can be<br />

used to support litigants concerns regarding reporting sexual harassment and perhaps encourage<br />

the courts to require organizations to prove that they not only provide but also strictly enforce<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


policy and procedure relevant to harassment while protecting complainants from negative<br />

repercussions due to reporting.<br />

Additionally, limitations may arise due to the self-report nature of the data. Analyses of<br />

collaborative data, such as assessment of formal reports, and comparisons to self-report items<br />

could lend support for the trichotomization of reporting status. Further the unnatural structure of<br />

item response scaling or unintended content ambiguity may distort participant responses that are<br />

likely undetected in the absence of interview assessments. Added limitations may arise due to the<br />

dichotomous nature of scaling, utilized in several items, which diminishes the amount of<br />

response variability and could therefore impact outcomes using multinomial logistic regression.<br />

The utilization of such a large sample size mediates these limitations but could also serve to<br />

generate more significance outcomes that would otherwise be found. As such MANOVA’s and<br />

other statistical analyses that are sensitive to sample size were avoided. The lack of a multitude<br />

of significant outcomes supports the appropriate use of the statistical analyses.<br />

Future Directions<br />

Future directions for this research include comparing the current study with data from the<br />

1995 Department of Defense Work and Gender Relations survey. Of particular interest in this<br />

comparison is the ability to test the hypothesized role of labeling in reporting sexual harassment<br />

as the 1995 survey included a labeling item specific to the One Situation with the Greatest<br />

Effect. Significant group differentiation, in the 1995 survey, by the labeling variable would have<br />

implications for the results of the current study. Additionally, this comparison could shed light<br />

on the dynamic nature of the reporting sexual harassment within a military context as differences<br />

between time periods can be assessed.<br />

Further analyses should focus on the role of organizational variables as determinants of<br />

reporting sexual harassment. The findings from the current study regarding the importance of<br />

organizational variables contradicts results from work on the 1995 Department of Defense Work<br />

and Gender Relations survey. It is possible to infer that these contradictions exist due to the<br />

passage of time (e.g. sexual harassment policy/procedural changes over time), different statistical<br />

analyses, or due to differences in construct operationalization. Comparison of the 1995 and 2002<br />

data sets could help clarify these findings as the research could employ uniformity in construct<br />

operationalization and method of data analysis. Additionally, as the importance of organizational<br />

climate on reporting in a military context is unclear participant’s self-report data regarding<br />

organizational reasons for non-reporting can be compared to responses on the Organizational<br />

Tolerance for Sexual Harassment scale (OTSH; Hulin, 1993) thereby comparing the usefulness<br />

of scenario versus self-report data collection on the perception of climate in the military.<br />

249<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


250<br />

References<br />

Bergman, M.E., Langhout, R.D, Palmieri, P.A., Cortina, L.M., & Fitzgerald, L.F. (2002).<br />

The (Un)reasonableness of reporting: Antecedents and consequences of reporting<br />

sexual harassment.<br />

Bernaards, C. A. & Sijtsma, K. (2000). Influence of imputation and EM methods on<br />

factor analysis when item nonresponse in questionnaire data is nonignorable.<br />

Multivariate Behavioral Research, 35, 321-364.<br />

Brooks, L., & Perot, A.R. (1991). Reporting Sexual Harassment: Exploring a Predictive<br />

Model. Psychology of Women Quarterly, (15), 31-47.<br />

Fitzgerald, L.F, Drasgow, F., Hulin, C.L., Gelfand, M.J., & Magley, V.J. (1997).<br />

Antecedents and consequences of sexual harassment in Organizations: A Test of<br />

an Integrated Model. Journal of Applied Psychology, 82(4), 578-589.<br />

Fitzgerald, L.F, & Swan, S. (1995). Why Didn’t She Just Report Him? The Psychological<br />

and Legal Implications of Women’s Responses to Sexual Harassment. Journal of Social<br />

Issues, 51(1), 117-138.<br />

Hulin, C.L. (1993). A framework for the study of sexual harassment in organizations:<br />

Climate, stressors, and patterned responses. Paper presented at a Symposium on Sexual<br />

Harassment at the Society of Industrial and Organizational Psychology, San Francisco,<br />

CA.<br />

Hulin, C. L., Fitzgerald, L. F., & Drasgow, F. (1996). Organizational influences on sexual<br />

harassment. In M. Stockdale (Ed.), Sexual harassment in the workplace, Vol. 5, (pp. 127-<br />

150). Thousand Oaks, CA: Sage.<br />

Knapp, D.E., Faley, R.H., Ekeberg, S.E., & DuBois, C.L.Z. (1997). Determinants of<br />

Target Responses to Sexual Harassment: A Conceptual Framework. Academy of<br />

Management Review, 22(3), 687-729.<br />

Malamut, A.B., & Offermann, L.R. (2001). Coping with sexual harassment: personal,<br />

environmental, and cognitive determinants. Journal of Applied Psychology, 86<br />

(6), 1152-1166.<br />

Marin, A.J, & Guadagno, R.E. (1999). Perceptions of sexual harassment victims as a<br />

function of labeling and reporting. Sex Roles, 41(11/12), 921-940.<br />

McKinney, K., Olson, C.V., & Satterfield, A. (1988). Graduate student’s experiences<br />

with and response to sexual harassment. Journal of Interpersonal Violence, 3(3),<br />

319-325.<br />

Offerman, L.R., & Malamut, A.B. (2002). When Leaders Harass: The Impact of Target<br />

Perceptions on Organizational Leadership and Climate on Harassment Reporting<br />

and Outcomes. Journal of Applied Psychology, 87(5), 885-893.<br />

Peirce, E.R., Rosen, B., & Hiller, T.B. (1997). Breaking the Silence: Creating User-<br />

Friendly Sexual Harassment Policies. Employee Responsibilities and Rights<br />

Journal, 10(3), 225-242.<br />

Perry, E.L., Kulik, C.T., & Schmidtke, J.M. (1997). Blowing the Whistle: Determinants<br />

of Responses to Sexual Harassment. Basic and Applied Social Psychology, 19(4),<br />

457-482.<br />

Rudman, L.A., Borgida, E., & Robertson, B.A. (1995). Suffering in Silence: Procedural<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Justice Versus Gender Socialization Issues in University Sexual Harassment<br />

Grievance Procedures. Basic and Applied Social Psychology, 17(4), 519-541.<br />

Survey Method for Counting Incidents of Sexual Harassment (April 28, 2002).<br />

Washington, DC: Office of the Under Secretary of Defense for Personnel and Readiness.<br />

U.S. Merit Systems Protection Board. (1994). Sexual harassment in the federal<br />

workplace: Trends, progress and continuing challenges. Washington, DC; US.<br />

Government Printing Office.<br />

251<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


252<br />

Using Stakeholder Analysis (SA) and the Stakeholder<br />

Information System (SIS) in Human Resource Analysis<br />

Kimberly-Anne Ford<br />

Directorate of Strategic Human Resources (DstratHR)<br />

Department of National Defence (DND) Canada<br />

Ford.KA@Forces.gc.ca<br />

45 th Annual Conference of the<br />

<strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>)<br />

November 05, <strong>2003</strong><br />

Pensacola Beach, Florida<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table of Contents<br />

Table of Contents............................................................................................................ 253<br />

List of Figures................................................................................................................. 254<br />

1.0 Introduction.............................................................................................................. 255<br />

1.1 Purpose................................................................................................................. 255<br />

1.2 Background.......................................................................................................... 255<br />

1.2.1 The Need for Effective Consultation Frameworks ....................................... 256<br />

1.2.2 The Value of Participatory Methods............................................................. 256<br />

1.2.3 The Utility of SA .......................................................................................... 257<br />

2.0 How could SA be Used in Human Resource Analysis? .......................................... 258<br />

2.1 Stakeholder Brainstorming .................................................................................. 259<br />

2.2 Stakeholder Influence and Importance ................................................................ 261<br />

2.3 Stakeholder Salience............................................................................................ 262<br />

3.0 Overview of The SIS ............................................................................................... 264<br />

3.1 What is the ‘Stakeholder Information System’?.................................................. 264<br />

3.2 Using the SIS ....................................................................................................... 265<br />

3.2.1 The SIS Online.............................................................................................. 267<br />

4.0 Discussion: Potential Applications of the SIS in Human Resource Analysis ......... 268<br />

4.1 Strategic Human Resources ................................................................................. 268<br />

4.2 Quality of Life Research...................................................................................... 268<br />

5.0 Conclusion ............................................................................................................... 269<br />

References....................................................................................................................... 270<br />

253<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


254<br />

List of Figures<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Page No.<br />

Figure 1: Basic Stakeholder Matrix 259<br />

Figure 2: Stakeholder Importance and Influence Matrix 261<br />

Figure 3: Representing Stakeholder Saliency 263<br />

Figure 4: The Stakeholder Information System Website 266


1.1 Purpose<br />

1.0 Introduction<br />

This paper provides an overview of Stakeholder Analysis (SA) – a social science<br />

methodology -- and the ‘Stakeholder Information System’ (SIS) – a software package that is<br />

presently under development at Carleton University, in Ottawa, Canada. SA is used to<br />

understand relationships between various stakeholders or stakeholder groups; to determine and<br />

obtain adequate levels of participation from each stakeholder group involved in a research<br />

project or activity; and to ascertain their potential influence over the process. The SIS is<br />

currently being developed and tested by researchers at Carleton University. In its present form,<br />

the SIS is an online resource that contains information on a wide array of research methods and<br />

techniques 10 , which are used to conduct SA. SA and the SIS have applications for social science<br />

researchers or human resource analysts who want to share information or consult with a wide<br />

range of stakeholders. The usefulness of SA and the SIS for creating effective consultation and<br />

communication frameworks for research and knowledge-sharing activities within the Department<br />

of National Defence (DND) will be addressed in depth in the following paragraphs. The paper<br />

addresses the utility of SA and the SIS for social science research and human resource activity<br />

within DND, and answers the following questions:<br />

• What is Stakeholder Analysis (SA)?<br />

• How does the Stakeholder Information System (SIS) facilitate SA?<br />

• In what ways are the SIS and SA relevant to Human Resource Analysis?<br />

1.2 Background<br />

The impetus for using the SIS within DND is grounded in three inter-related factors: first,<br />

the need for effective consultation and knowledge-sharing frameworks in DND and in<br />

government research in general; second, the value of Participatory Action Research (PAR) to<br />

accomplish this task; and third, the utility of SA for adding depth and rigour to PAR. Each of<br />

these is addressed in the following paragraphs.<br />

45 th 10<br />

Access in this form is presently free on the World Wide Web, see:<br />

http://www.carleton.ca/~jchevali/STAKEH.html; the system will soon be made available by subscription.<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

255


256<br />

1.2.1 The Need for Effective Consultation Frameworks<br />

The need for effective consultation models and information sharing throughout the<br />

Department of National Defence is widely acknowledged. For example, the <strong>Military</strong> HR<br />

Strategy 2020 states that:<br />

“We must ensure that a continuous, effective internal communications network is<br />

established… We must continually improve the effectiveness of our internal<br />

communications strategy to ensure that all members are aware of HR issues… We must<br />

maintain an effective consultation process that shares expertise within the department,<br />

nationally and internationally” (DND, 2002: 20 -21).<br />

The importance of effective communication in DND was also underlined in the 2002 Chief of<br />

the Defence Staff’s Annual Report, stating that:<br />

“Effective communication is an essential part of the modern military. Communications is<br />

a force multiplier on operations. It is also a vital tool in nurturing our relationship with<br />

Canadians and strengthening public awareness and public understanding of the relevance<br />

of the CF, as well as our issues and challenges… As the CF grapples with the demands of<br />

adapting to our changing geo-strategic environment, it is more important than ever to<br />

explain our issues, priorities, and decisions to CF members” (CDS, 2002).<br />

SA can also serve to actualise the government-wide policy objective of increasing<br />

transparency, as mentioned in the <strong>2003</strong> Speech From The Throne. SA and the SIS provide<br />

researchers with the theoretical knowledge and concrete techniques required to meet the<br />

challenge of ensuring transparency, accountability and engagement, to ultimately create a<br />

learning process in which all parties can benefit from the exchange of knowledge.<br />

1.2.2 The Value of Participatory Methods<br />

Increasingly, our research unfolds through the formation of partnerships with various<br />

groups and individuals, and the research process is largely one that necessitates a mutual<br />

exchange of information, in which all parties learn from the process. Therefore, traditional<br />

methods of objective research – in which a researcher goes into the field to extract data from<br />

subjects – are not always sufficient to meet our objectives. SA is an in-depth, analytically<br />

rigorous form of participatory research. The value of ‘Participatory Action Research’ (PAR) for<br />

use within the Department of National Defence, especially as it pertains to quality of life<br />

measurement among CF personnel and their family members, has been described elsewhere<br />

(Ford, 2001).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The Department of National Defence already has a history of conducting research that is<br />

‘participatory’ in nature (Ford, 2001: 20). For example, the Standing Committee On Defence<br />

and Veterans Affairs (SCONDVA) Inquiries and the PERSTEMPO project (as defined by<br />

Flemming, 2000) can both be seen as examples of participatory research: participants were<br />

consulted early on in the research processes and asked to identify important issues. In fact, DND<br />

researchers and human resource analysts often solicit input from various participants, or CF<br />

stakeholders, especially in the planning stage of a research process or knowledge-sharing<br />

activity. However, there is little follow-through in their participation. As a research project<br />

unfolds, participants are often consulted at the onset of a project or initiative, but they are rarely<br />

ever consulted with again over the course of the project, to discuss project outcomes or to<br />

disseminate results. In sum, the importance of stakeholder involvement in planning a research<br />

project is widely acknowledged and often practiced in DND, but the importance of following<br />

through with participation during the whole course of a research process is not. This is at times<br />

necessitated by the decision making process that follows a research endeavour, and/or it can be<br />

due to time limitations, financial constraints, and/or a lack of knowledge of existing participatory<br />

techniques that can facilitate the exchange of information throughout a research process.<br />

Furthermore, it is difficult to find a comprehensive list of participatory techniques and guidelines<br />

for their use in the existing literature. SA and the SIS fill the gaps in the participatory literature,<br />

by providing methodological techniques to determine and elicit the appropriate level of<br />

participation required by various stakeholders at different stages of a project.<br />

1.2.3 The Utility of SA<br />

In order to properly evaluate a proposed project, to gain the viewpoints of many parties,<br />

and/or to anticipate and mitigate potential conflicts, it is vital to identify the groups and<br />

individuals who will be affected by, and therefore have an interest in, project outcomes. Key<br />

questions to ask are:<br />

• Who are all the players involved in or impacted by the project?<br />

• How important are those individuals or groups to project success?<br />

• How influential can they be over before, during or after project completion?<br />

• How can they become involved or informed of the project?<br />

SA provides some answers to these questions.<br />

257<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


258<br />

Originally a prospecting term (a stakeholder being someone who holds a gold claim), SA<br />

has its contemporary roots in the field of management. It is widely used in social science<br />

research today, to identify all social groups impacted by a research process, and to assess their<br />

relative influence and importance based upon the criteria identified for project success. A SA<br />

allows researchers to identify the various individuals, groups and organisations that are<br />

“significantly affected by someone else’s decision making activity” (Chevalier, 2001: 1). The<br />

United Kingdom’s Department for <strong>International</strong> Development (DFID) defines SA as: “the<br />

identification of key stakeholders, an assessment of their interests, and the ways in which these<br />

interests affect project viability and riskiness.” (DFID, 1995: 1). A defining feature of SA is that<br />

it forces researchers to think through the numerous levels of impact a project may have on the<br />

diverse stakeholder groups: to differentiate between indirectly and directly, and positively and<br />

negatively affected stakeholders; to examine levels of influence and importance; and to uncover<br />

stakeholder saliency. The following pages describe the basics of SA and provide the reader with<br />

some rudimentary analytical techniques.<br />

2.0 How could SA be Used in Human Resource Analysis?<br />

The SIS has numerous potential applications for social science research and human<br />

resource analysis within DND. Three examples of such will be presented in the discussion<br />

section of this research note. In order to allow the reader to capture how SA could be used in<br />

DND research, the methodological techniques presented in the following pages are discussed in<br />

reference to the following fictitious 11 research scenario: You are at the planning stage of a<br />

research project aimed at modernizing or transforming the services offered in <strong>Military</strong> Family<br />

Resource Centres (MFRC), in light of changing definitions of ‘the military family’. You would<br />

soon realise that a number of ‘key players’ – important civilian and military stakeholder groups -<br />

- across Canada are already involved in activities that relate to your project. You would also<br />

foresee numerous ways in which your project could impact upon various individuals or groups.<br />

SA would allow you to plan a research project that captures the needs of MFRC clients, service<br />

providers, and other stakeholders, and to anticipate and mitigate potential conflicts that might<br />

arise over the course of your project. Moreover, SA could be used to ensure the co-operation of,<br />

and knowledge sharing between many of the ‘key players’ involved in or impacted by MFRC<br />

service delivery. SA and the SIS provide researchers with the theoretical understanding and<br />

practical techniques needed in order to get to know key stakeholders and to develop appropriate<br />

communication and consultation frameworks required to effectively share knowledge among all<br />

of them.<br />

11 This is fictitious example is used for the sole purpose of demonstrating SA techniques; there are presently no such<br />

plans for MFRC modernisation.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


2.1 Stakeholder Brainstorming<br />

The first step in conducting a basic SA is to brainstorm in order to create an exhaustive<br />

list of potential stakeholders. A list of stakeholders generated for the fictitious ‘MFRC<br />

Modernisation’ -- outlined in the introduction -- might include the following:<br />

• researchers working in the Directorate of Quality of Life<br />

• military service providers working in MFRCs across Canada (such as social workers, and<br />

support staff)<br />

• military members<br />

• the ‘loved ones’ or family members of military personnel<br />

• commanding officers<br />

• civilian service providers<br />

• members of the civilian population who are responsible for some aspect of a military<br />

child’s care: for example school teachers, daycare workers<br />

• extended family members<br />

• human resources policy analysts<br />

Once this list is complete, stakeholders can be categorised in a basic stakeholder matrix –<br />

a rudimentary tool used to differentiate between potential stakeholders who are positively or<br />

negatively and directly or indirectly influenced. Imagine that one proposed action in the MFRC<br />

modernization project is the closure of an MFRC. Figure 1 below, presents a sample basic<br />

stakeholder matrix that has been filled out accordingly.<br />

259<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


260<br />

Proposed<br />

Action:<br />

The Closure of a<br />

<strong>Military</strong> Family<br />

Resource Centre<br />

(fictitious)<br />

Directly-<br />

Affected<br />

Stakeholders<br />

Indirectly-<br />

Affected<br />

Stakeholders<br />

Figure 1: Basic Stakeholder Matrix<br />

(Adapted from Chevalier, 2001)<br />

Positively-<br />

Affected<br />

Stakeholders<br />

Local daycare owner<br />

(will gain business)<br />

DND Budgeting<br />

Officers<br />

(will have more money<br />

to spend on other items)<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Negatively-<br />

Affected<br />

Stakeholders<br />

MFRC staff --<br />

social workers,<br />

administrators, etc<br />

(who will be out of<br />

work)<br />

Commanding Officers<br />

(will have more<br />

requests for family<br />

leave)<br />

Local School Teachers<br />

(will see more<br />

behavioural problems in<br />

children of deployed CF<br />

members)<br />

Extended Family<br />

Members (will be called<br />

upon to take care of<br />

children)


2.2 Stakeholder Influence and Importance<br />

The next step in a basic SA is to determine whether each stakeholder, or stakeholder<br />

group ranks high or low on a scale of importance and influence, according to the following<br />

definitions: Important stakeholders are critical to project success, in other words, the project is<br />

meant to serve their interests in some way. For example, clients or military families or patients<br />

are important stakeholders in a health care reform project. Influential stakeholders, on the other<br />

hand, have the means to exert their will and influence project outcomes. MFRC administrators,<br />

budget allocation officers, or commanding officers are all examples of influential stakeholders in<br />

a MFRC modernisation. The influence and importance of diverse stakeholder groups in the<br />

MFRC modernization project, can be mapped out as shown in Figure 2.<br />

Proposed Action: Modernising the MFRC (fictitious)<br />

↑<br />

High<br />

I<br />

I<br />

m<br />

m<br />

p<br />

p<br />

o<br />

o<br />

r<br />

r<br />

t<br />

t<br />

a<br />

a<br />

n<br />

n<br />

c<br />

c<br />

e<br />

e<br />

Low<br />

Low<br />

- CF Members<br />

and their<br />

Loved ones<br />

- Civilian Social<br />

Workers<br />

Influence<br />

Influence<br />

→<br />

Figure 2: Stakeholder Importance and Influence Matrix<br />

- MFRC<br />

Service<br />

Providers<br />

- MFRC<br />

Administrators<br />

- DND Budget<br />

Allocators<br />

261<br />

High<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


262<br />

2.3 Stakeholder Salience<br />

SA also allows researchers to map out power differentials and to determine saliency<br />

among stakeholder groups – i.e. the relevance of stakeholder goals and objectives to those of the<br />

research project. According to Chevalier, (2001) contemporary understandings of stakeholder<br />

saliency encompass three elements: power, interest (or urgency), and legitimacy. While<br />

‘legitimacy’ in its different forms is an important variable, two other factors must not be ignored<br />

when determining the relevance of stakeholder claims to project objectives. Power, defined as<br />

“the ability to influence the actions of other stakeholders and to bring out the desired<br />

outcomes”... can be actualized “through the use of coercive-physical, material-financial and<br />

normative-symbolic resources at one's disposal”. The other factor is that of interest, which<br />

relates in part to the ability of stakeholders to impress the critical and pressing character of their<br />

claims or interests. Chevalier remarks that these three attributes are transient and have a<br />

cumulative effect on salience:<br />

“[They] are highly variable; they are socially constructed; and they can be possessed with<br />

or without consciousness and willful exercise. They can also intersect or be combined in<br />

multiple ways, such that stakeholder salience will be positively related to the cumulative<br />

number of attributes effectively possessed” (Chevalier, 2001).<br />

To further assess the various stakeholders’ places in the research process, they can be<br />

categorised into the following groups:<br />

• dormant stakeholders only have power (in the MFRC Modernisation, one example could<br />

be DND Budget Allocators);<br />

• discretionary stakeholders only have legitimacy (in the MFRC Modernisation, one<br />

example could be the Minister of National Defence);<br />

• demanding stakeholders only have strong interest (in the MFRC Modernisation, one<br />

example of a ‘demanding stakeholder’ could be CF Loved ones who do not fit into<br />

traditional definitions of the ‘military family’);<br />

• dependant stakeholders have legitimacy and interest (in the MFRC Modernisation, one<br />

example of a ‘dependant stakeholder’ could be CF ‘military families’);<br />

• dominant stakeholders have power and legitimacy (in the MFRC Modernisation, one<br />

example of a ‘dominant stakeholder’ could be Commanding Officers);<br />

• dangerous stakeholders have interest and power (in the MFRC Modernisation, one<br />

example of a ‘dangerous stakeholder’ could be civilian social workers); and<br />

• definitive stakeholders have legitimacy, power and urgency (in the MFRC Modernisation,<br />

one example of a ‘definitive stakeholder’ could be DQOL researchers).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 3 graphically represents stakeholder saliency, the relevance of stakeholders’ claims to<br />

overall project success, in the MFRC Modernisation project.Proposed Action: Modernising the<br />

MFRC (fictitious)<br />

DQOL<br />

45 th Figure 3: Representing Stakeholder Saliency<br />

(Adapted from Chevalier, 2001)<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

263


264<br />

In sum, SA is done to identify all groups or individuals affected by, or involved in, a<br />

research process or knowledge-sharing activity. It encourages project managers to draw out and<br />

identify the interests of stakeholder groups in relation to the project objectives, to assess<br />

stakeholder saliency, to identify potential conflicts of interest between stakeholder groups, to<br />

build upon existing relationships and form new networks between stakeholders, and to assess the<br />

appropriate type of participation by different stakeholders at successive stages of the research<br />

cycle (DFID, 1995: 3). Taking SA at its core, the SIS allows researchers to select the appropriate<br />

participatory techniques to use, depending upon the project objective.<br />

3.0 Overview of The SIS<br />

3.1 What is the ‘Stakeholder Information System’?<br />

The SIS is an <strong>International</strong> Development Research Council of Canada (IDRC) – funded<br />

initiative. It is important to note that the system is presently under development. First and<br />

foremost, the SIS is a research process management system, allowing researchers to identify and<br />

to complete a variety of research objectives, while informing or enlisting all relevant<br />

stakeholders in the tasks at hand. The SIS contains a comprehensive listing of over seventy-five<br />

participatory research techniques – surveys, focus groups, steering committees, value-structured<br />

referendums, open house sessions, visioning sessions, and historical timelines – that are all<br />

defined in-depth, and accompanied by suggestions for their use, links to Internet resources and<br />

readings, and organised according to the research objective to which they pertain. The creators<br />

of the SIS define the system as:<br />

“[The SIS] … offers flexible techniques to analyze the social aspects of conflict, problem,<br />

project or policy management activities. The methodology proposes ways of doing social<br />

analysis with the active involvement of the parties concerned, i.e., actors, groups,<br />

constituencies or institutions that can affect or be affected (adversely or positively) by a<br />

given set of problems or interventions” (Chevalier, and de la Gorgendière, <strong>2003</strong>) 12 .<br />

12 The SIS will soon be made publicly available through a user-friendly CD-ROM and an<br />

interactive web site, both of which are presently under construction and being field-tested with<br />

development workers in Africa and in South America. The SIS can now be acquired through<br />

workshops, and a prototype of the SIS is available online for free. This web-based version is<br />

non-interactive, but provides researchers with comprehensive definitions of participatory<br />

techniques in downloadable PDF files, and links to a wide array of Internet resources. Interested<br />

parties can visit the web site to become familiar with a wide array of participatory techniques:<br />

http://www.carleton.ca/~jchevali/STAKEH.html<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The SIS is a ‘research process management’ system; as such, it is used to guide<br />

researchers from the ‘problem definition’ stage of a project, to the dissemination of results. The<br />

system allows project leaders to outline objectives, define their methodological approaches,<br />

identify the stakeholders, and plan out their use of resources. Once this initial planning stage is<br />

completed, researchers can proceed through various stages of the project, identifying objectives<br />

and selecting from the appropriate research techniques offered. The system itself can serve as a<br />

knowledge-sharing tool, and can provide a record of the various research or knowledge-sharing<br />

activities accomplished throughout the research process. The creators of the SIS state that: “the<br />

SIS techniques are divided into interlocking modules (Pointers, Problems, Players, Profiles,<br />

Positions, Paths) designed to be ‘scaled up’ or ‘scaled down’ according to project needs”<br />

(Chevalier, and de la Gorgendière, <strong>2003</strong>). In this case, “scaling up” or “scaling down” the<br />

techniques refers to adding simplicity or complexity, depending upon the stakeholders involved<br />

and level of analysis required in any given research activity<br />

3.2 Using the SIS<br />

The SIS is designed to lead researchers through a process of identifying project<br />

objectives, outlining the resources available to complete various stages of the research process<br />

and then to select from a variety of participatory techniques. Researchers and stakeholders can<br />

keep track of progress made in various areas of the research project by referring back to the<br />

system. Figure 4 below, gives the reader a sense of the organisation of the system in its current<br />

state. It shows the display of nodes, modules and technique files that appear on the SIS website.<br />

265<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


266<br />

Figure 4: The SIS Website<br />

http://www.carleton.ca/~jchevali/STAKEH.html<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


3.2.1 The SIS Online<br />

As already mentioned, the online version of the SIS is only a prototype, and is not yet<br />

interactive. However, researchers involved in participatory research or knowledge-sharing<br />

activities can now visit the SIS website online to gain information on a wide range of<br />

participatory techniques. For example, a number of alternatives to the traditional meeting are<br />

described in depth, including: caucusing; citizens’ jury; planning cell; consensus conference;<br />

deliberative polling; open space meeting; workshop; roundtable; public hearing; open house;<br />

visioning session; working group; and multi-stakeholder forum. For each of the above listed<br />

techniques or meeting strategies, as for every technique included on the SIS, users are provided<br />

with a description of the technique; its strengths and weaknesses; some recommendations for its<br />

use; and a list of readings and Internet links. The SIS also defines the various levels of<br />

participation that might be required from different stakeholders who are involved in a research or<br />

knowledge-sharing activity. These levels of participation are defined as:<br />

• persuasion: using techniques to change stakeholder attitudes without raising<br />

expectations of involvement;<br />

• education: distributing information to promote awareness of project activities and<br />

related issues and goals;<br />

• survey and information extraction: seeking information to assist in the pursuit of project<br />

activities and goals;<br />

• consultation: promoting information flow between an organization and stakeholders<br />

invited to express their own views and positions;<br />

• participation for material incentives: primary stakeholders provide resources such as<br />

labour in exchange for material incentives;<br />

• functional participation: active engagement of primary stakeholders to meet<br />

predetermined objectives without concrete incentives and without involvement in the<br />

decision-making process;<br />

• joint planning or shared decision-making: active primary stakeholder representation in<br />

the decision-making process at all stages of project cycle and with voting and decisionmaking<br />

authority;<br />

• delegated authority: transferring responsibilities normally associated with the<br />

organization, to primary stakeholders; and<br />

• self-mobilisation: primary stakeholders taking and pursuing their own project initiative<br />

with limited external intervention.<br />

267<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


268<br />

In sum, the SIS provides researchers and human resource analysts with detailed<br />

descriptions of seventy-five techniques that can be used to realise a participatory research<br />

project.<br />

4.0 Discussion: Potential Applications of the SIS in Human Resource Analysis<br />

SA and the SIS have a number of potential applications in the domain of human resource<br />

analysis. Potential applications are discussed in the following paragraphs.<br />

4.1 Strategic Human Resources<br />

The SIS responds directly to two of the core strategic objectives of HR 2020. First, as<br />

noted above, the system provides users with approximately seventy-five different techniques that<br />

can be used to achieve one of the key objectives, to “ensure that a continuous, effective internal<br />

communications network is established… [and] continually improve the effectiveness of our<br />

internal communication strategy to ensure that all members are aware of HR issues” (DND,<br />

2002: 20). Second, the SIS applies directly to the strategic objective of: “maintaining an<br />

effective consultation process that shares expertise within the department, nationally and<br />

internationally” (DND 2002: 21).<br />

The SIS can be put to use in human resource analysis, as a means to share information<br />

between a variety of stakeholders and knowledge partners. The system directly addresses one of<br />

the key difficulties faced by those engaged in this type of work: the fact that everyone can not<br />

always be present at every meeting, all of the time. Hence the transfer and sharing of<br />

information is of critical importance. The SIS can be maintained and updated, so that a number<br />

of parties can remain informed of progress in any activity or area. Moreover, the SIS provides<br />

numerous alternatives to traditional methods of information sharing. For example, for more<br />

efficient and interesting use of meeting time, the SIS provides at least five alternatives to the<br />

traditional meeting style. Strategic human resource analysis is knowledge-sharing, since much<br />

of the material to be ‘analysed’ comes from a variety of disparate sources: meeting notes,<br />

briefings, presentations, conferences, and media reports, etc. A knowledge sharing system can<br />

therefore be a very useful tool for strategic human resource analysis.<br />

4.2 Quality of Life Research<br />

The Directorate of Quality of Life (DQOL) has a history of involving various stakeholder<br />

groups into its research initiatives. Previous DQOL projects have enlisted the participation of<br />

former CF members, ‘loved ones’ of CF members, CF service providers, and CF members in a<br />

variety of operational theatres. SA can serve to improve researchers’ knowledge of the diversity<br />

of stakeholders that are involved in, or impacted by QOL activities. The SIS can further assist<br />

researchers in finding appropriate sampling techniques to obtain representation of the diversity of<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


stakeholder groups. Furthermore, the system can be used to decide appropriate levels of<br />

participation for each stakeholder group, from consultation to the creation of full partnerships.<br />

Overall, use of the SIS would provide a more holistic conception of the numerous issues that<br />

DQOL must deal with in its research program. Specifically, for example, SA and the SIS could<br />

be used to develop accommodation strategies that fit the diversity of CF family needs; to<br />

improve civilian-military relations in theatres and in domestic settings; to address CF family<br />

issues; and many other quality of life issues. Furthermore, the SIS offers project managers the<br />

necessary techniques to address problem areas in research: to minimise the repetition of tasks, to<br />

reduce ‘respondent burnout’.<br />

Strategic human resource analysis and quality of life research are just two, among many<br />

knowledge sharing and research activities in which SA and the SIS can be put to use. The<br />

possible applications of the SIS in DND are endless.<br />

5.0 Conclusion<br />

The Department of National Defence, Canada, has a history of consulting with its<br />

members and with key stakeholders in order to create policies and programs to address their<br />

needs. SA and the SIS provide the methodological techniques required to create effective<br />

consultation and communication strategies, and thus to enhance knowledge sharing throughout<br />

the department. We frequently hear of ‘respondent fatigue’ and the need to be concerned with<br />

research ethics. We also speak about the need for horizontal integration in our organisations and<br />

try to conceive of strategies to doing away with a top-down or ‘stove-pipe’ way of doing<br />

business. SA and the SIS allow us to achieve these ends. By using SA and the SIS and<br />

approaching all research activities as the formation of partnerships, we learn how best to share<br />

knowledge, thus minimising the repetition of tasks, reducing respondent fatigue and building<br />

social and cultural capital throughout our organisations.<br />

269<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


270<br />

References<br />

Chief of the Defence Staff (CDS). (2002). Annual Report 2001-2002.<br />

http://cds.mil.ca/pubs/anrpt2002/intro_e.asp last accessed on March 11, <strong>2003</strong>.<br />

Chevalier, J. (2001). Stakeholder Analysis and Natural Resource Management. Carleton<br />

University, Ottawa. www.carleton.ca/~jchevali/stakeh2.html last accessed on February<br />

05 <strong>2003</strong>.<br />

Chevalier, J. and L. de la Gorgendière. (<strong>2003</strong>). The Stakeholder/Social Information System.<br />

Carleton University, Ottawa. www.carleton.ca/~jchevali/stakeh.html last accessed on<br />

February 05, <strong>2003</strong>.<br />

Christians, C. (2000). “Ethics and Politics in Qualitative Research” In Norman Denzin and<br />

Yvonne Lincoln, eds. The Handbook of Qualitative Research. Sage, Thousand Oakes.<br />

Department for <strong>International</strong> Development (DFID). (1995). Guidance Note on How to do<br />

Stakeholder Analysis of Aid Programs. DFID, London. www.dfid.gov.uk/ last accessed<br />

on March 01, 2002.<br />

Department of National Defence (DND). (2002). <strong>Military</strong> HR Strategy 2020: Facing the<br />

People Challenges of the Future. Ottawa, Canada.<br />

Ford, K. (2001). Using Participatory Action Research (PAR) in Quality of Life Measurement<br />

Among CF Personnel and Their Loved Ones. DSHRC Research Note RN 08/01.<br />

Department of National Defence, Ottawa, Canada. http://www.dnd.ca/qol/pdf/par_e.pdf<br />

last accessed on Feb 05, <strong>2003</strong>.<br />

Flemming, S. (2000). CF PERSTEMPO and Human Dimensions of Deployments Project;<br />

Research Plan Measurement Concepts and Indicators. Department of National Defence,<br />

PMO QOL/DSHRC, Ottawa, Canada.<br />

Fine, M. L. Weis, S Weseen, and L. Wong (2000). “For Whom? Qualitative Research,<br />

Representations and Social Responsibilities.” In Norman Denzin and Yvonne Lincoln,<br />

eds. The Handbook of Qualitative Research. Sage, Thousand Oakes.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


INTRODUCTION<br />

DESIGNING A NEW HR SYSTEM FOR NIMA<br />

Brian J. O’Connell, Ph.D.<br />

Principal Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

boconnell@air.org<br />

Jeffrey M. Beaubien, Ph.D.<br />

Senior Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

jbeaubien@air.org<br />

Michael J. Keeney, Ph.D.<br />

Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

mkeeney@air.org<br />

Thomas A. Stetz, Ph.D.<br />

Industrial and Organizational Psychologist<br />

Human Resources Department<br />

U.S. National Imagery and Mapping Agency<br />

4600 Sangamore Road - MS: D-18<br />

Bethesda, MD 20816<br />

stetzt@nima.mil<br />

The U.S. National Imagery and Mapping Agency (NIMA) was formed in 1996 by<br />

consolidating employees from several Federal agencies. These include the Defense Mapping<br />

Agency (DMA), the Central Imagery Office (CIO), the Defense Dissemination Program Office<br />

(DDPO), and the National Photographic Interpretation Center (NPIC), as well as the imagery<br />

exploitation and dissemination elements of the Defense Intelligence Agency (DIA), the National<br />

Reconnaissance Office (NRO), the Defense Airborne Reconnaissance Office (DARO), and the<br />

Central Intelligence Agency (CIA). NIMA’s mission is to provide geospatial intelligence and<br />

271<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


272<br />

related services to policy makers, military commanders, and civilian agencies in support of all<br />

national security initiatives (for additional information, see www.nima.mil).<br />

From the outset, NIMA’s management faced a complex, high-pressure situation that<br />

involved a critical – but predictable – set of issues. These included the need to develop a shared<br />

vision, the need to integrate work processes, and the need to meld multiple cultures. An<br />

additional challenge was the significant societal pressure to continuously improve efficiency and<br />

quality (GAO/Comptroller General of the United States, 1996). Because each legacy<br />

organization had their own unique human resources (HR) system, the newly-formed NIMA<br />

workforce was organized into approximately 600 unique position titles. To make effective<br />

personnel decisions, NIMA would first have to describe its work requirements. Unfortunately,<br />

traditional job descriptions – which list the primary duties and tasks to be performed within the<br />

position – are costly to develop, often lack the required precision, and have been criticized as<br />

static snapshots of dynamic jobs (Cascio, 1995). Therefore, NIMA’s management decided to<br />

forgo traditional job descriptions in favor of dynamic work roles.<br />

Work roles are distinct from job descriptions in that they define not only the job tasks but<br />

also describe the competencies that are required to perform those tasks (Mulqueen, Stetz,<br />

Beaubien, & O’Connell, <strong>2003</strong>). Each work role includes a title, a brief description of the work, a<br />

list of core job-related competencies, a list of relevant license and education requirements, and a<br />

description of the typical physical and environmental demands. In essence, work roles define<br />

different kinds of work – such as secretary or security guard – that require unique competency<br />

sets. By extension, work roles also define sets of employees who are essentially interchangeable.<br />

Any employee in a given a work role should be able to perform the duties of any other employee<br />

in that same work role with at least minimal proficiency within 90 days (Mulqueen, et. al., <strong>2003</strong>).<br />

Recognizing the need to strategically manage their human capital and to promote a single<br />

organizational identity, NIMA management decided to create a new, integrated HR management<br />

system that was based on the work roles concept. The new HR system was designed to be<br />

strategically oriented, person-based, and broad-banded to leverage the flexibility that was<br />

provided by NIMA’s exemption from many civilian personnel regulations under the DoD<br />

Authorization Act of 1996. Further, basing the new HR system on skills rather than tasks would<br />

provide the capability to support all HR initiatives, such as recruitment, selection, manpower<br />

planning, compensation, promotion, training, and individual career path planning.<br />

To achieve this goal, NIMA contracted with the American Institutes for Research (AIR)<br />

to develop a competency-based HR management system. This system – which was based on the<br />

O*Net Occupational Information Network (Peterson, Mumford, Borman, Jeanneret, &<br />

Fleishman, 1999) – will eventually serve as the basis for all HR functions at NIMA. The O*Net<br />

model evolved from a thorough review of previous work-related taxonomies, and represents<br />

state-of-the-art thinking about the world of work. Unlike many other models, it conceptualizes<br />

both the general and the specific aspects of the work domain in an integrated fashion and was<br />

thus ideally suited to NIMA’s needs. For example, O*Net’s Basic and Cross-Functional Skill<br />

(BCFS) taxonomy was used to capture broad competencies that cut across NIMA jobs, so that<br />

these jobs could be compared and grouped to create a skills-based occupational structure.<br />

Similarly, O*Net’s Generalized Work Activities (GWA) taxonomy was used to ensure<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


homogeneity of work activities within each occupation. Finally, O*Net’s Occupational Skills<br />

taxonomy was used to characterize the specific requirements of particular NIMA occupations.<br />

METHODOLOGY<br />

NIMA’s new personnel system was developed during a four-step process. The first step<br />

involved grouping legacy jobs into a parsimonious occupational structure and collecting data to<br />

statistically examine the quality and comprehensiveness of these groupings. The second step<br />

involved developing work roles within each occupation, again using empirical data to examine<br />

quality and comprehensiveness. The third step involved populating the newly-developed skills<br />

database with employee competency data. The fourth and final step involved periodically<br />

reviewing and updating the work roles to account for new developments in the ever-changing<br />

world of geospatial intelligence analysis.<br />

Grouping Legacy Jobs into Occupations<br />

Our initial goal was to create 20 to 30 broad occupations of employees who used similar<br />

skills in performing similar activities. We first assembled a comprehensive list of the Agency’s<br />

legacy position descriptions. A panel of Industrial and Organizational Psychologists then<br />

grouped the position descriptions into a smaller, yet comprehensive list of composite titles. We<br />

then created a description for each composite which summarized the aggregate duties and<br />

principal differences among the positions within the composite. Subject Matter Experts (SMEs)<br />

then reviewed the descriptions and eliminated or combined composites to reflect near-term<br />

outsourcing or other changes that were expected to occur within NIMA. The result was a set of<br />

approximately 125 composite descriptions that summarized all of the work performed at NIMA.<br />

We used statistical data to guide our decisions regarding the most appropriate<br />

occupational structure. To statistically group the composite positions into a smaller number of<br />

relatively broad occupations, it was necessary to create profiles on a common set of descriptors.<br />

We collected SME ratings using O*Net BCFSs and cluster analyzed the profile ratings.<br />

Hierarchical agglomerative procedures were used to iteratively build increasingly larger clusters<br />

until all of the composites had been clustered together into a single group (Everitt, 1993;<br />

Kaufman & Rousseeuw, 1990). At the end, a dendogram graphically displayed the order in<br />

which the composites were combined and indicated the magnitude of the differences between the<br />

composites being combined at each step. We reviewed each dendogram, along with the matrix of<br />

distances between composites. Based on this information, we identified approximately 30<br />

clusters – or potential occupations – that appeared to be both statistically and practically viable.<br />

We then presented the results to NIMA managers and other critical stakeholders to elicit their<br />

reactions, concerns, and approval. The results were considered along with a host of practical and<br />

political organizational factors to arrive at a final structure of about 25 broad occupations.<br />

Defining Work Roles within Occupations<br />

We began to define work roles within each occupation by assembling panels of SMEs<br />

from each occupation to offer guidance regarding the preliminary work roles. These panels<br />

identified meaningful distinctions among the jobs within their occupation, and developed<br />

preliminary titles and general descriptions for each work role. At the conclusion of the SME<br />

panels, each occupation had a defined set of work roles, with a total of about 200 preliminary<br />

273<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


274<br />

work roles throughout the Agency. SMEs also identified a sample of up to 12 prototypical<br />

employee representatives for each work role. These representatives were chosen because their<br />

current work duties indicated that they were working in one of the approximately 200<br />

preliminary work roles. Each work role representative completed a competency profile of the<br />

skills, knowledge, and tools that he or she currently used to perform their work.<br />

After collecting the employee competency profiles, we used the Jaccard similarity<br />

coefficient to assess the degree to which pairs of work roles had the same competencies<br />

identified as being essential for performing work they describe. The Jaccard coefficient ranges<br />

from 0.0 (indicating no overlap) to 1.0 (indicating complete overlap). The Jaccard coefficient is<br />

based upon binary yes or no coding, which in this case indicated whether an employee used each<br />

competency in their work role. This statistic has a significant advantage in reflecting only the<br />

presence of the characteristics in either one or both work roles (Kaufman & Rousseeuw, 1990); it<br />

does not reflect the mutual absence of a competency from both work roles in a pair. We used the<br />

representatives’ competency data to create a profile for each work role. The profile listed each<br />

knowledge, skill, and tool that was identified by representatives as currently used in the work<br />

role, as well as the number of representatives that reported using each competency. We then<br />

assessed the degree of overlap within each work role and across work roles within each<br />

occupation (Mulqueen, et. al., <strong>2003</strong>). Each occupation required three Jaccard analyses. The first<br />

computed the degree of similarity among representatives within each work role. This produced a<br />

measure of agreement among representatives regarding their individual work role requirements.<br />

The second Jaccard analysis computed the degree of similarity between each individual work<br />

role and the pool of all other work roles within the occupation. The third and final Jaccard<br />

analysis computed the degree of competency similarity between each pair of work roles within<br />

the occupation.<br />

Next, we organized a second set of SME panels to evaluate the work role competency<br />

profiles and to create the final set of work roles. The SMEs reviewed the Jaccard similarity<br />

matrices to determine whether there was unusual redundancy of competency requirements<br />

among work roles. For these determinations, we found the pairwise similarity matrix to be more<br />

helpful than the pooled matrix. A high degree of overlap among 2 or more work roles might<br />

indicate that the roles were too similar to be considered separate, suggesting that they should be<br />

combined into a single work role. We established a criterion of 40% or greater similarity for the<br />

SMEs to discuss the affected work roles and their requirements. Factors other than competency<br />

similarity – such as critical mission function and staffing requirements – were also used to<br />

determine whether work roles should be combined or remain separate. The SMEs made their<br />

final determinations based on their knowledge of the role requirements and any other reasons for<br />

keeping similar work roles separate. During the SME panels, the number of work roles was<br />

reduced from around 200 to under 170.<br />

In addition to the Jaccard similarity indices, each work role profile contained the list of<br />

associated competencies, organized according to the number of representatives who indicated<br />

that they currently use each. If a majority of representatives used a competency, it was identified<br />

as a core competency. The SMEs reached consensus on final lists of competencies for each work<br />

role by noting which competencies were core competencies and if any competencies were<br />

redundant or missing. Once again, this was a matter of expert judgment that was guided by<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


empirical information from the work role analyses. As a final step in work role refinement, the<br />

SMEs reviewed each work role description. At this time, they indicated whether any specific<br />

education or licensure requirements were necessary for the work role, and if there were any<br />

special environmental or physical requirements for performing the work.<br />

Populating the Skills Database with Employee Competency Data<br />

Next, NIMA employees assessed their proficiency levels on the work role competencies<br />

(i.e., knowledges, skills, and tools) that were identified during the previous phase, and entered<br />

this information into a database using the Skill Inventory Library (SKIL). SKIL is a Microsoft<br />

Access database that was equipped with a special user interface for gathering proficiency ratings.<br />

The software – which was designed to require minimal expertise with computers – walked<br />

employees through the process of entering proficiency and current use information about each<br />

skill, tool, and knowledge in the database with which that employee has expertise.<br />

The cross-occupational skills were addressed first; the employees entered proficiency<br />

data for whichever ones he or she considered applicable. Next, the employee provided<br />

proficiency data for the skills, tools, and knowledges for their current occupation, along with<br />

those from related occupations, as appropriate. At any stage in the process, employees were<br />

allowed to view the proficiencies that were already entered, and change their proficiency ratings<br />

as necessary. The SKIL software was available in a centrally located computer lab, and at<br />

computer terminals within the employees’ office spaces.<br />

Periodic Review and Update<br />

NIMA’s initial set of work roles became operational during 1998. However, work roles<br />

require periodic review to keep them current. This periodic review, which began during Fall<br />

2002, had six primary goals. The first goal was to add, delete, and merge work roles as needed.<br />

The second goal was to replace obsolete competencies. The third goal was to ensure Agencywide<br />

consistency in how work is described. The fourth goal was to reduce redundancy at both<br />

the work role and competency level. The fifth goal was to systematically collect importance<br />

ratings for each competency. The sixth and final goal was to update all the work role and<br />

competency changes into NIMA’s new PeopleSoft database.<br />

Panels of SMEs first reviewed each work role. The panels analyzed the existing work<br />

role descriptions in their occupation to determine if they adequately describe the current work<br />

performed. For each work role, they reviewed, revised, added, or replaced any obsolete or<br />

missing competencies and educational, physical, and environmental requirements. HR<br />

representatives reviewed all proposed changes to verify that they conformed to relevant legal<br />

requirements. Finally, the panels identified a “short list” of no more than 60 competencies –<br />

including no more than 20 occupation-specific skills, 10 cross-occupational skills, 20<br />

knowledges, and 10 tools – for each work role. This short list was needed because the SKIL<br />

database had become populated with redundant skills (such as “Human Resources Mentoring,”<br />

“Imagery Analysis Mentoring,” and “Geospatial Analysis Mentoring”).<br />

Each work role’s short list of competencies was converted into a survey and deployed to<br />

a sample of up to 20 of the employees in the work role plus the Professional Advisory Board<br />

(PAB) associated with each work role. When a work role encompassed fewer than 20 employees,<br />

275<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


276<br />

we surveyed all of the employees. Whenever the SME panels created a new work role or merged<br />

existing work roles, they were required to identify a suitable sample of employees to complete<br />

the survey. Each survey gathered respondent background information and ratings of the core<br />

competencies. For each competency, a dichotomous (yes/no) format item asked whether the<br />

competency is used. When the respondent indicated “yes,” the survey asked three additional<br />

questions: the competency’s importance to performing the work, the extent to which it is needed<br />

upon hire, and the amount of time that the respondent spends using it. The importance, hire, and<br />

time items used 5-point Likert scales, with anchors ranging from “Strongly Disagree” (1) to<br />

“Strongly Agree” (5). Because the survey was electronically-deployed, validation rules<br />

prevented out-of-range values.<br />

Next, we weighted the response data at two steps. The first weighting was made at the<br />

work role level. The purpose of this first weighting was to adjust the PAB ratings to contribute<br />

50 percent of the final weighted response. For each work role, we identified the overall<br />

importance for each competency by calculating the weighted mean and standard deviation of<br />

ratings for need at hire, frequency, and importance across all the incumbent raters for that work<br />

role and its PAB (with the PAB ratings weighted as previously discussed). After calculating the<br />

weighted overall importance ratings, we sorted the competencies within competency type<br />

(knowledge, skill, cross-occupational skill, and tool) in decreasing order of importance (mean)<br />

and increasing consensus about importance (standard deviation). The second weighting, which<br />

was made at the occupation level, was designed to provide greater weight to competencies that<br />

are identified in a large number of roles. Therefore, we weighted each competency by the<br />

number of work roles in which it appeared. For example, if a competency appeared in 5 work<br />

roles, it was weighted by 5; if a competency appeared in only 1 work role, it was not weighted.<br />

This weighting procedure was done to ensure that the most “critical” competencies were in fact<br />

used by a large percentage of the NIMA workforce.<br />

We then used Chronbach’s alpha to calculate the overall degree of inter-rater reliability<br />

(or consensus) within each work role. Cronbach’s alpha represents the degree to which raters<br />

provided consistent evaluations of the importance of each competency. Cronbach’s alpha is<br />

based on both the number of raters, as well as the similarity of the ratings included in calculating<br />

it. Consequently, low values may also be due either to small sample size alone, low consensus<br />

alone, or both low consensus and small sample size. Values for Cronbach’s alpha range from 0.0<br />

(indicating no consistency) to 1.0 (indicating perfect agreement). Values under 0.50 may suggest<br />

that the raters disagreed about the importance of the competencies they rated, and possibly that<br />

the work role may require additional review. However, low values can also result when the work<br />

role had only a small number of reviewers. Conversely, high values indicate agreement among<br />

the reviewers about the importance of the competencies they rated, suggesting that the work role<br />

more accurately describes the work.<br />

Finally, we used the Jaccard similarity coefficient to indicate the degree to which pairs of<br />

work roles had the same competencies identified as being essential for performing work they<br />

describe. Specifically, we considered values for the Jaccard coefficient greater than 0.80 as<br />

suggesting that further review should be considered. As before, statistical results were combined<br />

with expert judgments to determine whether or not individual pairs of work roles should be<br />

combined. Before the results data could be integrated into PeopleSoft, they needed to be<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


eviewed by NIMA management and other key stakeholders for their review and sign-off. We<br />

prepared four main types of documentation for the Agency-Level review. These included tables<br />

that displayed the return rate for each work role survey; changes (if any) in the number of work<br />

roles before and after the periodic review; occupation-specific tables that displayed the inter-rater<br />

reliabilities and pairwise similarities for all work roles within that occupation; and unique tables<br />

that show the mean ratings for each competency by work role and occupation.<br />

CONCLUSION<br />

As a result of this occupational development effort, NIMA now has: 1) an empiricallyderived<br />

structure of broad occupations, each consisting of work roles that apply similar skills<br />

toward the performance of similar work activities; 2) a comprehensive database of the<br />

competencies that are used to perform the full range of work at the Agency, and; 3) a database of<br />

employee self-rated proficiencies for over 90% of the NIMA workforce. Competencies have<br />

now become a major part of NIMA’s career development and promotion processes, and have<br />

influenced the development of Agency training programs. Moreover, during recent international<br />

crises, the competency data have been used to “search for the expert” to perform specific, quickturnaround<br />

geospatial intelligence analysis missions. The possible uses of the data are numerous,<br />

and the work roles will continue to form the basis for other HR initiatives, such as recruitment,<br />

selection, and manpower planning.<br />

REFERENCES<br />

Cascio, W. F. (1995). Whither industrial and organizational psychology in a changing world of<br />

work? American Psychologist, 50, 928-939.<br />

Everitt, B. (1993). Cluster analysis (3 rd ed.). New York: Halsted.<br />

General Accounting Office (GAO)/Comptroller General of the United States (1996). Effectively<br />

implementing the Government Performance and Results Act (GAO/GGD-96-118).<br />

Washington DC: Author.<br />

Kaufman, L. & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster<br />

analysis. New York: Wiley-Interscience.<br />

Mulqueen, C. M., Stetz, T. A., Beaubien, J. M. & O’Connell, B. J. (<strong>2003</strong>, April). Developing<br />

dynamic work roles using Jaccard similarity indices of employee competency data. Paper<br />

presented at the 18 th Annual Conference of the Society for Industrial and Organizational<br />

Psychology, Orlando, FL.<br />

Peterson, N. G., Mumford, M. D., Borman, W. C., Jeanneret, P. R., & Fleishman, E. A. (1999).<br />

An occupational information system for the 21 st century: The development of O*Net.<br />

Washington DC: American Psychological <strong>Association</strong>.<br />

277<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


278<br />

Personnel Security Investigations: Improving the Quality of<br />

Subject and Workplace Interviews<br />

Jeremy F. Peck<br />

Northrop Grumman Mission Systems /<br />

Defense Personnel Security Research Center<br />

Introduction<br />

Keeping our nation’s security secrets is a great challenge. Each year thousands of<br />

individuals are evaluated for initial or continuing access to sensitive or classified<br />

information. For Top Secret access eligibility, the principal method for initially screening<br />

and periodically reevaluating individuals is the Single Scope Background Investigation<br />

(SSBI) and SSBI Periodic Reinvestigation (SSBI-PR). These personnel security<br />

investigations (PSIs) gather information about an individual’s background using a variety<br />

of sources. These sources include interviews (e.g., subject, supervisor, coworker,<br />

reference, neighbor, ex-spouse, medical), self-report background questionnaires, record<br />

checks (e.g., national agency checks, local agency checks, public records checks, credit<br />

checks), and in some agencies, polygraphs. Of these sources, subject interviews are<br />

among the most productive in terms of providing information relevant to the clearance<br />

adjudication process. Workplace interviews, on the other hand, are less productive than<br />

subject interviews in gathering information of security relevance. Possible reasons for<br />

this are: co-workers and supervisors are not aware of the information; are concerned<br />

about defying social norms of disclosing negative information about their coworkers or<br />

subordinates; or they may fear legal recourse or retaliation. Both interviews are among<br />

the most expensive components of the PSI process.<br />

Determining what makes one interview method superior to another is vital for<br />

meaningful monitoring of investigative effectiveness as well as for establishing system<br />

improvements. The underlying challenge, therefore, is defining, measuring, and<br />

improving interviewing quality and effectiveness.<br />

Purpose and Overview<br />

Empirical evidence suggests that structured interviews provide more valid<br />

information than unstructured interviews. This study was conducted to examine ways to<br />

improve interviewing practices of the subject and workplace references using a set of<br />

structured questions enhanced to cover each area of security concern outlined in the<br />

Adjudicative Guidelines. 1<br />

The objectives of, and the specific methods used by investigators in conducting<br />

interviews are integral to the overall quality of personnel security investigations. Despite<br />

the subject interview doing a good job in obtaining issue-relevant information, 2 there are<br />

important procedural and methodological characteristics which require attention in<br />

conducting both subject and workplace interviews.<br />

1 Guidelines used by defense and intelligence community adjudicative agencies in making<br />

determinations whether to grant a security clearance.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Observations made as a result of a review of 1,200 SSBIs conducted by the<br />

Defense Security Service (DSS) provides anecdotal evidence which suggests that some of<br />

the problematic procedural components include:<br />

• A lack of coordinating the interviews with other investigative components. (e.g.,<br />

issue-relevant information that is developed through other sources may not be<br />

followed-up with the subject)<br />

• Investigators tend not to acquire the names of additional references from the<br />

subject or workplace references during the interview which, if done, could be<br />

passed on to other investigators conducting reference interviews<br />

• Investigators with less experience may not know why they are asking certain<br />

questions<br />

Interview Objectives<br />

The DSS investigative manual asserts that the subject interview has two basic<br />

objectives:<br />

1. To ensure that all information listed by the subject on his or her security forms<br />

is accurate and complete.<br />

2. To resolve issue or discrepant information by offering the subject an<br />

opportunity to explain, refute, amplify, or mitigate information developed by<br />

an investigation or known at the onset of the investigation.<br />

The first objective consists of reviewing and validating the responses the subject<br />

provides to each item on the security form (SF 86, SF 85P, SF 85PS, EPSQ, etc.). While<br />

it is important to ensure the information on the forms is accurate and complete, questions<br />

remain as to the costs and the benefits of the form validation portion of the interview.<br />

One question is whether the time spent on validating forms could be better used on<br />

uncovering issue-relevant information related to substantive security concerns.<br />

Investigators who conduct the SSBI and the SSBI-PR are tasked with gathering as<br />

much information about the subject as possible under time and resource constraints.<br />

Validating the subject’s security form can be done relatively quickly compared to<br />

conducting a more thorough interview which tends to take longer. Such time pressure<br />

discourages investigators and review personnel (supervisors, case analysts, and<br />

adjudicators) from being thorough. The emphasis placed on high production (e.g.,<br />

conducting as many investigations as possible under severe time constraints) creates the<br />

risk of compromising quality. Such quality concerns have become particularly important<br />

because of the current trend towards outsourcing PSIs and efforts in making clearance<br />

eligibility determinations reciprocal across federal agencies. Private contractors<br />

increasingly conduct personnel security investigations for the government, yet there is no<br />

standard method of conducting interviews or of assessing their quality.<br />

2 Information relevant to establishing that an issue is of potential current security concern<br />

and/or information that an adjudicator would want to review in making a clearance<br />

decision.<br />

279<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


280<br />

Anecdotal evidence suggests more could be done to improve the quality of<br />

interviews in regards to the amount of information adjudicators require in making wellinformed<br />

clearance determinations. Existing evidence suggests that much of the pertinent<br />

information that is either volunteered by the subject or is provided in response to specific<br />

questions is not followed-up with probing questions. Without the appropriate level of<br />

follow-up questions, the Report of Investigations (ROI) completed by investigators and<br />

forwarded to adjudicators provides less information on which adjudicators can base their<br />

clearance decisions. Interviews conducted in a standardized manner that involve asking<br />

specific follow-up questions of security relevance as well as questions that mitigate3<br />

security concerns can go a long way in assisting adjudicators in making clearance<br />

decisions by providing them with additional and necessary information. Without this<br />

information, adjudicators often require investigators to conduct a second interview to<br />

obtain the missing information before making a clearance determination. However,<br />

because of time pressures to close cases (make a clearance determination), adjudicators<br />

might be reluctant to send investigative cases back into the field for follow-up interviews.<br />

Method<br />

The development of the enhanced interview questions consisted of the following<br />

four steps.<br />

1. Reviewing the available research literature on the interviewing methods used<br />

across industries. Key points of this research include: Longer questions tend to<br />

produce longer responses and more reporting of undesirable behaviors; use of<br />

probe questions that obtain additional information are effective in motivating<br />

respondents to reveal personal information; structured interviews are the more<br />

valid than unstructured interviews.<br />

2. Reviewing the investigative manuals of several federal agencies that are based<br />

on Executive Order (EO) 12968, Access to Classified Information (1995)<br />

which sets the standard for eligibility to classified information. Each agency<br />

applies EO 12968 standards and guidelines to their respective investigative<br />

manuals differently based on the particular mission of the agency. This review<br />

was conducted in order to determine: a) what, if any, recommendations exist<br />

on what questions should be asked when interviewing the subject and<br />

workplace sources and b) to provide the basis for making the enhanced<br />

interview questions applicable to each agency.<br />

3 Information that explains, refutes, moderates or lessens issue-relevant information.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


45 th 3. Generating the specific questions. For this, the Adjudicative Guidelines were<br />

relied upon to ensure the questions coincided with these guidelines. These<br />

guidelines include the conditions that could raise a security concern which may<br />

be disqualifying factors in obtaining a clearance. They also include the<br />

conditions that could mitigate such security concerns.<br />

4. Obtaining input from individuals with expertise in conducting background<br />

investigations for several federal agencies. These experts reviewed drafts of the<br />

enhanced interview questions and provided specific feedback which was<br />

compiled and integrated into the final enhanced interview questions document.<br />

Once a standard set of questions was developed, a pilot test was initiated to test<br />

the value of the enhanced questioning protocol. The test sample consisted of<br />

approximately 150 SSBI-PR subject and workplace interviews. Each interview was<br />

conducted using the enhanced interview protocol. In addition, approximately 150<br />

baseline interviews were conducted using the traditional line of questioning. The<br />

structured format of the enhanced interview questions provides investigators with specific<br />

question ordering, wording, when to use close versus open-ended questions, and what<br />

specific follow-up questions to ask based on their responses to the initial question.<br />

Therefore, the structured format provides greater coverage of information that may be of<br />

potential security relevance. Limiting the “holes” in coverage will prevent the adjudicator<br />

from having to fill in those holes by either having to speculate as to what information<br />

should have been obtained or by having to request a follow-up interview to obtain<br />

missing information.<br />

One federal government agency was chosen to participate. Investigators from this<br />

agency were trained on how to use the protocol. An Interview Preparation Guide was<br />

developed as well and given to investigators to use prior to and in conjunction with<br />

conducting the interviews. During the training, emphasis was placed on having<br />

investigators ask the appropriate follow-up questions listed on the document. These<br />

follow-up questions are an integral part of the enhancements made to the line of<br />

questioning currently used. For example, on a question related to a subject’s financial<br />

situation the investigator currently might ask questions associated only with validating<br />

what has been provided on the subject’s security form. An enhanced question on the<br />

same topic asks: “Do you have any difficulties paying your bills or concerns about your<br />

current level of debt? If the subject answers “yes” the investigator is instructed to get<br />

details and the circumstances surrounding this issue.<br />

Results<br />

Rating forms were provided to investigators of the federal agency used in the pilot<br />

test to rate the quality of the enhanced interviews compared to the traditional line of<br />

questioning used. Investigators were instructed to indicate on the form the extent to<br />

which they agreed or disagreed with several statements. An example of an item on the<br />

investigators’ rating form is “The enhanced interview provided more complete coverage<br />

of disqualifying factors for this case.” Different rating forms were provided to<br />

adjudicators at this agency to rate the extent to which the Report of Investigation for each<br />

of the enhanced interviews contained unresolved issues or inadequate information for<br />

each of the required elements of the Investigative Standards (required by EO 12968).<br />

Preliminary data analysis suggests that while the enhanced subject interviews on<br />

average are taking longer to complete, the structured format provides more coverage<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

281


282<br />

whether or not such coverage surfaces information of a security concern. For this study,<br />

the rating forms are expected to provide the bulk of the data needed for a thorough<br />

analysis as to the overall effectiveness of the enhanced interview questions. However, at<br />

the time of this writing, the sample cases have yet to be adjudicated. Once the rating<br />

forms have been completed, comparisons will be made between the adjudicator rating<br />

forms for the baseline sample and the adjudicator rating forms for the enhanced interview<br />

sample. The rating forms completed by the investigators will also be analyzed to<br />

determine how well the enhanced interview protocol performed in comparison to the<br />

traditional approach to conducting interviews.<br />

Conclusion<br />

There is an inherent tension in the PSI program between needing to validate<br />

information provided on the subject’s security form; a non-confrontational approach,<br />

versus an interview approach which involves probing deeper into areas in the subject’s<br />

life that may be of security concern; an investigative approach. This study intends to<br />

reconcile some of that tension as well as the tension between conducting effective,<br />

thorough interviews and those that are completed and adjudicated in a timely manner.<br />

A number of important findings are expected to emerge from this study. In<br />

determining what makes one interview method superior to another the rating data will<br />

provide meaningful information on improving interview and overall investigative<br />

effectiveness. Because the relative productivity of the subject and workplace interviews is<br />

similar across federal agencies the findings from this study could have implications for<br />

improving interviewing techniques across all the agencies that conduct security<br />

background investigations.<br />

References<br />

Bosshardt, M.J., & Lang, E.L. (2002, November). Improving the subject and employment<br />

interview processes: A review of research and practice. Minneapolis, MN: Personnel<br />

Decisions Research Institutes, Inc.<br />

Carney, R.M. (1996, March). SSBI source yield: An examination of sources contacted<br />

during the SSBI. (PERS-TR-96-001). Monterey, CA: Defense Personnel Security<br />

Research Center.<br />

Defense Security Service. (2001, September 10). DSS Investigations Manual.<br />

Director of Central Intelligence. (1998). Adjudicative Guidelines for Determining<br />

Eligibility to Access to Classified Information (DCID 6/4, Annex C, Jul. 2, 1998).<br />

Washington, D.C.: Author.<br />

Executive Order 12968, “Access to Classified Information,” August 2, 1995.<br />

Kramer, L.A., Crawford, L.S., Heuer, Jr., R.J. & Hagen, R.R. (2001, August). SSBI-PR<br />

source yield: An examination of sources contact during the SSBI-PR. (PERS-TR-01-6).<br />

Monterey CA: Defense Personnel Security Research Center.<br />

Defense Personnel Security Research Center. (<strong>2003</strong>). Unpublished analyses. Monterey,<br />

CA: Author<br />

Privacy Act, 5 U.S.C. 552a (1974).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Strategies for Increased Reporting of Security-Relevant Behavior*<br />

Kent S. Crawford<br />

Defense Personnel Security Research Center<br />

Suzanne Wood<br />

Consultant to Northrop Grumman Mission Systems<br />

Introduction<br />

United States federal policies and those of the Department of Defense (DoD) are<br />

designed to ensure that the cleared workforce is reliable, trustworthy, and loyal. One of<br />

the requirements of such policies is that supervisors and coworkers who work in<br />

classified environments report to security managers any behavior they observe among<br />

workplace colleagues that may be of security concern. In essence, supervisors and<br />

coworkers, working closely with each other as they do, can observe or become aware of<br />

behaviors that might suggest a risk to national security.<br />

These policy requirements are in place as one means to prevent Americans from<br />

committing espionage. However, espionage is a rare event. One is not likely ever to<br />

encounter a spy, much less observe an act of espionage. So while catching spies may be<br />

an ultimate goal for these policies and the reason to report counterintelligence- (CI) and<br />

security-related behavior, the policies are also designed to regularly evaluate the millions<br />

of ordinary people in the workplace who have access to classified information, people<br />

who have no intention of committing espionage but are likely—over a period of time or<br />

in changing contexts—to develop personal problems that may possibly question their<br />

reliability and trustworthiness. The philosophy behind policies nowadays is that the<br />

government is not just trying to root out spies: by asking employees to report, the<br />

government is not only identifying potential security risks but actually helping employees<br />

take care of the kinds of human problems that plague us all from time to time. However,<br />

reporting policy—based on the adjudicative guidelines—has mixed together the kinds of<br />

behavior that should always be reported. CI- and security-related behaviors are mixed<br />

with reliability and suitability problems. This has led to confusion among supervisors and<br />

coworkers about what behaviors are the most important to report. It is this confusion that<br />

often paralyses employees: if they are not sure exactly what to report, they simply report<br />

nothing.<br />

Evidence gathered during a recent PERSEREC study (Wood & Marshall-Mies,<br />

<strong>2003</strong>) shows that the reporting rate is in fact very low. Supervisors and coworkers are<br />

reluctant to inform security managers about many behaviors that they observe in the<br />

workplace because they believe them to be too personal to report. It is ironic that the very<br />

behaviors that the government wants people to report—in order to be able to help them—<br />

are the very ones that supervisors and coworkers are loath to share with authorities.<br />

Employees are, however, more willing to report behaviors that are egregious and appear<br />

to have a more direct some connection with national security.<br />

_______________________________________________________________________<br />

*The views expressed in this paper are those of the authors and do not necessarily reflect<br />

those of the United States Department of Defense.<br />

283<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


284<br />

The present paper, based on the <strong>2003</strong> PERSEREC report mentioned above,<br />

examines current reporting policies, discusses research on reporting, and describes<br />

supervisor and employee confusion about what to report. It recommends ways to reduce<br />

the disconnect between the requirements to report and the actual reporting of any DoD<br />

security-relevant behavior. The aim of this paper is to recommend changes in<br />

organizational practice that might lead to the establishment of conditions under which<br />

employees would be more likely to report egregious security-relevant behaviors.<br />

Methodology<br />

The research methodology consisted of four steps: (1) reviewing policies related<br />

to supervisor and coworker reporting; (2) conducting literature reviews of commission<br />

studies and other research to learn about the willingness of people in general to report on<br />

colleagues; (3) interviewing military service, DoD, and non-DoD security and other<br />

management personnel to determine the frequency of reporting and to gather<br />

recommendations for improving reporting policy and its implementation; and (4)<br />

conducting focus groups with supervisors and coworkers in the field to discuss their<br />

reporting responsibilities, willingness to report, and recommendations.<br />

Results<br />

Review of Policies<br />

The key policy documents that concern the reporting of CI- and security-related<br />

behaviors were compared and contrasted, exploring areas of overlap, degree of<br />

specificity, and whether one policy superseded another.<br />

Executive Order. In August 1995, Executive Order 12968, Access to Classified<br />

Information, addressed the subject of employee reporting responsibilities. The order<br />

states, inter alia, that employees are expected to “report any information that raises<br />

doubts as to whether another employee’s continued eligibility for access to classified<br />

information is clearly consistent with the national security.” The order also expands<br />

recommended prevention, treatment, and rehabilitation programs beyond drug and<br />

alcohol abuse and emphasizes retaining personnel while they deal with a wide range of<br />

problems through counseling, or the development of appropriate life-skills.<br />

Directive for Sensitive Compartmented Information (SCI). Director of Central<br />

Intelligence Directive (DCID) 6/4, Personnel Security Standards and Procedures<br />

Governing Eligibility for Access to Sensitive Compartmented Information (July 2, 1998),<br />

covering individuals with SCI access, requires that security awareness programs be<br />

established for supervisors that provide practical guidance on indicators that may signal<br />

matters of security concern. DCID 6/4 discusses individuals’ responsibilities for reporting<br />

activities by anyone, including their coworkers, that could conflict with those individuals’<br />

ability to protect highly classified information.<br />

DoD Directives and Regulations. DoD Directive 5200.2-R, Personnel Security<br />

Program (January, 1987, amended 1996 and soon to be completely revised), implements<br />

the personnel security requirements of various executive orders. The directive outlines<br />

personnel security policies and procedures, including categories of behavior to be<br />

reported and provisions for helping troubled employees. The categories of behavior<br />

which serve as adjudicative guidelines and are to be reported are allegiance to the U.S.;<br />

foreign influence; foreign preference; sexual behavior; personal conduct; financial<br />

considerations; alcohol consumption; drug involvement; emotional, mental and<br />

personality disorders; criminal conduct; security violations; outside activities; and misuse<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


of information technology systems. While the directive requires that infractions of all the<br />

above categories of behavior be reported, like the other formal documents described<br />

above, it is vague on definitions of these behaviors and it mixes together CI, security, and<br />

personal problems in the one list. It requires that supervisors and coworkers report all<br />

relevant personnel security behaviors.<br />

Literature Review<br />

Sarbin (2001), in exploring the psychological literature, suggested that lack of<br />

reporting in the workplace is due to cultural prohibitions against informing on one’s<br />

colleagues and friends, especially for behaviors that are not strictly violations of security<br />

rules. Except in cases where the behavior is egregious, Sarbin questioned the<br />

effectiveness of current DoD policy that requires employees to inform on their fellow<br />

workers. Reviewing proxy measures of reporting in different fields such as whistleblowing,<br />

Giacalone (2001) also found that supervisors and coworkers in the general<br />

workplace report only a small percentage of the questionable behaviors they observe. In<br />

spite of the low rate of reporting and the cultural injunction against informing on others,<br />

Giacalone recommended several interventions to help increase the rate of reporting.<br />

These interventions were designed to make policies clearer and more transparent and to<br />

train supervisors and workers on these policies, the behaviors of concern, and the nexus<br />

between these behaviors and national security.<br />

A review of commission studies (Joint Security Commission, 1994; Joint Security<br />

Commission II, 1999) and related research (Bosshardt, DuBois, & Crawford, 1991;<br />

Kramer, Crawford, Heuer, and Hagen, 2001 Fischer & Morgan, 2002; Wood, 2001;<br />

Erdreich, Parks, & Amador, 1993) confirmed Sarbin and Giacalone’s findings that few<br />

individuals report security-related issues. Supervisors provide more security-relevant<br />

information than do coworkers, but neither is a very productive source. For example, in a<br />

PERSEREC study of four federal agencies, supervisors and coworkers provided very<br />

little information compared to sources such as subject interviews, personal history<br />

questionnaires (SF-86/SPHS), and credit reports.<br />

Source<br />

Percentage of SSBI-PRs Where<br />

Source Yielded Issue Information<br />

DoD<br />

(n = 1,611)<br />

OPM<br />

(n = 1,332)<br />

CIA<br />

(n = 855)<br />

Subject Interview 15% 18% 23% 25%<br />

SF-86/SPHS 11% 15% 5% 10%<br />

Credit Reports 11% 18% 10% 9%<br />

Supervisor Interviews 3% 5% 5% 2%<br />

Coworker Interviews 1% 3% 3% 1%<br />

285<br />

NRO<br />

(n = 923)<br />

Interviews with Security Managers<br />

For the <strong>2003</strong> PERSEREC study, interviews were conducted with 45 security<br />

managers and management personnel at 20 federal agencies, including intelligence<br />

agencies, military organizations, Department of Energy, State Department, Federal<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


286<br />

Bureau of Investigation and others. These personnel supported the notion that supervisors<br />

report more often than coworkers, but neither set of people reports much. They offered a<br />

series of reasons why this should be so, and made some recommendations to improve the<br />

situation. Security managers suggested the following reasons why people may not report:<br />

Cultural resistance.<br />

Negative perceptions of reporting and its consequences (to the reporter and to<br />

the person reported).<br />

Lack of knowledge and experience of security officers, supervisors, and the<br />

workforce with reporting requirements.<br />

Unclear relationships between security, employee assistance programs, and<br />

other functions in the organization.<br />

Focus Groups<br />

Supervisors and coworkers in focus groups supported the managers’ estimates on<br />

the frequency of supervisor and coworker reports. They noted their own reluctance to<br />

report on their colleagues. Reasons for not reporting included:<br />

Cannot see the nexus between certain reportable behaviors and national<br />

security.<br />

Fear they will cause people problems.<br />

Fear that reported colleagues will be harmed because the system may not be<br />

fair to them.<br />

Fear that they will lose control once the report has been made to Security.<br />

Fear of negative repercussions to themselves for reporting.<br />

However, they are not resistant to reporting serious infractions. It is simply that<br />

the DoD Directive 5200.2-R reporting requirements are perceived as being too broad and<br />

amorphous and, thus, very difficult to implement. The regulation requires that supervisors<br />

be trained in recognizing “indicators that may signal matters of personnel security<br />

concern” and that supervisors and coworkers report “information with potentially serious<br />

security significance.” While these phrases may have been clear to the original framers of<br />

the directive, they are far from obvious to personnel in the field. Noted a supervisor, “We<br />

need clearer rules about what should be reported up the chain.” However, even in the<br />

absence of such guidance, supervisors and coworkers do intuitively distinguish between<br />

behaviors that are directly related to national security (which they say they have no<br />

problem reporting) and behaviors that are associated with reliability and suitability for<br />

employment (which they are hesitant to report).<br />

The single most important reason employees gave for seldom reporting is that<br />

they personally cannot see the precise connection—the nexus—between certain<br />

behaviors and national security. They said that they do not know where to draw the line<br />

between egregious security-related behaviors and gray-area suitability or personal<br />

behaviors—the kinds of problems that, while important, are seen as less critical in terms<br />

of security risk management and are not directly linked in people’s minds with the<br />

compromise of security or with espionage.<br />

If the connection were made apparent, they said they would be more motivated to<br />

report in order to protect their country and national security. In response to this concern,<br />

PERSEREC subsequently developed a brochure that lists behaviors that always should be<br />

reported if they become known. Reporting these behaviors requires no judgment calls.<br />

The brochure is called Counterintelligence Reporting Essentials (CORE): A Practical<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Guide for Reporting Counterintelligence and Security Indicators and will be distributed<br />

to all CI and other security organizations for use in the field as part of security awareness<br />

presentations and CI briefings.<br />

Conclusions<br />

Findings from the PERSEREC study show that there will always be some tension<br />

between the rules requiring reporting and our cultural values not to inform on colleagues.<br />

This is especially likely in cases where the “infraction” is not perceived to be an illegal<br />

activity or security violation but a common, and often transient, personal problem. Yet,<br />

provided they understand the nexus, study participants had no objection to reporting<br />

serious security-related behaviors so long as it is made clear what constitutes such<br />

behaviors. They believe that temporary personal problems may be better handled in a<br />

different manner, perhaps by the supervisor through referral to employee assistance<br />

programs or to other kinds of monitored treatment programs.<br />

The PERSEREC study points to the need to increase the reporting of critical and<br />

obvious security-related behaviors, which employees say they are willing to report. It<br />

suggests drawing a clearer distinction between the reporting, and consequences, of<br />

egregious security-related behaviors and suitability-type behaviors of a more personal<br />

nature that realistically are not likely to be reported. By clearly communicating this<br />

distinction to supervisors and coworkers, through use of PERSEREC’s CORE brochure,<br />

and by encouraging supervisors to become more proactive in addressing suitability<br />

issues, the rate of reporting of truly serious security infractions may well be increased.<br />

References<br />

Bosshardt, M.J., DuBois, D.A., & Crawford, K.S. (1991). Continuing assessment of<br />

cleared personnel in the military services, Reports 1-4 (PERS-TR-91-1 through 4).<br />

Monterey, CA: Defense Personnel Security Research Center.<br />

Erdreich, B.L., Parks, J.L., & Amador, A.C. (1993). Whistleblowing in the government:<br />

An update. Washington, DC: U.S. Merit Systems Protection Board.<br />

Fischer, L.F. & Morgan, R.W. (2002). Sources of information and issues leading to<br />

clearance revocations. Monterey, CA: Defense Personnel Security Research Center.<br />

Giacalone, R.A. (April, 2001). Coworker and supervisor disclosure of reportable<br />

behavior: A review of proxy literature and programs. Paper presented at a colloquium<br />

on Obtaining Information from the Workplace: Supervisor and Coworker Reporting.<br />

Monterey, CA: Defense Personnel Security Research Center.<br />

Joint Security Commission (1994). Redefining security: A report to the Secretary of<br />

Defense and the Director of Central Intelligence. Washington, DC: Author.<br />

Joint Security Commission (1999). A report by the Joint Security Commission II.<br />

Washington, DC: Author.<br />

Kramer, L.A., Crawford, K.S., Heuer, R.J., & Hagen, R.R. (2001). Single-Scope<br />

Background Investigation-Periodic Reinvestigation (SSBI-PR) source yield: An<br />

examination of sources conducted during the SSBI-PR (TR-01-5). Monterey, CA:<br />

Defense Personnel Security Research Center.<br />

Sarbin, T.R. (April, 2001). Moral resistance to informing on coworkers. Paper presented<br />

at a colloquium on Obtaining Information from the Workplace: Coworker and<br />

287<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


288<br />

Supervisor Reporting. Monterey, CA: Defense Personnel Security Research Center.<br />

Wood, S. (2001). Public opinion of selected national security issues: 1994-2000 (MR-01-<br />

04). Monterey, CA: Defense Personnel Security Research Center.<br />

Wood, S., & Marshall-Mies, J.C. (<strong>2003</strong>). Improving supervisor and coworker reporting<br />

of information of security concern. Monterey, CA: Defense Personnel Security<br />

Research Center.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


CHARACTERIZING INFORMATION SYSTEMS INSIDER<br />

OFFENDERS<br />

Lynn F. Fischer<br />

Defense Personnel Security Research Center<br />

Introduction<br />

The development of a database to track trends and common characteristics of<br />

information systems insider offenses in the Department of Defense has been underway at<br />

PERSEREC for over 3 years. An early analysis of data drawn from the Insider Events<br />

Database was presented to <strong>IMTA</strong> at the Edinburgh meeting in 2000. Since then<br />

additional information has been obtained from the investigative agencies of the military<br />

services and we can more clearly define common characteristics and motivations of<br />

offenders as well as the types of offenses they commit against defense systems. Insiders<br />

are defined here as individuals holding a position of trust and given authorized access to a<br />

defense information system. As military services throughout the world are increasingly<br />

dependent on computer systems, internal networks, and the Internet, it is probable that<br />

scenarios such as those described in this report will be repeated in the military<br />

organizations of other countries.<br />

While many approaches to detecting and preventing cyber-offenses committed by<br />

insiders focus on technical countermeasures, we at PERSEREC are persuaded that the<br />

insider threat is essentially a trust-betrayal issue. That is, the insider threat is a human<br />

problem related to the selection and monitoring of persons who have use of or<br />

administrative control of our critical networks. Another important factor working against<br />

misuse or abuse of systems must be adequate security education. We have found that<br />

many offenders did not know or had not been informed of what was unacceptable<br />

behavior and what its consequences are for the integrity and operability of their systems.<br />

Who Are the People Doing This and Why Do They Do It?<br />

Based on a review of over 80 insider events in the database of different<br />

magnitudes of seriousness, most offenses were committed by younger service members<br />

or by information technology (IT) professionals under contract to a defense facility.<br />

Forty-seven percent were attributed to misuse by uniformed service members. Of 33<br />

events for which the rank of the offender is known, 22 involved junior enlisted personnel,<br />

nine were committed by non-commissioned officers, and two by commissioned officers.<br />

With few exceptions, the service members, whether IT professionals or not, knew a great<br />

deal more about computer systems than was required by their job. Several engaged in<br />

hacking or computer-related private enterprise from home during off-duty hours.<br />

This paper reviews several significant findings or generalizations, based on data<br />

available to date, that are emerging from the analysis. However, a much better<br />

understanding of situational factors, motivations, and contributing causes can be gained<br />

from in-depth case studies of those events that resulted in significant consequences for<br />

the organization. Therefore, each of the general observations will be illustrated by a case<br />

study that demonstrates the importance of situational factors at the place of employment,<br />

interpersonal and social interrelationships (often hostile and vindictive), and the attitudes<br />

of the offenders. The following observations are clearly emerging from the data acquired<br />

from sources of record to date.<br />

289<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


290<br />

Findings<br />

Almost all insider offenders were system administrators or had some level<br />

of administrative access to the system being abused.<br />

Of the identified offenders 20% were system administrators, 34% were assistant<br />

administrators, and another 41% had limited administrative access beyond that of a<br />

normal user. Among the many events that could illustrate this point, one stands out as<br />

unique, that of a Private First Class (PFC) who had helped to develop the U.S. Army’s<br />

database for enlisted records. This junior service member, whose later conduct was<br />

particularly egregious, was responsible for three events, each separated by several<br />

months. 1<br />

Age 22 at the time of these events, the service member was an information<br />

systems operator at Ft. Benjamin Harrison, Indiana. He claimed to have been interested in<br />

computer from an early age and had received advance systems training while in the<br />

Army. He also operated a small computer business from his home. When first arriving at<br />

Ft. Harrison in 1995 he reported to a Captain who depended upon his computer skills and<br />

gave him considerable freedom on the job. The PFC’s work position was information<br />

systems operator and software analyst, and he was assigned to the Information Support<br />

Agency Enlisted Records and Evaluation Center (EREC). A subsequent branch chief,<br />

however, exercised greater authority over his activities and in fact ordered him to remove<br />

unauthorized personal files from the system server. The soldier’s resistance to this policy<br />

resulted in a non-judicial punishment in November 1998.<br />

After continued animosity between him and his new branch chief, the soldier<br />

apparently attempted to get even by disabling the system users’ accounts in April, 1999,<br />

resulting in a shutdown of the EREC database system for about 3.5 hours. The PFC was<br />

accused of damaging computer information and of unauthorized computer access. A<br />

decision was made not to undertake a court martial against him. Action against the<br />

service member resulted in another non-judicial punishment by which he was reduced in<br />

rank, fined, and removed from all systems administrator level work-related duties.<br />

But he was still intent upon getting even. With the assistance of a chat room<br />

acquaintance located in Jamaica, he was able to steal passwords and infect several of the<br />

workstations on his organization’s system with a Trojan virus (BO2K) which gave him<br />

remote control of these workstations. He then proceeded to delete over 1,000 workrelated<br />

files of systems users. The culprit was not difficult to identify by special<br />

investigators. In September, 1999, the service member was arrested and his residence<br />

searched for evidence. Later that month, an unlawful intrusion was detected by a U.S.<br />

Army computer network in Indianapolis and traced to someone attacking from Montego<br />

Bay, Jamaica.<br />

For this final attack on the Army system, the service member was formally<br />

prosecuted. In June, 2000, he appeared before a general court martial convened at Ft.<br />

Knox, Kentucky, and pleaded guilty to all charges. He was sentenced to a reduction to the<br />

lowest enlisted rank, loss of all benefits and pensions, and 4 months of criminal<br />

confinement to be followed by a Bad Conduct Discharge.<br />

1 Information on this case summary is based on interviews with personnel at the scene of<br />

the offence, interviews with case agents, and transcripts from the court martial<br />

proceedings.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


In about 60% of these recorded events the offender demonstrated malicious<br />

or criminal intent.<br />

Not withstanding this fact, nearly 30% are not motivated by malice. Too often, as<br />

in the previous case, an offender has a sense of ownership of a system that leads him to<br />

believe that he can use it for personal gain or convenience and, in doing so, will neither<br />

be criticized nor jeopardize the integrity of the government system. In these cases there<br />

appears to be no intent to damage or destroy the system or to seek revenge on another<br />

employee. Approximately a third of the events in the Insider Events Database fall into<br />

this category. At one U.S. facility in Korea, for example, four service members actually<br />

set up their own business web site on the government server, apparently thinking that no<br />

harm would be done.<br />

However, in some cases non-malicious behavior can result in serious trouble for<br />

the offender. One example is the case of Michael Scott Moody, who at the time of the<br />

offense was an Airman First Class with 90th Fighter Squadron, Elmendorf, Alaska. 2 In<br />

November, 1998, information was received from the Air Force Communication<br />

Emergency Response Team (AFCERT) that two computers at Elmendorf Air Force Base<br />

had been illegally accessed. The intrusion was traced to a home computer owned by<br />

Moody who had set up NetBus 3 on two workstations in his office so that he could operate<br />

them remotely. A search of his personal computer revealed evidence of hacking, software<br />

piracy, and possession of child pornography. In May, 1999, Moody was discharged from<br />

the service so that he could face charges in a Federal court. He pleaded guilty to illegal<br />

access to government computers and possession of child pornography. Moody was<br />

sentenced to 10 month in prison.<br />

According to a media account of the trial, Moody told the judge, “Honestly, at the<br />

time, I didn’t consider it hacking. I thought of it more as a prank…I was curious to know<br />

if I could access the computer at work. Being a government computer, I considered it a<br />

challenge. I worked. I didn’t mean to hurt no one” (“Hacker gets time,” 1999)<br />

As is often the case, the accused claimed to have been wrongly misled by chatroom<br />

acquaintances who not only sent him child pornography but provided him with<br />

NetBus software over the Internet. However, once the software was installed by Moody,<br />

it allowed anyone with the knowledge of NetBus to access the Elmendorf AFB<br />

computers, which contained personnel records and maintenance records for an F-15<br />

squadron.<br />

2<br />

This case summary has been developed from numerous media reports published at the<br />

time of Moody’s arrest.<br />

3<br />

NetBus is one of several software systems that permits control of a workstation from a<br />

remote location. It may have legitimate uses, but its illicit use as a Trojan horse requires<br />

the loading of a program on a targeted workstation usually via an attachment to an email<br />

message. The unwitting user of the workstation is unaware that by opening the<br />

attachment NetBus is loaded and the system made available to the remote intruder.<br />

Hackers usually send NetBus to unsuspecting computer owners by e-mail and disguise it<br />

in the attachment of a computer game or graphic file.<br />

291<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


292<br />

Over 60% of these offenses resulted in serious damage or compromise to the<br />

system.<br />

Among the worst consequences of the actions of a malicious insider are denial of<br />

service to authorized users and the destruction of official files and software since our<br />

primary concern is with the unimpaired operability of defense information systems,<br />

particularly in the event of armed conflict. Data obtained to date show that 23% of the<br />

events involved denial of service attacks, while 11% resulted in the destruction of official<br />

files or software. An additional 29% resulted from the introduction of unauthorized<br />

software into a government system. Some of this software was of the malicious or<br />

damaging type. The first case discussed in this paper is the best example of this type of<br />

offense; however, there are several others. Another prosecuted case involved a<br />

disgruntled government civilian employee of the U.S. Coast Guard.<br />

In early 1998, Shakuntla Singla, civilian employee and systems administrator for<br />

the U.S. Coast Guard in Washington, DC used the password and identification of another<br />

employee to gain access to the Coast Guard system from her home after resigning from<br />

the organization. Singla, a Coast Guard employee, was reported to be angry over the fact<br />

that the organization had ignored her reports about improper conduct by an IT contractor<br />

employee. She had in fact filed a complaint with the Equal Employment Opportunity<br />

Commission claiming that she was subject to a hostile working environment.<br />

Two months later, employees noticed that critical files had been deleted from the<br />

Coast Guard nation-wide personnel database, causing the system to shut down.<br />

According to a news report, “The July crash wiped out almost two weeks’ worth of<br />

personnel data used to determine promotions, transfers, assignments and disability claim<br />

reviews for Coast Guard personnel nationwide” (”Woman gets five months,” 1998). The<br />

prosecuting Assistant U.S. Attorney stated, “It took 115 Coast Guard employees across<br />

the country working more than 1,800 hours to recover and reenter the data, at a cost of<br />

more than $40,000.”<br />

It was clear that, because of the precision by which the hacking was<br />

accomplished, the culprit was an insider or had inside information. Singla was linked to<br />

the crime by the FBI through computer and phone records and the fact that she had used<br />

an access code, known only to a few people, to enter the system. Singla had helped to<br />

build the personnel database she later attacked.<br />

While claiming that she had not intended the computer system to crash, Singla did<br />

plead guilty to unauthorized access and deletion of files. She was sentenced to 5 months<br />

in prison, ordered to pay $35,000 of restitution to the Coast Guard, and placed on several<br />

months of home detention. Singla stated to a media reporter, “I wanted to get even with<br />

them. I was frustrated and depressed because no one listened to my complaints of sexual<br />

harassment in the workplace. I did delete information, but I did not crash the system”<br />

(”Coast Guard,” 1998).<br />

Many offenses resulted from unauthorized use of a defense system for<br />

personal convenience.<br />

There are several accounts in the Insider Database of government systems being<br />

used by service members or employees for personal pleasure or convenience; however,<br />

few of the offenders have malicious or criminal intent. In at least seven events, system<br />

administrators or their assistants set up unauthorized storage directories on government<br />

servers for bootlegged game software, music, or pornography collections. In the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


following example, however, a cadet at the U.S. Air Force Academy was accused not<br />

only of misusing the academy system for personal chat room activity, but using it as a<br />

platform from which to launch a criminal attack on companies in the private sector<br />

(“Academy Jurors,” 1999).<br />

A second-year cadet, Christopher Wiest, along with other cadets, was ordered to<br />

stop using Internet chat rooms out of security concerns. Several months later Wiest<br />

resumed active participation in chat rooms with the assistance of several cyber-friends<br />

not connected to the Air Force. He in fact set up an Internet relay chat room (IRC) server<br />

on his PC that was connected to the USAF Network. Unfortunately, his “friends” were<br />

engaged in extensive hacking around the Internet and involved Wiest in their activities.<br />

At the time Wiest claims that he had no idea of what these people were doing. In<br />

November 1997, the Air Force Office of Special Investigations searched Wiest’s room<br />

and seized his computer. Weist was initially charged with using the Air Force system to<br />

illegally enter three companies and cause $80,000 damage. Two of these charges were<br />

later dropped. Prosecutors argued that Wiest used the Air Force platform to connect to<br />

the Internet, and then hacked into company systems, erased data, and planted destructive<br />

programs.<br />

In March, 1999, Wiest was found guilty by court martial for using an Air Force system to<br />

break into and damage a private company’s computer system, causing $6,000 damage.<br />

He was dismissed from the academy and the service (“Air Force Academy,” 1999).<br />

An unexpectedly high frequency of offenses can be categorized as inside<br />

hacking.<br />

While computer hacking is generally not thought of as an insider offense<br />

committed by trusted employees, the data revealed an unanticipated high frequency of<br />

insider hacking, that is, the use of a government platform to gain unauthorized access to<br />

either another defense system or a system outside of government. Sixteen percent of the<br />

events concerned the former and 11% involved hacking to a private sector system. A<br />

particularly serious example of this type of case was reported in the press as attempted<br />

espionage; however, the court convicted the offender only of unauthorized use of<br />

government property.<br />

PFC Eric Jenott, assigned to duty as a communication switch operator at Ft.<br />

Bragg, North Carolina, since June 26, 1996, was charged on August 21st of that year with<br />

espionage, damaging military property, larceny and unauthorized access to government<br />

computer systems (“Court-martial,” 1996). Specifically, Jenott was accused of providing<br />

a classified system password to a Chinese citizen located at Oak Ridge, Tennessee.<br />

Prosecutors contended that Jenott was attempting to gain favor with the Chinese<br />

government because he wanted to defect to China. According to the accused soldier, the<br />

password was not classified as secret, and charges related to his penetration of defense<br />

computer systems stem from his attempt to be helpful when he discovered a weakness in<br />

an encoded Army system. However, he did admit to having been an active hacker for<br />

several years before joining the Army and during that period broke into Navy, Air Force,<br />

and the Defense Secretary’s systems.<br />

On January 3, 1997, the court found Jenott not guilty of espionage, but guilty of<br />

damaging the security of an Army encoded system, exceeding authorized access to a<br />

government system, and transmitting a code with the intent to damage. He was given a<br />

Bad Conduct Discharge, reduced to the lowest enlisted rank, and sentenced to 3 years<br />

293<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


294<br />

imprisonment, less the 6 months already served in pre-trial confinement (“Jury finds,”<br />

1996).<br />

Few offenders were aware that their unauthorized activities could be easily<br />

monitored.<br />

Systems abuse, particularly where it involves electronic communication across<br />

organizations is frequently detected by network monitoring systems that alert law<br />

enforcement services to anomalies. These monitoring systems, or computer emergency<br />

response teams (CERTS), routinely monitor traffic in and out of defense networks for<br />

intrusions and attempted intrusions. Thirty-three percent of the events recorded in the<br />

database resulted from CERT notification. Twenty-three percent were detected by<br />

internal monitoring of a network. 4<br />

One of the most disconcerting aspects of the misuse of defense information<br />

systems is seen in several cases in which access to a government system was given to<br />

unauthorized persons by an insider. In the following example, there is little evidence that<br />

the offender had malicious intent against his organization or the system itself, but wanted<br />

to use the government server for personal convenience.<br />

In May, 1999, the Air Force CERT detected suspicious connections from Israel<br />

and the Netherlands into a computer located at a U.S. Air Force Base. The recipient of<br />

these communications was identified as a contractor employed as the system<br />

administrator. The administrator admitted that he reconfigured the computer and created<br />

accounts for two unknown individuals so that they could trade pirated computer gaming<br />

software and that he copied game software to a compact disk using his government<br />

computer. He also stored unauthorized software on system media.<br />

This was not a case of ignorance of regulations in this case. The employee<br />

acknowledged that he had completed the required USAF Computer Security and<br />

Awareness Training Program and was aware that these activities were not official. His<br />

access to USAF information systems was removed by his employer and he was<br />

dismissed.<br />

Of particular concern is the frequency of cases in which the insider offender<br />

was a contracted IT professional, and sometimes a foreign national.<br />

The outsourcing of IT support to manage critical defense networks is increasingly<br />

common due to the scarcity of service members having the necessary technical training.<br />

The downside to this trend is that unless the employee requires access to classified<br />

information, he or she is unlikely to receive any type of personnel security vetting prior to<br />

employment. The government may have little or no control of who is employed by a<br />

primary contractor under a sub-contract to service or maintain a sensitive information<br />

system. Twenty percent of the offenders identified in the database were civilian<br />

employees under contract to a Defense organization.<br />

4 Automated Security Insider Monitoring Systems (ASIMS) alert systems managers to<br />

anomalous actions by users such as unusual file transfers and hacking within the system.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The most significant event of this type is that of the compromise of a highly<br />

sensitive but unclassified Air Force aircraft maintenance and inventory database that took<br />

place in 1996 (Caruso, <strong>2003</strong>). 5 In December of that year, a system administrator at Wright<br />

Patterson Air Force Base discovered a security breach in the operations side of the Air<br />

Force Reliability and Maintainability Information System (REMIS) that tracks all aircraft<br />

and weapon systems. The breach was soon traced to Zhangyi “Steven” Liu, a subcontractor<br />

employee, one of 11 young Chinese nationals who had recently been brought<br />

over to work on the software development side of the database system. Somehow Liu<br />

was able to access a “super super password file” that gave him access to the operational<br />

database and the power to change or delete any file in the system. He and two other<br />

coworkers proceeded to download unauthorized files to a personal directory that could<br />

have been accessed by Internet users. It was never established whether these data were<br />

transmitted outside the country or what his true motivations were in breaking into the<br />

system. The prime contractor was forced to spend $350,000 to examine the code and<br />

database to ensure that no malicious code had been installed by Liu or his coworkers.<br />

In March, 1997, Liu pleaded guilty to two counts of gaining illegal access to the<br />

$148 million REMIS. He later withdrew his plea and was found guilty on two<br />

misdemeanor counts by a jury. He received a sentence of 4 months confinement, 1 year<br />

of supervised work-release and a fine of $4,000 (“Chinese national,” 2000).<br />

Summary and Conclusions<br />

The following conclusions are based on the analysis of information in the Insider<br />

Database and case studies described above. These have been reinforced by a parallel<br />

study of insider events in the private sector sponsored by PERSEREC that has provided a<br />

number of additional insights into the patterns of activity associated with these attacks on<br />

sensitive information technology resources. 6<br />

• Technical security measures offer minimal protection from abuse when the<br />

offender is a systems administration or has some level of administrative access<br />

to the system.<br />

• Interpersonal relations within the workplace and the organization’s climate are<br />

very important for understanding IT systems misuse. In almost a quarter of the<br />

cases there was evidence of prior hostility in the workplace involving the<br />

offender and usually a supervisor.<br />

• Some of these events could have been avoided by better security education.<br />

Personnel need to know what the rules are concerning the use of the system,<br />

what is acceptable and not acceptable use of that system, and what the<br />

consequences are for stepping over the line.<br />

• Both enhanced personnel security and technical deterrents should be applied to<br />

minimize the threat posed by angry or indifferent personnel who have legitimate<br />

access to defense information systems.<br />

5 This case summary is based on information contained in a recent thesis by Lt. Valerie L. Caruso<br />

at the Air Force Institute of Technology and on media reports of that time.<br />

6 Undertaking this project for PERSEREC is Dr. Eric Shaw, Consulting & Clinical Psychology,<br />

Ltd. Shaw is focusing on prosecuted cases in which an insider has attacked a corporate system<br />

that is related to the critical national infrastructure. A report on this work is forthcoming in 2004.<br />

295<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


296<br />

• Many offenses occurred after discharge or transfer to a new duty station—within<br />

60 days after separation—indicating the need for greater attention to discharge<br />

security and personnel planning.<br />

• Several attacks involved employee remote access to the corporate system,<br />

indicating a need for a review of safeguards covering this practice.<br />

• In several cases examined, a lack of personnel and/or security policies can be<br />

cited as having contributed to the event.<br />

• In some cases, evidence of disgruntlement or performance problems was visible<br />

to management well in advance of an attack. Delay in intervening in the<br />

underlying personnel problem contributed to the episode or failed to divert the<br />

subject from his destructive path.<br />

The results of these studies of insider events thus far indicate that there may be<br />

significant “gaps” in policies and practices designed to reduce the risk of insider events or<br />

detect and manage this risk when it exists. While work continues in this area, it is not too<br />

early to offer specific recommendations that might prevent the inoperability or<br />

impairment of a defense information system at a critical time. The importance of<br />

continuous network monitoring cannot be stressed enough. This should be reinforced<br />

with the articulation and enforcement of clear policies regarding the use or misuse of<br />

information systems. On the non-technical side, both administrators and end-users require<br />

security awareness training that is appropriate to their use of the information system. And<br />

lastly, while supervisors and managers must deal with disgruntlement and interpersonal<br />

conflict in the workplace in a timely fashion, it is essential for defense organizations to<br />

vet or screen IT job applicants for evidence of past systems abuse, hacking, or illegal<br />

behavior.<br />

References<br />

Academy Jurors get Lesson in Hacking During cadet’s trial. (March 16, 1999). Colorado Springs<br />

Gazette Telegraph.<br />

Air Force Academy Dismisses Cadet for Hacking into Computer. (March 14, 1999). Chicago<br />

Tribune.<br />

Caruso, Valerie L. (<strong>2003</strong>). Outsourcing Information Technology and the Insider Threat. Dayton<br />

OH: Graduate School of Engineering and Management, Air Force Institute of Technology.<br />

Chinese National gets sentence of 4 months. (January 15, 2000). Dayton Daily News.<br />

Coast Guard beefs up security after hack. (July 20, 1998). Computer World.<br />

Court-martial to begin in computer spy case. (December 9, 1996). San Diego Union-Tribune.<br />

Fischer, Lynn F., Riedel, James A., & Wiskoff, Martin F. (2000). A New Personnel Security<br />

Issue: Trustworthiness of Defense Information Systems Insiders. <strong>Proceedings</strong> of the 2000<br />

<strong>IMTA</strong> Conference, <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>.<br />

Hacker Gets Time in Prison; Former Airman Downloaded Porn. . (July 2, 1999). Anchorage<br />

Daily News.<br />

Jury finds Ft. Bragg Soldier innocent of espionage. Computer fraud, property damage charges<br />

draw 3-year sentence. (December 23, 1996). Durham Herald-Sun.<br />

Woman gets five months for hacking; Tampering Ruined Coast Guard Files. (June 20, 1998).<br />

Washington Post.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Ten Technological, Social, and Economic Trends That<br />

Are Increasing U.S. Vulnerability to Insider Espionage<br />

Lisa A. Kramer<br />

Defense Personnel Security Research Center<br />

Richards J. Heuer, Jr.<br />

RJH Research/Defense Personnel Security Research Center<br />

Kent S. Crawford<br />

Defense Personnel Security Research Center<br />

Introduction<br />

Permanent and temporary employees, vendors, contractors, suppliers, exemployees,<br />

and other types of “insiders” are among those who are most capable of<br />

exploiting organizational assets at greatest expense to U.S. interests. Due to their<br />

knowledge of the public agencies and private companies that employ them, their<br />

familiarity with computer systems that contain classified and proprietary information, and<br />

their awareness of the value of protected information in the global market, insiders<br />

constitute a significant area of vulnerability for national security (Fialka, 1997; Freeh,<br />

1996; Nockels, 2001; Shaw, Ruby, & Post, 1998; Thurman, 1999; Venzke, 2002). An<br />

estimated 2.4 million insiders have access to classified information currently, and while<br />

difficult to approximate, insiders with access to proprietary and sensitive technological<br />

information are likely to number in the tens of millions (National Security Institute, June<br />

2002). While the deliberate compromise of classified or proprietary information to<br />

foreign entities is a relatively rare crime, even one case of insider espionage can cause<br />

extraordinary damage to national security.<br />

Because espionage is a secret activity we cannot know how many undiscovered<br />

spies are currently active in American organizations, or what the future will bring in<br />

terms of discovered espionage cases. Nonetheless, we are not entirely in the dark when<br />

assessing the magnitude of the insider espionage threat. We can draw inferences from<br />

relevant changes in technology, society, and the international environment that affect<br />

opportunity and motivation for espionage.<br />

In exploring current and future prevalence of insider espionage this study employs<br />

a methodology similar to that used in epidemiological research where scientists explain<br />

or forecast changes in the prevalence of certain diseases within various populations. The<br />

medical researcher knows that heart disease is associated with age, weight, amount of<br />

exercise, blood pressure, diet, stress, genetics, and other factors, and can thus estimate<br />

prevalence of heart disease by analyzing changes in these variables. Similarly, because<br />

we know that certain factors influence the likelihood that insider espionage will occur, we<br />

can forecast changes in the prevalence of insider espionage by examining changes in<br />

these factors. This study examines U.S. vulnerability to insider espionage by exploring<br />

ten technological, social, and economic trends that affect opportunity and motivation for<br />

spying.<br />

Opportunity for espionage consists of access to classified or proprietary<br />

information that can be exchanged for money or other benefits, access to foreign entities<br />

interested in obtaining this information, and means for transferring this information to<br />

foreign recipients. Motivation, broadly defined, is a feeling or state of mind that<br />

influences one's choices and actions. While motivation for espionage results from a<br />

complex interaction between personality characteristics and situational factors (Crawford<br />

297<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


298<br />

& Bosshardt, 1993; Eoyang, 1994; Sarbin, 1994; Parker & Wiskoff, 1991; Shaw, Ruby &<br />

Post, 1998; Timm, 1991), this study focuses primarily on the latter.<br />

Findings of this study suggest that the information revolution, global economic<br />

competition, the evolvement of new and non-traditional intelligence adversaries, and<br />

other changes in the domestic and international environment have converged to create<br />

unusually fertile ground for insider espionage. Findings of this study suggest that greater<br />

numbers of insiders have the opportunity to commit espionage and are more often<br />

encountering situations that can provide motivation for doing so.<br />

1. Technological advances in information storage and retrieval are dramatically<br />

improving insiders’ ability to access and steal classified and proprietary information.<br />

2. The global market for protected U.S. information is expanding. American insiders<br />

can sell more types of information to a broader range of foreign buyers than ever<br />

before.<br />

3. The internationalization of science and commerce is placing more employees in a<br />

strategic position to establish contact with foreign scientists, businesspersons, and<br />

intelligence collectors, and to transfer scientific and technological material to them.<br />

4. The increasing frequency of international travel is creating new opportunity for<br />

motivated sellers of information to establish contact with, and transfer information to<br />

foreign entities. Foreign buyers have greater opportunity to contact and assess the<br />

vulnerabilities of American personnel with access to valuable information.<br />

5. Global Internet expansion is providing new opportunities for insider espionage.<br />

The Internet allows sellers and seekers of information to remain anonymous and<br />

provides means by which massive amounts of digitalized material can be transmitted<br />

to foreign parties in a secure manner.<br />

6. Americans are more vulnerable to experiencing severe financial crisis due to<br />

aggressive consumer spending habits and other factors. Financial problems are one of<br />

the primary sources of motivation for insider espionage.<br />

7. The increasing popularity of gambling and prevalence of gambling disorders<br />

suggests that greater numbers of insiders will commit workplace crimes such as<br />

espionage to pay off debts and to sustain gambling activities.<br />

8. Changing conditions in the American workplace suggest that greater numbers of<br />

insiders may become motivated to steal information from employers to exact revenge<br />

for perceived mistreatment. Because organizational loyalty is diminishing, fewer<br />

employees may be deterred from committing espionage due to a sense of obligation<br />

to the agencies and companies that employ them.<br />

9. More insiders now have ethnic ties to other countries, communicate with friends<br />

and family abroad, and interact with foreign businesspersons and governments.<br />

Foreign connections provide insiders with opportunities to transfer information<br />

outside the U.S. and foreign ties can provide motivation to do so.<br />

10. More Americans view human society as an evolving system of ethnically and<br />

ideologically diverse, interdependent persons and groups. While this is obviously<br />

beneficial, it is also possible that some insiders with a global orientation to world<br />

affairs will view espionage as morally justifiable if they feel that sharing information<br />

will benefit the “world community” or prevent armed conflict.<br />

Despite the significance of individual characteristics in determining which<br />

specific insiders will commit espionage, if more insiders are encountering situations that<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


can provide motivation and opportunity for espionage – as the findings of this study<br />

suggest – it is likely that the crime of insider espionage will occur more frequently. In our<br />

research we were unable to identify a single countervailing trend that will make insider<br />

espionage more difficult or less likely in the immediate future. Findings of this study<br />

suggest that increased investment of government resources to counteract the insider<br />

espionage threat is warranted.<br />

References<br />

Crawford, K. & Bosshardt, M. (1993). Assessment of position factors that increase<br />

vulnerability to espionage. Monterey, CA: Defense Personnel Security Research<br />

Center.<br />

Eoyang, C. (1994). Models of espionage. In T.R. Sarbin, R.M. Carney, and C. Eoyang<br />

(Eds.), Citizen Espionage: Studies in Trust and Betrayal (pp. 69-91). Westport, CT:<br />

Praeger.<br />

Fialka, J. (1997). War by other means: Economic espionage in America. New York:<br />

W.W. Norton and Company.<br />

Freeh, L. (1996). Statement of Louis J. Freeh, Director Federal Bureau of Investigation<br />

before the House judiciary Committee Subcommittee on crime. Retrieved December<br />

2002 from http://www.fas. org/irp/congress/1996_hr/h/h960509f.htm.<br />

National Security Institute (June 2002). U.S. Security managers warned to brace for more<br />

terrorism, espionage. National Security Institute Advisory, June 2002.<br />

Nockels, J. (2001). Changing security issues for government. (http://www.law.gov.au<br />

/SIG/papers /nockels.html).<br />

Parker, J. & Wiskoff, M. (1991). Temperament constructs related to betrayal of trust.<br />

Monterey, CA. Defense Personnel Security Research Center.<br />

Sarbin, T. Carney, R., & Eoyang, C. (Eds.) (1994). Citizen espionage: Studies in trust<br />

and betrayal. Westport, CT: Praeger.<br />

Shaw, E., Ruby, K. & Post, J. (1998). The insider threat to information systems. Security<br />

Awareness Bulletin, 2-98. Department of Defense Security Institute.<br />

Timm, H. (1991). Who will spy? Five conditions must be met before an employee<br />

commits espionage. Here they are. Forewarned is forearmed. Security Management,<br />

49-53.<br />

Thurman, J. (1999). Spying on America: It’s a growth industry. Christian Science<br />

Monitor, 80, 1.<br />

Venzke. B. (2002). Economic/Industrial Espionage. Retrieved June, 19, 2002 from<br />

http://www.infowar.com/class.<br />

299<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


300<br />

Development of a Windows Based Computer-Administered<br />

Personnel Security Screening Questionnaire1<br />

Martin F. Wiskoff<br />

Northrop Grumman Mission Systems/<br />

Defense Personnel Security Research Center<br />

Introduction<br />

The Defense Personnel Security Research Center (PERSEREC) developed a computeradministered<br />

questionnaire for screening enlisted applicants to sensitive Navy<br />

occupations that has been operationally used by the U. S. Navy Recruiting Command<br />

since 1996. The goal of questionnaire, called the <strong>Military</strong> Applicant Security Screening<br />

(MASS 3.0) is to:<br />

1. Reduce the number of Navy enlisted applicants who are processed for security<br />

clearances and subsequently found ineligible.<br />

2. Identify these applicants early in the accessioning process - at the <strong>Military</strong><br />

Entrance Processing Stations (MEPS) - before they are accepted into high security Navy<br />

jobs.<br />

3. Reduce the number of unfilled school seats and jobs due to the later<br />

ineligibility of enlisted personnel assigned to these occupations.<br />

4. Develop a more flexible mode for administering personnel security screening<br />

items and collecting more detailed information.<br />

A detailed description of the MASS system, including development of the questionnaire<br />

and the manner in which it is operationally administered is contained in Wiskoff, et.al.<br />

(1996), and Wiskoff & Zimmerman (1994). As stated in Wiskoff, et.al, “the MASS<br />

questionnaire inquires about the following areas of security concern: (1) alcohol<br />

consumption; (2) allegiance; (3) drug involvement; (4) emotional and mental health; (5)<br />

financial responsibility; (6) foreign travel and connections; (7) law violations; (8)<br />

personal conduct; and (9) security issues. These areas, and the specific questions within<br />

the areas, were developed by reviewing DoD security guidelines, evaluating existing<br />

paper and pencil security questionnaires and discussing specific issues to be included<br />

with security and legal professionals.”<br />

“Each applicant for a sensitive rating is individually administered the MASS<br />

questionnaire by a Navy classifier. The system includes a decision aid that automatically<br />

informs the classifier whether the information provided by the applicant is disqualifying<br />

or potentially disqualifying for the rating being considered, or whether it requires that a<br />

waiver be obtained to allow the applicant to enter the Navy. This decision aid, appearing<br />

as a flag, is triggered whenever an applicant response meets criteria for one or more of<br />

these situations. The rules for the decision aid were established by linking all possible<br />

responses to MASS questions to criteria contained in the Navy Recruiting Manual<br />

concerning acceptance into ratings and into the Navy.”<br />

1 MASS 4.0 was demonstrated as part of this presentation.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Requirement<br />

Inspection of MASS questionnaires completed since 1996 indicates that considerable<br />

numbers of applicants self-disclose information that needs to be reviewed before a<br />

decision can be reached whether to accept them into a high security occupation. In<br />

addition, according to Navy classification personnel who administer MASS, its very<br />

presence makes potential applicants self-select out of these sensitive positions if there is<br />

serious derogatory information in their backgrounds.<br />

However in recent years the program has begun to show its age. MASS 3.0 was<br />

programmed in Turbo Pascal and designed for the IBM 286 computers that were<br />

available at the MEPS in 1996. As newer computers have replaced the 286s some<br />

incompatibility issues have arisen with running MASS that have required temporary<br />

fixes. Difficulties have arisen at some MEPS locations in printing MASS interview<br />

summaries and there has been dissatisfaction with the inability to store and retrieve<br />

results of previous applicant interviews.<br />

In the summer of 2000 a survey was conducted of MASS classifiers at the MEPS to<br />

determine changes that would facilitate its use (Reed, 2000). The primary<br />

recommendation was the need to upgrade the platform to a Windows-based one. Other<br />

desirable features according to those who responded are:<br />

1. Quicker MASS completion time…MASS 3.0 takes at least 20 minutes even for<br />

applicants with little to report. It can take as long as 45 minutes to an hour.<br />

2. Error reducing features such as drop-down lists.<br />

3. Online help such as popup definitions<br />

4. Flexibility in ability to designate Navy ratings when the system matches<br />

applicant responses to Navy Recruiting Manual criteria.<br />

5. Increased detail in the printed interview report such as including the full<br />

question asked along with the responses.<br />

6. Storage and easy retrieval of previous MASS interviews to facilitate reinterviewing<br />

applicants when they return to the MEPS for final processing from the<br />

Delayed Entry Program.<br />

7. Enabling printing of the interview report from a network printer<br />

8. Enabling electronic forwarding of the interview report to the office responsible<br />

for providing guidance whether to continue processing the applicant.<br />

Based on the results of the survey and subsequent discussions, a request was received<br />

from Navy Recruiting Command in February 2001 to develop a Windows-based version<br />

of MASS that would incorporate the field recommendations and add other features that<br />

would enhance the screening process.<br />

Development of MASS 4.0<br />

The MASS 4.0 design addressed all of the field recommendations. The questionnaire<br />

administration was conceptualized as a three-stage procedure. This resulted in a<br />

streamlined questionnaire that contains a set of 20 first level questions that cover the 9<br />

areas of security concern mentioned in the introduction as being included in MASS 3.0,<br />

plus a newer area of “information technology systems.” The number of first level<br />

questions in each of the security areas is shown in column 2 of Table 1.<br />

301<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


302<br />

Table 1<br />

MASS 4.0 Areas of Inquiry<br />

Security Area First Level Questions<br />

N<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Second Level Questions<br />

N<br />

Alcohol consumption 1 7<br />

Allegiance<br />

Espionage<br />

Other<br />

Drug involvement<br />

Marijuana<br />

Other<br />

Emotional/mental health<br />

Treatment<br />

Suicide<br />

Financial responsibility<br />

Problems<br />

Debts<br />

Foreign travel/connections<br />

Contact<br />

Other<br />

1<br />

15<br />

1<br />

8<br />

Information technology<br />

Law violations<br />

1 8<br />

Arrested/charged<br />

1<br />

53<br />

Vehicle related<br />

1<br />

18<br />

Detained<br />

1<br />

1<br />

Civil<br />

1<br />

3<br />

Personal conduct<br />

School/job<br />

Other<br />

Security issues<br />

Denied clearance<br />

Problems<br />

For example, the first level question in the Law violations (Vehicle related) area is:<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

• Have you ever been cited, arrested or charged by civilian or military law<br />

enforcement officials for any vehicle related violations (e.g., improperly licensed<br />

or unregistered vehicle, operating an unsafe vehicle, driving without a license,<br />

speeding, hit and run, DUI)?<br />

If a positive response is received to that question the program would present 18 second<br />

level questions to determine the nature of the violation, e.g. hit and run or DUI. The<br />

number of second level questions by security area is displayed in column 3 of Table 1.<br />

3<br />

6<br />

1<br />

11<br />

4<br />

2<br />

18<br />

2<br />

8<br />

4<br />

2<br />

4


Finally, third level questions would be asked to obtain details of the incident, such as the<br />

following for “hit and run”:<br />

How many times have you been cited, arrested or charged for hit and run?<br />

Why were you cited, arrested or charged?<br />

Where and when did this occur?<br />

Was the offense a felony?<br />

What was the final outcome of the case?<br />

Were you given a jail or prison sentence?...length of sentence<br />

Were you given any other punishment?...nature of punishment<br />

When did your jail term or other punishment end?<br />

Upon completion of all questions the classifier selects the Navy rating being considered<br />

for the applicant and flags are shown indicating possible issues that might arise during a<br />

background investigation. The MASS 4.0 program, in addition to containing the MASS<br />

3.0 linkage of applicant responses to the Navy Recruiting Manual guidelines, relates the<br />

responses to the Adjudicative Guidelines For Determining Eligibility For Access To<br />

Classified Information (Director of Central Intelligence, 1998).<br />

These Guidelines are used by defense and intelligence community adjudicative agencies<br />

in making determinations whether to grant a security clearance. The classifier is advised,<br />

depending on the nature of the flags, to contact the appropriate authorities for permission<br />

to proceed with processing the applicant. A final step is to print a form that documents<br />

the results of the interview which is then signed by both the classifier and the applicant.<br />

MASS Evaluation and Implementation<br />

MASS 4.0 was tested and evaluated at 4 MEPS during the months of May/June <strong>2003</strong> as a<br />

replacement for MASS 3.0. There was agreement by all classifiers who used the program<br />

with applicants that MASS 4.0 should be made operational at all MEPS. Following some<br />

additional minor programming changes the system was delivered to Navy Recruiting<br />

Command and implemented nationwide in September <strong>2003</strong>.<br />

Within the next year we plan to improve some of the MASS 4.0 screens without changing<br />

the basic nature of the program. Perhaps the most important future modification will be<br />

the capability to electronically capture applicant responses for analysis. This will permit<br />

us to establish a database of responses that could be related to future personnel actions<br />

such as whether the applicant was found not acceptable during a security interview at<br />

recruit training, or did not receive a security clearance after being processed for a<br />

background investigation.<br />

References<br />

Director of Central Intelligence. (1998). Adjudicative Guidelines for Determining<br />

Eligibility to Access to Classified Information (DCID 6/4, Annex C, Jul. 2, 1998).<br />

Washington, D.C.: Author.<br />

Reed, S. C. (2000). Unpublished analyses. Monterey, CA: Defense Personnel Security<br />

Research Center.<br />

Wiskoff, M. F., Zimmerman, R. A. and Moore, C. V. (1996). Developing and<br />

303<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


304<br />

Implementing a Computer-Administered Personnel Security Screening Questionnaire.<br />

Paper in Symposium, Personnel Security in the Post-Cold War Era. <strong>Proceedings</strong> of the<br />

38th Annual <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Meeting. San Antonio, TX.<br />

Wiskoff, M. F. & Zimmerman, R. A. (1994). <strong>Military</strong> Applicant Security Screening<br />

(MASS): Systems Development and Evaluation. (PERSEREC Technical Report 94-<br />

004). Monterey, CA: Defense Personnel Security Research Center.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


OCCUPATIONAL ANALYSIS APPLIED FOR THE PURPOSE OF<br />

DEFINING OF SELECTION CRITERIA FOR NEW MILITARY<br />

OCCUPATIONAL SPECIALTIES IN THE ARMED FORCES OF THE<br />

REPUBLIC OF CROATIA<br />

Tomislav Filjak, Ingrid Cippico, Nada Debač, Goran Tišlarić, Krešimir Zebec<br />

MINISTRY OF DEFENCE OF THE REPUBLIC OF CROATIA<br />

Stančićeva 6, 10 000 Zagreb, Croatia<br />

ABSTRACT<br />

The decision on all-inclusive re-organisation and downsizing of the Armed Forces of<br />

the Republic of Croatia made in the early 2000 also envisaged the military specialties<br />

structure. The demands included: reduced number of the specialties, less specialised duties<br />

and NATO-compatibility. Within each branch and service one expert was assigned with new<br />

classification of specialties, which he performed in consultation with the branch (service)<br />

specialists. The experts were assisted each by a psychologist and a physician for possible<br />

occupational analysis to serve as a background for a more radical modification of previous<br />

specialties, if required so.<br />

Defining of new specialty structure was followed by a occupational analysis, aimed at<br />

defining the entry criteria (psychological, physical, medical) for each individual specialty.<br />

The analysis was based on the qualitative and quantitative analysis of data collected by means<br />

of a questionnaire administered on a group of experts. The version of the questionnaire used<br />

was the one adapted and tested through previous job-analysis assignments, and comprised the<br />

psychological, the physical and the medical aspects of a duty. One military psychologist per<br />

branch (service) was tasked with the administration of the questionnaire and with the data<br />

analysis for all specialties within it, in co-operation with the specialty expert and the<br />

physician. They had to “defend” the analysis results before the team leading the project. It<br />

“bequethed” us a “manual” containing the job analysis results, and will be transposed into<br />

future regulations books on selection and similar military documents.<br />

305<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


306<br />

OCCUPATIONAL ANALYSIS APPLIED FOR THE PURPOSE OF<br />

DEFINING OF SELECTION CRITERIA FOR NEW MILITARY<br />

OCCUPATIONAL SPECIALTIES IN THE ARMED FORCES OF THE<br />

REPUBLIC OF CROATIA<br />

Tomislav Filjak, Ingrid Cippico, Nada Debač, Goran Tišlarić, Krešimir Zebec<br />

MINISTRY OF DEFENCE OF THE REPUBLIC OF CROATIA<br />

Stančićeva 6, 10 000 Zagreb, Croatia<br />

INTRODUCTION<br />

Croatian armed force came into existence with the Croatia’s fight for independence in<br />

1991; a war-time military was following the war no longer corresponded with the new<br />

security environment of the Republic of Croatia. In 1996 thus Croatian Armed Forces<br />

underwent a first re-organisation, which still did not accommodate new exigencies. Therefore,<br />

early in 2000 a new, radical reform, entailing major cuts, was launched and is still under way.<br />

Its extent is best illustrated by the reduction figures: from 50 000 members in the late 2002 to<br />

the projected 22000 active personnel (plus 4000 conscripts and 3000 civilian employees) by<br />

the year 2005. The Armed Forces, as envisaged, are to be manned by career personnel (as<br />

much as 80%). Moreover, a new national security strategy and defence strategy envisage new<br />

missions for Croatian Armed Forces, among which participation in international operations<br />

(Croatian observers in the UN mission in Sierra Leone, Ethiopia and Erithrea, Western<br />

Sahara, Kashmir and the ISAF).<br />

The re-organisation and re-assignment entails a new military specialties structure to<br />

match the reduced manpower, career military and altered military duties.<br />

EXIGENCIES FOR NEW SPECIALTIES<br />

War-time Croatian military was a large force compared to the overall population, and<br />

at one moment comprised 240 000. It had been organised for traditional warfare and its very<br />

diverse specialties structure (e.g. 260 soldier specialties ) corresponded to that aim. It could<br />

not allow career military and development of new capabilities. A number of previous<br />

specialties disappeared naturally, and others altered as a result of changing military even<br />

before the re-organisation project. In order to match the specialties system with the military<br />

exigencies, well-defined criteria have been set for the new specialties.<br />

The exigencies for the new system were as follows:<br />

- reduced number of specialties (compared to the prior situation)<br />

- entry to a specialty is achieved in the enlisted soldier or officer status (NCo status<br />

is excluded as NCOs develop from enlisted soldiers)<br />

- 14-week hands-on training (enlisted soldiers)<br />

- less specialised duties (increased number of duties contained in a single specialty)<br />

- compatibility with the NATO system of specialties and classification<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


CLASSIFICATION OF DUTIES INTO NEW SPECIALTIES<br />

A first step in new specialties system was to define branches and services at the<br />

Armed Forces level, and to define specialties within each individual branch and service. This<br />

was executed without previous empirical studies, in view of the brief term allowed for<br />

development of the new system.<br />

Inventory of branches and services was agreed at the senior authorities level, and was<br />

mostly based on the existing system and planned reforms. This was followed by appointing of<br />

a Commission authorised for proposing the changes to the inventory and entrusted with coordination<br />

of the new specialties system. The Commission designated an expert per each<br />

branch and service, mostly an experienced and a respectable officer in the branch/service,<br />

who was then tasked with preparing a new inventory of specialties for the respective<br />

branch/service. He was encouraged to consult other experts of the branch. His “pool” also<br />

included a psychologist and a physician, who were counted on to do job analysis at his<br />

request in possibly vague situations and radical changes, to enable decision-making.<br />

While for some of the branches the job was performed in a short term and univocally,<br />

for others it took quite a long time. It was mostly the case with new duties. Prior to decision<br />

on specialties structure additional clarifications were requested that regarded the projected<br />

scope, content and modality of operation. Again, as the debate on how to organise new<br />

domains may take some time and includes a number of subjects, the job is still not finished<br />

for some domains.<br />

ENTRY CRITERIA-REQUIRED JOB ANALYSES<br />

The new specialties structure defined, job analysis was conducted for each considered<br />

specialty.<br />

Analysis objective<br />

The objective of analysis was to define entry criteria (psychological, physical and<br />

medical) that the candidates for a respective specialty are expected to meet.<br />

Analysis modality<br />

As mentioned above, each branch/service was assigned a task group made up of the<br />

respective branch/service leader-expert, a psychologist and a physician – all of them, as a rule,<br />

with a lengthy service in the branch/service and the experience with the duties analysed.<br />

The analysis was based on the quantitative and the qualitative processing of data<br />

compiled by means of a questionnaire administered on a group of experts.<br />

<strong>Military</strong> psychologists were tasked with the questionnaire administration and<br />

quantitative analysis of the results. The final step was then qualitative analysis of the data,<br />

which was done by all the members of the group, and drafting of a final report.<br />

Questionnaire<br />

In job analysis the adapted and previously tested version of questionnaire – the<br />

”VSSp-1” was used, employed previously for the purposes of the kind. It combines the<br />

psychological, the physical and the medical aspects of duties. In addition to basic data (bio<br />

data, unit, date) the questionnaire includes the following units:<br />

307<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


308<br />

- duty description, list of tools/instruments utilised, protective gear used<br />

- general features (physical strain, exposure of particular senses, frequency of<br />

particular movements, working conditions)<br />

- relevance of particular senses, use of aids<br />

- social working conditions (co-workers)<br />

- relevance rating for 24 different abilities<br />

- relevance rating for 20 personality traits<br />

- main cause of underperformance in a specialty<br />

- major incidence of injuries, accidents and occupational diseases in the specialty<br />

The questionnaire was based on different schemes for job analysis employed in<br />

Croatia over the past 40 years. The list of abilities and personality traits rated by relevance<br />

was composed based on descriptions derived from classical theories in the matter (by<br />

Thurstone, Burt, Vernon, Cattell capability-wise; by Cattell, Eysenck, Big Five personalitywise).<br />

The original questionnaire emerged in 1993, and being used several times for specific<br />

purposes underwent significant structural and substance revisions and adaptations. Substance-<br />

wise, the questionnaire now is combination of job (behaviour) oriented and psychologicaly<br />

(required characteristics) oriented. Previous testing demonstrated inteligibility of the<br />

questionnaire to raters and univocality of the data obtained.<br />

Administration of the questionnaire<br />

As stated above, each branch/service had its own psychologist to conduct<br />

administration of the questionnaire and quantitative analysis of the results for each specialty<br />

within it. They consulted the branch expert-leaders and physicians.<br />

Each specialty was assigned no less than 30 experts – 1/3 officers and 2/3 NCOs and<br />

enlisted soldiers. Respondents were selected based on the following criteria:<br />

a) years of experience in the duty<br />

b) respectability (as judged by the expert-leaders and psychologists) and<br />

c) assessed respondents’ ability to provide credible answers to questionnaire’s<br />

items (as judged by psychologists)<br />

The respondents undoubtedly held the competence for rating new specialties as the<br />

specialties have been derived from the previous ones (either through repetition, merging or<br />

separation) with which they were more than familiar.<br />

Selection of respondents was followed by compilation of ratings obtained in<br />

individual and group (up to 10 respondents) administrations, and that was the psychologists’<br />

task.<br />

Analysis results<br />

The report on job analysis for each specialty was submitted to the Commission in<br />

charge of the project. This was followed by presentation and discussion of the analysis results<br />

by the groups before the Commission, and harmonisation and elaboration of the conclusions<br />

in certain specialties based on the discussion. Once agreed, the conclusions were integrated<br />

into a final report, which unified the entry psychological, physical and medical criteria for all<br />

the specialties considered.<br />

The analysis to a certain extent affected the job/duty structure within a specialty, as<br />

some analysis results revealed inacceptability of the proposed job/duty structure within a<br />

branch/service, which needed modifying.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


CONCLUSIONS<br />

The procedure described issued a “manual” that contains the conclusions of job<br />

analysis which will be incorporated into new selection regulations and the related documents<br />

in the Armed Forces. The results obtained and prepared enable radical and swift reorganisation.<br />

However, the pace the procedure was conducted at was partly at the expense of<br />

its quality.<br />

We expect that in the time ahead the results, and “the lesson learned” will serve as<br />

basis for continuous job analysis that will bring to detailed harmonisation of the job<br />

substance, training and selection for respective specialties.<br />

REFERENCE<br />

Bujas, Z. (1959). Psihofiziologija rada. Zagreb.<br />

Harvey, R.J. (1994). Job Analysis. In M.D. Dunnette and L.M. Houg (Ed.), Handbook of<br />

Industrial and Organizational Psychology. Palo Alto: Consulting Psychologists Press.<br />

Petz, B. (1987). Psihologija rada. Zagreb: Školska knjiga.<br />

Radna skupina za pripravu novog sustava VSSp. (2001). Minimalni zahtjevi tjelesnih,<br />

psihičkih i zdravstvenih sposobnosti za ulazak u specijalnost. Zagreb: Ministarstvo<br />

obrane.<br />

Radna skupina za pripravu novog sustava VSSp i Odjel za vojnu psihologiju OJI MORH.<br />

(2000). Naputak za utvrđivanje minimalnih zdravstvenih, tjelesnih i psihičkih zahtjeva<br />

za određivanje VSSP. Zagreb: Načelnik GS OS RH.<br />

309<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


310<br />

USING DECISION TREE METHODOLOGY TO PREDICT ATTRITION<br />

WITH THE AIM<br />

Wayne C. Lee<br />

Department of Psychology<br />

University of Illinois at Urbana-Champaign<br />

603 East Daniel Street<br />

Champaign, IL 61820<br />

wlee@s.psych.uiuc.edu<br />

Dr. Fritz Drasgow<br />

Department of Psychology<br />

University of Illinois at Urbana-Champaign<br />

603 East Daniel Street<br />

Champaign, IL 61820<br />

fdrsagow@s.psych.uiuc.edu<br />

This paper describes the Assessment of Individual Motivation (AIM) and efforts to use it<br />

for predicting first-term attrition in the United States Army. This description provides a context<br />

for the three papers presented in this symposium, including this one where the results of one<br />

investigation are presented. In this first investigation, a non-linear, “configural” approach to<br />

prediction is applied to examine whether we can improve on linear methods used to determine<br />

the predictive validity of the AIM with respect to attrition in a 12-month time interval.<br />

The AIM<br />

Attrition is among the most studied of the organizationally relevant outcomes among<br />

personnel researchers. One estimate as early as 1980 put the number of articles and book<br />

chapters devoted to attrition between 1500 and 2000 (Muchinsky & Morrow, 1980). Certainly,<br />

the popularity of this topic is due in part to the high cost associated with attrition in<br />

organizations. Earlier research conducted by the U.S. Army Research Institute for the<br />

Behavioral and Social Sciences (ARI) with the Assessment of Background and Life Experiences<br />

(ABLE) suggested that temperament measures might indeed be good predictors of attrition in the<br />

U.S. Army. Unfortunately, as with many temperament measures, concerns regarding the<br />

potential effects of faking and coaching restricted the implementation of the ABLE in new<br />

recruit selection (White, Young, & Rumsey, 2001).<br />

Beginning in the mid-1990’s, this line of research was continued formally by ARI with<br />

the development of the AIM. This new measure was designed specifically to target first-term<br />

attrition while also being less susceptible to faking and coaching. The AIM is comprised of 27<br />

items in a forced-choice format measuring 6 constructs. While “27 items” may seem like a small<br />

number of items for any measure of multiple constructs, it is important to understand that each<br />

item is comprised of four descriptive stems. Each of these stems –108 in all—could very easily<br />

be presented as a single-item in Likert-type format. For each item in the AIM, two stems are<br />

worded negatively and two are worded positively –representing low and high levels of a<br />

particular construct, respectively, if a respondent endorses them.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


For each of the 27 items, the respondent is asked to indicate which of the four stems is<br />

“Most like me” and which stem is “Least like me.” This item format results in four, quasiipsative<br />

measurement opportunities, with each stem receiving a particular score. Each item is<br />

constructed such that each of the four component-stems measure separate constructs. One of the<br />

primary reasons behind this item format and scoring scheme is to provide a measure that is less<br />

transparent, and thus less susceptible to faking and coaching. Research with this measure<br />

suggests that the goals of the AIM (predicting first-term attrition and being resistant to faking<br />

and coaching) are indeed met (White & Young, 1998).<br />

With this evidence, the AIM has been in use operationally since early 2000 with a 3-year<br />

pilot program for non-high school diploma graduate recruits. These candidates are tested with<br />

the AIM, and, if they do not have their General Education Development (GED, high-school<br />

equivalency) certificate, are sponsored to complete a GED program and are then processed under<br />

the U.S. Army’s Delayed Entry Program (DEP). This pilot program allows additional recruits,<br />

from a labor market which otherwise may be inaccessible, to enlist in the U.S. Army, while at<br />

the same time screens out potential recruits likely to leave the military soon after entry.<br />

This screening of potential recruits is, in part, based on cutoffs associated with an “AIM<br />

Adaptability Composite Score,” comprised of a portion of the items across the 6 content scales.<br />

Unfortunately, we cannot describe or discuss the six content scales or the Adaptability<br />

Composite any further without compromising the AIM and/or its usage. As such, we will refer<br />

to the six content scales as Scales A through F.<br />

While the evidence from the previously mentioned research is enough to justify the use of<br />

the AIM in selection, such research based on traditional statistical approaches (e.g., development<br />

of the Adaptability Composite) may suffer potentially limiting characteristics. For example, with<br />

approaches such as linear and logistic regression, the terms of the equation, or weights used to<br />

determine a composite score based on these frameworks, act similarly across cases (i.e.,<br />

respondents). Only a small handful of prediction or classification approaches are sensitive to the<br />

profile of scale scores associated with each case–that is, approaches that may delineate separate<br />

mechanisms that lead to the same or similar outcomes. One such non-linear, “configural”<br />

method can be found with decision tree methodology.<br />

Classification and regression trees<br />

Decision tree methods can be used to predict any number of outcomes based on a set of<br />

predictor variables. The foundation of these methods lies in decision rules using predictor<br />

variables, arranged hierarchically, that split the data into smaller and smaller groups of<br />

increasing homogeneity with respect to some outcome. The hierarchical arrangement of splitting<br />

rules can be depicted graphically with an inverted tree, where:<br />

• The “root” of the tree represents an initial split (first node) of the entire dataset based<br />

on a cutoff score associated with one variable<br />

• “Branches” (internal nodes) depict additional splitting rules that further define the<br />

underlying relationship between the variables and the criterion by increasing the<br />

homogeneity of resulting nodes<br />

• “Leaves” (terminal nodes) represent predicted outcomes or levels of a criterion<br />

variable (e.g., those who stay with the U.S. Army and those who do not)<br />

311<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


312<br />

Classification and regression trees (CART; Breiman, Friedman, Olshen, & Stone, 1984)<br />

is one algorithm associated with this approach. Through brute force, CART examines all<br />

possible binary splits of the data (answers to “yes/no” questions) based on all of the predictor<br />

variables. It places the best split at the root of the tree and continues this process until no further<br />

splits can be made. Later, the resulting decision tree is “pruned” according to misclassification<br />

rates, user-determined preferences (e.g., permitted number of cases in a terminal node), or by<br />

eliminating redundant nodes. Additionally, competing trees may develop depending on the<br />

nature of the data.<br />

To assess classification accuracy, CART uses “v-fold cross-validation” in which the<br />

sample is divided into v subsamples and grows a decision tree after combining v-1 subsamples<br />

and assesses the classification accuracy using the hold-out subsample. This process is iterated so<br />

that v-1 subsamples are combined and used to grow decision trees and each of the v samples is<br />

used as the hold-out sample once. Classification accuracy is estimated as the average<br />

classification accuracy across the v holdout subsamples.<br />

As mentioned earlier, CART may delineate separate “paths” that lead to the same<br />

outcome –identifying configural relationships in the data. Also, CART may reuse any number of<br />

variables in separate parts of the tree, and thus, may capture non-linear relationships. Below we<br />

describe a investigation that examined whether we could improve upon the Adaptability<br />

Composite in predicting attrition with the AIM.<br />

Applying Decision Tree Methodology to AIM Data<br />

Sample, data, and software<br />

A file containing the data from 22,328 enlisted U.S. Army personnel was created<br />

containing the AIM scale scores and a retention variable to 12 months. The data for this file<br />

came from the AIM Grand Research Database, managed by ARI and a contractor, the Human<br />

Resources Research Organization (HumRRO). This database is the source of much of the recent<br />

research surrounding the AIM and, for this database, the AIM was administered to these<br />

personnel between 1998 and 1999 for research purposes only (Knapp, Heggestad, & Young, in<br />

preparation). The 12-month time interval was selected because this provided the CART 4.0<br />

software package (Breiman, Friedman, Olshen, & Stone, 1997) with a sufficient number of<br />

respondents with which to “grow” a tree. The six content scales were used as input (predictor<br />

variables) and the 12-month attrition variable (criterion) was treated dichotomously (i.e.,<br />

“stayers” and “leavers”).<br />

Results<br />

The analysis yielded 39 trees ranging in complexity from two terminal nodes to a tree<br />

with 2802 terminal nodes and a depth of 51 levels or tiers. However, the larger trees exhibited<br />

high rates of misclassification among the stayers (as much as 60 percent). Of particular interest<br />

were five trees resulting from this analysis. Table 1 summarizes the misclassification rates for<br />

these five “best” trees using the v-fold cross-validation approach (v=10).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 1: Misclassification rates for five classification trees<br />

“False positives”<br />

(misclassification of<br />

Number of<br />

terminal nodes stayers, percentages)<br />

3 31.14% 45.32%<br />

6 34.23% 48.19%<br />

7 33.64% 47.40%<br />

11 33.40% 47.13%<br />

18 32.09% 45.68%<br />

“Hits”<br />

(correct classification of<br />

leavers, percentages)<br />

The first and third of these trees are depicted in Figures 1 and 2, respectively (where left<br />

branches indicate a “yes” response, while right branches indicate a “no” response to the decision<br />

rule in the parent node). For example, Figure 1 shows that the root (i.e., initial) node splits the<br />

sample on the basis of Scale D scores; individuals with scores on Scale D of less than 8.5 are<br />

predicted to be leavers and individuals with scores greater than 8.5 are branched to another node.<br />

In this internal node, individuals with relatively high scores on Scale D (i.e., greater than 8.5) but<br />

low scores on Scale B (less than 14.89) are predicted to leave. It is only individuals with high<br />

scores on both Scales D and B that are predicted to be stayers.<br />

Figure 1: Classification tree with 3 terminal nodes<br />

CART also rank-orders the relative importance of the predictor variables. In this<br />

analysis, CART identified the two best predictors as Scale D and B. Scales A, E, and C played a<br />

smaller role in these classification trees, whereas Scale F played a nearly insignificant role. In<br />

comparing the performance of these trees to the Adaptability Composite, we turn to a receiver<br />

operating characteristic (ROC; Figure 3) curve depicting the relative hit and false positive rates<br />

associated with separate cut-off scores for the Adaptability Composite and the classification tree<br />

with 7 terminal nodes. A similar pattern was found in comparing this tree against results from<br />

logistic regression, where all six of the content scales served as predictors.<br />

313<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


314<br />

Figure 2: Classification tree with 7 terminal nodes<br />

Discussion and Conclusions<br />

One goal of this investigation was to determine whether decision tree methodology,<br />

specifically the CART algorithm, could improve upon the prediction of attrition based on the<br />

Adaptability Composite and logistic regression. These results, particularly the ROC curve in<br />

Figure 3, suggest that CART can indeed produce a selection algorithm that outperforms the<br />

Adaptability Composite and logistic regression. However, a number of caveats should be noted.<br />

First, CART produces trees that are “discrete” in their rates of hits and false positives (e.g., the<br />

single point on the ROC associated with the 7-terminal-node tree). In practice, it may be<br />

preferable to set the cut-off score based on a set false-positive rate, which may not be available<br />

from the CART output. Second, implementing the decision scheme associated with any decision<br />

tree may be computationally complex, depending on the depth and width of a tree, as a number<br />

of nested IF statements would have to be used. Third, in this case, the difference in performance<br />

of these two approaches is not that large –a few percentage points with respect to the hit and<br />

false positive rates. Whether this is a meaningful or significant difference is better determined<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


y examining all of the possible costs and benefits associated with choosing one approach over<br />

the other. This might include factors such as economic conditions, recruitment and staffing<br />

goals, or even changes within the labor market.<br />

Finally, in examining Figure 2 we do indeed see evidence of non-linear, configural<br />

relationships in the data (i.e., different cut-scores associated with the same variable and the same<br />

outcome described by separate paths within the tree). This characteristic of decision tree<br />

methodology has proven to be of tremendous use in fields as diverse as biology, mechanical<br />

engineering and finance (Breiman et al., 1997). In addition to improving prediction, decision<br />

tree methodology may also prove to be a valuable tool in developing theories for many content<br />

domains within personnel research and organizational science.<br />

Figure 3: ROC curve depicting the Adaptability Composite and one CART tree<br />

Acknowledgements<br />

We wish to thank the U.S. Army Research Institute for access to the AIM data and for<br />

supporting this research. Assistance from the Human Resources Research Organization<br />

(HumRRO) was particularly helpful with data management and recordkeeping. The Consortium<br />

of Universities of the Washington Metropolitan Area was also helpful in securing research funds.<br />

All statements expressed in this document are those of the authors and do not necessarily reflect<br />

315<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


316<br />

the official opinions or policies of the U.S. Army Research Institute, the U.S. Army, the<br />

Department of Defense, HumRRO, or the Consortium of Universities.<br />

References<br />

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression<br />

trees. Pacific Grove: Wadsworth.<br />

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1997). CART (Version 4.0)<br />

[Computer program & documentation]. San Diego, CA: Salford Systems.<br />

Knapp, D.J., Heggestad, E.D., & Young, M.C. (Eds.). (In preparation). Understanding<br />

and improving the Assessment of Individual Motivation (AIM) in the Army's GED Plus Program<br />

(ARI Study Note). Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social<br />

Sciences.<br />

Muchinsky, P.M., & Morrow, P.C. (1980). A multidisciplinary model of voluntary<br />

employee turnover. Journal of Vocational Behavior, 17, 263-290.<br />

White, L.A. & Young, M.C. (1998, August). Development and validation of the<br />

Assessment of Individual motivation (AIM). Paper presented at the Annual Meeting of the<br />

American Psychological <strong>Association</strong>, San Francisco.<br />

White, L.A., Young, M.C., & Rumsey, M.G. (2001). ABLE implementation issues and<br />

related research. In J.P. Campbell & D.J. Knapp (Eds.) Exploring the limits in personnel<br />

selection and classification (pp. 525-558). Mahwah, NJ: Erlbaum.<br />

Young, M.C., Heggestad, E.D., Rumsey, M.G., & White, L.A. (2000, August). Army<br />

pre-implementation research findings on the Assessment of Individual Motivation (AIM). Paper<br />

presented at the Annual Meeting of the American Psychological <strong>Association</strong>, San Francisco.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


PREDICTING ATTRITION OF ARMY RECRUITS USING OPTIMAL<br />

APPROPRIATENESS MEASUREMENT<br />

Dr. Oleksandr S. Chernyshenko<br />

Department of Psychology<br />

University of Canterbury<br />

Private Bag 4800<br />

Christchurch, New Zealand<br />

sasha.chernyshenko@canterbury.ac.nz<br />

Dr. Stephen E. Stark<br />

Department of Psychology<br />

University of South Florida<br />

4202 E. Fowler Ave.<br />

Tampa, FL 33620<br />

sstark@cas.usf.edu<br />

Dr. Fritz Drasgow<br />

Department of Psychology<br />

University of Illinois at Urbana-Champaign<br />

603 E. Daniel St.<br />

Champaign, IL 61820<br />

fdrasgow@s.psych.uiuc.edu<br />

The purpose of this research was to determine if item response theory (IRT) optimal<br />

appropriateness measurement methods could improve the prediction of attrition for the six<br />

content scales of the AIM. Optimal appropriateness measurement (OAM) provides<br />

statistically most powerful methods for classifying examinees into two groups, such as<br />

“stayers” and “leavers.” If the item response model is correctly specified for each studied<br />

group, then the Neyman-Pearson lemma states that no other method can be used on the<br />

same data to provide more accurate classification. Thus, the procedures are said to be<br />

optimal (Levine & Drasgow, 1988).<br />

Our application of OAM methodology for predicting attrition involved a three-step<br />

process: 1) calibration of AIM scales with an appropriate IRT model, 2) examination of<br />

model-data fit, and 3) classification via optimal appropriateness measurement. A detailed<br />

description of each of these steps is presented below.<br />

Sample<br />

We used the 22,666 active Army cases contained in the Army AIM Grand Research<br />

database. The subjects were enlisted applicants for the Army’s GED Plus Program who<br />

completed the AIM during the 2000-2002 period. The majority (92%) had applied to join<br />

the Regular Army, while the remainder applied for the Army Reserves. At the time of<br />

application, 92% had a GED certificate, while the remainder had neither a high school<br />

diploma nor an alternative high school credential. For every subject, retention status was<br />

available at 12 month since enlistment.<br />

317<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


318<br />

Calibration of the AIM Content Scales<br />

Unidimensionality. All analyses were conducted using the AIM trichotomous item<br />

scoring (2, 1, and 0). The majority of IRT models that may be suitable to describe<br />

trichotomous AIM scoring require both that the data are essentially unidimensional and, in<br />

connection, the item responses are locally independent. Both assumptions are satisfied<br />

when factor analysis of the data reveals the presence of one dominant dimension (e.g.,<br />

Hulin, Drasgow, & Parsons, 1983).<br />

To investigate dimensionality, classical test statistics were first computed for all of<br />

the stems of the six AIM content scales: Physical Conditioning, Leadership, Work<br />

Orientation, Adjustment, Agreeableness, and Dependability. Two stems that had a negative<br />

corrected item-total correlation (the first stem in both the Agreeableness and Dependability<br />

scales), so they were removed from the respective scales and the statistics were recomputed.<br />

The resulting coefficient alphas varied between .57 and .70.<br />

Next, factor analyses were carried out on the stems of each of the six content scales<br />

separately. The results indicated the presence of a relatively strong dominant factor for<br />

each scale. In examining the eigenvalues, a sharp elbow was apparent in each case (see<br />

Fig. 1 below). Consequently, a unidimensional IRT model may be suitable to describe<br />

AIM’s responding at the stem level.<br />

Figure 1. Scree plot following factor analysis for the six content scales<br />

4<br />

3<br />

2<br />

1<br />

0<br />

1 3 5 7 9 11 13 15 17 19<br />

Description of SGR model. Because a single dominant dimension was found to<br />

underlie each of the six AIM content scales and the response data were scored<br />

polytomously with options arranged in an increasing order (i.e., 0, 1, 2), Samejima’s<br />

Graded Response (SGR) model was selected for item parameter estimation. For the SGR<br />

model, the probability of endorsing a response option, or category, depends on the<br />

discriminating power of the item and the location of the threshold parameter for that option<br />

on the latent trait (theta) continuum. The mathematical form of the SGR model is<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Physical Conditioning<br />

Leadership<br />

Work Orientation<br />

Adjustment<br />

Agreeableness<br />

Dependability


1<br />

1<br />

P( vi<br />

= jθ<br />

= t) =<br />

−<br />

,<br />

1+<br />

exp[ −1.7a<br />

i (t − bi,<br />

j)]<br />

1+<br />

exp[ −1.7a<br />

i (t − bi,<br />

j+<br />

1)]<br />

where vi denotes person i’s response to the polytomously scored item; j is the particular<br />

option selected by the respondent (j = 1,…, J, where J refers to the number of options for<br />

item i); ai is the item discrimination parameter and is assumed to be the same for each<br />

option within a particular item; b is the extremity parameter that varies from option to<br />

option given the constraints bj-1< bj< bj+1, and bJ is taken as + ∞.<br />

For stems having three options, as in the AIM scales, three parameters are estimated<br />

for each stem: one discrimination parameter that reflects the steepness of the option<br />

response function (ORF) and two location parameters that reflect the positions of the ORFs<br />

along the horizontal axis.<br />

Item parameter estimation. Item parameters for the SGR model were estimated<br />

separately for the total samples of stayers (N = 18016) and leavers (N = 4521) using the<br />

MULTILOG computer program (Thissen, 1991). Space limitations prohibit the<br />

presentation of the resulting item parameters for the six AIM content scales.<br />

45 th Examining Model-Data Fit<br />

Graphical and statistical methods were used to examine the fit of the SGR model to<br />

AIM content scale data for both stayers and leavers. This required that the total samples be<br />

split into calibration and validation subsamples. Item parameters were reestimated for the<br />

calibration subsamples using MULTILOG. The validation subsamples were used for<br />

computing empirical response functions and chi-square fit statistics.<br />

Fit plots and chi-square statistics were computed using the MODFIT computer<br />

program (Stark, 2001). (See Drasgow, Levine, Tsien, Williams, and Mead [1995] for a<br />

detailed description of the methods.). “Fit plots” provide a graphical method of evaluating<br />

model-data fit. In this method, a theoretical option response function computed using the<br />

parameters estimated from the calibration subsample is compared to the empirical response<br />

function computed using the cross-validation sample. A close correspondence between two<br />

functions indicates good fit.<br />

One fit plot was produced for each response option. In each plot, there was a close<br />

correspondence between the theoretical and empirical response functions, which suggests<br />

that the SGR model fit the data well.<br />

Model-data fit was also examined using chi-square statistics. These statistics were<br />

computed from the expected and observed frequencies for each individual stem (one-way<br />

chi-square table) and for combinations of pairs and triples of stems (two-way and three-way<br />

tables). The later were computed to detect violations of local independence and forms of<br />

misfit that are often missed by item singles. The chi-squares were also adjusted for a<br />

sample size of 3,000 and divided by their degrees of freedom to facilitate comparisons<br />

across samples of different sizes. According to Drasgow et al. (1995), adjusted chi-square<br />

to degrees of freedom ratios of 3 or less indicate good model-data fit.<br />

The results indicated that relatively small χ<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

2 /df statistics for stem singles, doubles<br />

and triples were obtained for all AIM content scales. The average adjusted χ 2 /df for single<br />

items ranged from 0.6 to 2.2; the average for doublets ranged from 2.4 to 3.7; for triplets<br />

the range was from 2.4 to 3.3. These results, in conjunction with the fit plots, indicate that<br />

the SGR model fit the AIM data well and could be used for classification of respondents<br />

based on OAM methods.<br />

319


320<br />

Classification Via Optimal Appropriateness Measurement<br />

Optimal appropriateness measurement (OAM) was used to classify respondents into<br />

groups of “stayers” or “leavers” based on the value of their likelihood ratio statistic.<br />

Specifically, the likelihood ratio statistic for each respondent was computed by dividing the<br />

marginal likelihood for being a leaver by the marginal likelihood of being a stayer. In this<br />

situation, we assumed that the same process underlies responding for stayers and leavers, so<br />

the same marginal likelihood equation can be used for both groups. The only difference<br />

lies in the estimated item parameters used in the marginal likelihood equation shown below<br />

{ δ ( ν *) P(<br />

ν = j | t)<br />

} f ( t)<br />

dt<br />

Pr ob(<br />

ν*)<br />

=<br />

j i i<br />

n<br />

J<br />

∫ ∏∑<br />

i=<br />

1 j=<br />

1<br />

In the equation above, n is the number of items in an AIM scale, t is an individual’s<br />

standing on the latent trait, J is the number of response options for an item i; δj(vi*) = 1 if<br />

option was endorsed, and 0 otherwise; P (vi = j| t) is probability of choosing option j given t<br />

(computed using the parameters for either stayers or leavers); and f (t) is the normal<br />

density.<br />

As an example of the OAM procedure, consider the following. For responses to,<br />

say, the Physical Conditioning scale, first, compute the marginal probability of a<br />

respondent’s Physical Conditioning responses using the SGR item parameters for leavers.<br />

Second, compute the probability of the responses using the parameters for stayers. Third,<br />

compute the ratio of these two probabilities. Finally, if the ratio is large (i.e., the responses<br />

are better described by the model for leavers), predict that the respondent will be a leaver;<br />

otherwise, predict that the respondent will be a stayer.<br />

Six likelihood ratio statistics were computed for each respondent (one per AIM<br />

content scale) using Stark’s OAM computer program (Stark, 2000). Once all the likelihood<br />

ratios were obtained, logistic regression was used to determine the best linearly weighted<br />

sum of LR values for predicting the dichotomous stayer/leaver outcome. Receiver<br />

operating characteristic (ROC) curves were then generated for each AIM content scale and<br />

the logistic regression composite to examine how well the OAM procedure differentiated<br />

between groups of stayers and leavers. Fig. 2 presents an example ROC curve for one of<br />

the AIM scales.<br />

Figure 2. ROC based on Likelihood Ratio Values for an AIM scale<br />

Hits<br />

100%<br />

80%<br />

60%<br />

40%<br />

20%<br />

ROC for an AIM Content Scale<br />

0%<br />

0% 20% 40% 60% 80% 100%<br />

45 th False Positives<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.


It can be seen that for this AIM scale, the OAM procedure differentiated stayers and leavers<br />

to a moderate degree. For example, for this scale, at a 20% false positive rate, 33% of<br />

leavers were correctly identified. Note that because the AIM is currently used operationally<br />

to predict attrition, we do not present results that can identify which AIM content scales<br />

worked best.<br />

The results also indicated that a LR composite provided the highest hit rates among<br />

the seven decision variables. It correctly identified 22% of stayers at a 10% false positive<br />

rate, 35% of stayers at a 20% false positive rate, 47% at 30%, 56% at 40% and 65% at<br />

50%. The success of the LR composite indicated that AIM content scales provided<br />

incremental validity in the prediction of attrition and, thus, should be used collectively.<br />

It is important to note that the use of OAM methodology provided an improvement<br />

over the current application of the Adaptability score in predicting attrition. For instance, at<br />

about at 20 percent false positive rate, the current adaptability score yields 27 percent<br />

correct identification rate of those who leave the service, while the OAM composite yields<br />

a 33 percent correct identification rate. A graphical comparison of ROC curves for the two<br />

identification procedures (see Fig. 3), showed that OAM method performed better than<br />

Adaptability score at every level of the false positive rate. Thus, based on these results, we<br />

recommend using the OAM-based LR statistic to predict the likelihood of attrition with<br />

AIM instead of the Adaptability composite.<br />

Fig. 3. ROCs for OAM and Adaptability Composites<br />

Hits<br />

100%<br />

80%<br />

60%<br />

40%<br />

20%<br />

ROCs for OAM and Adaptability Composites<br />

0%<br />

0% 20% 40% 60% 80% 100%<br />

False Positives<br />

321<br />

45 th Ref Adapt OAM Composite<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


322<br />

Acknowledgements<br />

We wish to thank the U.S. Army Research Institute for access to the AIM data and for<br />

supporting this research. Assistance from the Human Resources Research Organization<br />

(HumRRO) was particularly helpful with data management and recordkeeping. We also<br />

thank the Consortium of Universities of the Washington Metropolitan Are. All statements<br />

expressed in this document are those of the authors and do not necessarily reflect the<br />

official opinions or policies of the U.S. Army Research Institute, the U.S. Army, the<br />

Department of Defense, HumRRO, or the Consortium of Universities.<br />

References<br />

Drasgow, F., Levine M.V., Tsien, S., Williams B.A., & Mead, A.D. (1995). Fitting<br />

polytomous item response theory models to multiple-choice tests. Applied Psychological<br />

Measurement, 19, 143-165.<br />

Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983). Item response theory:<br />

Applications to psychological measurement. Homewood, IL: Dow Jones Irwin.<br />

Levine, M. V., & Drasgow, F. (1988). Optimal appropriateness measurement.<br />

Psychometrika, 53, 161 – 176.<br />

Stark, S. (2001). MODFIT: Computer program for examining model-data fit using<br />

fit plots and chi-square statistics. University of Illinois at Urbana-Champaign.<br />

Stark, S. (2001). OAM_SGR: Computer program for optimal appropriateness<br />

measurement. University of Illinois at Urbana-Champaign.<br />

Thissen, D. (1991). MULTILOG user’s guide (Version 6.0). Mooresville, IN:<br />

Scientific Software.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


A NEW APPROACH TO CONSTRUCTING AND SCORING FAKE-<br />

RESISTANT PERSONALITY MEASURES<br />

Dr. Stephen E. Stark<br />

Department of Psychology<br />

University of South Florida<br />

4202 E. Fowler Ave.<br />

Tampa, FL 33620<br />

sstark@cas.usf.edu<br />

Dr. Oleksandr S. Chernyshenko<br />

Department of Psychology<br />

University of Canterbury<br />

Private Bag 4800<br />

Christchurch, New Zealand<br />

sasha.chernyshenko@canterbury.ac.nz<br />

Dr. Fritz Drasgow<br />

Department of Psychology<br />

University of Illinois at Urbana-Champaign<br />

603 E. Daniel St.<br />

Champaign, IL 61820<br />

fdrasgow@s.psych.uiuc.edu<br />

Because of concerns about faking, defined as intentional response distortion, many<br />

researchers have begun exploring methods for constructing and scoring personality tests that are<br />

fake-resistant. At the forefront of this effort are Army researchers who developed the<br />

Assessment of Individual Motivation (AIM; White & Young, 1998) inventory, which assesses<br />

the temperament of Army recruits. The AIM is composed of items involving tetrads of<br />

statements that are similar in social desirability, but representing different dimensions. A<br />

respondent’s task is to choose the statement in each tetrad that is “most like me” and “least like<br />

me.” Preliminary examinations of AIM data, collected under research conditions where scores<br />

were not being used operationally, suggest that this multidimensional format for administering<br />

items reduces score inflation due to faking to as little as one tenth of a standard deviation (White<br />

& Young, 1998), as compared to the differences of 1 SD that have been observed with<br />

traditional, single stimulus (statement) items (see White, Nord, Mael, & Young, 1993).<br />

In this paper, we build on the idea of the AIM to address the general problem of faking in<br />

personality assessment. Specifically, we propose a new item response theory (IRT) approach to<br />

constructing and scoring multidimensional personality tests that are, in principle, fake-resistant.<br />

Rather than focusing on tetrads, however, we create fake-resistant pairwise preference items by<br />

pairing similarly desirable statements representing different dimensions. Using a simulation<br />

study, we show that by scaling stimuli (individual statements) and persons in separate steps,<br />

using different IRT models, it is possible to recover known latent trait scores, representing<br />

323<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


324<br />

different personality dimensions, with a high degree of accuracy, meaning that interindividual<br />

comparisons are possible.<br />

An IRT Approach to Constructing and Scoring Pairwise Preference Items<br />

In his dissertation, Stark (2002) proposed a general IRT approach for constructing and<br />

scoring pairwise preference items involving statements on different dimensions. A multi-step<br />

procedure is required. 1) Develop a large number of statements representing different<br />

personality dimensions. 2) Administer the statements to a group of respondents instructed to<br />

indicate how well on, say, a scale of 1 to 5, each statement describes him/her. Also administer<br />

the statements to a separate group of judges instructed to rate the desirability of each statement<br />

using a similar scale. 3) Estimate stimulus parameters for the individual statements representing<br />

each dimension separately, using a unidimensional IRT model that provides good model-data fit;<br />

one possibility would be the Generalized Graded Unfolding Model (GGUM; Roberts, Donoghue,<br />

& Laughlin, 2000a). 4) Create fake-resistant items by pairing statements similar in desirability<br />

but representing different dimensions; also create a small proportion of unidimensional items by<br />

pairing statements that are similar in desirability, but having different stimulus location<br />

parameters. These pairings constitute the fake-resistant test. 5) Administer the resulting test to<br />

respondents, instructed to choose the statement in each pair that better describes him/her. 6)<br />

Score the pairwise preference data using a Bayes modal latent trait estimation procedure, based<br />

on the following general model:<br />

Pst{1, 0} Ps{1} Pt{0}<br />

P( s> t) ( θ , )<br />

i d θ s d = ≈<br />

, (1)<br />

t P {1, 0} + P {0,1} P{1} P{0} + P{0} P{1}<br />

st st s t s t<br />

where:<br />

i = index for items (pairings), where i = 1 to I,<br />

d = index for dimensions, where d = 1, …, D,<br />

s, t = indices for first and second stimuli, respectively, in a pairing,<br />

θd , θ =<br />

s d latent trait values for a respondent on dimensions d t<br />

s and dt respectively,<br />

Ps{ 1},<br />

Ps{<br />

0}<br />

= probability of endorsing/not endorsing stimulus s at θ d , s<br />

Pt { 1},<br />

Pt<br />

{ 0}<br />

= probability of endorsing/not endorsing stimulus t at θ d , t<br />

Pst{ 1,<br />

0}<br />

= joint probability of endorsing stimulus s, and not endorsing stimulus t at ( θ d , θ )<br />

s d , t<br />

Pst{ 0,<br />

1}<br />

= joint probability of not endorsing stimulus s, and endorsing stimulus t at ( θ d , θ )<br />

s d , and<br />

t<br />

P ( θ , θ ) = probability of respondent j preferring stimulus s to stimulus t in pairing i.<br />

( s><br />

t)<br />

i d s d t<br />

In essence, the model above assumes that when a respondent is presented with a pair of<br />

statements (stimuli), s and t, and is asked to indicate a preference, he/she evaluates each stimulus<br />

separately and makes independent decisions about endorsement. If a respondent endorses both<br />

stimuli, or does not endorse either, he/she must reevaluate the stimuli, independently, until a<br />

preference is reached. A preference is represented by the joint outcome {Agree (1), Disagree (0)}<br />

or {Disagree (0), Agree (1)}. An outcome of {1,0} indicates that stimulus s was preferred to<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


stimulus t, and is considered a positive response; an outcome of {0,1} indicates that stimulus t<br />

was preferred to s (a negative response). Thus, the response data for this model are<br />

dichotomous. Note that this model makes no assumption about item dimensionality. The<br />

statements involved in a pair may be on the same or different dimensions and, in fact, a small<br />

number of unidimensional pairings is required to identify the latent trait metric and permit<br />

interindividual comparisons. In addition, because the stimuli in each pair are assumed to be<br />

evaluated independently, stimulus parameters can be estimated, for each dimension separately,<br />

by using software for calibrating unidimensional single stimulus responses, such as the<br />

GGUM2000 computer program (Roberts, Donoghue, & Laughlin, 2000b). Therefore, this model<br />

is not referred to as a multidimensional model, but rather a multi-unidimensional model called<br />

MUPP (Multi-Unidimensional Pairwise Preferences; Stark, 2002).<br />

Scoring respondents. Once the fake-resistant tests have been administered and the<br />

dichotomous pairwise preference data have been collected, a multidimensional Bayes modal<br />

estimation procedure can be used to obtain scores for each respondent on each dimension. This<br />

amounts to maximizing the following equation:<br />

ui<br />

⎧ n<br />

⎪ 1−u<br />

⎫ i ⎪<br />

Lu ( % , θ % ) = ⎨ ⎡P⎤ ⎡ ( s> t) 1 −P ⎤ ( ) * ( )<br />

i s> t ⎬ f θ%<br />

∏ ⎣ ⎦ ⎣ i⎦<br />

, (2)<br />

⎪⎩ i=<br />

1<br />

⎪⎭<br />

% ( , , ..., ) represents a vector of latent trait values (one for each<br />

dimension), u%represents a dichotomous response pattern, P ( > ) ( θ , θ ) is the probability of<br />

where θ= θd'= 1 θd'= 2 θd'=<br />

D<br />

s t i ds dt<br />

preferring stimulus s to stimulus t in item i, and f ( θ ) % represents the prior density, whose<br />

dimensions, d ' = 1 to D, are assumed uncorrelated.<br />

Equation 2 can be solved numerically to obtain a vector of latent trait estimates for each<br />

respondent using subroutine DFPMIN (Press, Flannery, Teukolsky, & Vetterling, 1990) in<br />

conjunction with functions that compute the log likelihood and its first derivatives. DFPMIN<br />

performs a D-dimensional minimization, using a Broyden-Fletcher-Goldfarb-Shanno (BFGS)<br />

algorithm, so the first derivatives and log likelihood values must be multiplied by –1 when<br />

maximizing the likelihood of a response pattern. The primary advantage of this approach, over<br />

Newton-Raphson iterations, is DFPMIN does not require an analytical solution for the second<br />

derivatives of the log likelihood. Instead, it provides an approximation to the inverse Hessian<br />

matrix of second derivatives, from which standard errors of the latent trait estimates can be<br />

obtained by taking the square roots of the diagonal elements<br />

A Monte Carlo Study to Examine MUPP Scoring Accuracy<br />

Constructing Tests for Simulations<br />

To examine the accuracy of latent trait estimation, one- and two- dimensional tests were<br />

constructed using AIM pretest data provided by the U.S. Army Research Institute (ARI) through<br />

Human Resources Research Organization (HumRRO). Specifically, in the early stages of AIM<br />

development, nearly 500 stimuli, representing 6 temperament dimensions, were administered to<br />

738 recruits who were instructed to indicate their level of agreement, using a scale of 1 (very<br />

325<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


326<br />

untrue of me) to 6 (very true of me). Of those 738 recruits, 469 were instructed to answer<br />

honestly, and 269 were told to fake good (i.e., answer in a manner that would improve their<br />

score). HumRRO researchers screened those data and flagged six unusual response patterns,<br />

which we excluded from the following analyses.<br />

Based on 465 honest respondents, stimulus parameters were estimated for each of the<br />

six AIM dimensions separately using the GGUM2000 computer program. For all dimensions<br />

except Leadership, the parameter estimation procedure converged after eliminating just a few<br />

stimuli and, overall, good-model data fit was observed. Next, a social desirability rating was<br />

obtained for each statement by computing the mean proportion endorsement score using the<br />

responses of the 267 persons in the fake good condition. Based on the similarity of the<br />

distributions of stimulus parameters and social desirability ratings, the Adjustment and<br />

Agreeableness dimensions were chosen for test construction.<br />

Constructing 1-D tests. To investigate the accuracy of MUPP latent trait estimation as a<br />

function of test length, three conventional tests of 10, 20, and 40 items were created. An effort<br />

was made to construct tests having equal proportions of items that discriminated well at high,<br />

moderate, and low values of theta. However, because few stimuli had large, and very few had<br />

moderate location parameters, it was necessary to repeat some stimuli several times to create<br />

items that provided information above theta equals 1. Once a final set of 40 items was selected,<br />

the items were ordered to balance extremity and discriminating power across subsets 1 – 10, 11 –<br />

20, and 21 – 40. Items 1 – 10 and 1 – 20 were used for the conventional tests of 10 and 20 items,<br />

respectively; the entire set was used for the 40-item test.<br />

Constructing 2-D tests. The accuracy of MUPP latent trait estimation for multiunidimensional<br />

tests is most likely influenced by two factors: 1) the number of items involving<br />

each dimension (i.e., test length in the two-dimensional case); and 2) the percentage of items<br />

involving stimuli on the same dimension. (A small proportion of unidimensional pairings,<br />

representing each dimension, are required to identify the metric.). Therefore, these factors were<br />

chosen as independent variables, having 3 levels each.<br />

To implement a fully crossed factorial design, nine conventional tests were constructed<br />

according to the design specifications shown below.<br />

Percent of Items Involving Stimuli on Same Dimension<br />

10% 20% 40%<br />

Total Test<br />

1-2 1-2 1-2<br />

Length 1-1 2-2 2-1 1-1 2-2<br />

2-1 1-1 2-2 2-1<br />

20 1 1 18 2 2 16 4 4 12<br />

40 2 2 36 4 4 32 8 8 24<br />

80 4 4 72 8 8 64 16 16 48<br />

In the table, the bold entries in the first row and column represent the levels of the independent<br />

variables. For example, Column 1 indicates total test length, ranging from 20 to 80. Row 1<br />

indicates the percent of items involving unidimensional pairings, ranging from 10% to 40%. Just<br />

below, the entries in the columns labeled 1-1, 2-2, and 1-2/2-1 represent the required numbers of<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


items involving stimuli on dimension 1 (Adjustment) only, dimension 2 (Agreeableness) only,<br />

and dimensions 1 and 2. For illustration, a conventional test of 80 items (pairings), 40% of<br />

which are unidimensional, must contain 16 1-1, 16 2-2, and 48 1-2/2-1 items.<br />

80 stimuli representing Adjustment (dimension “1”) and 49 stimuli representing<br />

Agreeableness (dimension “2”) were chosen, as in the 1-D case. Multidimensional items (1-2<br />

and 2-1) were created by pairing Adjustment and Agreeableness stimuli that had similar<br />

desirability; unidimensional (1-1 and 2-2) items were created by pairing stimuli, representing the<br />

same dimension, which had different location parameters, but fairly similar desirability ratings.<br />

Once the three 80-item tests were constructed, three 20- and three 40-item tests were created<br />

using the first 20 and 40 items, respectively, of the 80-item test in each condition.<br />

Investigating Parameter Recovery<br />

1-D simulations. To determine if the accuracy of parameter recovery varied across theta<br />

points on the unidimensional trait continuum, each conventional test was administered to 50<br />

simulated examinees (simulees) at 31 points on the interval [ −3.0, − 2.8,..., + 3.0] . At each grid<br />

point, the average estimated theta and standard error were computed over 50 replications and<br />

used to compute error statistics, which were compared across conditions using MANOVA.<br />

2-D simulations. Each of the nine conventional tests was administered to 50 simulees at<br />

points on a ( θ1, θ 2)<br />

grid, where θd ranged from –3 to +3 in increments of 0.5. As above, error<br />

statistics for the estimated thetas and standard errors were compared using MANOVA.<br />

Results<br />

1-D simulations. Bias and root mean square errors of the latent trait estimates decreased<br />

as test length increased, but accurate parameter recovery was observed across a wide range of<br />

theta even for the short 10-item test. The estimated standard errors were also accurate,<br />

approaching zero at moderate thetas for tests of 20 and 40 items. Overall, the results suggested<br />

that latent trait and standard error estimation was quite accurate. In fact, a follow-up simulation,<br />

examining the correlation between estimated and known thetas for 1000 simulees, sampled from<br />

a standard normal distribution, showed correlations between estimated and known thetas for the<br />

10-, 20-, and 40- item tests of .90, .95, and .97, respectively.<br />

2-D simulations. Two independent variables, test length (TESTLEN) and the percent of<br />

unidimensional pairings (UNIPCT) were fully crossed to produce 9 tests. Error statistics were<br />

computed for each dimension separately and averaged for comparison using MANOVA. As<br />

before, a follow-up simulation was conducted for each test by sampling known thetas for 1000<br />

simulees from independent standard normal distributions and computing the correlations<br />

between the estimated and known thetas.<br />

As in the 1-D study, bias in the latent trait estimates decreased as test length increased,<br />

and the largest bias statistics occurred at the endpoints of the trait continua, where the regression<br />

toward the mean effect was greatest and the items provided the least information. In addition,<br />

while the accuracy of latent trait estimation did increase as the percentage of unidimensional<br />

pairings increased from 10% to 20%, there was little improvement by going from 20% to 40%.<br />

These results were supported by the MANOVA, which showed main effects for both<br />

independent variables, but only a weak linear trend for UNIPCT (eta-squared was about .10).<br />

327<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


328<br />

This finding is important from a substantive perspective, because fewer unidimensional pairings<br />

means a more fake-resistant test. As a final note, the correlations between the estimated and<br />

known thetas, ranged from .77 in the most unfavorable situation (a 20-item test with 10%<br />

unidimensional pairings) to .96 in the most favorable (80-item test with 40% unidimensional).<br />

The average correlation was about 0.9 for the 40-item tests, regardless of the percentage of<br />

unidimensional pairings.<br />

Discussion and Conclusions<br />

This paper outlines a method of constructing and scoring fake-resistant multidimensional<br />

pairwise preference items. Individual statements are administered and calibrated using a<br />

unidimensional single stimulus model. Social desirability ratings are obtained for statements<br />

using, say, a panel of judges, and fake-resistant items are created by pairing similarly desirable<br />

statements representing different dimensions. Tests are created by combining multidimensional<br />

items with a small number of unidimensional pairings needed to identify the latent metric. Trait<br />

scores are then obtained using a multidimensional Bayes modal estimation procedure based on<br />

the MUPP model developed by Stark (2002).<br />

As shown here, the MUPP approach to test construction and scoring provides accurate<br />

parameter recovery in both one- and two-dimensional cases, even with relatively few (say 15%)<br />

unidimensional pairings. Accuracy of this approach generally improves as a function of test<br />

length. Even with nonadaptive tests, good estimates may be attained using only 20 to 30 items<br />

per dimension, meaning that a 5-D test would require 100 to 150 items. If adaptive item<br />

selection were used to improve efficiency, the required number of items might decrease by as<br />

much as 40%. We are currently developing and validating a 5-D inventory, using this approach,<br />

and comparing the scores to those obtained using traditional methods.<br />

Acknowledgements<br />

We wish to thank the U.S. Army Research Institute for access to the AIM data and for supporting<br />

this research. Assistance from the Human Resources Research Organization (HumRRO) was<br />

particularly helpful with data management and recordkeeping. The Consortium of Universities<br />

of the Washington Metropolitan Area was also helpful in securing research funds. All statements<br />

expressed in this document are those of the authors and do not necessarily reflect the official<br />

opinions or policies of the U.S. Army Research Institute, the U.S. Army, the Department of<br />

Defense, HumRRO, or the Consortium of Universities.<br />

References<br />

Press, W.H., Flannery, B.P., Teukolsky, S.A., & Vetterling, W.T. (1990). Numerical<br />

recipes: The art of scientific computing. New York: Cambridge University Press.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000a). A general item response<br />

theory model for unfolding unidimensional polytomous responses. Applied Psychological<br />

Measurement, 24, 3 – 32.<br />

Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000b). GGUM2000 [computer<br />

program]. A Dos-based program for unfolding ideal point responses. Department of<br />

Measurement and Statistics. University of Maryland.<br />

Stark, S. (2002). A new IRT approach to test construction and scoring designed to reduce<br />

the effects of faking in temperament assessment [Doctoral Dissertation]. University of Illinois at<br />

Urbana-Champaign.<br />

Stark, S., & Drasgow, F. (2002). An EM approach to parameter estimation for the<br />

Zinnes and Griggs paired comparison ideal point IRT model. Applied Psychological<br />

Measurement, 26, 208 – 227.<br />

White, L. A., Nord, R. D., Mael, F. A., & Young, M. C. (1993). The Assessment of<br />

Background and Life Experiences (ABLE). In T. Trent & J. H. Laurence (Eds.), Adaptability<br />

screening for the armed forces (pp. 101 – 162). Washington, DC: Office of the Assistant<br />

Secretary of Defense (Force Management and Personnel).<br />

White, L. A., & Young, M. C. (1998). Development and validation of the Assessment of<br />

Individual Motivation (AIM). Paper presented at the Annual Meeting of the American<br />

Psychological <strong>Association</strong>, San Francisco, CA.<br />

329<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


330<br />

U.S. NAVY SAILOR RETENTION: A PROPOSED MODEL OF<br />

CONTINUATION BEHAVIOR 13<br />

Jessica B. Janega and Murrey G. Olmsted<br />

NAVY PERSONNEL RESEARCH, STUDIES AND TECHNOLOGY<br />

DEPARTMENT<br />

jessica.janega@persnet.navy.mil<br />

Sailor turnover reduces the effectiveness of the Navy. Turnover has improved<br />

significantly since the late 1990s due to the implementation of a variety of retention<br />

programs including selective re-enlistment bonuses, increased sea pay, changes to the<br />

Basic Allowance for Housing (BAH) and other incentives. Now in many cases, the Navy<br />

has adequate numbers of Sailors retained, however, it faces the problem of retaining the<br />

best and brightest Sailors in active-duty service (Visser, 2001). Changes in employee<br />

values will require that organizations, such as the Navy, make necessary changes in their<br />

strategy to retain the most qualified personnel (Withers, 2001). Attention to quality of life<br />

issues is one way in which the military has addressed the changing needs of its members<br />

(Kerce, 1995). One of the most effective ways to assess quality of life in the workplace is<br />

to look at the issue of job satisfaction. Job satisfaction represents the culmination of<br />

feelings the Sailor has toward the Navy. Job satisfaction in combination with variables<br />

like organizational commitment can be used to predict employee (i.e., Sailor) retention<br />

(for a general overview see George & Jones, 2002). The purpose of this paper is to<br />

explore the relationship of job satisfaction, organizational commitment, career intentions,<br />

and continuation behavior in the U.S. Navy.<br />

Job Satisfaction<br />

According to Locke (1976), job satisfaction is predicted by satisfaction with<br />

rewards, satisfaction with work, satisfaction with work context (or working conditions),<br />

and satisfaction with other agents. Elements directly related to job satisfaction include<br />

direct satisfaction with the job, action tendencies, career intentions, and organizational<br />

commitment (Locke, 1976). Olmsted & Farmer (2002) replicated a version of Locke’s<br />

(1976) model of job satisfaction proposed by Staples & Higgins (1998) by applying it to<br />

a Navy sample. Staples and Higgins (1998) proposed that job satisfaction is both a factor<br />

predicted by other factors, as well as an outcome in and of itself. Olmsted & Farmer<br />

(2002) applied the model of Staples and Higgins (1998) directly to Navy data using the<br />

Navy-wide Personnel Survey 2000. The paper evaluated two parallel models, which<br />

provided equivalent results indicating that a similar version of Locke’s model could be<br />

successfully applied to Navy personnel.<br />

Organizational Commitment<br />

Organizational commitment involves feelings and beliefs about entire<br />

organizations (George & Jones, 2002). Typically, organizational commitment can be<br />

13<br />

The opinions expressed are those of the authors. They are not official and do not represent the views of<br />

the U.S. Department of Navy.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


viewed as a combination of two to three components (Allen & Meyer, 1990). The<br />

affective (or attitudinal) component of organizational commitment involves positive<br />

emotional attachment to the organization, while continuance commitment is based on the<br />

potential losses associated with leaving the organization, and normative commitment<br />

involves a commitment to the organization based on a feeling of obligation (Allen &<br />

Meyer, 1990). Commonalities across all affective, normative, and continuance forms of<br />

commitment indicate that each component should affect employee’s intentions and final<br />

decision to continue as a member of the organization (Jaros, 1997). The accuracy of these<br />

proposed relationships have implications for turnover reduction because “turnover<br />

intentions is the strongest, most direct precursor of turnover behavior, and mediates the<br />

relationship between attitudes like job satisfaction and organizational commitment and<br />

turnover behavior,” (Jaros, 1997, p. 321). This paper primarily addresses affective<br />

commitment, since it has a significantly stronger correlation with turnover intentions than<br />

either continuance or normative commitment (Jaros, 1997).<br />

Career Intentions<br />

Career intentions represent an individual’s intended course of action with respect<br />

to continuation in their current employment. While a person’s intentions are not always<br />

the same as their actual behavior, an important assumption is that these intentions<br />

represent the basic motivational force or direction of the individual’s behavior (Jaros,<br />

1997). In general, Jaros (1997) suggests that the combination of organizational<br />

commitment and career intentions appears to be a good approximation of what is likely to<br />

occur in future career behavioral decisions (i.e., to stay or leave the organization).<br />

Purpose<br />

This paper looks at job satisfaction, organizational commitment, career intentions,<br />

and continuation behavior using structural equation modeling. It was hypothesized that<br />

increased job satisfaction would be associated with increased organizational commitment,<br />

which in turn would be positively related to career intentions and increased continuation<br />

behavior (i.e., retention) in the Navy. A direct relationship was also hypothesized to exist<br />

between career intentions and continuation behavior.<br />

METHODS<br />

Participants<br />

The sample used in this study was drawn from a larger Navy quality of work life<br />

study using the Navy-wide Personnel Survey (NPS) from the year 2000. The NPS 2000<br />

was mailed to a stratified random sample of 20,000 active-duty officers and enlisted<br />

Sailors in October 2000. A total of 6,111 useable surveys were returned to the Navy<br />

Personnel Research, Studies, & Technology (NPRST) department of Navy Personnel<br />

Command, a return rate of 33 percent. The current sample consists of a sub-sample of<br />

700 Sailors who provided social security numbers for tracking purposes. Sailors whose<br />

employee records contained a loss code 12 months after the survey were flagged as<br />

having left the Navy (10.4%). Those Sailors who still remained in active-duty in the<br />

Navy (i.e., those who could be tracked with social security number and did not have a<br />

loss code in their records) were coded as still being present in the Navy (87.8%). Sailors<br />

331<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


332<br />

whose status was not clear from their employment records (i.e., those who could not be<br />

tracked by social security number) were retained in the analysis with “unknown status”<br />

(1.8%).<br />

Materials<br />

The NPS 2000 primarily focuses on issues related to work-life and career<br />

development for active-duty personnel in the U.S. Navy. The survey contains 99<br />

questions, many of which include sub-questions. Formats for most of the 99 questions<br />

follow a five-point Likert-type scale.<br />

Analysis Procedures<br />

This sample contained missing data. Of those who returned the NPS 2000, not<br />

every Sailor filled it out completely. For this reason, Amos 4.0 was chosen as the<br />

statistical program to perform the structural equation models for this sample. Amos 4.0 is<br />

better equipped to handle issues with missing data then most any other structural equation<br />

modeling program (Byrne, 2001). Once acceptable factors were found via data reduction<br />

with SPSS 10, the factors and observed variables were input into Amos 4.0 for structural<br />

equation modeling via maximum likelihood estimation with an EM algorithm (Arbuckle<br />

& Wothke, 1999).<br />

RESULTS<br />

Overall, the proposed model ran successfully and fit the data adequately. A<br />

significant chi-square test was obtained for the model, indicating more variance remains<br />

to be accounted for in the factor, χ 2 (938) = 7637.94, p


e1<br />

e2<br />

e3 Q53K<br />

.36<br />

e4<br />

e5<br />

Q53I<br />

Q53J<br />

.32<br />

.62<br />

.71<br />

e6 Q53G<br />

.85<br />

.64<br />

e7 Q53F<br />

.79<br />

.55<br />

e8 Q53E .56<br />

e9 Q53D<br />

.45<br />

.41<br />

e10 Q62M<br />

.40<br />

.45<br />

e11 Q62L .41<br />

e12<br />

e13<br />

e14<br />

Q94E<br />

Q52A<br />

Q73F<br />

Q52G<br />

Q52F<br />

e15 Q54E .58<br />

e16 Q53B .43<br />

e17 Q62D<br />

.73<br />

e18 Q60F<br />

.50<br />

e19 Q62I .73<br />

e20 Q53H<br />

.36<br />

e21<br />

e22 Q52Q .71<br />

e23 Q65D<br />

.62<br />

.87<br />

e24 Q64D<br />

.47<br />

.52<br />

e25 Q64B .89<br />

e26 Q65B<br />

.50<br />

.94<br />

e27<br />

e28<br />

Q62V<br />

Q64E<br />

Q65E<br />

e33<br />

e34<br />

Satisfaction<br />

with Rewards<br />

Q60H<br />

Q60I<br />

e29<br />

Q62F<br />

e35 Q52H .73<br />

Satisfaction<br />

with Work<br />

Satisfaction<br />

with Other<br />

Agents<br />

Figure 1. Exploratory Model<br />

.67<br />

.79<br />

.88<br />

Satisfaction<br />

with Working<br />

Conditions<br />

.80<br />

.24<br />

e30<br />

Q62G<br />

.78<br />

-.02<br />

e31<br />

Q62U<br />

.68<br />

.18<br />

Global Job<br />

Satisfaction<br />

Q50A<br />

e36<br />

e32<br />

Q62N<br />

.41<br />

e46<br />

Q50B<br />

e37<br />

e43<br />

Q47B<br />

.81<br />

e44<br />

Q47C<br />

Career<br />

Intentions<br />

Q50D<br />

e38<br />

.78<br />

.70 .35 .34<br />

Organizational<br />

Committment<br />

.66 .73<br />

.87 .90<br />

.64 .82 .80<br />

Q50E<br />

e39<br />

Q50F<br />

Q50G<br />

Status<br />

Continuation<br />

Behavior<br />

333<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.83<br />

e48<br />

e40<br />

e47<br />

e41<br />

e45<br />

Q50H<br />

e42<br />

.08


334<br />

DISCUSSION<br />

This model provides an adequate fit to Navy data for use in relating job<br />

satisfaction and organizational commitment to career intentions and continuation<br />

behavior. Advantages over previously tested models include the use of structural equation<br />

modeling over regression path analysis, and the treatment of job satisfaction and<br />

organizational commitment as separate factors. Several points of interest are apparent in<br />

evaluating the results of the model. First, several factors and observed variables<br />

contributed to global job satisfaction. Satisfaction with work predicted the most variance<br />

in global job satisfaction of any of the factors (path weight = .80). Satisfaction with other<br />

agents was the next largest predictor of global job satisfaction, followed by working<br />

conditions and satisfaction with rewards. Interestingly, the amount of variance in global<br />

job satisfaction predicted by satisfaction with rewards was very low (path weight = -.02).<br />

This suggests that the rewards listed on this survey are not as important to job satisfaction<br />

as being generally satisfied with the job itself, or else these rewards do not adequately<br />

capture what Sailors value when considering satisfaction with their job. Perhaps these<br />

results may also indicate the differences between intrinsic and extrinsic rewards as<br />

predictors of job satisfaction. The relationships between variables relating to intrinsic and<br />

extrinsic motivation should be explored further in this model as they pertain to job<br />

satisfaction.<br />

Job satisfaction as it is modeled here is a good predictor of affective<br />

organizational commitment. The path weight from job satisfaction to organizational<br />

commitment is .70 for the exploratory model. Adding a path from global job satisfaction<br />

to career intentions did not add any predictive value to the structural equation model.<br />

Here, organizational commitment mediates the relationship between job satisfaction and<br />

career intentions/continuation behaviors. Organizational commitment predicted both<br />

career intentions and continuation behaviors adequately in the model. Since the model<br />

did not explain all of the variation present (as evidenced by the significant chi-square<br />

statistic), this difference could be the result of an unknown third variable that is<br />

influencing this relationship. This problem should be explored more in the future.<br />

The more the Navy understands regarding Sailor behavior, the more change can<br />

be implemented to improve the Navy. The results of this study suggest that job<br />

satisfaction is a primary predictor of organizational commitment and that both play an<br />

important role in predicting both career intentions and actual continuation behavior. In<br />

addition, the results of this paper suggest that career intentions are actually stronger in<br />

predicting continuation behavior than organizational commitment when evaluating them<br />

in the context of all of the other variables in the model. More research is needed to fully<br />

understand these relationships, and the specific contributions to job satisfaction that can<br />

be implemented in the Navy. A validation of this model should be conducted in the future<br />

to verify these relationships. However, it is clear at this point that understanding Sailor<br />

continuation behavior would be incomplete without measurement of job satisfaction,<br />

organizational commitment, and career intentions.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


REFERENCES<br />

Allen, N. J., & Meyer, J. P. (1990). The measurement and antecedents of affective,<br />

continuance, and normative commitment to the organization. Journal of<br />

Occupational Psychology, 63, 1-18.<br />

Arbuckle, J. L., & Wothke, W. (1999). Amos 4.0 user’s guide. Chicago, IL: SmallWaters<br />

Corporation.<br />

Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the<br />

analysis of covariance structures. Psychological Bulletin, 88, 588-606.<br />

Browne, M. W., & Cudeck, R. (1989). Single sample cross-validation indices for<br />

covariance structures. Multivariate Behavioral Research, 24, 445-455.<br />

Byrne, B. M. (2001). Structural equations modeling with AMOS: Basic concepts,<br />

applications and programming. Mahwah, New Jersey: Laurence Earlbaum<br />

Associates, Publishes.<br />

George, J. M., & Jones, G. R. (2002). Organizational behavior (3 rd ed.). New Jersey:<br />

Prentice Hall.<br />

Jaros, S. (1997). An assessment of Meyer and Allen’s (1991) three-component model of<br />

organizational commitment and turnover intentions. Journal of Vocational<br />

Behavior, 51, 319-337.<br />

Kerce, E. W. (1995). Quality of life in the U.S. Marine Corps. (NPRDC TR-95-4). San<br />

Diego: Navy Personnel Research and Development Center.<br />

Locke, E. A. (1976). The nature and causes of job satisfaction. In M. D. Dunnete (Eds.),<br />

Handbook of Industrial and Organizational Psychology (pp. 1297-1349). New<br />

York: John Wiley & Sons.<br />

Olmsted, M. G., & Farmer, W. L. (2002, April). A non-multiplicative model of Sailor job<br />

satisfaction. Paper presented at the annual meeting of the Society for Industrial &<br />

Organizational Psychology, Toronto, Canada.<br />

SPSS, Inc. (1999). SPSS 10.0 syntax reference guide. Chicago, IL: SPSS, Inc.<br />

Staples, D. S., & Higgins, C. A. (1998). A study of impact of factor importance<br />

weightings on job satisfaction measures. Journal of Business and Psychology,<br />

13(2), 211-232.<br />

Visser, D. (2001, January 1-2). Navy battling to retain sailors in face of private sector’s<br />

allure. Stars and Stripes. Retrieved March 3, <strong>2003</strong>. http://www.pstripes.com/<br />

jan01/ed010101a.html<br />

Withers, P. (2001, July). Retention strategies that respond to worker values. Workforce.<br />

Retrieved September 24, <strong>2003</strong>. http://www.findarticles.com/cf_0/m0FXS/7_80/<br />

76938893/print.jhtml<br />

335<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


336<br />

PSYCHOMETRIC PROPERTIES<br />

OF THE DUTCH<br />

SOCIAL SKILLS INVENTORY<br />

Drs. Frans V.F. Nederhof<br />

Senior Researcher<br />

Defence Selection Agency<br />

Kattenburgerstraat 7<br />

PO Box 2630<br />

1000 CP Amsterdam<br />

The Netherlands<br />

vf.nederhof@mindef.nl<br />

INTRODUCTION<br />

Social Skills are thought to be a valuable asset to all personnel. Still, instruments that enable<br />

personnel selectors to assess social skills in an efficient, reliable, and valid manner are not readily<br />

available in the Netherlands. Roughly speaking, Dutch personnel selectors are faced with the<br />

options to assess social skills in an interview on the basis of an impression, and to assess these<br />

skills in an assessment centre. Although efficient and perhaps better than not assessing social skills<br />

at all, the reliability and validity of assessing social skills in an interview may be questioned. The<br />

use of an assessment centre to assess social skills may result in more reliable and valid measures,<br />

but does so at the expense of a lot of time and money. In short, neither option seems to be quite<br />

satisfactory for a selection agency that is faced with the task to psychologically examine thousands<br />

of recruits a year.<br />

An instrument that could be of help is the Social Skills Inventory (Riggio, 1986). The Social Skills<br />

Inventory (SSI) is a commercially of the shelf 90-item self-report measure of six basic social and<br />

communication skills. Although the instrument has yet to be tested as an instrument to select<br />

military personnel, the properties of the SSI are promising. Firstly, the skills measured with the SSI<br />

seem highly relevant to military personnel. Secondly, the SSI has excellent psychometric<br />

properties, which puts it on par with most personality questionnaires. Thirdly, the items that<br />

compose the SSI are relevant to military personnel as well as to civilians, which makes the<br />

instrument suited to select persons of both categories. Fourthly, completing the instrument only<br />

takes about 40 minutes. Taken together, the SSI could be the reliable, valid and efficient instrument<br />

that provides the personnel selector with an informative image of a person’s social make up.<br />

The SSI measures six basic skills, focussing on the domains of non-verbal/emotional skills and<br />

verbal/social skills. Within each domain three sub-domains are defined, which focus on<br />

expressivity (skills in sending information), sensitivity (skills in receiving information) and control<br />

(skills in regulating the interaction). Emotional expressivity refers to the ability of sending<br />

emotionally relevant messages like feelings. Emotional expressivity is thought to be a relevant skill<br />

in military teamwork, especially in the phases in which horizontal cohesion between team members<br />

and vertical cohesion between team members and leaders have to develop. Examples-items are “I<br />

have been told that I have expressive eyes” and “Quite often I tend to be the life of the party”.<br />

Emotional sensitivity is the ability of receiving emotional messages such as overt and hidden body<br />

language. Emotional sensitivity is thought to be a relevant skill in military teamwork in the group<br />

dynamic phases in which people develop an individual identity in the team, and could help military<br />

leaders in the development of trust between leaders and men. Example items are “It is nearly<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


impossible for people to hide their true feelings from me”; “At parties I can instantly tell when<br />

someone is interested in me”.<br />

Emotional control reflects the ability to regulate the sending of emotional and non-verbal signals.<br />

Emotional control is thought to be relevant in leadership situations that require the unbiased<br />

gathering of information, the neutral sending of sensitive information, as well as in stress situations<br />

military leaders could be expected to remain a calm exterior. Example items are “I am very good at<br />

maintaining a calm exterior, even when upset”; “When I am really not enjoying myself at some<br />

social function, I can still make myself look as if I am having a good time”.<br />

Social expressivity reflects the liveliness of social behaviour, and the initiative of a person in social<br />

situations. In a close-knit society like the armed forces, social expressivity is thought to be of help<br />

in creating a network of interpersonal relationships. Example items are: “At parties I enjoy<br />

speaking to a great number of people”; “When in discussions I find myself doing a large share of<br />

the talking”.<br />

Social sensitivity reflects the ability in receiving socially relevant signals like norms, and the<br />

awareness to these norms. The skill is thought to be of help to recruits to fit in a rule- and rolebased<br />

environment like the military, thereby reducing the risk of a mismatch between person and<br />

organisation. Example items are: I often worry that people will misinterpret something that I have<br />

said to them”; While growing up, my parents were always stressing the importance of good<br />

manners”.<br />

Social control reflects the ability to act according to certain roles, including self-presentational<br />

skills. Social control is thought to be an asset to military leaders especially in situations of<br />

uncertainty, where it comes to convincing people by setting an example. Example items are: “I find<br />

it very easy to play different roles at different times”; “When in a group of friends, I am often the<br />

spokesperson for the group”.<br />

Each of the six SSI-scales is composed of 15 statements. Respondents are requested to indicate on a<br />

five-point Likert scale the degree the statement applies to them, choosing between “Not at all like<br />

me” - “A little like me” - “Like me” - “Very much like me” - “Exactly like me”. Each of the ninety<br />

statements of the SSI is translated in Dutch, after which the instrument is pre-tested on a sample of<br />

applicants for military leadership functions. The Dutch research findings are compared with<br />

American findings using the original instrument, and conclusions for future research and selection<br />

practices are drawn.<br />

METHOD<br />

Translation<br />

A psychologist and a linguist independently translated the SSI, after which measures of similarity<br />

were taken. Due to differences in structure of American English and Dutch, small but significant<br />

differences were found in the translations. Therefore an iterative approach was chosen, discussing<br />

each item in detail until a translation was found that accurately reflected the meaning of the original<br />

items. Four items caused significant discussions among the translators. For the exact meaning of<br />

these items the author of the SSI was contacted.<br />

To assure the Dutch translation of the SSI is comprehensible to personnel of all levels of education<br />

the translation was iteratively presented to a panel of civilian personnel with higher education, and<br />

to several panels of military personnel in active service, ranks ranging from soldier to captain.<br />

337<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


338<br />

These panels were encouraged to mark any part of the translation they found odd, confusing, or<br />

otherwise inadequate. Markings were then discussed, whereupon some minor changes were made<br />

in the translation. After three iterations no more suggestions were made by the panels.<br />

Participants<br />

In this study data was gathered from147 persons applying for military service at non-commissioned<br />

officer level (N=80) or officer level (N=67). 14 Forty-one of the participants applied for NCOfunctions<br />

with the army, twenty-six applied for NCO-functions with the navy, and thirteen applied<br />

for NCO-functions with the military police. Sixty-seven participants applied for officer functions<br />

with the military police. Of the participants fifteen participants were female. The mean age of the<br />

participants was 20 years, with a standard deviation of 4.4. Analysis of variance was used to test<br />

group differences between the subgroups. Perhaps surprising, no significant differences between<br />

the groups were found. Therefore, the sample was considered homogenous.<br />

Measures<br />

Participants were asked to complete the 90-item Dutch translation of the SSI, the 240-item NEO PI-<br />

R five factor personality inventory, the 64-item WIMAS measure of influence behaviour, and the<br />

frequently used 10-item 5-point Likert social desirability scale (Crowne & Marlowe, 1964;<br />

Nederhof, 1981; Nederhof, 1985). Of the NEO PI-R the five main personality scales neuroticism,<br />

extraversion, openness to experience, altruism, and conscientiousness are used. Of the WIMAS the<br />

four types of influence behavior manipulation, directness, diplomacy and assertiveness are used<br />

Instructions<br />

Many psychological questionnaires have different sets of norm groups for research purposes and<br />

for selection purposes. In order to probe the usefulness of the Dutch translation of the SSI as an<br />

selection instrument for the Dutch military, participants were induced to believe that completing the<br />

questionnaires was part of the standard psychological examination of applicants for the armed<br />

forces. Under guidance of a test assistant the questionnaires were issued and completed in a<br />

classroom setting by applicants for military service at the Defence Selection Agency. The setting<br />

and intructions are thought to have had the expected effect: 50% of the 147 participants had a score<br />

on the Crowne-Marlowe social desirability scale of 37 or higher 15 , which is considered to be quite<br />

high.<br />

RESULTS AND DISCUSSION<br />

Reliability<br />

An important question is whether the items that compose the respective scales measure the same<br />

dimension. In a study by Riggio (1986) on a sample of undergraduate students relatively strong<br />

Cronbach’s-α coefficients ranging from .75 for emotional expressivity to .88 for social expressivity<br />

are reported, indicating good internal consistency (see table 1).<br />

Riggio<br />

(1986)<br />

EE ES EC SE SS SC SSI-total<br />

.75<br />

15 items<br />

.78<br />

15 items<br />

.76<br />

15 items<br />

.88<br />

15 items<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.84<br />

15 items<br />

.87<br />

15 items<br />

14 The onset of this study coincided with a relatively sudden downsize of the Netherlands armed forces. As a<br />

consequence the number of available participants was reduced.<br />

15 Cronbach’s α=.89; Min = 10; Max = 50; M = 35,3; Stdv = 10.5.<br />

90 items


Nederhof .52<br />

( <strong>2003</strong>) 13 items 16<br />

.72 .70 .77 .73<br />

15 items 15 items 15 items 12 items 17<br />

.76 .80<br />

15 items 85 items<br />

Table 1: Internal consistency of the SSI-scales in a sample of American undergraduate students,<br />

and a sample of Dutch applicants for military service.<br />

The internal consistency of most SSI-scales of the Dutch translation is found to be satisfactory too,<br />

although somewhat lower than in the research by Riggio. On the scales emotional expressivity and<br />

social sensitivity some items were deleted because they lowered the internal consistency of the<br />

scale too much. With regard to the social sensitivity scale this resulted in an acceptable αcoefficient.<br />

The internal consistency of the Emotional Expressivity scale is nevertheless regarded as<br />

unsatisfying.<br />

Scale correlations<br />

Significant correlations between most SSI-scales are found in the American study as well as in the<br />

Dutch study (see table 2). The signs of the respective correlations between the scales in the two<br />

samples are equal. In the Dutch study we find a somewhat higher absolute value of the correlations<br />

between emotional sensitivity, emotional control, social expressivity, social sensitivity, social<br />

control and the SSI-total score than in the American study. Correlations between emotional<br />

expressivity and the other scales are lower in the Dutch study. This finding may partly be explained<br />

by the low internal consistency of the Dutch emotional expressivity scale.<br />

EE ES EC SE SS SC SSI-total<br />

EE 1 .44 -.38 .53 .00 .43 .59<br />

ES .17 1 -.10 .41 .11 .31 .60<br />

EC -.20 .23 1 .00 -.20 .08 .24<br />

SE .32 .52 .22 1 -.17 .66 .78<br />

SS .05 -.04 -.29 -.20 1 -.46 .06<br />

SC .23 .42 .23 .69 -.45 1 .64<br />

SSI-total .44 .74 .44 .83 -.01 .71 1<br />

Table 2: Correlations between the SSI-scales and the SSI-total scale in research by Riggio (above<br />

the diagonal, N = 629) and Nederhof (below the diagonal, N = 147).<br />

On the one hand the correlations between the subscales of the SSI could be interpreted as a clear<br />

sign the scales of the SSI are not independent, en should therefore be improved. In the case the SSI<br />

would have been a personality inventory this interpretation is very tempting.<br />

On the other hand the correlations may be the result of a natural process in which the development<br />

of one social skill influences the development of one or more related social skills. Some evidence<br />

for this interpretation is found in the negative correlation between some of the SSI scales. For<br />

instance, a negative correlation is found between emotional expressivity and emotional control,<br />

indicating that some of the more expressive persons would find it difficult to control the expression<br />

of their emotions (and vice versa), a notion that appeals to common sense. Also, a negative<br />

correlation is found between social sensitivity and emotional control, indicating that persons that<br />

are better in controlling the expression of their emotions are found to be less able in receiving<br />

16 Items 25 and 37 are deleted.<br />

17 Items 5, 17 and 53 are deleted.<br />

339<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


340<br />

relevant information concerning social norms. This finding may also appeal to common sense,<br />

when we realise social sensitivity is the ‘passive’ social skill of interpreting socially relevant<br />

signals, that could be associated more with the listener group role. Emotional control is a more<br />

active social skill of sending emotionally relevant messages, that could be associated more with the<br />

active sender group role.<br />

The negative correlation between on the one hand social sensitivity and on the other hand social<br />

expressivity and social control again seems to stress the difference between the active social skills<br />

and the passive skills. We will return to this thought later.<br />

A strong positive correlation is found between the SSI-total score and five SSI-scales. This<br />

correlation is interpreted as an indication that the SSI-total score may serve as a global indicator of<br />

the development of social skills. This thought will be explored further later on in the paper as well.<br />

Test-retest reliability<br />

An assessment of test-retest reliability with a two-week interval was planned. Due to the<br />

downsizing of the Dutch military that coincided with the study, this assessment was no longer<br />

possible. Riggio (1986) found test-retest reliabilities ranging from .81 for emotional expressivity, to<br />

.94 for the SSI-totalscore, and .96 for social expressivity.<br />

Validity<br />

Social desirability<br />

In personnel selection situations a social desirability bias on test results is generally expected. In<br />

accordance to earlier research findings by Riggio significant correlations are found between the<br />

Crowne-Marlowe social desirability scale and the social scales of the Social Skills Inventory (see<br />

table 3), indicating that the non-verbal/emotional scales of the SSI may be free of a social<br />

desirability bias. A small positive correlation is found between social desirability and social<br />

expressiveness, indicating that the more socially and verbally skilled persons are more inclined to<br />

giving social desirable answers (R=.21). A positive correlation is found between social desirability<br />

and social control, indicating that persons with greater skills in managing social situations are also<br />

more inclined to giving social desirable responses to questions (R=.25). A remarkable but<br />

significant negative correlation is found between social desirability and social sensitivity, indicating<br />

the more socially sensitive persons are actually less inclined to giving social desirable responses<br />

(R= -.30).<br />

EE ES EC SE SS SC SSI-total<br />

Riggio -.15 .12 .10 .26 -.31 .48 .04<br />

Nederhof -.08 .09 .10 .21 -.30 .25 .10<br />

Table 3: Correlation between the Crowne-Marlowe social desirability scale and the SSI-scales.<br />

These findings indicate that the social scales of the SSI are prone to social desirable answering by<br />

some persons. This conclusion is not unique, since this is the case for most instruments that are<br />

used for selection purposes, as indicated by the significant difference between norm groups for<br />

research purposes versus selection purposes. The problem of coping with social desirability in norm<br />

groups may be however that in the same situation some people are more inclined to social desirable<br />

answering than others (Nederhof, 1985), which greatens the chance of false negatives. As a<br />

solution different norm groups might be devised for people that score low versus people that score<br />

high on social desirability.<br />

Personality<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The relationship between behaviour and personality has been a topic of discussion for decades.<br />

Recently, with the increasing popularity of the five-factor personality inventories some consensus<br />

is growing to the notion that personality can influence behaviour in a direct way. As well as in<br />

work by Riggio (1986), in this study the relationship between personality and social skills is<br />

explored. The choice was made to measure personality with a Dutch translation of the NEO PI-R<br />

five-factor personality inventory 18 , a relatively new but well validated personality questionnaire.<br />

The flip side of this choice is that a comparison of research findings with work of Riggio (1986,<br />

1989) is more difficult.<br />

As discussed earlier, some of the social skills of the SSI refer to behaviour directed at changing or<br />

influencing the situation, whereas other skills are directed at interpreting the situation without<br />

changing it. In a way, social skills could therefore be considered ways of coping with social<br />

situations, and of coping with oneself. A positive correlation is found between neuroticism and the<br />

‘passive’ social sensitivity (see table 4). Negative correlations are found between the neuroticism<br />

scale and the active emotional control, social expressivity and social control. These findings may be<br />

seen as support that the SSI does measure relevant behaviour dimensions. The sign of the<br />

correlations stresses the difference between the active and the passive social skills.<br />

Neuroticism Extraversion Openness Altruism Conscientiousness<br />

EE .02 .38 .31 -.01 .06<br />

ES -.16 .33 .43 .10 .24<br />

EC -.24 .07 .02 -.04 .28<br />

SE -.35 .58 .32 .22 .42<br />

SS .57 -.06 -.04 -.10 -.32<br />

SC -.50 .44 .28 .15 .49<br />

SSI-total -.24 .54 .43 .09 .39<br />

Table 4: Correlations between the SSI-scales and the NEO PI-R five factor personality scales.<br />

Extraversion is found to correlate positively with the expressivity scales of the SSI, thereby<br />

supporting the expressivity construct. Also, a positive relationship between extraversion and social<br />

control is found, which can well be understood since extraverts may be expected to be more<br />

inclined towards influencing a situation than introverts.<br />

Openness to experience refers to a mindset in which new and different ideas, values, feelings and<br />

impressions are wellcomed. Most of the active scales of the SSI correlate positively with the<br />

openness scale, which is expected. No correlation was found between openness on the one hand<br />

and emotional control and social sensitivity on the other.<br />

The construct of altruism refers to mindset in which the experiences, stakes and goals of other<br />

people are found less or more important in comparison to the own experiences, stakes and goals.<br />

Most correlations between the altruism and the SSI-scales are found to be insignificant (see table<br />

4). Since altruism refers to a mind set and social skills to behaviour this finding is not surprising.<br />

Conscientiousness refers to engaging certain tasks actively, in an orderly manner, and with stamina.<br />

A positive correlation is found between the active social skills and conscientiousness. A negative<br />

correlation is found with the passive skill social sensitivity.<br />

45 th 18 Riggio (1986) used the 16 Personality Factor Test. However, the reliability and validity of the<br />

Dutch translation of the 16 Personality Factor Test is questioned (Evers et al, 2000).<br />

Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

341


342<br />

The correlations between the SSI and the personality dimensions generally support the notion that<br />

the SSI measures behaviour dimensions in an expected manner. Thus the validity of the SSI is<br />

supported by these findings.<br />

Influence behavior<br />

The participants in this study apply for leader functions at the level of non-commissioned officer or<br />

officer. It is expected that the more socially skilled persons will self-select for these jobs. In order<br />

to further judge convergent and discriminant validity, the relationship between SSI-scores and a<br />

measures of influence behaviour is explored. The WIMAS measures four styles of influencing<br />

others. The influence styles are exerting influence by manipulating others, influencing others by<br />

diplomacy, influencing others by assertive behaviour, and influencing others by open and direct<br />

requests.<br />

Manipulation Diplomacy Assertiveness Directness<br />

EE .13 -.03 .18 .19<br />

ES .13 .30 .16 .29<br />

EC .21 .25 .12 .18<br />

SE .01 .25 .37 .40<br />

SS .15 -.20 -.35 -.43<br />

SC -.07 .31 .44 .39<br />

SSI-total .16 .31 .31 .33<br />

Table 4: Correlation between influence behaviour and the SSI-scales.<br />

The SSI was originally composed of seven scales, the seventh being named “social manipulation”.<br />

With exception of a correlation with emotional control, social manipulation did not correlate with<br />

the other social skills (Riggio, 1986). The present findings confirm these results indicating that<br />

manipulation is perhaps more a cognitive skill than a social skill.<br />

Positive correlations are found between the active social skills as measured by the SSI, and the<br />

influence styles assertiveness, diplomacy and directness. These findings further support the claim<br />

that the SSI measures behaviour.<br />

GENERAL DISCUSSION<br />

In this study a translation in Dutch of the Social Skills Inventory was pre-tested on a sample of<br />

applicants for military leadership functions. Although this study may have suffered from a in<br />

imperfect sample, most results of this pre-test are seen as encouraging, and stimulate further<br />

exploration of the instrument. Even so, there are two concerns regarding parts of the instrument.<br />

Firstly, the internal consistency of the Dutch translation of the emotional expressivity scale is found<br />

to be inadequate. This finding could partly be caused by the Dutch culture. For instance, items with<br />

regard to touching other people may be prone to cultural different answers since the Dutch may not<br />

be very inclined to touching others. Also, a self-selection effect may have caused some lowering of<br />

the internal consistency of the emotional expressivity scale because items regarding the expression<br />

of emotion may be less appealing to persons that want to join the armed forces.<br />

Secondly, the social sensitivity scale causes some thought because of the strong positive correlation<br />

between social sensitivity and neuroticism, and he low correlation or even negative correlations<br />

between social sensitivity and the other scales. Inspection of the items that compose the scale learns<br />

that some items may touch the topic of neurotic behaviour.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


However, possibilities are seen to adjust the emotional expressivity scale to the Dutch culture,<br />

which may improve the internal consistency of the scale. Also, improvement of the social<br />

sensitivity scale by replacing some items by others with a more neutral content is thought possible.<br />

Since social skills are relevant skills for military leaders, and an instrument measuring these skills<br />

comes in handy both for selection purposes as well as for leadership training purposes, investing in<br />

the social skills inventory is worth thinking of, if the predictive validity of the four strong scales<br />

proves to be encouraging. With the current sample, in the next months an indication of the<br />

predictive validity of the four strong scales of the Dutch translation could be obtained in a military<br />

training environment.<br />

REFERENCES<br />

Bartone, Paul T. & Faris R. Kirkland (1991) Optimal leadership in small army units. In: Gal,<br />

Reuven & A. David Mangelsdorff (1991) Handbook of military psychology. Chichester: John<br />

Wiley & Sons.<br />

Crowne, D.P. & Marlowe, D. (1964) The approval motive. New York: Wiley.<br />

Evers, A, J.C. van Vliet-Mulder & C.J. Groot (2000) Documentatie van Tests en Testresearch in<br />

Nederland. Assen: Van Gorcum.<br />

Hoekstra, H.A., J. Ormel and F. de Fruyt (1996) NEO PI-R & NEO FFI, Handleiding Big Five<br />

Persoonlijkheidsvragenlijsten. Lisse: Swets & Zeitlinger B.V.<br />

Luteijn, F., J. Starren and H. van Dijk (1985) Nederlandse Persoonlijkheidsvragenlijst. Herziene<br />

uitgave 1985. Lisse: Swets & Zeitlinger B.V.<br />

Nederhof, A.J. (1981) Beter Onderzoek. Leiden: SISWO<br />

Nederhof. A.J. (1985) Methods of coping with social desirability bias: a review. European Journal<br />

of Social Psychology (1985) vol. 15, pp. 263-280.<br />

Riggio, Ronald E. (1986) Assessment of basic social skills. Journal of Personality and Social<br />

Psychology, 1986, vol 51, no 3, pp 649-660.<br />

Riggio, Ronald E. (1989) Manual of the Social Skills Inventory, Research Edition. Redwood City:<br />

Mind Garden.<br />

Riggio, Ronald E. & Shelby J. Taylor (2000) Personality and communication skills as predictors of<br />

hospice nurse performance. Journal of Business and Psychology, vol 15, no 2, winter 2000, pp.<br />

351-359<br />

Riggio, Ronald E., Bronston T. Mayes & Deidra J. Schleicher (<strong>2003</strong>) Using Assessment Center<br />

Methods for Measuring Undergraduate Business Student Outcomes. Journal of Management<br />

Inquiry, vol. 12 No.1, march <strong>2003</strong>.<br />

343<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


344<br />

ABSTRACT<br />

Deployability of teams<br />

The Dutch Morale Questionnaire,<br />

an instrument for measuring morale during military operations.<br />

Bert Hendriks MA, Cyril van de Ven MA & Daan Stam MA *<br />

Behavioral Science Division<br />

Royal Netherlands Army<br />

Unit morale is without doubt an important factor for deployability and combat power of<br />

military units. Combat power is nowadays considered to depend, in addition to tactical,<br />

logistic and technical capacity, on the morale of military personnel. Morale is a concept<br />

containing a number of factors. This paper describes the development of the Dutch Morale<br />

Questionnaire (DMQ). Based on 25 surveys applied to deployed units since 1997 a theoretical<br />

model is constructed. The model itself and its theoretical background will be described.<br />

Finally, practical experiences with the morale measurement in the Royal Netherlands Army<br />

are discussed.<br />

INTRODUCTION<br />

Since the evolution of the art of war, various armed forces have paid a greater deal of<br />

attention to psychological phenomena such as morale, elan, esprit de corps and other, similar<br />

concepts which may influence combat power. Combat power is nowadays considered to<br />

depend, in addition to tactical, logistic and technical capacity on the morale of military<br />

personnel. That is why any attempt at predicting morale attracts the full attention of<br />

operational commanders. Their attention focuses on all kinds of individual and group<br />

processes which positively affect morale and the resulting resistance to stress. To a large<br />

extent leaders of military units can influence these factors. The development and application<br />

of a morale measuring instrument does, therefore, not just contribute to restricting or<br />

preventing negative effects on personnel (such as disfunctioning, repatriation and exit), but<br />

also acts as a force multiplier.<br />

Research has shown that high morale reduces the risk of mental collapse of a group of<br />

military personnel. High morale and a high degree of group cohesion have proved to limit the<br />

development of combat stress during operations. They act as a kind of buffer for all kinds of<br />

negative consequences of war (Tibboel, van Tintelen, Swanenberg en van de Ven, 2001).<br />

<strong>Military</strong> personnel in highly cohesive sections feel more secure, have greater self-confidence,<br />

less fear and higher motivation.<br />

If commanders are responsible for improving morale it is important to have an instrument for<br />

measuring morale. In 1997 the Behavioral Sciences Division of the RNLA designed such an<br />

instrument, called ‘Deployability of teams’ (Tibboel et al., 2001). Since the mission of the<br />

Dutch part of the Stabilization Force rotation 7 (November 1999) the Behavioral Sciences<br />

Division has gained a lot of experience with this instrument and constantly evaluates and<br />

improves the questionnaire.<br />

* A.B. Hendriks MA and C.P.H.W. van de Ven MA work as a researcher at the Behavioral Sciences<br />

Division/Directorate of Personnel and Organisation/RNLA. D. Stam is associated with the Leiden University.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Article structure<br />

This paper describes the development and application of the instrument. First we will discuss<br />

morale in relation to military operations. Then the factors that are used in the instrument are<br />

described, analyzed and presented in a theoretical model. Finally we will discuss the<br />

experiences with the instrument and how it helps commanders in improving morale before<br />

and during operations.<br />

Morale in relation to military operations<br />

The following definition of morale is used:<br />

Morale is a mental attitude held by an individual soldier in a task-oriented<br />

group in relation to achieving the operational objectives for which the group<br />

exists’ (van Gelooven, Tibboel, Slagmaat & Flach, 1997).<br />

The following can be derived from this definition:<br />

• morale is a mental attitude, and cannot therefore be directly observed;<br />

• morale is a characteristic of an individual within a unit. The morale of a unit is comprised<br />

of the morale of the members;<br />

• morale is linked to (operational) objectives. Morale may be high (good) or low (bad).<br />

High morale is by definition good. It means after all that the individual has a positive<br />

mental attitude towards achieving the given objectives of the unit.<br />

A unit with high morale consistently performs at a high level of efficiency and carries out its<br />

allocated tasks accurately and effectively. In such units, each member makes a willing<br />

contribution, assumes that his or her contribution is worth making and that the other members<br />

of the unit will also make their contribution. If necessary, the members help each other,<br />

without their help having to be requested. The few members who would prefer not to make<br />

their contribution feel pressure to carry it out anyway. Members of such units rate themselves<br />

highly, they often develop a strong sense of identification with each other and they are proud<br />

of their unit. They are aware of the reputation of the unit and take pleasure in showing off<br />

their membership (Shibutani; in Manning, 1991).<br />

Morale analysis<br />

Since commanders are expected to influence morale, it is important to be able to measure this<br />

concept.<br />

The armed forces of, in particular, Israel and the United States use questionnaires on morale.<br />

The Israeli Defense Forces have psychologists, who advise commanders on matters<br />

concerning morale, discipline, sleep management, the prevention of stress and the<br />

deployability of troops in general. For advice with respect to morale and deployability of<br />

troops, the IDF makes use of ‘morale and combat readiness questionnaires’ (Tibboel, van<br />

Tintelen, Swanenberg en van de Ven, 2001).<br />

In the Yom Kippur and Lebanese wars these morale questionnaires have actually been used<br />

and they have formed a major source of information for the psychologists advising the<br />

brigade and/or battalion commanders. For example, units were regularly replaced when it<br />

could be demonstrated that they were too (mentally) exhausted by specific operations to still<br />

be effectively deployable.<br />

In the United States, morale research is used in a similar way. Here, a great deal of research is<br />

done into combat stress, group cohesion, morale and the deployability of personnel from<br />

combat units (Mangelsdorff et al., 1985).<br />

345<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


346<br />

The French army, too, uses a morale questionnaire called the ‘Force Morale’ (Centre des<br />

Relations Humaines, 1998). This serves as a reference and aid for commanders.<br />

Almost all armed forces currently recognize the importance of morale and an effort is made<br />

during training to create a strong group bond, high morale and high motivation by means of<br />

gaining common experience.<br />

Development of the Dutch Morale Questionnaire (DMQ)<br />

In the development of the DMQ, international theoretical morale models were used. Van den<br />

Bos, Tibboel and Willigenburg (1994) developed a theoretical model for the Dutch situation.<br />

The basis of this model are factors distinguished by Gal (1986) and Mangelsdorff et al.<br />

(1985). In 1997 the questionnaire was analyzed for the first time, questions were rephrased<br />

and scales were added. In 2002, after nearly five years of experience with the questionnaire, a<br />

second modification of the model has been executed. In table 1 an overview of measures is<br />

presented that are used for the development of the DMQ.<br />

Table 1. Overview of used measures<br />

Unit Rotation Year Measure<br />

1(NL) Mechbat<br />

7 1999 1, 2, 3<br />

SFOR in Bosnia<br />

8 2000 1, 2<br />

9 2000 1, 2, 3<br />

10 2001 1, 2, 3<br />

11 2001 1, 2<br />

12 2002 1, 2<br />

13 2002 1, 2<br />

14 <strong>2003</strong> 1, 2, 3<br />

1 (NL) Coy UNFICYP in Cyprus 2000 1, 2<br />

41 Medical coy 2001 1<br />

210 Fuel distribution coy 1997 1<br />

400 Medical battalion 1998 1<br />

The central question when developing a morale instrument was how that instrument might<br />

contribute to raising morale. In order to be able to answer this question, it is necessary to look<br />

at factors that influence morale. The concept of morale is composed of a number of factors,<br />

the importance of each factor depends on the specific circumstances. 4 kinds of aspects can be<br />

distinguished: individual, unit, leadership and organization.<br />

In figure 1 a simplified model for predicting morale within military units is illustrated. Morale<br />

can be predicted by measuring factors related to morale and is influenced by the net results of<br />

a combination of the showed factors (interactive).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 1. Simplified version of the model<br />

A separate scale for each factor was developed in this instrument using a number of<br />

statements. The instrument includes the following factors:<br />

Individual aspects<br />

1. Trust<br />

2. Home front support<br />

3. Job satisfaction<br />

4. Organizational Citizenship Behavior<br />

Unit aspects<br />

5. Deployability<br />

6. Unit cohesion<br />

7. Identification with the unit<br />

8. Respect<br />

Leadership<br />

9. Group<br />

10. Platoon<br />

11. Team<br />

Leadership<br />

aspects<br />

Unit<br />

aspects<br />

Organizationa<br />

l aspects<br />

Organizational aspects<br />

12. Appreciation of the military environment<br />

13. Involvement in the objectives of the army<br />

14. Familiarity with the assignment and the terrain<br />

15. Perceived organizational support<br />

Personal<br />

aspects<br />

Individual<br />

aspects<br />

Environmental<br />

aspects<br />

347<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


348<br />

The factors applied in the model are explained below. At the end of each description, several<br />

examples are given from the morale questionnaire. Finally in this section the scale reliability<br />

(Cronbach’s α) and the factor analysis are presented.<br />

The items may be answered by the respondents using a five-point scale (from totally disagree<br />

(1) to totally agree (5)).<br />

Individual aspects<br />

Trust<br />

A great deal of attention in the morale questionnaire is devoted to trust. It involves trust in:<br />

oneself, colleagues, arms and equipment, leaders and the presence of adjacent and support<br />

units. Trusting and being trusted promotes dedication and discipline and helps military<br />

personnel to function well under difficult conditions. Trust in one’s own talents within a<br />

specific situation or operational conditions requires that one knows the objective and the role<br />

one plays within that situation. One needs to have confidence in his capacity to be able to<br />

carry out the tasks. Furthermore, it also concerns trust in the skills and willingness of the<br />

group members to protect each other in operational conditions.<br />

It is also important for a high morale that military personnel have confidence in the logistic<br />

and combat-support units. In general, military personnel will be prepared to expose<br />

themselves to danger when they are convinced that medical care will be effective if they get<br />

wounded, and the same applies for counseling and treatment of combat stress. Furthermore,<br />

they must be confident that defects are repaired as quickly as possible and food, fuel and<br />

ammunition supply is guaranteed.<br />

Examples of items:<br />

- I pay an important contribution to the success of my group.<br />

- I think my platoon will perform well in combat situations.<br />

- The equipment used by my group is up to its task.<br />

- I think we will receive sufficient support in our tasks from logistic units.<br />

Home front support<br />

Morale is strongly influenced by the extent to which military personnel are concerned about<br />

the situation at home. Concern about the relationship between the home front and work is<br />

closely linked to the degree of self-confidence, the reduction of uncertainty, level of fear and<br />

motivation. This factor includes the assessment of survival chances by military personnel and<br />

worry about those family members left behind and vice versa.<br />

Limited opportunities for communication with the home front in particular may cause<br />

isolation and frustration. Separation from family and not being able to help if things go wrong<br />

at home is generally perceived as stressful. <strong>Military</strong> personnel must be able to start missions<br />

with their minds at ease, free from cares and problems, which negatively influence morale.<br />

Concern about the home front comprises the care, which the military personnel wish to give<br />

to the home front and the fear that something might happen to the home front. On the other<br />

hand, this factor implies the fact that the soldiers have to feel supported by their home front in<br />

what they are doing. If they do not feel supported they feel they have to choose between the<br />

army and the home front.<br />

Examples of items:<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


- My home front is proud of me.<br />

- My home front respects my decision to work for the Royal Netherlands Army.<br />

- I think there are sufficient opportunities to let the home front know how I am.<br />

Job satisfaction<br />

Characteristics of the job go hand-in-hand with satisfaction and (inner) motivation and thus<br />

also with morale. According to Hackman and Oldman (1976) five objective task<br />

characteristics can be distinguished which influence employee motivation: skill variety, task<br />

identity, task significance, autonomy, and feedback. The more a job meets these<br />

characteristics, the more motivated employees will be.<br />

In order to achieve high morale, it is important that individuals have a clear idea of their role<br />

and view their role as useful and significant. On the one hand, this concerns their share in the<br />

units’ objectives in a specific situation. Also, it is the role, which relevant other people (the<br />

group and the leader) expect of an individual. Roles ensure predictable behavior and regulate<br />

the co-ordination of tasks among the group members.<br />

Examples:<br />

- In my job I can show my capabilities.<br />

- I have a useful function/task within my section.<br />

Organizational Citizenship Behavior (OCB)<br />

Members of the unit show organizational citizenship behavior, when they are willing to do<br />

more than they are asked for in the description of their job without being ordered. They show<br />

‘extra-role’- behavior without expecting a reward. This behavior is linked to unit cohesion<br />

and raises the efficiency en effectiveness of the unit (Organ, 1988). Where unit members<br />

show more OCB, unit cohesion will grow.<br />

Examples:<br />

- If necessary I will die for the interest of my unit.<br />

- Even under the most badly circumstances I will try hard to fulfill my tasks.<br />

Personal aspects<br />

The definition showed that morale is an individual attitude. Every individual soldier<br />

contributes in morale within the unit in his own way, for example because of his age, military<br />

experience and experience in humanitarian missions (Labuc, 1991). To inform the<br />

commander about the personal aspects of his unit members and the relationship with the<br />

aspects of morale development, the questionnaire starts with a few personal questions.<br />

Unit aspects<br />

Cooperation within the unit, identification and respect are all related to unit cohesion.<br />

Unit cohesion is a major factor of influence on morale. In a cohesive unit, members feel<br />

secure and protected. The higher the unit cohesion, the more influence the unit will have on<br />

its members: unit members accept objectives, decisions and norms more quickly. There is<br />

stronger pressure to conform to the unit and there is less tolerance of unit members who do<br />

not agree.<br />

Where there is high cohesion, membership of a unit is maintained for longer, unit objectives<br />

are achieved sooner, there is greater participation and loyalty from unit members, good<br />

cooperation, better communication, less absenteeism and the members feel more secure (Van<br />

349<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


350<br />

Gelooven, Tibboel, Slagmaat & Flach, 1997). Members identify themselves with the unit and<br />

respect each other.<br />

Strong unit cohesion does not, however, necessarily have to be positive. It may cause<br />

aggressive behavior. A sense of ‘us’ which is too strong may lead to deindividualization,<br />

groupthink and aggressive behavior with respect to the outgroup. The positive effects of unit<br />

cohesion may, however, be stimulated and the negative minimized (Avenarius, 1994).<br />

Shared experiences, such as joint exercises and recreational activities, offer the opportunity to<br />

gain cohesion. These shared experiences may confirm whether the section members are<br />

willing to help each other and may distinguish the group from others. The greater the<br />

awareness of group members that they depend on each other for success, the more this<br />

contributes to group cohesion (Van Gelooven, Tibboel, Slagmaat & Flach, 1997).<br />

Examples of items:<br />

- I think my group is satisfied with my performance.<br />

- In my group, people feel responsible for each other.<br />

- In our group, we work together well.<br />

- In our group we respect each other.<br />

Leadership<br />

Trust in the leader is based on the personal characteristics of the leader, such as<br />

professionalism, the example set, integrity, patience and being able to assess the performance<br />

capability of the subordinate units and military personnel realistically. If leaders want to have<br />

the unconditional support of their unit, the unit not only needs to know that the leaders are<br />

competent, but also that they care about them. It is the trust of the unit members in the skills<br />

and willingness of their direct superiors to protect them under operational conditions. Often,<br />

of great importance is the emotional bond, which stimulates unit members to follow their<br />

commander, even in life-threatening conditions.<br />

In the Netherlands Armed Forces a high degree of independence and initiative is expected of<br />

military personnel. This enables mission command and control down to the lowest level (Van<br />

Gelooven, Tibboel, Slagmaat & Flach, 1997).<br />

<strong>Military</strong> personnel identify with the unit leaders and adopt the intentions and objectives of<br />

those leaders (to a large extent). As the leaders are also members of ‘higher groups’, they link<br />

the group to the objectives of the ‘higher’ unit. Insofar as the leaders of the small groups are<br />

seen as representatives of the larger organization, the trust they gain may be passed on to the<br />

organization. Leaders therefore play a key role in transferring the organizations’ objectives to<br />

the group. That is why leadership is measured on three different levels: the group-, platoon-<br />

and team commander.<br />

Examples of items:<br />

- My group commander always clearly tells us what needs to be done.<br />

- My platoon commander has sufficient skills to do what has to be done.<br />

- In general I am satisfied with the way in which my commander leads our team.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Organizational aspects<br />

Appreciation of the military environment<br />

The military environment can not always be as comfortable as a ‘regular’ working<br />

environment. During military operations vital needs as food and drink, relaxation and rest,<br />

clean clothes, sufficient accommodation and postal deliveries become very important.<br />

A shortage of these necessities of life has a negative effect on morale. It is largely subjective<br />

and relative as to whether these necessities are provided properly. Among other things, a<br />

comparison is made with what personnel had expected or were used to before the mission. If<br />

the organization doesn’t take care well enough of those aspects, morale is negatively<br />

influenced (Van Gelooven, Tibboel, Slagmaat & Flach, 1997).<br />

The appreciation of the military environment is based on an individual perception of the<br />

situation and is related to expectations. The worse the situation in the barracks, the better the<br />

appreciation of the operational working environment.<br />

Examples of items:<br />

- I am satisfied with the relaxation facilities on the base.<br />

- I am satisfied with the quality of the meals on the base.<br />

- The organization tries hard to deliver the mail on the right place in the right time.<br />

Involvement in the objectives of the army<br />

Involvement with the objectives concerns both the objectives of the unit and the objectives of<br />

the army and country. Esprit de corps concerns the relationship with and trust in a higher unit<br />

(the brigade, the corps), or even more abstractly the organization (the RNLA or - in<br />

peacekeeping operations - the EU or UN). This involvement with the reputation of an<br />

organization exceeds the borders of the primary unit.<br />

The objective and the legitimacy of a deployment in operational conditions must be clear and<br />

convincing to the individual. This objective does not necessarily have to be of international<br />

importance (such as protection of democracy). But nothing is worse for morale than the sense<br />

that activities are pointless and serve no purpose whatsoever. The lower in the hierarchy, the<br />

more specific the objectives will be. Objectives must be challenging and realistic in order to<br />

fully motivate those carrying them out.<br />

Examples of items:<br />

- I support the objectives of the Dutch army.<br />

- I contribute positively to international security by serving in the Dutch army.<br />

Familiarity with assignment and terrain<br />

Familiarity with the assignment and the terrain reduces uncertainty, raises self-confidence and<br />

consequently contributes to higher morale. It is important that leaders devote attention to<br />

environmental factors, which influence morale, and thus the deployability of personnel. Data<br />

about the terrain also need to be gathered for peacekeeping operations.<br />

According to Labuc (1991) morale is determined by the background of the military personnel<br />

and their unit and the environmental factors which apply at that time.<br />

This involves a considerable number of factors, such as: operational conditions, high or low<br />

level of the spectrum of force, whether the operation is offensive or defensive, the seriousness<br />

of the situation, the length of time involved, logistic support, losses, the terrain and the<br />

climate.<br />

Examples of items:<br />

- I was sufficiently informed of the assignments my platoon could expect during the<br />

351<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


352<br />

operation.<br />

- In my platoon, we are well informed on our future tasks.<br />

Perceived organizational support (POS)<br />

<strong>Military</strong> personnel have a need for support from the organization. Especially during military<br />

operations they need the perception that the army is interested in the individual soldier and<br />

cares for him/her. POS contributes to organizational citizen behavior (OCB): the more<br />

organizational support is perceived, the more OCB might be expected.<br />

Examples of items:<br />

- The Dutch army is interested in my well being.<br />

- The Dutch army will help me when I am in trouble.<br />

- The Dutch army appreciates my work.<br />

In table 2 the scale reliabilities are presented. Most of the scales reached a high level of<br />

reliability.<br />

Table 1. Scale reliability (Cronbachs α)<br />

Dimension Variable Number of<br />

questions<br />

Individual Trust 5 .8091<br />

Home front support 4 .7660<br />

Job satisfaction 6 .8322<br />

Organizational citizen behavior 6 .7943<br />

Group Unit cohesion 4 .8429<br />

Identification 6 .8468<br />

Respect 4 .9294<br />

Leadership Groupcommander 8 .9279<br />

Platooncommander 8 .9485<br />

Organization Appreciation of military environment 3 .6915<br />

Involvement in the objectives of the army 4 .8126<br />

Familiarity with the assignment and<br />

terrain<br />

6 .7928<br />

Perceived organizational support 6 .9009<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Cronbachs α<br />

A factor analysis (Varimax rotation with Kaiser Normalization) confirmed the assumed data<br />

structure. Through confirmative factor analysis the covariance structure was tested. ‘Job<br />

satisfaction’ appeared not to fit in the dimension of ‘individual aspects’. After putting these<br />

items in the ‘organizational aspects’ dimension, the analysis confirmed the data structure<br />

completely.


PRACTICAL APPLICATION OF THE DUTCH MORAL QUESTIONNAIRE (DMQ)<br />

The Behavioral Sciences Division of the Royal Netherlands Army has used this morale<br />

monitor for determining the morale of units mostly during military international operations<br />

since 1990. Experience of the DMQ therefore chiefly involves peacekeeping operations. The<br />

recently modified instrument can make valid statements about factors of influence on the<br />

morale of units.<br />

With the help of the DMQ and the feedback, and the advice of the investigators the leaders of<br />

the units can improve the morale of the team, platoon or group. Especially the measures taken<br />

before and during the deployment can be of great help.<br />

Method<br />

As high morale cannot be achieved in a short period of time, it is important that the unit to be<br />

deployed already possesses a reasonably high level of morale. That is why it is important to<br />

assess the unit morale just before operational deployment. Bottlenecks can be mapped in time<br />

and the commander is able to emphasize relevant aspects and to tackle and solve any<br />

problems. The instrument therefore consists of three measurements:<br />

a. the first measurement is just before operational deployment (approx. 1 month);<br />

b. the second is during deployment (approx. 3 months after the start of the operation);<br />

c. the third is after return the deployment (approx. 1 months after return).<br />

Before the survey begins, the Behavioral Sciences Division makes contact with the<br />

operational commander of the battalion to be deployed. The battalion commander is informed<br />

about the objectives of the project, the characteristics of the process and the current<br />

questionnaire and the aspects of anonymity. If this commander agrees to the survey, the same<br />

information is given to the team commanders. The team usually consists of an extended<br />

company, complemented with logistic and engineer units. During introduction emphasis is<br />

placed on anonymity and the fact that the information is to be used as an analysis of the<br />

strengths and weaknesses of the team and therefore as advice for improvement. A contract is<br />

drawn up with the team commander which clearly states how the survey and the<br />

corresponding report will be handled. Because of the defined objective and guaranteed<br />

anonymity, no information on the results is passed on to the battalion commander by the<br />

researchers without the teamcommanders’ permission (only the approval of the battalion<br />

commander is needed for the survey). Commitment from the personnel in the unit is essential<br />

for the data collection and for broad acceptance and recognition of the results.<br />

Each measurement consists of a description of the morale of the team. Each factor which<br />

influences morale is given a score using a number of questions and a scale of 1 (poor) to 5<br />

(good). A distinction is made here between soldiers and officers/NCOs.<br />

The results of each measurement, combined at team level, are fed back to the direct<br />

commander of the team. This is done in an assessment in which the survey results are linked<br />

to the specific background, recent (combat) experience, characteristics, culture etc. of the<br />

team, which are not entirely known to the researcher. Thus the results can be further refined<br />

and discussed with the team commander and noteworthy scores can be highlighted. Next, on<br />

the advice of the researcher the team commander determines what, if any, specific action<br />

should be undertaken to improve one or more aspects of morale in (part of) the team. Specific<br />

measures may be the structural introduction of more breaks, the issue of more information<br />

about progress and the objective of the combat (or the exercise) or more attention to<br />

communication among individuals or between individuals and those in positions of authority.<br />

Preconditions<br />

The application of the instrument is linked to the following preconditions.<br />

353<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


354<br />

1. A reliable response to the questions in the morale questionnaire is promoted by the<br />

anonymity and confidentiality guaranteed by the researcher. If that anonymity is not<br />

guaranteed this may lead to biased responses.<br />

2. Commitment from the respondents.<br />

3. The results are geared to (adjusting) steering of the unit along the line of command, i.e.<br />

the team commander.<br />

4. The point of the first measurement is determined by the point in time in which the unit is<br />

complete and trained for deployment and aware of the assignment.<br />

5. The instrument aims to make a ‘photo’ of a unit with respect to morale aspects; it is not<br />

meant to be an instrument of judgement, but one of improvement; it concerns a mutual<br />

agreement between the researchers and the team leadership to achieve that improvement.<br />

RESULTS/ EXPERIENCE<br />

Our experiences derive from several surveys that have been conducted. The DMQ has been<br />

used to measure morale within combat, support and logistic units. The measurement takes<br />

approximately seven working days from issuing the questionnaires to the respondents up to<br />

and including discussion of the results with the team commander. So far the survey has been<br />

conducted in various units in different environments. For example, in 1990 a survey was<br />

started about the situation at the barracks, and subsequently among the Special Forces, IFOR<br />

and SFOR (6,7,8). Only the latter surveys encompass measurements prior to, during and after<br />

the operation in the barracks situation and under operational conditions. In <strong>2003</strong> a tailor made<br />

moral questionnaire is designed on behalf of the Airmobile Brigade because of their high<br />

readiness state. In this way the brigadier is informed about the morale state of his companies<br />

and is able to send units on missions only when their moral state is sufficient. In 2004 the<br />

moral questionnaire will be used with the Dutch stabilization forces in Iraq (SFIR).<br />

Advantages and disadvantages of the DMQ<br />

The following advantages of the instrument can be seen from the surveys.<br />

a) The instrument provides insight into the quality of personnel within the team at<br />

operational level in specific and measurable terms with respect to morale and indicates<br />

(predicts) an increased or decreased risk of dropouts using the morale indicator.<br />

b) The team commander receives a clear overview of the morale aspects. The survey offers<br />

insight into the relevant personnel variables in relation to deployment and combat<br />

readiness. The information is used for a strengths and weaknesses analysis of the team and<br />

as advice for improving aspects. The use of a morale instrument at unit level enables team<br />

commanders to gain insight into the state of affairs concerning the influencing factors of<br />

morale. On the basis of this survey and the corresponding advice, specific measures to<br />

improve morale can be implemented.<br />

c) The analysis is not complicated and provides a structural and systematic overview of<br />

morale aspects and contains a great deal of information on the situation before, during and<br />

after the mission<br />

d) Conducting the survey and the (brief) report take little time.<br />

e) The response is very high (80-90%).<br />

The following disadvantages or points of attention were noted.<br />

a) The results of the measurements can be viewed as threatening or painful by the<br />

(leadership of the) unit. This is certainly true of information on leadership, cohesion and<br />

trust. Even though the report is not meant to be a judging instrument, team commanders<br />

might use it that way. The researchers therefore must emphasize this misunderstanding<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


efore reporting the results.<br />

b) Although the information on the morale of a unit may be interesting for the next level up<br />

(battalion), passing on information about morale is a sensitive issue. Care should be taken<br />

to avoid passing on information gained by the morale instrument being seen as an<br />

inspection by the next level up. In this way the information value of the instrument is lost,<br />

which is not desirable. This is why the team commander is asked for approval sending the<br />

team results to the battalion commander.<br />

c) Changing requires response time. High morale is not something that can be achieved in a<br />

short time, whereas a specific (negative) event can cause a fairly quick decrease in morale.<br />

For instance, when there are low scores on group cohesion and trust in leaders in a<br />

platoon, it will not be so easy to improve quickly.<br />

The results described below were partly derived from interviews with team commanders and<br />

are partly based on their experience. Anonymity means that specific results cannot be dealt<br />

with, but in general the following conclusions can be drawn so far:<br />

Application of the DMQ<br />

During SFOR, the DMQ was used in units in Bosnia as a direct advisory instrument for team<br />

commanders in the field. The results are the most reliable and widely available indicators so<br />

far for the morale of a team and team commanders are satisfied with the applied analysis and<br />

advice. They usually recognize and confirm the results during the presentation and, if this is<br />

not the case, they return to it at a later measurement and generally confirm the earlier<br />

findings.<br />

The respondents are open, positive and willing to complete the questionnaires. Almost no one<br />

misses the sessions and the discussions are serious and open. The team commanders process<br />

the (positive and negative) results and plan the solutions together with the behavioral<br />

scientists.<br />

The DMQ is currently used where a commander so requires. Experience has shown that<br />

almost all commanders decide, following consultation with the Behavioral Sciences Division,<br />

to use the instrument in the context of the mission abroad. After the reports are finished,<br />

commanders are asked to assess the use of the instrument by means of a report mark. These<br />

report marks show how much the DMQ is appreciated.<br />

Training prior to the mission<br />

Within the armed forces morale is often used as an indicator for deployment readiness and<br />

combat readiness. In view of the fact that aspects of morale only demonstrate their<br />

effectiveness under operational conditions, the important thing is to act preventively and to<br />

gain insight into aspects of morale before deployment. However, during combat readiness<br />

preparation for the mission – frequently as a result of a lack of time – often insufficient<br />

attention is paid to teambuilding and the conditions for creating cohesion and trust can not<br />

always be achieved. In particular, since logistic units are often formed just before deployment,<br />

their morale scores are in general lower than those of combat units.<br />

The same applies to military personnel sent on missions abroad individually.<br />

HOW TO PROCEED<br />

Behavioral scientists are increasingly being deployed in the operational environment. This has<br />

proved its worth in the application of the morale instrument. The Dutch Morale Questionnaire<br />

is currently being expanded by variables concerning dealing with stress and the influence of<br />

355<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


356<br />

the personal characteristics, for instance, coping is also being looked at. Whether and when<br />

the DMQ is expanded by items concerning coping styles depends on the number of items that<br />

need to be added. There is a risk of the list becoming too long and less acceptable if too many<br />

‘soft’ psychological elements are included which seem to have no direct relationship with the<br />

operational task.<br />

The DMQ has been frequently under construction. The Behavioral Science Division keeps on<br />

improving the questionnaire by collecting as much data as possible, analyzing the data set and<br />

reformulate items. Eventually the aim is to apply this morale instrument for each unit to be<br />

sent abroad, prior to, during and after the mission.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


LITERATURE<br />

Avenarius, M.J.H. (1994). De positieve en negatieve kanten van groepscohesie. Den Haag:<br />

DPKL/Afdeling Gedragswetenschappen.<br />

Bos, C.J. van de, Tibboel L.J., Willigenburg T.G.E. (1994). Een onderzoek naar de<br />

bruikbaarheid van de Nederlandse moreelvragenlijst. Den Haag: Rapport DPKL/GW 94-<br />

05.<br />

Centre des Relations Humaines, 1998<br />

Gal, R. (1986). Unit morale: From a theoretical puzzle to an empirical illustration - An Israeli<br />

Example. Journal of Applied Social Psychology, 549-564.<br />

Hackman, J.R. & Oldham, G.R. (1976). Motivation through the design of work: test of a<br />

theory. Organisational Behavior and Human Performance, 16, 250-279.<br />

Gelooven, R. van , Tibboel, L.J., Slagmaat, G.P. & Flach, A. (1997). Studie Masterplan KL-<br />

2000. Moreel: Vakmanschap – Kameraadschap - Incasseringsvermogen. Den Haag:<br />

CDPO/Afdeling Gedragswetenschappen.<br />

Labuc. S. (1991). Cultural and societal factors in military organisations. In R.Gal & A.D.<br />

Mangelsdorf (Eds.). Handbook of <strong>Military</strong> Psychology (p.471-489). New York: Wiley.<br />

Mangelsdorff, A.D.; King, J.M.; O’Brien, D.E. (1985). Battle stress survey. Fort Sam<br />

Houston. Consultation report.<br />

Manning, F.J. (1991). Morale, cohesion, and esprit de corps. In R. Gal & A.D. Mangelsdorff<br />

(Eds.), Handbook of <strong>Military</strong> Psychology (p. 453-470). New York: Wiley.<br />

Organ, D.W. (1988). Organizational Citizenship Behavior: The good soldier syndrome.<br />

Lexington, MA: Lexington.<br />

Tibboel, L.J., Tintelen, G.J.A. van, Swanenberg, A.B.G.J. & Ven, C.P.H.W. van de (2001).<br />

The Human in Command: Peace Support Operations (p. 363-380). Breda: Mets & Schilt.<br />

357<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


358<br />

Measures of Wellbeing in the Australian Defence Force<br />

Colonel A.J. Cotton,<br />

Director of Mental Health<br />

Australian Defence Force, Canberra, Australia<br />

Anthony.Cotton@defence.gov.au<br />

Ms Emma Gorney<br />

Directorate of Strategic Personnel Planning and Research<br />

Department of Defence, Canberra, Australia<br />

Emma.Gorney@defence.gov.au<br />

Abstract<br />

Cotton (2002) reported on the establishment of a Wellbeing program in the Australian<br />

Defence Force, concluding that a key issue was to identify appropriate measures of<br />

wellbeing that would allow comparison with measures of objective personnel capability in<br />

order that wellbeing could be incorporated into ADF capability planning. This paper<br />

reports the results of the ADF’s first attempts to identify such measures and how they might<br />

relate to personnel capability and other key strategic personnel indicators.<br />

INTRODUCTION<br />

The Australian Defence Force Mental Health Strategy (ADF MHS) was established<br />

in 2002 to provide an overarching strategy for the provision of mental health services to the<br />

ADF (Cotton, 2002). The strategy is built around eight initiatives:<br />

• Improving mental health literacy in the ADF.<br />

• Integrating the provision of mental health services by ADF providers.<br />

• Improving treatment options available to ADF members.<br />

• The development of a comprehensive training and accreditation framework for ADF<br />

providers.<br />

• The implementation of a comprehensive mental health research and surveillance<br />

program in the ADF.<br />

• The implementation of the ADF Drug and Alcohol Program (DAP)<br />

• The implementation of the ADF Suicide Prevention Program (SPP)<br />

• The enhancement of wellbeing and resiliance in ADF members.<br />

The last of these initiatives has been operationalised through the establishment of<br />

the Australian Defence Organisation (ADO 19 ) Wellbeing Forum (Cotton, 2002). This is a<br />

voluntary organisation of those agencies within the ADO that feel that they make some<br />

contribution to the sell being of ADF member or Defence civilians. One of the early<br />

decisions made by the Wellbeing forum was that ability to measure wellbing as objectively<br />

as possible was identified as a key requirement for the success of the program in the ADF.<br />

19 ADO refers to the collective group of ADF (i.e., uniformed) and Defence civilian employees of the<br />

ADF. Note that focus of this paper in wellbeing in ADF members.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


In particular it was identified that being able to identify the ability of wellbeing to<br />

contribute to the “bottom line” of the ADF was needed to provide sufficient evidence to win<br />

support for programs that might draw scarce resources away from other personnel<br />

initiatives. The measurement of well being has been shown to be possible in the civilian<br />

sector. Research from organisations like the Corporate Leadership Council 20 showed<br />

returns on investments in different civilian companies ranging from $220-$300 per<br />

employee, and return on investment (ROI) rates of between three and seven dollars per<br />

dollar spent on well being programs.<br />

Measuring ROI for wellbeing programs in the military is more difficult, in particular<br />

measuring the potential costs to the organisation are difficult, also many of the givens in<br />

military tradition (e.g., medical care) are seen as the primary elements of wellbeing<br />

programs in the civilian sector. A number of strategies were identified that could be<br />

applied to provide some indication of the contribution of wellbeing to the effectiveness of<br />

the ADF. These included identifying how the ADF compared with the general Australian<br />

population, this would provide an external benchmark of wellbeing in the ADF. A second<br />

strategy identified was to identify personnel markers within the ADF population to identify<br />

an internal benchmark of wellbeing, and then to identify possible outcome variables that<br />

might contribute to an understanding of how well being contributes to the effectiveness of<br />

the ADF.<br />

One of the difficulties with measuring wellbeing is that there are two types of<br />

measures of wellbeing: objective and subjective. The types of indicators that might be<br />

described as objective measures of wellbeing include absenteeism for health or other<br />

reasons, staff turnover rates, rates of industrial accidents, and so on; in other words the<br />

behavioural manifestations of wellbeing. Subjective indicators of wellbeing incorporate<br />

measures of job satisfaction and commitment, morale and cohesion (to use more military<br />

terms).<br />

The use of objective indicators can present a range of problems, most particularly<br />

the ability to link them directly to wellbeing, which is essentially a subjective term. The use<br />

of subjective indicators of organisational health has been common practise for decades, and<br />

is usually effected through the conduct of staff surveys that are commonly used in both the<br />

civilian and particularly the military sectors. In the ADF the primary staff survey tool used<br />

is the Defence Attitude Survey and this instrument provides a possible means for measuring<br />

wellbeing in the ADF.<br />

AIM<br />

The aim of this paper is to report initial attempts to measure wellbeing in the ADF<br />

through the collection and analysis of subjective measures collected through the Defence<br />

Attitude Survey.<br />

THE DEFENCE ATTITUDE SURVEY<br />

The Directorate of Strategic Personnel Planning and Research has responsibility for<br />

the administration of the Defence Attitude Survey (DAS). The DAS was first administered<br />

in 1999. It replaced the existing single Service attitude surveys, drawing on content from<br />

20<br />

Corporate Leadership Council Literature Search, ROI of Wellness Programs, March 2002,<br />

www.corporateleadershipcouncil.com<br />

359<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


360<br />

each of the RAN Employee Attitude Survey (RANEAS), the Army’s Soldier Attitude and<br />

Opinion Survey (SAOS) and the Officer Attitude and Opinion Survey (OAOS), and the<br />

RAAF General Attitude Survey (RGAS). The amalgamation of these surveys has facilitated<br />

comparison and benchmarking of attitudes across the three Services whilst maintaining a<br />

measure of single Service attitudes.<br />

The survey was re-administered to 30% of Defence personnel in April 2001. The<br />

results were widely used throughout the organisation. To maintain more frequent provision<br />

of key information, the Your Say Survey was developed, it takes a number of key items<br />

from the DAS to be more regularly administered to gather trend data on the organisation.<br />

The Your Say Survey is administered to a 10% sample of Defence members twice a year,<br />

and while it provides useful and current information, the sample size is not extensive<br />

enough to allow detailed breakdowns of the data.<br />

It was determined by the Defence Committee 21 in May 2002 that the DAS should be<br />

administered annually to a 30% sample of Defence personnel, allowing for more<br />

comprehensive data analysis. The Committee also directed that an Attitude Survey Review<br />

Panel (ASRP) be established, with representatives from all Defence Groups, to review and<br />

refine the content of the DAS. The final survey was a result of thorough consultation<br />

through the ASRP. The item selection both maintained questions from previous surveys to<br />

gather trend data, and incorporated new questions to feed into Balanced Scorecard and<br />

other Group requirements. The purpose of the Defence Attitude Survey is threefold:<br />

• To inform personnel policy and planning, both centrally and for the single<br />

Services/APS;<br />

• to provide Defence Groups with a picture of organisational climate, and;<br />

• to provide ongoing measurement in relation to the Defence Matters scorecard.<br />

METHODOLOGY OF THE DAS<br />

Questionnaire<br />

The DAS consists of four parallel questionnaires, one for each Service and one for<br />

Civilians. The Civilian form excludes ADF specific items and includes a number of items<br />

relevant to APS personnel only. Terminology in each form was Service-specific. Each<br />

survey contained a range of personal details/demographic items including gender, age, rank,<br />

information on deployments, specialisation, branch, Group, years of Service, education<br />

level, postings/promotion, and family status (44 for Navy, 40 for Army and Air Force, 35<br />

for Civilians). Navy personnel received additional questions regarding sea service. The<br />

survey forms contained 133 attitudinal items (some broken into parts) for Service personnel<br />

and 122 for Civilians. As in previous iterations, respondents were given the opportunity to<br />

provide written comments at the end of the survey.<br />

As directed by the Defence Committee, a number of changes were carried out on the<br />

survey items, through discussion of the ASRP. This refinement process attempted to<br />

balance the maintenance of sufficient items for gathering trend data and reducing the<br />

21<br />

The Defence Committee is the ADF’s senior management committee. It is responsible for making<br />

all high level decisions effecting the ADF.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


number of items to decrease the length of the survey. While a number of items were<br />

excluded due to no longer being relevant or appearing to duplicate other questions, further<br />

items were added to address issues previously excluded. The new additions included items<br />

on Wellbeing, Internal Communication, Security, Occupational Health and Safety and<br />

Equity and Diversity (which had been included in the 1999 iteration of the survey). Further<br />

demographic items were also added regarding work hours (predicability of and requirement<br />

to be on-call) as well as awareness of Organisational Renewal and the Defence Strategy<br />

Map. The additions resulted in more items being included in the 2002 survey than the 2001<br />

version, however total numbers were still lower than the original 1999 questionnaire.<br />

Attitudinal items were provided response options on a five-point scale where one<br />

equalled ‘Strongly Disagree’ and five equalled ‘Strongly Agree’ (a number of the items<br />

were rated on satisfaction or importance scales, rather than the more common agreement<br />

scale).<br />

Sample<br />

The sample for the Defence Attitude Survey is typically stratified by rank, however,<br />

concerns had been raised by Group Heads that Groups were not being representatively.<br />

Thus, for the 2002 sample, the thirty percent representation of the Organisation was<br />

stratified by both rank and Group. Recruits and Officer-Cadets were not included in the<br />

sample, as per the 2001 administration. Upon request, the whole of the Inspector General’s<br />

Department was surveyed to provide sufficient numbers for reporting on this small Group.<br />

Administration<br />

The survey was administered as a ‘paper and pencil’ scannable form and employed<br />

a ‘mail-out, mail-back’ methodology. For a selection of personnel in the Canberra region,<br />

where correct e-mail addresses could be identified, the survey was sent out electronically.<br />

This methodology allowed the survey to be completed and submitted on-line or printed out<br />

and mailed back in the ‘paper and pencil’ format.<br />

Due to declining response rates for surveys and inaccuracies encountered in address<br />

information, additional attempts were made to ensure that survey respondents received their<br />

surveys and were encouraged to complete them. Surveys were grouped into batches to be<br />

delivered to individual units. In coordination with representatives from each of the Service<br />

personnel areas, units were identified and surveys were sent to unit CO/OC s for<br />

distribution to sampled personnel, accompanied by a covering letter from Service Chiefs. A<br />

number of issues were encountered in this process, including the fact that some CO s are<br />

responsible for vast numbers of personnel (for example, HMAS Cerberus), and this process<br />

entailed double-handling of the survey forms.<br />

Surveys were sent out directly from DSPPR in Canberra, with Civilian forms<br />

delivered via regional shopfronts as specified by pay locations. Completed questionnaires<br />

were returned via pre-addressed return envelopes directly to DSPPR.<br />

Table 1 below outlines the response rate 22 by Service/APS. The response rate from<br />

2001 is also included, and the decline indicates that delivery via unit CO/OC s was not an<br />

22<br />

The response rate is calculated by the number of useable returns divided by the number of surveys mailed out minus the number of<br />

surveys returned to sender.<br />

361<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


362<br />

improved methodology, and also highlights the operational commitments of personnel,<br />

particularly those in Navy.<br />

Table 1<br />

Navy Army<br />

Air<br />

Force<br />

APS Total<br />

Sent 4640 6841 3461 5625 20567<br />

Return to Sender 489 265 179 312 1245<br />

Useable Returns 1532 2669 1808 3504 9513<br />

Response Rate 36.9% 40.6% 55.1% 66.0% 49.2%<br />

2001 Response<br />

Rate<br />

52.0% 50.7% 60.9% 56.2% 54.5%<br />

2001-2002<br />

Difference<br />

-15.1% -10.1% -5.8% +9.8% -5.3%<br />

The DAS is a comprehensive organisational measurement tool that has provided<br />

very good advice to senior ADF management. Given the stability of the item set in the<br />

DAS, the inclusion of the a set of wellbeing items in the DAS should allow some for an<br />

appropriate tool for the measurement of wellbeing in the ADF and its possible links to<br />

broader organisational outcomes.<br />

Wellbeing Items<br />

The Australian Unity Well Being Index is a national measure of Australian's views<br />

on a range of economic and social indicators in Australia at both a personal and national<br />

level 23 . The Index asks respondents their satisfaction with their:<br />

• health,<br />

• standard of living,<br />

• achievements in life,<br />

• personal relationships,<br />

• sense of personal safety,<br />

• community connectedness,<br />

• future security, and<br />

• overall satisfaction with life.<br />

These items were modified (after consultation with the ASRP) by the inclusion of an<br />

item on connectedness to the military community, and removal of the items on future<br />

security, and personal safety, and then incorporated into the ADO General Attitude Survey<br />

that was administered early in <strong>2003</strong>. The inclusion of these items should allow the<br />

measurement of wellbeing in the ADF compared to the general population. It will also<br />

provide a standard set of wellbeing items that can be linked to other organisational<br />

measures.<br />

Demographics<br />

23<br />

Taken from the Introduction section of the Australian Unity Well Being Index web site,<br />

www.australianunity.com.au<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


A randomly selected sample of 1,500 ADF cases (i.e., uniformed members) were<br />

taken from this data set to provide the analysis for this paper. Demographic data for the<br />

sample are:<br />

• Gender – 87% males, 13% females.<br />

• Age – mean 32.52 years, median 32 years.<br />

• Service – RAN 26.5%, Army 43.9%, RAAF 29.6<br />

• Length of Service – mean 11.73 years, median 11 years.<br />

• Proportion having served on operations – 47.7<br />

• Proportion in a recognised relationship – 61.6<br />

RESULTS<br />

Psychometric Properties of Wellbeing Items<br />

Analysis of distributions of the wellbeing items showed that all were non-normal,<br />

primarily due to being negatively skewed, some deviated from normality in terms of their<br />

kurtosis, although this varied across items. All item distributions were dome shaped and<br />

demonstrated some balance in their distribution.<br />

A principal components analysis was conducted on the individual wellbeing items<br />

(i.e., excluding the overall item) which yielded, as expected, a single component with an<br />

eigenvalue greater than one (2.557), that accounted for 42% of the variance in the data set.<br />

The overall item was then regressed on to the individual wellbeing items and this<br />

yielded a solution that accounted for 45.3% of the variance in the overall item, a significant<br />

result, where all items contributed significantly (all at alpha = 0.01, except one item which<br />

had a p value of 0.012). As a result, the individual wellbeing items were summed to<br />

produce a Total Wellbeing scale for use in further analysis.<br />

Overall the wellbeing items proved to be a consistent set of items that were close to<br />

normally distributed, that, together, adequately predicted the overall item.<br />

Comparison with the Australian Population<br />

Having determined the psychometric adequacy of the wellbeing item set, the results<br />

from the ADF population were then compared with those of the general population, the<br />

results (percent satisfied) are contained in Table 1 below.<br />

Item ADF 07/02 09/02 02/03 04/03 07/03<br />

Satisfaction overall 69.5 78.1 77.2 77.7 78.2 78.2<br />

Standard of living 78.0 77.7 76.5 77.3 77.7 77.8<br />

Health 73.3 75.4 74.9 75.8 76.0 75.2<br />

Achievements 71.7 74.8 74.0 74.9 75.0 74.8<br />

Personal Relationships 71.3 79.2 79.0 80.6 80.6 81.3<br />

Links – general community 49.6 70.7 69.5 70.0 71.0 71.2<br />

Links – military community 46.9 - - - - -<br />

Examination of the data in the tables shows that the ADF sample rates are very close<br />

to the general community and rates higher in terms of satisfaction with standard of living.<br />

363<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


364<br />

The ADF sample rates are, however, noticeably below the general community in terms of<br />

overall satisfaction, achievements, personal relationships, and links to the general<br />

community.<br />

Comparison within ADF<br />

The wellbeing items were then compared across rank categories (PTE - LCPL, CPL<br />

- WO1, LT - CAPT, MAJ - BRIG) and service (RAN, Army, RAAF) within the ADF by<br />

cross-tabulation, calculation of chi-squre values and examination of standardised residuals<br />

in each cell. This yielded the following results:<br />

• Significant differences across service in terms of links to the general community with<br />

RAN members more likely to indicate that they were unsatisfied with this.<br />

• A significant difference across rank categories for overall life satisfaction with junior<br />

ranks more likely to be dissatisfied with this and senior ranks more likely to be satisfied.<br />

• A significant difference in satisfaction with standard of living again with junior ranks<br />

more likely to be dissatisfied.<br />

• A similar result for satisfaction with achievements.<br />

• A similar result for satisfaction with personal relationships.<br />

• A similar result for satisfaction with links to the military community.<br />

Comparison with Organisational Markers<br />

Total wellbeing scores were correlated with a number of organisational markers to<br />

establish any relationship, this yielded the following correlations:<br />

• Confidence in immediate superior – 0.17<br />

• Confidence in Senior officers/staff – 0.23<br />

• Confidence in Senior Defence leadership – 0.25<br />

• I like the work in my present position – 0.26<br />

• There are insufficient personnel in units - -0.03 (NS)<br />

• Adequate opportunities to clear leave – 0.17<br />

• Impact of work on family responsibilities - -0.31<br />

• Actively looking to leave the service - -0.24<br />

• Satisfaction with salary – 0.235<br />

• Personal morale – 0.43<br />

• Unit morale – 0.32<br />

Total Wellbeing scores were then compared across categories of number of<br />

deployments, time since last deployment, marital status and intention to leave; this yielded<br />

the following results:<br />

• There was no significant difference in total wellbeing over the number of deployments<br />

the member had.<br />

• There was a significant effect for time since deployment on wellbeing, with a general<br />

increase in wellbeing as time since last deployment increases.<br />

• There was a significant effect for marital status on wellbeing. Post-hoc tests (Scheffes)<br />

indicated that the major differences were between those who were in a recognised<br />

relationship and those who were not in a relationship.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


• When total wellbeing scores were compared across intention to leave categories a<br />

significant effect was identified for intention, post hoc comparisons showed very clearly<br />

that those intending to remain until retirement, or those who have not considered<br />

leaving, having a higher level of wellbeing than those who have considered leaving.<br />

When wellbeing is compared across number of deployments there was no main effect.<br />

DISCUSSION<br />

The results presented above indicate that the wellbeing items from the Australian<br />

Unity Wellbeing Index are a useable measure for the ADF. The psychometric properties of<br />

the scale constructed by the items are sound and correlates well with the single wellbeing<br />

item (Satisfaction with overall life).<br />

Overall wellbeing in the ADF is less than in the general population, with the main<br />

differences in the areas of connection with the community, personal relationships, and<br />

satisfaction with achievements. Within the ADF the main differences occurring for junior<br />

members of the ADF in these areas and also in satisfaction with their standards of living.<br />

This is an important result as it indicates where the priority of effort should be put to<br />

improve wellbeing overall.<br />

Comparison across a number of important organisational (intention to leave, time<br />

since lst deployment) indicators indicate that wellbeing may have a causative effect on<br />

these. Correlation analysis with other organisational markers shows that wellbeing has a<br />

good relationship with a number of markers. The bi-directional nature of these analyses<br />

means that there is significant scope to further examine these variables in a more complex<br />

model of wellbeing and organisational attitudes and behaviour. There is significant scope<br />

particularly for model testing procedures such as structural equation modelling.<br />

These results indicate that wellbeing is a concept that can be measured with some<br />

utility in the ADF and with immediate applicability, particularly in terms of comparison<br />

with the general community and in terms of targeting service delivery and policies to better<br />

meet wellbeing needs.<br />

CONCLUSION<br />

Wellbeing is a concept of great promise as an organisational tool. It has a clear<br />

relationship with a key organisational behaviour (intention to leave) and its relationship<br />

with a number of important organisational markers suggests that there is significant scope<br />

for further analysis and model testing in particular.<br />

365<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


366<br />

Why One Should Wait Before Allocating Applicants<br />

LtCol Psych Francois J. LESCREVE 24<br />

Belgian Defense Staff<br />

Human Resources Directorate General<br />

Accession Policy Research & Technology Section<br />

Abstract<br />

An important policy issue in the design of an accession system for military personnel is<br />

the question how and when allocation decisions are made. Two major options are<br />

available: immediate allocation or batch classification. In the immediate allocation, an<br />

applicant will know what occupation he or she is accepted for during the selection day.<br />

With batch classification, a number of applicants are assessed during a certain period of<br />

time and their allocation is decided later. This allows comparing the applicants and<br />

assigning them in a smarter way.<br />

It can be demonstrated that the overall recruit quality is significantly better when batch<br />

classification is chosen. Yet, immediate allocation is the preferred method in many<br />

countries. A major reason for this is that it seems more appropriate to give the candidates<br />

certainty about their application without delay.<br />

In this paper we take a closer look at the relationship between the quality of the enlisted<br />

group of recruits and the time between assessment and allocation decision. Empirical data<br />

is used to simulate different conditions. It is found that waiting before making allocation<br />

decisions yields better recruit quality. In addition, the used setting shows a relationship<br />

between time before classification and recruit quality that approximates a logarithmic<br />

function. This indicates that even minor departures from immediate classification in favor<br />

of batch classification can yield significant improvement in recruit quality.<br />

Introduction<br />

<strong>Military</strong> organizations need to enlist recruits for different trades in order to compensate<br />

for departures. To reach this goal, recruiting and selection & classification (S&C) systems<br />

are set up. Recruiting systems are primarily aimed at attracting high numbers of quality<br />

people. S&C systems deal with a number of applicants characterized by varying<br />

aptitudes, interests and other pertinent attributes on the one hand as well as a number of<br />

vacancies on the other hand. Usually, there are a certain number of positions available for<br />

different trades. The trades or entries often require different levels of achievement for a<br />

number of aptitudes. During a selection phase, the different attributes of an applicant are<br />

assessed. Together with the applicant’s preferences or interest for the different entries,<br />

these selection measures make it possible to quantify the appropriateness to assign a<br />

candidate to a particular trade. How the actual assignment decision is made varies from<br />

one S&C system to another. We can roughly distinguish two main systems: the<br />

immediate ones and the batch classification systems. In immediate systems, decisions are<br />

typically made one at a time immediately after the assessment phase. That is, while the<br />

applicant is still present in the selection facility. In batch classification systems a<br />

relatively large number of applicants is processed simultaneously. By comparing the<br />

applicants to each other before assigning them, these systems typically result in<br />

24 Contact author at Lescreve@skynet.be<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


significantly better recruits for a given set of applicants and vacancies 25 . Yet, it is often<br />

seen that <strong>Military</strong> organizations prefer immediate assignment systems to smarter batch<br />

classification. This paradox is probably due to the major drawback of batch classification<br />

systems, namely the fact that applicants need to wait a certain time before knowing the<br />

outcome of their application. This is not very client-friendly and bears the risk that<br />

applicants continue their quest for a job elsewhere.<br />

This paper will focus on the relationship between the quality of the enlisted recruits and<br />

the timing for classification. If it is confirmed that for a given applicant pool and a given<br />

set of vacancies, we can reach better recruit quality by assigning them through batch<br />

classification, the question arises as to how long we need to wait in order to benefit from<br />

the increased quality? This research question is primarily of interest to continuous<br />

recruiting systems. A recruiting system can be considered to be continuous when a person<br />

can apply at any time and recruits are enlisted frequently, usually on a monthly basis.<br />

Other systems, such as used for the recruitment of officers for instance, are not<br />

continuous but annual or semi-annual. Typically, the candidates have to apply before a<br />

certain date and batch classification is used once, after all applicants have been examined.<br />

Method<br />

In order to assess the influence of waiting time before classification on recruit quality, a<br />

method was used that will be described next.<br />

Dataset.<br />

A sufficiently large dataset is needed to conduct this research. Since we also wanted to<br />

include the applicants’ preferences, this limited our choice among available datasets. It<br />

was therefore decided to aggregate data originating from different recruiting sessions.<br />

As will become clear further, this does not result in any kind of bias for the study. The<br />

dataset is composed of the Belgian NCO applicants who applied from 1999 to <strong>2003</strong><br />

and the vacancies available to them 26 . The measurements for each applicant are<br />

described in Enclosure 1. The vacancies encompass 27 different trades listed in<br />

Table 1.<br />

For each trade, the aptitude of the applicants was computed based upon a weighted<br />

sum of scores. The weights vary from one trade to another. To qualify for a trade, an<br />

applicant needs to meet certain requirements. These pertain to categorical<br />

measurements such a medical profile and/or to metric measurements for which minima<br />

are set. A person not meeting the requirements is given an aptitude zero for the entry.<br />

To illustrate the differences in weights used for different entries, the following graph<br />

shows the weights used for the entries ‘Infantry’ and ‘Air Traffic Controller’. In both<br />

cases the weights add up to one. The meaning of the variables can be found in<br />

Enclosure 1.<br />

25<br />

Lescreve, F. Improving <strong>Military</strong> Recruit Quality Through Smart Classification Technology Report of an<br />

<strong>International</strong> Collaborative Research Sponsored by the US Office of Naval Research, October 2002<br />

26<br />

In total 1529 vacancies encompassing 27 trades were available for 3366 applicants who were at least eligible<br />

for one trade.<br />

367<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


368<br />

0,45<br />

0,40<br />

0,35<br />

0,30<br />

0,25<br />

0,20<br />

0,15<br />

0,10<br />

0,05<br />

0,00<br />

-0,05<br />

Used Weights to Compute Aptitudes<br />

Example: Infantry & Air Traffic Control<br />

ATC<br />

ENG_G KAHO PHYS<br />

ELEC ENG_T MECH PINP<br />

Figure 1<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Infantry<br />

Air Traffic Control<br />

As was mentioned earlier, the dataset also includes the preferences of the applicants<br />

for the entries. In Belgium, the NCO applicants are asked to express their preferences<br />

on a scale from 99 (first choice) down to 1. They cannot give the same preference<br />

twice. They also can give a zero preference for entries they reject and that prevents<br />

them from being assigned to that trade.<br />

Given the applicant’s aptitude and the preference for an entry, a payoff value is then<br />

computed 27 . This value combines aptitude and preference in order to express the<br />

overall appropriateness to assign the applicant to the particular entry. Applicants who<br />

didn’t have a non-zero payoff for at least one entry were removed from the dataset.<br />

These were the applicants that didn’t qualify for any entry. 3366 applicants remained.<br />

Only the payoff, aptitude and preference scores are used in this research. The first two<br />

scores are standardized per trade with a mean of 500 and a standard deviation of 200.<br />

Standardized payoffs are limited to the range of 0 to 999. The preferences are<br />

untransformed and range from 99 to 0.<br />

Table 1 summarizes the complete dataset. The columns represent:<br />

• Job ID: An identification number for each entry available to the<br />

applicants;<br />

• MOS: A ‘<strong>Military</strong> Occupation Specialty’ code;<br />

• Job Title: A description of the entries;<br />

• Positions: The total number of individual positions for the entry;<br />

• Qualified: The number of applicants in the dataset meeting the<br />

requirements for the entry and for who an aptitude score was<br />

computed;<br />

• Preference: The number of applicants in the dataset who are willing to be<br />

assigned to the entry (Preference score > 0);<br />

27 The used formula is: Payoff = Aptitude * [((Preference/99) * 0.6) + 0.4]


• Non-zero Payoff: The number of applicants in the dataset who can be assigned to<br />

the entry. In order to be included in that number, an applicant<br />

must meet the requirements to qualify and express a preference<br />

> 0 for the entry;<br />

• Ratio: The number of applicants that can be assigned to the entry per<br />

available position.<br />

Job<br />

ID<br />

MOS Job Title Positions Qualified Preference<br />

Nonzero<br />

Payoff<br />

Ratio<br />

1 100 Aircraft mechanics Army 14 2746 559 509 36,36<br />

2 102 Electro mechanics Army 27 2385 384 335 12,41<br />

3 114 Signal electronics Army 45 2425 442 374 8,31<br />

4 116 Armaments electronics Army 9 2425 316 274 30,44<br />

5 140 Infantry 252 1730 2005 1156 4,59<br />

6 142 Armor 84 1703 1914 1056 12,57<br />

7 144 Artillery 99 2837 1857 1711 17,28<br />

8 146 Engineer 46 1703 1415 788 17,13<br />

9 150 Signal 101 2837 1347 1056 10,46<br />

10 152 Supply (Services) 120 3072 1411 1235 10,29<br />

11 212<br />

Electricity Electronics Air<br />

Force<br />

157 2591 478 443 2,82<br />

12 240 Air Traffic Controller 98 101 668 100 1,02<br />

13 244 Computer operator Air Force 97 2591 759 660 6,80<br />

14 250 Airfield defense Air Force 60 2125 1385 898 14,97<br />

15 320 Computer operator Navy 16 1065 100 61 3,81<br />

16 322 Sonar operator Navy 10 1065 87 54 5,40<br />

17 154<br />

Transport and movement<br />

control Army<br />

18 3072 849 759 42,17<br />

18 326 Radio operator Navy 12 1065 84 53 4,42<br />

19 364 Signal Navy 15 1318 172 131 8,73<br />

20 156 Cook Army 34 3092 558 480 14,12<br />

21 318 Electrician Navy 15 1190 108 86 5,73<br />

22 358 Detector Navy 16 618 147 73 4,56<br />

23 200 Aircraft mechanics Air Force 76 2746 421 384 5,05<br />

24 202 Electro mechanics Air Force 34 2385 341 299 8,79<br />

25 246 Administration Air Force 26 3360 749 749 28,81<br />

26 420 Medical support personnel 39 2837 521 370 9,49<br />

27 312 Radar maintenance Navy 9 1065 75 51 5,67<br />

Sum 1529<br />

Table 1<br />

Subsets<br />

From the described dataset, 60 subsets were drawn. The persons were randomly sorted<br />

and then sequentially assigned to subsets of 56 persons each (adding up to 3360<br />

persons). On the vacancy side, the available vacancies were distributed proportionally<br />

over the 60 subsets. Since the number of vacancies per entry must be an integer, it was<br />

ensured that rounding effects did not affect some subsets more than others in a<br />

systematic way. The original number of vacancies per trade for each subset is given in<br />

Enclosure 2.<br />

Procedure<br />

Why did we divide the original dataset into subsets? The reason is that this will help us<br />

simulate what is of interest: how does time between classifications influence recruit<br />

quality? Of course, time in itself is not really the point. In continuous recruiting<br />

systems, time correlates with the number of assessed applicants. It is the increased<br />

number of applicants classified simultaneously that is of importance to recruit quality.<br />

From the applicants’ or organization’s point of view however, it is the time between<br />

classifications that matters. Hence, our interest in time between classifications. To<br />

understand the approach, consider a hypothetical S&C setting in which - by chance –<br />

each day 56 persons are found to be eligible for at least one entry. This system would<br />

369<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


370<br />

wait till the end of the day and then use a batch classification system to assign these<br />

applicants to the vacancies that need to be filled that day. This corresponds to running<br />

the classification for the persons and vacancies in one subset. This procedure can be<br />

repeated each day or, expressed more generically, subset-by-subset. A variant to this<br />

approach would consist of classifying the applicants not each day but every two days<br />

(or the persons and vacancies of two subsets added up). Or every week (persons and<br />

vacancies of 5 subsets together assuming a 5 working day week). And so on. A<br />

different S&C setting may of course ‘produce’ more or less eligible applicants per<br />

time unit. The principle remains however: for continuous recruiting settings, batch<br />

classification can be done more or less frequently.<br />

It was chosen to divide the dataset in 60 parts for 60 is divisible by 1, 2, 3, 4, 5, 6, 10,<br />

12, 15, 20, 30 and 60 with no remainder. This proved to be very convenient for this<br />

research.<br />

It was also decided to add the unfilled vacancies from one classification to the<br />

vacancies for the next one since that seemed the most logical way in which continuous<br />

S&C systems are functioning.<br />

In summary: the independent variable for this research is the number of subsets that<br />

are processed simultaneously and the dependent variables are the mean payoff,<br />

aptitude and preference yielded when all 60 subsets have been processed. The payoff<br />

value of a person for the job s/he is assigned to is the operationalized definition of<br />

her/his quality for the organization.<br />

Classification method<br />

Probably the most crucial aspect of this research setting hasn’t been touched yet: the<br />

classification method itself. We used a smart classification system developed by the<br />

author called the ‘Psychometric Model’. This method is in use with the Belgian<br />

Defense since 1995. Its algorithm maximizes the sum of payoffs for the applicants<br />

assigned to the different jobs. The Psychometric Model allows lots of fine-tuning such<br />

as setting the Defense priorities, giving coefficients to the different classes of<br />

categorical data, setting minimum preferences to assign persons to jobs, include study<br />

background in the computing of payoffs etc. For reasons of simplicity, all these<br />

possibilities were not included in the current research.<br />

Results<br />

The following graphs show the main results of this research. In the next three graphs, the<br />

abscissa represents the condition of the independent variable: the number of subsets<br />

processed simultaneously. One means that the subsets were processed one by one. This<br />

means that 60 classifications were performed to process the whole dataset. Table 2 shows<br />

for each condition the number of subsets processed simultaneously and the number of<br />

classifications needed to process the whole dataset.<br />

Condition 1 2 3 4 5 6 7 8 9 10 11 12<br />

# Subsets processed simultaneously 1 2 3 4 5 6 10 12 15 20 30 60<br />

# Performed classifications 60 30 20 15 12 10 6 5 4 3 2 1<br />

Table 2<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The ordinates indicate the average value (payoff, aptitude or preference) for all applicants<br />

that have been assigned in the different conditions. The values are given in Table 3. Note<br />

that in some conditions not all 1529 available jobs were filled.<br />

Subsets<br />

processed<br />

simultaneously<br />

Persons<br />

Assigned<br />

Average<br />

Payoff<br />

Average<br />

Aptitude<br />

Average<br />

Preference<br />

1 1525 674,92 630,61 89,75<br />

2 1528 695,71 646,77 90,88<br />

3 1526 700,67 651,00 91,18<br />

4 1528 704,56 653,16 91,49<br />

5 1529 705,15 652,80 91,81<br />

6 1528 707,70 654,91 91,97<br />

10 1529 710,12 656,15 92,30<br />

12 1529 710,49 655,95 92,33<br />

15 1529 712,56 657,72 92,30<br />

20 1529 713,87 658,57 92,61<br />

30 1529 716,41 660,33 92,80<br />

60 1529 719,21<br />

Table 3<br />

662,78 93,00<br />

Average Payoff of Enlisted Persons<br />

730<br />

720<br />

710<br />

700<br />

690<br />

680<br />

670<br />

Average Payoff of Enlisted Persons<br />

as a Function of Classification Frequency<br />

1 6 12 20 30 60<br />

Number of Subsets Processed Simultaneously<br />

Figure 2<br />

371<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


372<br />

Average Aptitude of Enlisted Persons<br />

Average Preference of Enlisted Persons<br />

665<br />

660<br />

655<br />

650<br />

645<br />

640<br />

635<br />

630<br />

625<br />

Average Aptitude of Enlisted Persons<br />

as a Function of Classification Frequency<br />

1 6 12 20 30 60<br />

Number of Subsets Processed Simultaneously<br />

93,5<br />

93,0<br />

92,5<br />

92,0<br />

91,5<br />

91,0<br />

90,5<br />

90,0<br />

89,5<br />

Figure 3<br />

Average Preference of Enlisted Persons<br />

as a Function of Classification Frequency<br />

1 6 12 20 30 60<br />

Number of Subsets Processed Simultaneously<br />

Figure 4<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


In next- rather busy – graph the average payoff is given for the different entries. What is<br />

important to note for the moment is that the curvilinear relationship shown in figure 2<br />

does not seem to apply to all entries.<br />

Average payoff of assigned applicants<br />

1000<br />

900<br />

800<br />

700<br />

600<br />

500<br />

400<br />

Average Payoff of Enlisted Persons per Trade<br />

as a Function of Classification Frequency<br />

300<br />

1 6 12 20 30 60<br />

Number of subsets processed simultaneously<br />

Figure 5<br />

Discussion<br />

To begin with, it is clear that the conditions processing larger numbers of subsets<br />

simultaneously yield better average payoffs than the ones processing smaller numbers.<br />

The magnitude of the difference might seem rather small upon first sight. The difference<br />

between the first and last condition is about 45 points on a scale with an average of 500<br />

and a standard deviation of 200. Yet, given the fact that in all conditions the same<br />

applicant pool and the same vacancies were used along with the same eligibility rules and<br />

the same classification tool and considering that the difference pertains to the average of<br />

more than 1500 persons, one should realize that the effect is quite important. Put in other<br />

words, if you would think of measures that need to be taken to yield a similar increase of<br />

average recruit quality, such as increasing the selection ratio or improving the applicant<br />

pool quality through recruiting actions, you most probably would conclude that waiting<br />

some time before classifying the applicants is a very cheap and effective option.<br />

Secondly, it is quite interesting to notice the curvilinear relationship between average<br />

payoff and number of subsets processed simultaneously. The relationship approximates a<br />

logarithmic function. The steepest increase in payoff occurs at the left side of the abscissa<br />

and then flattens out.<br />

When looking at the second and third graph, we see very similar relationships. This<br />

means that both aptitude and preference benefit from classification in larger groups.<br />

J1<br />

J2<br />

J3<br />

J4<br />

J5<br />

J6<br />

J7<br />

J8<br />

J9<br />

J10<br />

J11<br />

J12<br />

J13<br />

J14<br />

J15<br />

J16<br />

J17<br />

J18<br />

J19<br />

J20<br />

J21<br />

J22<br />

J23<br />

J24<br />

J25<br />

J26<br />

J27<br />

373<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


374<br />

What can we infer from these results? As indicated earlier, we used subsets to simulate<br />

the time between successive classifications. As time goes by, the number of eligible<br />

applicants increases as does the number of vacancies that need to be filled (assuming a<br />

proportional distribution of vacancies over a large time period). Aggregating different<br />

subsets simulates this. From the obtained results it is therefore possible to conclude that<br />

increasing the time between classifications yields better recruit quality. This is due to the<br />

fact that the number of degrees of freedom to assign persons to jobs increases. On the<br />

condition that a classification tool is used that capitalizes on that number of degrees of<br />

freedom to produce better results, it can be expected that the outcome improves when that<br />

number grows. Of course the average payoff obtained by classification is constrained by<br />

the payoff distribution in the processed subsets. It is therefore normal that the effect of<br />

increased degrees of freedom flattens.<br />

If we now can accept that recruit quality does improve when we wait before performing<br />

batch classification, it is very interesting to notice – at least in this particular case - that a<br />

adequate batch classification system doesn’t need to wait very long before yielding<br />

significantly better results. This finding is of great practical consequence. Since one of the<br />

major reasons not to perform batch classification is related to the fact that the<br />

organizations are reluctant to let the applicants wait before knowing the outcome of their<br />

application, it is good to know that they shouldn’t wait for long!<br />

It has to be said that the found curvilinear relationship isn’t there for each individual<br />

entry. Figure 5 shows some entries for which there is no improvement at all or where<br />

there are conditions processing more subsets simultaneously yielding poorer results than<br />

conditions with fewer subsets. When analyzing the data it appears that these entries either<br />

have very low selection ratios (for instance Job 12: 100 persons with non-zero payoff for<br />

98 vacancies) or very low numbers of vacancies (for instance Job 16 with only 10<br />

vacancies). Further research should take a closer look at the reasons why some trades do<br />

not benefit from classification in larger groups.<br />

A major difficulty in S&C research is to control all parameters. There are a vast number<br />

of elements that condition a particular S&C situation and that practically prevents us to<br />

design useful theoretical models. To name just a few of these parameters:<br />

• The number of entries;<br />

• The number of eligible applicants per entry;<br />

• The number of positions per entry;<br />

• The interactions between eligibility for different entries;<br />

• The way in which new vacancies are added to the S&C system;<br />

• The way in which new candidates are added to the S&C system;<br />

• The way in which enlistment dates are managed;<br />

• The used assignment decision process (classification method).<br />

The present S&C research is not an exception. In order to be able to conduct it, a number<br />

of options needed to be taken. These options may have influenced the results and may<br />

compromise the conclusions to a certain extent.<br />

One of the options we took was to add the unfilled vacancies of one classification to the<br />

vacancies of the following one. This seemed to reflect what most S&C system managers<br />

would do. If this were not done, it would have resulted in having a number of vacancies<br />

remaining unfilled. This would mainly have an adverse impact on the conditions where<br />

one or a few subsets are processed simultaneously. As Table 3 shows, even with adding<br />

the unfilled vacancies to the next classification, not all conditions are able to fill all 1529<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


vacancies. In fact, the problem of unfilled vacancies only occurs within the first six<br />

conditions. That is when only a few subsets are processed simultaneously.<br />

Another option we took was not to transfer the applicants that were not assigned to a job<br />

in one classification to the next one. That decision might be a bit more controversial.<br />

Transferring all non-assigned applicants to the next classification increases the number<br />

of degrees of freedom for that classification. In theory, this should result in a better<br />

outcome. The effect of added degrees of freedom may however be tempered by the fact<br />

that the added applicants are expected to be of lesser quality. This is so since the better<br />

applicants were assigned to a job in the previous classification. For our research this<br />

would probably mean that the found differences in recruit quality between the conditions<br />

would be smaller if we had transferred the non-assigned applicants. The main reason<br />

why we decided not to transfer them is that by transferring them, we also increase the<br />

time between their assessment and the moment they learn about the result of their<br />

application. In other words, we would falsify the research design by decreasing the<br />

differences between the different conditions!<br />

An important option relates to the used classification system. As mentioned earlier, the<br />

classification system we used - the Belgian ‘Psychometric Model’ - capitalizes on the<br />

number of degrees of freedom available in the S&C setting to produce a high quality<br />

classification. It is quite obvious that less powerful classification methods will yield less<br />

marked results. The right question here is why S&C managers would prefer to stick to<br />

less powerful methods?<br />

In our setting, we used subsets of equal length. In the first condition each classification<br />

was done for 56 persons at a time, in the second one 112 persons were processed<br />

simultaneously etc. It is however quite unlikely that in practice each selection day or<br />

period would produce exactly the same amount of persons eligible for at least one trade.<br />

It is not clear what the influence of unequal length of subsets would be but it seems<br />

likely that conditions processing larger numbers are less subject to non-representative<br />

variance.<br />

The present research effort only looks at one side of the problem: the influence of waiting<br />

time on recruit quality. There is of course another one: the influence of waiting time on<br />

applicant behavior. Daily practice indicates that when applicants need to wait after their<br />

assessment to know the outcome of their application, some of them loose their interest or<br />

continue their search for a job elsewhere. This is bad news for the organization. It is<br />

therefore important to study the interaction between both sides: the increased recruit<br />

quality obtained when postponing batch classification and the loss of applicants when<br />

doing so. As was mentioned earlier, the complexity of most S&C systems makes it quite<br />

impossible to give general advice. However, given the importance of recruit quality for<br />

the <strong>Military</strong>, this should be studied! Another aspect that needs to be looked closer at is<br />

related to the fact that smart batch classification systems are able to respect the<br />

applicants’ preferences better when processing larger groups. This means that by waiting<br />

some time longer, the applicants that are assigned to jobs are more likely to get a trade<br />

they like 28 . This might have a positive influence on early turnover and other relevant<br />

aspects of training.<br />

Further research should also have a closer look at the left end of the graphs we presented.<br />

In the first condition of our research design, we classified subsets one at a time. Each<br />

subset represented 56 persons. Given the quasi-logarithmic shape of the curve it would be<br />

28 On the obvious condition that the classification system includes applicant preferences.<br />

375<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


376<br />

interesting to look at even smaller subsets. In the most extreme condition a subset would<br />

only count one person. The batch classification method would then in fact be similar to<br />

immediate classification. Such research design would most probably yield more evidence<br />

as to why immediate classification should be banned from S&C practice.<br />

Conclusion<br />

We found improved recruit quality in conditions where more subsets were processed<br />

simultaneously. This indicates that performing batch classification on larger groups is<br />

beneficial to the quality of recruits. In practice this would mean that given a set of<br />

applicants, vacancies and eligibility rules, waiting some time before assigning applicants<br />

to jobs is beneficial to the <strong>Military</strong> organization.<br />

The found curvilinear relationship indicates that significant improvement of recruit<br />

quality can be obtained without having to wait a long time before classification.<br />

Further research is needed to:<br />

–Confirm results in different settings;<br />

–Understand the mechanisms causing differential impact on different entries;<br />

–Model the impact of waiting time on applicant behavior and relate this to the obtained<br />

improvement in quality.<br />

References<br />

• Alley, W. E., Recent advances in classification theory and practice. In M. G. Rumsey, C.<br />

B. Walker & J. Harris (Eds)., Personnel selection and classification. Hillsdale, New<br />

Jersey, Lawrence Erlbaum Associates, Publishers. 1994.<br />

• Burke, E., Kokorian, A., Lescrève, F., Martin, C., Van Raay, P. & Weber, W., Computer<br />

based assessment: a NATO survey, <strong>International</strong> Journal of Selection and Assessment,<br />

1995<br />

• Darby, M., Grobman, J., Skinner, J. & Looper, L., The Generic Assignment Test and<br />

Evaluation Simulator. Human Resources Directorate, Manpower and Personnel Research<br />

Division, Brooks AFB, 1996.<br />

• Green, B. & Mavor, A. (Eds), Modeling Cost and Performance for <strong>Military</strong> Enlistment.<br />

Washington D.C., National Academy Press. 1994.<br />

• Hardinge, N. M., Selection of <strong>Military</strong> Staff. In <strong>International</strong> Handbook of Selection and<br />

Assessment., Edited by Neil Anderson and Peter Herriot, Wiley, 1997. p 177-178.<br />

• Keenan, T,. (1997) Selection for Potential: The Case of Graduate Recruitment. in<br />

<strong>International</strong> Handbook of Selection and Assessment., Edited by Neil Anderson and Peter<br />

Herriot, Wiley, 1997. p. 510.<br />

• Kroeker, L. & Rafacz, B., Classification and assignment within pride (CLASP): a recruit<br />

assignment model., US Navy Personnel Research and Development Center, San Diego,<br />

CA, 1983.<br />

• Lawton, D., A review of the British Army Potential Officer Selection System In<br />

<strong>Proceedings</strong> of the 36th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>, 1994.<br />

• Lescrève, F., A Psychometric Model for Selection and Assignment of Belgian NCO’s in<br />

<strong>Proceedings</strong> of the 35th annual conference of the <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>. US Coast<br />

Guard, . 1993 p. 527-533.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


• Lescrève, F., The Selection of Belgian NCO’s: The Psychometric model goes operational.<br />

in <strong>Proceedings</strong> of the 37th annual conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>. Canadian Forces Personnel Applied Research Unit. 1995, p. 497-502.<br />

• Lescrève, F., The use of neural networks as an alternative to multiple regressions and<br />

subject matter experts in the prediction of training outcomes., Paper presented at the<br />

<strong>International</strong> Applied <strong>Military</strong> Psychology Symposium, Lisboa, 1995.<br />

• Lescrève, F., The Psychometric model for the selection of N.C.O.: a statistical review.<br />

<strong>International</strong> Study Program in Statistics, Catholic university of LEUVEN, 1996<br />

• Lescrève, F., The determination of a cut-off score for the intellectual potential. Center for<br />

Recruitment and Selection: Technical Report 1997-3.<br />

• Lescrève, F., Data modeling and processing for batch classification systems. in<br />

<strong>Proceedings</strong> of the 39th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>, Sidney, 1997<br />

• Lescrève F., Immediate assessment of batch classification quality. In <strong>Proceedings</strong> of the<br />

37th annual conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>., 1998, Internet:<br />

www.internationalmta.org<br />

• Lescrève F., Equating distributions of aptitude estimates for classification purposes. In<br />

<strong>Proceedings</strong> of the 40th annual conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>., 2001.<br />

• Lescrève F., Why Smart Classification Does Matter. <strong>Proceedings</strong> of the 41 st annual<br />

conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>., 2002 29<br />

• Lescrève, F. Improving <strong>Military</strong> Recruit Quality Through Smart Classification Technology<br />

Report of an <strong>International</strong> Collaborative Research Sponsored by the US Office of Naval<br />

Research, October 2002<br />

• Robertson I., Callinan M. & Bartram D. Organizational Effectiveness: The role of<br />

Psychology New York, John Wiley & Sons Ltd, 2002<br />

• Stevens S.S., Mathematics, measurement and psychophysics. In S.S. Stevens (Ed.)<br />

Handbook of experimental psychology. New York: Wiley, (1951). p. 1-49<br />

29<br />

The paper was accidentally omitted in the proceedings. An electronic copy can be obtained at<br />

Lescreve@skynet.be<br />

377<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


378<br />

Metric Variables Included in the Dataset<br />

Variable Name Description<br />

ATC Air Traffic Controller suitability score<br />

ELEC Test score for Electricity<br />

ENG-G General English comprehension 30<br />

ENG-T Technical English comprehension<br />

KAHO Personality score<br />

MECH Composite score for Mechanics<br />

PHYS Physical fitness score<br />

PINP General intelligence score<br />

Categorical Variables Included in the Dataset<br />

Variable Name Description<br />

FAC_A Medical profile: audio<br />

FAC_C Medical profile: color perception<br />

FAC_E Medical profile: emotional stability<br />

FAC_G Medical profile: navy<br />

FAC_I Medical profile: lower limbs<br />

FAC_K Medical profile: navy<br />

FAC_M Medical profile: mantal capacity<br />

FAC_O Medical profile navy<br />

FAC_P Medical profile: general<br />

FAC_S Medical profile: upper limbs<br />

FAC_V Medical profile: visual acuity<br />

FAC_Y Medical profile: navy<br />

IN_COMB Interest in combat activities<br />

IN_GROUP Interest in group activities<br />

IN_OUTD Interest in outdoor activities<br />

IN_SPORT Interest in sport activities<br />

IN_TECH Interest in technical activities<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Enclosure 1<br />

30 English is not an national language in Belgium but is considered to be of great importance for certain trades.


Vacancies per trade (Columns) for each Subset (Rows)<br />

379<br />

Enclosure 2<br />

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 SUM<br />

S1 0 1 1 0 4 2 2 1 2 2 2 2 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 25<br />

S2 0 0 1 0 4 1 1 1 1 2 3 1 2 1 0 0 0 0 0 1 0 0 1 1 1 1 0 23<br />

S3 0 1 1 0 4 2 2 1 2 2 2 2 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 25<br />

S4 0 0 1 0 4 1 1 1 1 2 3 1 2 1 0 0 0 0 0 1 0 0 1 1 1 1 0 23<br />

S5 1 1 1 0 4 1 2 1 2 2 2 2 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 25<br />

S6 0 0 1 0 4 1 1 1 1 2 3 1 2 1 0 0 0 0 0 1 0 0 1 1 1 1 0 23<br />

S7 1 1 1 0 4 1 2 1 2 2 2 2 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 25<br />

S8 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 0 0 0 0 1 0 1 1 1 1 1 0 23<br />

S9 1 1 1 0 4 1 2 1 2 2 2 2 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 25<br />

S10 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 0 0 0 0 1 0 1 1 1 1 1 0 23<br />

S11 1 1 1 0 4 1 2 1 2 2 2 2 1 1 1 0 0 0 1 1 0 0 1 0 0 0 1 26<br />

S12 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 0 0 1 0 1 0 1 1 1 1 1 0 24<br />

S13 1 1 1 0 4 1 2 1 2 2 2 2 1 1 1 0 0 0 1 1 0 0 1 0 0 0 0 25<br />

S14 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 0 0 1 0 1 0 1 1 1 1 1 0 24<br />

S15 1 1 1 0 4 1 2 1 2 2 2 2 1 1 1 0 0 0 1 1 0 0 1 0 0 0 0 25<br />

S16 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 0 0 1 0 1 0 1 1 1 1 1 0 24<br />

S17 1 1 1 0 4 2 2 1 2 2 2 2 2 1 1 0 0 0 1 1 0 0 1 0 0 0 0 27<br />

S18 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S19 1 1 0 0 4 2 2 1 2 2 2 2 2 1 1 0 0 0 1 0 0 0 1 0 0 0 0 25<br />

S20 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S21 1 1 0 0 4 2 2 1 2 2 2 2 2 1 1 0 0 0 1 0 0 0 1 0 0 0 0 25<br />

S22 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S23 1 1 0 0 4 2 2 1 2 2 2 2 2 1 1 0 0 0 1 0 1 0 1 0 0 0 0 26<br />

S24 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S25 1 1 0 1 4 2 2 1 2 2 2 2 2 1 1 0 0 0 0 0 1 0 2 0 0 0 0 27<br />

S26 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S27 1 1 0 1 5 2 2 1 2 2 2 2 2 1 1 0 0 0 0 0 1 0 2 0 0 1 0 29<br />

S28 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S29 1 0 0 1 5 2 2 1 2 2 2 2 2 1 1 0 0 0 0 0 1 0 2 0 0 1 0 28<br />

S30 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

S31 1 0 0 1 5 2 2 1 2 2 2 2 1 1 1 0 0 0 0 0 1 0 2 0 0 1 0 27<br />

S32 0 0 1 0 4 1 1 0 1 2 3 1 2 1 0 1 0 1 0 1 0 1 1 1 1 1 0 25<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


380<br />

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 SUM<br />

S33<br />

S34<br />

S35<br />

S36<br />

S37<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1 5<br />

0 4<br />

1 5<br />

0 4<br />

1 5<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

0<br />

1<br />

1<br />

1<br />

2<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

3<br />

2<br />

3<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

27<br />

26<br />

29<br />

27<br />

29<br />

S38<br />

S39<br />

S40<br />

S41<br />

S42<br />

S43<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0 4<br />

1 5<br />

0 4<br />

1 5<br />

0 4<br />

0 5<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

3<br />

2<br />

3<br />

2<br />

3<br />

3<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

27<br />

29<br />

26<br />

28<br />

26<br />

27<br />

S44<br />

S45<br />

S46<br />

S47<br />

S48<br />

S49<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

0 4<br />

0 5<br />

0 4<br />

0 5<br />

0 4<br />

0 5<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

3<br />

3<br />

3<br />

3<br />

3<br />

3<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

26<br />

26<br />

26<br />

26<br />

26<br />

27<br />

S50<br />

S51<br />

S52<br />

S53<br />

S54<br />

S55<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0 4<br />

0 4<br />

0 4<br />

0 4<br />

0 4<br />

0 4<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

2<br />

3<br />

3<br />

3<br />

3<br />

3<br />

3<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

25<br />

26<br />

25<br />

25<br />

24<br />

27<br />

S56<br />

S57<br />

S58<br />

S59<br />

S60<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0 4<br />

0 4<br />

0 4<br />

0 4<br />

0 4<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

2<br />

2<br />

1<br />

2<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

3<br />

2<br />

3<br />

2<br />

3<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

2<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

1<br />

1<br />

1<br />

1<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

1<br />

0<br />

23<br />

25<br />

22<br />

25<br />

22<br />

Sum of<br />

vacancies<br />

14 27 45 9 252 84 99 46 101 120 157 98 97 60 16 10 18 12 15 34 15 16 76 34 26 39 9 1529<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


From Attraction to Rejection: A Qualitative Research on Applicant Withdrawal<br />

Bert Schreurs 31<br />

ABSTRACT<br />

This study examined why prospective and real applicants for the Belgian military decide to<br />

withdraw from the hiring process. We inventoried reasons for applicant withdrawal by gathering<br />

qualitative data through focus groups and in-depth interviews (face-to-face and telephonic) with<br />

prospective applicants, applicants withdrawing from the hiring process, applicants completing the<br />

selection procedure, trainees, and employees involved in the hiring process. Because it is<br />

generally accepted that applicant reactions to recruitment and selection procedures may influence<br />

whether applicants pursue job offers, this factor was examined in more detail. Results indicated<br />

that one of the main reasons for applicant withdrawal was that the military had become less<br />

attractive relative to other options; that withdrawals were frequently influenced by the opinions of<br />

significant others (parents, partner, and peers) on the military; and that based on the information<br />

they had received in the recruiting station, withdrawals had serious doubts about whether they<br />

would match with the organization. Inconsistent with previous research, we found that<br />

withdrawals were generally unaffected by early recruitment practices. Suggestions for<br />

strengthening organizational recruitment programs and for directing further research are<br />

discussed.<br />

31 Belgian Ministry of Defence<br />

Human Resources Directorate General<br />

Accession Policy – Research & Technology<br />

Bruynstraat 1, B-1120 Brussels (Neder-Over-Heembeek)<br />

bert.schreurs@mil.be or schreurs.bert@skynet.be<br />

381<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


382<br />

From Attraction to Rejection: A Qualitative Research on Applicant Withdrawal<br />

It is recognized that the recruitment process consists of multiple stages or phases. The first stage<br />

involves the identification and generation of applicants (from the organization’s perspective) or<br />

job opportunities (from the individual’s perspective). During the next stage, applicants become<br />

selectees if they pass the selection tests. At the last step, jobs are offered to the persons with the<br />

highest ranking. At every moment applicants can decide to self-select out of the recruitment<br />

process. As the term selection has been reserved for the processes used by organizations in<br />

hiring, self-selection has been the term used to refer to the individual’s selection decision (Ryan,<br />

Sacco, McFarland, & Kriska, 2000). Until now, research has overlooked applicant withdrawal<br />

that takes place in an early stage of the hiring process (Barber, 1998). In this research, we<br />

focused on applicants who decide to withdraw from the hiring process after the first hurdle of the<br />

selection process. These applicants passed the initial screening test at the career office, but did<br />

not show up at the selection center for their physical, medical and psychological screening.<br />

The study of self-selection is important for a number of reasons. Firstly, applicants’ decisions to<br />

withdraw from the selection process may affect the size and quality of the applicant pool (Barber<br />

& Roehling, 1993). If the organization’s top choices withdraw, this will lead to a reduced utility<br />

of the hiring system (Murphy, 1986). Conversely, self-selection can have positive results for the<br />

organization in terms of reduced turnover, and higher employee satisfaction, commitment, and<br />

performance (see Wanous, 1992). Secondly, while there are plenty of studies examining why job<br />

offers are accepted, until today it is still unclear why job offers are rejected (Turban, Eyring, &<br />

Campion, 1993). Thirdly, if the number of qualified women and minorities advancing through<br />

the selection process decreases, this may affect adverse impact statistics and the ability to meet<br />

diversity goals (Schmit & Ryan, 1997). Thus, understanding the causes of applicant withdrawal<br />

is important.<br />

Research on Applicant Withdrawal<br />

Although applicant withdrawal is often discussed in the literature on recruitment and job choice<br />

(e.g., Rynes, 1991, 1993), the empirical research has largely been directed at rejection of job<br />

offers and not at withdrawal behavior earlier in the process (Barber, 1998; Schmit & Ryan, 1997).<br />

As a result, relatively little is known about applicants’ decisions to withdraw from an<br />

organization’s selection process prior to the point of a job offer (Schmit & Ryan, 1997). In one<br />

of the earlier studies in this area, it was found that the time delay between application and the<br />

next step in the selection process was related to applicant withdrawal from civil service position<br />

selection processes (Arvey, Gordon, Massengill, & Mussio, 1975). The strongly negative effects<br />

of recruitment delays were also observed by Rynes, Bretz, and Gerhart (1991), particularly<br />

among male students with higher grade point averages and greater job search success. More<br />

recently, Schmit and Ryan (1997) examined the role of test-taking attitudes and racial differences<br />

in the decisions of applicants to withdraw from the selection process. They found small effects of<br />

comparative anxiety, motivation, and literacy scales on withdrawal behavior, and small race<br />

differences on test attitude scales. Applicant withdrawal or self-selection has also been examined<br />

as a theoretical rationale for the effects of realistic job previews (RJPs): “Applicants who are not<br />

likely to be satisfied with the job will not accept job offers, and those who do accept will<br />

therefore be more likely to remain” (Barber, 1998, p. 85). Several studies found support for this<br />

theory in that exposition to RJPs was associated with higher job rejection rates (Meglino, DeNisi,<br />

Youngblood, & Williams, 1988; Premack & Wanous, 1985; Suszko & Breaugh, 1986; Wiesner,<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Saks, & Summers, 1991). Bretz and Judge (1998) examined whether self-selection based on job<br />

expectation information may be adverse from the organization’s perspective. That is, whether the<br />

best qualified applicants are most likely to self-select out when presented with negative<br />

information about the organization. The results of this study yielded mixed support for the<br />

adverse self-selection hypothesis.<br />

Correlates of Applicant Withdrawal<br />

Process perceptions. One possible correlate of applicant withdrawal that has elicited a mass of<br />

research relates to the way applicants perceive and react to hiring processes. The bulk of this<br />

research deals with applicant reactions to initial screening interviews (e.g., Goltz &<br />

Giannantonio, 1995; Harris & Fink, 1987; Liden & Parsons, 1986; Maurer, Howe, & Lee, 1992;<br />

Powell, 1984, 1991; Rynes, 1991; Taylor & Bergmann, 1987; Turban, 2001; Turban &<br />

Dougherty, 1992). Only few studies have examined applicant reactions to later recruitment<br />

events, such as site visits (Rynes et al., 1991; Taylor & Bergmann, 1987; Turban, Campion, &<br />

Eyring, 1995), or to elements of administrative procedures, such as time lags and delays (Arvey<br />

et al., 1975; Rynes et al., 1991; Taylor & Bergmann, 1987). The primary mechanism through<br />

which hiring practices are expected to influence applicants’ reactions is signaling. Based on<br />

propositions from signaling theory (Spence, 1973, 1974), it is suggested that because applicants<br />

at early stages of the recruitment process have incomplete information about organizations they<br />

make inferences about the attractiveness of the job or their probability of receiving a job offer<br />

based on their recruitment experiences (Breaugh, 1992; Rynes, 1991), and that these inferences<br />

are directly related to applicants’ decisions to pursue employment opportunities (Barber, 1998).<br />

Unfortunately, only rarely applicant perceptions of recruitment activities are connected to<br />

behavioral responses, such as rejecting a job offer or dropping out of the hiring process (for<br />

exceptions see Ryan et al., 2000; Schreurs et al., <strong>2003</strong>). Recently, an increasing number of<br />

studies in this area has concentrated on applicant reactions to numerous selection devices such as<br />

drug testing, honesty testing, computerized secretarial tests, bio-data, cognitive ability tests,<br />

work-sample tests, Assessment Centers (e.g., Crant & Bateman, 1990; Iles & Robertson, 1997;<br />

Macan, Avedon, Paese, & Smith, 1994; Schmitt, Gilliland, Landis, & Devine, 1993; Smither,<br />

Reilly, Millsap, Pearlman, & Stoffey, 1993; Steiner & Gilliland, 1996). Several models have<br />

been proposed to account for applicants’ reactions to selection procedures (for an overview see<br />

Anderson, Born, & Cunningham-Snell, 2001). For instance, Schuler, Farr, & Smith (1993)<br />

postulate that five components influence the perceived acceptability of selection: (1) the presence<br />

of job and organizational relevant information, (2) participation by the applicant in the<br />

development and execution of the selection process, (3) transparency of the assessment so that<br />

applicants understand the objectives of evaluation process and its relevance to organizational<br />

requirements, (4) the provision of feedback with appropriate content and form, and (5) a dynamic<br />

personal relationship between the applicant and assessor. Derous and De Witte (2001) put<br />

forward six components: (1) provision of general information on the job opening, (2) active<br />

participation of applicants in the selection programme, (3) creation of transparency of testing, (4)<br />

provision of feedback, (5) guarantee of objectivity in selection through both a professional<br />

approach and equal treatment of candidates, and (6) assurance of human treatment and respect for<br />

privacy. Anderson and Ostroff (1997) proposed a model of ‘Socialization Impact’. It is an<br />

empirical testable, five-domain framework covering information provision, preference impact,<br />

expectation impact, attitudinal impact, and behavioral impact. This model closely fits the above<br />

mentioned signaling theory.<br />

383<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


384<br />

Social influence. People are unlikely to make organizational-relevant choices in a social vacuum.<br />

Yet, social influences were ignored in research on organizational choice for a long time. Kilduff<br />

(1990) noticed that “decision-making research has been generally silent concerning social<br />

influences on choices” (p. 270-271) and that “a good example of scholarly neglect of social<br />

influences on behavior occurs in the area of organizational choice” (p. 271). In his research on<br />

the interpersonal structure of decision making, Kilduff found the social network influenced<br />

individuals’ choices of organizations to interview with such that pairs of students who were either<br />

friends or who perceived each other as similar tended to make similar organizational choices,<br />

even if they had different academic concentrations and different job preferences. Similarly, there<br />

is a lot of evidence indicating that prospective applicants are more likely to acquire information<br />

about job vacancies through informal networks of friends, family and acquaintances than through<br />

official sources such as advertisements or employment offices (Granovetter, 1974; Reynolds,<br />

1951; Rynes et al., 1991; Schwab, Rynes, & Aldag, 1987). Liden and Parsons (1986) were<br />

among the first to suggest that parents and friends may have an important influence on job<br />

acceptance among job applicants. They found that these reference groups had even a larger<br />

impact on job acceptance intentions than general job affect. More recently, Turban (2001) found<br />

that the social context was related to potential applicants’ organizational attractiveness. More<br />

specifically, he found that perceptions of university personnel of a firm’s presence on campus and<br />

image as an employer were positively related to college students’ attraction to that firm. Legree<br />

et al. (2000) surveyed 2,731 young men and their parents about their attitudes and intentions<br />

toward the military to understand factors associated with military enlistment. The results from<br />

this study indicated that youth perceptions of parental attitudes toward the military significantly<br />

correlated with stated enlistment propensity, which predicted actual enlistment. Surprisingly,<br />

youth perceptions of parental attitudes were often inaccurate. With regard to applicant<br />

withdrawal, Ryan and McFarland (1997), and Ryan et al. (2000) found that family and friend<br />

support for pursuing a particular job had a significant relation to self-selection decisions.<br />

Applicants who self-selected out felt their families were less supportive for their careers. In their<br />

study, Schmitt and Ryan (1997) observed that more than 10% of applicants withdrew because<br />

they felt that the selection process or the job interfered with family obligations or with how the<br />

job was viewed by family members. There exists a clear parallel between these findings and<br />

recent developments in research on career choice. The social cognitive career theory (Lent,<br />

Brown, & Hackett, 1994), for instance, emphasizes that besides person and behavioral variables,<br />

contextual variables such as social supports and social barriers play a key role in the career choice<br />

process.<br />

Employment alternatives. Applicants may withdraw from the hiring process because the job<br />

opportunity has become less attractive to them relative to other options (Barber, 1998). Schmit<br />

and Ryan (1997) found that a large portion of those withdrawing from the hiring process did so<br />

because of perceived employment alternatives. In some cases, withdrawals believed they could<br />

get a better job or already had taken another offer. In other cases, one’s current job was seen as<br />

the better alternative. Ryan et al. (2000) also found that other alternatives were a major reason<br />

for withdrawing. These results are consistent with findings on the role of perceived employment<br />

alternatives in turnover behavior. Turnover experts long have argued that job opportunities may<br />

induce even satisfied employees into surrendering their current job (e.g., March & Simon, 1958;<br />

Mobley, Griffeth, Hand, & Meglino, 1979; Steel, 1996; Steel & Griffeth, 1989).<br />

Need to relocate. According to Noe and Barber (1993), relocation may have negative influences<br />

on one’s non-work life, requiring adjustments in housing, education, friendships, and activities by<br />

the relocated individuals and their families. Research on geographic boundaries in recruitment<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


suggest that applicants ‘rule out’ jobs located outside their preferred geographic area, and that the<br />

importance of location is not limited to low-level employees (e.g., Barber & Roehling, 1993;<br />

Osborn, 1993; Rynes & Lawler, 1983). In a related vein, research on job pursuit has indicated<br />

that the need to relocate plays an important role in decisions about accepting job offers (Brett,<br />

1982; Brett & Reilly, 1988; Gould & Penley, 1985; Noe & Barber, 1993; Noe, Steffy, & Barber,<br />

1988). Ryan et al. (2000) found that applicants withdrawing from the hiring process expressed a<br />

greater need to relocate than those who continued in the process.<br />

Objective factors: The role of job attributes. There is ample evidence suggesting that objective<br />

job attributes, such as pay, working conditions, nature of the work, influence applicant job pursuit<br />

and job acceptance (for a review see Turban et al., 1993). Evidence from field studies on RJPs<br />

suggests that applicants are more likely to reject jobs when presented with negative information<br />

on the job (e.g., Premack & Wanous, 1985; Suszko & Breaugh, 1986; Wiesner et al., 1991).<br />

Ryan et al. (2000) examined the relationship between withdrawing from a selection process and<br />

job attribute perceptions. Contrary to expectations, perceptions of job attributes were unrelated to<br />

withdrawal, and job attributes were generally positive. From these findings, she concluded that<br />

screening of jobs on attributes occurs prior to application. This conclusion has important<br />

implications for the military that has a tradition of informing prospective applicants on<br />

organizational characteristics and career possibilities within the military prior to application.<br />

Subjective factors: ‘Fit’. Schmit and Ryan (1997) found that a number of applicants withdrew<br />

because of perceptions of a lack of job and organization fit. Some were of the opinion that the<br />

job was not right for them; others argued – rightfully or wrongfully – that they did not have the<br />

required qualifications for the job. Previous research (see Kristof, 1996 for a review) has<br />

repeatedly demonstrated that applicants are more attracted to organizations that best fit their<br />

personal characteristics. That is, applicants are more attracted to organizations that best fit their<br />

individual values (e.g., Cable & Judge, 1996; Chatman, 1989, 1991; Judge & Bretz, 1992; Judge<br />

& Cable, 1997; O’Reilly, Chatman, & Caldwell, 1991; Posner, 1992), goals (e.g., Pervin, 1989;<br />

Vancouver, Millsap, & Peters, 1994; Vancouver & Schmitt, 1991; Witt & Nye, 1992), needs<br />

(Bretz, Ash, & Dreher, 1989; Bretz & Judge, 1994; Cable & Judge, 1994; Turban & Keon, 1993),<br />

and personality (Burke & Deszca, 1982; Slaughter et al., 2001; Tom, 1971). Although<br />

organizational attraction and self-selection are not synonymous (Wanous and Colella, 1989), it is<br />

reasonable to assume that fit perceptions will also have an influence on self-selection.<br />

Commitment to obtaining the job. Results from previous research suggest that career<br />

commitment, or motivation to work in a particular profession, has a strong negative relation with<br />

intentions to withdraw from a career (Blau, 1985; Carson & Bedeian, 1994). Utilizing social<br />

identity theory, Mael and Ashforth (1995) demonstrated that identification with the military may<br />

occur even prior to enlistment, and that this sense of professional identity relates negatively to<br />

turnover soon after hire. Therefore, it can be expected that individuals who are more committed<br />

to obtaining the job are more likely to remain in the process (Ryan et al., 2000).<br />

Impression of the organization. According to Barber (1998) “real-world applicants do not start<br />

out as “blank slates” from a recruitment standpoint; rather, they often have some impression of<br />

employing organizations even before they are exposed to recruitment materials. These general<br />

impressions have been referred to as organizational images and are expected to be related to the<br />

organization’s ability to attract applicants (e.g., Fombrun & Shanley, 1990; Stigler, 1962)” (p.<br />

32). Gatewood, Gowan, and Lautenschlager (1993) found that an applicant’s decision to pursue<br />

contact with the organization was influenced by its corporate and recruitment image. Turban and<br />

Greening (1997) demonstrated that corporate social performance (i.e., the organization’s<br />

tendency to act responsible in dealing with employees, customers, and the community) is related<br />

385<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


386<br />

to organizational attraction. Perceptions of the organization therefore not only refer to applicants’<br />

perceptions but also take into account how applicants think the community perceives the<br />

organization (Ryan et al., 2000).<br />

Present Study<br />

This study examined why applicants for the Belgian military decide to withdraw from the hiring<br />

process. We inventoried reasons for applicant withdrawal by gathering qualitative data through<br />

focus groups and in-depth interviews (face-to-face and telephonic) with prospective applicants,<br />

applicants withdrawing from the hiring process, applicants completing the selection procedure,<br />

trainees, and employees involved in the hiring process. Because it is generally accepted that<br />

applicant reactions to recruitment and selection procedures may influence whether applicants<br />

pursue job offers, this factor was examined in more detail. Following research questions were<br />

focus of this research:<br />

Research Question 1. How do prospective and real applicants evaluate and react to the<br />

recruitment and selection process of the Belgian military?<br />

Research Question 2. How can the organizational entry process be changed in order to<br />

raise the number of applicants and to cut back the voluntary<br />

withdrawal rate before, during, and after the selection process?<br />

Method<br />

Procedure and Sample<br />

To join the Belgian military prospects are required to visit a military career office (one in every<br />

province) for a preliminary information session on military life and career possibilities prior to<br />

application. If prospects still want to enter the organization after this initial preview, they are<br />

invited to fill out the application form and to take a cognitive screening test at the same career<br />

office. Within a week after their application, applicants who succeeded are invited to take the<br />

remaining tests (medical, physical and psychological) at the central selection center in Brussels.<br />

According to current selection rules, selectees should be incorporated within one month after<br />

application. In practice, however, this objective is not always reached.<br />

Firstly, we conducted face-to-face interviews with prospects (N = 35) after their information<br />

session with a career counselor at the career office. Next, we conducted telephonic interviews<br />

with applicants who never showed up at the selection center despite their appointment (N = 200).<br />

Approximately 10% of all applicants voluntarily withdraw from the selection process after<br />

application. Attempts were made to contact each individual who self-selected out. Thirdly, we<br />

contacted a small sample of applicants who completed the selection process, both applicants who<br />

succeeded (N = 25) and applicants who failed (N = 25). In addition, we organized three focus<br />

groups with newcomers who had just begun their initial military training, and one with<br />

employees involved in the hiring process.<br />

Measures<br />

Prospects. The primary purpose of the face-to-face interview at the career office was to find out<br />

how potential applicants had experienced these early recruitment activities. The interview<br />

included eight questions. The first question, a warm-up question, asked about how the visitor<br />

learned of the military career offices. The next two questions were open-ended questions asking<br />

prospects respectively for positive and negative elements of the visit. The last five questions<br />

asked for specific recruitment experiences: (a) whether the prospect was satisfied with the content<br />

of the information he/she had received, (b) whether the prospect was satisfied with the amount of<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


information he/she had received, (c) whether the career counselor had an interest in the prospect’s<br />

questions and problems, (d) whether the prospect had been able to actively participate in the<br />

information session, and (e) whether the career counselor had tried to sell the prospect a job. All<br />

five questions were followed by open-ended probes to assess reasons for the response. Interviews<br />

were conducted by the author and two undergraduate assistants; they typically lasted 10-15<br />

minutes.<br />

Withdrawals. Applicants who withdrew from the selection process were contacted by telephone.<br />

The primary purpose of the telephonic interview was to determine why applicants withdrew from<br />

the selection process before taking the tests at the selection center. Four attempts were made to<br />

contact each withdrawal at a different time of day. The first question was an open-ended question<br />

asking for the main reason why the respondent chose not to continue in the selection process.<br />

The second question was also an open-ended question asking for additional reasons. The<br />

remaining questions were designed to assess certain specific reasons for withdrawal: (a) treatment<br />

by career office personnel, (b) perceptions of test fairness, (c) handling of application, (d) test<br />

anxiety, (e) social influence, (f) fit perceptions, and (g) employment alternatives. Positive<br />

responses were followed by open-ended probes to assess reasons for the response. Interviews<br />

were conducted by the author and four undergraduate assistants; they typically lasted 5-10<br />

minutes.<br />

Successful and unsuccessful applicants. Both successful and unsuccessful applicants who<br />

completed the selection process were contacted by telephone. The primary purpose of the<br />

telephonic interview was to find out how applicants had experienced the selection procedure.<br />

The interview included eight questions. The first two questions were open-ended questions<br />

asking applicants respectively for positive and negative elements of the selection encounter. The<br />

third questions asked whether the applicant was satisfied with the selection procedure in general.<br />

This question was always followed by an open-ended probe to assess the reason for the response.<br />

The next four questions asked for specific selection experiences: (a) respect for privacy, (b)<br />

practical organization of selection procedure, (c) transparency of testing, and (d) whether they<br />

had been in the position to demonstrate their potential. The final question asked respondents what<br />

they would change to the selection procedure if they had the chance to do so. Interviews were<br />

conducted by the author and two undergraduate assistants; they typically lasted 5-10 minutes.<br />

Trainees. Three focus groups were held with newcomers who had just begun their training. They<br />

had taken their selection tests at the selection center three weeks before the focus groups were<br />

organized. Because of the short interval it was guaranteed that the trainees would be able to<br />

recall what they had experienced at the selection center. In addition, the short training period that<br />

they already had gone through permitted them to compare the information they had received with<br />

reality. The semi-structured manual that was used to guide the focus group contained questions<br />

referring to advertisement and organizational image, amount and content of the provided<br />

information, information realism, their visit to the career office and the selection center, selection<br />

methods (medical, physical, psychological), recruitment, selection and retention policy, reasons<br />

for withdrawal during the hiring process, and reasons for withdrawal during initial training. It<br />

took about four hours to discuss all these topics.<br />

Employees. One focus group was held with employees familiar with the accession policy of the<br />

military (advertisement, recruitment, and selection specialists). The semi-structured manual that<br />

was used to guide the focus group contained mainly questions referring to current and possible<br />

future recruitment, selection and retention policies. The focus group lasted about four hours.<br />

387<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


388<br />

Results<br />

At the career office<br />

A first finding is that prospects on the whole looked back positively on their career office visit.<br />

They referred to the visit in general positive terms. Spontaneously, the majority could not think<br />

of any negative experience. When they were asked for specific experiences, the greater part was<br />

enthusiastic about the career counselors’ attitude. The career counselor was described as a warm,<br />

empathic person who showed interest in the prospect’s case. However, a number of negative<br />

experiences were recurrently mentioned by prospects, applicants and trainees. Firstly, there was<br />

a general agreement that the amount and content of feedback on the cognitive screening test at the<br />

career office were unsatisfactory. Most respondents could not appreciate the fact that they did<br />

not get any further explanation on their test score and the message that they had passed or failed<br />

the test. Secondly, there was some disagreement about the appropriate amount of information<br />

that career counselors should provide. Some respondents argued that they needed more<br />

information in order to choose a specific military career. Others complained that they were not<br />

able to process all the information they had received at the career office. Thirdly, looking back<br />

on their visits, trainees criticized the unrealistic preview that was dished up at the career office.<br />

The expectations that trainees had formed based upon what they were told and how they were<br />

treated at the career office did not correspond with real military life. Fourthly, the duration of the<br />

enrollment process was mentioned. Respondents did not understand why the enrollment process<br />

had to be so cumbersome and extensive, although most of them were incorporated within one<br />

month after application. Next, several comments were made on the test conditions at the career<br />

office. Especially, the noisiness of the test environment was criticized. In some career offices the<br />

test computers are located in the same room as where the information sessions take place and<br />

where telephone calls are answered. Other frequently mentioned remarks referred to the lack of<br />

supervision during test administration and the occurrence of computer breakdowns. Finally,<br />

some respondents complained about the accessibility of the career offices.<br />

At the selection center<br />

It was striking to find that respondents (applicants, withdrawals, and trainees) spontaneously put<br />

forward several criticisms about their visit to the selection center, and that the employees<br />

involved in the hiring process openly acknowledged some of these criticisms. To begin with,<br />

respondents recurrently labeled the personnel of the selection center as unprofessional and<br />

unmotivated. In a related vein, most respondents found that the personnel had treated them in an<br />

impersonal manner, “as if they were just a number” or “a piece on an assembly line”. Also<br />

criticized was the interviewer’s presumptuous attitude towards the applicants. The trainees<br />

mentioned that this perception of arrogance was strengthened by the impression the uniform<br />

made on them at that time. Yet, few concluded that this lack of professionalism and human<br />

treatment was symptomatic of the whole organization. The majority made a clear distinction<br />

between hiring practices and the “real military”. According to most respondents, recruitment,<br />

selection, and training were not really part of the military, and were “just things that one has to<br />

plough through”. Most trainees were still enthusiastic about their new employer: On the one<br />

hand, they appreciated that their visit to the selection center provided a realistic preview of life at<br />

the training center, but at the same time they believed that “everything will be different once<br />

training is completed”. Again, several remarks were made on the amount and content of<br />

feedback on the selection outcome. Those who passed wanted to know why they were assigned<br />

to a particular occupation; those who failed wanted to have more detailed information on what<br />

went wrong. Several comments were made on the duration of the personality inventory (CPI). It<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


is not uncommon that applicants need more than one hour to complete this questionnaire. Due to<br />

time-consuming tests and a tight schedule, applicants sometimes do not have the opportunity to<br />

have lunch. Therefore, it is not surprisingly that there were several complaints about the<br />

experienced time pressure in selection. Finally, some respondents were disappointed in the<br />

physical fitness test. Especially trainees were of the opinion that the ergometrical bicycling test<br />

was too easy as physical selection for the military. This judgment was agreed upon by the group<br />

of employees. The sole positive experience respondents spontaneously could recall was that the<br />

visit to the selection center gave them the possibility to get acquainted with other applicants.<br />

Reasons for applicant withdrawal<br />

As can be concluded from Table 1, the most important reason for self-selecting out was the<br />

perception of available employment alternatives (21%). Most withdrawals had applied at several<br />

organizations and often gave preference to another employer. More specifically, a large<br />

proportion of withdrawals favored the police force. About 13% of all respondents said that the<br />

main reason for withdrawal was that they had lost interest after their visit to the career office.<br />

Based upon the information they received, they had serious doubts on whether they would fit the<br />

organization, and vice versa. More than 10% retreated because they were influenced by the<br />

opinions of significant others (parents, partner, peers). Most parents did not want their child to go<br />

to war. The partner was usually not in favor of the lengthy missions abroad. And often the peer<br />

group, due to its negative perception of the military, pressurized the potential candidate to choose<br />

another career path. Contrary to what we had expected, perceptions of the hiring process were<br />

rarely mentioned as reasons for applicant withdrawal. Yet, more than 9% of all withdrawals<br />

mentioned administrative problems as the main reason for self-selection. These problems had to<br />

do with the loss of documents which are required to participate in the selection (e.g., diploma,<br />

birth certificate), and with misunderstandings about the selection date (e.g., the selection center<br />

forgot to send a confirmation on the date). Most of these withdrawals were willing to reapply as<br />

soon as the problem had been solved. Surprisingly, more than 9% called in physical/medical<br />

problems as their main reason for self-selection. These problems varied from broken toes to the<br />

perception of being too heavy to pass the physical and medical tests. Several withdrawals were<br />

motivated to continue in the selection procedure, but were prevented by transportation problems<br />

(8%) because their car broke down on the way to the selection center, or because they did not<br />

have enough money to pay the train ticket to Brussels. Other reasons referred to persons who had<br />

to work on the day of selection (6%), who were too sick to attend the selection (5%), who had<br />

simply forgot about their appointment (5%), or had family/personal problems at that time (5%).<br />

Table 1 provides an overview of what respondents referred to as their primary reason for<br />

withdrawal.<br />

General discussion<br />

This study tried to examine applicant withdrawal that takes place early in the hiring process. In<br />

many nations the military uses career offices to inform prospective applicants about military job<br />

opportunities and military life. Despite the fact that prospects often put in a great effort to make<br />

the trip to a career office, approximately 10% of all visitors who apply at that time never show up<br />

at the selection center. In this study we used a combination of qualitative research methods to<br />

identify the main reasons for applicant withdrawal.<br />

Although respondents had a lot of criticisms on the hiring process, this was not the main reason<br />

for withdrawing from the selection procedure. Only a small percentage of all withdrawals<br />

referred to the recruitment or selection practices as their primary motive (time delay, test<br />

389<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


390<br />

anxiety). This finding is in contrast with previous research on this issue (e.g., Rynes et al., 1991).<br />

In addition, trainees made a clear distinction between hiring practices and the “real” organization,<br />

which is the opposite of what was expected from signaling theory (Spence, 1974, 1975). As a<br />

result, it is tempting to conclude that process perceptions are of minor importance to the study of<br />

applicant withdrawal. Then again, the high percentage of respondents mentioning administrative<br />

problems suggests that the application process is too complex and cumbersome. The high<br />

proportion of withdrawals calling in physical and medical problems might suggest that applicants<br />

believe that they have to be top fit in order to pass the physical and medical selection hurdles,<br />

which is contradicted by the group of trainees who were of the opinion that the ergometrical<br />

bicycling test was too easy as a physical selection test for the military. Although we did not find<br />

evidence that perceptions of the hiring process had a direct effect on applicant withdrawal, it is<br />

possible that recruitment activities indirectly influenced applicants’ decisions to withdraw.<br />

Recently, Turban (2001) found that recruitment activities influenced firm attractiveness through<br />

influencing perceptions of organizational attributes. In a related vein, previous research has<br />

found that recruiter behaviors influence attraction to the organization by providing information<br />

about working conditions in the firm (Goltz & Giannantonio, 1995; Turban, Forret, &<br />

Hendrickson, 1998).<br />

Consistent with previous research (Ryan et al., 2000; Schmit & Ryan, 1997), we found that an<br />

important reason for withdrawing was the availability of employment alternatives. It is naïve to<br />

believe that applicants would only consider a military occupation. Although there are exceptions,<br />

the majority of jobseekers typically generate a large number of potential employers for future<br />

consideration. It was found that for many applicants the military is less attractive than other<br />

potential employers. Apparently, for many youngsters the military is an option that is kept in<br />

reserve in case other applications would end in a refusal. More specifically, it turned out that the<br />

military’s strongest competitor in the struggle for manpower is the police force, which is not<br />

unexpected in view of their more attractive salary scale. In other cases, one’s current job was<br />

seen as the better alternative. Some withdrawals reported that they had to work at their current<br />

job at the time of testing and therefore could not attend. Schmit and Ryan (1997) are of the<br />

opinion that “this is probably a combination of individuals who decided that risking their current<br />

job for a slim chance at another job was not worth it, and individuals who decided that making an<br />

extra effort (e.g., taking a vacation day, rearranging one’s schedule) was not worth it” (p. 871).<br />

In both cases, staying with the current job is seen as the better alternative.<br />

Another set of frequently mentioned reasons for withdrawal related to perceptions of a lack of job<br />

and organization fit. Usually, this perception was based upon the information that was provided<br />

at the career office. This finding startled us at first because one would not expect a prospect to<br />

apply for the military when s/he has doubts on whether s/he would fit the organization. But then<br />

again, as long as the application process does not require a significant commitment of time and<br />

energy on the part of the applicant, it is unlikely that s/he will retreat. At this stage of the process,<br />

applicants are still generating a pool of opportunities from which one (eventually) will be chosen<br />

(Barber, 1998). In most cases, this kind of withdrawal reactions is beneficial to the organization<br />

because it spares the person a likely disillusion, and it saves the organization a considerable<br />

amount of money and time. However, organizations should concern themselves if perceptions of<br />

a lack of fit are based on erroneous information (Schmit & Ryan, 1997). This is particularly true<br />

for cases in which the reason for withdrawing had to do with hearsay stories about the military,<br />

told by relatives or friends after the person had visited the career office.<br />

This brings us to another set of frequently mentioned reasons for withdrawal: The influence of<br />

significant others on the applicant’s decision to withdraw. For 10% of the respondents, this was<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


the primary motive for withdrawal. Furthermore, for many the opinion of significant others on<br />

the military played a role in their decision, but was not the main reason for withdrawal. For<br />

example, some individuals preferred their current job to a military occupation because their<br />

parents had convinced them that this was the right thing to do. It should be noted that at the time<br />

of the data collection the Iraqi war had just begun. For most applicants this was not an issue, but<br />

withdrawals often mentioned that their parents were strongly opposed the possibility that their<br />

child would go to war. We believe that military recruitment programs could be strengthened by<br />

involving parents, partners and friends in the application process; for instance, by organizing an<br />

information evening or site visit for relatives.<br />

Finally, several interviewees told us that they were not able to attend the selection session,<br />

because they had not sufficient resources to pay their train or bus ticket. Contrary to some private<br />

firms, the Belgian military does not refund travel expenses because it is alleged to be too<br />

expensive for the treasury.<br />

Limitations, future research opportunities and contributions<br />

Admittedly, there are several limitations to our study. First, the interview data represent<br />

retrospective self-reports. Thus, the primary reason for withdrawal may be distorted in some<br />

cases (Schmit & Ryan, 1997). For example, some individuals may have reported that they<br />

withdrew because they had a physical problem or because their car broke down, when in reality<br />

less socially desirable motives were involved.<br />

Second, withdrawal behavior was studied at an early stage in the total selection process. It is not<br />

unlikely that reasons for withdrawal vary across time in the selection process even if the focus is<br />

on withdrawal prior to offer extension (Schmit & Ryan, 1997). Taylor and Bergmann (1987)<br />

found that recruitment activities affected applicant reactions only in the initial stage of the hiring<br />

process; after that point, only job attributes were found to significantly affect applicant reactions.<br />

According to Schmit and Ryan (1997) “future studies should include an explicit role for timing in<br />

the explanation of the withdrawal decision. For instance, an interesting group of ‘withdrawals’<br />

for the military are the individuals who paid a visit to the career office, but decided not to apply.<br />

In view of the high proportion of withdrawals who gave preference to another employer, we<br />

argue that future research on military recruitment should also consider why other organizations<br />

are favored and what can be done to make the military more attractive in relation to its<br />

competitors.<br />

A third limitation of this study is that we only analyzed the primary motive for withdrawal,<br />

although in reality it is often a cluster of reasons that has led to the decision to withdraw. Future<br />

research should analyze the combination of reasons instead of isolated motives.<br />

The limitations of this study are offset by several strengths. First, we dissented from the<br />

traditional pathway by taking into account the perspective of different stakeholders. We did not<br />

only interview the persons who withdrew from the hiring process, but also prospects, applicants<br />

who completed the process, trainees and even employees familiar with the accession policy of the<br />

military. This approach was very helpful in order to conduct the telephonic interviews and to<br />

interpret the results.<br />

Secondly, contrary to previous research in this area, we examined applicant withdrawal at a very<br />

early stage of the hiring process. Until now, relatively little is known about applicants’ decisions<br />

to withdraw from an organization’s selection process prior to the point of a job offer (Schmit &<br />

Ryan, 1997). This is unfortunate, because these decisions are surely important to organizations<br />

(Barber, 1998). Especially the inclusion of non-applicants will be informative for both research<br />

and practice.<br />

391<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


392<br />

Conclusions<br />

Inconsistent with previous research, we found that withdrawals were generally unaffected by<br />

early recruitment practices. The primary motive for withdrawal was that the availability of<br />

preferred employment alternatives. Apparently, for many youngsters the military is an option<br />

that is kept in reserve in case other applications would end in a refusal. Several applicants<br />

changed their mind about becoming a military because of perceptions of a lack of fit between<br />

person and organization. This is beneficial to the organization in case these perceptions were<br />

valid. In this study, however, some applicants withdrew because of controversial stories about<br />

the military told by parents, friends or partner. The influence of significant others is probably<br />

underestimated. Most interviewees acknowledged that this played in their decision to withdraw,<br />

but usually other reasons were said to be more important. Future research should take into<br />

account that seldom one isolated reason causes an applicant to withdraw. Finally, we believe that<br />

despite the fact that few applicants mentioned hiring practices as a direct reason of withdrawal,<br />

process perceptions might have influenced applicant reactions indirectly through influencing<br />

perceptions of organizational attributes.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


References<br />

Anderson, N., Born, M., & Cunningham-Snell, N. (2001). Recruitment and Selection: Applicant<br />

Perspectives and Outcomes. In N. Anderson, D. Ones, H.K. Sinangil, & C. Viswesvaran<br />

(Eds.), Handbook of Industrial, Work, and Organizational Psychology: Volume 1<br />

Personnel Psychology (pp. 200-218). London: Sage.<br />

Anderson, N., & Ostroff, C. (1997). Selection as socialization. In N. Anderson & P. Herriot<br />

(Eds.), <strong>International</strong> handbook of selection and assessment. London: Wiley.<br />

Arvey, R., Gordon, M., Massengill, D., & Mussio, S. (1975). Differential dropout rates of<br />

minority and majority job candidates due to time lags between selection procedures.<br />

Personnel Psychology, 38, 175-180.<br />

Barber, A.E. (1998). Recruiting employees: Individual and organization perspectives. Thousand<br />

Oaks, CA: Sage.<br />

Barber, A.E., & Roehling, M.V. (1993). Job postings and the decision to interview: A verbal<br />

protocol analysis. Journal of Applied Psychology, 79, 845-856.<br />

Blau, G.J. (1985). The measurement and prediction of career commitment. Journal of<br />

Occupational Psychology, 58, 277-288.<br />

Breaugh, J.A. (1992). Recruitment: Science and practice. Boston: PWS-Kent Publishing.<br />

Brett, J.M. (1982). Job transfer and well-being. Journal of Applied Psychology, 67, 450-463.<br />

Brett, J.M., & Reilly, A.H. (1988). On the road: Predicting the job transfer decisions. Journal of<br />

Applied Psychology,73, 614-620.<br />

Bretz, R.D., Ash, R.A., & Dreher, G.F. (1989). Do people make the place? An examination of the<br />

attraction-selection-attrition hypothesis. Personnel Psychology, 42, 561-581.<br />

Bretz, R.D., & Judge, T.A. (1994). Person-organization fit and the theory of work adjustment:<br />

Implications for satisfaction, tenure, and career success. Journal of Vocational Behavior,<br />

44, 32-54.<br />

Burke, R.J., & Deszca, E. (1982). Preferred organizational climates of Type A individuals.<br />

Journal of Vocational Behavior, 21, 50-59.<br />

Cable, D.M., & Judge, T.A. (1996). Person-organization fit, job choice decisions, and<br />

organizational entry. Organizational Behavior and Human Decision Processes, 67(3), 294-<br />

311.<br />

Carson, K.D., & Bedeian, A.G. (1994). Career commitment: Construction of a measure and<br />

examination of its psychometric properties. Journal of Vocational Behavior, 44, 237-262.<br />

Chatman, J.A. (1989). Improving interactional organizational research: A model of personorganization<br />

fit. Academy of Management Review, 14, 333-349.<br />

Chatman, J.A. (1991). Matching people and organizations: Selection and socialization in public<br />

accounting firms. Administrative Science Quarterly, 36, 459-484.<br />

Crant, J.M., & Bateman, T.S. (1990). An experimental test of the impact of drug-testing programs<br />

on potential job applicants’ attitudes and intentions. Journal of Applied Psychology, 75,<br />

127-131.<br />

Derous, E., & De Witte, K. (2001). Sociale procesfactoren, testmotivatie en testprestatie. Een<br />

procesperspectief op selectie geëxploreerd via een experimentele benadering. Gedrag &<br />

Organisatie, 14(3), 152-170.<br />

Fombrun, C., & Shanley, M. (1990). What’s in a name? Reputation building and corporate<br />

strategy. Academy of Management Journal, 33, 233-258.<br />

Gatewood, R. D., Gowan, M. A., & Lautenschlager, G. J. (1993). Corporate image, recruitment<br />

image, and initial job choice decisions. Academy of Management Journal, 36, 414-427.<br />

393<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


394<br />

Goltz, S.M., & Giannantonio, C.M. (1995). Recruiter friendliness and attraction to the job: The<br />

mediating role of inferences about the organization. Journal of Vocational Behavior, 46,<br />

109-118.<br />

Gould, S., & Penley, L. (1985). A study of the correlates of the willingness to relocate. Academy<br />

of Management Journal, 28, 472-478.<br />

Granovetter, M.S. (1974). Getting a job: A study of contacts and careers. Cambridge, MA:<br />

Harvard University Press.<br />

Harris, M.M., & Fink, L.S. (1987). A field study of applicant reactions to employment<br />

opportunities: Does the recruiter make a difference? Personnel Psychology, 40, 765-784.<br />

Iles, P.A., & Robertson, I.T. (1997). The impact of personnel selection procedures on candidates.<br />

In N. Anderson & P. Herriot (Eds.), <strong>International</strong> handbook of selection and assessment.<br />

Chichester: Wiley.<br />

Judge, T.A., & Bretz, R.D., Jr. (1992). Effects of work values on job choice decisions. Journal of<br />

Applied Psychology, 77, 261-271.<br />

Judge, T.A., & Cable, D.M. (1997). Applicant personality, organizational culture, and<br />

organization attraction. Personnel Psychology, 50, 359-394.<br />

Kilduff, M. (1990). The interpersonal structure of decision making: A social comparison<br />

approach to organizational choice. Organizational Behavior and Human Decision<br />

Processes, 47, 270-288.<br />

Kristof, A.L. (1996). Person-organization fit: An integrative review of its conceptualizations,<br />

measurement, and implications. Personnel Psychology, 49, 1-50.<br />

Legree, P.J., Gade, P.A., Martin, D.E., Fischl, M.A., Wilson, M.J., Nieva, V.F., McCloy, R., &<br />

Laurence, J. (2000). <strong>Military</strong> enlistment and family dynamics: Youth and parental<br />

perspectives. <strong>Military</strong> Psychology, 12(1), 31-49.<br />

Lent, R. W., Brown, S. D., & Hackett, G. (1994). Toward a unified social cognitive theory of<br />

career and academic interest, choice, and performance. Journal of Vocational Behavior,<br />

45, 79-122.<br />

Liden, R.C., & Parsons, C.K. (1986). A field study of job applicant interview perceptions,<br />

alternative opportunities, and demographic characteristics. Personnel Psychology, 39,<br />

109-122.<br />

Macan, T.H., Avedon, M.J., Paese, M., & Smith, D.E. (1994). The effects of applicants’ reactions<br />

to cognitive ability tests and an assessment center. Personnel Psychology, 47, 715-738.<br />

Mael, F.A., & Ashforth, B.E. (1995). Loyal from day one: Biodata, organizational identification,<br />

and turnover among newcomers. Personnel Psychology, 48, 309-333.<br />

March, J., & Simon, H. (1958). Organizations. New York: Wiley & Sons.<br />

Maurer, S.D., Howe, V., & Lee, T.W. (1992). Organizational recruiting as marketing<br />

management: An interdisciplinary study of engeneering graduates. Personnel Psychology,<br />

45, 807-833.<br />

Meglino, B.M., DeNisi, A.S., Youngblood, S.A., & Williams, K.J. (1988). Effects of realistic job<br />

previews: A comparison using an enhancement and a reduction preview. Journal of<br />

Applied Psychology, 73, 259-266.<br />

Mobley, W.H., Griffeth, H.H., Hand, H.H., & Meglino, B.M. (1979). Review and conceptual<br />

analysis of the employee turnover process. Psychological Bulletin, 86(3), 493-522.<br />

Murphey, K.A. (1986). When your top choice turns you down: Effect of rejected job offers on the<br />

utility of selection tests. Psychological Bulletin, 99, 128-133.<br />

Noe, R.A., & Barber, A.E. (1993). Willingness to accept mobility opportunities: Destination<br />

makes a difference. Journal of Organizational Behavior, 14, 159-175.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Noe, R.A., Steffy, B.D., & Barber, A.E. (1988). An investigation of the factors influencing<br />

employee’s willingness to accept mobility opportunities. Personnel Psychology, 41, 559-<br />

580.<br />

O’Reilly, C.A., Chatman, J.A., & Caldwell, D.F. (1980). Job choice: The impact of intrinsic and<br />

extrinsic factors on subsequent satisfaction and commitment. Journal of Applied<br />

Psychology, 65, 559-565.<br />

Osborn, D.P. (1990). A reexamination of the organizational choice process. Journal of Vocational<br />

Behavior, 36, 45-60.<br />

Pervin, L.A. (1989). Persons, situations, interactions: The history of a controversy and a<br />

discussion of theoretical models. Academy of Management Journal, 14, 350-360.<br />

Posner, B.Z. (1992). Person-organization values congruence: No support for individual<br />

differences as a moderating influence. Human Relations, 45, 351-361.<br />

Powell, G.N. (1984). Effects of job attributes and recruiting practices on applicant decisions: A<br />

comparison. Personnel Psychology, 44, 67-83.<br />

Powell, G.N. (1991). Applicant reactions to the initial employment interview: Exploring<br />

theoretical and methodological issues. Personnel Psychology, 44, 67-83.<br />

Premack, S.L., & Wanous, J.P. (1985). A meta-analysis of realistic job preview experiments.<br />

Journal of Applied Psychology, 70, 706-719.<br />

Reynolds, L.G. (1951). The Structure of Labor Markets. New York: Harper.<br />

Ryan, A.M., & McFarland, L.A. (1997, April). Organizational influences on applicant withdrawal<br />

from selection processes. Paper presented at the Twelfth Annual Conference of the<br />

Society for Industrial and Organizational Psychology, St. Louis, MO.<br />

Ryan, A.M., Sacco, J.M., McFarland, L.A., & Kriska, S.D. (2000). Applicant self-selection:<br />

correlates of withdrawal from a multiple hurdle process. Journal of Applied Psychology,<br />

85(2), 163-179.<br />

Rynes, S.L. (1991). Recruitment, job choice, and post-hire consequences: A call for a new<br />

research direction. In M.D. Dunnette & L.M. Hough (Eds.), Handbook of industrial and<br />

organizational psychology (2nd ed., Vol. 2, pp. 399-444).<br />

Rynes, S.L. (1993). Who’s selecting Whom? In N. Schmitt & W.C. Borman (Eds.), Personnel<br />

Selection in Organizations. San Francisco, CA: Jossey-Bass.<br />

Rynes, S.L., Bretz, R., & Gerhart, B. (1991). The importance of recruitment in job choice: A<br />

different way of looking. Personnel Psychology, 44, 487-521.<br />

Rynes, S.L., & Lawler, J. (1983). A policy-capturing investigation of the role of expectancies in<br />

decisions to pursue job alternative. Journal of Applied Psychology,68,620-631.<br />

Schmit, M.J., & Ryan, A.M. (1997). Applicant withdrawal: The role of test-taking attitudes and<br />

racial differences. Personnel Psychology, 50, 855-876.<br />

Schmitt, N., Gilliland, S.W., Landis, R.S., & Devine, D. (1993). Computer-based testing applied<br />

to selection of secretarial applicants. Personnel Psychology,46, 149-165.<br />

Schreurs, B., Derous, E., De Witte, K., Proost, K., Andriessen, M., & Glabeke, K. (<strong>2003</strong>).<br />

Attracting Potential Applicants to the <strong>Military</strong>: The Effects of Initial Face-to-Face<br />

Contacts. Manuscript submitted for publication.<br />

Schuler, H., Farr, J., & Smith, M. (1993). The individual and organizational sides of personnel<br />

selection and assessment. In H. Schuler, C.J.L. Farr & M. Smith (Eds.), Personnel<br />

selection and assessment: Individual and organizational perspectives. New Jersey:<br />

Lawrence Erlbaum.<br />

395<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


396<br />

Schwab, D.P., Rynes, S.L., & Aldag, R.A. (1987). Theories and research on job search and<br />

choice. In K. Rowland and G. Ferris (Eds.), Research in Personnel and Human Resources<br />

Management (Vol. 5, pp.129-166).<br />

Slaughter, J. E., Zickar, M., Highhouse, S., Mohr, D. C., Steinbrenner, D., & O'Connor, J. (2001,<br />

April). Personality trait inferences about organizations: Development of a measure and<br />

tests of the congruence hypothesis. Paper presented at the Annual Conference of the<br />

Society for Industrial and Organizational Psychology, San Diego, CA.<br />

Smither, J.W., Reilly, R.R., Millsap, R.E., Pearlman, K., & Stoffey, R.W. (1993). Applicant<br />

reactions to selection procedures. Personnel Psychology, 46, 49-76.<br />

Spence, A.M. (1974). Market Signalling. Cambridge, MA: Harvard University Press.<br />

Spence, A.M. (1973). Job market signalling. Quarterly Journal of Economics, 87, 355-374.<br />

Steel, R. (1996). Labor market dimensions as predictors of the reenlistment decisions of military<br />

personnel. Journal of Applied Psychology, 69, 846-854.<br />

Steel, R., & Griffeth, R. (1989). The elusive relationship between perceived employment<br />

opportunity and turnover behavior: A methodological or conceptual artifact? Journal of<br />

Applied Psychology, 69, 846-854.<br />

Steiner, D.D., & Gilliland, S.W. (1996). Fairness reactions to personnel selection techniques in<br />

France and the United States. Journal of Applied Psychology, 81, 134-141.<br />

Suszko, M.J., & Breaugh, J.A., (1986). The effects of realistic job previews on applicant selfselection<br />

and employee turnover, satisfaction, and coping ability. Journal of Management,<br />

12, 513-523.<br />

Taylor, M.S., & Bergmann, T.J. (1987). Organizational recruitment activities and applicants’<br />

reactions at different stages of the recruitment process. Personnel Psychology, 40, 261-<br />

285.<br />

Tom, V. R. (1971). The role of personality and organizational images in the recruiting process.<br />

Organizational Behavior and Human Decision Processes, 6, 573-592.<br />

Turban, D.B. (2001). Organizational attractiveness as an employer on college campuses: An<br />

examination of the applicant population. Journal of Vocational Behavior, 58, 293-312.<br />

Turban, D.B., Campion, J.E., & Eyring, A.R. (1995). Factors related to job acceptance decisions<br />

of college recruits. Journal of Vocational Behavior, 47, 193-213.<br />

Turban, D.B., & Dougherty, T.W. (1992). Influences of campus recruiting on applicant attraction<br />

to firms. Academy of Management Journal, 35, 739-765.<br />

Turban, D.B., Eyring, A.R., & Campion, J.E. (1993). Job attributes: Preferences compared with<br />

reasons given for accepting and rejecting job offers. Journal of Occupational and<br />

Organizational Psychology, 66, 71-81.<br />

Turban, D.B., Forret, M.L., & Hendrickson, C.L. (1998). Applicant attraction to firms: Influences<br />

of organization reputation, job and organizational attributes, and recruiter behaviors.<br />

Journal of Vocational Behavior, 52, 24-44.<br />

Turban, D. B., & Greening, D. W. (1997). Corporate social performance and organizational<br />

attractiveness to prospective employees. Academy of Management Journal, 40, 658-672.<br />

Turban, D. B., & Keon, T. L. (1993). Organizational attractiveness: An interactionist perspective.<br />

Journal of Applied Psychology, 78, 184-193.<br />

Vancouver, J.B., Millsap, R.E., & Peters, P.A. (1994). Multilevel analysis of organizational goal<br />

congruence. Journal of Applied Psychology, 79, 666-679.<br />

Vancouver, J.B., & Schmitt, N.W. (1991). An exploratory examination of person-organization fit:<br />

Organizational goal congruence. Personnel Psychology, 44, 333-352.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Wanous, J.P. (1992). Organizational entry: Recruitment, selection, and socialization of<br />

newcomers (2 nd ed.). Reading, MA: Addison-Wesley.<br />

Wanous, J.P., & Colella, A. (1989). Organizational entry research: Current status and future<br />

directions. In K. Rowland & G. Ferris (Eds.), Research In Personnel and Human<br />

Resources Management, pp. 59-120, Greenwich, CT: JAI Press.<br />

Wiesner, W.H., Saks, A.M., & Summers, R.J. (1991). Job alternatives and job choice. Journal of<br />

Vocational Behavior, 38, 198-207.<br />

Witt, L.A., & Nye, L.G. (1992, April). Goal congruence and job attitudes revisited. Paper<br />

presented at the Seventh Annual Conference of the Society for Industrial and<br />

Organizational Psychology, Montreal, Canada.<br />

Table 1<br />

Reasons for Withdrawal<br />

1. Available employment alternatives<br />

2. Lost interest/doubts<br />

3. Significant others<br />

4. Administrative problems<br />

5. Medical/physical problems<br />

6. Transportation problems<br />

7. Had to work<br />

8. Too sick to attend<br />

9. Forgot<br />

10. Personal/family problems<br />

11. Other things to do<br />

12. Further education<br />

13. Test anxiety<br />

14. Time delay<br />

15. Other military career<br />

Note. N = 200.<br />

397<br />

21,18%<br />

12,94%<br />

10,59%<br />

9,41%<br />

9,41%<br />

8,24%<br />

5,88%<br />

4,71%<br />

4,71%<br />

4,71%<br />

2,35%<br />

2,35%<br />

1,18%<br />

1,18%<br />

1,18%<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


398<br />

U.S. ARMY RECRUITER SELECTION RESEARCH: AN UPDATE 32<br />

Walter C. Borman 33<br />

Personnel Decisions Research Institutes, Inc. and<br />

University of South Florida<br />

100 South Ashley Drive, Suite 375<br />

Tampa, FL 33602<br />

wally.borman@pdri.com<br />

Leonard A. White<br />

U.S. Army Research Institute<br />

5001 Eisenhower Avenue<br />

Alexandria, VA 22333<br />

WhiteL@ARI.army.mil<br />

Stephen Bowles<br />

Command Psychologist, U.S. Army Recruiting Command<br />

10,000 Hampton Pkwy, Room 1119, Center One<br />

Fort Jackson, SC 29207<br />

bowless@jackson.army.mil<br />

Kristen E. Horgen, U. Christean Kubisiak, and Lisa M. Penney<br />

Personnel Decisions Research Institutes, Inc.<br />

100 South Ashley Drive, Suite 375<br />

Tampa, FL 33602<br />

kristen.horgen@pdri.com, chris.kubisiak@pdri.com, lisa.penney@pdri.com<br />

[INTRODUCTION<br />

[The objective of this research program is to develop and validate a new screening test<br />

battery for selecting U.S. Army recruiters. The approach has been to first conduct a concurrent<br />

validation study by administering a trial test battery to production recruiters currently on the job<br />

and also obtain performance measures on these same recruiters. The concurrent validation<br />

research has been completed and this paper describes results of that research.<br />

32 Paper presented at the Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>, Pensacola, FL. The<br />

views expressed in this paper are those of the authors and do not necessarily reflect those of the Army Research<br />

Institute, or any other department of the U.S. Government.<br />

33 Also contributing technical support to the research program are Valentina Bruk, Patrick Connell, Elizabeth Lentz,<br />

and Vicky Pace, Personnel Decisions Research Institutes, Inc., and Mark C. Young, U.S. Army Research Institute.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


We also have under way a predictive validation study to evaluate the validity of the test<br />

battery for predicting the subsequent performance of recruiters in a testing setting that is more<br />

similar to an actual operational environment for the test administration. Non-commissioned<br />

officers (NCOs) entering the Recruiting and Retention School (RRS) are being administered the<br />

test battery, and we have now followed up on a sample of 1,466 of these recruiters. Attrition data<br />

from the RRS and production data (i.e., number of recruits brought into the Army per month)<br />

were available for these recruiters, and we also present initial predictive validation results using<br />

these indicators as criteria.<br />

Long-term goals for the project are to establish a standard screening process for NCO<br />

candidates for a recruiting assignment prior to their being accepted into the RRS. NCOs scoring<br />

well on the battery might be encouraged to volunteer for recruiting. An even broader goal is to<br />

develop a classification test battery to target placement into other possible second-stage jobs<br />

(e.g., drill instructor). In one scenario, this classification battery would be administered routinely<br />

to NCOs at the beginning of their second tour, and predicted performance scores would be<br />

generated for each target job. Then, NCOs could be counseled about which second-stage job(s)<br />

suit them best.<br />

THE CONCURRENT VALIDATION STUDY<br />

We first conducted a job analysis of the Army recruiter military occupational specialty<br />

(MOS). This analysis had two purposes: to identify the recruiter performance requirements and<br />

thus to suggest performance measures that might be used as criteria in the validation study; and<br />

to identify candidate predictor tests for the validation research.<br />

Criteria<br />

. The job analysis suggested that production rates (as mentioned, number of prospects<br />

brought into the Army per unit time), and peer and supervisory ratings of job performance on the<br />

main dimensions of recruiter performance would provide relevant criteria. We also decided to<br />

develop a situational judgment test to measure problem-solving, judgment, and decision-making<br />

skills important in recruiting. More details on the criterion measures will be provided in a<br />

moment.<br />

Predictors<br />

Predictor measures for the test battery included: (1) The Army Research Institute’s<br />

Background Information Questionnaire (BIQ), with eight scales including “natural” leader,<br />

social perceptiveness, and interpersonal skills; (2) The Army Research Institute’s Assessment of<br />

Individual Motivation (AIM), with six scales including work orientation, leadership, and<br />

adjustment; (3) The Sales Achievement Profile (SAP), with 21 scales including validity scales,<br />

sales success, motivation and achievement, work strengths, interpersonal strengths, and inner<br />

resources; (4) The Emotional Quotient Inventory (EQI), with 15 scales including intrapersonal,<br />

399<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


400<br />

adaptability, general mood, interpersonal, and stress management components; and (5) The<br />

NEO, with five scales measuring the Big 5 personality factors.<br />

Sample<br />

. A total of 744 Army recruiters from 10 recruiting battalions comprised our concurrent<br />

sample.<br />

Results<br />

First, regarding the criteria, we obtained production data for a 12-month period for many<br />

of the recruiters in the sample. Not all members of the sample had 12 months data. In fact, on<br />

average, they had about eight months. We computed reliability coefficients for two months<br />

through 12 months data and found that four months provided reasonable levels of reliability<br />

(intraclass correlation = .59). Fewer months of production data reduced reliability substantially.<br />

Thus, only recruiters who had at least four months production data were included in the<br />

validation analyses.<br />

It is well documented that some territories are inherently easier or more difficult to recruit<br />

in than others. Thus, we experimented with correcting production data using the mean values for<br />

territories of various sizes, including at the brigade (N = 5), battalion (N = 41), and company<br />

(N = 243) levels. Corrections using brigade mean production levels proved to yield the most<br />

reliable production data, and this was therefore the correction employed.<br />

Peers and supervisor ratings of job performance were gathered on eight behavior-based<br />

rating scales. A total of 1,542 raters provided ratings on 619 recruiters, an average of 2.41 sets of<br />

ratings per ratee. Results of the performance ratings showed reasonable distributions of ratings<br />

(i.e., means around 6.5 on a 1-10 scale). Also, the interrater agreement results are quite good,<br />

with peers and supervisors showing comparatively high agreement in their ratings of recruiters<br />

(rs = .55 to .67). Finally, to summarize the eight dimensions into a simpler system, three factors<br />

were identified: Selling, Human Relations, and Organizing Skills.<br />

The situational judgment test (SJT) had 25 items in multiple-choice format that presented<br />

difficult but realistic recruiting-related situations and four to five response options that<br />

represented different possible ways to handle each situation. Effectiveness ratings for each<br />

response were provided by subject matter experts (SMEs), and these effectiveness ratings formed<br />

the basis of the scoring key. The best scores on the SJT were obtained when the recruiters’<br />

responses most closely corresponded to the responses the SMEs regarded as most effective.<br />

Relationships between the SJT and the other criteria (e.g., ratings, sales volume) were somewhat<br />

lower than expected. As a result of these unexpected findings, we did not include the SJT as a<br />

component of the criterion in our validation analyses. Future work on the SJT will examine an<br />

empirical keying approach as an alternative to SME judgments of effectiveness for scoring this<br />

measure.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


For the validation analyses, a final sales performance composite was derived. Recruiting<br />

Command policy makers provided the following weights on the criterion measures:<br />

(1) production data (corrected) = 50%; (2) Selling Skills ratings = 30%; (3) Human Relations and<br />

Organizing Skills ratings = 10% apiece. Weighted standard scores for each component of the<br />

composite were summed and became the criterion against which the validity of the predictors<br />

were determined.<br />

Table 1 presents the significant validities for the refined AIM and BIQ scales. In the<br />

scale refinement process, items with higher correlations against the criterion were given a higher<br />

weight in the scale score. Various cross-validation analyses suggested that the validities in Table<br />

1 are reasonable estimates of these two instruments’ validity, without capitalizing on chance.<br />

The SAP, EQI, and NEO results were not quite as positive as were the AIM and BIQ results.<br />

Table 1<br />

Correlations of AIM and BIQ Scales With the Sales Performance Criterion<br />

Performance Composite<br />

Scale<br />

N = 446-453<br />

AIM Work Orientation .28**<br />

AIM Leadership .26**<br />

AIM Agreeableness .10*<br />

BIQ Hostility to Authority -.14**<br />

BIQ Social Perceptiveness .18**<br />

BIQ “Natural” Leader .32**<br />

BIQ Self-Esteem .25**<br />

BIQ Interpersonal Skill<br />

**p < .01 *p


402<br />

THE PREDICTIVE VALIDITY RESEARCH<br />

The recruiter screening battery has considerable potential for identifying NCOs likely to<br />

be successful recruiters. However, we believe it is highly important to evaluate, as well, the<br />

validity of the battery over time in a predictive validation design. As mentioned, this research is<br />

underway at the RRS, where incoming students at the school are being administered the test. In<br />

fact, 1700+ recruiters who completed the test battery, recently called the NCO Leadership Skills<br />

Inventory or NLSI, have now progressed through the RRS (although 10.9% dropped out during<br />

training) and 1,466 have at least five months production data available for the predictive<br />

validation analyses. Below we describe that research.<br />

Criteria<br />

We used two criteria in the predictive work. Attrition from the RRS was one criterion,<br />

and production rates corrected for territorial differences, as in the concurrent study, was the other<br />

criterion. We included recruiters in the study only if they had five or more months of production<br />

data. The reliability of the corrected production index was .64.<br />

Predictors<br />

As mentioned previously, the test battery now included the BIQ and the AIM, and is<br />

referred to as the NLSI.<br />

Sample<br />

More than 1700 NCOs entering the RRS were administered the NLSI. Approximately<br />

160 of these NCOs were dismissed from or otherwise failed to complete training. A total of 1466<br />

recruiters graduated from the RRS and had five or more months of production data.<br />

Results<br />

The predictive validity of the NLSI was also promising, although at this point the criteria<br />

are somewhat limited. First, the attrition results showed that when NLSI scores are divided into<br />

quartiles, attrition was respectively, from highest to lowest scorers, 6.5%, 8%, 10.5%, and<br />

18.8%. Even more dramatic, the bottom 5% of NLSI scorers has an attrition rate of 36%,<br />

whereas the rest of the sample attrited at a 9% rate.<br />

Second, the production rate results indicated that recruiters who score in the top 25% on<br />

the NLSI brought in 1.10 recruits on average. For the second highest quartile the number was<br />

1.06, 1.03 for the third highest quartile, and .91 for the lowest quartile. At the extreme bottom of<br />

the NLSI distribution, the lowest 5% bring in .75 recruits, whereas the rest of the sample has<br />

1.05 as an average. For comparison purposes, the BIQ-AIM composite cross-validity against<br />

production in the concurrent study was .22. The corresponding validity coefficient in the<br />

predictive study was .15.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


NLSI IMPLEMENTATION TO A WEB-BASED PLATFORM<br />

In order to implement the NLSI most efficiently and effectively, we were asked to<br />

develop a web-based version. We coordinated with ePredix, an organization specializing in webbased<br />

testing, and the U.S. Army Research Institute to generate HTML-based AIM and BIQ<br />

forms that were as similar as possible to the original paper version. The project staff generated<br />

instructions and item formats that could be used in conjunction with ePredix’s established<br />

computer-based testing engine. Additionally, this system allowed for immediate scoring of the<br />

tests and reporting of results in real-time. Furthermore, this enables candidates for recruiting<br />

duty, who are deployed throughout the world, to take the test at any military base equipped with<br />

a Digital Training Facility (DTF). In order to maintain the security of the battery, the test can<br />

only be accessed at appointed times and in a proctored setting at the DTF. The test was deployed<br />

on a trial basis at a limited number of DTFs in early <strong>2003</strong> and will gradually be expanded to all<br />

DTF locations. Data gathered so far indicate that the recruiter candidates tested using the on-line<br />

system are obtaining scores comparable to those obtained by recruiter candidates who took the<br />

paper version.<br />

CONCLUSION<br />

The NLSI appears to be a reasonably valid predictor of Army recruiter job performance.<br />

Concurrent and predictive validation results demonstrate substantial relationships with attrition<br />

from recruiter training and several indicators of recruiter performance. A web-based,<br />

computerized version of the NLSI has now been developed and is being administered to NCOs<br />

in Army DTFs. We anticipate that this instrument will be used operationally to encourage NCOs<br />

likely to succeed in a recruiting environment to apply for recruiting duty. We are also<br />

investigating potential uses of the NLSI for assignment to other NCO specialties (e.g., Drill<br />

Sergeant).<br />

403<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


404<br />

Modelling Communication in Negotiation in PSO Context<br />

Prof Dr MSc Jacques Mylle 34<br />

Human Factors & <strong>Military</strong> Operations Research<br />

Royal <strong>Military</strong> Academy<br />

B1000 Brussels – Belgium<br />

jacques.mylle@rma.ac.be<br />

Scope<br />

Although people are not conscious of it, negotiation is a very common behaviour. A major<br />

problem that arises often is the lack of a “shared mental model” of the negotiation situation<br />

and which constitutes the core of the negotiation. So, people often leave a lot of information<br />

implicit which in turn leads to misunderstandings.<br />

In other words clear communication depends on the quantity, the quality and the relevance of<br />

the information procured.<br />

Soldiers deployed in a peace support context do not constitute an exception to the above<br />

mentioned observations, on the contrary. Different cultural background, not mastering each<br />

others language, interpreters who do not translate reliably, problems with role perception, etc<br />

are all parameters which make clear communication very difficult and hence negotiation even<br />

more.<br />

Research aim<br />

The aim of our project is to use mathematical and computer tools to unravel and to represent<br />

the logical and informational structure of a dialogue in a given setting. The ultimate goal is to<br />

create a “machine” that can “dialogue” with a living person about a particular object.<br />

On the one hand it will be used to enhance people communicative abilities in a military setting<br />

and on the other hand to train people in negotiation structured way.<br />

It goes without saying that such a project requires the contribution of several fields: among<br />

others psychology, linguistics and computer sciences. In this paper we will look specifically<br />

at the computational linguistic aspects.<br />

Approach<br />

Formally spoken, in linguistics three separate aspects have to be considered, which have to be<br />

implemented in specific modules.<br />

1. The syntactic aspect.<br />

The purpose of this module is to analyze a sentence from a grammatical point of view; i.e.<br />

looking up for the subject, the verb, etc. The output of this first module is necessary as a<br />

structured input for the second module.<br />

2. The semantic aspect.<br />

This module builds a logical representation of the meaning of the sentence starting from<br />

the syntactic structure of it. In most cases these logical representations are underspecified<br />

with respect to certain relationships within and between sentences. Both of these modules<br />

34 In strong collaboration with Nicholas Yates, researcher and specialist in computer linguistics, and Prof Dr<br />

André Helbo, director of the Language Training Centre at the Royal <strong>Military</strong> Academy.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


work at the sentence level. This ambiguity is clearly inherent to the human language<br />

when used out of a context.<br />

3. The pragmatic aspect.<br />

The module related to this aspect works on the underspecified relationships created by the<br />

semantic module. It builds a general representation of a series of sentences - which may be<br />

a continuous text or a dialogue. This module has to link a given sentence to the former<br />

discourse by confronting them with the given context/setting which can be reduced to a set<br />

of “givens” and relationships. This set of informations must allow for resolving the<br />

aforementioned ambiguities as a consequence of the initial linguistic underspecifications<br />

in the semantic representation.<br />

Each of these three modules requires a specific formal language. We have chosen the<br />

following ones for our project.<br />

Head-driven Phrase Structure Grammar (Pollard & Sag, 1994) - in short HPSG - for the<br />

syntactic module.<br />

Minimal Recursion Semantics (Copestake et al. 1999) – in short MRS – for the semantic<br />

module.<br />

Segmented Discourse Representation Theory (Asher & Lascarides, <strong>2003</strong>) - in short SDRT –<br />

for the pragmatic module.<br />

Both HPSG and MRS are already implemented in a computional language in a system called<br />

Linguistic Knowledge Builder (Copestake et al. 2000)- in short LKB. Unfortunately, this is<br />

not the case for SDRT . So, this tough task has to be done by our research group.<br />

An example of communication / negotiation in a PSO setting<br />

The text below is part of the transcript of a dialogue recorded in Leposavic (Kosovo) in Januar<br />

<strong>2003</strong> during a working visit in the context of our project. The two parties involved are the<br />

Damage Control Officer (DCO) of the Belgian Battle Group in Kosovo – whose<br />

mothertongue is French - and a Kosovar civilian, assisted by a Kosovar interpreter who<br />

masters more or less English.<br />

Β01 Officer : Hello Dragan.<br />

Β02 Interpr : Sacha, Sacha ! How are you ? (Gives a document to the DCO)<br />

Β03 Officer : I am OK.<br />

Β04 Interpr : This is the person who had an accident with a Belgian bus six months ago.<br />

Β05 Officer : Yes… ?<br />

Β06 Interpr : But he did not get his money yet.<br />

B07 Officer : Mm. Mm. (Reads the document) Vu-ko-ma…<br />

B08 Interpr : Vukomanovic.<br />

B09 Officer : I don’t have a file on this person.<br />

B10 Interpr : Hun?He told me that they told him to wait four months. Within this four months<br />

he would be refunded. But he is waiting since more than six months and a halve.<br />

…<br />

The syntactic module : HPSG and LKB<br />

Figure 1 shows how LKB analyses a simple sentence like B09 “I don’t have a file on this<br />

person.”<br />

405<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


406<br />

The left side of the window shows the graphical representation of the syntactic analysis. One<br />

can easily locate the subject, the verb, the negation and the object.<br />

The right side of the window gives the computational representation of the syntactic elements<br />

of this particular sentence in the HPSG language. Such a structure is, mathematically spoken,<br />

a “Typed Feature Structure”. It contains all elements that are relevant and necessary for the<br />

following steps in the analysis.<br />

Figure 1. Example of a sentence analysis by HPSG in LKB.<br />

Représentation graphique<br />

de Graphical la structure representation syntaxique<br />

de la phrase.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Représentation informatique des<br />

informations Computational syntaxiques representation<br />

dans le langage<br />

HPSG (du point de vue mathématique, on<br />

appelle ceci une structure de traits typés). Ce<br />

sont ces informations qui sont réellement


The semantic module: MRS and LKB<br />

The information contained in the HPSG representation allow for the computation of the<br />

underspecified semantic representation of the same sentence (Figure 2). It can be visualized<br />

in its computational form as shown in the left part of Figure 2 and/or in its mathematical form<br />

as shown in the right part of Figure 2.<br />

Figure 2. Example of output produced by the semantic module.<br />

Computational version<br />

Mathematical version<br />

407<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


408<br />

The pragmatic module : SDRT<br />

The implementation of this module will constitute the biggest load of work.<br />

As already said, SDRT is a theory that predicts how successive sentences of a text/dialogue<br />

will be linked to each other. To be able to do so, a number of rules and axioms must be<br />

specified beforehand about the “state of the world” for the problem at hand. Thus, this has to<br />

do with making explicit in the program what is left implicit in the natural language because<br />

the subjects have a mental picture about the state of the world. By means of this<br />

complementary set of information, it is possible to determine what type of relationships links<br />

the sentence under consideration to one or more of the former ones.<br />

For example, take the sentences Β04, B05, and B06.<br />

Β04 Interpr. : This is the person who had an accident with a Belgian bus six months ago.<br />

Β05 Officer : Yes… ?<br />

Β06 Interpr. : But he did not get his money yet.<br />

After Β04, SDRT builds a “segmented discourse representation structure” (SDRS). The<br />

Figure 3 below shows those SDRS in their standard graphical form. Each of the parts after a Π<br />

constitutes an entity of information for “monotonic recursive structures” (MRS) which parts<br />

that can be univocally determined in the sense that there are at least no contradictions with<br />

what is already known. This is a notational gloss for a more precise, but less readable<br />

representation.<br />

Figure 3a. Example of rules about the “state of the world” for SDRT<br />

How does SDRT works with these entities ? First, an expression like “this is the person<br />

who..;” implies that the speaking party (here the interpreter) supposes that the listening party<br />

(here the officer) knows who the “this person” is (the Kosovar who is complaining). This<br />

assumption is a general axiom meaning. Moreover, in the given context, we can assume as<br />

“normal” that, if the person is known by the Damage Control, there is a file about that person.<br />

This is a contextual common sense rule.<br />

If this type of information is available to SDRT, it can compute this complementary<br />

information (π4c) and link it through a causal relation to the sentence π4; relationship which is<br />

in this case “consequence by default” (Def-Cons).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Next the program has to deal with Β5. This reply is in fact a question but which can only be<br />

understood as a relevant question if this “sentence” is interpreted as a query for further<br />

specification. This is represented by the relation Request for Explanation or Explanation* in<br />

SDRT terminology (Fig 3b). In this concrete case, the officer does still not know the reason –<br />

formally spoken : does not have enough information to “compute” the purpose of the visit -<br />

and asks for further specification by his “yes…?”<br />

Figure 3b. Example of lack of information about the “state of the world” for SDRT<br />

The interpreter understood correctly the meaning of the “yes” (π5) and answers with Β06.<br />

Here another type of rule has to be introduced: it is known that, in general, a “but” implies a<br />

contrast with an implicit default consequence belonging to a former part of the discourse. In<br />

the example, the implicit consequence is that - in fact - the Kosovar civilian should already<br />

have been payed. To able to “compute” this implicit consequence some other contextual rules<br />

must be introduced ; such as “ If a person who was involved in an accident, was in the<br />

right,(s)he has to be reimbursed for the repair of the damage”.<br />

In the example the interpreter leaves the above stated idea implicit by giving the motive of his<br />

visit “but he did not get the money yet” (π6) as an answer to π5 which leaves also the object<br />

implicit.<br />

As a result SDRT produces the following nested structure (Figure 3c).<br />

The thing which guarantees that we end up with the representation shown in Figure 3c is that<br />

this solution - among all possible meaningful alternatives - maximises the coherence between<br />

the already elaborated structure and the information to be added information. In other words<br />

each sentence is linked by at minimum one relation to the discourse structure and resolves<br />

simultaneously a maximum of ambiguities.<br />

Under ideal circumstances such a structure represents :<br />

- what the parties are saying<br />

- what their intentions are at the communicative level<br />

- what is reached by saying what they said.<br />

409<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


410<br />

Figure 3c. Example of a contextual rule about the state of the world for SDRT.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Conclusion<br />

Communication abilities are a critical operational issue at all levels of the organisation.<br />

To optimise performance in operations, people must thus be trained in.<br />

A tool implemented on computer allows for training anytime anywhere.<br />

HPSG and MRS do not pose fundamental modelling problems.<br />

Analysis with SDRT is only possible if an effective system of rules and axioms about the state<br />

of the world for the problem a hand exists.<br />

These rules and axioms have to be derived from the analysis of empirical data. This means in<br />

the context of our project the analysis of a number of “real life” negotiations and/or good<br />

simulations/role playings which have been recorded.<br />

The “theory” about the state of the world is than build step by step and adjusted when needed<br />

to fit with new data.<br />

This is precisely what we will start in the coming months: recording “input” through role<br />

playing games at the Royal Higher Defence Institute because we experienced a number of<br />

unbridgeable difficulties with the real life data in Kosovo; especially due to the language<br />

problems and the use of a third party, the interpreter.<br />

The next steps are then deriving the rules and axioms, followed by implementing them in an<br />

algorithm to build an SDRT from a series of MRS produced by LKB. Stated otherwise, we<br />

have to create a “glue language” which allows for hooking a sentence somewhere to the<br />

already elaborated structure.<br />

References :<br />

Asher, N. & A. Lascarides (<strong>2003</strong>). Logics of Conversation. Cambridge University Press.<br />

Cambridge<br />

Copestake, A., D. Flickinger, I. A. Sag, & C. Pollard (1999). Minimal Recursion Semantics:<br />

An Introduction. CSLI, Stanford University. http://www-csli.stanford.edu/<br />

aac/papers.html<br />

Copestake, A., J. Carroll, R. Malouf, & S. Oepen. (2000). The (new) LKB system. CSLI,<br />

Stanford University. http://wwwcsli.stanford.edu/\Delta aac/doc5-2.pdf)<br />

Pollard, C. & Sag, I. (1994). Head-Driven Phrase Structure Grammar. CSLI Publications.<br />

Stanford.<br />

411<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


412<br />

ELECTRONIC ADVANCEMENT EXAMS –<br />

TRANSITIONING FROM PAPER-BASED TO ELECTRONIC FORMAT<br />

Kirk Schultz, Robert Sapp and Larry Willers<br />

Naval Education and Training Professional Development and Technology Center (NETPDTC)<br />

Pensacola, Florida, USA, 32509<br />

lee.schultz@navy.mil; robert.d.sapp@navy.mil; larry.willers@navy.mil<br />

ABSTRACT<br />

The U. S. Navy’s increased emphasis on Human Performance is changing the way<br />

Sailors are trained. It is important that the Advancement Exam process leverages continuing<br />

technological progress to better assess an individual Sailor’s performance in an accurate and<br />

meaningful manner. This paper reviews the current status of the Electronic Advancement Exam<br />

Initiative, which capitalizes on the ability to use multimedia to better present questions with a<br />

performance orientation. The initiative’s goals and objectives, design and procurement<br />

decisions, content development methodology, and considerations for integration with the current<br />

exam process are addressed.<br />

INTRODUCTION<br />

The transition process from a paper-based to an electronic format for U. S. Navy Enlisted<br />

Advancement Exams was originally presented by Baisden and Rymsza (2002), who conducted<br />

Phase I of the initiative and established the viability of replacing the existing paper-based exam<br />

with an electronic, multimedia format. This paper deals with Phase II issues, which included:<br />

• addressing organizational culture issues associated with Advancement Exams<br />

• reviewing and addressing internal/external process issues related to Advancement Exams<br />

• defining and procuring specific hardware solutions for electronic exam implementation<br />

• integrating the electronic exam software solution with existing database resources<br />

• establishing standards and processes for developing multimedia assessment items<br />

ORGANIZATIONAL CULTURE<br />

The first major hurdle faced was the issue of change management. Navy paper and<br />

pencil testing has been around for over fifty years. To move from a validated process to one<br />

entailing a major shift in exam design (increased performance orientation using multimedia), as<br />

well as a different presentation mode (electronic) left many skeptical. Three factors have had a<br />

significant impact on the success of the initiative to date.<br />

First, a motivated and capable implementation Advanced Exam Development Team<br />

(hereafter referred to as the “Team”) was established. To conduct Phase II, three lead Team<br />

members were selected to research and implement the change. In an effort to ensure an objective<br />

look at implementation options, two of the three were brought in from outside the Advancement<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Exam Center to work with the project full-time. The third lead member was a relative newcomer<br />

to the Center and dedicated half-time to the project. Their backgrounds included engineering,<br />

computer programming, instructional design, project management, procurement, and human<br />

performance expertise. Project responsibilities fell into three general areas - hardware, software<br />

and content – with one lead member responsible for each. Process issues tended to cross all<br />

three of these general areas and were addressed both by individuals and the Team as a whole.<br />

The lead Team tapped expertise from other departments as required for short-term guidance and<br />

assistance.<br />

Second, every effort was made to achieve a series of rapid successes in small, but critical,<br />

elements of the electronic exam process to prove the concept, gain acceptance, build momentum<br />

and establish ultimate success as a realistic possibility. As Phase II was initiated, a number of<br />

individuals expressed concern and voiced their reasons why this initiative would fail. The Team<br />

moved quickly to review and select hardware and software that, out of the box, addressed key<br />

questions people had about the viability of creating and delivering electronic exams. With Team<br />

coordination and assistance from the Exam Development Software (EDS) programmer, required<br />

modifications that were anticipated by some to take three to six months were achieved to an 80<br />

percent level in two weeks, with a prototype ready for implementation in less than one month.<br />

SME content developers moved quickly to generate sample multimedia exam questions that were<br />

performance-oriented and illustrated content that would have been difficult or impossible to<br />

convey through text alone. Key employees and persons of influence within and associated with<br />

the Navy Advancement Center were kept informed of progress through regular meetings held<br />

every two weeks. The short time frame within which these successes were achieved started to<br />

move the organizational culture from a “This will never happen” position to “This initiative is<br />

going to happen and we need to be a part of it.”<br />

Finally, efforts were undertaken to brief higher-level echelons on the plan and solicit<br />

support. Tying this initiative to the Navy’s current Revolution in Training with its emphasis on<br />

human performance was key. The same infrastructure that will allow Navy members electronic<br />

access to elements of their career progression can be used to provide Navy advancement testing<br />

in an all-electronic format. While implementation of this access is not expected for some time, it<br />

is important to work out the details of the electronic advancement testing process now, so that it<br />

can be expanded as soon as the infrastructure is in place.<br />

PROCESS ISSUES<br />

The United States Navy prides itself on its track record of consistently creating and<br />

administering examinations to rank order enlisted Sailors for advancement in the grades E4-E7.<br />

To smooth the transition, Phase II processes adhered to the existing paper-based processes to the<br />

greatest extent practicable. Review of the processes currently used for the paper and pencil<br />

examination process indicated that only minor modification was needed to accommodate the<br />

electronic testing process. For example, electronic testing will utilize existing Education<br />

Services Officers (ESOs), who will administer the exams in a manner consistent with the current<br />

paper and pencil process. Where variations between the processes exist (like charging computer<br />

batteries the night before the exam), these additional responsibilities will be clearly<br />

communicated. Following the exam, in non-networked environments the ESO will express-mail<br />

413<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


414<br />

digitized answer files (saved to Secure Digital (SD) cards) back to the Center, in a manner<br />

similar to the way completed machine-scoreable answer sheets are sent back now. These answer<br />

files will be formatted so the results can be processed and item statistics evaluated without<br />

requiring any modification to current procedures.<br />

HARDWARE ISSUES<br />

Hardware Platform Selection<br />

Any shift to an electronic advancement exam process would be predicated on the<br />

availability of a sufficient number of appropriately located and configured computing devices to<br />

support delivery of the exams. A fundamental issue addressed by the Team was that an<br />

insufficient quantity of suitable computers currently exists in the fleet. Since this meant that<br />

equipment procurement was required, and because there have been advances in technology and<br />

decreases in equipment costs in the year since Baisden and Rymsza (2002) performed their<br />

original Phase I study, taking another look at the suitability of various devices for use in exam<br />

delivery was justified.<br />

At the start of Phase II, the latest models of Personal Digital Assistant (PDA) were<br />

reviewed. A major concern remained that, in delivering multimedia questions, the small screen<br />

size would require the test taker to toggle between two screens or scroll to see the content of each<br />

question. Also, testing to determine maximum useful battery life indicated an operating duration<br />

of just slightly over two hours - insufficient to deliver a three-hour exam. The PDA was<br />

therefore eliminated as a delivery vehicle for the foreseeable future.<br />

Another potential exam delivery platform evaluated in the Phase I effort was a tablet<br />

computer. The recent release of Windows XP Tablet Edition, with its integrated support for<br />

stylus operation and handwriting recognition, raised the possibility of delivering exams in a<br />

manner that closely replicated the “pencil and paper” approach so familiar to Sailors.<br />

Additionally, the extended screen height of tablet computers permitted questions with larger<br />

content (like graphic alternative choices) to be viewed on-screen without requiring the user to<br />

scroll. Evaluation of the Electrovaya Scribbler indicated that it has sufficient battery life to<br />

deliver a three-hour exam with an adequate reserve capacity. However, at almost $2,800 per<br />

unit, this unit was determined to be too costly to acquire in the quantity necessary to support<br />

deployment of an electronic exam system.<br />

While brief consideration was given to the idea of using desktop PCs due to the very<br />

aggressive pricing available (typically half the cost of a similarly equipped notebook computer),<br />

shipping costs and other logistical issues associated with deploying desktop systems and<br />

returning them after the exam cycle left notebooks as the most promising option. Management<br />

determined that, due to budgetary constraints, $1,000 was the target price for the initial Phase II<br />

transition systems. After surveying the offerings of the major computer vendors with whom the<br />

government has purchasing agreements, the Gateway M305E notebook computer was selected.<br />

System testing identified that, when outfitted with a high-capacity (4400 mAh) lithium-ion<br />

battery, the system delivered a 3 hour and 40 minute battery life when running BWS<br />

BatteryMark (Version 1.0) set to simulate usage similar to that expected when delivering an<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


exam. Gateway was able to provide these systems, configured to Phase II specifications, at close<br />

to the target price.<br />

Hardware Configuration Considerations<br />

In evaluating and selecting a suitable hardware platform, decisions had to be made<br />

regarding the specific configuration that would best support the delivery and administration of<br />

the examinations. At the same time, the processes developed must place the fewest possible<br />

demands on the exam proctors, as they likely will have little familiarity with the exam software<br />

and may not have extensive computer experience. After careful consideration of a host of<br />

options, the Phase II exam deployment strategy selected provides the exam on separate media, to<br />

be plugged into the exam station at the time of the test. While various digital media were<br />

analyzed, Secure Digital (SD) cards were ultimately selected for their economy, wide availability<br />

and broad industry support. The Gateway M305E has an internal digital media reader accepting<br />

SD cards. Using the SD card addressed three primary processes. First, it provided a way to get<br />

the exam to the Sailor initially, and also established a fallback process that can be used if a Sailor<br />

misses an exam and needs a makeup or alternate test. A second SD card with a new test can be<br />

sent, rather than a new notebook computer. Second, it provided a way to get the exam back from<br />

the Sailor. The concept of having exam proctors download answer files and e-mail them back to<br />

the Center is attractive, but represents a significant change from current practice and presents<br />

special challenges for exams with classified content. When SD cards are used to deliver the<br />

exam, the results can be written back to the card, providing the administering activities with a<br />

removable component that can be express-mailed back to the Navy Advancement Center in a<br />

manner similar to that used for paper answer sheets. Finally, providing the test on SD cards<br />

reduces the classification security storage concerns by allowing the computers to be shipped as<br />

unclassified devices. Only the SD cards have to be stored in a classified safe prior to the day of<br />

the exam. The computer systems themselves would only become classified once the exam was<br />

inserted into the machine at test time, reducing the time the computers were classified devices<br />

from several weeks to a few days. Classified storage issues could possibly be avoided<br />

completely if the administering activities repackaged and returned the computer systems to the<br />

Center immediately upon conclusion of the exam. If that is not possible, one solution is to<br />

remove the computer hard drives and store them separately. This returns the computers to an<br />

unclassified status free from security controls and reduces physical storage requirements.<br />

SOFTWARE ISSUES<br />

In Phase I, Perception for Windows by Questionmark was selected as the authoring<br />

tool and exam delivery vehicle, “primarily because of its widespread commercial use and its<br />

success in Navy training for end-of-course testing” (Baisden and Rymsza, 2002). As in the case<br />

of the hardware, Phase II began with a review of available assessment software to determine<br />

whether significant advances had occurred since Phase I. Two software packages were<br />

determined to be the leading candidates: OutStart Evolution ® (a Learning Content Management<br />

System (LCMS) that has been adopted by the Naval Education and Training Command (NETC)<br />

as the primary content development tool for use with Navy E-Learning) and, once again,<br />

415<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


416<br />

Questionmark Perception . While both software packages have extensive capabilities and<br />

features, only Perception provides: a question bar indicating which questions have been<br />

answered, a method to flag questions for review, a user prompt if any questions are still<br />

unanswered when the exam is submitted, a way to turn off the display of the final score, and<br />

encryption for delivered questions. Additionally, Perception has the capability of importing<br />

XML (eXtensible Markup Language) files conforming to the IMS QTI Specifications (see IMS<br />

Global Learning Consortium, Inc., 2000). At the time of this review there do not appear to be<br />

any benefits to using Evolution ® instead of Perception for assessments. For these reasons,<br />

Questionmark Perception was retained as the assessment development and delivery software.<br />

Questionmark Perception permits questions to be delivered in either a scrolling format<br />

or a question-by-question format. As in Phase I, Phase II adopted the question-by-question<br />

format. However, the question presentation template was modified to display a security banner<br />

at the top of the screen, reduce the height of the button bar at the bottom of the screen, reduce the<br />

font size, and eliminate the timer to permit more content to appear on the screen. This reduces<br />

the need for the user to scroll to see all of the question content.<br />

Subject Matter Experts use an in-house developed and maintained Exam Development<br />

Software (EDS) to generate exam questions and create paper-based advancement exams. To<br />

date, paper-based exams have made only limited use of black-and-white graphics. These<br />

graphics can be associated with the question stem, but not the question alternative choices.<br />

Moving to an electronic exam introduces the capability of including audio, video and interactive<br />

graphics in various parts of exam question content, so EDS has been modified to use of these file<br />

types.<br />

Taking advantage of the Perception feature that permits importing content conforming<br />

to IMS QTI Specifications, EDS now has the capability of generating an XML file that can be<br />

seamlessly imported into Perception . All required media files are copied from the EDS master<br />

resource folder into an exam-specific folder to expedite packaging for final delivery. This allows<br />

EDS to remain the single source for all question generation and editing. EDS now also produces<br />

a “preview” XML file that uses an XSL (eXtensible Stylesheet Language) file to format the<br />

questions so the SMEs can use Microsoft ® Internet Explorer to rapidly review all of the exam<br />

questions in a scrolling format. This reduces the learning curve, cost, and time associated with<br />

individual SMEs using Perception for this purpose.<br />

Finally, as part of the advancement exam process, the exam taker is required to provide<br />

additional organizational and personal information. In Phase II, the initial server-based version<br />

used Perception to perform this function, similar to what was done in Phase I. However, using<br />

a separate Active Server Page or Visual Basic ® program permits more flexible screen formatting,<br />

provides more control over how the information is manipulated, and is less time consuming to<br />

set up and maintain as enterprise deployment is implemented. A separate program will be used<br />

after the exam has been taken to extract the required data from stand-alone or server processes<br />

and format it for analysis and statistical processing.<br />

ASSESSMENT DEVELOPMENT CONSIDERATIONS<br />

Navy Advancement Exams consist of 200 4-alternative (4-alt) multiple-choice questions.<br />

Question development is governed by a stringent set of guidelines and the quality of the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


questions is monitored statistically based on by-item exam results. In selecting the first ratings to<br />

transition to electronic exams, consideration was given to several factors: size of rate/rating,<br />

demographic distribution of exam candidates, classification of exam, appropriateness of<br />

multimedia for use with the rate/rating content, status of existing multimedia materials, number<br />

of exam bank control items, willingness of Subject Matter Expert (SME) exam writer to work on<br />

an electronic exam, marketing impact a successful exam would have, and exam candidate's<br />

anticipated familiarization with computers. The two rates selected, AG3 (Aerographer’s Mate,<br />

Third Class) and STG3 (Sonar Technician - Surface, Third Class), successfully met the desired<br />

criteria. Successfully distributing exams to these two communities will address key issues<br />

common to the broad spectrum of Sailors in various ratings across the Navy. In addition to<br />

occupationally-oriented content, Professional <strong>Military</strong> Knowledge (PMK) questions common to<br />

all rating exams were developed that used multimedia to better illustrate job performance. For<br />

the initial test development, between 35-45 percent of the questions used multimedia.<br />

A major concern is the increased time it takes to create a multimedia-based exam item.<br />

Approximately 75 percent of the multimedia did not exist. Required animations were created<br />

using vector graphics and photos/videos were obtained from photo shoots. For the remainder,<br />

existing media needed to be modified. It took an average of twenty times longer to create a<br />

multimedia-based exam item than it did to create a simple text-based one. Only governmentcreated<br />

multimedia resources were used to avoid the possibility of any copyright violations.<br />

CONCLUSION<br />

The implementation of electronic exams is no longer a question of “if,” but “when.”<br />

Phase II is off to a good start, but there is still much work to be done. Ongoing technological<br />

advances in hardware and software will need to be monitored so that they can be leveraged to<br />

enhance this project. Required changes in organizational and fleet culture will need additional<br />

consideration and careful implementation. But the solutions adopted to date have been designed<br />

to be scaleable with a minimum of alteration. The efforts invested through these initial stages of<br />

Phase II provide numerous lessons learned of benefit to other organizations making a similar<br />

transition.<br />

REFERENCES<br />

Baisden, A., & Rymsza, B. (2002). New directions in navy assessment: Developing a multimedia<br />

prototype. In 44 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

<strong>Proceedings</strong> (22-24 October 2002, Ottawa, Canada).<br />

http://www.internationalmta.org/2002/02<strong>IMTA</strong>proceedings.pdf (<strong>2003</strong>, October 7).<br />

IMS Global Learning Consortium, Inc. (2000). IMS question & test interoperability<br />

specification: A review (IMSWP-1 Version A, 10 th October, 2000).<br />

http://www.imsglobal.org/question/whitepaper.pdf (<strong>2003</strong>, October 23).<br />

417<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


418<br />

Integrated Web Assessment Solutions<br />

David T. Pfenninger, Ph.D., Reid E. Klion, Ph.D., and Marc U. Wenzel, Ph.D.<br />

Pan, Inc.<br />

111 Congressional Blvd., Suite 120<br />

Carmel, IN USA 46032<br />

david@pantesting.com<br />

One of the most exciting and practical new technological developments for psychologists and<br />

other test administration professionals is the incorporation of psychological and behavioral<br />

assessment tools into efficient and secure Internet-based platforms.<br />

Web-based assessment holds the promise of improved performance and efficiencies of scale for<br />

practitioners and researchers alike, and by extension, a potentially superior methodology for their<br />

clients. However, this potential is counterbalanced by special challenges attendant to the web<br />

delivery medium itself, as well as by the dynamics of both a methodology (test administration<br />

and processing) and market (testing, assessment, survey, and collateral measurement domains) in<br />

transition.<br />

This paper will review some of the recent developments and emerging trends and practices in the<br />

world of integrated web assessment to help orient test users and consumers to this new tool, its<br />

promise, and its challenges.<br />

Why Web Assessment?<br />

The rise of web assessment (aka “e-testing”) appears to be inevitable. Indeed, the director of the<br />

<strong>Association</strong> of Test Publishers stated unequivocally that the “…emphasis on e-testing means that<br />

current test and assessment tools will probably be replaced by electronic versions or by new<br />

tests with exclusive online use….” (Harris, 1999).<br />

Similarly, influential psychological researchers Patrick DeLeon, Leigh Jerome and their<br />

colleagues (1999) have observed that “…behavioral telehealth is emerging as a clinically<br />

important and financially viable option for psychologists…the Internet is particularly well-suited<br />

for self-report questionnaires and surveys.”<br />

Web assessment and testing is a variant of Computer Based <strong>Testing</strong> (CBT), leveraging Internet<br />

content delivery and data transmission with the goals of creating greater client access and faster<br />

processing of data. The key advantages of CBT are preserved, possibly accentuated, in the etesting<br />

variant:<br />

1. Psychologists are comfortable using computer-based test administration, scoring and<br />

electronic report generation, with 85% having done so (McMinn, Ellens, & Soref, 1999).<br />

2. Computerized administration and scoring of tests have become a generalized practice<br />

(Silzer & Jeanneret, 1998).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


3. Computerized testing has been shown to save significant clinician and patient time<br />

(Yokley, Coleman, and Yates, 1990).<br />

4. Researchers have found in controlled studies that hand-scored personality tests by trained<br />

scorers results in 53% of profiles containing errors, 19% significant enough to affect a<br />

diagnosis (Allard, Butler, Faust, & Shea, 1995). Similar scoring inaccuracies with<br />

scanner technology are currently plaguing the standardized educational testing market via<br />

tort suits filed by plaintiffs alleging harm from such errors.<br />

5. Psychometric research of computerized tests yields conclusive support for test<br />

characteristics of stability and validity (Alexander and Davidoff, 1990).<br />

6. Computerized testing provides a highly standardized and uniform process not influenced<br />

by examiner characteristics (Lewis, Pelosi, et al, 1988).<br />

7. Computerized testing formats are judged acceptable by patients, rated by patients as easy<br />

to use, and apparently are preferred to conventional paper-pencil and interactive testing in<br />

perceived comfort (Campbell, Rohlman, et al, 1999; Navaline, Snider, et al, 1994;<br />

Pisoneault, 1996; Telwall, 2000).<br />

8. Test-takers tend to divulge more information to a computer test module than to human<br />

examiners (Hart & Goldstein, 1985; Malcom, Sturgis et al, 1989).<br />

Test Quality and Method Variance Issues<br />

Web assessment and e-testing sites and programs vary widely in terms of the information they<br />

provide to help assess the quality of the available instruments. Many sites claim to have<br />

“reliable and valid” tests; as is the usual practice, practitioners should verify these claims with<br />

data. Some sites offer online manuals, psychometric white papers, and other information<br />

allowing the same evaluations psychologists have done with paper-and-pencil assessments.<br />

Web assessment per se does not appear to have significant impact upon validity or reliability<br />

considerations for most kinds of testing, although commensurability and method variance studies<br />

are in their infancy for more complex stimulus presentations. The following is a brief review of<br />

extant sources of commensurability or method variance study involving computer or web<br />

assessment.<br />

The Mead & Drasgow (1993) meta-analysis of cognitive ability test equivalency studies found<br />

average cross-mode correlations of .97 for power tests and .72 for speeded tests, a remarkable<br />

level of method equivalence that sets a formidable bar to those who would challenge equivalence<br />

of methods.<br />

There are more recent examples relative to the form of CBT referred to here as “e-testing” or<br />

“web-based”, “internet-based” or “online” testing or assessment. For example, the Canadian<br />

Public Service Commission Personnel Psychology Centre (Chiocchio, et.al, <strong>2003</strong>) has executed<br />

method variance studies comparing paper/pencil vs. online versions of several timed cognitive<br />

skills tasks, including performance measures of reading ability, numerical ability, and visual<br />

pattern analysis ability. They found virtually no method variance and concluded that the forms<br />

were functionally equivalent and require no correction factor.<br />

419<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


420<br />

Weiss (2001) reviewed literature and presented design considerations and outcome summaries<br />

for several in-house method variance studies, and concluded that method variance effects even<br />

for perceptual-response based tests tend to be trivial in most cases. His group’s own studies with<br />

the Beck Scales (paper/pencil vs. online) yielded equivalence with no correction factor,<br />

suggesting “there is ample evidence that computer administration of clinical personality tests,<br />

such as the Beck Depression Inventory, are comparable with paper administration when certain<br />

essential design elements are carefully considered.”<br />

Other researchers have found similar results. Biggerstaff, Blower, and Portman (1996) found<br />

equivalence of paper-pencil and computerized versions of the Aviation Selection Test Battery<br />

without the need for score transformations, while Neuman & Baydoun (1998), Clark (2000), and<br />

Donovan et. al (2000) all conducted comparisons of paper/pencil vs. CBT for personality<br />

questionnaires, finding strong equivalency.<br />

Potosky and Pobko (1997) provided support for equivalence of non-cognitive computerized<br />

versions of paper/pencil-normed tests in a personnel selection program. Smith & Leigh (1997)<br />

found online and paper versions of a clinical questionnaire to be comparable. Coffee and<br />

colleagues (2000) reported on web-base civil service testing with equivalent forms readily<br />

established.<br />

Burnkrant & Taylor (2001) presented results of these three methods variance studies and suggest,<br />

in line with previous work...that “data collected over the internet and in a traditional paper-andpencil<br />

setting are largely equivalent" (p.5). Weiner & Gibson (2002) presented several studies on<br />

the PSI Employee Aptitude Survey cognitive ability tests, ultimately concluding “web-and paperbased<br />

battery scores were found to be highly equivalent.”<br />

Thus, a growing corpus of available research finds paper-pencil and computer and online-based<br />

versions of various cognitive ability (both speeded and power), personality tests, and rating<br />

scales to be de facto equivalent.<br />

If there is a method difference established, one should identify and enter a correction factor into<br />

the scoring algorithm for the web-version and slowly retires the paper version if the online test is<br />

to use the same norms that were the basis of the paper/pencil test. Of course, many tests are now<br />

in development without regard for paper/pencil administration at all, based upon computerizedweb<br />

administration from the outset.<br />

Based upon the authors’ experiences in developing over 400 online testing products for more<br />

than 40 commercial test publishers and a dozen major proprietary corporate or government<br />

testing programs, the following tentative conclusions may be offered:<br />

1. The consensus in the commercial test and measurement industry is to assume equivalence<br />

of forms as a general baseline heuristic. This seems to be accepted by the firms as one of<br />

the clearest results in the aggregated history of scientific method variance study, a result<br />

that extends across test content (clinical, I-O) domains.<br />

2. Publishers and users by and large consider CBT and internet or e-testing equivalent.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


3. Most publishers have chosen not to execute method variance studies for their online tests,<br />

especially from CBT-to-e-testing conversions.<br />

4. For those publishers executing method variance reviews of online products, the findings<br />

have been fairly uniformly one of demonstrable equivalence, even across power and<br />

cognitive tests.<br />

5. Some publishers are presenting a caveat emptor notification to the test administrator prior<br />

to purchase if method variance studies have not been completed on a product formerly<br />

available only in paper/pencil.<br />

Factors Driving Web Assessment<br />

These psychometric and ergonomic considerations aside, the main reason for the advent of web<br />

assessment or e-testing is that test publishers and users perceive cost savings and revenue<br />

enhancement potential with the new format. The new e-testing product offerings are beginning to<br />

have an impact on the test industry similar to what has occurred in recent years in online retailing<br />

within the book-selling industry, delivery of e-learning content in the training industry, and ecruiting<br />

in the recruitment industry.<br />

Web assessment allows for continued test and item refinement by allowing for aggregation of<br />

anonymous raw data that in the past had eluded the test developer, which should make the<br />

development or refinement of test products much faster and less labor-intensive. Further, with<br />

improved document control technologies, publishers can increasingly control their test items<br />

against unlawful copying and copyright violations.<br />

Because e-testing has the potential to attenuate many of the costs traditionally associated with<br />

paper-pencil and desktop-diskette modes of testing, publishers may enjoy better margins without<br />

appreciably raising prices. Some cost-comparison data has demonstrated substantial cost<br />

reductions (Chiocchio, et. al, <strong>2003</strong>) and no doubt this area will receive increased scrutiny in<br />

coming years. The now well-established capabilities of instant ordering access, enhanced remote<br />

data harvesting capability, bundled products (i.e., no separate purchases of questionnaires,<br />

answer sheets, scoring templates, shipping, etc), reduced or eliminated scoring and clerical<br />

requirements, and immediately produced results are often posited as valuable improvements for<br />

testing consumers. Thus, web assessment appears to have the promise of an enhanced delivery<br />

solution of value to publishers, test professionals, and ultimately, the test-taking client.<br />

That the test-taking subject should derive benefit from the online paradigm is evidenced by the<br />

rationale of the Canadian Public Service Commission in developing their online pre-employment<br />

testing programs: namely, that the citizenry of Canada would enjoy enhanced access to<br />

employment opportunities by making pre-screening more wide accessible via the web delivery<br />

model (Chiocchio, et. al, <strong>2003</strong>). Readers are also encouraged to refer to the proposed <strong>Association</strong><br />

of Test Publishers guidelines for web-based e-testing (Harris, 2000) for more information on<br />

evolving standards.<br />

421<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


422<br />

Integrated Web Assessment Solutions Model<br />

The “web services” variant of the “business process management” model underlies most web<br />

testing platforms. It is essentially the accessing of web-based software on a “plug-in/plug-out”<br />

basis in lieu of buying and installing PC-based programs. The Internet becomes the software<br />

dispensary. Because the application is fully web-based, the system is constantly upgraded and<br />

improved with no need to buy or install new versions of software. Moreover, true “process<br />

integration” becomes feasible due to the specific features of the Internet in terms of efficient data<br />

linkage and delivery.<br />

The concepts of business process management and integration have attracted considerable<br />

attention for the past several years. The intent of such models is to create a fully integrated<br />

system from the previously disparate aspects of a business process. The goal is to share<br />

information across all aspects of the process as well to monitor and manage the entire system<br />

efficiently. An example of such a system would be a consumer’s purchase at a checkout counter<br />

triggering a transfer of information to the store’s local inventory control system (so the manager<br />

knows when an item needs to be re-stocked), to the corporate office (so that the marketing<br />

department receives feedback on store performance), and to the supplier of the purchased<br />

product (so that a new shipment can directed to the store when it is next needed).<br />

Any standard web browser provides access to the integrated web assessment for both<br />

administrator and client on a 24/7 basis. Hence, clients no longer have to bother purchasing<br />

diskettes for every computer they wish to use for testing, suffer through painful installations and<br />

complex manuals, or incur information technology resource drains. The web services model<br />

preserves the best of CBT while allowing the practitioner to log into an online assessment office<br />

virtually anywhere in the world, on any web-connected machine (and the same flexibility is<br />

extended to the client).<br />

Further, aggregate data is easily made available in standard database formats that can easily<br />

interface with other software applications. For example, XML and XSL data transfers serve as<br />

bridges for communication between often distributed or disparate 3 rd party applications. The<br />

formerly isolated applications become “integrated” into a coherent system. The client gains new<br />

functional value by streamlining the exchange and flow of relevant client data (including test<br />

data), and this data in fact becomes more “dynamic” by virtue of its ability to trigger other<br />

processes or decision points within the system (Klion, R., et. al, <strong>2003</strong>).<br />

Security<br />

Compared to the problems of keeping paper and pencil test reports secure, advanced e-testing<br />

systems can be far more secure than traditional paper and pencil testing (Bartram, 2000). When<br />

reviewing e-testing sites, consumers should take the time review the security and privacy<br />

statements and endorsements (such as Verisign). For the highest privacy standards, sites should<br />

only use temporary cookies to enable page turning and other basic web navigation functions, but<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


should not house permanent cookies (cookies are small identifier programs that leave a copy on<br />

the user’s machine so that the machine may be “identified” when it returns to the web site).<br />

Access to secure, restricted instruments should be conducted in the same manner traditional<br />

paper-and-pencil publishers determine access – by verifying professional status and/or<br />

educational experiences. Interactions with the system should be encrypted via current Secure<br />

Sockets Layer (SSL) technology, and security statements on the site should indicate how patient<br />

or client data is stored and transmitted.<br />

Sites vary widely with report handling. Some routinely emailed reports to the psychologist. This<br />

method is subject to hacking and interception and is not recommended. The best method is<br />

secure, SSL delivery to an ID and password-controlled on-line testing “office.” Some systems<br />

can “lock down” the testing machine to prevent other functional programs, and still other<br />

systems are capable of delivering online tests to identified computers (using IP address or other<br />

identifier). These methodologies offer possibly the most secure format for test data transfer yet<br />

available. Reports in easily editable formats (e.g., MSWord or HTML) should be treated with<br />

caution, as results are more easily altered than, for example, delivery of a closed PDF-type file.<br />

Well-designed and maintained e-testing systems appear to offer advanced security, a point<br />

clearly made by researchers:<br />

“A well-designed Internet system can be far more secure than many local computer<br />

network systems or intranets, and certainly more secure than paper and filing<br />

cabinets….In most areas, Internet-based assessment provides the potential for higher<br />

levels of control and security…Test scores are also potentially far more secure.”<br />

(Bartram, 2000, p. 269).<br />

Users of web assessment should expect state-of-the-art concern and methods for protecting the<br />

integrity of tests consistent with guidelines of the American Psychological <strong>Association</strong> (1999a;<br />

1999b). In fact, the bundled nature of many online testing formats may provide enhanced<br />

protections of test information from abuses related to HIPPA or other nascent disclosure<br />

mandates.<br />

Future trends and developments<br />

Web assessment is already a significant modality for tests and measures in the business and<br />

government testing arena and is gaining momentum in educational and certification testing<br />

markets as well, and we can anticipate that e-testing will be the norm in a very short time,<br />

probably within three to five years (Pfenninger, 2000; 2002). The staggering increase in<br />

classroom connectivity (cf., Fantasma Networks, 2001), reduction in the “digital divide” (Sailer,<br />

2001), and the previously described economic “return-on-investment” factors all but guarantee<br />

this outcome.<br />

The next trends in web assessment and e-testing will involve globalization, as the medium<br />

naturally lends itself to international distribution and access. Local norming and translation<br />

423<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


424<br />

issues will be increasingly important and controversial constructs as web assessment testing now<br />

easily transverses national boundaries (Klion, et. al, 2002).<br />

Delivery of media-rich assessments is an emerging mega-trend (Jones & Higgins, 2001) awaiting<br />

fuller implementation of broadband delivery capability. High quality audio and video delivery<br />

will be able to present complex assessment stimuli as “virtual” realities, allowing for an<br />

advanced form of simulation-based evaluation.<br />

Meanwhile, the advent of reliable, low-cost Internet devices is upon us. Many hospital and health<br />

care systems are already using hand-held thin clients as the basic data point-of-entry for<br />

clinicians. Similar devices are being used in corporate surveys, and it is probably inevitable that<br />

most large-scale certification and standardized educational testing will eventually utilize thin<br />

client Internet devices as the data harvesting hardware.<br />

Case example<br />

The U.S. Transportation Security Administration utilizes a very rigorous skills-based standards<br />

framework for job role definition, identification of skills and competencies, and the alignment of<br />

selection and assessment strategies within this model (Figure 1, from Kolmstetter, <strong>2003</strong>). The<br />

project is among the largest web assessment-based testing programs ever executed and involves<br />

close coordination with TSA and a host of other vendors in the hiring and human resource<br />

management of a 75,000 employee federal agency.<br />

Figure 1. TSA Skill Standards – Integrated Web Assessment Model<br />

Performance Appraisal<br />

& Management<br />

Certification-<br />

Annual<br />

Proficiency<br />

Review<br />

Train<br />

Advertise, Recruit,<br />

Provide Applicant<br />

Information<br />

Skill<br />

Standards<br />

Hire<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Qualifications Screen<br />

Competency Assessment<br />

Medical Examination<br />

Background/Security Check


Figure 2. Five process model per integrated panel<br />

Function:<br />

Biodata Form<br />

Interface:<br />

Online Biodata form<br />

completed at applicant’s<br />

home<br />

Data Input: Provided by<br />

applicant<br />

Decision Analysis: If Index<br />

> 70, qualify<br />

Communication:<br />

1) Qualifiers: Send link to<br />

Online Proctored <strong>Testing</strong><br />

Scheduler<br />

2) Non-qualifiers: Send email<br />

regrets<br />

Function:<br />

Online Proctored <strong>Testing</strong><br />

Scheduler<br />

Interface:<br />

Online scheduling system<br />

for Proctored Test Center<br />

Data Input:<br />

Provided by applicant<br />

Decision Analysis:<br />

None needed<br />

Communication:<br />

1) Send e-mail to applicant<br />

confirming appointment<br />

2) Send data to XYZ<br />

3) Send data to MNO<br />

Function:<br />

Check-in at test center<br />

Interface:<br />

Online check-in form<br />

Data Input:<br />

Provided by proctor based upon<br />

applicant documentation<br />

Decision Analysis:<br />

Meet documentation criteria?<br />

Communication:<br />

1) Applicant is seated to begin<br />

testing<br />

2) Non-qualifiers: Invited to<br />

return tomorrow with correct<br />

paperwork<br />

The TSA demand articulated in their contract for “integrated services” highlights the new XMLdriven<br />

“web services” models and reflect the state-of-the-art in technology-based global<br />

assessment with systemic, data flow, and content integration capabilities.<br />

A solution was needed to manage all aspects of computerized pre-employment test delivery,<br />

including design and development of Java testing applets, proctored testing centers in over 450<br />

sites across the United States, provision of remote testing facilities in overseas territories, and<br />

maintenance of a complex stream of transactions and .xml data exchanges among multiple<br />

vendors.<br />

First, an integrated process was developed in which data collected at one phase were used to<br />

drive subsequent aspects of the process. For example, results of the test battery were used to<br />

generate a summary report and individualized interview questions for use in the employment<br />

interview. Second, it enabled the use of automated communications as well as a variety webbased<br />

tools to manage the entire process.<br />

Across this multi-step/multi-vendor vetting process, there are five functions that occur regularly<br />

at each phase (“panel,” “module”): Function; Interface; Data Input; Decision Analysis;<br />

425<br />

Function:<br />

Assessment Battery<br />

Interface:<br />

Online test<br />

Data Input:<br />

Provided by applicant<br />

Decision Analysis:<br />

Qualified if Math>19 and<br />

English>20<br />

Communication:<br />

1) Qualifiers: Send link to<br />

Physical Exam Scheduler<br />

2) Non-qualifiers: Send<br />

e-mail regrets<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


426<br />

Communication. See Figure 2 for a redacted extract illustrating the five processes for some<br />

typical sequential panels in the integrated web solution for TSA.<br />

The integrated solution is comprised of the following features:<br />

1. Integration of multi-partner assessment and related HR content<br />

2. Multi-hurdle select-out model adapted due to stringent hiring eligibility criteria<br />

3. Content “travels” with electronic candidate record through the sequence<br />

4. Aggregate content data posted to TSA for analysis<br />

5. Seamless integration for total candidate experience<br />

6. Import of 3rd party assessment data and exports to required databases and applications<br />

7. Dynamic status moves triggered by data decision rules<br />

TSA uses a display console for data management and real-time review of candidates passing<br />

through the various steps in this integrated selection and assessment solution. See Figure 3 for an<br />

example of a typical display:<br />

Figure 3. Data management display tracking candidate assessment information.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Results<br />

As of October <strong>2003</strong>, the TSA has deployed the integrated web solution with few difficulties,<br />

with over 30,000 test batteries completed and over 7,000 candidates completing the second phase<br />

assessment center procedures.<br />

Summary<br />

Business testing for selection and development is no longer done in isolation from other practices<br />

and applications, and opportunities in corporate testing are increasingly tied to the ability to<br />

integrate seamlessly with other systems.<br />

The “best-of-breed” web services integration approach renders a solution characterized by<br />

“emergent properties”; that is, the whole of the solution is greater than the sum of its parts, and<br />

as such reflects a real innovation, that while portable in core respects, also is highly sensitive to<br />

the client’s needs for customization. Such emergent properties avoid the generic drift of the<br />

monolithic enterprise human capital processing installations, instead offering plug-in capability,<br />

focused support, and design collaboration with (as opposed to imposition of design upon) the<br />

client.<br />

<strong>Testing</strong> providers who adapt their offerings to take advantage of web services integrations will<br />

find a receptive audience among I-O psychologists and other responsible users of testing and<br />

assessments.<br />

References<br />

Alexander, J. & Davidoff, D. (1990). Psychological testing, computers, and aging. <strong>International</strong><br />

Journal of Technology and Aging, 3, 47-56.<br />

Allard, Butler, Faust, & Shea (1995). Errors in hand scoring objective personality tests: The case<br />

of the Personality Diagnostic Questionnaire-Revised (PDQ-R). Professional Psychology:<br />

Research and Practice, 26, 304-308.<br />

American Psychological <strong>Association</strong> (1999). Test security: Protecting the integrity of tests.<br />

American Psychologist, 54 (12), December 1999, p. 1078.<br />

Bartram, D. (2000). Internet recruitment and selection: Kissing frogs to find princes.<br />

<strong>International</strong> Journal of Selection and Assessment, 8 (4), December, 261-274.<br />

Biggerstaff, S., Blower, D., and Portman, L. (1996). Equivalence of the computer-based aviation<br />

selection test battery (ASTB). <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> 38 th Annual<br />

Conference, 1996. Available at http://www.ijoa.org/imta96/paper47.html.<br />

427<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


428<br />

Burnkrant, S., & Taylor, C. (April, 2001). Equivalence of Traditional and Internet-Based Data<br />

Collection: Three multigroup analyses. Paper presented at 16 th Annual Conference of the Society<br />

for Industrial and Organizational Psychology, San Diego, CA.<br />

Campbell, K., Rohlman, D., Anger, W., Kovera, C., Davis, K, & Grossmann, S. (1999). Testretest<br />

reliability of psychological and neurobehavioral tests self-administered by computer.<br />

Assessment, Vol. 6, #1, 21-32.<br />

Carr, A., Ancill, R., Ghosh, A. & Margo, A. (1981). Direct assessment of depression by<br />

microcomputer. A feasibility study. Acta Psychiatrica Scandinavica, 64, 415-422.<br />

Chiocchio, F., Degagnes, P., Kruidenier, B., Thibault, D., & Lalonde, S. (<strong>2003</strong>). Online Tests:<br />

Phase II Implementation. Report to Stakeholders. Public Service Commission Canada -Personnel<br />

Psychology Centre. http://www.pscfp.gc.ca/ppc/online_testing_pg02K_e.htm.<br />

Clark, D. (2000). Evaluation of a networked self-testing program. Psychological Reports, 86,<br />

127-128.<br />

Donovan, M., Drasgow, F., & Probst, T. (2000). Does computerizing paper-and-pencil job<br />

attitude scales make a difference? New IRT analyses offer insight. Journal of Applied<br />

Psychology, 85, 305-313.<br />

Duffy, J., & Waterton J. (1984). Under reporting of alcohol consumption in sample surveys: The<br />

effect of computer interviewing in fieldwork. British Journal of Addiction, 79, 303-308.<br />

Fantasma Networks (2001). Network classrooms of the future: An economic perspective.<br />

Retrieved from http://www.fantasma.net, March.<br />

Harris, W.G. (1999). Following the Money. White Paper: <strong>Association</strong> of Test Publishers,<br />

Washington, DC. http://www.testpublishers.org.<br />

Harris, W.G. (2000). Best practices in testing technology: Proposed computer-based testing<br />

guidelines. Journal of e-Commerce and Psychology, 1(2), 23-35.<br />

Hart, R., & Goldstein, M. (1985). Computer assisted psychological assessment. Computers in<br />

Human Services, 1, 69-75.<br />

Jerome, L., DeLeon, P., James, L., Folen, R., Earles, J., & Gedney, J. (2000). The coming of age<br />

of telecommunications in psychological research and practice. American Psychologist, 55(4),<br />

407-421.<br />

Jones, J., & Higgins, K. (2001). Megatrends in personnel testing: A practitioner’s perspective.<br />

Journal of <strong>Association</strong> of Test Publishers, January 2001. http://www.testpublishers.org.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Klion, R., Pfenninger, D., Chiocchio, F. & Callender, J. (<strong>2003</strong>). Cross-Cultural Test Design for<br />

Global Selection Programs. Presentation to <strong>Association</strong> of Test Publishers annual conference,<br />

Orlando, FL.<br />

Kolmstetter, E. (<strong>2003</strong>). The TSA story: Screener selection. Presented as part of Pfenninger, D.,<br />

Kolmstetter, E., Davis, B., & Jung, P., Automating the <strong>Testing</strong> and Assessment Process.<br />

Presentation, <strong>International</strong> Conference on Assessment Center Methods, Atlanta, GA.<br />

Lewis, G., Pelosi, A., Glover, E., Wilkinson, G., Stansfeld, S., Williams, P., & Shepherd, M.<br />

(1988). The development of a computerized assessment for minor psychiatric disorder.<br />

Psychological Medicine, 18, 737-743.<br />

Malcom, R., Sturgis, E., Anton, R., & Williams, L. (1989). Computer-assisted diagnosis of<br />

alcoholism. Computers in Human Services, 5, 163-170.<br />

Mead, A.D., & Drasgow, F. (1993). Equivalence of computerized and paper-and-pencil cognitive<br />

ability tests: A meta-analysis. Psychological Bulletin, 114, 449-458.<br />

McMinn, M., Ellens, B., & Soref, E. (1999). Ethical Perspectives and practice behaviors<br />

involving computer-based test interpretation. Assessment, Vol. 6, #1, p. 74<br />

Navaline, H., Snider, E., Petro, C., Tobin, D., Metzger, D., Alterman, A., & Woody, G. (1994).<br />

Preparations for AIDS vaccine trials. An automated version of the Risk Assessment Battery:<br />

Enhancing the assessment of risk behaviors. AIDS Research & Human Retroviruses, 10, Suppl.<br />

2:S281-282.<br />

Neuman, G., & Baydoun, R. (1998). Computerization of paper-and-pencil tests: When are they<br />

equivalent? Applied Psychological Measurement, 22, 71-83.<br />

Potosky, D., & Bobko, P. (1997). Computer versus paper-and-pencil administration mode and<br />

response distortion in noncognitive selection tasks. Journal of Applied Psychology, 82, 293-299.<br />

Pfenninger, D. (2000). e-testing: A new methodology for professional assessment. Paper<br />

presented at the American Psychological <strong>Association</strong> annual conference, August.<br />

Pfenninger, D. (2002). Remote clinical assessment: An introduction. Paper presented at<br />

<strong>Association</strong> of Test Publishers annual conference, February.<br />

Sailer, S. (2001) Analysis: The web’s true digital divide. United Press <strong>International</strong>. Retrieved<br />

from http://www.vny.com/cf/news/upidetail.cfm?QID=203267.<br />

Silzer, R., & Jeanneret, R. (1998). Anticipating the future: Assessment strategies for tomorrow.<br />

In. R. Jeanneret & R. Silzer (Eds.) Individual psychological assessment: Predicting behaviors in<br />

organizational settings (pp. 445-477). San Francisco: Jossey-Bass.<br />

429<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


430<br />

Smith, W. A. & Leigh, B. (1997). Virtual subjects: Using the internet as an alternative source of<br />

subjects and research environment. Behavior Research Methods, Instruments, and Computers,<br />

29, 496-505.<br />

Weiner, J. & Gibson, W. (2002). Transition to technology: Design and Application Issues with<br />

Employment Tests. Paper presented at <strong>Association</strong> of Test Publishers annual conference,<br />

February.<br />

Weiss, L., & Trent, J. (2002). Equivalency analysis of paper-and-pencil and web based<br />

assessments. <strong>Association</strong> of Test Publishers annual conference, February.<br />

Yokley, J., Coleman, D., & Yates, B. (1990). Cost effectiveness of three child mental health<br />

assessment methods: Computer-assisted assessment is effective and inexpensive. Journal of<br />

Mental Health Administration, 17, 99-107.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


STREAMLINING OF THE NAVY ENLISTED<br />

ADVANCEMENT NOTIFICATION SYSTEM<br />

LCDR Tony Oropeza<br />

Navy Advancement Planner OPNAV N132C4<br />

Jim Hawthorne<br />

PNCS Jim Seilhymer<br />

Darlene Barrow<br />

Joanne Balog<br />

Navy Enlisted Advancement System, Pensacola, FL, USA<br />

In February 2002, the major stakeholders of the Navy Enlisted Advancement System<br />

(NEAS) convened with the goal of streamlining the Navy Enlisted Advancement Notification<br />

(NEAN) process. Numerous ideas were shared and discussed among the stakeholders. In turn,<br />

initiatives were delegated for action to various stakeholders. Many of the initiatives assigned to<br />

the Naval Education and Training Professional Development and Technology Center<br />

(NETPDTC) will be presented. These initiatives are:<br />

• Auto-Ordering Exams and Bar-Coded Answer Sheets<br />

• Rapid Advancement Planning Estimation Algorithm<br />

• Accelerated NEAS Processing<br />

• Internet Posting of Examination Results<br />

Through the implementation of these and other initiatives, a significant reduction in the<br />

notification process has been accomplished.<br />

Prior to streamlining the notification process, the time period from exam day to publication<br />

was 11 to 13 weeks. The target time period after streamlining was reduced to 5 weeks. Initially,<br />

several constraints were implemented so there would be no negative impact on the Sailors, no<br />

sacrifice to the quality of the exam, and no increase in the workload of the Fleet. The outcome<br />

would result in improvements in quality of life, learning, retention, and a reduction in the<br />

workload of the Fleet. Additional conferences were held in June 2002 and January <strong>2003</strong> for the<br />

NEAS stakeholders to meet and discuss their progress and to further improve the plan of action.<br />

Auto-Ordering Exams and Bar-Coded Answer Sheets<br />

At the NEAN conference in February 2002, it was decided commands should have the ability<br />

to order exams for candidates via the web, based on time in rate (TIR) eligibility according to the<br />

Enlisted Master File. In turn, exams and bar-coded answer sheets would be printed and mailed<br />

to commands (testing sites), determined by the TIR eligibility lists.<br />

Providing a website where commands can review, change, delete, or add candidates has<br />

several advantages. There has been a reduction in workload on testing officers. <strong>Testing</strong> officers<br />

431<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


432<br />

can spend less time figuring out how many exams to order and when to order. In the past, some<br />

testing officers would either over-order, order too close to the deadline, or forget to order. Overordering<br />

results in a waste of resources. Ordering too close to the deadline or forgetting to order<br />

results in requiring mailing of substitute exams. Therefore, another advantage due to the latter<br />

cases is reduction of substitute exams. Also, by producing a TIR eligibility list, commands will<br />

be sent exams and answers sheets even if the list has not been reviewed. In turn, exams and<br />

answers sheets will arrive promptly at the testing sites. Figure 1 is a snapshot of a TIR eligibility<br />

list.<br />

Figure 1. Time in rate eligibility list.<br />

By bar-coding answer sheets, candidates are required to “bubble” in minimal entries on the<br />

answer sheets. Moreover, a reduction in discrepancies is obtained by bar-coding the answer<br />

sheets with pertinent information. If any of the bar-coded information is incorrect, the candidate<br />

can “bubble” in the particular information, which will override the bar code.<br />

Rapid Advancement Planning (ADPLAN) Estimation Algorithm<br />

The ADPLAN estimation algorithm was developed to provide the manning planners a basic<br />

idea of how many candidates within each exam rate would pass upcoming exams. The algorithm<br />

was beta-tested for the March 2002 exam. The algorithm was of assistance to the manning<br />

planners in projecting exam passers and advancement counts. The algorithm was implemented<br />

for the September 2002 exam and was helpful in trimming two weeks off the advancement exam<br />

process. It has been determined the ADPLAN algorithm will continue to be a part of the<br />

advancement exam process.<br />

The ADPLAN algorithm begins with projecting the number of all exam passers. It continues<br />

by determining the number of passers by duty groups - Training and Administration of Reserves<br />

(TARs), Active Duty (USN/R), and Canvasser Recruiters (CVRs). These numbers are totaled<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


and compared with the number found for total exam passers. The number of passers by<br />

paygrade also is determined. The total number of passers by paygrade is compared to the total<br />

passers by duty group and the total exam passers. The same procedure is performed for<br />

projecting exam takers as well. Exam-taker projections are used by NETPDTC for scheduling<br />

tasks.<br />

An increase in exam takers was expected from 2002 to <strong>2003</strong>. The number of projected takers<br />

was 110,193. The actual number of exam takers was 112,056, which was a 10% increase from<br />

the September 2002 advancement exam. In analyzing where the increase of exam takers actually<br />

occurred, it was determined the majority of the increase came from paygrade E4. There were<br />

1,640 more E4 exam takers than expected. Refer to figure 2.<br />

CANDIDATES ( In thousands )<br />

65<br />

60<br />

55<br />

50<br />

45<br />

40<br />

35<br />

30<br />

25<br />

20<br />

15<br />

MARCH <strong>2003</strong> ACTUAL VERSUS PROJECTED EXAM<br />

TAKERS BY PAYGRADE<br />

E5<br />

E6<br />

E4<br />

1993 1994 1995 1996 1997 1998 1999 2001 2002 <strong>2003</strong><br />

MARCH EXAM CYCLES<br />

PROJECTED E5 EXAM TAKERS<br />

52,151<br />

ACTUAL E5 EXAM TAKERS<br />

52,067<br />

PROJECTED E6 EXAM TAKERS<br />

30,763<br />

ACTUAL E6 EXAM TAKERS<br />

30,713<br />

ACTUAL E4 EXAM TAKERS<br />

29,276<br />

PROJECTED E4 EXAM TAKERS<br />

27,656<br />

Figure 2. Actual versus projected exam takers by paygrade for March <strong>2003</strong>.<br />

During the algorithm development it was found that starting in 2000 there was a significant<br />

difference between the number of candidates taking the March exams and the September exams.<br />

Due to this phenomenon, it was determined the algorithm would have to account for this<br />

disparity. Past history for March exams would be used to predict the number of candidates<br />

taking the upcoming March exam, and the same course of action would be used in predicting the<br />

number of candidates taking September exams. This phenomenon is showing signs of becoming<br />

less of a concern. Figure 3 shows the number of candidates taking March exams is once again<br />

coming closer to the number of candidates taking September exams.<br />

433<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


434<br />

CANDIDATES (In thousands )<br />

150<br />

140<br />

130<br />

120<br />

110<br />

100<br />

90<br />

80<br />

Sep-93<br />

Mar-94<br />

TOTAL EXAM TAKERS<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Sep-94<br />

Mar-95<br />

Sep-95<br />

Mar-96<br />

Sep-96<br />

Mar-97<br />

Sep-97<br />

Mar-98<br />

Sep-98<br />

Mar-99<br />

EXAM CYCLES<br />

Sep-99<br />

Mar-00<br />

Sep-00<br />

MARCH CYCLES<br />

SEPTEMBER CYCLES<br />

Figure 3. Total exam takers in March and September cycles over the past 10 years.<br />

As with any forecasting model, predictions will improve as the model is refined. Additional<br />

resources have been found that will assist in determining increases and decreases in manning<br />

numbers.<br />

Accelerated NEAS Processing<br />

As stated previously, the main goal of the NEAN conferences was to streamline and<br />

accelerate the NEAS processing. Another decision was made to validate exam answer keys prior<br />

to exam day rather than on exam day. Since time is a factor, any problem that can be resolved<br />

without slowing down the advancement process is an advantage. Another important decision<br />

was made to mandate mailing of examination answer sheets in the most expedient traceable<br />

manner available, within 24 hours of administering advancement exams. This one process is the<br />

main factor in slowing down publication of advancement results.<br />

Several internal processes at NETPDTC were streamlined by rewriting many of the<br />

processing programs. Prior to NEAN, all internal processing was done by administration cycle.<br />

Now all internal processing up to advancement planning is done by paygrade. By doing this,<br />

manning planners can have an early look at paygrade advancement planning.<br />

Another process that has been refined is quota entry. To expedite advancement planning<br />

quota entry, a web quota site was developed for the manning planners. The data at this site is<br />

retrieved at NETPDTC and placed in the NEAS.<br />

Prior to the NEAN initiative, exam cycles took from 11 to 13 weeks to process. A great deal<br />

of the processing time was based on the receipt of answer sheets. Different stages of the<br />

advancement process depend on a certain percentage of answer sheets being returned. The goal<br />

for advancement planning is 90%. Additional factors can slow down the advancement process.<br />

In September 2000, a necessary program change slowed down the process, in September 2001 it<br />

was the 9/11 tragedy, and in March 2002 it was the invasion into Afghanistan.<br />

Mar-01<br />

Sep-01<br />

Mar-02<br />

Sep-02<br />

Mar-03


March 2002 marked the implementation of the NEAN initiatives. The NETPDTC processing<br />

was reduced; however, other process changes had not been implemented. By September 2002,<br />

many of the process changes had been implemented and the process time was greatly reduced.<br />

By March <strong>2003</strong>, the NEAN initiatives were complete and processing time was reduced, but<br />

administration and mailing were affected by the onset of Iraqi Freedom. See figure 4.<br />

WORKING DAYS<br />

ACCELERATED NEAS PROCESSING<br />

ADMIN TO PUBLICATION<br />

70<br />

65<br />

60<br />

55<br />

50<br />

45<br />

40<br />

35<br />

30<br />

25<br />

20<br />

15<br />

10<br />

5<br />

0<br />

MAR 00<br />

ALL<br />

11 to 13 weeks<br />

SEP 00<br />

ALL<br />

MAR 01<br />

ALL<br />

SEP 01<br />

ALL<br />

SEP 01 and Prior - Pre-NEAN<br />

Cycles<br />

(SEP 00 – Standard Score<br />

program change<br />

delayed publication<br />

SEP 01 – 9/11 Tragedy<br />

delayed E-4/5 admin until<br />

early October; anthrax scare<br />

affected some mailing)<br />

MAR 02<br />

ALL<br />

MAR 02 –<br />

Initial<br />

NEAN<br />

Initiatives<br />

(Raw Score<br />

Processing<br />

and exam<br />

admin<br />

affected<br />

by Afganistan<br />

Invasion)<br />

SEP 02<br />

E-6<br />

7 weeks<br />

SEP 02<br />

E-5<br />

SEP 02<br />

E-4<br />

SEP 02 – Continued<br />

NEAN Initiatives<br />

(First Cycle<br />

for separate E-4/5/6<br />

paygrade processing<br />

from admin to Raw<br />

Score)<br />

MAR 03<br />

E-6<br />

9 weeks<br />

MAR 03<br />

E-5<br />

MAR 03<br />

E-4<br />

MAR 03 – Completed<br />

NEAN Initiatives<br />

-Improvements<br />

continue (First Cycle<br />

for separate E-4/5/6<br />

Paygrade processing<br />

from admin to<br />

Advancement Planning;<br />

Raw Score processing<br />

and E-4 exam admin<br />

affected by IRAQI<br />

Freedom)<br />

NETPDTC<br />

QUOTAS<br />

ANSWER SHEETS<br />

Figure 4. NEAS process from exam administration to publication.<br />

Currently, obstacles to consistency in the accelerated processing are: no single source of<br />

Navy overnight mail, command compliance to new requirements, acts of nature, terrorism,<br />

wartime conflicts, and manning/budget constraints.<br />

Internet Posting of Advancement Examination Results<br />

Another very important accomplishment by NETPDTC is the web posting of advancement<br />

status and statistics. In the former method of advancement publication, results were typically<br />

mailed in paper form, from NETPDTC to all Navy commands. This paper mailing of<br />

examination results could take anywhere from 1 to 4 weeks to reach a command, sometimes<br />

longer if the commands were deployed. Consequently, Sailors may have had to wait as long as<br />

17 weeks after testing to see how they performed. Part of the NEAN initiative was to find a way<br />

to provide the fastest possible examination and advancement feedback on a central website by<br />

utilizing existing Internet technology. This effort consisted of three basic goals. The first goal<br />

was to post advancement examination profile sheets on a web page accessible by every active<br />

and reserve Sailor, at home, at work, ashore, or afloat. The profile sheet provides the exam<br />

takers with a summary of their performance in relation to their peers. The second goal was to<br />

provide commanding officers with necessary administrative reports showing how Sailors under<br />

their charge performed, who was selected for advancement, and who was not. The third goal<br />

435<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


436<br />

was to provide commanding officers with statistical reports showing how Sailors under their<br />

charge performed compared to all commands in the Navy.<br />

The tasking for these goals consisted of a partnership between Navy military and civilian<br />

personnel, which resulted in several accomplishments. A web page design team was established.<br />

Computer programs were created to extract data from existing databases and compose the<br />

information recognizably on the web page. The website was tested by various afloat platforms<br />

such as aircraft carriers, support ships and aircraft squadrons, as well as shore commands. These<br />

platforms also provided feedback concerning the accessibility and performance of the website.<br />

The results of the project exceeded expectations for customer satisfaction. Exam takers can<br />

now view their exam results 96 hours after initial publication. This savings in time also allows<br />

exam takers additional time to study for the next exam in the event they were not selected for<br />

advancement. Exam takers also can access their information from past cycles. Commanding<br />

officers can access required administrative reports upon initial publication of results, which has<br />

proven to be a valuable work and time saver for personnel managers. Commanding officers can<br />

view how their own commands performed compared to all other commands in the Navy. This<br />

particular element has reduced workload at NETPDTC and is a valuable counseling tool for<br />

commanding officers when assisting Sailors in making career decisions.<br />

Once inside the website, a menu of options is offered. Commands can view several exam<br />

cycle results, statistics from each cycle, and can look at individual profile sheets. The profile<br />

sheet shows the test taker’s standard score, what the average standard score was for candidates<br />

who advanced, topics the individual was tested on, how many questions pertaining to that section<br />

were on the exam, how many questions the individual answered correctly, and the individual’s<br />

percentile in that section in relation to all persons taking that particular exam. Figure 5 shows an<br />

example of a profile sheet.<br />

XXXXXXX XXXXX<br />

Figure 5. Profile sheet.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Due to the limited bandwidth restrictions and satellite connectivity limits for deployed ships,<br />

several changes were made in the development of the web page. Large amounts of data in the<br />

profile sheets had to be placed in smaller files. Also, the number of website pages was reduced<br />

to make extracting data easier. The website is run from Microsoft Internet Explorer, and is<br />

password-protected to ensure security of data.<br />

The statistics option shows how a particular command performed by duty group by rating<br />

and overall paygrade. Refer to figure 6. The total number of test takers is shown with number<br />

and percentage of the different advancement status. The average standard score of exam takers<br />

who were selected for advancement is also shown. These statistics are compared against<br />

Navywide performance. Statistics for advancement are useful for commanding officers in<br />

helping to build study programs and training programs and make career management decisions.<br />

Conclusion<br />

XXXXXXX XXXXX<br />

Figure 6. Statistics web page.<br />

Through the implementation of these and other initiatives, a significant reduction in the<br />

notification process has been accomplished. Prior to the implementation of these initiatives, the<br />

turnaround time for Sailors to receive their advancement results following examination was<br />

between 11 and 13 weeks. Currently, utilizing these initiatives the turnaround time has been<br />

nearly cut in half to about 6 weeks. Improvements in the process will continue as procedures are<br />

refined and as individuals become more adapted to the new regulations.<br />

437<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


438<br />

THE ROLE OF PSYCHOLOGY IN INTERNET SELECTION<br />

E. Edgar, A. Zarola , L. Dukalskis & K. Weston<br />

QinetiQ Ltd<br />

A50 Cody Technology Park, Farnborough, Hampshire GU14 0LX, UK<br />

eedgar@qinetiq.com<br />

This paper reports the second year of a three-year programme of work to examine how the<br />

UK Armed Forces can develop their use of the Internet to best effect in selection. Given the<br />

large numbers of non-graduate recruits employed by the three Services, the research<br />

programme has focused on recruit selection. The first year programme included a review of<br />

Internet selection practices and the development of a prototype recruitment and selection<br />

(R&S) website (Weston, Edgar, Dukalskis & Zarola, 2002). During the second year of the<br />

programme, the researchers focused attention on the potential applicant user group. This<br />

report describes the rationale behind consideration of the applicant user group, the methods<br />

applied, and the implications of findings when it comes to Internet-based R&S systems.<br />

INTRODUCTION<br />

The popularity of e-recruitment has grown substantially over the last five years. According to<br />

Reed (2002) 78% of UK graduate recruiters prefer online applications. This compares with<br />

1998 when only 44% of employers had the facility for online applications (Park, 1999).<br />

However, the sudden growth in popularity creates its own problems. For example, according<br />

to Park (2002) employers find it increasingly difficult making decisions on where to advertise<br />

jobs due to the variety of media available. As technology develops to support a range of R&S<br />

approaches, beyond recruitment activities, other features of traditional selection systems are<br />

also becoming more available online, e.g. personality testing, ability testing and even<br />

interviewing.<br />

Inevitably, the proliferation of e-recruitment and e-selection has raised questions across<br />

professions. Whilst electronic R&S features may look the part, reviews of e-practices are a<br />

little sparse when it comes to evaluating key reliability and validity features. For<br />

psychologists and other professionals, security, authenticity, fairness, privacy and<br />

standardisation have been increasingly recognised as issues of concern. These same issues<br />

appear to mediate the extent to which traditional methods have been adapted to the Internet<br />

forum. Much of the discussion tends to focus on the organisational perspective. For example,<br />

how to manage the increased volume of applications that can be generated from e-application<br />

systems; how to address security issues, such as authenticating the user’s identity and prevent<br />

cheating; and how to monitor and evaluate the effectiveness of on-line systems. Initiatives<br />

tend to focus on addressing employers’ concerns and optimising the use of the Internet for the<br />

employer. In contrast, significantly less attention has been given to another important user<br />

group: the potential applicant group.<br />

Human factors research across various domains illustrates the importance of considering the<br />

user when designing systems and other artefacts (e.g. Norman, 1988; Booth, 1989). Such<br />

research also highlights the costs of not giving sufficient attention to different user groups.<br />

When considering user groups related to Internet-based R&S systems distinctions between<br />

users can be made. Employers who purchase on-line testing facilities from test publishers are<br />

users of the on-line facilities and support. Test publishers are users of the Internet medium<br />

when, for example, they employ the technology to collect data to monitor the performance of<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


their tests. However, job seekers or browsers are the users at the interface. They are the group<br />

to which all efforts are targeted to encourage applications from those suitable to fill vacancies.<br />

Thus, it makes sense that attempts are made to understand their perspective when it comes to<br />

designing and applying Internet R&S systems. It is noted that the success of online<br />

recruitment can be quite variable (Park, 2002). By considering the applicant population,<br />

factors may emerge to help explain why consistency varies in the success of the new<br />

technology to support e-recruitment and e-selection. Effective R&S systems (traditional or<br />

electronic) rely on the calibre and number of applications made in the first instance. If esystems<br />

do not appeal to or attract responses from potential new employees, then, at best, they<br />

become a costly supplement to traditional methods, and, at worst, portray a negative image of<br />

the organisation.<br />

From the limited research that has considered the applicant user in the e-recruitment context,<br />

some interesting findings have emerged. For example, in the UK, Foulis and Bozionelos<br />

(2002) reported on the attitudes of a postgraduate group to Internet recruitment. Their findings<br />

indicate that the perceived advantages of using the Internet for job search activities<br />

outweighed the disadvantages. Key advantages were identified as: the completeness and<br />

quality of the information provided; convenience and accessibility; and the speed of the<br />

process. Some disadvantages included: restriction to certain employers and jobs; technical<br />

difficulties; the Internet afforded a less direct approach; excessive amounts of information;<br />

and length of time employers take to respond via email. Some concerns were expressed about<br />

the opportunity for ‘cheating’, but student confidence in the systems was dependent on the<br />

context. For example, the respondents had few concerns about sending their completed<br />

application forms through the Internet, but they did have reservations about automated<br />

screening. Interestingly, the students indicated that they wanted a choice of traditional and<br />

electronic methods to support their job search activities. The quality of the website was also<br />

found to have an impact on applicants’ perceptions of the organisation.<br />

Also in the UK, a study by Price & Patterson (<strong>2003</strong>) questioned twenty undergraduates (21-26<br />

years of age) about their e-application experiences and reported that nearly all participants had<br />

a favourable attitude towards using the Internet in general. The researchers considered<br />

accessibility to the Internet and found that the largest proportion of candidates had relatively<br />

easy access to the Internet; being able to access the Internet at home or at the University. The<br />

vast majority, however, opted for the cheaper option of accessing free facilities at the<br />

University. Indeed, the issue of cost was raised by the respondents, as a factor that inhibited<br />

their use of the Internet when they had to pay to use it, e.g. time online to complete<br />

application forms.<br />

Price et al’s interviewees reported a number of reactions that the researchers labelled<br />

psychological processes. These included a concern for privacy and a greater desire for<br />

support with online applications, e.g. feedback in the form of an acknowledgement of<br />

application was considered essential. Finally, applicants felt that using the Internet<br />

dehumanised the application process and also made it easier for individuals to exaggerate<br />

their responses, be more casual and offer socially desirable responses.<br />

Some usability issues were addressed by Price et al and whilst technology has evolved<br />

considerably, users still experience technical problems that can affect their attitude. Some<br />

practical preferences were identified. For example, respondents reported that they thought<br />

439<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


440<br />

electronic forms were more convenient, but wished that facilities allowed them to preview,<br />

flick through and check their application forms (just as they can with paper application<br />

forms). Design features also left respondents feeling restricted. Some felt that too many dropdown<br />

menus limited choice and the lack of space restricted their ability to give a full picture<br />

of themselves. Unlike paper forms, there was no facility to add an extra sheet when<br />

necessary. Like the participants in the Foulis and Bozionelos study, the students believed that<br />

organisations who provided effective online application facilities created an innovative,<br />

forward-thinking image and appeared more appealing as an employer. Conversely, when the<br />

students had a negative experience using online application systems, their image of the<br />

organisation was affected negatively.<br />

Equity of access to the Internet has been an issue of concern since the introduction of the<br />

Internet as a medium for recruitment and selection. Price et al echo concerns of others by<br />

suggesting that use of the Internet is limited on the basis of demographics such as age, sex,<br />

race and income and, therefore, the use of the Internet may introduce adverse impact. It is<br />

important to investigate the possibility that the use of the Internet medium might serve to<br />

exclude candidates on the basis of demographic factors. However, it is possible that the<br />

characteristics of each applicant population will determine whether access and ability to use<br />

the Internet is an issue for concern in each specific case.<br />

Beyond access to Internet-based R&S facilities, there is another important issue to consider:<br />

the extent to which the target population is motivated to use career-related websites and the<br />

specific R&S features they accommodate.<br />

Applicant groups vary in their characteristics just as the jobs and roles for which they apply<br />

vary. This is why it is important for organisations to consider the specific groups of people<br />

they want to target when they design their Internet applications. Many of the R&S features<br />

found on the Internet, particularly selection applications, are targeted at graduates. Studies,<br />

such as that documented by Foulis and Bozionelos, can help direct initiatives aimed at<br />

targeting a similar graduate applicant population. However, whilst some Internet<br />

applications, such as Job Boards, accommodate non-graduate populations, little research<br />

attention has been paid to understanding the large non-graduate population of job seekers and<br />

their perceptions of the Internet as a medium used in the R&S process.<br />

The QinetiQ researchers participated in a programme of work aimed at fulfilling a number of<br />

objectives. These can be summarised within the following three questions:<br />

a. Is the recruit target population limited from accessing the Internet to carry out job search<br />

activities?<br />

b. Do the target population use the Internet to carry out job search activities?<br />

c. How is the prototype Tri-Service web site received by the target group?<br />

The first question addresses the very practical issue of whether potential applicants have<br />

access to the facilities they need to take advantage of electronic recruitment and selection<br />

facilities. The second and third questions, although addressing different elements, both relate<br />

to the motivation to use these electronic facilities. By asking about their current interest in<br />

using the Internet for job search activities, the researchers aimed to identify the extent to<br />

which this target group currently use the Internet for R&S activities, and what preferences<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


they may already have in relation to the different types of R&S features available. In eliciting<br />

views about the Tri-Service website, the researchers also wanted to elicit attitudes towards the<br />

traditional and more novel R&S features that could be accommodated on the Internet.<br />

However, a large element of this latter research activity was aimed at canvassing opinion<br />

towards specific interface design elements, in order that, for example, the physical design of<br />

the site could be improved with respect to ease of use, preferences for colours, removal of<br />

constraints etc.<br />

The methodology used to address these objectives is described, and selected results follow. It<br />

is beyond the remit of this paper to give a full account of results specific to the interface<br />

design. Rather, results are restricted to applicant access and their interest and willingness to<br />

use different e-selection features.<br />

METHODOLOGY<br />

Three research activities were designed to address the research issues.<br />

Internet and Careers Survey (ICS): A survey design was used whereby a questionnaire was<br />

developed and administered to students at secondary schools around the UK. The survey was<br />

designed in consideration of available literature and recent public statistics about Internet<br />

access. In addition to access, the questionnaire also included items related to frequency of<br />

Internet use, preferred media, and career search activities. A team of researchers visited the<br />

schools to administer the questionnaires to classes of students aged between 14 and 18 years.<br />

Five schools participated in the study, and completed questionnaires were obtained from 170<br />

students.<br />

Web site usability trials (UTs) – structured walkthroughs: A semi-structured evaluation<br />

protocol was designed in consideration of usability guidelines and a working definition of<br />

usability (Booth, 1989). Booth separates concepts of usability into four key areas:<br />

1. Effectiveness (or Ease of Use): This concerns how easy the site is to use. In simplistic<br />

terms, it relates to the ease with which the system can be used e.g. to get from A to B, and the<br />

effectiveness of the design to allow people to use it.<br />

2. Learnability: This relates to the degree to which the design used e.g. on one website<br />

supports the use of knowledge acquired from previous exposure to other systems. For<br />

example, users should be able to transfer their knowledge of other sites to learn how to use a<br />

new site quickly. Additionally, the site should be flexible enough to support previous<br />

experience, e.g. such that if someone wants to use keyboard shortcuts, this should be<br />

supported.<br />

3. Attitude (Likeability): This covers a broader spectrum. It relates to issues of personal taste<br />

e.g. colour, factors which increase frustration, anxiety, or pleasure and any other subjective<br />

feeling resulting from interaction with the system.<br />

4. Usefulness (or Functionality): This relates to the degree to which the user can and wants to<br />

use the system (or its applications) to achieve their goals. That is, it is not about how easy the<br />

system is to use, it is about the worth of what the web site offers.<br />

Four features of the web site were explored in detail. These included: an application form; a<br />

biodata questionnaire; a medical questionnaire; and an eligibility form. Students were<br />

randomly allocated to one of the four features.<br />

441<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


442<br />

Twenty-one students from the schools involved in the Internet and Careers Survey<br />

participated. Individually, students sat next to a member of the research team, who acted as a<br />

facilitator, and were given an explanation of the study. It was emphasised that they were not<br />

being evaluated, and that any comments they could offer would be valued; positive and<br />

negative. Each student was presented with the Home Page of the web site and was asked to<br />

complete a task. For example, There is an application form on the web site. Try to find it,<br />

complete it and then return to the Home Page. Each task involved negotiating the web site to<br />

find the relevant location, following instructions on the page and returning back to the Home<br />

Page. Students were encouraged to comment freely as they progressed. To supplement free<br />

responses, a series of open-ended questions were put to the students. The free response<br />

questions were categorised under the following headings: The look of the web site; Ease of<br />

use; Response and feedback; Experiences; Functions.<br />

To supplement limited responses made by students structured questions were also used in an<br />

effort to obtain a full picture of the students’ experiences. The facilitator recorded the<br />

responses made by each student. Following completion of each task, students were given the<br />

opportunity to ask any questions and comment on other features of the site they might have<br />

encountered outside the parameters of the task. Each session lasted approximately 30<br />

minutes. The responses were analysed using qualitative and quantitative techniques.<br />

Comments were coded by three researchers according to task scenario, and in relation to the<br />

usability elements to which they referred. Structured questions that involved responding on a<br />

rating scale were analysed using SPSS for Windows v10.<br />

Job Search Preferences Questionnaire (JSPQ): Following the initial survey and usability<br />

trials, an opportunity arose to question a further student sample. A short self-completion<br />

questionnaire was designed to be administered at a school careers fair. The questionnaire<br />

included items that could be used to help clarify student preferences for job search and job<br />

application activities. The questionnaire allowed researchers to consider further reactions to<br />

some of the recruitment and selection functions that could be supported using Internet<br />

technology. A total of 30 students completed this questionnaire.<br />

RESULTS<br />

A total of 170 students responded to the Internet and Careers Survey (ICS) from 5 schools<br />

around the country. (57% from the North of the country and 43% from the South). Fifty-three<br />

per cent were female and 42% were male (5% missing data). Twenty-one students, from the<br />

same five schools, completed the usability trials (62% male and 38% female). Students’ ages<br />

ranged from 14 – 18 years, with the majority aged between 14 and 15 years of age. Thirty<br />

students completed the JSPQ (77% male and 23% female) between the ages of 13 and 15<br />

years.<br />

Access<br />

All the schools provided student access to the Internet, and only two students out of 170<br />

(1.2%) stated that they had never used the Internet. A significant majority of students access<br />

the Internet at least once a week, with 58% reporting they access the Internet at least 2 –3<br />

times per week (at school or at home). The most unpopular locations for accessing the<br />

Internet are a) at an Internet café (10%), b) at a career’s centre (18%) or c) at a library (48%).<br />

The results indicate that for the whole sample, the most common method of accessing the<br />

Internet is by using a computer (PC/Mac) and that many students access the Internet in this<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


way both in and out of school. A significant minority of students use technology other than a<br />

computer, e.g. WAP/ Mobile phone (27%), television (TV) (17%); or games console (17%) to<br />

access the Internet.<br />

Access to the Internet does not appear to be a significant problem for this age group.<br />

Nevertheless, it may be the case that potential recruit applicants who have left school and who<br />

have not gone on to a higher educational establishment may not have the same free access to<br />

the Internet. Alternatively, the proliferation in the use of alternative technology to access the<br />

Internet indicates that, for some groups at least, fairness concerns relating to Internet access<br />

might be diminishing.<br />

Job Search Activities<br />

Forty-one per cent of all respondents stated that they had used the Internet to search for job or<br />

college opportunities. Most of this group (86%) stated that they accessed the Internet for such<br />

information either once a month or less than once a month. Forty-three per cent of<br />

respondents said that they had used the Internet to search for college courses; 27% stated that<br />

they had searched for a job; 19% had searched for information about an organisation; 9% had<br />

registered on a career or job search web site; and 2% had submitted a CV online. Whilst<br />

interest was expressed in using the Internet for job search activities, experience of doing so<br />

appears limited for this particular age group. Age and expectations about when to search for a<br />

job, as well as restricted exposure to employment processes may have limited responses to job<br />

search questions.<br />

Motivation to use e-selection features<br />

Website Functionality: UT participants were asked to offer their opinions about the different<br />

features included on the site. They were asked to rate the usefulness of the tool that formed<br />

the basis of the task, and how much they liked the specific example with which they<br />

interacted. The students thought most of the R&S features were very useful. Not all had the<br />

same strength of feeling for the medical questionnaire or application form, nevertheless, most<br />

thought it useful to some degree. The exception was found for Biodata with one participant<br />

stating that they thought this was of little use. In rating how much they liked the specific<br />

feature shown on the website, most tended to like the features they had used, but did not rate<br />

them as highly as they had rated the associated concept. The medical questionnaire stood out<br />

with half of the students reporting that they did not like it. (Evidence elicited elsewhere with<br />

respect to specific design issues helped to inform why this might be the case).<br />

Application Preferences: When given a choice between using the Internet or writing and<br />

posting application forms, UT participants indicated that their preferred method of completing<br />

and returning an application is to use Internet facilities for all actions, and their least favourite<br />

preference was a hybrid option of downloading a paper version of the test and then posting it<br />

back. Additionally, JSPQ respondents showed a preference for using the phone to make<br />

contact with organisations in the application stage and request application forms. When it<br />

comes to completing and returning application forms, students appear to want a choice of<br />

traditional and new methods, and prefer an element of consistency in the genre of methods<br />

chosen.<br />

Initial selection preferences: JSPQ participants appear to have varied preferences when it<br />

comes to initial selection features, such as biodata forms, essay, ability tests, personality tests,<br />

tests of knowledge; and personal skills. For most features, responses indicated that having a<br />

443<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


444<br />

choice of media is important. The responses show that some traditional methods remain<br />

popular, but that the choice of method differs depending on the activity. For example, it<br />

would appear that for ability testing more students want to complete the tests at the<br />

organisation (e.g. under traditional conditions at the HR department) compared with<br />

personality testing, which they appear happy to do at home or via the Internet. Such<br />

preferences may be related to different levels of confidence in the technology to support fair<br />

or standardised assessment under the different conditions. For example, factors such as<br />

standardisation and security, may be perceived to be more crucial for ability testing compared<br />

with personality testing, but less reliable under Internet conditions. Some comments elicited<br />

during the research referred to concerns about the possibility of others ‘cheating’ on the<br />

Internet and accidental deletion of information.<br />

Other selection preferences: When it comes to activities that tend to be administered later<br />

within a selection process, e.g. Interviews; presentation; activities with others, the most<br />

popular means of participation is on location at the organisation. Email was not popular (or<br />

feasible) at this stage, but a significant minority stated they would be interested in using<br />

Internet-supported groupware for presentations (33%) or activities with others (27%). For<br />

interviews, 23% stated that they would be interested in using web-cam technology. Whilst<br />

this figure is not great, it indicates that one fifth of participants are willing to consider<br />

interviews supported by Internet technology.<br />

This picture was also represented by usability trial participants. Their stated preferences for<br />

Internet use in the selection process tended to be for obtaining information, and for the<br />

process of application. They offered some support for using the Internet for testing and<br />

communication activities, however less students supported the idea of using the Internet to<br />

facilitate selection interviews.<br />

Again, it would appear that preferred choice of media depends on the selection activity.<br />

However, it also appears that some individuals may have a preference for modern technology,<br />

whilst others tend towards more traditional approaches.<br />

Impact of website on organisational perceptions:<br />

To investigate if there was any affect of experience with the website and attitude towards<br />

employment with the ‘host’ organisation, the researchers included a final question at the end<br />

of the usability trial. This question was ‘would you consider a career with the Armed<br />

Forces?’. Responses were compared with an item that had been posed at the beginning of the<br />

usability trial, ‘Have you ever considered a career with the Armed Forces?’ There was no<br />

significant increase in the number of students who responded ‘yes’, they would consider a<br />

career with the Armed Forces (27% to 33%), but there was a decrease in those who gave a<br />

categorical ‘no’ (71% to 38%), with more respondents expressing uncertainty (0% to 14%).<br />

Whilst not conclusive, the results appear to support the notion that exposure to a satisfactory<br />

organisational interface and a positive experience could create a more favourable view of an<br />

organisation.<br />

Self-disclosure and self-assessment<br />

The degree to which potential applicants were willing to disclose personal information<br />

through the Internet, and to what extent students would like to know more about their<br />

suitability for a job prior to committing to an application, was investigated.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The students were asked about their attitudes towards giving detailed personal information<br />

about themselves at an early stage in the application process. Forty-seven per cent stated that<br />

they would feel comfortable giving detailed, personal information at an early stage, 23% said<br />

that they would not feel comfortable and 30% did not respond to the question. The phone was<br />

chosen as the most popular means of giving detailed personal information to the organisation.<br />

This was followed by email and using a website directly. Interestingly, the least popular<br />

methods involved posting hand or type-written materials. Face-to-face was not given as an<br />

option, and it may be the case that some students would always prefer to give detailed<br />

information about themselves on a face-to-face basis. However, at an early stage in the<br />

selection process it is unlikely that employers would have the resources to offer face-to face<br />

opportunities, hence the omission from the list of options. Nevertheless, the absence of this<br />

option should be acknowledged.<br />

Using the JSPQ, the majority of respondents (67%) expressed interest in being able to find out<br />

more about their suitability to a given job prior to making an application. The following<br />

features were presented as options to aid this self-assessment: detailed job description;<br />

written, frank comments by employees; job knowledge quiz; job interest inventory;<br />

personalised careers advice; personality testing and ability testing. Between 77% and 97% of<br />

students chose all of the features available to find out more about their suitability for a given<br />

role. The preferred mode to help self-determine suitability for role was investigated. Once<br />

more, results show that whilst the majority of students appear willing to use the Internet, a<br />

high minority want the choice of using traditional methods in order to find out more about<br />

their suitability for a job role.<br />

DISCUSSION AND CONCLUSIONS<br />

In response to one of the main research questions, it seems that the target population is not<br />

limited with respect to access to Internet selection facilities. This has important implications<br />

when it comes to issues of fairness, specifically in relation to adverse impact. Price et al<br />

(<strong>2003</strong>) might be correct in their suggestion that Internet access may be affected by<br />

demographic determinants, but this may not apply to all applicant groups. However, whilst<br />

the age range of the sample involved in this study can access the Internet for no personal cost<br />

at school, it remains to be seen if the older recruit applicant would be deterred by having to<br />

pay for Internet selection services. This has implications for how the military organisations<br />

can encourage or support Internet access, for example through provision of free Internet<br />

access at Armed Forces Careers Offices.<br />

With respect to the extent to which the target group carry out job search activities, given the<br />

frequency with which they use the Internet, results indicated that the majority of participants<br />

did not use the Internet for such activities, and those that had, did so infrequently. This result<br />

and other signs throughout the research, e.g. indication that some students were unaware of<br />

key activities carried out through traditional selection systems, suggest that this target group is<br />

relatively inexperienced when it comes to recruitment and selection processes. This is likely<br />

to have affected some of their responses, but, equally important, it has implications for design<br />

of e-selection systems aimed at a similar age/ experience group. Explanations of what is<br />

expected and descriptions of why certain information is important may need to be much more<br />

explicit for certain target groups. Similarly, because this target group does not appear to have<br />

‘favourite’ job sites, or job search experience, compared with graduate counterparts,<br />

445<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


446<br />

employers’ may need to be inventive about where they place advertisements, e.g. on mobile<br />

phone alerts; sports websites, school events etc. They may need to ‘attract’ the applicants they<br />

seek to their selection web sites, rather than being able to assume that good applicants will<br />

find their own way.<br />

The extent to which this target group used technologies other than desktop computers etc. was<br />

of interest. It alerts us to the fact that employers need to consider the way they present their<br />

information and questions, as the visible space on a desk top tends to be much greater than on<br />

a phone. These differences in media choice seem to add support for a move to adaptable<br />

interfaces, whereby the interface changes according to responses to a set of questions relating<br />

to media being used, demographic determinants, qualifications or, simply, personal<br />

preferences.<br />

In relation to motivational issues, and in support of previous studies (Foulis et al, 2002; Price<br />

et al, <strong>2003</strong>), participants seemed motivated to use the Internet for the purposes of at least<br />

some job search activities. Interestingly, once more, their concerns reflected some of those<br />

expressed by employers. That is, the students raised problems relating to standardisation and<br />

the ability of other applicants to ‘cheat’. This suggests that employers need to offer some<br />

assurances about their procedures; that they are fair and they understand applicant concerns.<br />

To enhance motivation, it appears that employers need to retain elements of choice when it<br />

comes to modern technology and traditional approaches. A clear theme running throughout<br />

the various sets of results is that applicants want a choice of how to respond to various<br />

selection activities. These choices appear to depend on a preference for new technology over<br />

traditional methods, but also vary according to what they are asked to do. For example, the<br />

desire to go to the organisation for ability testing but a preference for Internet facilities when<br />

completing application forms. These preferences contrast with the preferences of Foulis et<br />

al’s students who stated they were happy to complete tests online. The implications of these<br />

findings are that employers should not be too hasty to eradicate traditional approaches.<br />

Indeed, Freeserve, the Internet company which adopted e-recruitment techniques, continues to<br />

use paper-based advertising to complement web applications (The Sunday Times, <strong>2003</strong>, July<br />

13).<br />

Comparison across the graduate groups used in previous studies and the non-graduate groups<br />

in this study illustrate differences in the range of issues that are volunteered, preferences and<br />

experience. Such differences illustrate the worth in considering different applicant groups<br />

separately. The researchers expect the design of the prototype website to be enhanced by<br />

having elicited views from the target users directly. For example, the research has elicited<br />

positive feedback about the utility of some of the features, and constructive feedback about<br />

how the specific examples of these features on the prototype are limited.<br />

It is argued that not only is it egalitarian to consider the applicant alongside the employer, but<br />

that it also makes pragmatic sense if the technology is going to support, rather than lead,<br />

future recruitment and selection systems. The work psychologist would appear one of the<br />

professions suitably positioned to help facilitate understanding and draw implications in this<br />

area.<br />

ACKNOWLEDGEMENTS<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


This work was funded by the Human Sciences Domain of the UK Ministry of Defence<br />

Scientific Research Programme.<br />

REFERENCES<br />

Booth, P. (1989) An introduction to human-computer interaction. Hove: Lawrence Erlbaum<br />

Foulis, S. & Bozionelos, N. (2002). The use of the internet in graduate recruitment: The<br />

perspective of the graduates. Selection & Development Review. Vol. 18, No.4, p12-15.<br />

Park (2002) Graduate in the eyes of the employers, 2002. London: Park HR and The<br />

Guardian.<br />

Park (1999) Graduate in the eyes of the employers, 1999. London: Park HR and The<br />

Guardian.<br />

Price, R. E. & Patterson, F. (<strong>2003</strong>) Online application forms: Psychological impact on<br />

applicants and implications for recruiters. Selection & Development Review. Vol. 19, No. 2,<br />

p12-19<br />

Norman, D. A. (1988) The design of everyday things. London: MIT Press<br />

Reed Executive PLC (2002) In R. E. Price & F. Patterson, F. (<strong>2003</strong>) Online application forms:<br />

Psychological impact on applicants and implications for recruiters. Selection & Development<br />

Review Vol. 19, No. 2, p12-19.<br />

The Sunday Times. July 13, <strong>2003</strong>. Appointments, p7.<br />

Weston, K. J., Edgar, E., Dukalskis, L. & Zarola, A. (2002) Internet-supported Tri-Service<br />

recruit selection: An exciting future. QinetiQ Report: QINETIQ/CHS/CAP/CR020199/1.0<br />

447<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


448<br />

WORKING FOR THE UNITED STATES INTELLIGENCE COMMUNITY:<br />

DEVELOPING WWW.INTELLIGENCE.GOV<br />

INTRODUCTION<br />

Brian J. O’Connell, Ph.D.<br />

Principal Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

boconnell@air.org<br />

Cheryl Hendrickson Caster, Ph.D.<br />

Senior Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

chendrickson@air.org<br />

Nancy Marsh-Ayers<br />

Intelligence Community Program Manager IC/CIO<br />

One of the most significant problems that the Intelligence Community (IC) faces in<br />

recruiting applicants is the stereotypes that exist concerning these agencies and the work<br />

performed by Agency employees. In particular, these stereotypes focus public attention on a<br />

very narrow portion of the actual (or fictional) job opportunities within the IC and also give a<br />

very glamorized and inaccurate view of actual work and working conditions. Literally, media<br />

portrayals are a very poor “Realistic Job Preview”. The Director Of Central Intelligence, Chief<br />

Information Office mandated the creation of a web portal that would provide accurate<br />

information about all job related matters for members of the intelligence community.<br />

The Chief Information Office (CIO) of the Central Intelligence Agency (CIA) contracted<br />

with the American Institutes for Research (AIR) to assist in the development of an IC-wide web<br />

site. This web site represents 15 different agencies that are, wholly or in part, involved with<br />

intelligence work. One section of the website, A Place For You, includes information about IC<br />

careers and occupations. The goal of this section is to assist web site visitors in career planning<br />

activities, such as determining which occupations are most relevant to their background (e.g.,<br />

education, interests).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


AIR’s past experienced in the IC enabled us to support this requirement based on<br />

previous occupational analysis (and other work) at three of the major agencies: the National<br />

Security Agency (NSA), the Defense Intelligence Agency, and the National Mapping and<br />

Imagery Agency (NIMA).<br />

Background<br />

In the mid-1990’s, IC member agencies were responding to radical changes in US threats<br />

and, as a result, Agency missions began to change. For example, the primary threat of fighting<br />

conventional wars against the former Soviet bloc vanished. The necessity of fighting low<br />

intensity conflicts became apparent. Both of these factors led to a questioning of whether the<br />

IC had the necessary blend of skills and abilities to carry out new missions. An additional<br />

challenge was the significant societal pressure to continuously improve efficiency and quality<br />

(for example, see General Accounting Office, 1994; GAO/Comptroller General of the United<br />

States, 1996; and the work of the National Performance Review 35 ).<br />

The then current approach to position management and recruitment in the IC was very<br />

much driven by the legacy systems and missions. Changing from these systems was difficult and<br />

faced significant organizational barriers. AIR implemented a pilot project at NSA to evaluate the<br />

possibility of transitioning one of their core business units to a more flexible process of<br />

describing their work and the skills necessary to achieve mission goals.<br />

AIR’s approach was based on the Occupational Information Network (O*NET). This<br />

system was based on the O*NET Occupational Information Network (Peterson, Mumford,<br />

Borman, Jeanneret, & Fleishman, 1999). The O*NET model evolved from a thorough review of<br />

previous work-related taxonomies, and represents state-of-the-art thinking about the world of<br />

work. Unlike many other models, it conceptualizes both the general and the specific aspects of<br />

the work domain in an integrated fashion and was thus ideally suited to NSA’s needs. For<br />

example, O*NET’s Basic and Cross-Functional Skill (BCFS) taxonomy was used to capture<br />

broad competencies that cut across NSA jobs, so that these jobs could be compared and grouped<br />

to create a skills-based occupational structure. Similarly, O*NET’s Generalized Work Activities<br />

(GWA) taxonomy was used to ensure homogeneity of work activities within each occupation.<br />

Finally, O*NET’s Occupational Skills taxonomy was used to characterize the specific<br />

requirements of particular NSA occupations.<br />

The O*NET approach was well received at NSA and the pilot program was expanded to<br />

include the entire agency. That is, AIR had the task of describing all work carried out at the<br />

agency in a skills-based taxonomy. This work, and its follow on activities for the agency,<br />

continues today.<br />

Other members of Community were impressed with the impact and flexibility of the<br />

occupational redesign at NSA. In 1998 NIMA adopted this approach This approach was<br />

35 Now the National Partnership for Reinventing Government. See<br />

http://govinfo.library.unt.edu/npr/library/review.html and http://www.nima.mil/ast/fm/acq/nima_commission.pdf<br />

449<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


450<br />

particularly well suited for NIMA. Consolidating employees from several Federal agencies<br />

formed NIMA in 1996. These agencies included: the Defense Mapping Agency (DMA), the<br />

Central Imagery Office (CIO), the Defense Dissemination Program Office (DDPO), and the<br />

National Photographic Interpretation Center (NPIC), as well as the imagery exploitation and<br />

dissemination elements of DIA, the National Reconnaissance Office (NRO), the Defense<br />

Airborne Reconnaissance Office (DARO), and the Central Intelligence Agency (CIA).<br />

However, there was also a disparate set of approaches to HR issues in the amalgamation of these<br />

agencies. O*NET was a logical solution to providing a new flexible and consistent approach to<br />

base HR practices.<br />

In1999 DIA awarded AIR a contract to transition their legacy position based HR system<br />

to a skills based one that was rapidly becoming the benchmark for the IC. Over the next two<br />

years, AIR research scientists transitioned over 1000 job titles to 21 Occupational Groups.<br />

The occupational analyses at these agencies provided AIR research scientists with several<br />

elements that would facilitate the development of a website describing work in the IC. These<br />

included (a) a consistent taxonomy for describing work throughout the agencies, (b) agencies<br />

that would provide the “lions share” of unique jobs in the IC, and (c) broad based experience of<br />

work carried out at these three large IC agencies. These factors laid the groundwork for the<br />

identification of occupational similarities across the IC and ultimately the web portal content for<br />

the IC website www.intelligence.gov.<br />

This IC-wide web site represents 15 different agencies of the IC. Some of these agencies<br />

are exclusively designed to support the intelligence mission (e.g., NSA, CIA) while others (e.g.,<br />

Department of State and Department of Energy) have hybrid missions where only part of the<br />

agency’s overall mission is involved in intelligence related work.<br />

One of the goals of the web site is to educate the public on the types of work and<br />

missions carried out by each member agency and to provide a Community perspective. In<br />

addition, the A Place For You section of the web site assists web site visitors with identifying the<br />

most relevant careers and occupations based on their background (e.g., education, interests).<br />

This section also provides information about where individuals can contact member agencies to<br />

apply for advertised positions. In essence, this site is to serve as a one-stop shopping location for<br />

members of the public who have an interest in working within the Community.<br />

Challenges<br />

In the past decade the US economy has been growing at a record pace. This has meant<br />

that there is stiff competition for employees in all sectors of the economy. The IC must compete<br />

with the private sector to hire qualified individuals for a wide variety of positions. Unlike the<br />

private sector the IC faces unique challenges when recruiting, none the least of which are the<br />

media generated stereotypes about day-to-day work for these agencies. Unfortunately, these<br />

stereotypes can lead to unrealistic work expectations of job applicants and even subsequent<br />

dissatisfaction with work and ultimately personnel turnover. Given the cost to recruit, train and<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


obtain the necessary clearances necessary to be able to work in the IC the community must do<br />

everything possibly to ensure applicants understand the work they are undertaking. In a<br />

technical sense media portrayals are a very poor “Realistic Job Preview” for any applicant<br />

considering working in the IC. Further they tend to focus on a fairly narrow range of jobs that<br />

can also hinder recruiting.<br />

Additionally, different agencies have very different degrees of public awareness about<br />

their contribution to “intelligence” work. While agencies like CIA and NSA have very high<br />

public visibility, organizations like ONI or the Department of Energy (DOE) have a much lower<br />

profile. The www.intelligence.gov site puts a frame of reference around the IC, identifies all the<br />

components and their role in the overall IC. The front page of our web site is shown below in<br />

Figure 1.<br />

Figure 1. Home page of www.intelligence.gov<br />

The development of an unclassified web site describing work in the IC presented several<br />

unique technical challenges. The primary challenge was identifying a common metric for<br />

describing jobs within the IC. The next major challenge was the security classification level.<br />

Obviously, some of the jobs are classified. In addition, an unclassified job in one agency can<br />

(and often is) classified in another agency. Therefore, extreme care was taken in making agency<br />

attributions about certain jobs or occupations. The next major challenge was to get agency “buy-<br />

451<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


452<br />

in” to have their data represented in a public forum such as this web site (remember until only<br />

relatively recently the very identity of some of these agencies was classified, e.g., NRO).<br />

How Common Jobs Were Identified<br />

AIR’s experience conducting occupational analyses within three Community agencies<br />

(i.e., NSA, DIA, and NIMA) was leveraged during the development of The Place For You<br />

section of the website. The same methodology for describing work (i.e., O*NET) was used in<br />

these agencies. This is a critical and key advantage in developing the current IC website as AIR’s<br />

work employing the O*NET taxonomy has provided a common metric, or descriptor set, for<br />

describing all work in these agencies.<br />

Based on this data, and our understanding of the work involved within these IC agencies,<br />

AIR staff proposed to the government initial decisions about the commonalities among the<br />

occupational structures within these three agencies. Our decisions were made at two levels of<br />

detail. First, does this career exist in the agency? Second, does a specific occupation exist within<br />

a given career? For example, a career might be “Intelligence Analysis”, with an associated<br />

occupation of “Signals Intelligence Analyst”. These three agencies formed a baseline 36 from<br />

which to work off as the job analysis information was developed in a comprehensive manner<br />

over a number of years. We had complete confidence in the data from these agencies.<br />

The next step was initially more inferential. Once our baseline had been established, a<br />

two-step process was formulated. First, informed inferences were made about likely occupations<br />

in the other IC agencies by experienced AIR staff members. These inferences were based on<br />

many years working in the IC community on job analysis and other projects. This was also a<br />

very conservative assessment of the careers and occupations that were thought to exist in other<br />

agencies. This inferential process was also aided by meeting with agency representatives and an<br />

examination of the Community agencies’ web sites. Next, these inferences were “validated”<br />

through a survey of Community representatives (often Human Resources representatives).<br />

Specifically, representatives reviewed the occupational and career information on the web site<br />

and determined if the information was relevant to their specific agency. AIR research scientists<br />

assisted the representatives during the survey process by responding to questions and clarifying<br />

information, as needed. AIR is currently receiving updated surveys from the IC members. In<br />

addition, new IC members such as the Department of Homeland Security (DHS) are being<br />

surveyed. An essential element of the survey process is the representatives’ approval of the<br />

career and occupation descriptions on the web site.<br />

Next Steps<br />

36 This baseline was from three very large agencies consisting of approximately 30,000 personnel worldwide.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


As we move forward with integrating the information from the Community, we are<br />

adding functionality to the web site to improve the utility to the general public. Some of the<br />

planned website enhancements include the following:<br />

• Enhanced descriptions of core business functions on the website (e.g., analysis<br />

and collection);<br />

• An Interest-Occupation cross referencing guide;<br />

• Identification of “Hot Jobs” within the Community; and<br />

• Enhanced graphic design and marketing of the site.<br />

CONCLUSION<br />

In this paper we outlined the background and rationale for the development of the IC<br />

website www.intelligence.gov. This site was developed with a baseline set of data from<br />

three large IC members and is being updated as new members are established and new<br />

occupations are identified in current organizations. In addition, the functionality of the<br />

website is being enhanced to provide tools and information that will improve the quality<br />

and quantity of employment related information about the IC to members of the general<br />

public.<br />

In the future that public will be able to cross reference their academic backgrounds and<br />

other interests with occupations and careers that exist in the IC.<br />

This site provide a one stop information source about all members of the IC and<br />

provides members of the general public carefully developed information that<br />

accurately portrays the work of the professionals that comprise the United States<br />

Intelligence Community.<br />

453<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


454<br />

REFERENCES<br />

General Accounting Office (1994). Improving mission performance through strategic<br />

information management and technology: Learning from leading organizations<br />

(GAO/AIMD-94-115). Washington DC: Author.<br />

General Accounting Office (GAO)/Comptroller General of the United States (1996). Effectively<br />

implementing the Government Performance and Results Act (GAO/GGD-96-118).<br />

Washington DC: Author.<br />

Peterson, N. G., Mumford, M. D., Borman, W. C., Jeanneret, P. R., & Fleishman, E. A. (1999).<br />

An occupational information system for the 21 st century: The development of O*Net.<br />

Washington DC: American Psychological <strong>Association</strong>.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


ENCAPS – Using Non-Cognitive Measures for Navy Selection and<br />

Classification<br />

William L. Farmer, Ph. D.<br />

Ronald M. Bearden, M.S.<br />

Navy Personnel Research, Studies, & Technology Department (PERS-1)<br />

Navy Personnel Command<br />

Millington, TN 38055-1300<br />

Walter C. Borman, Ph.D.<br />

Jerry W. Hedge, Ph.D.<br />

Janis S. Houston, M.A.<br />

Kerri L. Ferstl, Ph.D.<br />

Robert J. Schneider, Ph.D.<br />

Personnel Decisions Research Institute<br />

Minneapolis, MN 55414<br />

As we enter the 21 st century, the military is in the middle of a major transition.<br />

Due to changes in mission and technology, the organizational and occupational<br />

characteristics that have come to define the present day military are being overhauled.<br />

As a result of this process it is imperative to develop a full understanding of the role that<br />

enlisted personnel play in the “new military.” This role includes interface with systems,<br />

equipment, and personnel that may appear quite a bit different than in the past. What<br />

individual requirements will these players need to accomplish their mission? How will<br />

this translate to personal readiness? How will performance be defined and measured?<br />

In addition to individual requirements for successful performance, a number of<br />

other factors play important roles in the success of selection and classification in the<br />

military. The military uses results from large-scale aptitude testing as its primary basis<br />

for making selections into the service. Following this initial selection, testing results are<br />

further utilized in making classification decisions. The mechanism for making a<br />

personnel classification is a somewhat complicated process that involves using a<br />

combination of individual ability, obtained from the aforementioned testing program, the<br />

needs of the Navy (regarding jobs that need to be filled and gender and minority quota<br />

requirements), and individual interest. Historically, interest has received the least amount<br />

of weight.<br />

455<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


456<br />

Following classification into a job, an individual will go through basic training,<br />

then proceed to the training school pipeline that has been proscribed for the particular<br />

assigned career path. After finishing the initial training pipeline, an individual will be put<br />

on the job, complete the first term of enlistment, then reenlist or not. A number of other<br />

factors, in addition to the things an individual brings into the service, play a crucial role<br />

in how that individual perceives their military career and whether the choice to reenlist or<br />

not is made. Organizational variables have typically received little or no attention in the<br />

military services when evaluating reasons for attrition or retention.<br />

Historically, the preponderance of military predictive validation work has<br />

centered on measuring success in basic and technical training. Job performance in the<br />

first-term of enlistment has been included as a criterion measure sporadically. However,<br />

because finding and training a 21 st Century sailor will be much more complex and costly<br />

than it is today, success on the job beyond the first term of enlistment in the Navy will be<br />

increasingly important. The prediction of such long-term behavior as reenlistment and<br />

promotion rates will require the use of new sets of predictor variables such as measures<br />

of personality, motivation, and interest. To effectively use the variables to predict longterm<br />

performance, it will be crucial to better understand the work context for the future<br />

Navy, including the environmental, social, and group structural characteristics.<br />

Ultimately, combining the personal and organizational characteristics should lead to<br />

improved personnel selection models that go beyond the usual vocational and aptitude<br />

relations, encouraging a closer look at theories of person-organization (P-O) fit (see<br />

Borman, Hanson, and Hedge, 1997).<br />

Advances in the last decade or so have shown that we cab reliably measure<br />

personality, motivational, and interest facets of human behavior and that under certain<br />

conditions these can add substantially to our ability to predict attrition, retention, and<br />

school and job performance. The reemergence of personality and related volitional<br />

constructs as predictors is a positive sign, in that this trend should result in a more<br />

complete mapping of the KSAO requirements for jobs and organizations, beyond general<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


cognitive ability. One particularly promising approach to measuring individual<br />

differences in the interpersonal and personality areas is the situational judgment test<br />

(SJT). These tests are based in the premise that there are important and often subtle<br />

differences between the behavior of effective and ineffective persons as they respond to<br />

problems or dilemmas confronted in the course of carrying out their job responsibilities<br />

and that such differences are reflected in their responses to similar situations presented in<br />

written form.<br />

Research has demonstrated that short-term, technical performance criteria,<br />

particularly overall school grades, are best predicted by general intelligence while longer<br />

term, more differentiated criteria such as non-technical job performance criteria,<br />

retention, and promotion rates are better predicted by other measures, including<br />

personality, interest, and motivation instruments. In order to select and retain the best<br />

possible applicants, it would seem critical to understand, develop, and evaluate multiple<br />

measures of short- and long-term performance, as well as other indicators of<br />

organizational effectiveness such as attrition/retention.<br />

In general, then, when one considers what attributes are most relevant to perform<br />

effectively in any given job, there are many from which to choose. The type of person<br />

characteristic viewed as important to success in a job may vary from situation to<br />

situation. For example, for a job or set of jobs, one may be most interested in choosing<br />

persons that have high cognitive ability, and care much less about their personality or<br />

interest patterns. In other situations the reverse may be true. For optimal assignment, it<br />

is necessary to somehow link the attributes to how necessary they are for effective<br />

performance in specific jobs or job types, and as attempts are made to expand the<br />

predictor and criterion space, it will be important to extend one’s perspective to broader<br />

implementation issues that involve thinking about classification and person-organization<br />

(P-O) fit. As organizational flexibility in effectively utilizing employees increasingly<br />

becomes an issue (e.g., workers are more often moved from job to job in the<br />

organization), the P-O model may be more relevant in comparison with the traditional<br />

person-job match approach.<br />

457<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


458<br />

Current Research Program<br />

The individual requirements for successful performance will require a thorough<br />

understanding of predictors of success and their relationship with key criteria. This effort<br />

will lead to the development of new measures of aptitude, personality, and other<br />

cognitive and non-cognitive instruments. Prior to this; however, it is necessary to develop<br />

a nomological net between the critical constructs that will define successful performance,<br />

and current selection instruments. The knowledge gained from this effort would provide<br />

a foundation for future developmental endeavors and contribute a much-needed<br />

component to the scientific literature.<br />

With this said, the focus of efforts for the first year of the current research<br />

program (FY-2002), were a detailed illumination of the current state-of-the-science<br />

relevant to the changing nature of jobs in the Navy, and how this will impact selection<br />

and classification. A major component of this effort will be a thorough literature review<br />

of predictors and criterion and the relationships between them. This will include (but in<br />

no way is limited to): a.) extending work that was accomplished as part of the Army’s<br />

Project A, b.) review of current models of job performance, c) review of the literature on<br />

cognitive and non-cognitive predictor measures, d) investigation of promising areas (e.g.<br />

the role of situational judgment) for increasing predictive ability and objectifying<br />

measurement, e) the role of organizational and attitudinal variables, and f) personorganization<br />

and person-job match.<br />

The focus of the past year (FY-<strong>2003</strong>) has been the development of a<br />

comprehensive computer administered non-cognitive personality-based assessment tool.<br />

This tool, the Enlisted Navy Computer Adaptive Personality Scales or ENCAPS,<br />

promises to improve the predictive ability of the current ASBAB-based testing program.<br />

The addition of a personality assessment tool will aid greatly in the selection and<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


classification of enlisted personnel, as we continue to redefine individual performance<br />

and expand the base or organizationally important criteria.<br />

The ENCAPS constructs selected for initial measurement are intended to<br />

represent the most important traits from a broader taxonomy of personality constructs<br />

that will ultimately be measured. We used the following inclusion criteria to identify<br />

traits in that broader taxonomy:<br />

459<br />

1. Unidimensionality. ENCAPS are based on an IRT measurement<br />

model. Though recent work in multidimensional modeling has been<br />

promising, it is generally advisable that constructs be measured<br />

unidimensionally. Item covariance is a requirement for every basic<br />

personality trait. As such, this is not a very restrictive criterion.<br />

However, this requirement does preclude measurement of<br />

compound traits , such as integrity and service orientation, that may<br />

be comprised of non-covarying personality facet variables linked to<br />

important criteria.<br />

2. Temporal stability. The personality traits to be measured by ENCAPS<br />

will be used to select and/or classify naval enlisted personnel into<br />

positions they will occupy over a significant period of time. It is<br />

therefore important that people’s rank ordering on such traits be<br />

preserved over time.<br />

3. Appropriate level of specificity. As with cognitive ability, personality<br />

traits can be represented hierarchically. The appropriate level to<br />

measure in a personality hierarchy has generated significant debate.<br />

For our purposes, however, the key is to measure personality traits<br />

at a level broad enough to provide efficient measurement, but<br />

narrow enough not to (1) obscure meaningful distinctions; or (2)<br />

preclude measurement of specific variance that would increment the<br />

validity associated with the common variance measured by the<br />

broader trait. Traits included in our personality taxonomy will be<br />

selected to optimize this tradeoff.<br />

4. Prediction of important job performance criteria. There must be a<br />

rational or empirical basis for believing that a personality construct<br />

will be predictive of one or more important job performance<br />

dimensions in at least some Navy enlisted ratings. Further, the traits<br />

in the overall personality taxonomy operationalized by the ENCAPS<br />

must, collectively, account for the majority of non-cognitive variance<br />

in job performance dimensions across all Navy enlisted ratings.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


460<br />

5. Well understood and history of successful measurement. Although we<br />

don’t necessarily wish to exclude experimental personality<br />

constructs from our broader taxonomy, the majority of the<br />

constructs should be represented in most of the major<br />

comprehensive personality taxonomies and have been successfully<br />

measured by the instruments operationalizing those taxonomies.<br />

For the initial prototype, three constructs (Achievement Motivation, Social<br />

Orientation, Stress Tolerance) were chosen. Approximately 300 items (evenly divided<br />

between the three constructs), were written to be presented in a paired comparison<br />

format. Items were rated via SME judgment for the level of the trait in question that they<br />

represented and their level of apparent social desirability. These items formed the initial<br />

pool for the computer adaptive algorithm. Recent pilot testing results are currently being<br />

analyzed and will be used to inform development of successive versions of ENCAPS.<br />

References<br />

Borman, W.C., Hanson, M.A., & Hedge, J.W. (1997). Personnel selection. In Spence,<br />

J.T., Darley, J.M., & Foss, D.J. (Eds.), Annual review of psychology, Vol. 48 (pp.<br />

299-337), Palo Alto, CA: Annual Review.<br />

Borman, W.C., Hedge, J.W., Ferstl, K.L., Kaufman, J.D., Farmer, W.L., & Bearden,<br />

R.M. (<strong>2003</strong>). Current directions and issues in personnel selection and classification.<br />

In Martocchio, J.J., & Ferris, G.R. (Eds.), Research in personnel and human<br />

resources management, Vol. 22 (287-355), Amsterdam: Elsevier.<br />

Ferstl, K.L., Schneider, R.J., Hedge, J.W., Houston, J.S., Borman, W.C., & Farmer, W.L.<br />

(In press). Following the roadmap: Evaluating potential predictors for Navy<br />

selection and classification (NPRST-TN-03- ). Millington, TN: Navy Personnel<br />

Research, Studies, & Technology.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Abstract<br />

Pilot Selection in the Australian Defence Force:<br />

AUSBAT Validation<br />

Dr. Alan Twomey<br />

Major Damian O'Keefe<br />

Psychology Research and Technology group<br />

Department of Defence, Australia<br />

Using a concurrent validation design, this study applied regression analytical techniques to develop<br />

a statistical model for improving the prediction of outcomes at Basic Flying Training in the<br />

Australian Defence Force (ADF). In particular it sought to evaluate the relative contribution made<br />

by the Australian Basic Abilities Test (AUSBAT) battery of tests compared to the existing selection<br />

battery, relevant biographical variables, some alternative trial tests, and a flight screening program<br />

to predicting basic pilot training outcomes. Hierarchical regression analysis found that 46% of the<br />

variance in overall sortie average rating scores could be predicted with Flight screening accounting<br />

for most of this variance followed by one of the AUSBAT tests (Pursuit B). Cutoffs identified from<br />

the regression equation enabled three groupings of trainees to be developed with failure rates in the<br />

order of 33%, 19% and 0% respectively.<br />

Introduction<br />

This paper describes an attempt to improve the prediction of training outcomes at the Australian<br />

Defence Force (ADF) Basic Flying Training school. In the process it reports the latest phase in the<br />

development and evaluation of the Australian Basic Abilities Tests (AUSBAT) battery as a tool for<br />

military pilot selection.<br />

AUSBAT<br />

The AUSBAT tests are computer generated tests delivered via a standard desktop PC utilising<br />

specialised joy sticks. The theoretical underpinnings, early development and descriptions of the<br />

tests have been reported by Bongers, S and Pei, J., (2001).<br />

Since its conversion from a DOS environment to a Windows platform in 1999-2000, AUSBAT has<br />

undergone systematic evaluation. Completed phases include the following:<br />

a. Specification of standardised delivery parameters for each of the tests (Pei J., 2002)<br />

now incorporated into technical documentation;<br />

b. Development of standard scoring systems for the tests (O'Keefe D., 2002a);<br />

c. Investigation of the construct validity of the tests (O'Keefe D., 2002a, Pei J., <strong>2003</strong>);<br />

and<br />

d. initial concurrent validity studies against basic training outcomes (O'Keefe D.,<br />

2002b).<br />

Today, the AUSBAT battery comprises nine discrete tasks grouped in various combinations,<br />

orientations, and difficulty levels, into fourteen tests, yielding twenty-eight measures. These<br />

measures comprise accuracy scores (e.g. time spent in target area, number of correct answers) and<br />

error scores (e.g. distance away from target area, orientation error of test objects in relation to<br />

reference targets, and number wrong). Standard deviation scores are also calculated for 11<br />

psychomotor accuracy and error scores.<br />

461<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


462<br />

Construct validation studies (noted above) suggest that most tests assess the following six ability<br />

areas - psychomotor, perceptual adjustment, working memory, time sharing, spatial ability and<br />

visual memory. They further indicate that there is some overlap with existing pilot selection<br />

measures in the domains of psychomotor ability, working memory/numeracy, spatial ability and<br />

visual memory/perceptual speed, but not so much as to make the tests redundant.<br />

Current studies are focussing on developing norms for the ADF officer aircrew applicant<br />

population, evaluating them as selection aids for military pilots and a range of other ADF<br />

occupations, and exploring their utility as tools for assessing the impacts of physiological variables<br />

in studies of human cognitive performance.<br />

Initial analyses of the relationships between AUSBAT scores and training outcomes for trainees<br />

participating in, or recently completing, Basic flying Training (BFTS) showed significant<br />

correlations ranging up to 0.34 for some of the tests and/or test factors. Combined with the index<br />

generated by the current pilot test battery, one of the coordination tests predicted 19% of the<br />

variance in BFTS outcome scores.<br />

ADF Pilot Selection<br />

Pilot selection and initial training in the ADF is currently undertaken on a tri-Service basis.<br />

Excluding medical assessments, the selection process for civilian applicants involves six major<br />

steps. These include:<br />

Step 1: Achievement of test cutoffs in the Officer Test Battery (OTB);<br />

Step 2: Achievement of test/pilot index cutoffs in the current Aircrew Test Battery (ATB);<br />

Step 3: Positive recommendations following separate interviews by a psychologist and a<br />

uniformed recruiting officer;<br />

Step 4: Successful selection for flight screening following a file review and rating process;<br />

Step 5: Successful completion of, and positive recommendation following, a two week<br />

Flight Screening Program (FSP) 37,38 ; and<br />

Step 6: Successful selection following an Officer Selection Board (OSB) recommendation<br />

and rating process.<br />

A recent review of this process (Pei In press) suggests that from an initial applicant pool of about<br />

2189 over a two year period (2001-2), only 196 (approx 9%) are recommended for BFTS following<br />

FSP. This figure is only approximate as it is influenced by an unknown but small number of Army<br />

applicants who did not attend FSP, limits on the places available at FSP, and the practice of<br />

selecting only those rated at the top of the applicant pool awaiting FSP selection. Nevertheless it<br />

highlights the quantum of the screen out rate in the selection process.<br />

The ADF Basic Flying Training School (BFTS) has a capacity to train about 100 pilots per annum.<br />

Each intake comprises a mix of Australian Defence Force Academy (ADFA) graduates and Direct<br />

Entry Officers (DEO) officers from each of the three Services. ADFA graduates will have already<br />

completed about 3 years of tertiary studies prior to commencing, while DE Officers will have<br />

completed between 17 and 72 weeks of military training (depending on type of entry and Service) at<br />

single Service officer training schools. While most will have been assessed for pilot aptitude as part<br />

of their selection for the ADF, some will have been assessed as part of an in-service occupational<br />

transfer process.<br />

37 Applicants with 20 or more previous flying hours (PFH) undergo an advanced version, while those with less than 20 PFH undergo a basic version.<br />

38 . Some Army applicants proceed directly to the Officer Selection Board without undertaking Flight Screening.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Royal Australian Navy (RAN) and Royal Australian Airforce (RAAF) trainees who successfully<br />

complete BFTS then proceed to advanced training at the ADF No 2 Flying Training School (2FTS).<br />

Those who are successful progress to specialised training as transport, fast jet (RAAF) or rotary<br />

wing pilots (RAN). Successful army trainees currently proceed directly to specialised training<br />

following BFTS.<br />

Given the substantial investment in each applicant and the costs of subsequent pilot training,<br />

therefore, it is important that members reaching this stage have a good chance of successfully<br />

completing their courses. For the period since it recommenced as a distinct entity in 1999 the pass<br />

rate at BFTS has been about 67% (Pei J., In press).<br />

Aim<br />

This study has three aims:<br />

Method<br />

a. to test with a more complete data set previous indications that at least one of the<br />

AUSBAT tests adds to existing selection tools in predicting training outcomes at<br />

the ADF Basic Flying Training School (BFTS);<br />

b. to extend previous studies by including other trial tests in the evaluation and the<br />

Flight Screening Program; and<br />

c. to trial a new statistical model for ADF pilot selection incorporating the findings of<br />

this study.<br />

Because of small numbers, all pilot trainee data has been grouped together. While this approach<br />

may obscure differences between the various sub groups undertaking pilot training, it does<br />

recognise extant decision making processes that are common for all forms of entry, and should<br />

detect whether any AUSBAT or other tests add value to these processes.<br />

Sample<br />

AUSBAT results were obtained from pilot trainees (N=194) undertaking or recently completing<br />

training at BFTS (65%) and 2FTS (23%). Some RAN (6%) and ARA (6%) trainees had moved to<br />

specialist conversion training.<br />

Variables<br />

The independent variables included in the analysis fall into six categories.<br />

a. Biographical: Age at testing and previous flying hours (PFH) have in previous<br />

studies been shown to be related to success during pilot training.<br />

b. OTB: The primary instrument in the Officer Test Battery, the Army General<br />

Classification (AGC) test, is a measure of general cognitive measure. Because it is<br />

administered in two forms, the standard score (General Ability Scale (GAS) it<br />

produces was used in the analysis. The GAS is a 1-19 point normalised standard<br />

scoring scale with each unit representing a quarter of a standard deviation.<br />

c. ATB: The Aircrew Test Battery (ATB) comprises three tests - Instrument<br />

Comprehension (IC) , Instruments - A (INSA) and the COORD, an<br />

electromechanical device used for assessing psychomotor skills. Stanine scores are<br />

calculated for each and combined to form a Pilot Index (PI) stanine which, together<br />

463<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


464<br />

with cutoffs on individual tests, comprise the primary current pilot selection<br />

instrument.<br />

d. Trial Tests: Data is being collected from applicants on a number of trial tests that<br />

have previously been shown to be associated with flying experience. These are:<br />

Aviation Reasoning (AVR), Visual Manoeuvres (VM) and Numerical<br />

Approximations (NA). Raw scores of number correct were used for the analysis.<br />

e. AUSBAT: Because of their extreme diversity and difficulty of direct interpretation<br />

and comparison, AUSBAT test raw scores were converted to T scores with all error<br />

scores being recoded such that a high score means less error.<br />

f. FSP: Flight Screening Program (FSP). A two week program in which applicants are<br />

assessed for their flying ability. Applicants are split into two groups based primarily<br />

on their number of previous flying hours. The Raw Mean Score (RMS) represents<br />

the average of sortie ratings received in both basic and advanced programs. It was<br />

preferred over standardised scores currently obtained for each program as it retains<br />

differential performance between both groups which is lost in the standardisation<br />

process. Although varying slightly, both programs assess flying ability and the<br />

ratings used were considered to be sufficiently similar to warrant inclusion as a<br />

common scale.<br />

The dependent variables record BFTS outcomes. They include an overall weighted mean score<br />

ranging from 1-5 based on average sortie ratings, which in turn are calculated from sequence ratings<br />

within each sortie. The number of sequences per sortie may vary, but the number of sorties per<br />

course is about 70. This variable was used for regression and correlational analyses. A second<br />

outcome variable was end of course status in which applicants are grouped according to whether<br />

students - Self Suspend, are Back-Classed, Fail for Air Work or obtain a Pass, Credit or Distinction.<br />

Only the latter 4 groupings were used for this analysis, as reasons for back-classing and self<br />

suspension are often unrelated to flying ability. The numbers involved were also small (N=3 for<br />

each).<br />

Analyses<br />

In the first phase of the analysis, relationships between the test variables and the criterion variable<br />

and with each other were explored using correlational techniques.<br />

In the second phase, variables showing close relationships with the criterion but not with each other<br />

and those deemed important for process or theoretical reasons were analysed using a mix of<br />

hierarchical and stepwise multiple regression techniques. Despite leading to marked reduction in<br />

degrees of freedom, listwise deletion was preferred as it ensured all cases with relevant variable<br />

data were included in the analysis. The regression equation arising from the above analysis was<br />

used determine a new statistical model for use in guiding selection following FSP.<br />

The effectiveness of the new statistical model was then assessed using cross tabulation analyses that<br />

compared predicted training outcomes against actual BFTS pass and fail rates.<br />

Results<br />

Although age and PFH were foound not to be significantly correlated with BFTS sortie averages,<br />

previous studies have indicated they are important predictors of training success and hence they<br />

were included in the regression analysis. Not unexpectedly, the OTB test did not show up as<br />

significant, and was not included in the regression analysis. Of the trial tests only the Aviation<br />

Reasoning (AVR) test showed up as having a statistically significant correlation (r-.22) and hence<br />

was included in the regression analysis. Similarly, although none of the individual measures<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


comprising the ATB pilot index (PI) correlated significantly, the PI did (r=.20) and hence for both<br />

practical and statistical reasons was included. Similarly the FSP raw mean score showed an<br />

especially high correlation (r=.50) and was included in the regression analysis.<br />

Of the AUSBAT measures, two psychomotor (Pursuit B (r=.31) and Perceptual Adjustment Test -<br />

Horizontal test error score (r=.20)); and three Working Memory tests showed up as significant<br />

(Number Product - number right (r=.17) and number wrong (r=.19), Divided Attention test -<br />

Number Product wrong (r=.19)). However, as the working memory measures intercorrelate highly<br />

(>0.4) with each other, only one (Number Product - Number wrong) was included in the regression<br />

analysis. It was preferred over its equivalent in the Divided Attention test, because it is more<br />

efficient to administer in a selection scenario (the Divided Attention test requires administration of<br />

both component tests independently prior to its administration) and because it contributed similarly<br />

to the DA version in predicting the variance of the DV. On the other hand, although there is some<br />

overlap, both conceptually and empirically, between the two psychomotor measures, it was not<br />

considered sufficiently high (r


466<br />

the latter two groups respectively. Translated to a selection scenario applicants falling into the<br />

'High' category have a high risk (33%) of failing BFTS, while those in the 'Medium' and 'Low'<br />

categories have progressively lower risks of failing (19%, 0% respectively). Table 2 summarises the<br />

results of this analysis.<br />

Table 2 - BFTS Outcomes by Cutoff group<br />

Conclusion<br />

Cut-off Group<br />

N Percentage<br />

Low Med High Low Mid High<br />

FAW 7 3 0 33.3% 18.7% 0%<br />

PASS 14 12 60 66.6% 75% 72.3%<br />

CREDIT 0 1 17 0 6.3% 20.5%<br />

DISTINCTION 0 0 6 0 0% 7.2%<br />

Total 21 16 83 100% 100% 100%<br />

The results may seem surprising in that neither age nor previous flying experience met statistical<br />

criteria for inclusion in the model, despite previous research showing they are related to pilot<br />

training outcomes. Both however, are thought to be important considerations in the subjective<br />

assessment components of the selection process. Younger applicants and those with more flying<br />

experience are often preferred. Range restriction, therefore, may be leading to an under-estimation<br />

of the likely contribution of these variables on prediction BFTS outcomes. The same applies to the<br />

Pilot Index. It is the main determiner for deciding who is likely to progress to more advanced levels<br />

in the selection process.<br />

The utility of the Aviation Reasoning Test is of interest. It is similar to previous predominantly<br />

knowledge based tests that have been included previously in the pilot selection battery. There is<br />

some suggestion it may be tapping into motivation as much as skill or ability. This study suggests<br />

that its inclusion in the selection model may prove useful - at least for selection at the basic flying<br />

training level.<br />

Of the AUSBAT measures, the Pursuit B especially was predictive. Together with one of the<br />

Perceptual Adjustment measures, it accounted for 13.6% of the variance in this study. This<br />

confirms previous research supporting its utility in helping predict success at this level of training.<br />

As some of the AUSBAT tests were designed to predict outcomes at more advanced flying training<br />

levels, it is perhaps, not surprising that many are not showing up as statistically significant at the<br />

basic level. Finally, the study provides strong evidence supporting the role of the flight screening<br />

raw mean score in predicting BFTS outcomes additional to that provided by all the selection tests<br />

The primary drawback of the study concerns the small sample size which precludes analyses of subgroups<br />

(eg Service specific, Basic or Advanced FSP Program) for which distinctive patterns of<br />

variable interrelationships may be evident. The range restriction associated with some of the<br />

variables also suggests the full contribution they make is likely to be underestimated.<br />

Overall, however, the model accounts for a substantial proportion of the variance in BFTS sortie<br />

averages. Given that some of the trainees in the sample did not commence BFTS training until three<br />

years after they were tested and screened, the strength of the findings in this study is encouraging. It<br />

suggests the approach is likely to have utility as a tool for predicting BFTS outcomes for ADF pilot<br />

applicants.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


References<br />

Bongers, S., and Pei, J. The Australian Basic Abilities Tests (AUSBAT). Presentation at <strong>IMTA</strong>,<br />

Canberra, ACT, Australia, 23-25 Oct 2001;<br />

O'Keefe, D. (2002a). Principal Component Analysis and Development of Scale Scores for the<br />

Australian Basic Abilities Tests (AUSBAT), Psychology Research and Technology Group Research<br />

Report 05/<strong>2003</strong>. Canberra, Defence Force Psychology Organisation.<br />

O'Keefe, D. (2002b). Concurrent Validation of Australian Basic Abilities Tests (A USBAT) against<br />

Basic Flight Training School (BFTS) Performance. Psychology Research and Technology Group<br />

Research Report 06/<strong>2003</strong>. Canberra, Defence Force Psychology Organisation.<br />

Pei, J. (2002). Determination of the Test Parameters for the AUSBAT Tests. Psychology Research<br />

Group Technical Brief 15/2002. Canberra: Defence Force Psychology Organisation.<br />

Pei J., (<strong>2003</strong>) The Construct Validity of AUSBAT. Presentation at the Australian Psychological<br />

Society Industrial & Organisational Psychology Conference, Melbourne, Australia, 27-30 June<br />

<strong>2003</strong>.<br />

Pei J., (In Press). Overview of the selection rates at different stages of pilot selection and<br />

graduation rates in Basic Flight Training. Psychology Research and Technology Group Technical<br />

Brief 12/<strong>2003</strong>. Canberra, Defence Force Psychology Organisation.<br />

467<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


468<br />

Development and Validation of a Revised ASVAB CEP Interest Inventory<br />

Jane S. Styer, Ph.D.<br />

Department of Defense Personnel <strong>Testing</strong> Division<br />

DoD Center - Monterey Bay<br />

styerjs@osd.pentagon.mil<br />

Abstract<br />

The Armed Services Vocational Aptitude Battery Career Exploration Program is a<br />

comprehensive career exploration and planning program that includes aptitude and interest<br />

inventory assessments. Currently, the program includes a 240-item, Holland-based, selfscored,<br />

paper-and-pencil interest inventory. This inventory has not been updated since it was<br />

implemented in 1995.<br />

The outcome of the current study will be a 90-item interest inventory that will be<br />

administered via the Internet and by paper-and-pencil. The study will be conducted in two<br />

phases. The item and form development phase is currently nearing completion. The<br />

validation phase will commence in January 2004.<br />

In the item and form development phase of the study, out-of-date items were<br />

eliminated and new items were written to reflect the considerable changes in technology and<br />

the world-of-work. From a pool of over 1,000 new and old items, 600 items were selected<br />

and assembled into two 300-item forms. These forms will be piloted with approximately 600<br />

local high school students. Item performance statistics (e.g., endorsement rates and item to<br />

scale correlations) will be reviewed to identify good performing items. These items will be<br />

used to assemble two experimental forms to be administered in the second phase of the study.<br />

A development and validation study will be conducted in the spring semester in 2004<br />

with a nationally representative sample of approximately 6,000 eleventh and twelfth grade<br />

students in 50 to 55 high schools. Each student will complete one of the two experimental<br />

forms and the Strong Interest Inventory in a counterbalanced design. A subset of schools will<br />

be used to collect test-retest data.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The Armed Services Vocational Aptitude Battery (ASVAB) Career Exploration<br />

Program (CEP) annually serves approximately 800,000 students in over 14,000 schools<br />

nationwide. The Program provides 10 th , 11 th , 12 th , and post-secondary students with a<br />

comprehensive vocational assessment program at no cost.<br />

The program includes two assessment components: the ASVAB (Defense Manpower<br />

Data Center, 1995) and the Interest-Finder (IF: Wall, Wise, & Baker, 1996). These<br />

components provide students with information about their academic abilities and career<br />

interests. The current IF is a paper-and-pencil, 240-item interest inventory based on John<br />

Holland’s (1985, 1997) widely accepted theory of career choice. Students indicate their<br />

preference (i.e., Like or Dislike) to various Activity, Education and Training, and Occupation<br />

items. The IF is a self-administered and self-scored inventory that yields both raw scores and<br />

gender-based percentiles for the six scales. Students use their two or three highest Interest<br />

Codes to explore potentially satisfying occupations.<br />

There are a number of reasons for developing a new interest inventory. The items in<br />

the IF were developed in the early 1990s. Apart from the datedness of some items, other<br />

items have shown a decrement in their item performance statistics over time. Also, the<br />

current items do not reflect the significant changes in technology that have occurred in the<br />

past decade.<br />

Finally, there is a need to have a shortened interest inventory. Typically, it takes<br />

approximately 20 to 30 minutes to take and self-score the IF. A 90-item interest inventory<br />

could be completed and scored in 12-15 minutes, thus freeing up more time in a typical 50minute<br />

interpretation session to focus on aptitude test score interpretation, interest inventory<br />

results, and a discussion of career exploration and planning activities.<br />

The outcome of this study will be a 90-item interest inventory. The study will be<br />

conducted in two phases. The two phases will focus on the item and form development and<br />

validation of the inventory, respectively. Phase two will not commence until January 2004.<br />

The remainder of this paper will address only the first phase of the study.<br />

ITEM AND FORM DEVELOPMENT<br />

Developing a new inventory, as opposed to updating and shortening an existing<br />

inventory, allowed us to make more substantive changes. We found that some of the IF<br />

scales are narrowly defined and narrowly sampled. Subsequent to a review of the literature,<br />

we revised the scale definitions. For example, in the IF, Realistic is defined as: “People in<br />

REALISTIC occupations are interested in work that is practical and useful, that involves<br />

using machines or working outdoors, and that may require using physical labor or one’s<br />

hands.” The revised definition reads: “Realistic people prefer work activities that include<br />

practical, hands-on problems and solutions, such as designing, building, and repairing<br />

machinery. They tend to enjoy working outside, with tools and machinery, or with plants and<br />

animals. Realistic types generally prefer to work with things rather than people.”<br />

In the IF, items are organized by scale and scale definitions are provided. Students<br />

read the scale definition and use two response options (i.e., Like and Dislike) to indicate their<br />

preference for the Activity, Training and Education, and Occupation items. We have changed<br />

the response scale to Like, Indifferent, and Dislike. This response scale allows respondents to<br />

indicate their indifference to items and is more consistent with the scale used in other interest<br />

469<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


470<br />

inventories (Harmon, Hansen, Borgen & Hammer, 1994; ACT, 1995). We eliminated the<br />

Training and Education and Occupation items; the new inventory will consist of only<br />

Activity items. We also eliminated the scale definitions and spiraled the items. Attachment A<br />

shows the directions for the new inventory, response scale, and illustrates the spiraling of<br />

items.<br />

Item Development<br />

We began item development by reviewing the item performance statistics of the<br />

current IF items based on two studies conducted in early to mid 1990s (Wall, Wise, & Baker,<br />

1996) during the development of the IF. We also reviewed item performance statistics from<br />

data collected in the National Longitudinal Survey of Youth 1997 (Moore, Pedlow,<br />

Kishnamurty, & Walter, 2000). Attachment B shows the item selection screens employed to<br />

identify and categorize items into three groups: Best, Good, and Marginal. The quality of<br />

items within the six RIASEC domains varied. As a result, we identified domains where more<br />

new items would be needed relative to other domains. Items selected for inclusion in the two<br />

experimental forms were selected primarily from the Best category; however, when<br />

necessary items from the Good category were also selected.<br />

Two groups of people were asked to write new items. Item writing workshops were<br />

conducted with five item writers from the Personnel <strong>Testing</strong> Division (PTD) and volunteers<br />

from the staff at the Defense Manpower Data Center. Fifteen individuals submitted 1,019<br />

items. The PTD item writers reviewed and edited the items and eliminated poor items and<br />

duplicates. This resulted in 487 new items.<br />

Four professionals with extensive experience and expertise with RIASEC typology<br />

independently reviewed and assigned the 487 items to RIASEC domains. Items that they<br />

were unable to assign to one RIASEC domain were edited so that the assignment could be<br />

made. Once the items were categorized, three of the experts sorted the items into RIASEC<br />

types, reviewed the item content, and assessed the coverage within the six domains. For<br />

example, the experts identified the Investigative domain as having too few items pertaining<br />

to the social sciences. The experts wrote items for the areas not covered or not sufficiently<br />

covered by the pool of new items. The experts wrote a total of 89 new items.<br />

Form Development<br />

Two 300-item experimental forms are currently being developed. In assembling these<br />

forms we will select the best performing items from each of the six scales from the current IF<br />

and divide these between the two forms. We will include the items that were consistently<br />

assigned to the same RIASEC domain by the experts and the items written by the experts to<br />

broaden the sampling of the content within the domains. Final selections will be made from<br />

items that were edited by the experts to facilitate assignment to a single domain.<br />

Once developed, the two experimental forms will be reviewed by a group of junior<br />

and senior level English teachers to evaluate the understandability and reading level of the<br />

items. The two forms will be spiraled and administered to approximately 600 juniors and<br />

seniors at a local high school. Item endorsement rates, item-to-scale corrected correlations,<br />

and item-to-all-scales corrected correlations will be reviewed to identify poorly performing<br />

items. These items will be replaced by other new or old items.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Attachment A<br />

Career Exploration Program<br />

Interest Inventory<br />

An interest inventory can help you identify your work-related interests.<br />

Knowing your work-related interests can help you determine career fields or<br />

occupations that would be potentially satisfying.<br />

Directions: Interest inventories are not like other tests. There are no right or wrong<br />

answers. For each of the activities listed in the inventory, ask yourself if you would<br />

like or dislike doing that activity. In answering, don’t be concerned with how well<br />

you would do any activity or whether you have the experience or training to do it.<br />

Just think about how much you would like or dislike doing the activity and select an<br />

answer from the following responses:<br />

Darken L for Like. (I would like to do this activity.)<br />

Darken I for Indifferent. (I don’t care one way or the other.)<br />

Darken D for Dislike. (I would not like to do this activity.)<br />

In deciding, it is best to go with the first answer that comes to mind and not to think<br />

too much. If you don’t have a strong preference one way or the other you can answer<br />

Indifferent (I don’t care one way or the other). Try to answer as many questions as<br />

possible with Like or Dislike. Answer all of the questions, even when you are unsure.<br />

Mark only one response for each activity. It will take about 40 minutes to complete the<br />

inventory, but there is no time limit. Don’t rush. Take your time and enjoy yourself.<br />

1. Connect a DVD player<br />

2. Study the solar system<br />

3. Act in a play<br />

4. Work as a camp counselor<br />

5. Manage a restaurant<br />

6. Balance a checkbook<br />

7. Feed and bathe zoo animals<br />

8. Analyze evidence of a crime<br />

9. Develop landscape designs<br />

10. Teach a class<br />

11. Campaign for a political cause<br />

12. Take minutes for a meeting<br />

471<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


472<br />

ITEM SELECTION SCREENS<br />

First Screen<br />

Attachment B<br />

1. Endorsement rates (er) equal to or greater than .1 for all subgroups<br />

2. Item-to-scale correlation of .4 and above<br />

3. Item-to-scale correlation larger than the same item adjacent scales<br />

Second Screen<br />

Categorize and sort items into three categories.<br />

Best Items<br />

1. Absolute difference of endorsement rates for all subgroups no greater than .15<br />

2. Item-to-scale correlation of .5 and above<br />

3. Strong hexagonal correlation pattern<br />

Good Items<br />

1. Absolute difference of endorsement rates for all subgroups no greater than .18<br />

2. Item-to-scale correlation of .4 to .49<br />

3. Correlation pattern approximates the hexagon<br />

Marginal Items<br />

The item meets at least two of the three Good item criteria.<br />

Third Screen<br />

1. Evaluate the coverage of each scale domain by the Best and Good performing items.<br />

Identify gaps and redundancy.<br />

2. Evaluate the balance of each scale from the perspective of each subgroup to ensure<br />

balance.<br />

3. Consider the possibility of using Marginal items.<br />

4. Consider each Training and Education and Occupations items to see whether they can be<br />

rewritten as an Activity item.<br />

Fourth Screen<br />

1. Compare individual judgments and come to consensus.<br />

2. Identify gaps and weaknesses in domain coverage.<br />

3. Conduct analyses of new interim scales.<br />

4. Estimate numbers of new items needed by scale.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


References<br />

Harmon, L. W., Hansen, J. C., Borgen, F. H., & Hammer, A. L. (1994). Strong Interest<br />

Inventory applications and technical guide. Palo Alto, CA: Consulting Psychologists<br />

Press.<br />

Holland, J. L. (1985). Making vocational choices: A theory of vocational personalities and<br />

work environments (2 nd ed.).Englewood Cliffs, NJ: Prentice-Hall.<br />

Holland, J. L. (1997). Making vocational choices: A theory of vocational personalities and<br />

work environments (3 rd ed.). Odessa, FL: Psychological Assessment Resources.<br />

Moore, W., Pedlow, S., Kishnamurty, P., & Walter, K. (2000). National longitudinal survey<br />

of youth 1997 (NLSY97). Chicago, IL: National Opinion Research Center.<br />

Swaney, K. (1995). Technical manual: Revised unisex edition of the ACT Interest Inventory<br />

(UNIACT). Iowa City, IA: ACT, Inc.<br />

Wall, J. E., Wise, L. L., & Baker, H. E. (1996). Development of the Interest-Finder – a new<br />

RIASEC-based interest inventory. Measurement and Evaluation in Counseling and<br />

Development, 29, 134-152.<br />

473<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


474<br />

INTRODUCTION<br />

JOB AND OCCUPATIONAL INTEREST IN THE NAVY<br />

Stephen E. Watson, Ph.D.<br />

Director, Navy Selection and Classification<br />

Bureau of Naval Personnel<br />

Washington, DC 20370-5000<br />

Stephen.E.Watson@Navy.Mil<br />

The United States Navy is currently taking an innovative approach to improving the<br />

match between enlisted personnel (Sailors) and their jobs (Ratings). The Rating Identification<br />

Engine (RIDE) is the Navy’s decision support system designed to help enlisted classifiers<br />

provide initial guidance counseling to Applicant-Sailors (i.e., potential enlistees), and re-assign<br />

Sailors to new Ratings during their career. During accessions, RIDE provides a rank ordered list<br />

of recommended Navy Ratings to the Classifier and Applicant-Sailor and presents a wide variety<br />

of educational and career information about these jobs (i.e., enlistment periods, bonuses,<br />

promotion rates, civilian education and job equivalents). Current inputs to RIDE include an<br />

ability component, which utilizes scores on the Armed Services Vocational Aptitude Battery, and<br />

a Navy Need input providing available training opportunities and emphasizing critical Navy<br />

Ratings.As the Navy strives to improve the Sailor-Rating match, a collection of independent<br />

variable measurements are being investigated as additional inputs to RIDE, including measures<br />

of personality (Farmer, Bearden, Borman & Hedge, 2002), general time sharing ability (Watson,<br />

Heggestad, & Alderton, 1999), and vocational interest. The most mature of these additional<br />

measures is the job interest measure known as Job and Occupational Interest in the Navy (JOIN).<br />

In the development of JOIN, reviews suggested traditional approaches and existing<br />

inventories were insufficient for the Navy’s intent (Lightfoot, Alley, Schultz, Heggestad and<br />

Watson, 2000). In general, these approaches failed to provide sufficient differentiation between<br />

Navy jobs. Navy Ratings (and military specialties in general) tend to fall into technical and<br />

scientific interest domains, rather than a more evenly distributed representation across the<br />

Holland domains (1999). Likewise, the interests of military personnel should follow this limited<br />

distribution pattern as compared to college or college-bound populations (see Lightfoot et al.,<br />

2000, for a complete discussion of these issues).<br />

For specificity, rather than attempting to tap into established interest factors or<br />

personality traits which may indicate potential job satisfaction in a broad collection of jobs<br />

(Holland 1999) the author decided to focus on a componential classification (or taxonomy)<br />

indexing approach. In the approach presented here, preferred jobs are retrieved through<br />

collections, or profiles, of indices, rather than matching a person’s “type” to a broad job area, and<br />

then retrieving jobs in that area. Campbell, Boese, Neeb and Pass (1983) have made compelling<br />

arguments for similar taxonomic approaches.<br />

The creation of an interest taxonomy requires the creation and categorization of some<br />

elements of job interest. Primoff (1975) has defined a “job element” as a worker characteristic<br />

that influences success in a job, and includes interest as a job element. Unfortunately, there does<br />

not exist a consensus definition for the phrase “vocational interest” (Hindelang, Michael &<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Watson, 2002; Dawis, 1991), or perhaps even the construct being measured by vocational<br />

interest inventories (Holland, 1999). For the purposes of this effort, we assume that we can<br />

measure job interest elements representing: a) Work the individual would prefer (specifically and<br />

generally), and b) Preferred work environments.<br />

We also assume that these measured vocational interest elements are a valid source for<br />

defining some portion of the vocational interest preferences of the individual. Given this<br />

approach, it is also necessarily assumed that a complementary set of classifications of Navy<br />

Ratings, relying on vocational interest elements, is possible.<br />

An additional and very important challenge in building an operational tool is to create<br />

something that is not only valid within the artificial constraints of laboratory and research<br />

settings, but also operational and maintainable, without a compromise in established validity.<br />

Addressing the technological, operational, and theoretical concerns described above, the<br />

author decided that utilizing a universally available spreadsheet application to capture, apply and<br />

maintain a taxonomy based grid representing a Navy interest model, was the most parsimonious<br />

approach. Two-dimensional data arrays and spreadsheets are common elements in contemporary<br />

operational office environments. These spreadsheet formats are easy to process, understand, and<br />

are well supported across a variety of computing platforms.<br />

Admittedly, the idea of allowing a technology and its associated constraints to so clearly<br />

influence the development of a model of human cognition is not without reproach. Although this<br />

is a substantive issue meriting consideration, the current discussion will focus on the<br />

development of the model intended to drive JOIN, and allow the establishment of validity to<br />

answer the important question of approach effectiveness. The JOIN application generalizes the<br />

model for use in working environments, and will eventually provide the greatest test of robust<br />

validity. Without constraining the application of this model to be easily applied to a variety of<br />

situations, eventual tests of validity would instead be quite limited and prolonged.<br />

In summary, the production of the current componential model of Navy Ratings, is a job<br />

classification task, focused on job interest elements, using a taxonomic grid (or spreadsheet)<br />

approach. Recommended Navy Ratings will be retrievable through patterns of indices, based on<br />

measured vocational interest responses. Finally, it should be noted that the development of this<br />

componential model, or “classification indexing of interest”, is heavily influenced explicitly and<br />

implicitly by Fleishman and Quaintance (1984), and their summary and interpretation of work<br />

cited therein.<br />

METHOD<br />

Previous attempts at developing a model and items sufficient for use in JOIN (Lightfoot<br />

et al., 2000; Alley, Crowson & Fedak, in press) were not appropriate for the current approach for<br />

a variety of reasons. For example, due to variance in the level of description of job elements (see<br />

Lightfoot et al., 2000, pp.17; but also Alley, Crowson & Fedak, in press), detail and description<br />

in a taxonomy became problematic. Additionally, since the context for interpreting statements<br />

had been procedurally stripped away, interpretation and conversion to more generic or more<br />

specific statements, or into job interest elements, was not possible. Finally, while some of the<br />

task lists, statements and items developed were linked to specific Ratings, most were not, and as<br />

such could not be used to build descriptions of the Ratings from which they were derived.<br />

475<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


476<br />

Phase I: Initial Job Interest Elements Production<br />

To create a completely new set of data, all available job descriptions for all Navy enlisted<br />

Ratings were downloaded from the Enlisted Community Manager’s website. The author printed,<br />

compiled, and reviewed each of these job descriptions, and highlighted words reflecting process<br />

(action verbs, e.g., make, maintain), content (direct object nouns, e.g., electronics, documents),<br />

environment (e.g., indoor vs. outdoor), and community (e.g., surface, submarine, aviation) to<br />

construct interest job elements (see also, Hindelang, Michael & Watson, 2002, for<br />

complementary description of Methods). From the collections of highlighted extracts, the author<br />

developed common statements for each of the Ratings containing: 1) Between one and five<br />

Process-Content (PC) pairings 2) For each PC pairing, between one and five parenthesized<br />

examples. 3) At least one community statement. 4) At least one environment statement. 5) At<br />

least one ‘style’ statement. An example of these statements is shown in Figure 1.<br />

Rating Community Process Content Work Style<br />

ABE Surface Maintain mechanical (hydraulic systems, steam systems, pneumatic) outdoor<br />

ABE Operate mechanical (hydraulic systems, steam systems, pneumatic) physical<br />

ABE Operate heavy equipment (crane, tractor)<br />

ABE Respond emergency (fire, crash, flood)<br />

ABE Aviation Direct aircraft<br />

Figure 1. Example of job interest elements for the Aviation Boatswain Mate-Launch/Recovery<br />

(ABE) rating built from analysis of ECM documents, and revised in iterative SME interviews.<br />

Phase II: Job Interest Element Revision<br />

The collection of job interest elements (see Figure 2) developed in Phase I were used in<br />

semi-structured interviews with Subject Matter Experts (SME’s). These SME’s were Enlisted<br />

Community Managers (ECM’s; typically the officer supervisors of enlisted personnel, with over<br />

10 years of experience), and the Enlisted Community Technical Advisors (TECHADS; typically,<br />

enlisted personnel with 10-30 years of experience in the interviewed rating). In these small group<br />

interviews (two to four participants) the author systematically presented elements from the<br />

relevant pre-conceived descriptions, and interviewees were asked to assess and comment on the<br />

appropriateness of elements, and modify elements as necessary. Interviewees were also allowed<br />

to add words from the “precoordinated” (Fleishman & Quaintance, 1984) list of terms, which<br />

had been created by pooling all terms developed in Phase I. Also, natural language terms<br />

considered by the interviewer to be key to the description, occurred frequently, and were not<br />

represented in the precoordinated vocabulary, were added to the pool.<br />

The development of the final interest element collection for each rating continued from<br />

this point as a process of iterative interviews, refining and modifying the collection of job<br />

interest elements. Upon completion, all Ratings were described by some collection of interest<br />

elements, and all interest elements occurred for multiple (but not all) Ratings, and could be used<br />

to successfully differentiate between all but 7 Navy Ratings. Ultimately, 27 PC interest elements,<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


7 community interest elements, and 8 work style interest elements were identified as being<br />

sufficient to represent the 79 entry-level Navy enlisted Ratings. These interest elements and<br />

Ratings were transformed to a taxonomic grid, in spreadsheet format (see Figure 3), commonly<br />

referred to as the “Rating DNA.”<br />

JOB GROUP<br />

RATING<br />

aviation<br />

COMMUNITY PROCESS-CONTENT WORK STYLE<br />

healthcare<br />

special programs<br />

submarine<br />

surface<br />

analyze-comms<br />

analyze-data<br />

analyze-docs<br />

direct-aircraft<br />

direct-emerg resp<br />

maintain-docs<br />

477<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

maintain-elec equip<br />

work indep<br />

outdoor<br />

Aviation Mechanical ABE 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 1 0<br />

Aviation Mechanical AD 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0<br />

Health Care DT 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1<br />

Submarine Personnel ST 0 0 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 0 1<br />

Applicant 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 1 0<br />

Figure 2. An abbreviated and simplified example of the “Rating DNA” and a single applicant’s<br />

collection of responses.<br />

Phase III: Linking Pictures to Job Interest Elements<br />

Digital images from a variety of sources were collected to represent job interest elements.<br />

Using a series of iterative workshops with the same population of SME’s as used in Phase I & II<br />

described above, 9 images were identified which represented each PC interest element, 3 images<br />

for each community interest element, and 3 images for each work style element. Each multiple<br />

image collection was selected for a variety of aesthetic qualities (e.g. image clarity), but also to<br />

include a distribution of demographic makeup and task representations.<br />

Phase IV: Instantiation in JOIN<br />

PC pairs were presented 3 times, so that psychometric properties of items could be tested. Figure<br />

3 illustrates the framework and design of the JOIN software. Responses from each of the job<br />

interest element areas, Process-Content, Community and Work Style, were collected, and used to<br />

develop an interest profile for the individual. The pattern of responses can be interpreted through<br />

relationships between rating structure and responses using Figure 3. Each applicant’s response<br />

pattern is represented as a single row matrix (see Figure 2), with job interest elements reflected<br />

as a continuous variable between 0 and 1 (user actually responds 0-100% using JOIN interface).<br />

This methodology and organization for the collection of data lends itself to a variety of potential<br />

indexing and retrieval schemes and procedures, allowing us to explore a variety of approaches to<br />

match Ratings to patterns of responses.<br />

Phase V: System Test<br />

indoor<br />

industrial<br />

office<br />

physical<br />

mental


478<br />

The JOIN system was tested by 300 new recruits, attending Recruit Training Center,<br />

Great Lakes, IL (Michael, Hindelang, & Watson, 2002). Recruits also participated in small<br />

group discussions, and filled out questionnaires regarding the software.<br />

RESULTS AND DISCUSSION<br />

Two sets of analyses are critical in the validation of the current approach. First<br />

Hindelang, Michael and Watson (2002) presented results from a principal component analytic<br />

approach which indicated a convergence of the current Process-Content defined structure with<br />

previous factor analytic representations of Navy jobs. Labels for 7 of the 9 components<br />

(accounting for 92% of the variance) were readily available and intuitive (e.g., Technical<br />

Mechanical Activities, Administrative Activities, Intelligence), while the remaining 2<br />

components (accounting for 8% of the variance) were not.<br />

Second, Michael, Hindelang, and Watson (2002) report findings from the RTC, Great<br />

Lakes test of the JOIN system. In general, recruits found the JOIN system easy to use, intuitive<br />

and appealing. All PC pairs, communities and work styles were responded to with some level of<br />

interest, and there were substantial and reliable individual differences in response patterns. The<br />

(alpha) reliability estimates for each PC job interest element were very good, ranging from .83<br />

(Operate Mechanical Equipment) to .95 (Make Facilities), and .91 overall.<br />

Taken as a whole, results from the described efforts suggest that developing a job interest<br />

measure from job interest elements derived by SME evaluation can be an effective approach for<br />

organizations offering training and careers in some subset of all possible jobs. Future research<br />

will test the predictive validity of the system with respect to training and career success.<br />

The approach described here is efficient with respect to establishing rating descriptions,<br />

and is facile in the creation of new jobs, as well as the modification and merging of existing job<br />

structures. Additionally, as technical requirements for jobs change, job interest elements can be<br />

readily added and subtracted from a job profile, and entirely new job interest elements can be<br />

added to the system based on SME input. The intuitive and simple approach described here<br />

should result in a functional, accurate and easily maintained vocational interest measurement and<br />

matching system, JOIN.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 3. Illustration of possible relationships between JOIN items, job interest elements, and<br />

existing Ratings and Ratings groups.<br />

REFERENCES<br />

E 1.1<br />

E 1.2<br />

E 1.3<br />

E 2.1<br />

E 2.2<br />

E 2.3<br />

E 3.1<br />

E 3.2<br />

E 3.3<br />

3-1 Many-Many Many-Few<br />

PC 1<br />

PC 2...<br />

P-C n<br />

Job Interest Elements<br />

Multiple response items<br />

POTENTIAL MAPPINGS<br />

Job Group<br />

Rating1<br />

Rating2…<br />

Ratingn<br />

Job Group<br />

Ratings<br />

Rating<br />

Commu 1<br />

Commu 2…<br />

Commu n<br />

Work 1<br />

Work 2…<br />

Work n<br />

Job Interest<br />

Elements<br />

Alley, W.E., Crowson, J.J., & Fedak, G.E. (in press). JOIN item content and syntax templates<br />

(NPRST-TN-03). Millington, TN: Navy Personnel Research, Studies, & Technology.<br />

Cunningham, J.W., Boese, R.R., Neeb, R.W., & Pass, J.J. (1983). Systematically derived work<br />

dimensions: factor analyses of the Occupation Analysis Inventory. Journal of Applied<br />

Psychology, 68, 232-252.<br />

Dawis, R. (1991). Vocational interests, values, and preferences. In M.D. Dunnette, & L.M.<br />

Hough (Eds.), Handbook of industrial and organizational psychology (2nd ed., Vol. 2,<br />

pp. 271-326). Palo Alto, CA: Consulting Psychologists Press.<br />

Farmer, W.L., Bearden R.M., Borman, W.C., & Hedge, J.W. (2002). Navy Psychometrics of<br />

Measures: NAPOMS. <strong>Proceedings</strong> of the 44th Annual Conference of the <strong>International</strong><br />

<strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>.<br />

Fleishman, E.A., & Quantance, M.K., (1984). Taxonomies of Human Performance. Orlando,<br />

FL: Academic Press, Inc.<br />

Hindelang, R.L., Michael, P.G., & Watson, S.E. (2002). Considering Applicant Interests in<br />

initial Classification: The Development of Job and Occupational Interest in the Navy<br />

(JOIN). <strong>Proceedings</strong> of the 70th <strong>Military</strong> Operations Research Society Symposium.<br />

Holland, J.L. (1999). Why vocational interest inventories are also personality inventories. In<br />

Savickas, M.L., & Spokane, A.R. (Eds). Vocational interests: Meaning, measurement,<br />

479<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


480<br />

and counseling use. Palo Alto, CA: Davies-Black Publishing/Consulting Psychologists<br />

Press, Inc.<br />

Lightfoot, M.A., McBride, J.R., Heggestad, E.D., Alley, W.E., Harmon, L.W., & Rounds, J.<br />

(1999). Navy interest inventory: Approach development (FR-WATSD-99-13). San Diego,<br />

CA: Space and Naval Warfare Systems Center.<br />

Lightfoot, M.A., Alley, W.E., Schultz, S.R., & Watson, S.E. (2000). The development of a Navyspecific<br />

vocational interest model. Alexandria, VA: Human Resources Research<br />

Organization.<br />

Michael, P.G., Hindelang, R.L., & Watson, S.E. (2002). JOIN: Job and Occupational Interest in<br />

the Navy. <strong>Proceedings</strong> of the 44rd Annual Conference of the <strong>International</strong> <strong>Military</strong><br />

<strong>Testing</strong> <strong>Association</strong>.<br />

Primoff, E.S. (1975). How to prepare and conduct job element examinations (Tech. Study 75-1).<br />

Washington, DC: Government Printing Office, 1975.<br />

Savickas, M.L. (1999). The psychology of interests. Holland, J.L. (1999). Why vocational<br />

interest inventories are also personality inventories. In Savickas, M.L., & Spokane, A.R.<br />

(Eds). Vocational interests: Meaning, measurement, and counseling use. Palo Alto, CA:<br />

Davies-Black Publishing/Consulting Psychologists Press, Inc.<br />

Watson, S.E. (2001). Considering Applicant Interests in Initial Classification: The Rating<br />

Identification Engine Using Job and Occupational Interests in the Navy (RIDE/JOIN).<br />

<strong>Proceedings</strong> of the 69th Symposium of the <strong>Military</strong> Operations Research Society.<br />

Watson, S.E., Heggestad, E.D., & Alderton, D.L. (1999). General Time-Sharing Ability:<br />

Nonexistent or Just Rare? <strong>Proceedings</strong> of the 41 st Annual Conference of the <strong>International</strong><br />

<strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Vocational Interest Measurement in the Navy - JOIN<br />

William L. Farmer, Ph.D. and David L. Alderton, Ph.D.<br />

Navy Personnel Research, Studies, and Technology Department (PERS-1)<br />

Navy Personnel Command<br />

5720 Integrity Drive, Millington, TN, USA 38055-1000<br />

William.L.Farmer@navy.mil<br />

Job and Occupational Interest in the Navy (JOIN), has been developed for use in<br />

conjunction with the Rating Identification Engine (RIDE) to help provide a better match<br />

between a recruit’s abilities, interests and specific occupations (i.e., ratings). JOIN measures<br />

interest in specific work activities and environments, as well as providing recruits with Navy job<br />

information.<br />

JOIN evolved from a more or less typical interest inventory, comprised of a number of<br />

very specific work activity statements, into a picture-based instrument that solicits interest<br />

ratings in a series of generalizable work component items. There were four objectives in the<br />

design of JOIN. First, it had to differentiate among the 79 entry-level Navy jobs (these jobs are<br />

known as “ratings”), something civilian and other military interest measures could not do. It<br />

had to be “model based” so that it could be quickly adapted to Navy job mergers, changes, or<br />

additions. Third, JOIN needed to be useable by naïve enlisted applicants; naïve in terms of<br />

knowledge of Navy jobs and the technical terms used to describe them. Finally, JOIN needed to<br />

be short and engaging to encourage acceptance by Navy applicants and those who process them.<br />

The development of the JOIN model and tool is described in a series of NPRST reports (Alley,<br />

Crowson, & Fedak, in press; Lightfoot, Alley, Schultz, Heggestad, Watson, Crowson, & Fedak,<br />

in press; Lightfoot, McBride, Heggestad, Alley, Harman, & Rounds, in press; Michael,<br />

Hindelang, Watson, Farmer, & Alderton, in press).<br />

JOIN SOFTWARE TESTING<br />

Usability <strong>Testing</strong> I<br />

The first test phase occurred during August of 2002 at the Recruit Training Center (RTC)<br />

Great Lakes, and was conducted with a sample of 300 new recruits. Participants were presented<br />

with JOIN and its displays, images, question content, and other general presentation features in<br />

order to determine general test performance, item reliability, clarity of instructions and intent,<br />

and appropriateness with a new recruit population for overt interpretability, required test time,<br />

and software performance. The initial results from the usability testing were very promising on<br />

several levels. First, the feedback from participants provided researchers with an overall<br />

positive evaluation of the quality of the computer administered interest inventory. Second, the<br />

descriptive statistical analyses of the JOIN items indicated that there was adequate variance<br />

across individual responses. In other words, the participants were different in their level of<br />

interest in various items. Finally, the statistical reliability of the work activity items was<br />

assessed and the developed items were very consistent in measuring participant interest in the<br />

individual enlisted rating job tasks. The results from this initial data collection effort were used<br />

to improve the instrument prior to subsequent usability and validity testing (Michael,<br />

Hindelang, Watson, Farmer, & Alderton, in press).<br />

481<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


482<br />

Instrument Refinement<br />

Based on the results of the initial usability study, a number of changes were made. These<br />

changes were made with three criteria in mind. First, we wanted to improve the interface from<br />

the perspective of the test taker. Second, it was imperative that testing time be shortened.<br />

Though this modification does contribute to the “user-friendliness” of the tool, the initial<br />

impetus for this was the very real operational constraint, as directed by representatives from the<br />

Navy Recruiting Command (CNRC), that the instrument take no more than ten to fifteen<br />

minutes to complete. Finally, if at all possible, it was necessary that the technical/psychometric<br />

properties of the instrument be maintained, if not enhanced.<br />

Though the initial usability testing was favorable overall, one concern was voiced on a<br />

fairly consistent basis. Respondents stated that there was an apparent redundancy in the items<br />

that were presented. This redundancy was most often characterized, as “It seems like I keep<br />

seeing the same items one right after another.”<br />

One explicit feature that was a target during the initial development was that a set of<br />

generalizable process and content statements would be used. For instance, the process<br />

“Maintain” is utilized in nine different PC pair combinations and a number of the content areas<br />

are used in as many a three PC pairs. Due to targeted design it was decided that this feature<br />

would not be revised.<br />

Also contributing to the apparent redundancy was the fact that three versions of each PC<br />

item were presented, yielding a total of 72 PC items being administered to each respondent. This<br />

feature had been established as a way of ensuring psychometric reliability. With a keen eye<br />

toward maintaining technical standards, the number of items was cut by one-third, yielding a<br />

total of 56 PC items in the next iteration of the JOIN tool.<br />

Finally, the original algorithm had specified that all items be presented randomly. Though<br />

the likelihood of getting the alternate versions of a PC pair item one after the other was low, we<br />

decided to place a “blocking constraint” in the algorithm; whereby an individual receives blocks<br />

of one version of all of the 26 PC pairs presented randomly. With the number of PC pair<br />

presentations being constrained to two, each participant receives two blocks of 26 items.<br />

As users had been pleased with the other major features of the interface, no refinements<br />

were made other than those mentioned. Reduction in testing time was assumed based on the<br />

deletion of PC item pairs. Decisions to delete items were made using a combination of rational<br />

and technical/psychometric criteria. As stated earlier, initial item statistics had been favorable in<br />

that internal consistencies within 3-item PC scales were good (mean α = 0.90), and sufficient<br />

variation across scale endorsement indicated that individuals were actually making differential<br />

preference judgments. Items were deleted if they contributed little (in comparison to other items<br />

in the scale) to PC scale internal consistency or possessed response distributions that were<br />

markedly different from alternate versions of the same item. In lieu of definitive empirical<br />

information, items were also deleted if they appeared to present redundant visual information (as<br />

judged by trained raters). The resulting 2-item PC scales demonstrated good internal consistency<br />

(mean α = 0.88). Additional modifications were made that enhanced item data interpretation and<br />

allowed for the future measurement of item response time.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Usability <strong>Testing</strong> II<br />

The second phase of testing occurred over a three and a half month period in the spring of<br />

<strong>2003</strong> at RTC Great Lakes. A group of approximately 4,500 participants completed the refined<br />

JOIN (1.0e) instrument. The group was 82% male, 65% white, 20% black, and 15% other.<br />

From a usability perspective, 93.2% of all respondents rated JOIN “good” or “very good.”<br />

Regarding the PC descriptors, 90.4% of respondents felt that the pictures did a “good” or “very<br />

good” job of conveying the information presented in the descriptors, and 80.5% stated that the<br />

items did a “good” or “very good” job of conveying Navy relevant job information to new<br />

recruits. In terms of psychometric quality, the average PC scale α was 0.87. Descriptive<br />

statistics indicate that participants have provided differential responses across and within work<br />

activity scales. The average testing time decreased (from the original version) from an average<br />

of 24 minutes to 13 minutes. The average time spent per item ranges from 8 to 10 seconds<br />

(except for special operations items – 21 seconds). Special programs and Aviation are preferred<br />

communities, with working outside and in a team as the work environment and style of choice.<br />

As in the initial pilot test, the most desirable work activity has been to operate weapons.<br />

Criterion Related Validity <strong>Testing</strong><br />

Currently the data collected in the most recent round of testing is also being used to<br />

establish criterion-related validity of the JOIN instrument. As those who completed the<br />

instrument lack prior experience or knowledge of the Navy or Navy ratings, they are an ideal<br />

group to use for establishing predictive validity of the tool. Criterion measures (e.g. A-school<br />

success) will be collected as participants progress through technical training, and those data<br />

become available. Participants’ social-security-numbers (SSN) were collected to link interest<br />

measures to longitudinal data, including the multiple survey 1st Watch source data. Additional<br />

measures will also include attrition prior to End of Active Obligated Service (EAOS), measures<br />

of satisfaction (on the job and in the Navy), propensity to leave the Navy, or desire to re-enlist.<br />

Additionally, JOIN results will be linked with performance criteria.<br />

JOIN MODEL ENHANCEMENT<br />

In addition to the establishment of criterion-related validity, current efforts are focused on<br />

establishing and enhancing the construct validity of the SME model upon which the JOIN<br />

framework exists. As mentioned previously, the tool was developed using the judgments of<br />

enlisted community managers. In addition to decomposing rating descriptions to core elements<br />

and matching photographs with these elements, this group also established the initial scoring<br />

weights, which were limited in the first iteration to unit weights. At present, NPRST researchers<br />

are conducting focus group efforts with Navy detailers, classifiers, and A-school instructors for<br />

the purpose of deriving SME rater determined numerical weights that establish an empirical link<br />

between JOIN components and all existing Navy ratings that are available to first term sailors.<br />

These weights will be utilized in the enhancement of the scoring algorithm that provides an<br />

individual preference score for each Navy rating. A rank ordering (based on preference scores)<br />

of all Navy ratings is provided for each potential recruit.<br />

483<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


484<br />

FUTURE DIRECTIONS<br />

Plans include linking JOIN results with other measures that include the Enlisted Navy<br />

Computer Adaptive Personality Scales (ENCAPS) and other individual difference measures<br />

currently being developed at NPRST. The establishment of a measurable relationship between<br />

job preference and such constructs as individual temperament, social intelligence, teamwork<br />

ability, and complex cognitive functioning will greatly advance the Navy’s efforts to select and<br />

classify sailors and ensure the quality of the Fleet into the future.<br />

References<br />

Alley, W.E., Crowson, J.J., & Fedak, G.E. (in press). JOIN item content and syntax templates<br />

(NPRST-TN-03). Millington, TN: Navy Personnel Research, Studies, & Technology.<br />

Lightfoot, M.A., Alley, W.E., Schultz, S.R., Heggestad, E.D., Watson, S.E., Crowson, J.J., &<br />

Fedak, G.E. (in press). The development of a Navy-job specific vocational interest<br />

model (NPRST-TN-03). Millington, TN: Navy Personnel Research, Studies, &<br />

Technology.<br />

Lightfoot, M.A., McBride, J.R., Heggestad, E.D., Alley, W.E., Harmon, L.W., & Rounds, J.<br />

(in press). Navy interest inventory: Approach development (NPRST-TN-03).<br />

Millington, TN: Navy Personnel Research, Studies, & Technology.<br />

Michael, P.G., Hindelang, R.L., Watson, S.E., Farmer, W.L., & Alderton, D.L. (in press). JOIN:<br />

I. Interest inventory development and pilot testing. (NPRST-TN-03). Millington, TN:<br />

Navy Personnel Research, Studies, & Technology.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


THE ARMY VOCATIONAL INTEREST CAREER EXAMINATION<br />

Mary Ann Hanson<br />

President<br />

Center for Career and Community Research<br />

402 Dayton Avenue #3, St. Paul, MN 55102-1772<br />

maryann.hanson@ccc-research.com<br />

Cheryl J. Paullin and Kenneth T. Bruskiewicz<br />

Senior Research Scientist Senior Research Associate<br />

Personnel Decisions Research Institutes<br />

Leonard A. White<br />

Senior Research Psychologist<br />

U. S. Army Research Institute for the Behavioral and Social Sciences<br />

INTRODUCTION<br />

The Army Vocational Interest Career Examination (AVOICE) is a vocational interest<br />

inventory developed as part of the U.S. Army’s Project A. It is designed to measure a wide<br />

variety of interests relevant to Army jobs. AVOICE items describe a variety of occupational<br />

titles, work tasks, leisure activities, and desired learning experiences and respondents are asked<br />

to indicate their degree of interest in each. Data collected as part of Project A and later Building<br />

the Career Force provide a great deal of information about the psychometric characteristics and<br />

correlates of AVOICE scores, and more generally about how interests relate to important<br />

military criteria. This paper describes the development of the AVOICE, Project A/Career Force<br />

data collections that have included the AVOICE, and currently available analysis results. Finally,<br />

we provide suggestions for further research and applications using the AVOICE in the<br />

recruitment and classification process.<br />

AVOICE DEVELOPMENT<br />

The AVOICE is based on an interest inventory developed by the U.S. Air Force called<br />

the Vocational Interest Career Examination (VOICE; Alley & Matthews, 1982). The VOICE<br />

item pool was developed rationally to cover interest constructs that appeared relevant for Air<br />

Force enlisted personnel. The items were grouped into scales based on content similarity and<br />

data were used to improve the internal consistency of the scales. The VOICE was administered<br />

as part of the first large-scale Project A predictor data collection (the Preliminary Battery). Based<br />

on the results, some items were dropped, new items were added, and the response format was<br />

changed from a three-point to a five-point scale. This initial version of the AVOICE was<br />

administered during the field test of Project A predictor and criterion measures, revised based on<br />

field test data, administered to the Concurrent Validation (CV) sample, refined further based on<br />

these data, and finally administered to the Longitudinal Validation (LV) sample. Hough, Barge,<br />

and Kamp (2001) provide more details concerning the AVOICE development. The current<br />

version of the AVOICE contains 182 items grouped into 22 homogeneous basic scales. Based on<br />

available literature, each AVOICE scale has been linked to one of the constructs in Holland’s<br />

hexagonal model of interests (Realistic, Investigative, Artistic, Social, Enterprising, or<br />

485<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


486<br />

Conventional). Table 1 shows some example AVOICE scales and the associated Holland themes.<br />

The AVOICE scales emphasize the Realistic theme, reflecting the fact that much of the work<br />

performed by enlisted Army soldiers is Realistic in nature.<br />

Table 1. AVOICE Composites, Holland Theme(s) and Examples of Scales Included<br />

Composite Example Scale(s) Holland Theme(s)<br />

Rugged/Outdoors Combat; Rugged Individualism Realistic<br />

Audiovisual Arts Drafting; Audiographics Realistic; Artistic<br />

Interpersonal Medical Services; Leadership Investigative; Social<br />

Skilled/Technical Computers; Mathematics Investigative; Realistic<br />

Administrative Clerical/Administrative Conventional<br />

Food Service Food Service - Professional Conventional<br />

Protective Services Fire Protection; Law Enforcement Realistic<br />

Structural/Machines Mechanics; Vehicle Operator Realistic<br />

PROJECT A/CAREER FORCE DATA COLLECTIONS AND CRITERIA<br />

Project A/Career Force included a comprehensive set of criterion measures. There were<br />

multiple job performance measures, including hands-on performance tests, written job<br />

knowledge tests, supervisory role-play simulations (second-tour soldiers only), and self-report<br />

measures of personnel actions (e.g., awards). In addition, job performance ratings were collected<br />

from peers and supervisors using specially developed behaviorally-anchored rating scales.<br />

Soldiers also completed a satisfaction questionnaire that assessed satisfaction with eight different<br />

dimensions of the Army and their jobs, and also overall satisfaction. As with the AVOICE, the<br />

criterion measures were continually refined and revised over the course of the project.<br />

After the field test, Project A data collection efforts focused on two cohorts: the<br />

Concurrent Validation (CV) cohort and the Longitudinal Validation (LV) cohort. The CV cohort<br />

entered the military in 1983 or 1984, and were administered the Project A predictor and first-tour<br />

criterion measures concurrently in 1985 during their first tour of duty. The LV cohort entered the<br />

Army in 1986 or 1987. Each soldier in the LV cohort was administered the Project A predictor<br />

measures during his or her first three days in the Army. During 1988 and 1989, the first-tour<br />

performance measures were collected for these soldiers, and during 1990 and 1991 the secondtour<br />

performance measures were collected. By this time, most soldiers in the LV cohort were in<br />

their second tour of duty and had moved into leadership roles (e.g., squad leader).<br />

Scores from all of the performance measures were used to model the structure of first-<br />

and second-tour soldier job performance (Campbell & Knapp, 2001). These performance models<br />

were then used to group scores on the performance measures into criterion composite scores. The<br />

first-tour criterion composites used in the Project A basic validation analyses were: (1) Core<br />

Technical Proficiency (CTP), (2) General Soldiering Proficiency (GSP), (3) Effort and<br />

Leadership (ELS), (4) Maintaining Personal Discipline (MPD), and (5) Physical Fitness and<br />

<strong>Military</strong> Bearing (PFD). The second-tour criterion composites are similar to the first, with the<br />

primary difference being the addition of a sixth composite: Leadership (LDR). The ELS<br />

composite was also revised and relabeled Achievement and Effort (AE) for second tour. Finally,<br />

attrition data were collected from the Army archives. Attrition analyses in this paper focused on<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


soldiers who left the Army for avoidable reasons (e.g., failure to meet minimal performance or<br />

behavioral criteria) during their first tour of duty. Because the majority of this avoidable attrition<br />

occurs during the first 12 months, attrition was defined as leaving the military for avoidable<br />

reasons during the first 12 months of service.<br />

BASIC VALIDATION AND EMPIRICAL SCORING PROCEDURES<br />

For the basic validation analyses, AVOICE item-level scores were summed to create the<br />

22 interest scales. Based on principal components analysis, these scale scores were grouped into<br />

eight summary composites (Campbell & Knapp, 2001; see Table 1). AVOICE validity was<br />

assessed by computing multiple correlations between these eight AVOICE composites and the<br />

performance criteria within each MOS, statistically adjusting the correlations for shrinkage<br />

(Rozeboom, 1978), correcting them for range restriction, and then averaging across MOS.<br />

Paullin, Bruskiewicz, Hanson, Logan, and Fellows (1995) conducted preliminary<br />

analyses to determine whether empirical scoring procedures have potential for enhancing the<br />

validity of the AVOICE beyond the levels obtained in the basic validation analyses. This work<br />

focused on five MOS, selected to include both combat and noncombat MOS, some MOS that are<br />

similar to each other and some that are quite different in terms of the tasks involved, and some<br />

MOS that have a relatively large percentage of female soldiers. The MOS selected were<br />

Infantryman (11B), Cannon Crewman (13B), Light Wheel Vehicle Mechanic (63B),<br />

Administrative Specialist (71L), and Medical Specialist (91A). Only a subset of the criterion<br />

variables were included in this work: Core Technical Proficiency (CTP) because it is quite well<br />

predicted by the AVOICE (see results below); attrition because it is reasonable to expect that<br />

soldiers whose interests do not match their jobs will be more likely to leave; and MOS<br />

membership because it has been one of the most widely studied criteria in past research on<br />

interests. Several different empirical keying procedures were explored for CTP for the 13Bs and<br />

for occupational membership for the 91As. Based on results of these analyses, only the most<br />

promising empirical keying strategies were evaluated for the remaining MOS and criteria.<br />

Empirical keys that focus on response option level data do not assume a linear<br />

relationship between item scores and the criterion of interest. Two of the most common response<br />

option level approaches were tried: vertical percent and correlational. For the vertical percent<br />

method, contrasting groups of 13Bs were formed based on the criterion variable (CTP): soldiers<br />

whose CTP scores fell in the top 30 percent and those whose scores fell in the bottom 30 percent.<br />

The differences between the percentages of soldiers choosing each response option in each<br />

contrasting group were then used to assign “net weights” to each response option (Strong, 1943).<br />

This essentially gives differences greater weight if they occur at either extreme of the response<br />

distribution. Two vertical percent keys were developed: one including only those items that<br />

showed at least a 5-point difference across groups and one including items that showed at least a<br />

10-point difference. Similar procedures were followed for the MOS membership criterion for<br />

91As, but all items were included in these keys. A second “correlational” method was also<br />

explored for CTP. Each dichotomous response option score was correlated with the continuously<br />

scored CTP criterion measure, and unit weights were assigned to response options with a<br />

significant point-biserial correlation (p < .05).<br />

Empirical scoring procedures focused on item-level data were also developed, by<br />

computing correlations between AVOICE item-level scores and CTP. The items with the largest<br />

487<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


488<br />

correlations were then selected for a unit weighted short (6-item), medium (12-item), and long<br />

key. Each successive key contained all of the items from the shorter keys, and the long key<br />

included all of the items that had a statistically significant relationship with the criterion variable<br />

(p < .05). Item-level empirical keys to predict MOS membership were developed using similar<br />

procedures. Differences between the mean AVOICE item-level score for members of the target<br />

MOS and members of all remaining MOS (i.e., the “general population”) were computed for<br />

each item. Items for which this mean difference was significantly different from zero (p < .01)<br />

were then included in a unit-weighted key, with a negative weight if the general population<br />

received a higher score and a positive weight if the target MOS received a higher score.<br />

All analyses for CTP were conducted within MOS, because interests are expected to have<br />

different relationships with performance (i.e., CTP) in different jobs. For MOS that included<br />

women, empirical keys were developed separately for men and women. Attrition analyses were<br />

conducted for the 13Bs and 91A men only, and again analyses were conducted within MOS. All<br />

samples were divided into two subsamples randomly. Empirical keys were then developed in one<br />

sample and cross validated in the other. For the analyses focused on MOS membership,<br />

empirical keys were developed in the CV sample and applied and evaluated in the LV sample.<br />

RESULTS FOR JOB PERFORMANCE<br />

A clear finding from the Project A research is that AVOICE interest scales predict job<br />

performance. For example, in the LV sample (N = 4,220), the mean multiple correlation, across<br />

nine MOS, between AVOICE composites and CTP was .38 (corrected for range restriction and<br />

adjusted for shrinkage). Multiple correlations for the other first-tour performance composites<br />

were .37 with GSP, .17 with ELS, .05 with MPD, and .05 with PFB. AVOICE validities for<br />

second-tour performance (N=1,217) were similar; the multiple correlations were .41 with CTP,<br />

.29 with GSP, .09 with AE, .06 with MPD, .09 with PFB, and .35 with LDR. It is interesting that<br />

the prediction of CTP is slightly better for second-tour soldiers, while the prediction of GSP is<br />

somewhat worse. Also, AVOICE scores predicted leadership performance quite well, even<br />

though there was not a strong a priori rationale for expecting this relationship.<br />

Table 2. Comparison of the Validity of AVOICE Composites with Empirical Keys for First-<br />

Tour Core Technical Proficiency (CTP)<br />

Cross-Validation<br />

Sample Size<br />

Multiple<br />

Correlation 1<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Empirical Keys<br />

12-Item All-Sig.<br />

CV Median across MOS 176 .20 .24 .25<br />

CV Range across MOS 92-243 .00-.44 .14-.43 .16-.46<br />

LV Median across MOS 240 .15 .24 .23<br />

LV Range across MOS 113-365 .00-.17 .17-.28 .17-.25<br />

1 Multiple correlation of eight rational composites adjusted for shrinkage using Rozeboom (1978).<br />

Regarding empirical keys for CTP, the various item and response option level empirical<br />

keying procedures did not yield appreciably different cross-validated results. Response option<br />

level keys general yielded higher validities in the development sample, but greater shrinkage in


the cross-validation sample. Since conclusions based on the various empirical keying approaches<br />

are virtually identical, this paper will present results for the item-level keys only. The short (i.e.,<br />

6-item) scales were uniformly less valid than the medium and long scales, and results are<br />

presented here for the latter scales only. Table 2 summarizes the results across six groups – 11Bs,<br />

13Bs, 63B men, 71L men, 71L women, and 91A men – for the CV and LV cohorts. In general,<br />

empirically developed AVOICE scales provided somewhat better prediction of CTP than the<br />

multiple correlations based on the eight rationally developed composites.<br />

RESULTS FOR SATISFACTION, ATTRITION AND MOS MEMBERSHIP<br />

Regarding job satisfaction, Carter (1991) examined multiple correlations between the<br />

AVOICE composites and satisfaction scales (adjusted for shrinkage). Sample size weighted<br />

means (across MOS) ranged from .04 to .13 across the different dimensions of satisfaction, with<br />

a median of .10. These correlations are lower than those typically reported for military samples<br />

(e.g., Hough et al., 2001), but correlations between interests and satisfaction in the non-military<br />

literature have been inconsistent and often quite low (e.g., Dawis, 1991). The AVOICE did not<br />

predict attrition well for the 13B and 91A MOS. The median multiple correlation between the<br />

eight composites and attrition (adjusted for shrinkage) was only .06. Empirical keying did not<br />

improve these results. In fact, the median cross-validity for the empirical attrition keys was only<br />

.02. AVOICE scores are, however, substantially related to soldiers’ occupational (i.e., MOS)<br />

membership. Table 3 summarizes the results across seven groups: 11Bs, 13Bs, 63B men, 71L<br />

men, 71L women, 91A men, and 91A women. Soldiers generally score considerably higher on<br />

the occupational key developed for their own MOS than do soldiers from other MOS, with a<br />

median effect size of more than a full standard deviation.<br />

Table 3. Mean 12-Item Occupational Scale Scores in the Longitudinal Validation Sample<br />

Target MOS General Population 1<br />

N Mean SD N Mean SD Effect<br />

Size 2<br />

Median<br />

Across MOS<br />

486 61.24 8.45 5,889 49.96 9.43 1.24<br />

Minimum 74 53.88 6.22 529 48.24 8.81 .40<br />

Maximum 764 65.24 10.60 6,545 56.29 11.52 1.69<br />

1 General population for each scale consists of members of all other Batch A and Batch Z MOS in that sample.<br />

2 Effect sizes in this table are relative to the target MOS (i.e., a positive effect size indicates that the target MOS<br />

score higher than the general population). All effect sizes are significant at the p < .01 level.<br />

DISCUSSION AND RECOMMENDATIONS<br />

AVOICE scores predict the MOS-specific aspects of performance quite well, and they<br />

are also related to MOS membership. Interests are not strongly related to attrition, so there is<br />

apparently a good deal of self-selection into the occupational specialties that match soldiers’<br />

interests. The fact that interests are related to performance even within this self-selected group<br />

suggests that the Army could benefit from more systematically placing recruits in MOS that<br />

match their interests. For example, recruits could simply be told which MOS best match their<br />

489<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


490<br />

interests and the benefits of entering an MOS that more closely matches their interests. Because<br />

people tend to gravitate toward occupations that are consistent with their vocational interests, the<br />

AVOICE may also provide a good recruiting tool, as it could be used to identify the MOS that<br />

are likely to be more appealing to potential recruits. Alternatively, AVOICE scores could be used<br />

to make classification decisions. Further analyses are needed to determine the extent to which<br />

AVOICE scores can contribute to performance prediction beyond the current ASVAB<br />

composites.<br />

Empirical keying has good potential for improving the validity of the AVOICE. It is<br />

worth noting that the analyses presented here actually provide a fairly conservative test of the<br />

validity of empirical approaches, as the multiple correlation using the eight relatively<br />

homogenous AVOICE composites also capitalizes on the relevant predictor-criterion<br />

relationships. Empirical keys are likely to provide even more of an advantage when compared<br />

with purely rational keys. Large datasets that include interest scores and the criteria of interest<br />

are needed for empirical keying, and the Project A/Career Force data provides a valuable source<br />

of this relatively difficult to obtain item-level validity information. Empirical keying has been<br />

explored for only a subset of criteria and MOS, and based on the results further research is<br />

warranted.<br />

REFERENCES<br />

Alley, W.E., & Matthews, M.D. (1982). The vocational interest career examination: A description<br />

of the instrument and possible applications. Journal of Psychology, 112, 169-193.<br />

Campbell, J.C., & Knapp, D.J. (Eds, 2001) Exploring the limits in personnel selection and<br />

classification. Mahwah, NJ: Lawrence Erlbaum Associates.<br />

Carter, G.W. (1991). A study of relationships between measures of individual differences and job<br />

satisfaction among U.S. Army personnel. Unpublished doctoral dissertation, University of<br />

Minnesota, Minneapolis.<br />

Dawis, R. V. (1991). Vocational interests, values, and preferences. In M. D. Dunnette and L. M.<br />

Hough (Eds.), Handbook of Industrial and Organizational Psychology (2 nd ed., vol. 2, pp.<br />

833-871). Palo Alto, CA: Consulting Psychologists Press.<br />

Hough, L., Barge, B., & Kamp, J. (2001). Assessment of personality, temperament, vocational<br />

interests, and work outcome preferences. In J. P. Campbell & D. J. Knapp (Eds.),<br />

Exploring the limits in personnel selection and classification. Mahwah, NJ: Lawrence<br />

Erlbaum & Associates.<br />

Paullin, C., Bruskiewicz, K. T., Hanson, M. A., Logan, K., & Fellows, M. (1995). Development<br />

and evaluation of AVOICE empirical keys, scales and composites. In J. P. Campbell &<br />

L. M. Zook (Eds.), Building and retaining the career force: New procedures for<br />

accessing and assigning Army enlisted personnel - Final report (ARI Technical Report).<br />

Alexandria, VA: U. S. Army Research Institute for the Behavioral and Social Sciences.<br />

Rozeboom, W.W. (1978). Estimation of cross-validated multiple correlation: A clarification.<br />

Psychological Bulletin, 85, 1348-1351.<br />

Strong, E.K. (1943). Vocational interests of men and women. Stanford, CA: Stanford University<br />

Press<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


DEVELOPING MEASURES OF OCCUPATIONAL INTERESTS AND<br />

VALUES FOR SELECTION 39<br />

Dan J. Putka, Ph.D.<br />

Human Resources Research Organization<br />

66 Canal Center Plaza, Suite 400<br />

Alexandria, Va 22314<br />

Dputka@Humrro.Org<br />

INTRODUCTION<br />

Christopher E. Sager, Ph.D.<br />

Human Resources Research Organization<br />

66 Canal Center Plaza, Suite 400<br />

Alexandria, Va 22314<br />

csager@humrro.org<br />

491<br />

Chad H. Van Iddekinge, Ph.D.<br />

Human Resources Research Organization<br />

66 Canal Center Plaza, Suite 400<br />

Alexandria, Va 22314<br />

Cvaniddekinge@Hummro.Org<br />

Historically, personnel selection has concerned the development of predictor measures<br />

that assess knowledges, skills, and attributes (KSAs) deemed critical to successful job<br />

performance (e.g., Campbell, 1990; Schmitt & Chan, 1998). However, job performance is not the<br />

only criterion that the U. S. Army desires to affect through its selection and classification<br />

systems. Most notably, attrition is often a key criterion of interest. Unfortunately, traditional<br />

KSA-based predictor development strategies fall short in identifying predictors of nonperformance<br />

criteria. Fortunately, the academic literature provides alternative strategies for<br />

developing predictors that are well grounded in theory and highly relevant for the prediction of<br />

alternative criteria. For example, measures of occupational interests and values have a long<br />

tradition in vocational/career counseling literature, and have been found to be predictive of both<br />

employee turnover and several work-related attitudes that are believed to underlie turnover (e.g.,<br />

job satisfaction and commitment; Dawis, 1991). Although common in the vocational counseling<br />

arena, several challenges arise when such measures are considered for use in personnel selection<br />

where intentional response distortion among respondents becomes more likely. This paper<br />

discusses challenges that have arisen in our efforts to develop such measures for the Army’s<br />

Select21 project.<br />

INTEREST AND VALUE MEASURES IN SELECT21<br />

The goal of the Select21 project (sponsored by the U.S. Army Research Institute for the<br />

Behavioral and Social Sciences) is to develop and validate measures that will help the Army select<br />

and retain Soldiers with the characteristics needed to succeed in the future Army. A key element of<br />

predictor development has been to develop measures of person-environment (P-E) fit (Kristof,<br />

39 This paper is part of a symposium titled Occupational Interest Measurement: Where Are the Services Headed?<br />

presented at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Conference in Pensacola, FL (M.G. Rumsey,<br />

Chair). The views, opinions, and/or findings contained in this paper are those of the authors and should not be<br />

construed as an official U.S. Department of the Army position, policy, or decision.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


492<br />

1996), with particular focus on the match between prospective Soldiers’ work interests and values<br />

and those supplied by the Army environment, both now and in the future. Our strategy for<br />

developing these measures for Select21 arose out of work done in two relatively distinct bodies<br />

of literature, which we briefly summarize in the sections that follow.<br />

Theory of Work Adjustment<br />

Our first strategy for developing fit measures was derived from the Dawis, England, and<br />

Lofquist (1964) Theory of Work Adjustment (TWA). Although the TWA is a broad theory, we<br />

focused on the part that concerns the correspondence between individuals’ needs (in this case<br />

recruits) and what the organization or job (in this case the Army) supplies. Specifically, TWA<br />

suggests that a Soldier’s level of work satisfaction is a function of the correspondence between<br />

that Soldier’s preference for various occupational reinforcers and the degree to which the Army<br />

(or job) provides those reinforcers. An occupational reinforcer is a characteristic of the work<br />

environment associated with an individual’s work values (e.g., having a chance to work<br />

independently, being paid well, having good relationships with co-workers). Within the P-E fit<br />

literature, correspondence between a Soldier’s needs and what the organization or job supplies in<br />

terms of those needs is referred to as “needs-supplies fit” (Edwards, 1991).<br />

Although TWA focuses on occupational reinforcers and work values, other theories have<br />

postulated similar relationships between needs-supplies fit and work-related attitudes with<br />

different needs-supplies content. For example, Holland’s congruence theory (Holland, 1985)<br />

would suggest that a Soldier’s work satisfaction is, in part, a function of the congruence between<br />

that Soldier’s vocational interests (i.e., standing on RIASEC 40 interest dimensions) and the<br />

interests supported by the Army/job environment. Vocational interests are often indicated by<br />

individuals’ preferences for various generalized work activities, work contexts, and leisure<br />

activities. For Select21, we drew on both TWA work value and RIASEC interest content when<br />

developing the interest and value measures.<br />

Realistic Job Previews<br />

The other strategy we adopted for developing predictor measures was derived from the<br />

literature on realistic job previews (RJPs; e.g., Wanous, 1992). RJPs are hypothesized to bring<br />

applicants’ pre-entry expectations more in line with reality, thus serving to reduce later negative<br />

effects (e.g., dissatisfaction and turnover) of unmet expectations (e.g., Hom, Griffeth, Palich, &<br />

Bracker, 1999). RJPs are not typically thought of as predictors in the selection context; rather<br />

they reflect information provided to an applicant. As such, traditional pre-entry RJPs take the<br />

selection decision out of the hands of the organization and put it into the hands of the applicant<br />

(i.e., self-selection). In a lean recruiting environment, loss of such power on the Army’s part<br />

would be undesirable. Despite their value, this characteristic of traditional RJPs might explain<br />

their lack of use in the Army (Brose, 1999).<br />

40<br />

Holland discusses six dimensions of vocational interest: realistic, investigative, artistic, social, enterprising, and<br />

conventional (RIASEC; Holland, 1985).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


For Select21 we conceived of a novel way to capitalize on the benefits of pre-entry RJPs,<br />

yet put the decision in the hands of the Army. We are seeking to achieve this by presenting RJP<br />

information in the form of a pre-entry knowledge of the Army test. For example, we developed<br />

measures that ask prospective Soldiers the extent to which they believe the occupational<br />

reinforcers and interests assessed in the Select21 needs-supplies fit measures are characteristic of<br />

the Army. We refer to correspondence between recruits’ expectations and the reality of Army<br />

life as “expectations-reality fit.” Thus, based on content from the needs-supplies fit measures, we<br />

also constructed expectations-reality fit measures for Select21.<br />

Our decision to go beyond traditional needs-supplies fit measures, and include<br />

expectations-reality fit measures for Select21 stems from our belief that these two types of<br />

measures will interact to predict the criteria of interest (i.e., attrition and its attitudinal<br />

precursors). For example, based on expectancy theory (Vroom, 1964), we expect that misfit<br />

between the applicant and Army for any given occupational reinforcer (e.g., degree of autonomy)<br />

depends on (a) how important the reinforcer is to the applicant (need), (b) how much the<br />

applicant expects the Army to provide the reinforcer (expectation), and (c) the extent to which<br />

the Army actually offers the reinforcer (supply).<br />

For example, consider two applicants—one who values autonomy and expects the Army<br />

will provide it, and a second who values autonomy, but does not expect the Army to provide it. If<br />

we assume the Army does not provide autonomy, it is likely that the second applicant will be<br />

more satisfied with the Army than the first. Although both applicants value autonomy (indicating<br />

a needs-supplies misfit), the fact that the first applicant expects autonomy and does not receive it<br />

may result in greater dissatisfaction for the first applicant. Thus, a hypothesis we plan to test is<br />

that Soldiers will be more dissatisfied and more likely to attrit if they have unmet needs<br />

regarding interests and values that they expected the Army to support.<br />

GENERAL DEVELOPMENT CONSIDERATIONS<br />

The previous section summarized two types of measures we developed for Select21, and<br />

theoretical bases for them. In the remaining sections, we elaborate on several issues we<br />

considered during the course of their development.<br />

WHAT TO ASSESS?<br />

A key characteristic that differentiates the strategies for developing needs-supplies and<br />

expectations-reality fit measures from the development of traditional KSA-based predictor<br />

measures is the determination of what constructs to assess. As noted earlier, selection measures<br />

are generally designed to assess critical KSAs identified by a job analysis. When developing<br />

needs-supplies and expectations-reality fit measures, however, this approach makes little sense<br />

because of the need to understand the range of an applicant’s needs or expectations, particularly<br />

those needs that the organization or job environment cannot satisfy (which can lead to<br />

dissatisfaction). Thus, the constructs that are critical to assess may vary by applicant (e.g., it<br />

493<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


494<br />

depends on what values or interests the individual applicant finds most desirable) instead of<br />

being a fixed entity (as with critical KSAs).<br />

It is therefore important to work from a broad taxonomy when developing needs-supplies<br />

and expectations-reality fit measures to ensure adequate coverage of the values and interests that<br />

the applicant population might possess. Given the prominence and breadth of the Dawis and<br />

Lofquist (1984) taxonomy of occupational reinforcers and Holland’s RIASEC dimensions, we<br />

adopted these frameworks as a basis for developing the fit measures. We also took steps to<br />

expand the taxonomy of occupational reinforcers in light of our review of recent work on (a)<br />

general work values (e.g., Schwartz, 1994), (b) values of American youth (Sackett & Mavor,<br />

2002), and (c) values of Army recruits (Ramsberger, Wetzel, Sipes, & Tiggle, 1999), to ensure<br />

the resulting set of occupational reinforcers is comprehensive.<br />

RECONCILING DIFFERENT TYPES OF FIT<br />

During the development process, we recognized a distinction should be made between<br />

two types of fit—fit with the Army and fit with one’s military occupational specialty (MOS).<br />

Person-Army fit refers to the correspondence between Soldiers’ KSAs, values, interests, and<br />

expectations and those required/supported by the Army in general. Person-MOS fit refers to the<br />

correspondence between Soldiers’ KSAs, values, interests, and expectations and those that are<br />

required/supported by the specific MOS to which the Soldier is assigned.<br />

We geared the Select21 occupational reinforcer-based measures towards assessing<br />

Person-Army fit. This is because the occupational reinforcers we used reflect Army-wide<br />

conditions, and are not specific to individual MOS. For example, the Army’s supply of the<br />

occupational reinforcer relating to pay is fairly stable across MOS within pay grade. For the<br />

RIASEC-based measures, elements of both Army-wide and MOS-specific fit are considered.<br />

This is because interests are often tied to job tasks and opportunities for training that vary by job.<br />

As such, there likely exists an Army-wide RIASEC profile that taps the common tasks and<br />

training opportunities offered by Army jobs in general, as well as an MOS-specific RIASEC<br />

profile that taps the MOS-specific tasks and training opportunities.<br />

An issue we will confront when validating the RIASEC-based fit measures is how Armywide<br />

and MOS-specific fit will interact to predict the criteria of interest. For example, a Soldier’s<br />

interests may match the profile of his/her MOS, but differ form the Army-wide profile. Empirical<br />

examination of the interaction between person-organization and person-job fit has only recently<br />

begun to appear in the civilian literature (Kristof-Brown, Jansen, & Colbert, 2002). As such,<br />

when conducting validation analyses, care will be taken to explore the unique and joint<br />

contribution of these types of fit.<br />

RESPONSE DISTORTION<br />

Response distortion becomes a prominent issue when attempting to use needs-supplies fit<br />

measures in an operational selection context (Hough, 1998; Rosse, Stecher, Miller, & Levin,<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


1998). For example, nearly all of the occupational reinforcers discussed in the TWA are socially<br />

desirable. Thus, a Likert-type work values measure is not likely to yield useful information. On<br />

the other hand, not all of the work activities, work contexts, leisure activities, or learning<br />

experiences used in vocational interest inventories are socially desirable. These differences<br />

indicate that the best methods for managing response distortion on these measures may differ<br />

depending on whether one is assessing work values or vocational interests.<br />

Work Values<br />

One promising way to deal with content that is socially desirable is to use a forced-choice<br />

format. For Select21, we constructed a novel forced-choice occupational reinforcer-based needs<br />

measure to assess work values. The purpose of using a forced-choice format was to reduce its<br />

susceptibility to response distortion in the operational selection context (Jackson, Wrobleski, &<br />

Ashton, 2000).<br />

However, the forced-choice format is not without its problems. For example, forcedchoice<br />

measures result in ipsative or partially ipsative data (Hicks, 1970). Ipsative data indicate<br />

whether an individual is higher on one construct than another (e.g., whether a prospective Soldier<br />

has a greater need for work autonomy than for a supportive supervisor). However, the selection<br />

context needs normative data that compare individuals to each other on a single construct (e.g.,<br />

orders prospective Soldiers on their need for work autonomy). We are taking several steps to<br />

help maximize the ability to normatively scale recruits based on their responses to our forcedchoice<br />

measure (see Van Iddekinge, Putka, & Sager, <strong>2003</strong>). Another potential problem that may<br />

arise from using a forced-choice format to assess work values is that one value in a pair sounds<br />

more like the Army than the other. In such cases, an applicant desiring to be selected into the<br />

Army may endorse that statement regardless of whether they value it. We took steps to construct<br />

the forced choice work values measure to make such impression management tactics more<br />

difficult (see Van Iddekinge et al., <strong>2003</strong>).<br />

Interests<br />

Unlike measures of work values, assessing applicants’ vocational interests with a forcedchoice<br />

measure may be less feasible. For example, on many interest measures, items relating to<br />

military-type activities (e.g., I like to shoot guns) are included as indicators of realistic interests.<br />

Inclusion of such items is problematic in a selection context. That is, an applicant who strongly<br />

desires to get into the Army and is willing to distort his/her responses to do so will indicate a<br />

strong liking for such items regardless of whether it interests him/her. Given the specific nature<br />

of common interest items, imposing a forced-choice format by pairing these items with other<br />

interest items would not likely resolve this type of response distortion.<br />

Another factor that limits the potential benefit of a forced-choice interest measure is the<br />

number of items often used to measure occupational interests. For example, a diverse array of<br />

work activities from different occupations may be required to accurately measure investigative<br />

interests. One drawback of this is that it makes it more difficult to use a forced-choice format<br />

495<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


496<br />

because there are too many indicators of interests to compare, making for an overly long<br />

measure. Thus, the key to managing response distortion on interest inventories may be to balance<br />

item content. For example, this might be achieved by writing items for each RIASEC dimension<br />

that are either (a) all descriptive of the Army or (b) all Army neutral (e.g., don’t sound like the<br />

Army). Conversely, one interesting possibility would be to include both types of items on an<br />

interest inventory to examine differences in RIASEC profiles based on items that are intended to<br />

sound like the Army versus items that are Army neutral. If the RIASEC profile based on Army-like<br />

items is higher across all dimensions, it might indicate that individuals are distorting their<br />

responses.<br />

SCORING CONSIDERATIONS<br />

In assessing the fit between what the applicant needs (or expects) and what the<br />

organization or job supplies, commensurate measures are typically administered to applicants<br />

and incumbent subject matter experts (SMEs). The role of SMEs is to provide an organization-or<br />

job-side profile against which the “fit” of an applicant’s profile can be assessed. The degree of fit<br />

is typically indexed using profile similarity indexes (PSIs; Edwards, 1991). Past research in the<br />

TWA literature suggests that the PSI most correlated with work satisfaction is a simple rankorder<br />

correlation of profiles (e.g., Rounds, Dawis, & Lofquist, 1987). Nevertheless, such PSIs<br />

have been criticized because of their conceptual problems. Most notably, they mask the<br />

relationship between individual dimensions of fit (e.g., Artistic interests) and the criteria. Past P-<br />

E fit research indicates that individual dimensions of fit may have different relationships with the<br />

same criteria, and that aggregating them into profile similarity indices makes identification of<br />

(and capitalization on) differential relationships problematic (Edwards, 1994).<br />

A potential alternative to using a profile similarity index to score fit measures is<br />

polynomial regression (Edwards, 1994). This approach involves modeling the criterion variable<br />

of interest as a function of applicants’ need and a job’s supply of any given dimension of fit (e.g.,<br />

Artistic interests). Unfortunately, this method is most applicable in studies of person-job fit<br />

where participants hold a variety of jobs (thus allowing variation in supply measures on the jobside<br />

of the equation). However, one can still capitalize on certain aspects of this approach when<br />

assessing P-E fit within a single organization. For example, in Select21 a regression-based<br />

method would allow us to model the criteria of interest as a function of (a) Soldiers’ level of<br />

preference for a reinforcer/interest, (b) an indicator of where Soldiers’ level of preference falls<br />

relative to what the Army provides (e.g., above/below), and (c) hypothesized moderating<br />

variables (e.g., Soldiers’ expectation regarding the given reinforcer/interest). Relying on<br />

aggregate profile similarity indices, as indicators of fit would not support this type of modeling,<br />

which in turn could lead to lower criterion-related validity estimates for the fit measures<br />

(Edwards, 1994).<br />

SUMMARY<br />

In this paper we discussed a number of issues to consider when developing measures of<br />

values and interests for use in personnel selection. Although many challenges are present, the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


potential benefit of including such measures in selection batteries may be substantial.<br />

Specifically, based on theory, such measures are more likely to predict valued alternative criteria<br />

such as attrition and its attitudinal precursors than traditional KSA-based selection instruments.<br />

Therefore, we feel that making efforts to construct measures of values and interests for use in<br />

personnel selection is a worthy pursuit.<br />

REFERENCES<br />

Brose, G. D. (1999). Could realistic job previews reduce first-tour attrition? Unpublished<br />

master’s thesis. Monterey, CA: Naval Postgraduate School.<br />

Campbell, J. P. (1990). Modeling the performance prediction problem in industrial<br />

organizational psychology. In M.D. Dunnette & L.M. Hough (Eds.), Handbook of<br />

industrial and organizational psychology (2nd ed., Vol.1, pp. 687-732). Palo Alto:<br />

Consulting Psychologists Press.<br />

Dawis, R. V. (1991). Vocational interests, values, and preferences. In M.D. Dunnette & L. M.<br />

Hough (Eds.), Handbook of industrial and organizational psychology (2nd ed., Vol. 2,<br />

pp. 833-871). Palo Alto: Consulting Psychologists Press.<br />

Dawis, R. V., England, G. W., & Lofquist, L. H. (1964). A theory of work adjustment. Minnesota<br />

Studies in Vocational Rehabilitation, XV. Minneapolis: University of Minnesota.<br />

Dawis, R. V., & Lofquist, L. H. (1984). A psychological theory of work adjustment.<br />

Minneapolis: University of Minnesota Press.<br />

Edwards, J. R. (1991). Person-job fit: A conceptual integration, literature review and<br />

methodological critique. <strong>International</strong> review of industrial/organizational psychology<br />

(Vol. 6, pp. 283-357). London: Wiley.<br />

Edwards, J. R. (1994). The study of congruence in organizational behavior research: Critique and<br />

proposed alternative. Organizational Behavior and Human Decision Processes, 58, 51-100.<br />

Hicks, L. E. (1970). Some properties of ipsative, normative, and forced-choice normative<br />

measures. Psychological Bulletin, 74, 167-184.<br />

Holland, J. L. (1985). Manual for self-directed search. Odessa, FL: Psychological Assessment<br />

Resources.<br />

Hom, P. W., Griffeth, R. W., Palich, L. E., & Bracker, J. S. (1999). Revisiting met expectations<br />

as a reason why realistic job previews work. Personnel Psychology, 52, 97-112.<br />

Hough, L. M. (1998). Effects of intentional distortion in personality measurement and evaluation<br />

of suggested palliatives. Human Performance, 11, 209-244.<br />

497<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


498<br />

Jackson, D. N., Wrobleski, V. R., & Ashton, M. C. (2000). The impact of faking on employment<br />

tests: Does forced-choice offer a solution? Human Performance, 13, 371-388.<br />

Kristof, A.L. (1996) Person-organization fit: An integrative review of its conceptualizations,<br />

measurements, and implications. Personnel Psychology, 49, 1-49.<br />

Kristof-Brown, A.L., Jansen, K.J., & Colbert, A. (2002). A policy-capturing study of the<br />

simultaneous effects of fit with jobs, groups, and organizations. Journal of Applied<br />

Psychology, 87(5), 985-993.<br />

Ramsberger, P. F., Wetzel, E. S., Sipes, D. E., & Tiggle, R. B. (1999). An assessment of the<br />

values of new recruits (FR-WATSD-99-16). Alexandria, VA: Human Resources<br />

Research Organization.<br />

Rosse, J.G., Stecher, M.D., Miller, J.L., & Levin, R. (1998). The impact of response distortion on<br />

pre-employment personality testing and hiring decisions. Journal of Applied Psychology,<br />

83, 634-644.<br />

Rounds, J.B., Dawis, R.V., & Lofquist, L.H. (1987). Measurement of person-environment fit and<br />

prediction of satisfaction in the theory of work adjustment. Journal of Vocational<br />

Behavior, 31, 297-318.<br />

Sackett, P. R. & Mavor, A. (Eds.) (2002). Attitudes, aptitudes, and aspirations of American youth:<br />

Implications for military recruitment. Washington, D.C.: National Academies Press.<br />

Schmitt, N., & Chan, D. (1998). Personnel selection: A theoretical approach. Thousand Oaks,<br />

CA: Sage Publications.<br />

Schwartz, S. H. (1994). Are there universal aspects in the structure and contents of human<br />

values? Journal of Social Issues, 50, 19-45.<br />

Van Iddekinge, C. H., Putka, D. J., & Sager, C. E. (<strong>2003</strong>, November). Assessing Person-<br />

Environment (P-E) Fit with the Future Army. In D. J. Knapp (Chair), Selecting Soldiers<br />

for the Future Force: The Army’s Select21 Project. Paper presented at the <strong>2003</strong><br />

<strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Conference, Pensacola, FL<br />

Vroom, V. (1964). Work and motivation. New York: John Wiley.<br />

Wanous, J. P. (1992). Organizational entry (2 nd Ed.), Reading, MA: Addison-Wesley<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


OCCUPATIONAL SURVEY SUPPORT OF<br />

AIR AND SPACE EXPEDITIONARY FORCE (AEF) REQUIREMENTS<br />

Mr. Robert E. Boerstler, Civilian, Chief, Leadership Development Section<br />

Mr. John L. Kammrath, Civilian, Chief, Occupational Analysis Flight<br />

Air Force Occupational Measurement Squadron, Randolph AFB, TX 78150-4449<br />

robert.boerstler@randolph.af.mil john.kammrath@randolph.af.mil<br />

ABSTRACT<br />

This paper highlights the Air Force Occupational Measurement Squadron’s (AFOMS)<br />

involvement in the Air Education and Training Command (AETC) initiative to determine 3-skilllevel<br />

deployment requirements in support of the Air and Space Expeditionary Forces (AEFs). In<br />

order to provide current AF leadership with a strategy for conducting initial skills training<br />

focused on deployment tasks, AFOMS was challenged to devise a means of surveying members<br />

currently deployed and members who have returned from deployments within the past<br />

12 months. Ten sample specialties with known deployment requirements were selected to<br />

participate in this survey. Results of this effort provided insight into the importance of targeted<br />

task training in initial skills training courses. The potential for AETC to change from garrisonbased<br />

initial skills training to deployment task training will require a paradigm shift for many of<br />

the US Air Force functional communities to “train as they fight” in support of the AEF structure.<br />

INTRODUCTION<br />

AEFs were invented in the 1990s to solve chronic deployment problems. More than anything<br />

else, the Air Force hoped to provide a measure of stability and predictability for its airmen, who<br />

were constantly being dispatched overseas on one short-notice contingency assignment after<br />

another. It was not apparent at the time what a big difference this change was going to make.<br />

The AEFs have become a new way of life for the Air Force.<br />

Airmen are still assigned to their regular units at their home stations. But most likely they also<br />

belong to an AEF, and for 3 months out of every 15, the AEF governs where they will be and<br />

what they will do. About half of the airmen and officers in the active duty force are already in an<br />

AEF, and the number is rising. Guard and Reserve participation is so high that a fourth of the<br />

deployed forces come from the Air Reserve Components.<br />

The Air Force has grouped its power projection forces and the forces that support them into 10<br />

"buckets of capability," each called an AEF. (The other abbreviation, "EAF"--for Expeditionary<br />

Air and Space Force--refers to the concept and organization.)<br />

Secretary of the Air Force James G. Roche told Congress in February that "a nominal AEF has<br />

about 12,600 people supporting 90 multirole combat aircraft, 31 intratheater airlift and air<br />

refueling aircraft, and 13 critical enablers. The enablers provide command, control,<br />

499<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


500<br />

communications, intelligence, surveillance, and reconnaissance, as well as combat search and<br />

rescue."<br />

Increasingly, the Air Force describes itself operationally in terms of AEFs rather than wings or<br />

wing equivalents. A full AEF rotation cycle is 15 months. It is divided into five 3-month<br />

periods, and during each of these, two of the AEFs are vulnerable to deployment.<br />

In August 2002, Chief of Staff of the Air Force (CSAF), General John P. Jumper, issued a Sight<br />

Picture entitled “The Culture of our Air and Space Expeditionary Force and the Value of Air<br />

Force Doctrine.” General Jumper’s comments included: “Concerning what I call “The Culture<br />

of the Air and Space Expeditionary Force,” everyone in the Air Force must understand that the<br />

day-to-day operation of the Air Force is absolutely set to the rhythm of the deploying AEF force<br />

packages. Essential to this cultural change is our universal understanding that the natural state<br />

of our Air Force when we are “doing business” is not home station operations but deployed<br />

operations. The AEF cycle is designed to provide a rhythm for the entire business of our Air<br />

Force, from assignment cycles to training cycles and leave cycles. That process needs to be the<br />

focus of our daily operational business. We must particularly work to change processes within<br />

our own Air Force that reach in and drive requirements not tuned to the deployment rhythm of<br />

the AEF. That means that when the 90-day vulnerability window begins, the people in that<br />

particular AEF force package are trained, packed, administered, and are either deploying or<br />

sitting by the phone expecting to be deployed. There should be no surprises when that phone<br />

does ring, and no reclamas that they are not ready. More important, there should be no reclamas<br />

because someone other than the AEF Center tasked people in the AEF for non-AEF duties.”<br />

Operational commanders at all levels have found it difficult to maintain enough qualified airmen<br />

to meet personnel deployment demands for the Unit Type Code (UTC) requirements that have<br />

been levied on their units.<br />

This problem was elevated to HQ AETC, and the Director of Operations (DO) hosted a<br />

conference with Air Force Career Field Managers (AFCFMs) of a selected sample of Air Force<br />

specialties to determine if the apprentice (or 3-skill level) airmen could be task-certified to meet<br />

some deployment requirements identified for journeyman or 5-skill-level requirements.<br />

In preparation for this conference, HQ AETC/DO requested AFOMS assistance in developing a<br />

survey to determine if 3-skill-level personnel could be used for some AEF UTC requirements.<br />

AFOMS is responsible for conducting occupational analyses for every enlisted career field<br />

within the Air Force and for selected officer utilization fields. AFOMS is an operational<br />

scientific organization that is often in contact the senior enlisted and officer career field<br />

managers through Utilization and Training Workshops (U&TWs). Occupational surveys<br />

generally provide information in terms of the percentage of members performing jobs or tasks,<br />

the relative percentage of time spent performing tasks, equipment used, task difficulty, training<br />

emphasis, testing importance (for enlisted specialties only), and the skills necessary to perform<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


tasks. The structure of jobs within an occupation should serve as a guide for refining the Air<br />

Force military classification system or personnel utilization policies for that occupation.<br />

With these capabilities, AFOMS was engaged to assist in providing empirical data that could be<br />

used to identify what tasks are performed in a deployed environment.<br />

METHOD<br />

Survey Development Process<br />

An occupational survey begins with a job inventory (JI) -- a list of all the tasks performed by<br />

members of a given Air Force Specialty Code (AFSC) as part of their actual career field work<br />

(that is, additional duties and the like are not included). We include every function that career<br />

field members perform by working with technical training personnel and operational subjectmatter<br />

experts (SMEs) to produce a task list that is complete and understandable to the typical<br />

job incumbent. The SMEs write each task to the same level of specificity across duty areas, and<br />

no task is duplicated in the task list. The JIs used for this project were the most current for each<br />

AFSC as compiled through our ongoing 3-year cyclical survey process.<br />

Survey Administration<br />

This survey was administered to 3-, 5-, and 7-skill-level personnel who are either currently<br />

deployed or have been deployed within the past 12 months in support of contingency operations.<br />

A list of personnel who met these requirements was provided by the Air Force Personnel Center<br />

(AFPC). A web-based survey was developed which included the Job Inventory for each of the<br />

12 AFSCs selected. As the individual responded to certain background questions, they were<br />

branched within the survey to their appropriate JI section.<br />

All 3- and 5-skill-level personnel were branched to their AFSC's JI and instructed to mark only<br />

those tasks they performed while deployed to support a contingency operation. Once they<br />

completed marking all tasks they performed while deployed, they were presented with only those<br />

tasks marked and asked to rate each task on a scale of 1-9 on how well-trained or prepared they<br />

felt they were to perform the task upon arrival at the deployed location. The following rating<br />

scale was used for this data collection:<br />

1 Extremely low preparation/training<br />

2 Very low preparation/training<br />

3 Low preparation/training<br />

4 Below average preparation/training<br />

5 Average preparation/training<br />

6 Above average preparation/training<br />

7 High preparation/training<br />

8 Very high preparation/training<br />

9 Extremely high preparation/training<br />

All 7-skill-level personnel were also branched to their AFSC's JI and instructed to mark those<br />

tasks they felt were important for personnel they supervised to be able to perform upon arrival at<br />

501<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


502<br />

a deployed location to support a contingency operation. Once they completed marking all tasks<br />

they felt were important to be performed while deployed, they were presented with only those<br />

tasks and asked to rate each task on a scale of 1-9 on how import each task was to perform by 3-<br />

and 5- skill level personnel upon arrival at the deployed location. The following rating scale was<br />

used for this data collection:<br />

1 Extremely low importance<br />

2 Very low importance<br />

3 Low importance<br />

4 Below average importance<br />

5 Average importance<br />

6 Above average importance<br />

7 High importance<br />

8 Very high importance<br />

9 Extremely high importance<br />

The data collected were then formatted and loaded into the Comprehensive Occupational Data<br />

Analysis Programs (CODAP) and sorted on various groups to enable analysis and display of the<br />

data by various skill level groups. The JI tasks were matched to the current Specialty Training<br />

Standard (STS) for each AFSC to depict the tasks currently coded in the STS as core tasks.<br />

The definition for a core task varies among CFMs, but for the purpose of this analysis, we<br />

wanted to display the feasibility of defining a core task as “a task that is core to the specialty and<br />

performed in a deployed environment.” By employing this definition, tasks can be easily<br />

identified for initial skills training and certified prior to deployment.<br />

Results<br />

The data were presented to the CFMs at the HQ AETC/DO conference and depicted the top tasks<br />

performed by each sample AFSC while deployed. A sample of these data tables are presented on<br />

the following page, sorted by Supervisor’s Importance rating for each. The percent members<br />

performing by 3- and 5-skill-level airmen are also presented for each task.<br />

This information was very well-received by both the functional and training communities to help<br />

identify deployment requirements. AFOMS is now incorporating these deployment survey<br />

techniques into our cyclical analysis process to provide deployment data for every AFSC.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


AFSC 2A3X1 (A-10, F-15, U-2 AVIONIC SYSTEMS)<br />

TASKS WITH HIGHEST SUPERVISORY IMPORTANCE EMPHASIS RATINGS<br />

503<br />

PERCENT MEMBERS<br />

PERFORMING<br />

All All SUPV<br />

CORE 5-Lvl 3-Lvl IMPORT<br />

TASKS TASK (N=44) (N=47) (N=35)<br />

A0038 Troubleshoot aircraft wiring * 86 85 6.51<br />

D0287 Code mode 4 crypto equipment 50 43 6.26<br />

D0288 Code secure voice crypto equipment 39 36 6.20<br />

A0031 Repair aircraft wiring * 77 70 6.14<br />

F0451 Close CAMS maintenance events 86 72 6.09<br />

A0037 Trace wiring, system, or interface diagrams * 80 74 6.09<br />

SI MEAN = 2.07; S.D. = 1.62; HIGH SI = >3.69<br />

AFSC 3C0X1 (COMM/COMPUTER SYSTEMS)<br />

TASKS WITH HIGHEST SUPERVISORY IMPORTANCE EMPHASIS RATINGS<br />

PERCENT MEMBERS<br />

PERFORMING<br />

All All SUPV<br />

CORE 5-Lvl 3-Lvl IMPORT<br />

TASKS TASK (N=120) (N=38) (N=172)<br />

A0024 Install computer hardware for end users 73 55 6.14<br />

A0009 Assist users in resolving computer software malfunctions or problems 78 79 5.98<br />

A0025 Install standalone and network computer operating systems, such as Windows or UNIX 73 61 5.78<br />

A0023 Install application software, such as information protection or special systems software 76 61 5.50<br />

B0042 Answer trouble calls from end users dealing with network outages 74 66 5.43<br />

A0012 Configure operating systems, such as UNIX or NT Server * 70 55 5.35<br />

SI MEAN = 1.03; S.D. = 1.17; HIGH SI = >2.20<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


504<br />

CONCLUSION<br />

The US Air Force is focused on AEF as its normal operation. AFOMS was enlisted to assist in<br />

identifying the task requirements through occupational analysis and stands poised to develop a<br />

comprehensive support base for future decisions.<br />

REFERENCES<br />

Air Force Magazine (July 2002), Vol. 85, No. 07, John T. Correll, The EAF in Peace and War<br />

Chief’s Sight Picture (August 2002), General John P. Jumper, The Culture of our Air and Space<br />

Expeditionary Force and the Value of Air Force Doctrine<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Occupational Analytics<br />

Paul L. Jones<br />

Navy Manpower Analysis Center<br />

Jill Strange and Holly Osburn<br />

SkillsNET Corporation<br />

The U.S. Navy established the Navy Occupational Task Analysis Program (NOTAP)<br />

in 1976. Initially, the NOTAP used the Comprehensive Occupational Development and<br />

Analysis Program (CODAP) as the analytical software to produce occupational standards<br />

and to analyze the results from fleet surveys. Rising costs of production, Navy personnel<br />

downsizing, increased sophistication of technology, and requirements to more rapid<br />

production forced the Navy to identify alternative methods for maintaining its<br />

occupational structure. In 1988, the NOTAP replaced CODAP with Raosoft, Inc.<br />

because it enabled the Navy to collect data using diskettes (heretofore, we used printed<br />

booklets.) In 2001, the NOTAP replaced Raosoft with the SkillsNET on-line<br />

methodology that permits our occupational classification structure needs to be met from a<br />

web-enabled environment.<br />

NOTAP was challenged to rework the classification structure to provide increased<br />

characterizations of the work and have “the ability to conduct what-if scenario’s for a<br />

changing littoral operation.” Adoption of the SkillsNET methodology provides greater<br />

flexibility in occupational structures, while moving the Navy into a multifaceted<br />

environment where the analytical possibilities are virtually unlimited. This paper focuses<br />

on several analytical vistas available to Navy decision makers and leadership.<br />

Similarity of Jobs<br />

The SkillObject, a cluster of similar tasks, becomes the focus around which similarity<br />

of Jobs – tasks, knowledge, skills, and abilities – is determined. Similarity knowledge<br />

allows the Navy to realize cost savings, common training delivery, and it eliminates<br />

redundancy. Similarity is calculated using scaled skills, scaled abilities, percentage of<br />

overlap for tools and knowledge categories.<br />

Job Transfer<br />

The Navy has a major problem with Sea-Shore rotation of personnel. It is difficult to<br />

place individuals in a shore billet where his/her sea skill requirements are maintained or<br />

strengthened. Reality tells us basic skills have historically eroded.<br />

Transferability measures enable us to dissect the job at sea and place individuals in shore<br />

billets where a portion of their skills are maintained. This enables us to eliminate<br />

expensive retooling.<br />

Transferability is a function of criticality, consequence of error between the targeted<br />

shore job and the sea requirement, and Importance.<br />

505<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


506<br />

General Difficulty of External Recruiting for a Job<br />

The objective is to determine the level of difficulty associated with recruiting<br />

individuals. Recruiting is a difficult and expensive proposition, particularly, when the<br />

need for highly skilled Sailors is rising. Thus, we have the capability to look at demand<br />

and pay ratios to meet this requirement.<br />

Analytically, we are looking at the skillObjects within a job, the skills associated<br />

with the tasks within the skillObject. From these we calculate the Average Proficiency<br />

for the job.<br />

We then calculate the average industry pay using Department of Labor standards and<br />

compare it with Navy pay and benefits.<br />

This metric becomes extremely valuable when recruiting for military-specific<br />

technical jobs.<br />

Number of Sailors Required for the Mission<br />

The objective is to determine the number of people needed to successfully perform the<br />

mission. Requirements determination for the fleet is a major concern with our changing<br />

world and changing missions. Historically, we have relied on industrial engineering<br />

techniques that are time and labor intensive. Our goal is to develop the analytics that will<br />

enable us to meet the requirements determination challenge with equal accuracy using<br />

data collected from our occupational structures.<br />

This analytic, using mission identification software, staffing requirements, criticality,<br />

risk factors, has potential. It is not operation, but shows promise.<br />

OTHER ANALYTICS<br />

We are working on several other initiatives that provide the predictability required in a<br />

changing military environment. Some of these include: Jobs that require teamwork,<br />

Average level of expertise required for a job, Outsourcing, Depth of training, etc.<br />

With the characterizations available to the Navy with the insertion of SkillsNET<br />

methodology into NOTAP, we have opened numerous vistas for occupational analytics<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Anticipating the Future for First-Tour Soldiers<br />

Tonia S. Heffner and Trueman Tremble<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

Alexandria, VA USA<br />

Roy Campbell and Christopher Sager<br />

Human Resources Research Organization<br />

Alexandria, VA USA<br />

As the U.S. Army moves through transformation, the focus is on the impact of current<br />

and future technology changing the nature of war and training preparation. This transformation<br />

will involve development and fielding of Future Combat Systems (FCSs) to achieve full<br />

spectrum dominance through a force that is responsive, deployable, agile, versatile, lethal, and<br />

fully survivable and sustainable under all anticipated future combat conditions. Army leadership<br />

recognizes the critical importance of Soldiers to the effectiveness of transformation. Although<br />

the Army’s Objective Force 2015 White Paper (Department of the Army, 2002) demonstrates<br />

recent thinking regarding the way transformation will impact the individual Soldier and the<br />

personnel system, this work is in its infancy. It is assumed that enlisted soldiers will require<br />

considerably greater personal, social, and technological sophistication, but this assumption has<br />

received limited empirical investigation.<br />

Anticipating the need for solid research on Soldiers in the future, the U. S. Army<br />

Research Institute for Behavioral and Social Sciences (ARI) initiated research to examine the<br />

future first-term Soldier, New Predictors for Selecting and Assigning Future Force Soldiers<br />

(Select21; Sager, Russell, Campbell, & Ford, <strong>2003</strong>). Select21 research focuses on the assumption<br />

that future entry-level Soldiers will require different capabilities than today’s soldiers. The<br />

research seeks to understand what those capabilities might be and to determine if the Army’s<br />

procedures for selecting and assigning new Soldiers to future jobs would benefit from personnel<br />

tests that measure the capabilities not currently assessed as part of the Army’s current<br />

accessioning process. The Army’s selection and classification process now relies on<br />

measurement of cognitive capabilities through the Armed Services Vocational Aptitude Battery<br />

(ASVAB). Thus, Select21 basically tests the hypothesis that performance prediction for future<br />

entry-level jobs is increased, over ASVAB scores, by inclusion of knowledges, skills, and the<br />

other personnel attributes (KSAs) measures important to the performance demands of future<br />

jobs.<br />

Fig. 1 shows the overall design of the Select21 project. Most of this research is being<br />

conducted with the support of the Human Relations Research Organization (HumRRO). The<br />

presentations by HumRRO researchers provide more detail on individual aspects of the project.<br />

Here, we discuss research challenges and solutions that together shaped the Select21 design.<br />

Research following from that design has produced a clearer vision of the conditions under which<br />

future Soldiers will perform and that the Select21 research needs to recognize.<br />

507<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


508<br />

CHALLENGES AND SOLUTIONS<br />

Operational Utilization<br />

Prospective enlistees of all U.S. military services take the Armed Service Vocational<br />

Aptitude Battery (ASVAB) and typically are assessed at the same testing stations. The<br />

implications of this multi-service system are (1) the large number of Soldiers who are assessed,<br />

(2) the compatibility of the Army’s evaluation procedures with the overall system supporting all<br />

services, and (3) the cost-effectiveness of introducing new measures. Select21 researchers are<br />

attending to the obstacles of these implications. As the research shows progress, we anticipate<br />

seeking greater cross-service involvement.<br />

Army-Wide Job Analysis<br />

Cluster/MOS-Specific Job Analysis<br />

Develop Predictors<br />

Develop Criteria<br />

Figure 1. Select21 research design.<br />

Field<br />

Test<br />

Future Focus<br />

Select21 is intended to examine personnel decisions, but not for the current personnel.<br />

Instead, the focus is selecting and classifying personnel now who will meet the demands of a<br />

future, the groundwork for which is now just being laid. Indeed, it is anticipated that the Army<br />

will be engaged in a continual cycle of transformation. Moreover, features of the current Army<br />

will remain as innovation takes hold and becomes fully characteristic. To support selection and<br />

assignment for the transformation, we had to perform an analysis of future Soldier jobs. Research<br />

on future jobs challenges the traditional job analysis approaches used to identify personnel<br />

requirements. Traditionally, personnel requirements are determined by interviewing subject<br />

matter experts (SMEs) and job incumbents to identify the work environment, the tasks<br />

performed, and the associated KSAs necessary for successful job performance. The most critical,<br />

yet most challenging, aspect of this job analysis activity is to determine how the future will differ<br />

from current experience. For Select21 research, however, all of these aspects have had to be<br />

projected – neither the environment, the equipment, nor those experienced in working future jobs<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Concurrent<br />

Criterion-<br />

Related<br />

Validation<br />

Jan 2002 Jan <strong>2003</strong> Jan 2004 Jan 2005<br />

Develop<br />

Recommendations


now exist. Therefore, we had to derive a unique methodology to depict the future working<br />

environment.<br />

Select21 has had the advantage of recent ARI research on future noncommissioned<br />

officer requirements, Maximizing 21 st Century NCO Performance (NCO21; Ford, Campbell,<br />

Campbell, Knapp, & Walker, 2000; Knapp et al., 2002; Knapp, McCloy, & Heffner, <strong>2003</strong>). The<br />

overarching goal of NCO21 was to identify personnel who are best suited for entry into and<br />

progression through the NCO corps, despite future changes. The NCO21 research included a<br />

futuristic job analysis that projected future job performance demands and critical KSAs.<br />

Fortunately, the NCO21 analysis specifically included first-term soldiers. Select21 has been able<br />

to use and to update the NCO21 products based on the recent views of the future captured under<br />

the notion of transformation.<br />

Validating projections of future KSAs is also a formidable challenge because of the<br />

absence of performance measures. NCO21 faced this challenge, with results showing the<br />

difficulty of obtaining scores that differentiate current performance from projections of future<br />

performance. The NCO21 research, thus, emphasized the importance to Select21 of a broad<br />

performance model, multiple measures reflecting the model, and including a focus on jobs that<br />

both exist today and are likely to be characteristic of the future.<br />

Select21 also considers more than job performance. The research takes into account the<br />

total system change of the transformation that includes organizational operations and the overall<br />

organizational lives of Soldiers. In addition to changes in job performance demands, system<br />

change could add to requirements for increased personal and social skills and motivation.<br />

Select21 seeks to provide a database capability for research on KSAs likely to influence<br />

individual fit into the future Army and decisions to remain in the Army, using a personenvironment<br />

fit model.<br />

As subsequent papers will emphasize, expert judgment has been critical to formulations<br />

of the future and to decisions about job performance dimensions, KSAs, and the measurement<br />

process. Expert panels have ensured that the project tracks with transformation plans. Panels<br />

have also provided direct reviews and judgments about task analysis products. We anticipate that<br />

expert input will help guide recommendations for and actual product utilization.<br />

Selection and Job Classification<br />

The Select21 project capitalizes on the knowledge gleaned from the NCO21 research, but<br />

it seeks to advance findings by investigating the possibility of providing measures useful for<br />

assigning soldiers to jobs, as well as screening them for suitability for organizational entry. The<br />

challenge here is to create KSA measures that are excellent predictors of the criteria and able to<br />

make differential prediction for the well over 150 different jobs or military occupational<br />

specialties (MOSs) to which Soldiers are assigned. To deal with this challenge, the research<br />

sought to group MOSs viable for the future into clusters based on a principle of likely job<br />

demand homogeneity.<br />

509<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


510<br />

ANTICIPATED FUTURE CONDITIONS<br />

At the initiation of the Select21 research effort (Sager et al., <strong>2003</strong>), we reviewed the<br />

anticipated conditions of the future that had been derived for NCO21. The NCO21 conditions<br />

represented themes concerning: self-direction, information processing, computer familiarity,<br />

increased technical skills, increased leadership skills, understanding systems, managing multiple<br />

functions, stamina, and adaptability. It was apparent that the future conditions for NCOs would<br />

need to be adapted because projections and realities had changed since the NCOs were<br />

investigated. Further, the requirements for NCOs are inherently broader than those for first-tour<br />

Soldiers. Once again, the current writings and projections were reviewed and integrated with<br />

those things learned from the NCO investigation to generate possible future conditions. We then<br />

interviewed SMEs who reviewed and revised the list of anticipated Army-wide conditions for<br />

first-tour Soldiers (see Table 1).<br />

Table 1. Anticipated Army-wide Conditions for First-tour Soldiers<br />

Learning Environment Greater requirement for continuous learning and the<br />

need to independently maintain/increase proficiency<br />

on assigned tasks<br />

Disciplined Initiative Less reliance on supervisors and/or peers to performed<br />

assigned tasks within their realm of defined<br />

responsibilities<br />

Communication Method and Frequency Greater need to rely on digitized communications,<br />

understanding common operational picture, and<br />

increased situational awareness<br />

Individual Pace and Intensity Greater need for mental and physiological stamina,<br />

understanding of personal status, and adaptability<br />

Self-Management Greater emphasis on ensuring Soldiers balance and<br />

manage their personal matters and well-being<br />

Survivability Improved protective systems, transportation,<br />

communications, and medical care will result in<br />

improvement in personal safety<br />

Expectation of change was the impetus for Select21, and the themes in Table 1<br />

summarize the types of changes creating future job demands on first-tour Soldiers and the KSAs<br />

important to successful performance of the demands. Based on inspection, they include<br />

requirements to cope successfully with change (Learning Environment and Individual Pace and<br />

Intensity). The themes also point to at least two system changes (Communication Method and<br />

Frequency; Survivability). The theme of “survivability” highlights a condition of work already<br />

existing in the Army but perhaps having new implications. Finally, certain themes (Learning<br />

Environment, Disciplined Initiative, Individual Pace and Intensity, and Self-Management)<br />

spotlight personal attributes that include “learning orientation,” “independence,” “disciplined<br />

self-reliance,” and a combination of physical and psychological states enabling behavioral<br />

“adaptation.” Importantly, the condition of Self-Management recognizes needs for balancing<br />

work with personal matters, to include well-being. A potential conflict in these future conditions<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


perhaps deserves note. This is the juxtaposition of the rugged and relatively energetic attributes<br />

having to do with a learning orientation, independence, and self-reliance with the attributes that<br />

could promote an emphasis on personal safety and balance among competing life roles.<br />

There is no totally “clean” way to compare the Select21 and NCO21 conditions.<br />

Difficulty arises from the development process: Select21 conditions were derived in part from<br />

the NCO21 list. The lists were also intended to serve different purposes, with the Select21 and<br />

NCO21 lists intended to serve job description/analysis of first-tour Soldiers and NCOs,<br />

respectively. Table 2 was nevertheless generated with these difficulties in mind and provides a<br />

take on the overlaps in the two sets of conditions.<br />

Table 2. Summary of Comparability of Select21 and NCO21 Conditions for First-tour Soldiers<br />

Select21 Condition NCO21 Condition<br />

Relatively Overlapping Conditions<br />

Learning Environment Adaptability<br />

Disciplined Initiative Self-Direction<br />

Communication Method and Frequency Computer Familiarity/Information Processing<br />

Individual Pace and Intensity Stamina<br />

Self-Management<br />

Survivability<br />

Relatively Non-Overlapping Conditions<br />

Increased Technical Skills<br />

Increased Leadership Skills<br />

Understanding Systems<br />

Manage Multiple Functions<br />

Even though the view in Table 2 is open to disagreement, it provides suggestions.<br />

Without doubt, the non-overlapping conditions from the NCO21 list reflect the purpose of the<br />

list as setting conditions for NCO performance. Thus, Understanding Systems and Manage<br />

Multiple Functions (NCO21) are exclusive to NCOs. These may reflect a change in our<br />

understanding, but likely reflect the different ranks. Although technologies exist to provide the<br />

information for all Soldiers and NCOs gain these skills (e.g., Force XXI Battle Command,<br />

Brigade-and-Below [FBCB 2 ]), these are considered functions of leading Soldiers.<br />

Also interesting are the other trends suggested by the comparison. That is, the<br />

comparison suggests that views of the future associated with Army transformation are somewhat<br />

more consolidated than the views during the era of NCO21. This consolidation may be an<br />

outcome of the systems planning that the Army has undertaken for the transformation. Thus,<br />

some conditions have broader but more specified and more integrated implications. A good<br />

example is the comparison of Individual Pace and Intensity (Select21) with Stamina (NCO21).<br />

Both refer to increased needs for physical and mental superiority. A prime difference is the<br />

degree to which the Select21 condition specified the aspects of fitness. Further, the Select21<br />

condition integrated the notions of understanding personal status and adaptability. The NCO21<br />

511<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


512<br />

list had portrayed Stamina and Adaptability as distinct dimensions. Another example is the<br />

Select21 condition of Self-Management. (Select21). Nothing comparable to this concept appears<br />

in the NCO21 list. Self-Management probably implies taking control of job activities, but it more<br />

explicitly refers to taking care of personal matters and balancing work and personal issues. Thus,<br />

Self-Management may reflect a stronger integration of Soldier well being into its picture of the<br />

future. As mentioned earlier, this more recent picture also includes an emphasis on personal<br />

safety.<br />

THE WAY FORWARD<br />

Select21 research has used the anticipated future conditions as the context for<br />

accomplishing its objectives. These include: (a) job clusters of similar first-tour soldier MOSs (b)<br />

selection of clusters and representative MOSs for viability and potential for differential<br />

prediction, (c) projection of future conditions for the representative MOSs, (d) job analysis of<br />

the representative MOSs, and (e) construction of predictor measures and criterion measures.<br />

To the extent that the Select21 future conditions represent consolidation of the<br />

transformation planning process, they raise optimism about the likely usefulness of Select21<br />

products. The principal products will be predictor measures and recommendations to the Army<br />

for the selection and classification of future first-term Soldiers. The research now concentrates<br />

on development of the predictor and criterion measures for subsequent use in test of concurrent<br />

validity.<br />

REFERENCES<br />

Department of the Army (2002). Army Training and Leader Development Panel (NCO). FT<br />

Leavenworth, KS, Author.<br />

Department of the Army (2002). Objective Force 2015. Arlington, VA: Author.<br />

Ford, L. A., Campbell, R. c., Campbell, J. P., Knapp, D. J., & Walker, C. B. (2000). 21 st Century<br />

Soldiers and Noncommissioned Officers: Critical predictors of performance (Technical<br />

Report 1102). Alexandria, VA: U.S. Army Research Institute for the Behavioral and<br />

Social Sciences.<br />

Halal, W. E., Kull, M. D., & Leffmann, A. (1997, November-December). Emerging<br />

technologies: What’s ahead for 2001-2030. The Futurist, 20-28.<br />

Knapp, D. J., Burnfield, J. L., Sager, C. E., Waugh, G. W., Campbell, J. P., Reeve, C. L.,<br />

Campbell, R. C., White, L. A., & Heffner, T. S. (2002). Development of Predictor and<br />

Criterion Measures for the NCO21 Research Program (Technical Report 1128).<br />

Alexandria, VA: U. S. Army Research Institute for the Behavioral and Social Sciences.<br />

Knapp, D. J., McCloy, R., & Heffner. T. S. (<strong>2003</strong>). Validation of Measures Designed to<br />

Maximize 21 st Century Army NCO Performance (Contractor Report). Alexandria, VA:<br />

Human Resources Research Organization.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Sager, C. E., Russell, T. L., Campbell, R. C., & Ford, L. A. (<strong>2003</strong>). Future Soldiers: Analysis of<br />

Entry-Level Performance Requirements and their Predictors. Alexandria, VA: Human<br />

Resources Research Organization.<br />

513<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


514<br />

INTRODUCTION<br />

FUTURE-ORIENTED JOB ANALYSIS FOR FIRST-TOUR<br />

SOLDIERS 41<br />

Christopher E. Sager, Ph.D. and Teresa L. Russell, Ph.D.<br />

Human Resources Research Organization<br />

Alexandria, VA, USA<br />

csager@humrro.org<br />

The Select21 project was undertaken to help the U.S. Army ensure that it acquires<br />

Soldiers with the knowledges, skills, and attributes (KSAs) needed for performing the types of<br />

tasks envisioned in a transformed Army. Army leadership recognizes the importance of its Soldiers<br />

to the effectiveness of transformation. In this context, the ultimate objectives of the project are to<br />

(a) develop and validate measures of critical KSAs needed for successful execution of the Future<br />

Army’s missions and (b) propose use of these measures as a foundation for an entry-level selection<br />

and classification system adapted to the demands of the 21 st century. The purpose of this first stage<br />

of the project was to conduct a future-oriented job analysis to support the development and<br />

validation effort.<br />

APPROACH<br />

In this section we briefly describe concepts underlying our approach, challenges, and the<br />

strategies we used to complete the future-oriented job analysis.<br />

Underlying Concepts<br />

The Select21 project focuses on the period of transformation to the Future Army—a<br />

transition that is envisioned to take on the order of 30 years to complete (Institute for Land<br />

Warfare, October 17, 2000). This conceptualization of the transformation implies that the next<br />

several years will include elements of the Army (a) in its current state, (b) transitional systems,<br />

and (c) combat systems characteristic of the fully transformed Future Army. Our goal is to<br />

develop measures of KSAs that will be useful in the not too distant future and remain so for<br />

many years. Therefore, we decided to focus on the time period during which all of these<br />

elements will be present simultaneously. This transformation will affect first-tour Soldier<br />

requirements in at least two ways: (a) the types of missions for which Soldiers need to prepare<br />

will grow in number and complexity, and (b) the tools and equipment Soldiers will be using to<br />

perform these missions are undergoing significant changes (U.S. Army, 2001; December, 2002).<br />

41 This paper is part of a symposium titled Selecting Soldiers for the Future Force: The Army’s Select21 Project<br />

presented at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Conference in Pensacola, FL (D. J. Knapp, Chair).<br />

The views, opinions, and/or findings contained in this paper are those of the authors and should not be construed as<br />

an official U.S. Department of the Army position, policy, or decision.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The primary goal of selection/classification, and many other human resources<br />

interventions, is to positively affect the job performance of individuals/Soldiers. Consistent with<br />

this goal, this project began by developing a description of job performance. Here job<br />

performance is defined as “…actions or behaviors that are relevant to the organization’s goals<br />

and that can be scaled (measured) in terms of each individual’s proficiency (that is, level of<br />

contribution)” (Campbell, McCloy, Oppler, & Sager, 1993, p. 40). Based on these descriptions of<br />

job performance, we make inferences about the KSAs Soldiers need to perform the behaviors<br />

that make-up their performance requirements.<br />

A model of job performance that links future-oriented performance requirements and KSAs<br />

aids these inferences. It hypothesizes that job performance is a function of the individual’s<br />

declarative knowledge (DK), procedural knowledge and skill (PKS), and motivation (M; Campbell<br />

et al., 1993). Different aspects of job performance can have different DKs and PKSs associated<br />

with them. The individual’s abilities, personality/temperament, interests, education, training, and<br />

experience are antecedents to DK, PKS and M. In terms of the Select21 project, this means that<br />

performance on a given performance requirement is a function of DK, PKS, and M, where the<br />

KSAs are the antecedents.<br />

Consistent with this model, our job analysis approach was driven by future-oriented<br />

performance requirements. We defined the performance requirements, and then identified a master<br />

list of future KSAs—including salient individual differences attributes identified in prior research. In<br />

turn, we linked the two—identifying the KSAs likely to predict various performance requirements.<br />

Challenges and Strategies<br />

The goal of a future-oriented job analysis is to take the broad, dynamic plans for future<br />

directions, identify trends in Army jobs over time, and describe future jobs at a level that is specific<br />

enough to guide predictor and criterion development. This goal created several challenges.<br />

Army-Wide Performance Requirements<br />

Select21’s measure development needs and the level of detail at which the Future Army is<br />

currently being described led us to describe Army-wide future performance requirements via three<br />

products. We refer to the first as the Select21 Army-Wide Performance Dimensions for First-Tour<br />

Soldiers. 42 These dimensions are 19 general components of performance that are expected to be<br />

critical to the future. They are conceptually consistent with the job performance dimensions<br />

developed for related past Army projects that serve as building blocks for this effort (e.g., Project<br />

A [Campbell & Knapp, 2001] and NCO21 [Ford, R. C. Campbell, J. P. Campbell, Knapp, &<br />

Walker, 2000]). These dimensions are future-oriented and supported our needs for developing<br />

important criterion measures (e.g., job performance ratings and a situational judgment test [SJT]).<br />

To support development of criteria that need more specific job analysis information (e.g.,<br />

multiple-choice tests), we also developed the Select21 Army-Wide Common Tasks for First-Tour<br />

42 Here we define “first-tour” Soldiers as those who have 18 to 36 months time in service.<br />

515<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


516<br />

Soldiers. They are 59 individual technical tasks. They are conceptually consistent with the Army’s<br />

current list of common tasks that, according to Army doctrine, all first-tour Soldiers should be able<br />

to perform (U. S. Department of the Army, April, <strong>2003</strong>).<br />

The performance dimensions (a) provide a description of the critical dimensions of<br />

performance in the Future Army, (b) are helpful for developing some criteria, and (c) assist in<br />

identifying relevant KSAs. The Select21 common tasks have provided enough technical details to<br />

facilitate the development of future-oriented multiple-choice questions of performance and to think<br />

about hands-on tests and simulations. The remaining challenge was that the future does not look<br />

considerably different from the present at the level of the performance dimensions; the same is true<br />

for tasks, given the level of stability and detail forecasts of the future can currently achieve. To<br />

support the development of expected future performance measures that are as future-oriented as<br />

possible, we developed a third Army-wide job analysis product that we refer to as the Anticipated<br />

Army-Wide Conditions in the 21 st Century for First-Tour Soldiers. These anticipated conditions<br />

focus on how the future Army will place new and different requirements on first-tour Soldiers.<br />

Cluster/MOS-Specific Performance Requirements<br />

The primary reason for collecting <strong>Military</strong> Occupational Specialty-specific (MOSspecific)<br />

job analysis information is to show how performance requirements differ across MOS<br />

and to guide the identification of pre-enlistment KSAs that differ in relevance across MOS. Such<br />

a discovery would in turn facilitate the development of predictor measures that could improve<br />

the classification efficiency of the current system. However, we were concerned that the<br />

transformation to the Future Army would result in changes to the content of current MOS and to<br />

the MOS structure as a whole. Based on this premise, we decided on a somewhat more general<br />

unit of analysis (i.e., job clusters) that we believed would be more stable into the future than<br />

individual MOS. After identifying 16 future job clusters, we selected two clusters to focus on for<br />

the cluster-specific portion of our job analysis and measure development efforts. Following this<br />

logic, we attempted to collect job analysis information at the cluster level. Because that portion<br />

of a particular MOS that is not Army-wide primarily focuses on technical tasks, we aimed our<br />

initial efforts at cluster-specific tasks. Finally, for data collection and sampling purposes, we<br />

identified three current MOS to represent each cluster.<br />

At first, we tried to develop task lists that (a) applied to all three target MOS in each target<br />

cluster, (b) were sufficiently detailed to support the development of measures of current job<br />

performance (e.g., multiple-choice tests, hands-on tests, and ratings), and (c) were future-oriented<br />

enough to support the development of measures of expected future performance. This approach did<br />

not work. We found that cluster-level task descriptions were simply too confusing for SMEs who<br />

are entrenched in a specific MOS. Additionally, cluster-level tasks were, by necessity, broader than<br />

MOS-specific ones—making cluster-level tasks less useful for development of criterion measures.<br />

While we retained the clusters for sampling jobs for inclusion in this effort and for summarizing<br />

results across MOS, we created MOS-specific task lists to support cluster/MOS-specific measure<br />

development. Similar to the Army-wide analysis, we also developed cluster/MOS-specific<br />

anticipated future conditions.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Knowledges, Skills, and Attributes<br />

We started by developing a list of Army-wide KSAs with two important characteristics.<br />

The first is a pre-enlistment focus. Since the goal of this project is to develop measures that can<br />

be used for selection and classification, we determined that the KSA list should focus on<br />

characteristics that Soldiers are likely to have before enlistment (i.e., before they are trained on<br />

“common tasks” or on tasks specific to their MOS). The second characteristic is<br />

comprehensiveness. The list includes 48 KSAs offering complete coverage of the measurable<br />

and potentially relevant individual difference constructs across a number of domains (i.e.,<br />

cognitive, personality/temperament, physical, psychomotor, and sensory). Therefore, this single<br />

list was used in the Army-wide and cluster/MOS-specific job analysis.<br />

Method<br />

This analysis included a literature review in three areas—future Army literature, research<br />

literature on jobs (particularly Army MOS), and literature on human attributes. The future Army<br />

literature provided information that allowed us to make initial inferences about (a) Future Army<br />

missions; (b) the functions and roles that Soldiers will play in those missions and the KSAs those<br />

Soldiers will need; (c) new technology such as weaponry, vehicles, communication devices, and<br />

the effect of technological change on personnel requirements; and (d) likely changes in the force<br />

structure in the future (e.g., Unit of Action Maneuver Battle Lab, 2002: U.S. Army 2001;<br />

December, 2002; U.S. Army Training and Doctrine Command, July, 2002). Research on jobs<br />

provided information about task taxonomies, KSAs, and tasks generated in other Army projects.<br />

Literature on human attributes told us what KSAs have been identified and measured reliably in<br />

the major domains of human performance—including cognitive, personality, psychomotor,<br />

physical, skill, and interest domains (e.g., Fleishman, Costanza, & Marshall-Mies, 1999; Ford et<br />

al., 2000; Campbell & Knapp, 2001).<br />

In addition to reviewing the relevant literature, we relied heavily on meetings, briefings,<br />

and workshops with subject matter experts (SMEs). Their contributions included (a) providing<br />

feedback on the quality and practicality of research plans, (b) revising performance requirements<br />

and KSAs, (c) evaluating the importance of performance requirements and KSAs, and (d)<br />

developing anticipated future conditions. A number of the SMEs participating in this project were<br />

organized into three panels—a Scientific Review Panel (SRP), an Army Steering Committee<br />

(ASC), and an Army Subject Matter Expert Panel (SMEP). The SRP is composed of scientists<br />

knowledgeable in the areas addressed by this research. The ASC is a policy advisory group that<br />

includes senior representatives from a number of Army organizations concerned with<br />

transformation. The SMEP is composed of personnel who are expert in particular MOS or<br />

specific Future Army planning activities. Finally, other Soldiers and Non-commissioned Officers<br />

(NCOs) participated in additional workshops and data collections.<br />

517<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


518<br />

RESULTS<br />

As discussed above, the Army-wide future performance requirements were described in<br />

three ways: (a) performance dimensions, (b) common tasks, and (c) anticipated future conditions.<br />

Fig. 1 shows the 19 Army-wide performance dimensions. The common tasks are not shown here,<br />

due to space consideration. A detailed description of anticipated Army-wide conditions is<br />

provided by Heffner, Tremble, Campbell, & Sager (<strong>2003</strong>).<br />

Performs Common Tasks Follows Instructions and Rules<br />

Solves Problems and Makes Decisions Exhibits Integrity and Discipline on the Job<br />

Exhibits Safety Consciousness Demonstrates Physical Fitness<br />

Adapts to Changing Situations Demonstrates <strong>Military</strong> Presence<br />

Communicates in Writing Relates to and Supports Peers<br />

Communicates Orally Exhibits Selfless Service Orientation<br />

Uses Computers Exhibits Self-Management<br />

Manages Information Exhibits Self-Directed Learning<br />

Exhibits Cultural Tolerance<br />

Exhibits Effort and Initiative on the Job<br />

Demonstrates Teamwork<br />

Figure 1. Select21 Army-Wide Performance Dimensions for First-Tour Soldiers.<br />

To identify target clusters for the cluster/MOS-specific portion of this project, we first<br />

needed a useful way of organizing all entry-level Army jobs into a smaller group of clusters. The<br />

final list of 16 clusters included the full domain of likely future entry-level Army jobs. The target<br />

clusters identified for focused study and their representative MOS were:<br />

Close Combat<br />

− 11B Infantryman<br />

− 19D Cavalry Scout<br />

− 19K M1 Armor Crewman<br />

• Surveillance, Intelligence, and Communications (SINC)<br />

− 31U Signal Support Systems Specialist<br />

− 74B Information Systems Operator/Analyst<br />

− 96B Intelligence Analyst.<br />

MOS-specific tasks organized in task categories were developed for each of these six<br />

MOS. For example, the Infantryman list includes tasks in 11 task categories. One of these<br />

categories is Performs Tactical Operations. This category contains tasks like (a) Move as a<br />

member of a fire Team, (b) Select hasty firing positions during an urban operation, and (c)<br />

Sustain and camouflage fighting positions. In addition to these tasks, we developed descriptions<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


of anticipated future conditions applying to each target MOS and cluster. Fig. 2 shows excerpts<br />

from the anticipated future conditions applicable to Infantryman.<br />

• For the foreseeable future, infantry will continue to operate as mechanized infantry and light<br />

infantry. They will be delivered to the battle area by air, helicopter, ground vehicles, and walking.<br />

• All infantry will see improvements in communication and location capability when in dismounted<br />

mode. This will include a GPS integrated navigation system.<br />

• Individual weapon (e.g., rifle) improvements will include thermal sights, daylight sights, close<br />

combat optics, lasers, and weapon systems connected to a digital reporting recording network.<br />

• Infantrymen will experience better individual protection through (a) integrated combat<br />

identification systems, (b) full time chemical/biological clothing, (c) intercepting body armor and<br />

(d) laser eye protection.<br />

• Long term, possibilities include target detection and engagement without exposure (i.e., individual<br />

non-line-of-sight fire).<br />

• Full C4I capability and situational awareness (SA) interconnectivity is dependent on the future<br />

development of lightweight, multiday power sources (i.e., batteries) that are rechargeable and<br />

logistically supportable.<br />

• Changes in infantry technology will occur incrementally. Overall there will be no major mid-term<br />

changes to infantry organizations, formations, employment, or tactics.<br />

Figure 2. Selected Anticipated Conditions in the 21 st Century Relevant to First-Tour Infantryman.<br />

The Select21 list of pre-enlistment KSAs is presented in Fig. 3. Direct ratings of<br />

importance and linkages of KSAs to Army-wide performance dimensions and cluster/MOSspecific<br />

task categories provided information about the relative importance of KSAs to job<br />

performance. Important Army-wide KSAs included General Cognitive Aptitude, Dependability,<br />

Oral and Nonverbal Comprehension, and Emotional Stability. When comparing the target MOS<br />

across the two clusters, Close Combat favored psychomotor and physical attributes and Team<br />

Orientation while SINC favored Basic Computer Skill and Reading Skill and Comprehension.<br />

SUMMARY<br />

This analysis generated performance requirements describing the nature of work for entrylevel<br />

Soldiers’ during the transformation. These requirements guided the identification and<br />

prioritization of KSAs that are being used to develop new predictor measures that could be useful<br />

for recruit selection and MOS assignment. These requirements are also being used to develop job<br />

performance measures that will serve as criteria for evaluating the predictors in an eventual<br />

concurrent criterion-related validation effort.<br />

519<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


520<br />

Cognitive Attributes:<br />

Oral Communication Skill Spatial Relations Aptitude<br />

Oral and Nonverbal Comprehension Vigilance<br />

Written Communication Skill Working Memory<br />

Reading Skill/Comprehension Pattern Recognition<br />

Basic Math Facility Selective Attention<br />

General Cognitive Aptitude<br />

Temperament Attributes:<br />

Perceptual Speed and Accuracy<br />

Team Orientation Affiliation<br />

Agreeableness Potency<br />

Cultural Tolerance Dependability<br />

Social Perceptiveness Locus of Control<br />

Achievement Motivation Intellectance<br />

Self-Reliance<br />

Physical Attributes:<br />

Emotional Stability<br />

Static Strength Extent Flexibility<br />

Explosive Strength Dynamic Flexibility<br />

Dynamic Strength Gross Body Coordination<br />

Trunk Strength<br />

Stamina<br />

Sensory Attributes:<br />

Gross Body Equilibrium<br />

Visual Ability<br />

Psychomotor Attributes:<br />

Auditory Ability<br />

Multilimb Coordination Arm-Hand Steadiness<br />

Rate Control Wrist, Finger Speed<br />

Control Precision<br />

Manual Dexterity<br />

Procedural Knowledge and Skill:<br />

Hand-Eye Coordination<br />

Basic Computer Skill Self-Management Skill<br />

Basic Electronics Knowledge Self-Directed Learning and Development Skill<br />

Basic Mechanical Knowledge Sound Judgment<br />

Figure 3. Select21 Knowledge, Skills, and Attributes.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


REFERENCES<br />

Campbell, J. P., & Knapp, D. J. (2001). Exploring the limits in personnel selection and<br />

classification. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.<br />

Campbell, J. P., McCloy, R. A., Oppler, S. H., & Sager, C. E. (1993). A theory of performance. In<br />

N. Schmitt, & W.C. Borman (Eds.), Personnel selection in organizations (pp.35-70). San<br />

Francisco: Jossey-Bass.<br />

Fleishman, E. A., Costanza, D. P, & Marshall-Mies, J. (1999). Abilities. In N.G. Peterson, M. D.<br />

Mumford, W. C. Borman, P. R. Jeanneret, & E. A. Fleishman (Eds.), An occupational<br />

information system for the 21st century: The development of O*NET (p.175-195).<br />

Washington DC: American Psychological <strong>Association</strong>.<br />

Ford, L. A., Campbell, R. C., Campbell, J. P., Knapp, D. J, & Walker, C. B. (2000). 21st Century<br />

soldiers and noncommissioned officers: Critical predictors of performance (Technical<br />

Report 1102). Alexandria, VA: U.S. Army Research Institute for the Behavioral and<br />

Social Sciences.<br />

Heffner, T. S., Tremble, T., Campbell, R. C., & Sager, C. E. (<strong>2003</strong>, November). Anticipating the<br />

future for first-tour soldiers. In D. J. Knapp (Chair) Selecting Soldiers for the Future<br />

Force: The Army’s Select21 Project. Symposium presented at the 45 th Annual<br />

Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>, Pensacola, FL.<br />

Institute for Land Warfare. (2000, October 17). Soldiers on point for the nation, Army<br />

transformation. Briefing presented to the Army Transformation Panel at the AUSA<br />

Annual Meeting, Washington, DC.<br />

Unit of Action Maneuver Battle Lab. (2002). Operational requirements document for the future<br />

combat systems. Ft. Knox, KY: Author.<br />

U.S. Army. (2001). Concepts for the Objective Force, United States Army white paper. Online<br />

at: http://www.army.mil/features/WhitePaper/default.htm.<br />

U.S. Army. (2002, December). Object Force in 2015 white paper. Arlington, VA: Department of<br />

the Army: Objective Force Task Force.<br />

U.S. Army Training and Doctrine Command. (2002, July). The United States Army Objective<br />

Force: Operational and organizational plan for maneuver unit of action (Pamphlet 525-<br />

3-90/O&O). Fort Monroe, VA: Author.<br />

U. S. Department of the Army. (<strong>2003</strong>, April). STP 21-1-SMCT, soldier’s manual of common<br />

tasks skill level 1. Washington, DC: Headquarters, Author.<br />

521<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


522<br />

PERFORMANCE CRITERIA FOR THE SELECT21 PROJECT 43<br />

INTRODUCTION<br />

Patricia A. Keenan, Ph.D., David A. Katkowski, Maggie M. Collins,<br />

Karen O. Moriarty, and Lori B. Schantz<br />

Human Resources Research Organization<br />

Alexandria, VA, USA<br />

pkeenan@hummro.org<br />

The U.S. Army is undertaking fundamental changes to transform into the Future Force—<br />

a transition envisioned to take approximately 30 years to complete. The time frame of interest<br />

extends to approximately 2025. The overall goal of the Select21 project is to ensure the Army<br />

selects and classifies soldiers with the knowledge, skills, and attributes (KSAs) needed for<br />

performing the types of tasks envisioned in a transformed Army.<br />

One of the central tasks of the Select21 research program is to develop performance<br />

measures to support criterion-related validation efforts. The development effort comprises several<br />

criterion measures, which reflect both “can-do” and “will-do” constructs. Can-do measures include<br />

performance-oriented job knowledge tests; will-do measures include observed current performance<br />

ratings, expected future performance ratings, and archival/self-report information. Our approach to<br />

solving the future orientation problem is to develop criterion measures that reflect existing job<br />

performance in tasks that will remain virtually the same in the future, as well as measures that<br />

simulate future conditions under which the job would be performed.<br />

This paper describes development of (a) supervisor and peer ratings for current and<br />

expected future performance, (b) a job knowledge test that has both Army-wide and <strong>Military</strong><br />

Occupational Specialty-specific (MOS-specific) components, and (c) a self-report measure that<br />

includes archival information (e.g., evaluations, awards, education). Other criterion measures<br />

developed for Select21 include a criterion situational judgment test (Waugh & Russell, <strong>2003</strong>)<br />

and a measure of person-environment fit (Van Iddekinge, Putka, & Sager, <strong>2003</strong>).<br />

Instrument Development<br />

The rating scales and job knowledge tests have been developed following the same<br />

general procedures. HumRRO project staff used available information to draft materials to be<br />

reviewed and refined by subject matter experts (SMEs). These SMEs were most often instructors<br />

in Advanced Individual Training (AIT) or One Station Unit Training (OSUT) who teach<br />

technical material to new Soldiers to prepare them for their first duty position. We asked the<br />

43 In D. J. Knapp (Chair), Selecting Soldiers for the Future Force: The Army’s Select21 Project. Symposium<br />

conducted at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>) Conference, Pensacola, FL. The views,<br />

opinions, and/or findings contained in this paper are those of the authors and should not be construed as an official<br />

U.S. Department of the Army position, policy, or decision.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

522


SMEs to think of Soldiers they supervised and use the rating scales to assess their performance<br />

and then to give us feedback on how they used the scales and any problems they had with the<br />

scales. NCOs also helped to write test items and review items written by the HumRRO project<br />

staff. Students in AIT/OSUT also completed the job knowledge items and rating scales. We had<br />

planned to conduct a pilot test for both measures; however, most of the intended participants<br />

became unavailable due to the deployments associated with Operation Iraqi Freedom. The field<br />

test scheduled for late spring or summer of 2004 will provide the first large-scale chance to<br />

administer the measures to intended users.<br />

PERFORMANCE RATING SCALES<br />

We are developing two types of rating scales—Observed Performance Scales (OPS) and<br />

Future Expected (FX) scales. The OPS are ratings from target Soldiers’ supervisors and peers on<br />

the Soldier’s current performance. We are developing versions of these scales for the Army-wide<br />

and six target MOS samples. The FX scales will ask raters to assess the Soldier’s expected<br />

effectiveness under conditions we anticipate will exist in the future (Sager & Russell, <strong>2003</strong>). The<br />

anticipated future conditions developed during the job analysis include both cluster- and MOSlevel<br />

information. We will incorporate both levels of information in the scales for Army-wide<br />

and cluster samples.<br />

Observed Performance Rating Scales<br />

The Observed Performance Scales (OPS; both Army-wide and MOS-specific) were<br />

developed with input from SMEs (primarily training instructors) who took part in a series of<br />

workshops in the first half of <strong>2003</strong>. In the validation data collections, many raters will be asked<br />

to rate more than one Soldier using as many as four different scales, which could be quite a<br />

burden for them. In an effort to reduce the load on raters, one of the first steps we took was to<br />

review the performance requirements to see if they could be combined to reduce the number of<br />

scales in each rating instrument. We combined the 19 Army-wide performance dimensions into<br />

11 scales. For example, we combined the performance dimensions “Follows Instructions and<br />

Rules,” “Exhibits Integrity and Discipline on the Job,” and “Exhibits a Selfless Service<br />

Orientation” into a rating dimension named “Demonstrates Professionalism and Personal<br />

Discipline on the Job.” We also included an “Overall Effectiveness” scale. The organization of<br />

the Army-wide and MOS-specific scales is shown in Fig 1. We followed the same procedure for<br />

the MOS-specific scales. For example, the 11B performance requirements included several tasks<br />

related to maintaining personal and antitank weapons and grenades/rocket launchers. Subject<br />

matter experts approved combining these into one scale. Our goal was to reduce the number of<br />

scales while still differentiating between performance areas.<br />

OPS Format<br />

Much of our development work with the rating scales has focused on improving rater accuracy,<br />

including the extent to which the resulting ratings differentiate an individual Soldier’s strengths<br />

and weaknesses and differentiate between Soldiers. We have a good deal of experience in<br />

training raters to avoid typical rating problems, including evaluation errors (e.g., stereotyping),<br />

523<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


524<br />

response tendency errors (e.g., halo, leniency), and comparing Soldiers to one another. Our goal<br />

is to develop scales and rater training that will encourage raters to use the scales as a standard<br />

against which to measure performance. This is a continuing challenge.<br />

Common Task Performance<br />

Performs Army-wide Common Tasks<br />

Exhibits Safety Consciousness<br />

MOS-Specific Task Performance<br />

Performs MOS-Specific Technical Tasks<br />

Uses Computer<br />

Communication Performance<br />

Communicates in Writing<br />

Communicates Orally<br />

Information Management Performance<br />

Manages Information<br />

Problem Solving and Decision Making<br />

Performance<br />

Solves Problems and Makes Decisions<br />

Exhibits Tolerance<br />

Exhibits Cultural Tolerance<br />

Supports Peers<br />

Relates to and Supports Peers<br />

Demonstrates Teamwork<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Adaptation to Changes in Missions/Locations,<br />

Assignments, Situations<br />

Adapts to Changing Situations<br />

Exhibits Level of Effort and Initiative on the Job<br />

Exhibits Effort and Initiative on the Job<br />

Demonstrates Professionalism and Personal<br />

Discipline on the Job<br />

Adheres to Regulations, Policies, and<br />

Procedures<br />

Exhibits Integrity and Discipline on the Job<br />

Exhibits a Selfless Service Orientation<br />

Personal and Professional Development<br />

Exhibits Self-Management<br />

Exhibits Self-Directed Learning<br />

Demonstrates Physical Fitness<br />

Demonstrates Physical Fitness<br />

Demonstrates <strong>Military</strong> Presence<br />

Figure 1. Organization of performance dimensions for Army-wide observed performance scales.<br />

In their final versions, each of the OPS has four sections: (a) title of the dimension, (b)<br />

three summary performance paragraphs, (c) behavioral examples, and (d) a 7-point rating scale.<br />

The summary paragraphs (anchors) provide a snapshot description of Soldiers’ behavior<br />

representing three levels of performance: Exceeds Expectations, Meets Expectations, and Fails to<br />

Meet Expectations. The behavioral examples are designed to provide additional pieces of<br />

information about Soldiers' behavior at the various levels of effective performance to improve rater<br />

accuracy. Fig. 2 provides an example of a rating scale. The 7-point rating scale allows raters to<br />

differentiate between rating levels to provide ratings that are an accurate reflection of performance.<br />

The scales looked rather different at the beginning of the project. Keeping in mind our<br />

goal to have raters rely on the scales as the measurement standard, we revised the format of the<br />

scales based on the results of practice rating exercises done with SMEs. The original scales<br />

contained a title, a lead-in question about how effective the Soldier is at performing in that<br />

dimension, and designations of High, Moderate, or Low performance for the three columns. We<br />

discovered from discussion with the SMEs that some of them just read the question and decided<br />

524


on a rating, while others use the High-Low designations along with the question in deciding on a<br />

rating. So, to try to force raters to read the scale and use that as the measure of performance, we<br />

eliminated the question and the High-Low designations.<br />

Shows little effort or initiative in<br />

accomplishing even simple tasks<br />

− Frequently fails to meet<br />

deadlines<br />

− Refuses or ignores<br />

opportunities to take additional<br />

responsibilities<br />

LEVEL OF EFFORT AND INITIATIVE ON THE JOB<br />

Demonstrates sufficient effort in<br />

accomplishing most tasks; puts<br />

forth extra effort when necessary<br />

− Is usually reliable about<br />

completing assignments on time<br />

− Accepts additional<br />

responsibilities; may<br />

occasionally seek out<br />

challenging assignments<br />

525<br />

Consistently demonstrates initiative<br />

and often puts forth extra effort to<br />

accomplish tasks effectively, even<br />

under difficult conditions<br />

− Almost always completes<br />

assignments on time<br />

− Seeks out and enthusiastically<br />

takes on challenging<br />

assignments and additional<br />

responsibilities<br />

1 2 3 4 5 6 7<br />

Figure 2. Example of a Select21 observed performance rating scale.<br />

We also tried another exercise to encourage raters to think about the relative strengths<br />

and weaknesses of the Soldiers they rated. We asked raters in our early site visits to rank soldiers<br />

on the Army-wide dimensions prior to rating them. We provided them with a set of cards on<br />

which the rating scale dimensions and anchors were printed, and asked them to sort the cards in<br />

the order that reflected the performance level of the soldiers they were rating. After sorting the<br />

cards, we instructed the SMEs to record their rankings on a separate sheet and then to complete<br />

the Army-wide OPRS. In the field test, we plan to simplify the task by asking raters to sort the<br />

cards into three piles (Exceeds Expectations, Meets Expectations, and Fails to Meet<br />

Expectations). They will then record the categorization, and make their ratings. We think this<br />

will accomplish the same result as the ranking of performance on all 12 dimensions and reduce<br />

the frustration encountered by “ties” in rankings. We will re-examine this issue after another<br />

review to determine whether the added accuracy is useful and makes the task worth the projected<br />

resource costs. If this exercise helps raters to differentiate their ratings, we will use it for the<br />

Army-wide scales with the expectation that the lesson learned will carry over to the other scales.<br />

Future Expected (FX) Performance Scales<br />

The FX scales ask raters to predict how well the ratee might be expected to perform<br />

under conditions we believe will exist for the Future Army. Both the Army-wide and clusterspecific<br />

FX scales will be based on anticipated future conditions generated in the job analysis<br />

phase of the project (Sager, Russell, Campbell & Ford, <strong>2003</strong>). The Army-wide future conditions<br />

to be rated are:<br />

• Learning Environment<br />

• Disciplined Initiative<br />

• Communication Method & Frequency<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


526<br />

• Individual Pace & Intensity<br />

The Army-wide FX scales will incorporate descriptions of the anticipated future<br />

conditions listed above. Raters will read the description and rate how effectively they think the<br />

Soldier would perform under each condition. They will also provide a rating of how confident<br />

they are in those ratings. The cluster-specific scales will be based on scenarios much like those<br />

generated for the SJT-X in the NCO21 project (Knapp et al., 2002). These scenarios are more<br />

detailed and are specific to the job cluster. Again, raters will indicate how well they think the<br />

Soldiers would perform in that scenario. We anticipate developing 5-6 scenarios for the Close<br />

Combat cluster, where the anticipated future conditions are much the same for the three MOS,<br />

and 10-11 scenarios for the Surveillance, Intelligence, and Communication cluster, where there<br />

is less overlap among the MOS. Some of these scenarios may be only applicable to one or two<br />

MOS in the cluster. This process will not be as good as addressing independent themes or<br />

constructs, but it will do well at sampling the relevant content for the MOS.<br />

We will collect separate ratings for the Army-wide and cluster-specific FX scales and<br />

conduct statistical analyses to determine whether there is dimensionality in the ratings. However,<br />

because (a) each scenario is likely to involve multiple “dimensions” of performance and (b) a<br />

single dimension of performance is likely to be relevant for more than one scenario, we believe<br />

that dimensionality is very unlikely here. Therefore, for both Army-wide and cluster FX ratings,<br />

we plan to aggregate the ratings into an overall Army-wide rating and an overall cluster rating,<br />

respectively.<br />

Rater Training/Process<br />

Effective rater training is key to getting raters to use the rating scales as intended. We<br />

have considerable experience in developing rater training that has focused on evaluation errors<br />

(e.g., first impression, stereotyping) and response tendency errors (e.g., halo effect, central<br />

tendency). This experience has shown that reducing or eliminating rating error is quite difficult.<br />

Our goal with Select21 rater training is to more clearly focus raters on reading and using<br />

the scales accurately. For all raters, training will emphasize the importance of making accurate<br />

ratings and thinking about a Soldier’s relative strengths and weaknesses. To this end, we will<br />

stress the importance of accurate performance measures to the overall success of the project. In<br />

past work we have found that stressing the fact that the ratings are “for research purposes only”<br />

helps to overcome problems, such as leniency, that are common in operational performance<br />

ratings. We will focus on the importance of reading the anchors, thinking about a Soldier’s<br />

relative strengths and weaknesses, and applying that insight to the ratings. The ranking exercise<br />

described previously should help with this focus, as will the format of the rating scales. We will<br />

address response tendency and evaluation errors. In addition, while raters are working, we will<br />

have facilitators move about the room to keep an eye out for raters who seem to be falling prey<br />

to these visible errors.<br />

We expect that we will collect ratings from most of the raters in a face-to-face setting.<br />

We also expect that a fairly large proportion of raters will not be available during the data<br />

collection period. Identifying raters and collecting their ratings has been a challenge in past<br />

efforts such as this, and we expect that we will encounter the same situation in this project. It is<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


fairly easy to identify first-level supervisors. We will have the names of each Soldier's direct<br />

supervisor and they will be asked to attend the data collection, so getting ratings from them<br />

should be fairly straightforward. However, we have found that it is beneficial to collect ratings<br />

from multiple raters. For Select21, we would like to identify at least two supervisors and several<br />

peers to rate a Soldier. The second supervisor may be an NCO or might be another Soldier in the<br />

target Soldier's unit who has seniority over the target Soldier. When Soldiers come in for testing,<br />

we will ask them to identify a second supervisor and several peers who could provide ratings for<br />

them. If Soldiers come in a group, as we expect will happen with the large density MOS such as<br />

infantry, we can identify peer raters from within that group. However, for the lower density<br />

MOS, we expect that Soldiers will identify peers who can rate them. We will also ask<br />

supervisors to provide the names of potential raters for their Soldiers. We will leave rating<br />

materials for those raters to complete after the on-site data collection is over.<br />

Collecting “Distance” Ratings<br />

Collecting ratings from absentee raters is a very different problem than encountered in a faceto-face<br />

session. The problem is two-fold—(a) persuading them to make ratings at all and (b) getting<br />

them to make accurate ratings. First, whether or not to complete the ratings is highly discretionary for<br />

them. Although we have been advised that getting buy-in from their supervisors will gain their<br />

cooperation, this process has not been an overwhelming success in the past. Second, they will not<br />

hear and see the training message, nor will they be able to ask questions. They will read as much or<br />

as little of the instructions as they want, so it is likely they will not fully understand why their ratings<br />

are important.<br />

We will collect data from a small number of field NCOs in January. At that time, we will<br />

talk with them about (a) the relative feasibility of using paper-and-pencil leave-behind packets or<br />

asking Soldiers to access the Select21 website to make ratings for the 2004 field test and (b)<br />

ways to increase the response rate for Soldiers who are unable to attend the regular rating<br />

sessions. We have used paper-and-pencil leave-behind packets in other projects and know that<br />

the response rate is not tremendously high. For Select21, we have the capability of either<br />

allowing raters to access a website to make ratings or sending them rating forms via email. One<br />

of the topics we will talk about in the January meetings is whether we can assume that Soldiers at<br />

all installations will have access to the Internet and/or email in such a way that makes electronic<br />

ratings feasible. We will incorporate their feedback into our plans for the field test. The field test<br />

in 2004 will provide an important opportunity to try out multiple strategies for handling<br />

“distance” ratings. We could evaluate the two options by seeing which most raters prefer and<br />

comparing the quality of data obtained through each.<br />

JOB KNOWLEDGE TEST<br />

The purpose of the job knowledge criterion exam is to obtain an indicator of job<br />

performance of first-tour Soldiers in the Future Force by measuring job-related knowledge at the<br />

Army-wide and MOS-specific levels. The job knowledge criterion exam is a “can-do” measure<br />

of first-tour Soldier performance.<br />

527<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


528<br />

The original Select21 research plan called for development of both hands-on and<br />

computer-based work sample simulations. It became clear early on, however, that the simulation<br />

capabilities offered by the Perception® software being used for the job knowledge tests<br />

mitigated the need for development of expensive, high fidelity computer-based simulations.<br />

Moreover, there were no obvious Army-wide testing requirements that would be reasonably met<br />

with computer-based simulations, and it would not be economically feasible to even consider<br />

developing simulations for all six target MOS. The reality of developing criterion measures for<br />

six MOS instead of two job clusters also led the project team to step back from the expectation<br />

that hands-on tests would be developed for the Army-wide and the six MOS samples. The<br />

Perception® testing software allows for a far more realistic presentation of materials (e.g., via<br />

graphics such as illustrations, photographs, video clips) than the traditional paper-and-pencil<br />

multiple-choice tests.<br />

Perception® allows the use of a variety of question types (e.g., matching, ranking,<br />

true/false, Likert-type scales) as well as standard multiple-choice. The use of graphics,<br />

illustrations, photographs, and video clips also reduces the reading level required to take the<br />

exam and, consequently, reduces the risk of adverse impact. Additionally, it allows the test items<br />

(traditionally a measure of textbook knowledge) to be more performance-oriented.<br />

Developing the Test<br />

The content of these tests is driven by the performance requirements identified by the<br />

future-oriented job analysis of first-tour Soldiers. The test blueprints (i.e., content specifications,<br />

including the degree to which each content area is reflected in the test) are composed of the<br />

performance requirements that are easily captured in a written test (e.g., knowledge of first aid<br />

procedures is more easily tested than oral communication skill by this method). Although the test<br />

blueprints are composed of tasks, it is the knowledge required to perform each task that is<br />

captured by the test questions; thus, the instruments are referred to as knowledge tests. HumRRO<br />

project staff developed draft blueprints that were reviewed and revised by AIT/OSUT<br />

instructors, drill sergeants, and other SMEs.<br />

Developing an Item Bank<br />

The final Army-wide test will have 60-80 test questions and the final MOS-specific<br />

instruments will have 40-60 test questions. Because many questions are dropped during the<br />

review process, the goal is to write at least twice as many questions as required, per category.<br />

The test questions were written primarily by HumRRO staff, as well as by training instructors<br />

during data collection visits. Other questions were imported from the Project A item bank<br />

(Campbell & Knapp, 2001). Multiple sources were used for item content, including the Soldiers’<br />

Manual of Common Tasks, and field manuals and study guides available online (e.g.,<br />

www.adtdl.army.mil, www.usapa.army.mil, and www.armystudyguide.com). These references<br />

are also useful sources of pictures and graphics. For content where there are no existing pictures<br />

or video clips, graphics are being collected at the schools.<br />

All items go through an iterative review process. HumRRO project staff and Army<br />

instructors developed new questions and identified relevant Project A questions. In-house test<br />

experts and staff familiar with Army performance requirements reviewed all test questions. These<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


eviewers considered the currency of item content (e.g., in terms of technology and/or procedures)<br />

and how well each item adheres to the task requirement. Once items have passed the in-house<br />

review, they are presented to focus groups of school instructors and/or drill sergeants. Again, these<br />

reviewers consider the currency and relevance of item content. They also make revisions as<br />

necessary. HumRRO project staff implement revisions and update item banks, writing additional<br />

questions as necessary to replace dropped questions.<br />

SELECT21 PERSONNEL FILE FORM (S21-PFF)<br />

The Select21 Personnel File Form (S21-PFF) will serve as a self-report criterion measure<br />

for use in the Select21 validation effort. The S21-PFF will closely parallel the content of the<br />

Army NCO Promotion Point Worksheet (PPW) and Personnel File Forms used in past research<br />

(e.g., NCO21, Project A). The S21-PFF will contain sections that assess Soldiers’ standing on (a)<br />

Awards, Certificates, and <strong>Military</strong> Achievements; (b) <strong>Military</strong> Education; (c) Civilian<br />

Education; and (d) <strong>Military</strong> Training. Points for these areas are allocated by the personnel<br />

system based on Soldiers’ records.<br />

Questions that assess promotion board points (available on the Project A PFF) and<br />

commander’s evaluation points are also being considered for inclusion in the S21-PFF. The S21-<br />

PFF will also ask Soldiers to indicate the number of Article 15s and Flag Actions they have<br />

received. Data on these disciplinary actions will be particularly useful as criteria for the<br />

temperament and P-E fit predictors. We will also generate a list of operational tests, training<br />

experiences, and other potential criterion indicators that do not appear on previous PFFs, but that<br />

might yield useful information. This list will be used as a source of additional items, some of<br />

which are likely to be MOS-specific. Project staff will then review and comment on this initial<br />

measure. Lists of awards, military education, and civilian experiences will be carefully<br />

scrutinized to ensure that they are all covered by the current PPW, and to determine whether any<br />

awards or experiences are missing. Based on feedback from project staff, appropriate additions,<br />

deletions, or modifications will be made prior to the field test. Although not collected via selfreport,<br />

we will also calculate an archival measure of promotion rate. This variable was used<br />

successfully in Project A as a supplemental job performance indicator.<br />

NEXT STEPS<br />

Due to deployments, there has been limited opportunity to collect data on the criterion<br />

measures. The next opportunity for large-scale administration of the criterion measures will be in<br />

the 2004 field test, which is likely to occur in late spring or summer of 2004. All the criterion<br />

measures will be finalized for administration in the concurrent validation at that time.<br />

In January, we will ask small groups of field NCOs to review the OPS and FX scales.<br />

This will be the first exposure of the scales to NCOs who are not instructors, so their point of<br />

view is extremely useful, as are there ideas about how to best conduct “distance ratings.” After<br />

this mini pilot test, we will finalize the scales for the 2004 field test. The field test will be the<br />

first large-scale administration of the rating scales. This will provide us the opportunity to see<br />

how well our efforts to focus raters on using the scales as a standard have worked. It will also<br />

529<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


530<br />

allow us to compare response rates and ratings from paper-and-pencil leave-behind packets and<br />

ratings administered via the Internet.<br />

Job knowledge test development is continuing at several installations in the fall/winter of<br />

<strong>2003</strong>. We will incorporate these new and revised items into item bank. In January we will<br />

finalize the instruments for field testing and develop a background information questionnaire.<br />

This questionnaire will ask Soldiers whether they have been trained on each of the tasks and how<br />

recently they have performed the task. In addition, it will ask Soldiers the unit to which they are<br />

assigned and what equipment they use. This information is critical to allowing us to tailor the<br />

tests for question tracking during the MOS portion of the exam. We will administer overlength<br />

exams in the field test, which will allow us to gather data on all the items in the test bank. We<br />

will use the field test data to revise the tests for the concurrent validation, selecting the best set of<br />

items for each test.<br />

REFERENCES<br />

Campbell, J.P., & Knapp, D.J. (Eds.) (2001). Exploring the limits in personnel selection and<br />

classification. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.<br />

Knapp, D.J., Burnfield, J.L., Sager, C.E., Waugh, G.W., Campbell, J.P., Reeve, C.L., Campbell,<br />

R.C., White, L.A., & Heffner, T.S. (2002). Development of predictor and criterion<br />

measures for the NCO21 research program (Technical Report 1128). Alexandria, VA:<br />

U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

Sager, C.E., Russell, T.L., Campbell, R.C., & Ford, L.A. (<strong>2003</strong>). Future Soldiers: Analysis of<br />

entry-level performance requirements and their predictors. (Draft Technical Report).<br />

Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

Sager, C.E., & Russell, T.L. (<strong>2003</strong>). Future-oriented job analysis for first-tour soldiers. In D. J.<br />

Knapp (Chair), Selecting Soldiers for the Future Force: The Army’s Select21 Project.<br />

Symposium conducted at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>)<br />

Conference, Pensacola, FL.<br />

Van Iddekinge, C., Putka, D., & Sager, C.E. (<strong>2003</strong>). Assessing person-environment (p-e) fit with<br />

the future Army. In D. J. Knapp (Chair), Selecting Soldiers for the Future Force: The<br />

Army’s Select21 Project. Symposium conducted at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong><br />

<strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>) Conference, Pensacola, FL.<br />

Waugh, G.W., & Russell, T.R. (<strong>2003</strong>). In D. J. Knapp (Chair), Selecting Soldiers for the Future<br />

Force: The Army’s Select21 Project. Symposium conducted at the <strong>2003</strong> <strong>International</strong><br />

<strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>) Conference, Pensacola, FL.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


DEVELOPING OPERATIONAL PERSONALITY ASSESSMENTS:<br />

STRATEGIES FOR FORCED-CHOICE AND BIODATA-BASED<br />

MEASURES 44<br />

Rodney A. McCloy, Ph.D., Dan J. Putka, Ph.D., and Chad H. Van Iddekinge, Ph.D.<br />

Human Resources Research Organization (HumRRO)<br />

Alexandria, VA, USA<br />

rmccloy@humrro.org<br />

BACKGROUND<br />

Robert N. Kilcullen, Ph.D.<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

Alexandria, VA, USA<br />

The U.S. Army is undertaking fundamental changes to transform into the Future Force.<br />

The Select21 project concerns future entry-level Soldiers selection, with the goal of ensuring the<br />

Army selects and classifies Soldiers with the knowledge, skills, and attributes (KSAs) needed for<br />

performing the types of tasks envisioned in a transformed Army. The ultimate objectives of the<br />

project are to (a) develop and validate measures of critical attributes needed for successful<br />

execution of Future Force missions and (b) propose use of the measures as a foundation for an<br />

entry-level selection and classification system adapted to the demands of the 21 st century. The<br />

Select21 project focuses on the period of transformation to the Future Force—a transition<br />

envisioned to take approximately 30 years to complete. The time frame of interest extends to<br />

approximately 2025.<br />

The major elements of our approach to this project are (a) future-oriented job analysis,<br />

(b) development of KSA/predictor measures, (c) development of criterion measures, and (d)<br />

concurrent criterion-related validation. The future-oriented job analysis provides the foundation<br />

for development of new tests that could be used for recruit selection or <strong>Military</strong> Occupational<br />

Specialty (MOS) assignment (i.e., predictors) and development of job performance measures that<br />

will serve as criteria for evaluating the predictors. After field testing the predictor and criterion<br />

instruments, we will evaluate the potential usefulness of the predictors by comparing Soldiers’<br />

scores on the predictor measures to their scores on criterion performance measures in a<br />

concurrent criterion-related validation effort.<br />

44 In D. J. Knapp (Chair), Selecting Soldiers for the Future Force: The Army’s Select21 Project. Symposium<br />

conducted at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>) Conference, Pensacola, FL. The views,<br />

opinions, and/or findings contained in this paper are those of the authors and should not be construed as an official<br />

U.S. Department of the Army position, policy, or decision.<br />

531<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


532<br />

The Select21 job analysis team reviewed multiple sources to identify relevant KSAs,<br />

including the Basic Combat Training list, Project A KSAs, NCO21 KSAs, Soldier21, and several<br />

other sources. This activity resulted in a list of 48 KSAs relevant to performance of first-tour<br />

Soldiers in the Future Force (Sager & Russell, <strong>2003</strong>). Twelve of these entry-level KSAs fall<br />

under the heading of temperament and serve as the target constructs for the Select21<br />

temperament measures:<br />

• Team Orientation<br />

• Agreeableness<br />

• Cultural Tolerance<br />

• Social Perceptiveness<br />

• Achievement Motivation<br />

• Self-Reliance<br />

• Affiliation<br />

• Potency<br />

• Dependability<br />

• Locus of Control<br />

• Intellectance<br />

• Emotional Stability.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


In this paper, we focus on two measures: the Person-Organization-Personality (POP) Hybrid<br />

(also known as the Work Suitability Inventory) and the Rational Biodata Inventory (RBI). The<br />

discussion highlights design characteristics of the POP Hybrid and RBI that we believe will<br />

preserve their utility in operational selection settings.<br />

PERSON-ORGANIZATION-PERSONALITY (POP) HYBRID<br />

Researchers generally agree that people can fake self-report personality assessments (Hough,<br />

Eaton, Dunnette, Kamp, & McCloy, 1990; Ones, Viswesvaran, & Korbin, 1995) and that many will<br />

do so in operational selection settings (Hough, 1996, 1997, 1998; Rosse, Stechler, Miller, & Levin,<br />

1998). Researchers disagree, however, regarding the extent to which faking affects the criterionrelated<br />

validity of these assessments. Although many researchers have found that faking has little or<br />

no effect on criterion-related validity estimates (e.g., Barrick & Mount, 1996; Hough et al., 1990;<br />

Ones, Viswesvaran, & Reiss, 1996), other evidence suggests faking does change the rank-order of<br />

applicants in the upper tail of the distribution and results in the selection of individuals with lowerthan-expected<br />

performance scores (Mueller-Hanson, Heggestad, & Thornton, <strong>2003</strong>; Zickar, 2000).<br />

Given our experience with the Army’s Assessment of Individual Motivation (AIM; Knapp, Waters,<br />

& Heggestad, 2002), we believe that response distortion poses a dauntingly high hurdle to the<br />

personnel selection specialist interested in using temperament measures in an operational setting.<br />

Recent efforts to mitigate response distortion have centered on forced-choice formats.<br />

Although forced-choice formats have demonstrated capacity to reduce the effects of faking<br />

(Jackson, Wrobleski, & Ashton, 2000; White & Young, 1998; Wright & Miederhoff, 1999), they<br />

carry the stigma of ipsative response data (Hicks, 1970). One approach to reducing the ipsativity<br />

of a forced-choice measure involves introducing foil (i.e., dummy) constructs—constructs we do<br />

not wish to score. This approach reduces ipsativity in the responses because one can now score<br />

relatively high or relatively low on all keyed constructs (when they are paired only with dummy<br />

constructs). Some ipsativity remains, however, because the forced-choice response depends upon<br />

the respondent’s standing on the keyed and dummy traits in each pair. Thus, although ipsativity<br />

fades, it does not exit the stage entirely. Furthermore, we hypothesize that one will likely attain<br />

better approximations of normative trait standings to the extent that one more fully samples the<br />

content space of interest (here, personality traits). We designed the POP Hybrid as a forcedchoice<br />

measure with these characteristics, which we hypothesize will enhance its ability to<br />

provide estimates of respondents’ normative standing on the targeted traits.<br />

Development<br />

The POP Hybrid comprises 16 statements (stems) that describe temperament-related work<br />

requirements (e.g, work that requires being showing a cooperative attitude). The statements are<br />

based on the Work Styles portion of the O*NET content model (Borman, Kubisiak, & Schneider,<br />

1999), although we have simplified their wording to make them more accessible to entry-level<br />

Soldiers. Given that the O*Net work styles taxonomy was designed to cover the entire domain of<br />

personality, it provides good coverage of the Select21 temperament-related KSAs with the<br />

exception of Locus of Control and Cultural Tolerance, which are not typically included in<br />

personality taxonomies (see Sager & Russell (<strong>2003</strong>) for a review of Select21 KSAs).<br />

533<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


534<br />

Several factors led to the decision to base the POP Hybrid content on the O*NET Work<br />

Styles. First, the taxonomy lends itself to the formation of commensurate measures for assessing<br />

person-environment (P-E) fit. For example, the content allows one to construct both person-side<br />

(ability) and Army-side (demand) measures. Furthermore, the breadth of coverage of the taxonomy<br />

helps ensure coverage of the range of work-related personality traits/characteristics an applicant<br />

might have, which is an important characteristic of P-E fit measures (Putka, Van Iddekinge, &<br />

Sager, <strong>2003</strong>). Third, working from the O*NET Work Styles model provides the POP Hybrid with<br />

a defensible taxonomic base upon which to argue that the stems from target traits appear with an<br />

appropriate set of dummy traits, which as we note above, is arguably an important characteristic of<br />

forced-choice measures. Finally, as a deterrent to prevent respondents from distorting their<br />

answers, all stems are socially desirable.<br />

The POP Hybrid attempts to distract Soldiers from thinking about how best to game their<br />

answers to a temperament assessment by redirecting their thoughts toward P-E fit. For example,<br />

the initial version of the POP Hybrid contained more than 100 paired-comparison items.<br />

Respondents selected the one statement out of each pair that described the type of work they<br />

believed they “would be more successful at.” Not surprisingly, Soldiers reacted quite negatively to<br />

the redundancy of the measure and sheer drudgery of the exercise. In addition, the measure<br />

required an inordinate amount of administration time (approximately 45 minutes). We therefore<br />

put this version aside in favor of an alternative response format.<br />

The POP Hybrid now takes the form of a card-sorting task, with each of the 16 cards<br />

containing one of the work characteristic statements (Fig. 1 presents 3 of the 16 statements). The<br />

instructions direct the respondents to “sort the 16 cards in terms of how well you think you would<br />

perform the type of work described by the cards. Cards containing types of work that you think<br />

you would perform best should be ranked highest; cards containing types of work that you think<br />

you would perform worst should be ranked lowest.”<br />

Work that requires…showing a cooperative and friendly attitude towards others I dislike or disagree with.<br />

(Agreeableness)<br />

Work that requires…setting challenging goals and working continuously to attain them. (Achievement<br />

Motivation)<br />

Work that requires…interacting with people of different cultures and backgrounds, and appreciating differences<br />

in their values, opinions, and beliefs (Cultural Tolerance)<br />

Note. The target Select21 KSA appears in parentheses following each POP Hybrid stem.<br />

Figure 1. Sample POP Hybrid stems.<br />

Scoring<br />

One benefit of the POP Hybrid is that it can be scored in several ways, depending on<br />

whether we want to use it for traditional personality assessment applications or more traditional<br />

P-E fit applications. For example, two options we are considering are:<br />

(1) Scoring target constructs only (traditional personality assessment):<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Here, we score only those statements selected as target constructs; the remaining<br />

statements serve as “foil” (dummy) constructs (i.e., constructs we choose not to<br />

score). Considering only these statements reduces score ipsativity, with the<br />

reduction inversely proportional to the number of target constructs. Scores for the<br />

target constructs are a function of their ranking relative to the foil constructs<br />

only—not each other. Therefore, respondents can receive equal scores on the<br />

target constructs (e.g., two target constructs are each ranked higher than any of the<br />

foils). Were the data totally ipsative, different traits could not receive the same<br />

score; thus, the data are only partially ipsative, thereby improving their statistical<br />

characteristics.<br />

(2) Scoring all constructs (P-E fit applications):<br />

As noted earlier, we can also use this information to assess P-E fit with regard to<br />

the temperament-related characteristics applicants possess and the temperamentrelated<br />

demands of the Army. For example, if we administer an instrument similar<br />

to the POP Hybrid to Army SMEs with different instructions (e.g., rank the<br />

statements in terms of which they are descriptive of work that is critical to<br />

perform well as first-tour Soldier), we would have an Army-side profile to which<br />

we could compare applicant responses. With an applicant and an Army-side<br />

profile, numerous scoring options might be investigated. For example, such<br />

information could be used to generate rank-order correlations with Army-side rank<br />

orderings to generate profile similarity indices.<br />

Other Development Considerations<br />

In addition to the steps described above to make the POP Hybrid more resistant to faking,<br />

other characteristics may help the measure maintain its validity in an operational setting. For<br />

example, to the extent respondents try to distort the rank ordering of stems to match the ideal<br />

personality for the Army, such distortion may not detract from the criterion-related validity of<br />

the resulting score. Indeed, this particular form of distortion would indicate familiarity with the<br />

requirements of the Army and realistic expectations with regard to what Army work requires.<br />

The literature on realistic job previews suggests that familiarity with the job (or in this case the<br />

Army) and realistic expectations would contribute to criterion-related validity when predicting<br />

alternative criteria such as job satisfaction and attrition (Wanous, 1992). Thus, although this type<br />

of response distortion represents a source of contamination in POP Hybrid scores, it could very<br />

well serve as criterion-related contamination and thus enhance criterion-related validity.<br />

In addition, the design of the POP Hybrid allows us to select which constructs to key and<br />

which to treat as foils, depending on the criteria of interest. Thus, for criterion Y1,<br />

Achievement/Effort, Energy, and Leadership Orientation might serve as the keyed traits, with the<br />

other 13 traits serving as dummies. Criterion Y2, on the other hand, might require Innovation,<br />

Analytic Thinking, Stress Tolerance, and Energy as the keyed traits. This flexibility in how we<br />

treat constructs contained on the POP Hybrid has great value for two additional reasons. First,<br />

the Army often desires to use the same instrument to predict a variety of criteria (e.g., using AIM<br />

to predict NCO performance, recruiter performance and first-tour attrition). Second, to the extent<br />

that we can convince respondents completing the POP Hybrid that the Army will use results for a<br />

535<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


536<br />

variety of purposes (thus another reason for covering the domain of personality), it may prevent<br />

them from attempting to fake toward a given profile or in a certain direction.<br />

RATIONAL BIODATA INVENTORY (RBI)<br />

During initial project meetings, there was a recommendation to use the Test of Adaptable<br />

Personality (TAP), a 20-minute biodata assessment. Because the TAP has demonstrated<br />

criterion-related validity with can-do criteria in operational use for Special Forces Soldiers<br />

(Kilcullen, Goodwin, Chen, Wisecarver, & Sanders, 2002), we hypothesized the TAP (or another<br />

measure like it) might hold substantial promise as a selection measure for entry-level Soldiers.<br />

Over time, we selected only certain TAP scales, supplementing these with scales from the<br />

Biographical Information Questionnaire (BIQ)—a conglomerate biodata instrument administered<br />

during the NCO21 project (cf. Putka, Kilcullen, & White, <strong>2003</strong>) and comprising ARI’s<br />

Assessment of Right Conduct (ARC). The resulting measure—the Rational Biodata Inventory<br />

(RBI)—provides a largely biodata-driven assessment that we believe holds promise for<br />

operational use.<br />

We continue to work on the RBI. We anticipate it potentially assessing all 12 Select21<br />

temperament-related KSAs (Sager & Russell, <strong>2003</strong>). In many ways the RBI will complement the<br />

POP Hybrid, either assessing traits that the POP Hybrid does not, or assessing different facets of<br />

the traits the POP Hybrid does assess. For example, the KSA Locus of Control does not lend<br />

itself to assessment through the POP Hybrid method, as the trait concerns internal attributions<br />

rather than characteristics of work environments. The RBI, however, can readily assess an<br />

individual’s standing on this construct. Other constructs, such as Dependability, are arguably too<br />

broad to assess completely with a single statement. The Dependability stem on the POP Hybrid<br />

concerns the degree to which the respondent meets obligations and completes duties on time. We<br />

plan on the RBI rounding out our assessment of this heterogeneous construct by incorporating<br />

scales such as Hostility to Authority (negatively loading on the facet of respect for authority) and<br />

Social Maturity (loading on facets of conformity and compliance).<br />

NEXT STEPS<br />

The Select21 temperament measures offer substantial promise for predicting the typical<br />

job performance of Future Force Soldiers. To realize the promise, however, we must<br />

satisfactorily address several issues.<br />

Given the Select21 data collection will incorporate concurrent validation in a researchonly<br />

setting, assessing the performance of the personality measures in an operational setting<br />

appears of paramount importance. Our past research has demonstrated that responses obtained<br />

under faking instructions might not approximate well those responses obtained during<br />

operational use of the measure (Knapp et al., 2002)—the nature of dissimulation varied from<br />

research to operational setting. If at all possible, some form of operational tryout of the<br />

personality measure(s) seems imperative.<br />

Failing an operational tryout, we believe a carefully designed investigation of<br />

faking/coaching that mirrors the operational selection setting closely would be critical to<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


understanding how applicants might alter their responses compared to research participants.<br />

Therefore, we propose to conduct such an investigation during the pilot test of the POP Hybrid<br />

and RBI that will link performance on the measures to some set of desired outcomes (what these<br />

will be, exactly, remains unclear). Again, we believe such a tryout to have special import for the<br />

RBI, given its prior use in operational settings with experienced Soldiers.<br />

We are considering several options for simulating operational selection setting. One<br />

option entails administration of the measures with incentives given for “correct” responding (i.e.,<br />

responses that look like the ideal candidate without looking overly suspect). We would compare<br />

statistics for the measures from this condition to statistics from an honest-responding condition.<br />

Although we would have no criterion data, any changes in respondents’ rank order would raise<br />

flags about use of the measure(s) in an operational setting.<br />

537<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


538<br />

REFERENCES<br />

Barrick, M.R., & Mount, M.K. (1996). Effects of impression management and self-deception on the<br />

predictive validity of personality constructs. Journal of Applied Psychology, 81, 261-272.<br />

Borman, W.C., Kubisiak, U.C., & Schneider, R.J. (1999). Work styles. In N.G. Peterson, M.D.<br />

Mumford, W.C. Borman, P.R. Jeanneret, & E.A. Fleishman (Eds.), An occupational<br />

information system for the 21st century: The development of O*NET (pp.213-226).<br />

Washington, DC: American Psychological <strong>Association</strong>.<br />

Hicks, L.E. (1970). Some properties of ipsative, normative, and forced-choice normative<br />

measures. Psychological Bulletin, 74, 167-184.<br />

Hough, L.M. (1996). Personality measurement and personnel selection: Implementation issues.<br />

Paper presented at the 11 th annual meeting of the Society of Industrial and Organizational<br />

Psychology, San Diego, CA.<br />

Hough, L.M. (1997). Issues and evidence: Use of personality variables for predicting job<br />

performance. Paper presented at the 12 th annual meeting of the Society of Industrial and<br />

Organizational Psychology, St. Louis, MO.<br />

Hough, L.M. (1998). Effects of intentional distortion in personality measurement and evaluation<br />

of suggested palliatives. Human Performance, 11, 209-244.<br />

Hough, L.M., Eaton, N.K., Dunnette, M.D., Kamp, J.D., & McCloy, R.A. (1990). Criterionrelated<br />

validities of personality constructs and the effect of response distortion on those<br />

validities. Journal of Applied Psychology, 75, 581-595.<br />

Jackson, D.N., Wrobleski, V.R., & Ashton, M.C. (2000). The impact of faking on employment<br />

tests: Does forced-choice offer a solution? Human Performance, 13, 371-388.<br />

Kilcullen, R., Goodwin, J., Chen, G., Wisecarver, M., & Sanders, M. (2002). Identifying agile<br />

and versatile officers to serve in the Objective Force. Paper presented at the 23rd Annual<br />

Army Science Conference, Orlando, FL.<br />

Knapp, D.J., Waters, B.K., & Heggestad, E.D. (Eds.) (2002). Investigations related to the<br />

implementation of the Assessment of Individual Motivation (AIM) (Study Note 2002-02).<br />

Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

Mueller-Hanson, R., Heggestad, E.D., & Thornton, G.C., III (<strong>2003</strong>). Faking and selection:<br />

Considering the use of personality from a select-in and a select-out perspective. Journal<br />

of Applied Psychology, 88, 348-355.<br />

Ones, D.S., Viswesvaran, C., & Reiss, A.D. (1996). Role of social desirability in personality testing<br />

for personnel selection: The red herring. Journal of Applied Psychology, 81, 660-679.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Putka, D.J., Kilcullen, R. N., & White, L.A. (<strong>2003</strong>). Temperament inventories. In D. J. Knapp, R.<br />

A. McCloy, & T. S. Heffner (Eds.), Validation of measures designed to maximize 21stcentury<br />

Army NCO performance (Interim Report) (pp. 8-1 – 8-42). Alexandria, VA:<br />

Human Resources Research Organization.<br />

Rosse, J.G., Stecher, M.D., Miller, J.L., & Levin, R. (1998). The impact of response distortion on<br />

pre-employment personality testing and hiring decisions. Journal of Applied Psychology,<br />

83, 634-644.<br />

Sager, C.E. & Russell, T.L. (<strong>2003</strong>, November). Future-oriented job analysis for first-tour<br />

soldiers. In D.J. Knapp (Chair), Selecting Soldiers for the Future Force: The Army’s<br />

Select21 Project. Paper presented at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Conference, Pensacola, FL.<br />

Putka, D.J., Van Iddekinge, C.H., & Sager, C.E. (<strong>2003</strong>, November). Developing measures of<br />

occupational interests and values for selection. In MG. Rumsey (Chair), Occupational<br />

Interest Measurement: Where Are the Services Headed? Paper presented at the <strong>2003</strong><br />

<strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Conference, Pensacola, FL.<br />

Wanous, J.P. (1992). Organizational entry (2 nd Ed.), Reading, MA: Addison-Wesley.<br />

White, L.A., & Young, M.C. (1998). Development and validation of the Assessment of<br />

Individual Motivation (AIM). Paper presented at the Annual Meeting of the American<br />

Psychological <strong>Association</strong>, San Francisco, CA.<br />

Wright, S.S., & Miederhoff, P.A. (1999). Selecting students with personal characteristics relevant to<br />

pharmaceutical care. American Journal of Pharmaceutical Education, 63, 132-138.<br />

Zickar, M.J. (2000). Modeling faking on personality tests. In D. Ilgen & C.L. Hulin (Eds.),<br />

Computational modeling of behavioral processes in organizations (pp. 95-108).<br />

Washington, DC: American Psychological <strong>Association</strong>.<br />

539<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


540<br />

INTRODUCTION<br />

SCORING BOTH JUDGMENT AND PERSONALITY<br />

IN A SITUATIONAL JUDGMENT TEST 45<br />

Gordon W. Waugh, Ph.D. and Teresa L. Russell, Ph.D.<br />

Human Resources Research Organization<br />

Alexandria, VA, USA<br />

gwaugh@humrro.org<br />

Although personality measures are good predictors of performance in a research setting<br />

(Tett, Jackson, & Rothstein, 1991), there are problems with their use in operational settings.<br />

(Knapp, Waters, & Heggestad, 2002). There is substantial research showing that personality tests<br />

can be faked (Hough, Eaton, Dunnette, Kamp, & McCloy, 1990; Ones, Viswesvaran, & Korbin,<br />

1995). Several recent studies show that they probably are faked when used for personnel<br />

selection (Hough, 1996, 1997, 1998; Rosse, Stechler, Miller, & Levin, 1998). Faking changes the<br />

rank-order of applicants and results in the selection of individuals with lower-than-expected<br />

performance scores (Mueller-Hanson, Heggestad, & Thornton, <strong>2003</strong>; Zickar, 2000). Thus, there<br />

is much interest in developing a faking-resistant personality measure.<br />

This presentation describes the development of a situational judgment test (SJT) for<br />

selection into the U.S. Army. A situational judgment test item consists of a description of a<br />

problem situation followed by several possible actions. An examinee answers the item by<br />

judging the effectiveness of the actions. In some SJTs, the examinee indicates the best and worst<br />

actions. In other SJTs, including this SJT, the examinee rates the effectiveness of each action.<br />

A criterion SJT was simultaneously developed which targeted soldiers who had been in<br />

the Army between 18 and 36 months. It was developed in the same manner as the predictor SJT<br />

described in this paper, with two exceptions: it includes only military scenarios and it does not<br />

use trait scoring. The criterion SJT, along with other criterion measures, will be used to collect<br />

validity data on other predictor measures. The predictor and criterion SJTs were developed for<br />

Select21, a project sponsored by the U.S. Army Research Institute for the Behavioral and Social<br />

Sciences. The objective of Select21 is to develop and validate selection measures that will help the<br />

Army select, classify, and retain enlisted Soldiers with the characteristics needed to succeed in the<br />

future Army.<br />

The SJT format has two characteristics that might make it possible to developing fakingresistant<br />

personality tests. First, in contrast to traditional personality tests, SJT examinees are not<br />

asked to divulge anything about themselves. Rather, examinees are asked to analyze each<br />

situation and evaluate the effectiveness of each action. Thus, the SJT would be a subtle measure<br />

45 In D. J. Knapp (Chair), Selecting Soldiers for the Future Force: The Army’s Select21 Project. Symposium<br />

conducted at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>) Conference, Pensacola, FL. The views,<br />

opinions, and/or findings contained in this paper are those of the authors and should not be construed as an official<br />

U.S. Department of the Army position, policy, or decision.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


of personality. Second, the same responses would be used to generate both an effectiveness score<br />

and personality scores. Examinees who try to maximize their personality scores would risk<br />

lowering their effectiveness scores. That is, examinees cannot ignore the effectiveness of any<br />

actions when answering test items. Thus, an SJT that produces both an effectiveness score and<br />

personality scale scores might be able to eliminate—or considerably reduce—faking.<br />

Several personality traits were identified that are relevant to performing well as a Soldier<br />

in the U.S. Army (Sager & Russell, <strong>2003</strong>). We had developed a situational judgment test in<br />

recent previous research for the Army (Knapp et al., 2002). The SJT score correlated<br />

significantly not only with performance measures but also with several personality scales.<br />

Unfortunately, attempts to develop scales based on item scores were unsuccessful. This was not<br />

surprising considering that SJTs tend to be heterogeneous even at the item level. Thus, in the<br />

current approach, we are developing personality scales based on the scores for individual actions.<br />

This effort has two notable aspects. First, we generated parallel civilian situations from<br />

military situations. Second, the test simultaneously measures both the respondent’s judgment and<br />

several of the respondent’s personality traits. We developed descriptions of military situations<br />

that a Soldier might experience during the first few months in the Army. For each situation, we<br />

developed response options comprising actions that Soldiers might take in those situations. We<br />

feared, however, that a test consisting of military situations might not be appropriate for civilian<br />

applicants. Many applicants might not understand the situations, and those that had some<br />

military knowledge might have an unfair advantage on the test. Therefore, we developed a<br />

parallel civilian situation for most military situations. The remainder of this paper provides more<br />

detailed descriptions of the development of civilian items and trait scales. The results of a recent<br />

pilot test will also be discussed.<br />

DEVELOPMENT OF THE MILITARY AND CIVILIAN SJTS<br />

To develop the military SJT, we asked new Soldiers and the NCOs (Non-Commissioned<br />

Officers) who train them to write descriptions of problem situations relevant to new Soldiers<br />

(during the first few months in the Army). These could be actual situations they had observed or<br />

hypothetical situations. Other Soldiers and NCOs wrote descriptions of actions that Soldiers<br />

might take in these situations. We edited and dropped actions until there were no more than<br />

about nine actions per situation. We asked Soldiers and NCOs to write each situation targeted to<br />

one of the specific predictor constructs we wanted the SJT to measure.<br />

At this point, we asked NCOs and Soldiers to write parallel civilian situations based on<br />

the military situations. After editing these situations, we picked the best parallel civilian situation<br />

for each military situation. Then we asked NCOs and Soldiers to write actions for the civilian<br />

situations. We edited these actions and reduced the number of actions per situation to about nine.<br />

We developed an initial scoring key using 10 NCOs (drill instructors and other trainers of<br />

new Soldiers). Then we gave the draft test items (military and civilian items) to Soldiers in<br />

training. Each item was answered by about 12 Soldiers. Based on these two data collections, some<br />

options were dropped and a few others were edited or added. An option was dropped if the NCOs<br />

disagreed substantially on an option’s effectiveness. Typically, options with standard deviations<br />

above 2.00 (on a 7-point scale) were dropped. In contrast, options were also dropped if there was<br />

541<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


542<br />

too much agreement among the Soldiers in training. Finally, we narrowed down the options to<br />

about seven per situation. Where possible, a set of options were selected for a situation so that there<br />

was a wide range of keyed effectiveness values (computed as the mean of the NCO ratings).<br />

DEVELOPMENT OF THE TRAIT SCORING<br />

The theoretical basis for using the SJT to measure both traits and judgment is based upon<br />

the following model. When an examinee judges the effectiveness of an action, that judgment is<br />

determined by both the examinee’s personality and his/her knowledge/training/experience relevant<br />

to the situation. The traditional SJT score taps the examinee’s knowledge/training/experience<br />

whereas the trait scores tap part of the examinee’s personality.<br />

As mentioned above, SJT tests are heterogeneous. Therefore, we decided to measure<br />

traits at the lowest level possible: the individual option. Nineteen individuals with graduate<br />

degrees in industrial-organizational psychology were recruited to rate the traitedness of each<br />

response option. Each response option was rated by five to seven psychologists. For each traitoption<br />

combination, participants rated the degree to which the action and trait were related.<br />

Inverse relationships were given negative ratings. Each point in the rating scale represented a<br />

range of correlations. The mean (across psychologists) rating represented the traitedness for that<br />

trait on that option.<br />

PILOT TEST RESULTS: JUDGMENT SCORES<br />

Eight draft test forms were given to 319 Soldiers in U.S. Army reception battalions.<br />

These Soldiers had just entered the Army but had not been assigned to training yet. Therefore,<br />

they would be similar to applicants. Each Soldier completed one civilian SJT form (A–D) and<br />

one military SJT form (1–4). There were four pairings of forms: A-1, B-2, C-3, and D-4. Within<br />

each form-pair, the order was randomized. That is, half of the Soldiers got the military form first;<br />

the other half got the civilian form first. Most items had seven response options. The civilian<br />

forms had 14–16 items; the military forms had 11–13 items. There was no attempt to put a<br />

military item and its parallel civilian item within the same form-pair.<br />

The Soldiers responded by rating the effectiveness of each option on a 7-point scale<br />

(where higher numbers represent greater effectiveness). The judgment score for an option was<br />

computed as shown in Equation 1 below. The difference between the rating and keyed<br />

effectiveness values is subtracted from 6 so that higher values represent better scores. The<br />

judgment score for an entire test form was merely the mean of the option scores.<br />

optionEffectivenessScore = 6 – | SoldiersRating – keyedEffectiveness | (1)<br />

The reliability of the judgment scores was estimated via coefficient alpha. Table 1 shows<br />

these values for each of the eight forms. The reliability estimates are around .90. Table 1 also<br />

shows that the judgment score is measuring essentially the same thing on the civilian and<br />

military forms. The correlations between forms are almost as high as the reliability estimates.<br />

The correlation rc estimates the correlation between the constructs measured in the two forms.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 1. Correlations between Civilian and <strong>Military</strong> Forms and Reliability Estimates for<br />

Judgment Scores<br />

Form Pair Coefficient Alpha<br />

(Civ, Mil) r rc Civ Mil<br />

A, 1 .76 .83 .92 .92<br />

B, 2 .85 .95 .91 .88<br />

C, 3 .70 .83 .87 .83<br />

D, 4 .82 .90 .91 .90<br />

Note. N = 79. Each Soldier completed only one form pair. rc is corrected for<br />

attenuation due to unreliability. All correlation are significant at p < .0001.<br />

Table 2. Correlations between Judgment Score and SD of Ratings<br />

Form-Pair<br />

Scale A,1 B,2 C,3 D,4<br />

<strong>Military</strong><br />

Forms<br />

-.69 -.77 -.60 -.58<br />

Civilian -.66 -.77 -.61 -.58<br />

Forms<br />

Note. N = 79. Each Soldier completed only one form-pair.<br />

All correlations are significant at p < .0001. Each value is<br />

the correlation between the judgment scores and the withinexaminee<br />

standard deviation of his/her ratings.<br />

Some other SJTs use the judgment scoring algorithm used in this research. Table 2 shows<br />

a possible danger with this algorithm. The variability of an examinee’s scores are highly<br />

correlated (in a negative direction) with the judgment scores. This relationship exists when the<br />

keyed effectiveness values tend to be near the middle of the rating scale—as is the case with this<br />

SJT. An examinee can get a fairly good score by simply rating every option a 4 (the middle of<br />

the rating scale). This problem can be eliminated by either (a) designing a test with a uniform<br />

distribution of keyed effectiveness values or (b) asking examinees to pick or rank rather than rate<br />

responses. The problem can be reduced during the computation of the scores by moving values<br />

near the top and bottom of the scale towards the center (for both the examinee’s ratings and the<br />

effectiveness key). For example, rating values below 2.5 could be changed to 2.5.<br />

PILOT TEST RESULTS: TRAIT SCORES<br />

The score on a trait for a specific option was computed as shown in Equation 2 below. As<br />

shown, the keyed effectiveness value was subtracted from the Soldier’s rating to set the scales<br />

metric so that a Soldier would receive a trait score of zero on an option if his/her rating equaled<br />

the keyed effectiveness value. Trait scores can be positive or negative. The trait score for an<br />

entire form is the mean of the trait scores among the options linked to the trait. Because Trait 8,<br />

intellectance, was linked to very few options, it was dropped from the analyses.<br />

optionTraitScore = (SoldiersRating – keyedEffectiveness) * traitedness (2)<br />

543<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


544<br />

Table 3 shows the reliability estimates by form and trait. Part of the variability in the<br />

reliability estimates is due to wide range of options per trait which ranges from 3 to 35. Table 4<br />

provides an easier way to compare traits by basing each reliability estimate on a 20-option trait<br />

scale. There is considerable variability in the reliability estimates across forms: the values range<br />

from .27 to .88. Thus, it appears possible to construct an SJT test that measures some traits<br />

reliably. For example, in Form D, the reliability estimate is above .80 for five of the seven traits.<br />

Table 3. Reliability Estimates of Trait Scales<br />

<strong>Military</strong> Forms Civilian Forms<br />

Trait Name 1 2 3 4 A B C D<br />

1 Achievement<br />

.60 .63 .65 .78 .77 .60 .84 .88<br />

Orientation<br />

2 Self-Reliance .42 .55 .77 .67 .63 .60 .74 .85<br />

3 Dependability .77 .76 .75 .72 .74 .76 .87 .88<br />

4 Sociability .74 .35 .74 .48 .81 .28 .75 .84<br />

5 Agreeableness .47 .53 .24 .46 .73 .27 .64 .61<br />

6 Social Perceptiveness .67 .30 .44 .45 .52 .34 .66 .35<br />

7 Team Orientation .82 .38 .52 .74 .79 .54 .81 .81<br />

Note. N = 79. Each Soldier completed only one form-pair. The number of options per scale ranges<br />

from 3 (Agreeableness Form 3) to 35 (Dependability Form 1).<br />

It appears that Agreeableness and Social Perceptiveness are not measured as reliably as<br />

the other traits. This is due partly to a lack of options linked to these traits, but Table 4 shows<br />

that these two traits are not measured quite as reliably even when their scales have 20 options.<br />

The SJT was not administered with other instruments. Thus, its construct validity could<br />

not be assessed by looking at its relationships with performance or personality measures. Its<br />

construct validity was examined by looking at the relationships among the trait scales.<br />

Table 4. Reliability Estimates for Hypothetical 20-Option Trait Scales<br />

<strong>Military</strong> Forms Civilian Forms<br />

Trait Name 1 2 3 4 A B C D<br />

1 Achievement<br />

.54 .60 .77 .75 .73 .60 .76 .82<br />

Orientation<br />

2 Self-Reliance .41 .53 .83 .86 .59 .65 .65 .78<br />

3 Dependability .66 .68 .71 .70 .75 .68 .80 .81<br />

4 Sociability .83 .64 .85 .65 .83 .50 .73 .81<br />

5 Agreeableness .64 .82 .68 .66 .67 .28 .54 .60<br />

6 Social Perceptiveness .79 .68 .66 .70 .63 .60 .65 .36<br />

7 Team Orientation .73 .48 .63 .76 .74 .59 .77 .71<br />

Note. N = 79. Each Soldier completed only one form-pair.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The latent structure underlying the seven trait scales was examined using exploratory<br />

factor analysis. Several analyses were examined using both oblique and orthogonal rotation.<br />

Parallel analyses clearly indicated a three-factor solution. In the parallel analyses the scree plot<br />

of the real data (correlation matrix of the seven traits with multiple-squared correlations in the<br />

diagonal) was compared with the scree plots of 100 random datasets. In the vast majority of<br />

cases, the plots crossed between the third and fourth factors. The orthogonal model was more<br />

interpretable than the oblique model. Table 5 shows the factor loadings for the orthogonal model.<br />

The traits loading highly on the first factor are related to accomplishing tasks independently.<br />

Factors 2 and 3 are related to interacting with people. Factor 2 appears to involve working with<br />

other people to accomplish tasks. Factor 3 appears to be almost equivalent to the Agreeableness<br />

trait. The major elements of the Agreeableness definition are likeability and pleasantness.<br />

Two traits have large loadings on two factors. Team Orientation loads almost equally on<br />

Factors 2 and 3. The definition of Team Orientation has two facets: team member cohesion/bonding<br />

and working together as a team. This definition is consistent with the factor loadings. Dependability<br />

is a bit more difficult to explain. Its loading on Factor 2 was somewhat surprising. One could<br />

consider, however, a team can perform well only when its members can depend upon each other.<br />

Finally, the relationship between judgment scores and trait scores was examined. Table 6<br />

shows that the judgment scale correlates moderately with some trait scales in some forms. The<br />

vast majority of correlations are negative. An examinee whose ratings stay within the middle of<br />

the rating scale cannot achieve high trait scores. This is the reverse of what we say with the<br />

judgment scores. Thus, these negative correlations are likely related to the same phenomenon.<br />

They would likely be eliminated or reduced dramatically if the examinees were to rank (or pick)<br />

rather than rate the options. Even using the present scoring method, Table 6 shows that the trait<br />

scores are measuring something different from the judgment score.<br />

545<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


546<br />

Table 5. Trait Score Factor Loadings for Three-Factor Model<br />

Factor Loadings<br />

Trait Name Factor 1 Factor 2 Factor 3<br />

1 Achievement<br />

Orientation<br />

0.93 0.15 0.26<br />

2 Self-Reliance 0.92 -0.16 0.08<br />

3 Dependability 0.53 0.67 0.35<br />

6 Social Perceptiveness -0.19 0.70 -0.07<br />

4 Sociability 0.06 0.86 0.39<br />

7 Team Orientation 0.33 0.66 0.63<br />

5 Agreeableness 0.18 0.12 0.97<br />

Note. N = 319. To obtain an acceptable sample size, test form was ignored. Loadings above .49 are boldfaced.<br />

Loadings between .30 and .49 are italicized. Loadings above .29 that are not consistent with the interpretation of<br />

the factors are in red. The three factors were interpreted as follows:<br />

Factor 1: Motivation/Skill to accomplish tasks while working independently.<br />

Factor 2: Motivation/Skill to accomplish tasks while working with other people.<br />

Factor 3: Agreeableness - pleasantness and likeability.<br />

Table 6. Correlations between Judgment and Trait Scores<br />

<strong>Military</strong> Forms Civilian Forms<br />

Trait Name 1 2 3 4 A B C D<br />

1 Achievement<br />

-.29 -.14 -.07 -.27 -.21 -.29 -.14 -.24<br />

Orientation<br />

2 Self-Reliance -.30 -.33 -.31 -.35 -.05 -.31 -.14 -.31<br />

3 Dependability -.29 -.24 -.16 -.21 .03 -.25 -.05 -.14<br />

4 Sociability -.41 -.20 -.25 .11 -.13 -.27 -.10 -.16<br />

5 Agreeableness -.37 .16 -.21 .13 .01 -.03 .09 -.03<br />

6 Social Perceptiveness -.36 -.30 .25 .07 -.09 .02 .14 .03<br />

7 Team Orientation -.28 -.21 -.20 -.02 -.04 -.09 .03 -.09<br />

Note. N = 79. Each Soldier completed only one form-pair. Boldfaced correlations (i.e., > .22) are significant at<br />

p < .05.<br />

CONCLUSIONS<br />

The results of this research show that a situational judgment test can be designed to<br />

reliably measure personality traits. Although a factor analyses demonstrated some evidence of<br />

construct validity, additional research has been planned to obtain stronger evidence. The SJT will<br />

be administered with personality measures in the near future. Later it will be administered with<br />

other personality measures as well as performance measures. The strength of using an SJT is<br />

that, in theory, it is resistant to faking. Further research would be needed to determine this.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


There are many ways to score a situational judgment test and a few ways for examinees<br />

to respond to items. Many of these ways have been compared with respect to judgment scores<br />

but not with respect to the types of trait scores developed during this research effort. In<br />

particular, ranking and rating should be compared.<br />

The high correlations between the civilian and military test forms are reassuring. On the<br />

one hand, one could argue that civilian forms do not have to be developed because the military<br />

forms measure essentially the same thing. On the other hand, a few potentially good Soldiers<br />

might be screened out because they knew little about the military—things they would learn soon<br />

after joining the military. In that case, one could argue that civilian forms should be used.<br />

REFERENCES<br />

Hough, L. M. (1996). Personality measurement and personnel selection: Implementation issues.<br />

Paper presented at the 11 th annual meeting of the Society of Industrial and Organizational<br />

Psychology, San Diego, CA.<br />

Hough, L. M. (1997). Issues and evidence: Use of personality variables for predicting job<br />

performance. Paper presented at the 12 th annual meeting of the Society of Industrial and<br />

Organizational Psychology, St. Louis, MO.<br />

Hough, L. M. (1998). Effects of intentional distortion in personality measurement and evaluation<br />

of suggested palliatives. Human Performance, 11, 209-244.<br />

Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., & McCloy, R. A. (1990). Criterionrelated<br />

validities of personality constructs and the effect of response distortion on those<br />

validities. Journal of Applied Psychology, 75, 581-595.<br />

Knapp, D. J., Burnfield, J. L., Sager, C. E., Waugh, G. W., Campbell, J. P., Reeve, C. L.,<br />

Campbell, R. C., White, L. A., & Heffner, T. S. (2002). Development of predictor and<br />

criterion measures for the NCO21 research program (Technical Report 1128).<br />

Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

Knapp, D. J., Waters, B. K., & Heggestad, E. D. (Eds.) (2002). Investigations related to the<br />

implementation of the Assessment of Individual Motivation (AIM) (Study Note 2002-02).<br />

Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

Mueller-Hanson, R., Heggestad, E. D., & Thornton, G. C., III (<strong>2003</strong>). Faking and selection:<br />

Considering the use of personality from a select-in and a select-out perspective. Journal<br />

of Applied Psychology, 88, 348-355.<br />

Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in personality<br />

testing for personnel selection: The red herring. Journal of Applied Psychology, 81, 660-<br />

679.<br />

Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. (1998). The impact of response distortion<br />

on pre-employment personality testing and hiring decisions. Journal of Applied<br />

Psychology, 83, 634-644.<br />

547<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


548<br />

Sager, C. E. & Russell, T. L. (<strong>2003</strong>, :November). Future-oriented job analysis for first-tour<br />

Soldiers: Selecting Soldiers for the Future Force: The Army’s Select21 Project (D.J.<br />

Knapp, Chair) at the 45 h Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>, Pensacola, Florida.<br />

Tett, R. P., Jackson, D. N, & Rothstein, M. (1991). Personality measures as predictors of job<br />

performance: A meta-analytic review. Personnel Psychology, 41, pp. 703–742.<br />

Zickar, M. J. (2000). Modeling faking on personality tests. In D. Ilgen & C. L. Hulin (Eds.),<br />

Computational modeling of behavioral processes in organizations (pp. 95-108).<br />

Washington, DC: American Psychological <strong>Association</strong>.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


ASSESSING PERSON-ENVIRONMENT (P-E) FIT<br />

WITH THE FUTURE ARMY 46<br />

Chad H. Van Iddekinge, Ph.D., Dan J. Putka, Ph.D., and Christopher E. Sager, Ph.D.<br />

Human Resources Research Organization<br />

Alexandria, VA, USA<br />

cvaniddekinge@hummro.org<br />

INTRODUCTION<br />

Personnel selection measures are typically designed to assess the knowledges, skills, and<br />

attributes (KSAs) critical to performance in the job of interest. Although important, job<br />

performance is not the only criterion of concern to most organizations. For example,<br />

organizations like the U.S. Army are interested in reducing personnel attrition through their<br />

selection and classification systems. Traditional KSA-based measures, however, seldom predict<br />

both performance and alternative criteria such as attrition.<br />

In recent years, personnel researchers have turned to measures of person-environment (P-<br />

E) fit to predict outcomes other than job performance. Recent studies indicate that scores on such<br />

measures are related to various work-related intentions and attitudes (e.g., job satisfaction,<br />

organizational commitment, turnover intentions), as well as to actual behaviors such as<br />

absenteeism and turnover (e.g., Cable & DeRue, 2002; Saks & Ashforth, 1997; Verquer, Beehr,<br />

& Wagner, 2001). Although there is widespread interest in P-E fit within the civilian selection<br />

literature, the use of fit measures has yet to be extensively reported in the military literature.<br />

This paper attempts to address this gap by describing the development of P-E fit measures<br />

for Select21, a project sponsored by the U.S. Army Research Institute for the Behavioral and<br />

Social Sciences. The objective of Select21 is to develop and validate selection measures that will<br />

help the Army select, classify, and retain enlisted Soldiers with the characteristics needed to<br />

succeed in the future Army. The Select21 P-E fit measures we developed are intended to assess the<br />

match between the work-related values and interests of prospective Soldiers and the<br />

values/interests the Army provides first-tour Soldiers now and in the future. In this paper, we<br />

describe a novel approach to developing P-E fit measures to predict Soldiers’ attitudes and career<br />

decisions. We begin by describing the constructs measured in these instruments.<br />

46 In D. J. Knapp (Chair), Selecting Soldiers for the Future Force: The Army’s Select21 Project. Symposium<br />

conducted at the <strong>2003</strong> <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> (<strong>IMTA</strong>) Conference, Pensacola, FL. The views,<br />

opinions, and/or findings contained in this paper are those of the authors and should not be construed as an official<br />

U.S. Department of the Army position, policy, or decision.<br />

549<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


550<br />

CONSTRUCTS ASSESSED<br />

As mentioned, selection measures are typically designed to assess KSAs critical to job<br />

performance. When developing measures of P-E fit, however, the focus is on the needs, interests,<br />

and/or expectations of job applicants. Given this, it is important to use a comprehensive<br />

taxonomy to ensure these instruments measure the range of values and interests that the applicant<br />

population might possess. The fit measures we are developing are designed to assess two sets of<br />

constructs: work values and occupational interests. Work values are the most common constructs<br />

assessed in the P-E fit measures (Verquer et al., 2001). The values we assessed were derived from<br />

the Theory of Work Adjustment (TWA; Dawis, England, & Lofquist, 1964). According to the<br />

TWA, job satisfaction is a function of the correspondence between workers’ preferences for<br />

certain work-related values (e.g., having a chance to work independently, being paid well, having<br />

good relations with co-workers) and the degree to which the job or organization supports those<br />

values.<br />

The work interests we focused on came from Holland’s (1978, 1996) congruence theory.<br />

As with the TWA, this theory suggests that job satisfaction is a function of the congruence<br />

between an individual’s work interests and the interests supported by his or her job (or<br />

organization). According to Holland, vocational interests are expressions of personality that can<br />

be used to categorize individuals and work environments into six types: realistic, investigative,<br />

artistic, social, enterprising, and conventional (RIASEC). Holland’s model has been widely<br />

validated and is the prevailing taxonomy in vocational psychology (Barrick, Mount, & Gupta,<br />

<strong>2003</strong>).<br />

We developed two sets of P-E fit measures. The first set of measures assessed the<br />

congruence between Soldiers’ needs for certain values/interests and the values/interests the<br />

Army supplies first-tour Soldiers. In the P-E fit literature, this is referred to as needs-supplies fit<br />

(Edwards, 1991; Kristof, 1996). The second set of measures assessed the congruence between<br />

the values/interests Soldiers expect the Army to provide and the values/interests the Army<br />

actually provides. We refer to this as expectations-reality fit. As discussed later, we believe there<br />

is a subtle yet important difference between the values/interests Soldiers prefer and the<br />

values/interests they expect the Army to support. Because we used a similar process to develop<br />

the values and interests measures, we limit our discussion to the work values instruments.<br />

ASSESSING NEEDS-SUPPLIES FIT<br />

Two measures of work values were developed to assess needs-supplies fit. The Army<br />

Description Inventory (ADI) was designed to determine the extent to which the Army<br />

environment supports several work-related values. Thus, we refer to this as a “supplies-side”<br />

measure. The Work Values Inventory (WVI), in contrast, assesses the extent to which Soldiers<br />

(and eventually prospective recruits) desire each of these values. We refer to this as a “needsside”<br />

measure. The development of these measures is described in turn.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


To develop the ADI, we first identified a set of values to assess. The initial measure<br />

included 42 values. Of these, 21 values were from the Dawis et al. (1964) taxonomy. The<br />

remaining 21 values were developed based on a review of several source materials, including<br />

previous studies of the values of Army recruits (e.g., Ramsberger, Wetzel, Sipes, & Tiggle,<br />

1999), research on the values of American youth (Sackett & Mavor, 2002), and results of the<br />

Select21 job analysis. We then asked 70 Army non-commissioned officers (NCOs) to indicate<br />

the degree to which the Army provides opportunities for first-tour Soldiers who possess each of<br />

the 42 values. Based on these data, we identified 27 values to assess in the WVI. The decision to<br />

use only 27 of the original 42 values was due to concerns about testing time and redundancy<br />

among the initial set of values. The final 27 values were classified into three categories. The<br />

“high” category included nine values that NCOs indicated that the Army offers first-tour<br />

Soldiers. The “low” category consisted of nine values that NCOs believed the Army does not<br />

offer first-tour Soldiers. The “middle” category included the nine values that fell between the<br />

high and low categories.<br />

Given that most of the 27 work values are socially desirable, having applicants rate them<br />

with a Likert-type scale would probably not produce enough variability in responses (i.e.,<br />

applicants would indicate that all values are important to them). Indeed, applicant response<br />

distortion is a concern whenever non-cognitive measures such as this are used in an operational<br />

selection setting (Rosse, Stecher, Miller, & Levin, 1998). One way to help minimize the effects<br />

of response distortion is to present test items in a forced-choice format (Jackson, Wrobleski, &<br />

Ashton, 2000). We adopted this approach with the WVI. The instrument consists of 81 triads,<br />

each of which includes one value from the three categories described above (i.e., high, low,<br />

middle). Respondents are asked to identify the value that would be most and least important to<br />

them in their ideal job. The WVI is constructed so that no two values are paired together more<br />

than once, and values from the same category are never paired together (e.g., values within the<br />

low category are paired only with high and middle category values). An example item from the<br />

WVI is provided in the Appendix.<br />

Using a forced-choice measure will not decrease response distortion unless items within<br />

each triad are similarly attractive. For example, if one value in a triad sounds more like the Army<br />

than the other two values, applicants may indicate that this value is most like them regardless of<br />

whether they truly value it. We attempted to address this issue in the WVI by comparing values<br />

that the Army provides with values that may appear like something the Army could satisfy but,<br />

in fact, are not characteristic of the Army environment. Nevertheless, even if prospective recruits<br />

are able to correctly identify the values the Army provides (and distort their responses in a way<br />

that is consistent with these values), this type of response pattern might not decrease criterionrelated<br />

validity. That is, this form of distortion would indicate that the respondent has realistic<br />

expectations about what the Army is like. Met expectations can, in turn, lead to higher<br />

satisfaction and performance once in the Army (Wanous, 1992).<br />

The forced-choice format of the WVI also gives us several options for scoring the<br />

measure. For example, we could create composite scores for the high and low categories by<br />

summing the number of times respondents choose values in the high category over values in the<br />

551<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


552<br />

low and middle categories to form a high category composite. Likewise, we could sum the<br />

number of times respondents chose values in the low category over values in the high and middle<br />

categories to form a low category composite. Such composites would indicate the degree to<br />

which applicants prefer values that the Army does and does not supply, respectively. A high<br />

score on the high category composite would indicate high needs-supplies fit, whereas a high<br />

score on the low category composite would indicate poor needs-supplies fit. That is, applicants<br />

with high scores on the low category composite would tend to value things the Army does not<br />

generally offer first-tour Soldiers.<br />

Despite the potential advantages of a forced-choice instrument, such measures can also<br />

present challenges when used for selection. For example, forced-choice measures result in ipsative<br />

or partially ipsative data, which can make it difficult to obtain the normative information that is<br />

critical for making between person comparisons (Hicks, 1970). However, the ipsativity of a forcedchoice<br />

measure can be reduced by the way it is constructed and scored. For example, assessing<br />

most or all of the constructs within the domain of interest (e.g., all work-related values of<br />

American youth) can increase the degree to which the measure provides normative information.<br />

This is because in a forced-choice instrument, an applicant’s score on a given construct (e.g. the<br />

value of autonomy) depends on the constructs with which it is compared. For instance, autonomy<br />

at work could be most important to an individual when compared to values A and B, but not when<br />

compared to values C and D. Thus, comparing a construct to every other construct within the<br />

domain of interest (rather than comparing it to limited number of constructs) can result in more<br />

accurate approximations of normative trait standings. We attempt to do this in the WVI by<br />

assessing a large number of work values that we think prospective Army recruits could possess.<br />

Another way to reduce the ipsativity of a forced-choice measure is to not score all of the<br />

constructs assessed in the instrument. This approach can minimize ipsativity because it allows<br />

applicants to score high (or low) on all constructs of interest when they are paired only with nonrelevant<br />

constructs. In the WVI, values that the Army supports (high category values) are<br />

compared only to values it does not support (low and middle category values), and not other<br />

values supported by the Army. Although this will reduce ipsativity, some will remain because<br />

scores on the supported values depend, in part, on the restricted set of (unsupported) values with<br />

which they are compared.<br />

ASSESSING EXPECTATIONS-REALITY FIT<br />

We also developed measures of expectations-reality fit. These measures are designed to<br />

assess individuals’ knowledge about the work values and interests that the Army actually<br />

supports. We developed these instruments because we believe that needs-supplies fit and<br />

expectations-reality fit may interact to predict attrition and its attitudinal precursors (e.g., job<br />

satisfaction). Based on expectancy theory (Vroom, 1964), we believe that misfit between the<br />

applicant and Army for a given work value or interest depends on (a) how important the<br />

value/interest is to the applicant, (b) how much the applicant expects the Army to provide<br />

opportunities to satisfy the value/interest, and (c) the extent to which the Army actually offers the<br />

value/interest. For example, consider two applicants – one who values autonomy and expects the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Army to provide it, and a second who values autonomy but does not expect the Army to provide<br />

it. If the Army does not provide autonomy, it is likely that the second applicant will be more<br />

satisfied in the Army than the first. That is, although both applicants value autonomy (which in<br />

this case indicates a lack of needs-supplies fit), the fact that the first applicant expects autonomy<br />

and does not receive it is likely to result in greater dissatisfaction.<br />

The Army Beliefs Survey (ABS) was designed to assess expectations-reality fit with<br />

regard to work values. The ABS includes the 27 values measured in the WVI, but uses a Likerttype<br />

rating scale and a different set of instructions (see Appendix). Specifically, rather than<br />

asking prospective recruits to indicate their preference for the values, the ABS assesses their<br />

knowledge about what values the Army does (not) support. Thus, it is not in the best interest of<br />

respondents to indicate that the Army offers all of the values. Because the ABS is essentially a<br />

knowledge test, response distortion is less likely to be an issue and thus we did not develop a<br />

forced-choice version of the measure. As with the WVI, the data we collected from NCOs (via<br />

the ADI) will provide the supplies-side (i.e., “reality”) information against which to compare the<br />

needs-side data from the ABS. The greater the correspondence between the values applicants<br />

think the Army does (not) provides first-tour Soldiers and the values the Army actually provides<br />

(based on NCO ratings), the better the P-E fit should be.<br />

SUMMARY<br />

Recent research suggests that measures of P-E fit can predict valued criteria such as job<br />

satisfaction, organizational commitment, and attrition. However, relatively few P-E fit studies<br />

have been published in the military selection research literature. In this paper, we described a<br />

unique approach to developing fit measures to help select, classify, and retain enlisted Soldiers<br />

for the future Army. Although there are challenges with assessing P-E fit in an operational<br />

context, we believe such measures have the potential to provide substantial utility to the U.S.<br />

Army and in other military settings.<br />

REFERENCES<br />

Barrick, M. R., Mount, M. K., & Gupta, R. (<strong>2003</strong>). Meta-analysis of the relationship between the<br />

five-factor model of personality and Holland’s occupational types. Personnel Psychology,<br />

56, 45-74.<br />

Cable, D. M., & DeRue, S. D. (2002). The convergent and discriminant validity of subjective fit<br />

perceptions. Journal of Applied Psychology, 87, 875-884.<br />

Dawis, R. V., England, G. W., & Lofquist, L. H. (1964). A theory of work adjustment. Minnesota<br />

Studies in Vocational Rehabilitation, XV. Minneapolis: University of Minnesota.<br />

Edwards, J. R. (1991). Person-job fit: A conceptual integration, literature review and<br />

methodological critique. <strong>International</strong> review of industrial/organizational psychology<br />

(pp. 283-357). London: Wiley.<br />

553<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


554<br />

Hicks, L. E. (1970). Some properties of ipsative, normative, and forced-choice normative<br />

measures. Psychological Bulletin, 74, 167-184.<br />

Holland, J. L. (1978). Manual for the vocational preferences inventory. Palo Alto, CA:<br />

Consulting Psychologist Press.<br />

Holland, J. L. (1996). Exploring careers with a typology: What we have learned and some new<br />

directions. American Psychologist, 51, 397-406.<br />

Jackson, D. N., Wrobleski, V. R., & Ashton, M. C. (2000). The impact of faking on employment<br />

tests: Does forced-choice offer a solution? Human Performance, 13, 371-388.<br />

Kristof, A. L. (1996) Person-organization fit: An integrative review of its conceptualizations,<br />

measurements, and implications. Personnel Psychology, 49, 1-49.<br />

Ramsberger, P. F., Wetzel, E. S., Sipes, D. E., & Tiggle, R. B. (1999). An assessment of the<br />

values of new recruits (FR-WATSD-99-16). Alexandria, VA: Human Resources<br />

Research Organization.<br />

Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. (1998). The impact of response distortion<br />

on pre-employment personality testing and hiring decisions. Journal of Applied<br />

Psychology, 83, 634-644.<br />

Sackett, P. R., & Mavor, A. (Eds.) (2002). Attitudes, aptitudes, and aspirations of American youth:<br />

Implications for military recruitment. Washington, D.C: National Academies Press.<br />

Saks, A. M., & Ashforth, B. E. (1997). A longitudinal investigation of the relationships between<br />

job information sources, applicant perceptions of fit, and work outcomes. Personnel<br />

Psychology, 50, 395-426.<br />

Verquer, M. L., Beehr, T. A., & Wagner, S. H. (2001, April). A meta-analytic review of relations<br />

between person-organization fit and work attitudes. Paper presented at the 16th Annual<br />

Conference of the Society for Industrial and Organizational Psychology, San Diego, CA.<br />

Vroom, V. (1964). Work and motivation. New York: John Wiley.<br />

Wanous, J. P. (1992). Organizational entry (2nd Ed.). Reading, MA: Addison-Wesley.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


APPENDIX<br />

Example Item from the Work Values Inventory (WVI)<br />

Indicate which statement is most important to you in your ideal job, and which statement is least<br />

important to you in your ideal job.<br />

On my ideal job, I would...<br />

A. have opportunities to lead others.<br />

B. try out my own ideas.<br />

C. have a flexible work schedule.<br />

Example Items from the Army Beliefs Survey (ABS)<br />

Few Will Experience Some Will Experience Most Will Experience<br />

This statement describes an<br />

experience few Soldiers will<br />

have during their first<br />

enlistment.<br />

This statement describes an<br />

experience some, but not most<br />

Soldiers will have during their<br />

first enlistment.<br />

555<br />

This statement describes an<br />

experience most Soldiers<br />

will have during their first<br />

enlistment.<br />

Using the rating scale above, indicate which category you believe best describes each statement.<br />

Few Some Most<br />

1. ____ ____ ____<br />

2. ____ ____ ____<br />

3. ____ ____ ____<br />

Soldiers will…<br />

have opportunities to lead others.<br />

try out their own ideas.<br />

have a flexible work schedule.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


556<br />

Competency <strong>Testing</strong> for the U.S. Army Noncommissioned Officer (NCO)<br />

Corps<br />

Tonia S. Heffner<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

Alexandria, VA USA<br />

Roy Campbell<br />

Human Resources Research Organization<br />

Radcliff, KY USA<br />

Deirdre J. Knapp<br />

Human Resources Research Organization<br />

Alexandria, VA USA<br />

Peter Greenston<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

Alexandria, VA USA<br />

In 1991, the U.S. Army discontinued its Soldier proficiency-testing program, the Skill<br />

Qualification Test (SQT). The Army’s previous experience with job performance testing spanned<br />

a period of 40 years but was a mixture of successes, frustrations, and adjustments. Although the<br />

SQT program played an important role in the determination of Noncommissioned Officer (NCO)<br />

promotions, the associated costs of preparing and administering over 200,000 tests, in Soldiers’<br />

and administrators’ time as well as financial resources, made the program no longer viable<br />

(Campbell, 1994). However, recent events have brought about an examination of the feasibility<br />

of reinstituting competency testing for the NCO corps. First, in a survey of one-third of the NCO<br />

corps, the Army Training and Learning Development Panel (ATLDP) found NCOs<br />

overwhelmingly want objective testing to demonstrate their accomplishments and provide<br />

feedback on technical, tactical, and leadership skills. Second, the Sergeant Major of the Army<br />

(SMA) has made reinstituting competency testing a priority. Finally, competency testing<br />

reinforces self-development, one of the three pillars of the Army’s NCO educational philosophy.<br />

The U.S. Army Research Institute for Behavioral and Social Sciences (ARI) has<br />

embarked on a three-phase research effort to examine the feasibility of reinstituting competency<br />

testing. This project began and continues as a research effort to develop criterion measures to<br />

validate our selection and classification tests, but the nature of the project has expanded to reflect<br />

the needs identified by the ATLDP. The first phase is a reinforcing dual-track approach. Track<br />

1 is a detailed examination of the issues that influence testing feasibility. Track 2 is the initiation<br />

of a demonstration competency assessment program (DCAP) that mimics the critical aspects of<br />

the development, implementation, and administration processes. The DCAP is designed to<br />

provide an experience-based sense of current issues that will impact feasibility. The second<br />

phase of the research is an evaluation of the DCAP and the initiation of five additional prototype<br />

competency assessments targeted towards specific Army jobs (military occupational specialties<br />

[MOS]). The third phase is an evaluation of the five competency assessments and overall<br />

recommendations. We are currently completing the first phase of this research effort.<br />

FEASIBILITY ISSUES<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


To achieve wide-scale implementation, Phase I addresses four general categories of<br />

questions to determine the feasibility of competency testing. First are utilization strategies – how<br />

will the competency assessments be used for personnel management, career development,<br />

training readiness, retention, and recruitment. Second are operational strategies – what are the<br />

boundaries and requirements for an assessment program. Third is implementation – what are the<br />

phases to implementing, maintaining, and growing the assessment program. Finally, are<br />

external considerations – what other programs or initiatives affect or are affected by a<br />

competency program. Although ARI can inform policymakers based on our research, these<br />

questions must ultimately be answered by senior Army leadership.<br />

Many factors contribute to the feasibility of competency assessment for the Army. Our<br />

approach has been to identify these factors through a variety of perspectives. The issues arising<br />

from the historical factor include equity for promotion across the Army, threats to test<br />

compromise, equity of test content within each job type (i.e., MOS), intensive resource demands,<br />

and multiple test uses. Reviews of other military and civilian assessment and certification<br />

programs revealed a variety of different testing approaches including one-time certification (e.g.,<br />

nursing boards, computer certification) as well as multi-level certifications (e.g., National<br />

Institute for Automotive Service Excellence [ASE]). <strong>Testing</strong> approaches used by sister services<br />

also offer variety in terms of test preparation approaches and resources, test administration<br />

windows, test delivery options, and test development structure and support. Automation<br />

technology (i.e., the internet and computerized test development software) offer new ways for<br />

developing and delivering assessments, but offer new challenges such as computer availability<br />

and increased test security concerns. We are also examining alternative performance assessment<br />

systems such as instituting testing at the completion of required NCO development courses.<br />

In addition to examining the feasibility factors, two other activities were planned as part<br />

of Phase I. These activities - an intensive needs analysis and the DCAP - were clarified by the<br />

SMA, who recommended that competency assessment be:<br />

• used for promotion,<br />

• administered via the internet,<br />

• include the active and reserve components of the Army,<br />

• assess Specialists/Corporals (E4) through Sergeants First Class (E7),<br />

• use multiple choice items and include situation based items, and<br />

• assess the content areas of basic Soldier skills, leadership, conducting training, Army<br />

history, and Army values.<br />

ARMY TESTING PROGRAM ADVISORY TEAM<br />

The Army’s interest in a competency assessment program, specifically our research<br />

effort, and our need for direct experience and guidance prompted the researchers to form an<br />

Army <strong>Testing</strong> Program Advisory Team (ATPAT). The team consists of 24 senior<br />

Noncommissioned Officers (NCOs) representing 11 commands from the Active Army as well as<br />

commands from the Army Reserves and Army National Guard. This group also acted as our<br />

primary source for needs analysis information and content subject matter experts. Originally<br />

conceived as a one-time council, the ATPAT, largely through its own initiative, has developed<br />

into a crucial feature and resource for the research project. It has provided guidance, assistance,<br />

557<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


558<br />

review, and an up to date reflection of the operational implications of the current and future<br />

Army posture. The ATPAT now meets quarterly and we anticipate that its role and importance<br />

will continue into the follow-on phases.<br />

DCAP DEVELOPMENT<br />

The DCAP was originally designed to be small-scale trial of the assessment development<br />

and delivery systems for one MOS. The purpose was to identify currently available<br />

opportunities and potential current and future obstacles. Based on guidance from the SMA and<br />

the ATPAT, the DCAP has been reconceptualized to be a prototype of an Army-wide test geared<br />

to Soldiers eligible to be promoted to the NCO corps. The DCAP development is a multi-step<br />

process.<br />

First, we identified the test topics based on job analysis and objectives identified by the<br />

SMA and the ATPAT. Although the SMA identified general test topics, this stage presented<br />

some challenges. The first question we had to address was whether the assessment should solely<br />

address knowledge that Soldiers are expected to have acquired based on training and doctrine or<br />

if it should also include potential to perform at the next higher grade. After extensive discussion<br />

by the ATPAT members, it was decided that the majority (51%) of the prototype assessment<br />

would cover basic Soldiering skills, Army history, and Army values that E4 Specialists and<br />

Corporals are required by training and doctrine to know. The remainder of the prototype<br />

assessment would cover Soldiering skills, leadership, and training required at the next grade<br />

level, but be limited to the most essential tasks. This was a particularly important decision for<br />

the prototype assessment, because all doctrinal training for leadership and training skills does not<br />

begin until the Soldier is promoted to Sergeant. We also had to decide the breadth of materials<br />

to be covered in the assessment. The ATPAT members decided to limit the resources to the key<br />

publications in each topic area including the Soldier’s Manual of Common Tasks (STP-21-1 and<br />

STP 21-24 SMCT), the Soldier’s Guide (FM 7-21.13), the Noncommissioned Officer’s Guide<br />

(FM 7-22.7), Army Leadership Be, Know, Do (FM 22-100), Training the Force (FM 7.0), and<br />

Battle Focused Training (FM 7-1).<br />

In the second step, we developed the test blueprint. We began by generating a list of the<br />

possible topic tasks and skills to be included based on the field manuals. We presented this list<br />

to the ATPAT members who, through multiple iterations, determined the relative importance of<br />

the topic tasks, skills, and knowledge areas. They also assigned percentages to each area; these<br />

percentages reflect the number of items to be written for each topic.<br />

Next, we identified the format and test content specific to the selected prototype. The<br />

format was provided by the SMA’s guidance for a multiple-choice test. Although there was<br />

discussion of using hands-on assessment, either using specifically developed tests or capitalizing<br />

on the currently used Common Task Test, these were deemed impractical because of financial or<br />

system constraints at this time.<br />

To address the SMA’s requirement for situation-based items, we are including a<br />

situational judgment test (SJT). The 24-item SJT was developed to assess the knowledge, skills,<br />

and attributes of directing, monitoring, and supervising individual subordinates; training others;<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


team leadership; concern for Soldiers’ quality of life; cultural tolerance; motivating, leading, and<br />

supporting individual subordinates; relating to and supporting peers; and problem<br />

solving/decision making. The items consist of a brief problem scenario and four possible<br />

responses. Soldiers are asked to identify the most and least effective responses to the scenario.<br />

The SJT was validated against job performance for junior NCOs (Knapp et al., 2002; Knapp,<br />

McCloy, & Heffner, <strong>2003</strong>).<br />

Fourth, we are developing the test prototype. The items are being prepared using a<br />

software program compatible with internet distribution. This allows for graphics to be used in a<br />

multiple-choice format. The prototype will be reviewed by groups of NCOs who develop or<br />

teach primary leader development courses and basic Soldiering skills for NCOs. They will also<br />

be reviewed by the ATPAT members before completion at the end of the calendar year.<br />

Finally, we will prepare for self-assessment. The ATPAT recommended that the<br />

competency assessment not stand alone, but be supported by a self-assessment as well. This<br />

self-assessment is not intended to be a pre-test, but is to be designed to provide opportunity for<br />

learning and development. Although this portion of the project is still in the planning stages, we<br />

expect to have assessment items similar to the DCAP, but each item will provide feedback on the<br />

correct response, the logic behind that response, and resources for learning more about the topic<br />

addressed by the item.<br />

DCAP ADMINISTRATION<br />

Phase II of the project will unfold over the next 15 months starting in January 2004. The<br />

SMA has required the DCAP administration be internet-based. The Army has an extensive<br />

network of distributed learning facilities that we are planning on incorporating into our Phase II<br />

activities. Using the existing Army Training Support Center’s secure network, a portal will be<br />

established to allow Soldiers to register with the system and take the assessment. The<br />

assessment administration will be proctored to ensure test security. Our goal is administer the<br />

DCAP to about 600 to 1000 Soldiers world wide, including a sizable sample of Reserve<br />

Component and Army National Guard Soldiers.<br />

LESSONS LEARNED<br />

• The ATPAT has been a highly successful venture and contributes significantly to the<br />

progress and success of the research effort. Concerted efforts should be made to include<br />

an advisory panel of varied personnel to advise and guide continued research efforts.<br />

• Early and consistent involvement of the Reserve Components in decisions is deemed<br />

essential to the success of the program because almost 60% of the Army force structure<br />

are in the Reserve Components.<br />

• Administrative and policy issues, ranging from study materials to test reporting, are as<br />

imperative to the program (if not more so), as test development issues.<br />

• There are many critical and central issues yet to be addressed. Examples include the<br />

designation of Army infrastructure and organizational testing entities, long-term needs,<br />

and the means for sustaining and maintaining a viable test development and<br />

administration program.<br />

559<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


560<br />

• Enthusiasm and support for a competency testing program remains high within the Army<br />

and within the NCO corps.<br />

REFERENCES<br />

Campbell, R.C. (1994) The Army skill qualification test (SQT) program: A synopsis (Interim<br />

Report IR-PRD-94-05) Alexandria, VA: Human Resources Research Organization.<br />

Department of the Army (<strong>2003</strong>). Army Leadership Be, Know, Do (Field Manual 22-100).<br />

Washington, DC: Author.<br />

Department of the Army (<strong>2003</strong>). Battle Focused Training (Field Manual 7-1). Washington, DC:<br />

Author.<br />

Department of the Army (<strong>2003</strong>). Noncommissioned Officer’s Guide (Field Manual FM 7-22.7).<br />

Washington, DC: Author.<br />

Department of the Army (<strong>2003</strong>). The Soldier’s Guide (Draft) (Field Manual 7-21.13). FT Bliss,<br />

TX: Author.<br />

Department of the Army (<strong>2003</strong>). Soldier’s Manual of Common Task (Soldier Training<br />

Publication 21-1). Washington, DC: Author.<br />

Department of the Army (<strong>2003</strong>). Soldier’s Manual of Common Task (Soldier Training<br />

Publication 21-24 SMCT). Washington, DC: Author.<br />

Department of the Army (<strong>2003</strong>). Training the Force (Field Manual 7.0). Washington, DC:<br />

Author.<br />

Knapp, D. J., Burnfield, J. L., Sager, C. E., Waugh, G. W., Campbell, J. P., Reeve, C. L.,<br />

Campbell, R. C., White, L. A., & Heffner, T. S. (2002). Development of Predictor and<br />

Criterion Measures for the NCO21 Research Program (Technical Report 1128).<br />

Alexandria, VA: U. S. Army Research Institute for the Behavioral and Social Sciences.<br />

Knapp, D. J., McCloy, R., & Heffner. T. S. (<strong>2003</strong>). Validation of Measures Designed to<br />

Maximize 21st-Century Army NCO Performance (Contractor Report). Alexandria, VA:<br />

Human Resources Research Organization.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


1 ST WATCH: ASSESSMENT OF COPING STRATEGIES EMPLOYED BY NEW<br />

SAILORS<br />

Marian E. Lane, M.S.<br />

Jacqueline A. Mottern, Ph.D.<br />

Michael A. White, Ph.D.<br />

Marta E. Brown, M.S.<br />

Erica M. Boyce<br />

U.S. Navy Personnel Research, Studies, and Technology, PERS-13<br />

Millington, TN 38055-1300<br />

marian.lane@persnet.navy.mil<br />

This paper examines the relationship between coping strategies and the success of new<br />

Sailors during the first term of enlistment. Surveys were administered to Navy recruits (N =<br />

47,708) upon entry into Recruit Training Command, graduation from Recruit Training<br />

Command, graduation from “A”/Apprentice School, and/or exit from the Navy. These surveys<br />

included demographic items; a 32-item stress coping scale adapted from the Ways of Coping<br />

Checklist (WCCL; Vitaliano, Russo, Carr, Maiuro, & Becker, 1985) designed to assess coping<br />

strategies employed by people when faced with stressful situations; and a 24-item reasons-forjoining<br />

scale designed to assess the influence of various factors on the decision to join the Navy.<br />

Data were analyzed to determine the reliability and factor structure of the stress coping scale, as<br />

well as to examine possible relationships between type of coping strategy utilized and other<br />

factors, including outcomes and demographic characteristics. Results indicate that recruits who<br />

successfully completed training were more likely to utilize different types of coping strategies<br />

than recruits who exited the Navy prior to the completion of training. Coping strategies were<br />

significantly related to type of education credential and weakly related to reasons for joining the<br />

Navy. Results also indicate gender differences in the frequency of use of various types of coping<br />

strategies, but these differences were not related to attrition rates within gender. The results<br />

suggest that an assessment of coping strategies may be useful in recruiting and selection<br />

purposes, as well as in preparation for and during military training.<br />

BACKGROUND<br />

Overall, the 1 st Watch project is based on the idea that increasing the Navy’s knowledge<br />

base regarding what factors contribute to Sailors’ success during the first term of enlistment, as<br />

well as subsequent terms for those who chose to make the Navy a permanent career decision,<br />

will assist in retaining qualified Sailors for the Navy. The 1 st Watch project began an<br />

investigation into factors related to outcomes such as performance, satisfaction, morale, and<br />

stress at various points during this first term. This paper focuses on the latter of these outcomes –<br />

stress – and the ways new recruits handle the potentially stressful situations they inevitably<br />

encounter.<br />

561<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


562<br />

Coping Strategies<br />

Lazarus and Folkman (1984) define coping as “constantly changing cognitive and<br />

behavioral efforts to manage specific external and/or internal demands that are appraised as<br />

taxing or exceeding the resources of the person” (p. 141). It is these specific demands that often<br />

produce anxiety about a given situation and build into stress that may either enhance or diminish<br />

performance in the situation, if not dealt with in an effective and timely manner through any of<br />

the various stress coping strategies. For instance, Edwards and Trimble (1992) examined the<br />

relationship between anxiety and performance, as influenced by coping strategy employed in the<br />

situation, and found that task-oriented coping responses (such as problem-focus) were positively<br />

related to performance on a task, while emotion-oriented coping responses (such as avoidance<br />

and distancing) were negatively related to performance on the task. Because coping has been<br />

related to such outcomes, an examination of this factor should be of great interest to the military<br />

as it competes with the civilian workplace in an attempt to attract, recruit, hire, and retain the<br />

most qualified individuals.<br />

The beginning of a Navy career may be, and most likely is, a very stressful time for most<br />

recruits. Many of them are leaving home for the first time to travel far from family and friends to<br />

a new environment in which they know no one and have limited knowledge as to what is about<br />

to happen in their lives. An assessment of the ways in which these recruits handle the stressful<br />

situations they face in the beginning of their Navy careers may shed some light on how likely<br />

these recruits are to make the Navy a life-long career. The methods used in this process, the<br />

coping strategies for handling potentially stressful, anxiety-producing situations, may vary both<br />

across and within individuals and/or groups of individuals, and knowledge of these differences<br />

may provide valuable information regarding the choices that these recruits make concerning their<br />

military careers.<br />

As the demands of recruit training at Recruit Training Command (RTC) introduce novel<br />

situations and tasks for new recruits to face each day, they must either figure out ways to adapt to<br />

and deal with these situations or be faced with the prospect of ending their careers early and<br />

returning to their civilian lives. It is likely that, more often than not, the latter of these choices is<br />

not beneficial for either the Navy or the exiting recruits. Therefore, a priori information about<br />

how recruits cope with the challenges they face may be helpful in designing and implementing<br />

ways to prevent unwanted attrition from training due to situational stress and anxiety.<br />

METHOD<br />

Sample<br />

The sample for the current study was composed of new recruits (N = 47,708) who had<br />

recently joined the Navy and were embarking upon initial recruit training at the Great Lakes<br />

Naval Training Center, Great Lakes, IL, from the beginning of data collection in April 2002 to<br />

August <strong>2003</strong>.<br />

Survey<br />

New Sailor Survey. The first of four questionnaires administered during the course of the<br />

first term of enlistment is the New Sailor Survey, which is composed of questions designed to<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


assess individuals’ personal values, their experiences with recruiting and classifying, their<br />

reasons for joining the Navy, their expectation of training, their stress coping skills, their fit with<br />

the Navy, and their demographic information.<br />

Measure<br />

Stress Coping Scale. In addition to finding themselves in a completely novel situation,<br />

new recruits are faced with additional potentially stressful daily events within this unfamiliar<br />

context, such as daily physical fitness exercises and rigorous inspections conducted by their<br />

Recruit Division Commanders (RDCs). These events likely add to the stress already building<br />

within these recruits. As mentioned previously, a portion of the New Sailor Survey is devoted to<br />

the assessment of stress coping skills and techniques employed by the recruits. The scale consists<br />

of 32 items adapted from the revised Ways of Coping Checklist (WCCL) developed by<br />

Vitaliano, Russo, Carr, Maiuro, and Becker (1985). It was hypothesized that the use of certain<br />

types of coping skills would be more closely related to successful completion of recruit training<br />

than the use of different types of coping skills, and that identification of the skills most closely<br />

related to training success could be used in future recruit training efforts.<br />

Procedure<br />

Data for this study was collected from recruits as they traveled by bus from O’Hare<br />

airport in Chicago to RTC in Great Lakes. These recruits were given the New Sailor Survey,<br />

which requires approximately 45 minutes to complete (the duration of the trip from O’Hare to<br />

RTC). Upon completion and arrival at Great Lakes, the questionnaires were collected by a Navy<br />

Petty Officer, and, periodically, shipped to our data processing center, where they were<br />

electronically scanned in preparation for data analysis.<br />

RESULTS<br />

Factor structure<br />

Data were analyzed to determine the factor structure of the stress coping strategies<br />

measure. The revised measure administered by Vitaliano et al. (1985) consisted of 42 items and<br />

factored into five distinct subscales, corresponding to five different styles for coping with<br />

stressful situations: Problem-focused, Blamed self, Wishful thinking, Seeks social support, and<br />

Avoidance. At pre-test, this scale factored into the same five factors as found by Vitaliano et al.,<br />

but 10 of the 42 items were eliminated due to low factor loadings scattered across the five<br />

primary factors. The remaining 32 items were included on the New Sailor Survey as the Sailor<br />

Stress Coping Scale.<br />

The principal components analysis resulted in five factors with λ greater than 1, which<br />

corresponded to the five factors indicated by Vitaliano et al. Factor loadings ranged from .24 to<br />

.86. Two items from the Seeks social support subscale and one item from the Avoidance subscale<br />

had slightly higher loadings on other factors, but had greater theoretical and practical<br />

significance as interpreted on the original scales. Therefore, the subscales indicated by Vitaliano<br />

et al. and observed in the pre-test phase of the current project were retained for subsequent<br />

analyses.<br />

563<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


564<br />

Reliability analysis<br />

Internal consistency reliability analyses indicated moderate to high reliabilities for all five<br />

subscales (Problem-focused, α = .88; Blamed self, α = .82; Wishful thinking, α = .86; Seeks<br />

social support, α = .62; and Avoidance, α = .83).<br />

Graduates vs. attrites<br />

A disposition report including the outcome (graduate or attrite prior to graduation) of the<br />

recruits included in the sample was obtained from RTC at Great Lakes. Results of a multivariate<br />

analysis of variance indicated that there was a significant difference between these groups on the<br />

set of variables [F(5, 23164) = 16.947, Wilk’s λ = .996, p < .001], which was the set of coping<br />

strategies assessed by the Ways of Coping Checklist modified for this study. Further examination<br />

revealed differences between these groups on the Avoidance and Wishful thinking subscales,<br />

such that recruits who attrited prior to graduation were significantly more likely to use these<br />

types of coping strategies than those successfully completed recruit training.<br />

Demographic characteristics<br />

Gender. Results of a multivariate analysis of variance indicated that there was a<br />

significant difference between males and females on the set of variables comprising the stress<br />

coping scales used in this study [F(5, 40574) = 192.329, Wilk’s λ = .977, p < .001. Further<br />

examination revealed differences between males and females on all coping subscales except<br />

Problem-focused, such that males were significantly more likely to use coping strategies of<br />

Avoidance, Wishful thinking, and Blamed self than females, and females were significantly more<br />

likely to use the coping strategy of Seeks social support than males. However, these differences<br />

did not appear to be related to attrition rates between male and female recruits.<br />

Reasons for joining. Few strong relationships were observed between reason for joining<br />

the Navy and type of coping strategy employed by these recruits. The strongest relationships<br />

were between the Problem-focused strategy subscale and such reasons for joining as ‘wanted to<br />

test myself in a demanding situation’, ‘challenging or interesting work’, ‘desire to serve my<br />

country’, and ‘personal growth’. Weaker relationships were observed between the Avoidance and<br />

Wishful thinking strategy subscales and reasons such as ‘get away from family or personal<br />

situations’, ‘get away from hometown’, and ‘time to figure out what I want to do’. The Seeks<br />

social support strategy subscale was related, though not as strongly, to the same reasons for<br />

joining as the Problem-focused strategy, as well as reasons relating to benefits and skills to be<br />

acquired in specific occupations.<br />

Type of education credential. Results of a multivariate analysis of variance indicated that<br />

there was a significant difference among types of education credential earned on this set of<br />

variables [F(45, 177264) = 7.292, Wilk’s λ = .992, p < .001]. Further examination revealed<br />

differences on all coping subscales. All groups reported being most likely to use the Problemfocused<br />

strategy and least likely to use the Avoidance strategy. Within each type of strategy,<br />

though, different groups (by type of education credential earned) reported being more and less<br />

likely to use the strategy. Within the Problem-focused strategy, those earning a diploma from an<br />

adult school reported being most likely to use this strategy, while those receiving a diploma<br />

issued by parents or tutors for home schooling reported being least likely to use this strategy.<br />

Within the Avoidance strategy, those receiving a diploma from a vocational or technical school<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


eported being most likely to use this strategy, while those receiving a diploma issued by parents<br />

or tutors for home schooling reported being least likely to use this strategy. Within the Wishful<br />

thinking strategy, those receiving a diploma from an correspondence school reported being most<br />

likely to use this strategy, while those receiving a diploma issued by parents or tutors for home<br />

schooling reported being least likely to use this strategy. Within the Blamed self strategy, those<br />

earning college credit to turn a GED into a diploma reported being most likely to use this<br />

strategy, while those receiving a diploma issued by an association or other school for home<br />

schooling reported being least likely to use this strategy. Within the Seeks social support<br />

strategy, those earning a diploma from a correspondence school reported being most likely to use<br />

this strategy, while those receiving a diploma issued by parents or tutors for home schooling<br />

reported being least likely to use this strategy.<br />

DISCUSSION<br />

The 1 st Watch project already has provided much information regarding new recruits as<br />

they enter their first term of enlistment. The exploration of measures unincorporated in previous<br />

research has contributed significantly to the existing body of knowledge regarding the factors<br />

that influence success at the beginning of this critical first term. Examination of the strategies<br />

employed by these recruits for coping with stressful situations reveals possible applications to<br />

future recruit training to enhance the chances of success during this critical first term.<br />

The results of the current study reveal that the measure of coping and its factor structure<br />

are reproducible in this context from original work, and reliability analyses indicate that the<br />

items composing the subscales hold together adequately to assume the measurement of the<br />

intended constructs. The difference found between graduates and those who attrite prior to<br />

graduation on two of the coping strategy subscales, Avoidance and Wishful thinking, makes some<br />

intuitive sense. Vitaliano et al. found these strategies to be related to negative outcomes for the<br />

individual, including depression and anxiety; therefore, these strategies may be perceived as<br />

being more ‘negative’ or ‘unproductive’ approaches to coping with stressful situations as<br />

compared to other, more positive, productive strategies, such as Problem-focused and Seeks<br />

social support. The nonsignificant findings between these groups on the other strategy subscales<br />

may suggest that other factors related to coping strategies, and not merely the likelihood of using<br />

a specific type of strategy across situations, affect the probability of success of new recruits.<br />

Future research may explore possibilities for these related factors, such as coping adaptability, in<br />

addition to coping strategies themselves.<br />

Results also indicate that although males and females report differences in the frequency<br />

of use of some types of coping strategies, such that males are more likely than females to use the<br />

more ‘negative’ coping strategies of Avoidance and Wishful thinking, these differences did not<br />

appear to be related to attrition rates in either group, and differences did not emerge on any other<br />

strategies. Significant but weak relationships were observed between types of coping strategy<br />

employed and reasons for joining the Navy, indicating that more positive, productive coping<br />

strategies such as Problem-focused and Seeks social support are related to more constructive<br />

reasons for joining such as ‘wanted to test myself in a demanding situation’, ‘challenging or<br />

interesting work’, ‘desire to serve my country’, and ‘personal growth’; more ‘unproductive’<br />

strategies such as Avoidance and Wishful thinking strategy subscales are related to more escapist<br />

565<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


566<br />

reasons such as ‘get away from family or personal situations’, ‘get away from hometown’, and<br />

‘time to figure out what I want to do’. Although significant, these relationships were relatively<br />

weak, and further examination of the relationship between the use of coping strategies and<br />

reasons for joining the Navy, and any potentially related factors, should be conducted prior to<br />

basing recruiting and/or selection decisions on these results.<br />

Recruits who earned different types of education credentials differed on types of coping<br />

strategies used more often, although all groups reported using the Problem-focused strategy most<br />

often and the Avoidance strategy least often. The group of recruits who received a diploma from<br />

parents or tutors for home schooling appeared least likely among the groups to use any one type<br />

of coping strategy most often, which may indicate that these individuals are more likely to use a<br />

variety of coping strategies at different times, depending on the situation at hand.<br />

The results of the current study shed new light on the construct of coping strategies and<br />

their relationship to the success of new recruits during initial recruit training and throughout the<br />

first term of enlistment. These results lend support to the hypothesis that the employment of<br />

different coping strategies may be important to this success and indicate that there may be<br />

additional factors related to the use of coping strategies that will further contribute to knowledge<br />

of the most salient issues related to the success of new recruits in the initial stages of training, as<br />

well as throughout the first term of enlistment.<br />

References<br />

Edwards, J.M., & Trimble, K. (1992). Anxiety, coping and academic performance. Anxiety,<br />

Stress, and Coping, 5, 337-350.<br />

Lazarus, R.S., & Folkman, S. (1984). Stress, appraisal, and coping. New York: Springer.<br />

Mottern, J.A., White, M.A., & Alderton, D.L. (2002). 1 st watch on the first term of enlistment.<br />

Paper presented at the 44 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>, Ottawa, Canada.<br />

Vitaliano, P.P., Russo, J., Carr, J.E., Maiuro, R.D., & Becker, J. (1985). The ways of coping<br />

checklist: Revision and psychometric properties. Multivariate Behavioral Research, 20,<br />

3-26.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


1 st WATCH: THE NAVY FIT SCALE<br />

Marta E. Brown, M.S.<br />

Jacqueline A. Mottern, Ph.D.<br />

Michael A. White, Ph.D.<br />

Marian E. Lane, M.S.<br />

Erica M. Boyce<br />

U. S. Navy Personnel Research, Studies, and Technology, PERS-13<br />

Millington, TN 38055-1300<br />

marta.brown@navy.mil<br />

The purpose of this study was to extend existing research by determining the relationship<br />

between person-organization (P-O) fit and attitudinal and behavioral outcomes in the Navy<br />

setting. This study utilized data from a longitudinal study designed to examine Navy recruits (N<br />

= 47,708) during their first term of enlistment. Surveys were administered upon graduation from<br />

Recruit Training Command, graduation from “A”/Apprentice School, and/or exit from the Navy.<br />

The surveys included demographic characteristics; the Navy Fit Scale, an 8-item measure of<br />

perceived P-O fit designed for use with Navy personnel; the Navy Commitment scale, a 14-item<br />

scale of commitment to the Navy; and additional outcome variables including morale, Navy<br />

expectations, level of stress, and attrition. P-O fit was found to have a moderate positive<br />

relationship with morale and organizational commitment. In addition, P-O fit had a weak<br />

positive relationship with Navy expectations and a weak negative relationship with stress and<br />

attrition. The impact of setting on P-O fit and the implications for generalization are discussed.<br />

The results support the importance of understanding the association between perceived fit with<br />

the Navy and career outcomes.<br />

BACKGROUND<br />

This study utilized data from the “1 st Watch on the First Term of Enlistment” study,<br />

which was designed to examine Navy recruits during their first term of enlistment (Mottern,<br />

White, & Alderton, 2002). The main goals of the 1st Watch project were to gather information<br />

pertaining to the career progress of Sailor’s during the first term of enlistment and to use that<br />

information to help develop highly qualified and well prepared sailors in the future. To<br />

accomplish these goals the model developed for the project employed person-organization (P-O)<br />

fit theory modified for use in the navy setting.<br />

P-O fit is commonly defined as “the compatibility between people and organizations that<br />

occurs when: (a) at least one entity provides what the other needs, or (b) they share similar<br />

fundamental characteristics, or (c) both” (Kristof, 1996). In current literature, P-O fit is<br />

frequently operationalized as the congruence between individual and organizational values, and<br />

has focused on workers in civilian occupations. Chatman (1991) argued for value congruence as<br />

a measure of person-organization fit because values are fundamental and relatively enduring and<br />

individual and organizational values can be directly compared. An indirect method to assess<br />

objective fit was developed for use in this study, which involves comparing individual ratings of<br />

567<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


568<br />

personal characteristics with a profile of Navy values. This research extends the P-O fit literature<br />

by examining the controlled environment of military training and military life.<br />

Individuals develop a sense of fit during their career within an organization, which<br />

impacts their attitudes, decisions, and behavior. Cable & DeRue (2002) explain that when P-O fit<br />

exits, there is a match between the employees’ values and the organization’s values. This fosters<br />

a sense of involvement and creates a strong bond, which results in greater identification with the<br />

organization, a positive perception of organizational support, and the decision to stay in the<br />

organization. This is congruent with Gade (<strong>2003</strong>), who described commitment in terms of<br />

service members as “a person who is strongly attached to his or her military service as an<br />

organization and to his or her unit as a part of that organization.” Although several studies have<br />

found that P-O fit is related to job satisfaction (e.g., Kristof-Brown, Jansen, & Colbert, 2002) and<br />

organizational commitment (e.g., O'Reilly, Chatman, & Caldwell, 1991), there is less evidence<br />

for the relationship between P-O fit and morale and accuracy of expectations. Thus, in this study<br />

the following hypothesis is proposed:<br />

Hypothesis 1: Person-organization fit is positively related to morale, organizational<br />

commitment, and expectations.<br />

When people do not fit with their environment, they experience more negative affect such<br />

as feelings of incompetence and anxiety (Chatman, 1991). Past research indicates that P-O fit is<br />

related perceived stress (e.g., Lovelace & Rosen, 1996), and attrition (e.g., Saks & Ashforth,<br />

1997). The following hypotheses are based on these findings and intend to extend the research by<br />

improving generalizability:<br />

METHOD<br />

Hypothesis 2: Person-organization fit is negatively related to stress and attrition.<br />

Hypothesis 3: Recruits who graduate will have higher person-organization fit than those<br />

who attrite.<br />

Upon entering the Navy, recruits complete 8 weeks of training with a training division at<br />

Recruit Training Command (RTC), Great Lakes Naval Training Center. Once the training<br />

requirements are completed, trainees graduate from RTC and transition to the next phase of<br />

training. The graduates either continue with advanced training at an “A/Advanced School” or<br />

attend a 2-3 week Apprentice School. Questionnaires were administered to trainees at these two<br />

milestones (RTC Grad Survey and “A”/Apprentice School Survey) and, in the case of attrition,<br />

upon separation from the Navy (Exit Survey). Data collection began in April 2002 and<br />

concluded in August <strong>2003</strong>.<br />

Sample<br />

Navy recruits in training at the Great Lakes Naval Training Center were tracked from the<br />

beginning of Recruit Training Command to graduation from “A”/Apprentice School (N =<br />

47,708).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Surveys<br />

RTC Grad Survey. This questionnaire was administered by Petty Officers to trainees (N =<br />

31,331) who are identified for graduation from Recruit Training Command. The questionnaire<br />

included the Navy Fit Scale, the Navy Commitment Scale, morale, Navy expectations, and level<br />

of stress. Eighty three percent of respondents were men.<br />

“A”/Apprentice School Survey. This questionnaire was distributed by a student class<br />

leader to trainees who were identified for graduation from “A”/Apprentice School (N = 9,323).<br />

The Navy Fit Scale, the Navy Commitment Scale, morale, Navy expectations, and level of stress<br />

were included in the questionnaire. Seventy nine percent of respondents were men.<br />

Exit Survey. This questionnaire was administered at the separation barracks to all trainees<br />

who attrited during training (N = 2,592). The questionnaire contained the Navy Fit Scale, morale,<br />

Navy expectations, and level of stress. Eighty three percent of respondents were men.<br />

Measures<br />

Navy Fit Scale. An 8-item measure of perceived person-organization fit designed for use<br />

in the Navy setting. Items were developed from the Navy’s Evaluation Report & Counseling<br />

Record (E1-E6) to represent each of the domains on which Sailors are rated annually. These<br />

domains are personal characteristics that together form a profile of Navy values. Respondents<br />

were instructed to respond as to how their Recruit Division Commander would rate them<br />

compared to other sailors on the set of personnel characteristics, using a 5-point Likert scale<br />

(“Far better than the average recruit” to “Far worse than the average recruit”). This indirect<br />

method assesses objective fit by comparing individual ratings with the Navy profile. The<br />

reliability of the scale was α = .88 (RTC), α = .90 (“A”/Apprentice School), and α = .93 (Exit).<br />

Navy Commitment Scale. Organization commitment was measured using adapted items<br />

from Meyer & Allen’s (1987) scale and additional items specific to the Navy. The reliability of<br />

the 14-item scale was α = .85 (RTC) and α = .88 (“A”/Apprentice School).<br />

Outcome variables. Other variables including morale, Navy expectations, and level of<br />

stress during recent training period were measured by specific questions on the surveys.<br />

Participants rated these items on Likert-type scales. Attrition was collected from administrative<br />

records.<br />

Demographic variables. Demographic characteristics including gender, current paygrade,<br />

and highest education level achieved were also collected.<br />

RESULTS<br />

Tables 1, 2, and 3 present the means, standard deviations, and correlations for each of the<br />

study variables by respective survey. Hypothesis 1 predicted that person-organization fit would<br />

be positively related to morale, organizational commitment, and accuracy of expectations. The<br />

moderate positive correlations between P-O fit and commitment on all three questionnaires<br />

indicates that a high degree of similarity between individual characteristics and the Navy’s<br />

desired personal characteristics is positively related to a strong attachment to the military (rrtc =<br />

.28 and ras = .35). Also, the moderate positive correlations between P-O fit and morale points to<br />

the relationship between a high level of P-O fit and positive psychological well-being (rrtc = .29,<br />

569<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


570<br />

ras = .33, rexit = .43). Finally, the hypothesized relationship between P-O fit and accuracy of<br />

expectations was supported by weak to moderate positive correlations (rrtc = .13, ras = .21, rexit =<br />

.33). This interesting relationship between reporting that expectations for training were correct<br />

and having a high degree of fit with the Navy warrants further investigation.<br />

Hypothesis 2 predicted that Person-organization fit is negatively related to stress and<br />

attrition. The small to moderate negative correlations between P-O fit and perceived stress on all<br />

three questionnaires supports this hypothesis (rrtc = -.15, ras = -.15, rexit = -.38). However, the<br />

strength of this relationship may have been diminished in RTC and “A”/Apprentice School by<br />

the general stress experienced by the majority of trainees that is a standard component of the<br />

training design. P-O fit also had a weak negative relationship with attrition (r = -.19), where<br />

turnover decision was coded as 1 = graduate (RTC, “A”/Apprentice School, or both) and 2 =<br />

exit. This correlation may not be an accurate representation of the relationship due to the variety<br />

of reasons for which trainees attrite from the Navy (e.g. existing medical problems or injury that<br />

occurred during training).<br />

Table 1<br />

Means, Standard Deviations, and Correlations for RTC Grad Variables<br />

Variable M SD 1 2 3 4 5<br />

1. Person-organization fit 31.70 4.71 ⎯<br />

2. Commitment 56.23 8.03 .28 ⎯<br />

3. Morale 3.65 0.84 .29 .35 ⎯<br />

4. Navy expectations 3.43 1.03 .13 .38 .30 ⎯<br />

5. Perceived stress 3.11 1.03 -.15 -.16 -.16 -.20 ⎯<br />

Note. All correlations are significant at p < .001.<br />

Table 2<br />

Means, Standard Deviations, and Correlations for “A”/Apprentice School Variables<br />

Variable M SD 1 2 3 4 5<br />

1. Person-organization fit 31.62 5.13 ⎯<br />

2. Commitment 48.76 8.60 .35 ⎯<br />

3. Morale 3.64 0.89 .33 .40 ⎯<br />

4. Navy expectations 3.38 1.05 .21 .47 .40 ⎯<br />

5. Perceived stress 2.96 1.09 -.15 -.18 -.20 -.24 ⎯<br />

Note. All correlations are significant at p < .001.<br />

Table 3<br />

Means, Standard Deviations, and Correlations for Exit Variables<br />

Variable M SD 1 2 3 4<br />

1. Person-organization fit 27.92 6.80 ⎯<br />

2. Morale 3.12 1.13 .43 ⎯<br />

3. Navy expectations 2.74 1.22 .33 .44 ⎯<br />

4. Perceived stress 3.77 1.15 -.38 -.37 -.40 ⎯<br />

Note. All correlations are significant at p < .001.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Hypothesis 3 stated that recruits who graduate will have higher P-O fit than those who<br />

attrite. To test this hypothesis, an independent samples t-test was conducted comparing graduates<br />

of either RTC or “A”/Apprentice School to those who attrite. If an individual completed both<br />

RTC and “A”/Apprentice School, the P-O fit score was an average of the two scores. The results<br />

of this analysis indicate that graduates have significantly higher P-O fit than those who attrite, t<br />

(35889) = 35.88, p < .001.<br />

To further investigate the construct of P-O, variation in P-O fit by demographic<br />

information was analyzed. An analysis of gender differences on P-O fit scores indicated that<br />

males had significantly higher P-O fit scores than females for RTC graduates [t (28778) = 19.44,<br />

p < .001); “A”/Apprentice School graduates [t (8618) = 9.56, p < .001)]; and attrites [t (2294) =<br />

3.48, p < .01)]. Three one-way analyses of variance (ANOVA) with Bonferroni post hoc tests<br />

were used to investigate P-O fit at current levels of paygrade (E1, E2, and E3). The results<br />

indicated that those trainees at higher levels had significantly higher P-O fit scores for RTC<br />

graduates [F (2,28759) = 329.07, p < .001]; “A”/Apprentice School graduates [F (2, 8112) =<br />

45.03, p < .001], and attrites [F (2, 2288) = 9.90, p < .001].<br />

Three additional ANOVAs with Bonferroni post hoc tests were used to investigate P-O<br />

fit and highest level of education achieved (10 th grade or less, 11 th , 12 th , one or more years of<br />

college or technical school, and Bachelor’s degree). The results indicated that for RTC graduates,<br />

those trainees with one or more years of college or technical school and those with a Bachelor’s<br />

degree had significantly higher P-O fit than all lower levels of education, F (4, 20130) = 53.69, p<br />

< .001. For “A”/Apprentice School graduates, those trainees with one or more years of college or<br />

technical school and those with a Bachelor’s degree had significantly higher P-O fit than those<br />

who had completed 10 th grade and those who had completed 12 th grade, F (4, 5100) = 17.46, p <<br />

.001. Finally, there were no significant differences between levels of education on P-O fit for<br />

attrites.<br />

DISCUSSION<br />

This study investigated the relationship between the similarity of individuals’ selfperceptions<br />

of personal characteristics and the navy’s desired personal characteristics (P-O Fit)<br />

and attitudinal and behavioral outcomes in the Navy setting. The results suggest that a high<br />

degree of similarity between individual ratings of personal characteristics and a profile of Navy<br />

values was related to a strong attachment to the military, positive psychological well-being,<br />

correct training expectations, lower perceived stress, and retention. Research aimed at better<br />

understanding this relationship between perceived fit with the Navy and career outcomes can be<br />

applied in the development of highly qualified and well prepared sailors. The differences found<br />

pertaining to demographic characteristics highlight specific segments of the population that<br />

could benefit from further investigation.<br />

571<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


572<br />

REFERENCES<br />

Cable, D. M., & DeRue, D. S. (2002). The convergent and discriminant validity of subjective fit<br />

perceptions. Journal of Applied Psychology, 87(5), 875-884.<br />

Chatman, J. (1991). Matching people and organizations: Selection and socialization in public<br />

accounting firms. Administrative Science Quarterly, 36, 459-484.<br />

Gade, P. A. (<strong>2003</strong>). Organizational Commitment in the <strong>Military</strong>: An Overview. <strong>Military</strong><br />

Psychology, 15(3), 163-166.<br />

Kristof, A. L. (1996). Person-Organization Fit: An Integrative Review of its Conceptualizations,<br />

Measurement, and Implications. Personnel Psychology, 49(1), 1-50.<br />

Kristof-Brown, A. L., Jansen, K. J., & Colbert, A. E. (2002). A policy-capturing study of the<br />

simultaneous effects of fit with jobs, groups, and organizations. Journal of Applied<br />

Psychology, 87(5), 985-993.<br />

Lovelace, K. & Rosen, B. (1996). Differences in Achieving Person-Organization Fit among<br />

Diverse Groups of Managers. Journal of Management, 22(5), 703-723.<br />

Meyer, J. P., & Allen, N. J. (1987). Organizational commitment: Toward a three-component<br />

model, Research Bulletin No. 660. The University of Western Ontario, Department of<br />

Psychology, London.<br />

Mottern, J. A., White, M. A., & Alderton, D. L. (2002, October). Ist Watch on the First Term of<br />

Enlistment. Paper presented at the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> 44 th Annual<br />

Conference, Ottawa, Ontario.<br />

O'Reilly III, C. A., Chatman, J., & Caldwell, D. F. (1991). People and Organizational Culture: A<br />

Profile Comparison Approach to Assessing Person-Organization Fit. Academy of<br />

Management Journal, 34(3), 487.<br />

Saks, A. M., & Ashforth, B. E. (1997). A longitudinal investigation of the relationships between<br />

job information sources, applicant perceptions of fit, and work outcomes. Personnel<br />

Psychology, 50(2), 395-426.<br />

Bretz, Jr., R. D., & Judge, T. A. (1994). Person-Organization Fit and the Theory of Work<br />

Adjustment: Implications for Satisfaction, Tenure, and Career Success. Journal of<br />

Vocational Behavior, 44(1), 32-54.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


USING RESULTS FROM ATTITUDE AND OPINION SURVEYS 1<br />

Dr. Alma G. Steinberg and Dr. Susann Nourizadeh<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

5001 Eisenhower Avenue<br />

Alexandria, VA 22333-5600<br />

steinberg@ari.army.mil<br />

This paper addresses the utilization of attitude and opinion surveys from three<br />

perspectives: respondents to the survey, sponsor/proponents for the survey, and the researchers<br />

who conduct the survey. It shows how understanding the perspectives of each can maximize<br />

potential use of survey results.<br />

Background<br />

Survey results can be used in many different ways. Some examples are:<br />

(a) Provision of scientifically-sound, and timely information to decision-makers<br />

(b) Monitoring of Soldier issues<br />

(c) Conducting program or policy assessments<br />

(d) Determination of the validity of anecdotal information or opinions<br />

(e) Tracking of trends on a wide number and variety of issues<br />

(f) Identification of emerging issues<br />

(g) Assessments of the impact of unexpected events (by comparing to baselines)<br />

In spite of the many ways survey results can be used, it is common for people to be<br />

skeptical about their use. The discussion below indicates some of the reasons for this.<br />

Discussion<br />

Respondents typically want to know the results of the surveys they take. Yet, the results<br />

are usually not provided to them personally, often for very practical reasons. Results may be<br />

available in public forums (e.g., reports, newspaper articles, the Web) at some later date, but<br />

respondents may not be aware of them or may not associate them with the surveys they took.<br />

Respondents also want to know how the results will be and have been used. Yet, when the<br />

results are used, the user does not always specify the source. Further, the results are often used<br />

in conjunction with other sources of input and thus they are not easily recognizable. As a result,<br />

respondents often make false assumptions. The first assumption is that that most people<br />

responded just the way they did. The second is that, if they cannot identify an impact from the<br />

survey, the results have not been used.<br />

The sponsors or proponents of the survey also want to know the results. In addition, they<br />

typically want to know the context for interpreting the results, such as: (a) comparisons with<br />

other populations/subpopulations of interest, (b) trends indicating changes over time, and (c) the<br />

573<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


574<br />

reasons behind the responses. Finally, sponsors often want concrete recommendations to address<br />

the issues highlighted by the results.<br />

Sponsors often mistakenly assume that since the survey was conducted for them, there is<br />

no compelling need to address respondent concerns about the results. Thus, they may severely<br />

limit distribution of all or part of the results. Also, they may not be aware of the importance of<br />

indicating to respondents that the results have been used and examples of their use.<br />

Finally, researchers who conduct the survey want to know the results. Often, the<br />

researcher’s interests include furthering the science. This may involve: (a) developing<br />

scales/approaches to measure constructs, (b) testing hypotheses, and (c) building theories and<br />

models. Researchers should not take it for granted that sponsors have that same interest in<br />

furthering the science. Also, they should not assume that sponsors will want to know (or will<br />

understand) results presented in the language and format of scientific journals.<br />

Conclusions<br />

Respondents, sponsors, and researchers are all stakeholders in surveys. All want to know<br />

the survey results. However, they each may have unique, need-driven expectations for how<br />

results are analyzed, reported, and utilized. To maximize utilization of results, researchers need<br />

to recognize these differing needs by tailoring analyses and report formats accordingly.<br />

1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the U.S. Army<br />

Research Institute or the Department of the Army.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Using Survey and Interview Data: An Example 1<br />

Dr. Susann Nourizadeh and Dr. Alma G. Steinberg<br />

U.S. Army Research Institute for the Behavioral and Social Sciences<br />

5001 Eisenhower Avenue<br />

Alexandria, VA 22333-5600<br />

nourizadehs@ari.army.mil<br />

Introduction<br />

In general, surveys are useful in providing quantitative data from large samples across a<br />

wide geographical area. They allow the same questions to be asked to all respondents in the same<br />

way. Supplementing surveys with focus group interviews adds qualitative data that provide a<br />

context to interpret survey results. Focus groups also facilitate an in-depth examination of issues<br />

that are difficult to examine with surveys (e.g., why individuals responded a certain way to the<br />

survey) and aid in finding solutions to problems. The purpose of this paper is to present an<br />

example which uses survey and interview data to address an applied concern related to<br />

mentoring.<br />

Currently, the U.S. Army is examining issues related to mentoring, including its<br />

definition, its role in leader development, and ways to increase its occurrence. However, there is<br />

no shared understanding about the meaning of mentoring and how it should be implemented in<br />

the Army. In the literature, the mentoring relationship is described as a “developmental<br />

relationship between senior and junior individuals in organizations” (McManus & Russell, 1997,<br />

p. 145). Thus, mentors are considered to be individuals who are superior in both rank and<br />

experience (e.g., Bagnal, Pence, & Meriwether, 1985; McManus & Russell, 1997). In the U.S.<br />

Army, currently there is confusion over the differentiation between leadership and mentoring.<br />

The problem arises because similar behaviors appear to apply to both (e.g., teaching job skills,<br />

giving feedback on job performance, providing support and encouragement).<br />

In addition to definitional concerns, the Army is looking at how mentoring can be<br />

implemented as part of leader development and whether the amount of mentoring can be<br />

increased by tapping additional sources of mentors. Since Soldiers often learn much from the<br />

first non-commissioned officers (NCOs) with whom they work (e.g., their platoon sergeant) and<br />

also from their peers, the Army decided to examine whether these two groups might be<br />

considered additional sources of mentors.<br />

Approach<br />

The survey data collection instrument was the Fall 2001 Sample Survey of <strong>Military</strong><br />

Personnel (SSMP). The SSMP is a semi-annual omnibus survey conducted by the U.S. Army<br />

Research Institute. It is sent to an Army-wide random sample of Active component<br />

commissioned officers and enlisted personnel. The survey addressed whether Soldiers felt they<br />

ever had a mentor, who the mentor was (e.g., their rater, senior rater, someone else who was<br />

575<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


576<br />

higher in rank other than the rater or senior rater, peer, subordinate) and the kind of behaviors<br />

their mentor exhibited.<br />

The focus group interviews addressed why Soldiers perceived certain people to be their<br />

mentors and not others, the behaviors they were looking for from mentors, and barriers that<br />

prevented individuals from being seen as mentors. In addition, the focus groups addressed<br />

participants’ views on how to increase the amount of mentoring in the Army.<br />

Results<br />

The survey results from 2,802 Army officers and 4,022 enlisted Soldiers are shown in<br />

Table 1. The percentages in this Table are for the set of respondents who said that they have a<br />

mentor now or have had one in the past. Of those who had a mentor now or have had one in the<br />

past, most (92% of the officers and 86% of the enlisted) reported that the mentor was someone<br />

higher in rank than them. In the case of both officers and enlisted Soldiers, the mentor was more<br />

likely to be someone higher in rank, but not their rater or their senior rater. Also as can be seen<br />

from the Table, only 12% of officers and 9% of enlisted Soldiers reported that their senior rater<br />

was their mentor.<br />

Very few officers and enlisted Soldiers reported that their mentor was a peer at their same<br />

rank (3% and 5%, respectively) and very few said that their mentor was a person lower in rank<br />

(3% of officers and less than 1% of enlisted Soldiers). Few officers and enlisted Soldiers (2%<br />

and 9%, respectively) said their mentor was not in the military at the time the mentoring was<br />

provided.<br />

Table 2 shows the percentage of individuals who said their mentors exhibited various<br />

behaviors and that each of these mentoring behaviors were very/extremely helpful. The Table<br />

also shows that when mentors who are higher in rank than the mentee (raters, senior raters, and<br />

others at higher ranks) exhibit some behaviors, these behaviors are somewhat more likely to be<br />

seen as helpful than when mentors who are lower in rank exhibit these same behaviors. Some<br />

examples of this include teaching job skills, helping to develop skills for future assignments,<br />

providing support and encouragement, assigning challenging tasks, providing<br />

sponsorship/contacts to advance careers, and assisting in obtaining future assignments. In<br />

addition, senior raters are seen as more helpful than raters when they exhibit some behaviors<br />

such as advice on organizational politics, personal and social guidance, sponsorship/contacts to<br />

advance careers, and assistance in obtaining future assignments.<br />

Focus group participants strongly advocated that mentoring as part of the leader<br />

development process remain voluntary and not be mandated or assigned. In addition, Soldiers<br />

clarified issues surrounding the perceived overlap between leadership and mentoring. It appears<br />

that both share many behaviors in common, however, mentoring is seen as a more<br />

individualized, one-on-one relationship wherein their mentors exhibit a broad range of mentoring<br />

behaviors (as opposed to a few selected ones).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Conclusions and Recommendations<br />

This paper shows how survey and focus group methodologies complement one another<br />

and result in richer findings, thereby increasing the potential for utilization. Thus, from the<br />

survey it was found that a surprisingly low number of senior raters were seen as mentors and the<br />

interviews revealed the obstacles that stood in the way. A major obstacle is Soldier reluctance to<br />

seek help from senior raters because of their role in determining the final performance rating. On<br />

the other hand, superiors not in the rating chain were seen as a good source of mentors since they<br />

are in a far less threatening role. The focus group interviews also helped to explain why so few<br />

survey respondents saw peers and subordinates as mentors. Although Soldiers do learn from<br />

peers and subordinates, they do not experience the full range of mentoring behaviors in their<br />

relationships with them.<br />

Army:<br />

The above findings led to the following recommendations which were provided to the<br />

• Encourage mentoring, but keep it voluntary.<br />

• Encourage senior raters to exhibit a wider range of mentoring behaviors.<br />

• Make senior raters aware of the barriers to their being seen as mentors (i.e., limited<br />

contact, their role in providing the final performance appraisal rating). Highlight<br />

possible ways of overcoming these barriers.<br />

• Encourage more non-raters who are senior in rank to mentor and educate them on the<br />

importance of exhibiting a wider range of mentoring behaviors.<br />

• Do not rely on increasing the number of mentors by encouraging peers and<br />

subordinates to mentor. This is not likely to increase mentoring because few of these<br />

individuals are typically viewed as mentors and are less likely to exhibit the whole<br />

range of mentoring behaviors.<br />

577<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


578<br />

Table 1<br />

Percent of officers and enlisted soldiers who said their mentor is/was: a<br />

Their rater<br />

Their senior rater<br />

A person who is/was higher in rank than them,<br />

but not their rater or their senior rater<br />

A person who is/was at their same rank<br />

A person who is/was lower in rank than them<br />

A person who is not or was not in the military at<br />

the time the mentoring was provided<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Officers<br />

Enlisted Soldiers<br />

35% 23%<br />

12% 9%<br />

45% 54%<br />

3%<br />

5%<br />

3%


Table 2<br />

Percent of officers and enlisted soldiers who said their mentors (who were senior, peer, lower ranked, or not in the Army)<br />

exhibited these behaviors and these behaviors were very/extremely helpful: a<br />

Rater<br />

Senior<br />

rater<br />

579<br />

Mentor’s relative position to mentee<br />

Higher rank, not<br />

rater/senior rater<br />

Demonstrates trust 93% 91% 94% …. 93% 93% 87%<br />

Gives feedback on your job performance 90% 90% 88% 88% 81% 73%<br />

Same<br />

rank<br />

Lower<br />

rank<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Not in<br />

military<br />

Acts as a role model 89% 89% 92% 88% 90% 83%<br />

Helps develop your skills/competencies for future assignments 88% 90% 88% 90% 77% 81%<br />

Assigns challenging tasks 88% 90% 85% 81% 55% 78%<br />

Provides support and encouragement 88% 87% 91% 93% 75% 87%<br />

Instills Army values 87% 90% 89% 88% 84% 67%<br />

Provides career guidance 86% 89% 89% 88% 77% 81%<br />

Provides moral/ethical guidance 86% 88% 88% 91% 87% 83%<br />

Teaches job skills 84% 84% 85% 80% 76% 72%<br />

Protects you 82% 85% 81% 75% 79% 79%<br />

Invites you to observe activities at his/her level 82% 78% 81% 87% 73% 79%<br />

Teaches/advises on organizational politics 81% 87% 84% 80% 73% 74%<br />

Provides personal and social guidance 77% 83% 85% 93% 74% 85%<br />

Provides sponsorship/contacts to advance your career 75% 82% 78% 86% 65% 70%<br />

Assists in obtaining future assignments 71% 80% 74% 74% 62% 72%<br />

a These percentages are for the set of respondents who said they have a mentor now or have had one in the past.


580<br />

References<br />

Bagnal, C. W., Pence, E. C., & Meriwether, T. N. (1985). Leaders as mentors. <strong>Military</strong> Review,<br />

65(7).<br />

McManus, S. E., & Russell, J. E. A. (1997). New directions for mentoring research: An<br />

examination of related constructs. Journal of Vocational Behavior, 51.<br />

1The views expressed in this paper are those of the authors and do not necessarily reflect the views of the U.S. Army<br />

Research Institute or the Department of the Army.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


UTILIZING SURVEY RESULTS OF THE NAVY EQUAL<br />

OPPORTUNITY/SEXUAL HARASSMENT SURVEY 47<br />

Paul Rosenfeld, Ph.D., and Carol E. Newell, M.A.<br />

Navy Personnel Research, Studies, and Technology Department<br />

5720 Integrity Drive, Millington, TN, USA 38055<br />

paul.rosenfeld@navy.mil carol.newell@navy.mil<br />

Commander Leanne Braddock<br />

Navy Equal Opportunity Office<br />

Navy Personnel Command<br />

7736 Kittyhawk, Millington, TN, USA 38055<br />

leanne.braddock@navy.mil<br />

In 1988, the Chief of Naval Operations Study Group (CNO, 1988) conducted a wideranging<br />

assessment of equal opportunity (EO) issues in the Navy. They found that no Navywide<br />

instrument existed to accurately measure the EO climate of the Navy and tasked a<br />

“comprehensive Navy-wide biennial EO climate survey to indicate the extent and form of racial<br />

discrimination within the Navy” (p. 2-18). The previous year, the Progress of Women in the<br />

Navy Report (Secretary of the Navy, 1987) had similarly found that no instrument existed to<br />

determine the extent of sexual harassment (SH) in the Navy and recommended that a Navy-wide<br />

SH survey be conducted. These two recommendations for Navy-wide surveys were implemented<br />

in 1989 when the first Navy Equal Opportunity/Sexual Harassment (NEOSH) Survey was<br />

administered. Since 1989, the NEOSH Survey has been administered every other year with the<br />

results being briefed to senior Navy policymakers. The results of the NEOSH Surveys have<br />

provided Navy leaders with an accurate portrait of the state of EO, SH and related issues such as<br />

racial/ethnic, religious, and gender discrimination.<br />

Although the NEOSH Survey results are generally widely distributed to top Navy<br />

policymakers, questions are periodically raised about how the data are used and what impact<br />

they have. The present paper describes how the NEOSH Survey results have been utilized by the<br />

Navy and offers recommendations for how they could be better utilized.<br />

INTERNAL USES OF NEOSH SURVEY RESULTS<br />

Once analyzed, the NEOSH results are typically briefed to top Navy policymakers<br />

including the Chief of Naval Personnel. Afterwards, the results are released, usually<br />

accompanied by a Navy-wide message that summarizes the main findings and recommends<br />

47<br />

The opinions expressed are those of the authors. They are not official and do not represent the views of the U.S.<br />

Navy.<br />

581<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


582<br />

continued vigilance and efforts to reduce remaining racial and gender-related issues found on the<br />

survey. Additionally, the NEOSH Survey results have been incorporated into standard Navy EO<br />

and SH training, provided norms for command-level climate assessment surveys, led to<br />

supplemental studies into selected EO/SH topics, and have been used to justify a recent Navywide<br />

strategic diversity effort.<br />

Following the Tailhook SH episode in 1991, the entire Navy was required to participate<br />

in an eight-hour SH training stand-down. As part of the standard Navy training package for the<br />

stand-down, SH results from the NEOSH Survey were included. More recently, some of the<br />

NEOSH Survey results have been included within the Navy’s annual required General <strong>Military</strong><br />

Training (GMT). In the current GMT, Unit 3, Topic 1 deals with SH and EO. Within that<br />

module, the SH results of the 1999-2000 NEOSH Survey are summarized.<br />

Another internal Navy use of the NEOSH Survey results has been through Navy-wide<br />

norms that are used in conjunction with command-level EO climate assessments. For the past<br />

decade, the Navy has had a software program called CATSYS or CATWIN that allowed<br />

commands to administer and analyze a command EO survey online without the requirement for<br />

additional software or specialized training. The standard command EO survey was essentially a<br />

“mini-NEOSH” of around 40-50 items that commands could administer and modify as needed.<br />

Early users of the command-level system requested some comparison data so they could see how<br />

their local performance compared to that of the Navy. To address this issue, Navy-wide norms<br />

generated from the NEOSH Survey have been calculated and posted on a Navy-web site<br />

allowing local commands to compare their survey findings to Navy-wide averages.<br />

Results from the NEOSH Survey have also led to follow-on research studies to better<br />

explore issues raised in the survey. These special studies were conducted to gain a more in depth<br />

understanding of an issue. Topics that have been addressed include focus groups with African-<br />

American women on their experience in the Navy (Bureau of Naval Personnel, 1990), and<br />

reasons for survey non-response on the NEOSH Survey (Newell, Rosenfeld, Harris, &<br />

Hindelang, <strong>2003</strong>). The focus group study sought to determine reasons for African-American<br />

women’s low EO climate scores on the NEOSH Survey. This study was conducted in 1990 and<br />

again in 1994 (Moore & Webb, 1998), as the finding that African-American women were the<br />

least satisfied of any group was still obtained on the survey. The non-response study was<br />

conducted as a result of the declining response rates evident on the NEOSH and other Navy-wide<br />

surveys (Newell et al., <strong>2003</strong>). As a result of this study, the 2002 NEOSH Survey was shortened<br />

and steps were taken to provide better feedback as these were common complaints among those<br />

completing the non-response survey (Newell et al., <strong>2003</strong>).<br />

More recently, the Navy has developed a strategic diversity initiative that seeks to move<br />

the Navy from an older EO/compliance framework to a model that values differences and seeks<br />

to leverage diversity to maximize performance and increase the Navy’s readiness. In justifying<br />

the need for the Navy to move beyond its traditional EO programs, the project leaders used a<br />

long-term NEOSH finding that although improvements in EO climate have occurred since the<br />

first NEOSH Survey administration, racial and gender gaps still remained in many of the areas<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


assessed by the survey. This notion of “improvement but gaps remain” provided a compelling<br />

argument that if improvement were to continue with the gaps being reduced, a new paradigm<br />

would be required, in this case, the one offered by a strategic diversity effort.<br />

EXTERNAL USE OF NEOSH SURVEY RESULTS<br />

Externally, the NEOSH results have been used to respond to taskers from the Department<br />

of Defense and been included in Congressional testimony by top Navy leadership. Perhaps due<br />

to the interest in SH issues in the Navy in the aftermath of Tailhook, the Congressional testimony<br />

has typically focused on the SH side of the NEOSH Survey and documented the reduction in SH<br />

rates that occurred in the Navy in the years following the 1991 Tailhook episode. On February<br />

4, 1997, Secretary of the Navy, John H. Dalton cited NEOSH survey results in testimony before<br />

the Senate Armed Services Committee. He said, “The Navy Equal Opportunity and Sexual<br />

Harassment, and the Marine Corps Equal Opportunity Survey are key assessment tools,<br />

providing us biennially with a comprehensive look at our equal opportunity climate. A<br />

comparison of the NEOSH survey results conducted in 1989, 1991, 1993, and 1995 indicates that<br />

the Navy has made steady progress in the communication, prevention, training, and handling of<br />

sexual harassment complaints”. In Congressional testimony before the House National Security<br />

Committee Subcommittee on Personnel on March 13, 1997, Vice Admiral Dan Oliver, Chief of<br />

Naval Personnel noted that “both our 1995 Navy Equal Opportunity/Sexual Harassment<br />

(NEOSH) Survey and a 1996 DoD wide survey revealed a generally improving trend in<br />

elimination of sexual harassment in the Navy, and Service leadership in training, communication<br />

and reporting sexual harassment”. In testimony before the Senate Armed Services Personnel<br />

Subcommittee on March 24, 1999, the Honorable Carolyn H. Becraft, the Assistant Secretary of<br />

the Navy (Manpower and Reserve Affairs) echoed similar thoughts: “…we monitor our progress<br />

in preventing sexual harassment, fraternization, and related behaviors through a number of<br />

assessment tools. The Navy and Marine Corps each conduct their own biennial surveys—The<br />

Navy Equal Opportunity Sexual Harassment (NEOSH) and the Marine Corps Equal Opportunity<br />

Survey (MCEOS) – that provide a climate assessment on various equal opportunity issues.<br />

Current results indicate that we have made progress and are moving in the right direction, but<br />

that we must not relax our resolve to rid our Services of sexual harassment and other<br />

unacceptable behaviors”. At that same Senate hearing, VADM Dan Oliver, the Chief of Naval<br />

Personnel, noted the downward trend in SH rates reported on the 1997 NEOSH Survey with the<br />

largest decreases being in hostile environment forms of SH. Similarly, in testimony before the<br />

Senate Armed Services Committee Subcommittee on Personnel on March 9, 2000, Vice Admiral<br />

Norb Ryan, the Chief of Naval Personnel, noted that the NEOSH Survey was part of the Navy’s<br />

program of SH prevention. In sum, the NEOSH Survey has been regularly used to inform<br />

583<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


584<br />

Congressional committees that the Navy is actively monitoring its EO climate and SH rates, and<br />

also to demonstrate through these survey results that progress has been made.<br />

ARE NEOSH RESULTS UNDERUTILIZED?<br />

Although the NEOSH results have had both internal and external uses, it is probably<br />

accurate to say that they have been underutilized. Indeed, in a study that contacted individuals<br />

who had not responded to the 1999 NEOSH Survey, Newell et al. (<strong>2003</strong>), found that one of the<br />

major reasons that individuals cited for not responding to surveys like the NEOSH was that they<br />

felt that no changes would result or that their results didn’t matter. As noted in Alma Steinberg’s<br />

and Susan Nourizadeh’s paper in this symposium, part of this dissatisfaction may be the result of<br />

a divergence in expectations between survey respondents and survey sponsors and policymakers<br />

about how survey data are to be utilized. Survey respondents tend to personalize the uses of<br />

survey data and often see utilization in terms of tangible changes that may impact them. While<br />

this may occur in small organizations following climate assessments, this sort of dramatic<br />

survey-driven change is rare in large institutions such as the military. One notable exception is<br />

the Marine Corps. In 1993, the Marine Corps conducted a service-wide Quality of Life Domain<br />

Survey that indicated dissatisfaction with housing. Based on these results, the Marine Corps was<br />

able to get additional funding for housing. A follow-up survey in 1998 indicated a significant<br />

positive trend in perceptions of housing satisfaction, providing some evidence that the increase in<br />

funding for housing was successful.<br />

More commonly, military organizations use large-scale surveys such as the NEOSH as a<br />

benchmark for how they are doing and to determine whether they have improved compared to<br />

the past. It is rarer to make large-scale changes such as those that followed the Marine Corps<br />

Quality of Life Survey and then use a future survey administration to evaluate the efficacy of the<br />

change. Since respondents expect change but leaders utilize surveys for purposes other than<br />

organizational change, we recommend that steps to reduce this divergence be taken.<br />

Specifically, we recommend better feedback to survey respondents about survey results and uses,<br />

and some limited but targeted actions based on survey results.<br />

BETTER UTILIZATION OF NEOSH SURVEY RESULTS<br />

Feedback to Respondents<br />

Although the NEOSH Survey results have been used, one obvious limitation is that<br />

individuals within the sample have not been given feedback on what the results were or how they<br />

have been used. The Navy has recognized this limitation and is currently requiring that all<br />

approved personnel surveys include a plan through which survey respondents would be provided<br />

feedback about the results. This typically occurs through follow-up letters to all who were in the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


original sample providing them with a summary of the results or directing them to a website<br />

where the results are available. While the letters typically have provided a summary of the major<br />

findings, there is no reason why in future efforts they could not also include a description of what<br />

actions the Navy was planning based on the findings. As the Navy moves its personnel surveys<br />

to the Internet, this feedback function can increasingly be done electronically thus shortening the<br />

lag time between administration and results. The feedback letter would also inform respondents<br />

of how their responses have been used. While specific organizational changes may not always<br />

occur, telling respondents that their results have better informed Navy, DoD or Congressional<br />

leadership or had been used to update Navy-wide training, would certainly go a long way to<br />

closing the gap between respondent expectations and the actual organizational uses of the data.<br />

Targeted Actions<br />

While the Navy has not typically used NEOSH Survey results to attempt large-scale<br />

organizational changes, this has occasionally occurred in a more limited fashion. For example,<br />

the results of the 1999/2000 NEOSH Survey indicated that just about half of respondents had<br />

seen or heard of the Navy EO/SH adviceline. The Navy had instituted the adviceline following<br />

Tailhook to provide impartial information to callers on EO and SH issues. After the results were<br />

briefed, the NEOSH Survey sponsors made a concerted and coordinated effort to increase the<br />

visibility of the adviceline through posters and other media efforts. The results of the 2002<br />

NEOSH Survey demonstrated that these efforts had been successful. While in 1999, just over<br />

half of officers and enlisted personnel had heard of the Navy EO/SH adviceline, the percentage<br />

who said they had heard of the adviceline had jumped to over 2/3 on the 2002 survey.<br />

A more systematic attempt for targeted actions based on NEOSH results is currently<br />

being proposed in conjunction with the communications plan for the release of the 2002 NEOSH<br />

Survey results. That plan would target another long-held NEOSH finding relating to racial and<br />

gender discrimination: that the most common occurrences of racial and gender discrimination<br />

are in the areas of “offensive comments and jokes”, still reported by about 1/3 of enlisted<br />

minorities and women. The proposed action would target offensive jokes and comments since<br />

they are both the most common forms of reported discrimination and also because Sailors can<br />

take simple actions to end these forms of discrimination. This message of “simple actions to end<br />

offensive jokes and comments” will be conveyed through various Navy media including<br />

websites, wire stories, and internal Navy television news stories and commercials. As currently<br />

proposed, the success of the targeted efforts at reducing these forms of discrimination will be<br />

assessed in future surveys; either through a Navy-wide survey administration or through a more<br />

limited but scientific Internet quick poll that would focus on the specific behaviors being<br />

targeted. This proposed, coordinated media and communications strategy followed by planned<br />

follow-up assessments has rarely if ever been used to attempt to effect and assess change<br />

following Navy surveys. Thus, the effort should be viewed as a pilot project that, if successful,<br />

may serve as a model for future efforts to better utilize the results of Navy surveys.<br />

585<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


586<br />

REFERENCES<br />

Bureau of Naval Personnel (1990, November). Black women in the Navy study group report.<br />

Washington, DC: Author.<br />

CNO Study Group (1988). CNO study group’s report on equal opportunity in the Navy.<br />

Washington, DC: Department of the Navy.<br />

Moore, B.L. & Webb, S.C. (1998). Equal opportunity in the U.S. Navy: Perceptions of African-<br />

American women, Gender Issues, 16(3), 99-119.<br />

Newell, C.N., Rosenfeld, P., Harris, R.L., and Hindelang, R.N. (<strong>2003</strong>). Reasons for nonresponse<br />

on U.S. Navy surveys: A closer look. Manuscript submitted for publication,<br />

<strong>Military</strong> Psychology.<br />

Secretary of the Navy. (1987, December 5). Navy study group report on progress of women in<br />

the Navy. Washington, DC: Department of the Navy.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


THE U.S. ARMY'S PERSONNEL REPLACEMENT SYSTEM<br />

Raymond O. Waldköetter, Ed.D. Alex T. Arlington, M.P.A.<br />

Educational and Selection Programs U.S. Army Soldier Support Institute<br />

Greenwood, IN 46143 U.S.A. Fort Jackson, SC 29207 U.S.A.<br />

Staff2@officeassistanceandsuuulies.com<br />

The views expressed in this paper are those of the authors and do not necessarily reflect the views of the U.S.<br />

Army Soldier Support Institute, Department of the Army, or Department of Defense.<br />

Replacements and Manning<br />

The principal goal of the Army's personnel replacement system is to fill needs or<br />

requirements, putting trained and confident soldiers onto the battlefield as quickly as<br />

possible. Replacements are crucial to our national security, and there is greater stress on<br />

the active force of the U.S. Army now less than 500,000 personnel. During war<br />

personnel replacements are needed to fill combat, combat support, and combat service<br />

support positions. Should a protracted war develop it would not be long before<br />

commanders ask for needed replacements. The ability will be essential for the<br />

replacement system to fill requirements on a timely basis (Arlington, 1998).<br />

This paper examines personnel replacement operations and the automated<br />

systems used to ensure that the right soldiers with the right skills get to the battlefield at<br />

the right time. Knowing the number of soldiers for anticipated missions and casualty<br />

estimation are critical factors in determining the personnel replacement needs.<br />

Personnel replacements fall into one of two categories. The first category is<br />

called the filler requisition shelf and these personnel fill the gap between peacetime<br />

authorized strength and wartime required strength. This shelf is updated at least once a<br />

year and it reflects any changes to the Table of Organization and Equipment (TOE) and<br />

the Table of Distribution and Allowances (TDA). The second category of replacements<br />

is based on the anticipated number of casualties (AR 600-8-111, 1996).<br />

Casualty estimation and casualty stratification are extremely important to the<br />

Army and are two of the keys to successful replacement operations (Arlington &<br />

Waldköetter, 1994). These procedures are also very controversial in that some experts<br />

believe we should base our casualty estimation and stratification on historical rates and<br />

others believe we should use rates generated from computer simulation models. The<br />

Army formally used a combination of the two procedures to estimate casualties.<br />

However, in 1997 Major Army Commands (MACOMs) were given a new method to<br />

estimate casualties. The former approach used the following five levels to describe<br />

combat intensity: intense, heavy, moderate, light and none. The new methodology no<br />

longer uses these static definitions, because research has shown that casualties often<br />

occur in pulses, and with the new approach personnel planners are allowed to choose<br />

rate patterns instead of combat intensity levels (Kuhn, 1998), and the chosen rate pattern<br />

will be based on the type of combat mission.<br />

587<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


588<br />

To plan for future missions properly the commander needs to know the number of<br />

soldiers he will have and the skills they possess. This concept is called manning and is one of<br />

the most critical elements of war. The ultimate aim of a replacement system is to continue to<br />

man the force. The procedures used to achieve this aim have varied throughout history (FM<br />

12-6, 1994).<br />

Differing Operational Approaches<br />

The German divisions during World War II (WW II) were affiliated with a<br />

military district with the regiments within a division affiliated with a region within the<br />

military district. Replacement battalions were located within a region intending to have<br />

regimental replacements coming from the same German regions. This was very effective<br />

during the early stages of the war, but as the number of German divisions grew it became<br />

necessary to have replacement battalions provide replacements for divisions instead of<br />

regiments. The replacement battalions received the draftees and provided about two<br />

months of combat training, then after this initial training the replacements were either<br />

sent to the division to a replacement battalion or to receive additional training.<br />

The German division and regimental commanders were in charge of their<br />

replacement system rather than the Army command. This decentralized system fostered<br />

esprit de corps and great unit devotion, however, when certain units had mass casualties<br />

they had to be taken off the front lines, since not enough replacements were available<br />

from the affiliated region or military district. When this occurred the unit was sent to a<br />

recovery area behind the front lines, remaining there until enough sick and wounded<br />

soldiers returned or region and district draftees were assigned to the unit. Another<br />

approach used by the Germans was to take the remnants of a division and create new<br />

battalions. These battalions would then become part of another division, the regional<br />

integrity of the battalions being kept intact. Along with reducing the numbers of<br />

divisional battalions from the original nine to seven, these approaches allowed<br />

Germany to maintain nearly 300 divisions until the later stages of WW II.<br />

The British replacement system during WW II was similar to the German system<br />

in that it tried to keep regional ties whenever possible. The U.S. Army took a differing, if<br />

not opposite approach, to replacement operations, marking considerations as to how and<br />

why another approach was more acceptable. There was fear that if a particular unit lost a<br />

large number of men, it would have a dramatically negative impact on the regional<br />

morale. Another difference in replacement philosophy was the decision to keep the<br />

number of divisions relatively low. During WW II Henry L. Stimson, then Secretary of<br />

War, wanted the Army to have 200 divisions, whereas General George C. Marshall, the<br />

leading U.S. Army proponent, insisted on keeping the number much lower so that these<br />

would be an adequate replacement flow. Secretary Stimson ultimately gave way to<br />

General Marshall and they agreed that 90 divisions for the Army would be a manageable<br />

number. It was also thought a centralized system would be more efficient and as needs<br />

occur fill them quickly as possible without concern for trying to keep regional integrity.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


This policy was so strictly adhered to that many of the replacements assigned to the U.S.<br />

National Guard units were not even from the related state (Wray, 1987).<br />

Centralized Replacement Operations<br />

Today, the centralized philosophy of replacement operations is intact.<br />

Headquarters, Department of the Army (HQDA), Deputy Chief of Staff for Personnel<br />

(DCSPER) is the Army's functional proponent for the replacement management system.<br />

After the Office of the Deputy Chief of Staff for Operations and Plans (ODCSOPS)<br />

determines that replacements will be necessary it notifies the DCSPER. The DCSPER<br />

then relaying this information notifies the Training and Doctrine Command<br />

(TRADOC), executive agent for replacement centers, of the necessary replacements to<br />

be subsequently processed by the designated replacement centers. The Continental U.S.<br />

(CONUS) Replacement Centers (CRCs) are located at predesignated Army<br />

installations. Operations begin 10 days before the first replacements are expected to<br />

arrive, when respective CRCs have the responsibility to receive and-in process<br />

replacement personnel, subsequently to coordinate training, equipping, and the<br />

transportation of replacement soldiers (AR 600-8-111, 1996; FM 12-6,1994).<br />

The CRC consists of a Replacement Battalion and two to six Replacement<br />

Companies. The Battalion Headquarters is responsible for providing command and<br />

control for the battalion and Replacement Companies. The Battalion Headquarters is<br />

commanded by a Lieutenant Colonel and has 38 personnel, excluding the companies.<br />

Each company commanded by a Captain has 25 personnel with each company having<br />

up to four platoons and each platoon having up to 100 replacements. The goal of the<br />

Replacement Company is for 100 ready to depart each day allowing five days total<br />

processing time per each CRC replacement (FM 12-6, 1994).<br />

The Replacement Operations Automated Management System (ROAMS) is the<br />

computer program used by the U.S. Total Army Personnel Command (PERSCOM) to<br />

track the flow of CRC replacements to the theater of operation, serving PERSCOM to<br />

use ROAMs to both project and manage replacements (AR 600-8-111, 1996).<br />

Challenges or Issues for Personnel Replacement Systems<br />

The replacement system will face most probably at least three specific<br />

problems in the future, and the first being our reliance on technology. One of the<br />

reasons we have been able to reduce the size of our force is the superiority we possess<br />

in military technology. Technology is great when everything works properly, but if a<br />

system fails and there is not a backup system we are vulnerable. The enemy also<br />

knows we rely on technology and will use whatever means available to degrade our<br />

systems. Immediate dangers that we face are computer viruses, computer hackers and<br />

terrorism. Additional problems we may face in the near future are threats of longrange<br />

precision bombs and smart munitions. Also in the future is the potential for<br />

destruction for our computer systems and satellites through use of electro-magnetic<br />

589<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


590<br />

pulse (EMP). Our reliance on technology will not diminish, therefore, it is imperative<br />

that we have hardened systems with backup modules, computer firewalls, and antivirus<br />

software.<br />

The second problem facing the replacement system is the lack of reliable models for<br />

casualty estimation and stratification. There is currently no validated computer simulations<br />

model that generates casualties for Combat Support (CS) and Combat Service Support (CSS).<br />

We are now a force projection Army and there is more reliance on CS and CSS than ever<br />

before where our enemies are constantly striving to find ways to reduce our advantages by<br />

disrupting CS and CSS. The emphasis the Center for Army Analysis (CAA) places on casualty<br />

estimation and stratification for combat personnel must be applied to CS and CSS personnel.<br />

Another short coming of casualty estimation is the restrictive nature of the<br />

models. The models currently run simulations to estimate casualties only at the division<br />

level or higher, yet our force is tending to get smaller, more lethal and more<br />

maneuverable. As we have experienced in the <strong>2003</strong> Iraqi War, many battles will be<br />

fought at the maneuver brigade level and below, requiring we must be able to estimate<br />

casualties at those levels. Further current models do not have sufficient capability to<br />

simulate weapons of mass destruction (WMD), operations other-than-war (OOTW) or<br />

special operations and primarily concentrate only on conventional losses in conventional<br />

operations. Since we live in an unpredictable world rather than conventional, casualty<br />

estimation models must be flexible enough to allow planners the ability to simulate<br />

casualties in many different environments.<br />

The third problem now facing the Army's replacement system is the lack of<br />

personnel with something less than 500,000 personnel on active duty. Congress has been<br />

informed by the Department of Defense and senior Army leadership that everything is<br />

fine, as this number is acceptable during a qualified enforcing of "peacetime." However,<br />

getting into a protracted conflict or in two nearly simultaneous theaters of war there are<br />

currently insufficient personnel. It would then be necessary to activate the reserve<br />

components where we have over half of our combat forces. Fully activating the National<br />

Guard and Army Reserve would require the President to convince Congress and the<br />

American people that our way of life and security are drastically threatened. This<br />

problem actually goes beyond current numbers as presented below in Table 1. If<br />

Congress directed the Army to set peacetime active duty numbers at 700,000 personnel,<br />

the Army would likely fall short since present recruiting goals are barely met (Arlington,<br />

1998).<br />

Table 1<br />

U.S. Army Total Force 2004<br />

Active Component (AC) Army National Guard (ANG) Army Reserve (AR)<br />

480,000 + 350,000 + 205,000 =<br />

(Force Structure 1,035,000)*<br />

*Army Divisions 18 (10AC, 8 ARNG)<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The Army has over 180,000 soldiers in more than 80 countries. Many part of a<br />

conventional force with many others involved in special operations and fighting the<br />

global war on terrorism. To maintain our current advantages the Army is constantly<br />

searching for ways to modernize. According to the Army Modernization Plan (Annex<br />

B, <strong>2003</strong>), 98% of the $10.76 billion Science and Technology (S&T) funding is<br />

specifically targeted for future forces (DCS, <strong>2003</strong>).<br />

In conclusion, the Army cannot really do anything immediately about the lack of<br />

personnel, but something can be done about the preceding two challenges or issues and the<br />

specific constraints or problems regarding its technology and systems and models for casualty<br />

estimation and stratification. Improving, protecting and safeguarding the automated systems<br />

used to call up replacement personnel should be the number one priority. The second priority<br />

should be to improve the casualty estimation and stratification models. There is little doubt that<br />

the Army replacement system works effectively in a relative peacetime situation of limited<br />

combat. Hopefully the options or suggestions now proposed for improving personnel<br />

replacement operations and systems will never have to be tested in a real scenario of global<br />

multi-front conflict.<br />

References<br />

Arlington, A.T., & Waldköetter, R.O. (1994). A method for estimating Army battle<br />

casualties and predicting personnel replacements. Paper presented at the 36 th Annual<br />

Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>, Rotterdam, the<br />

Netherlands.<br />

Arlington, A.T. (1998, November). The Army's personnel replacement system.<br />

Unpublished manuscript. Fort Belvior, VA: U.S. Army Management Staff College.<br />

Deputy Chief of Staff (G-8). (<strong>2003</strong>, February). Army modernization plan (Letter).<br />

Washington DC: Headquarters, Department of the Army.<br />

Kuhn, G.W. (1998, January). Battle casualty rate patterns for conventional ground<br />

forces, rate planners guide. Washington, DC: Logistics Management Institute.<br />

Personnel Doctrine (Field Manual 12-6). (1994). Washington, DC: Headquarters,<br />

Department of the Army.<br />

The Army Modernization, Annex B, (<strong>2003</strong>). On point for readiness today, transforming<br />

for security tomorrow.<br />

Retrieved from http://www.army.mil/features/MODPLAN/<strong>2003</strong>/default.htm.<br />

Wartime Replacement Operations. (Army Regulation 600-8-111). (1996). Washington,<br />

DC: Headquarters, Department of the Army.<br />

Wray, J.D. (1987, May) Replacements back on the road at last. <strong>Military</strong> Review.<br />

Retrieved from http://leav-err.army.mil-cgi-bin/cgcqi.<br />

591<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


592<br />

Background<br />

Team Effectiveness and Boundary Management:<br />

The Four Roles Preconized by Ancona Revisited<br />

Prof. Dr MSc Jacques Mylle<br />

Psychology Department<br />

Royal <strong>Military</strong> Academy<br />

B1000 Brussels – Belgium<br />

jacques.mylle@rma.ac.be<br />

A lot of tasks during peace support operations (PSO) have to be performed by small military<br />

units (say from 4 to 10 people). It is often contended that these tasks are characterized by the<br />

necessity of teamwork and a high level of autonomy, due among others to the large distances<br />

between the (sub)unit executing a task and their superior at the one side, and the potentially<br />

quickly evolving situation at the other side .<br />

The issue addressed in this paper is to be situated in the framework of a study aiming at<br />

measuring team effectiveness of small executive teams, more specifically the contribution of<br />

autonomy and boundary management in turbulent situations.<br />

By teams we mean groups of people working together on a task with a high interdependency<br />

of team members in fulfilling their job and who work together for a longer time period in their<br />

real work environment. Thus we do not consider one-time groups in laboratory situations.<br />

The scope on the subject is a so-called external perspective or ecological perspective, because<br />

the team is considered as a living system that, at the one side, adapts to the demands of its<br />

environment but, at the other hand, causes changes in the environment too.<br />

Until the mid eighties researchers took an internal perspective and focused on what was<br />

happening inside the group; e.g. how cohesion evolves.<br />

The seminal work of Gladstein (1984) was the start for a paradigm shift: studying team<br />

behaviors directed outwards the team; among others, towards other parts of the organization<br />

and other groups evolving in the same setting. The core question relates to how the team<br />

deals with the external influences on their performance and how they (try to) influence the<br />

outer world themselves. In other words we are looking at what happens on the boundaries of<br />

the team.<br />

Gladstein observed in interviews that the subjects – as members of sales teams – frequently<br />

spoke about the importance of their interactions with other teams of the company, such as the<br />

installation teams and repair teams.<br />

Another important finding was that they did not distinguish between the classic task-related<br />

processes and team-related behaviors but instead between internal and external oriented.<br />

External oriented behaviors are for example seeking information or molding an external<br />

opinion.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Since 1990 a lot of work has been done by Deborah Ancona (often in collaboration with D<br />

Caldwell). Her initial longitudinal study led to four key findings:<br />

1. teams develop a distinct set of externally oriented activities and strategies<br />

2. these activities are positively and significantly related to managerial performance<br />

ratings<br />

3. there exists a complex interaction between the internal and external processes and it<br />

changes over time<br />

4. there is a pattern in this external dynamic just as there is one in the internal dynamic.<br />

An exploratory factor analysis of data she collected in consulting and production teams<br />

revealed the existence of three major “styles”.<br />

1. The Ambassador style, which includes both buffering and representing. Buffering<br />

means protecting the team from external influences or absorbing them while<br />

representing refers to persuading other to support the team in its efforts. In later work<br />

(for a review see Yan and Louis, 1999) buffering will become a fourth separate role,<br />

namely guarding.<br />

Communication is thus bottom up oriented, deals with how to have access to power<br />

and how to manage the vertical dependence<br />

2. The Task co-ordination style deals with workflow and -structure, refers to how handle<br />

technical or design issues through looking for feedback or negotiating.<br />

In this case communication is lateral, how to manage horizontal dependence.<br />

3. The Scouting style refers to scanning the environment for information/ideas about<br />

relevant aspects of the “environment”; e.g. available resources (and the competition<br />

for it), technologies, etc.<br />

Furthermore, Ancona defined four strategies, which rely on the above described styles.<br />

1. The ambassadorial strategy relies on the ambassadorial style only while the others<br />

are neglected<br />

2. The technical scouting strategy encompasses the scouting style and task co-ordination<br />

but not the ambassadorial style.<br />

3. The isolationist strategy refers to the absence of any style. The team lives more or<br />

less on its own as on an island.<br />

4. The comprehensive strategy is a combination of the ambassadorial style and task coordination,<br />

with a minimum of scouting.<br />

She showed also that the comprehensive style is the only effective one in turbulent situations.<br />

It goes without saying that boundary management is an issue for leaders too, even if all or<br />

some of its behaviors are shown by team members as boundary spanners.<br />

Research question<br />

Ancona tested her hypothesized structure of boundary management with data from consulting<br />

team and new production teams in a commercial setting.<br />

It is know that boundary management in more or less quickly changing situations is positively<br />

related to performance in civilian settings under the condition that the team uses the right<br />

“mix” of styles. Given that we arrived tentatively at the same conclusions in operational<br />

593<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


594<br />

context (Mylle, 2001; Mylle, Sips, Callaert & Bouwen, 2002), we wanted to verify if the<br />

factor structure preconized by Ancona can be generalized to other types of teams and to other<br />

settings? In casu, do effective military teams in a peace support operations environment use<br />

indeed the four “roles” described by Ancona?<br />

Method<br />

Subjects<br />

Data have been collected in a civilian sample and a military sample. The latter consists of<br />

about 200 soldiers out of 800 (in rounded figures) who were part of a Belgian Task Force in<br />

Kosovo. Furthermore, the sample encompasses a subsample of “combat troops” (infantry and<br />

armor troops, n=143) and a subsample of “support troops” (pioneer, logistics, medical<br />

support, n=47).<br />

Instrument<br />

Based on a simple (causal) model, a questionnaire has been elaborated by the research team to<br />

measure several facets of team functioning; among others boundary management.<br />

For this aspect, the questionnaire of Ancona (1993) was taken as such and was translated into<br />

Dutch and French. It is composed of four scales totaling 25 items: the Ambassador-,<br />

Coordinator-, Scout- and Guard scale have respectively 12, five, four and four items. An<br />

example of an item belonging to each scale is given below. The core of all statements can be<br />

found in Table 1.<br />

Ambassador:<br />

Co-coordinator<br />

Scout<br />

Guard<br />

We try to persuade others to support the team’s decisions<br />

We try to resolve problems together with other teams<br />

We negotiate deadlines with external people<br />

We keep information in the team secret for others<br />

To get a better insight in what kind of external activities teams get involved and why they did<br />

what they did, we asked a number of questions about “communication with people outside the<br />

team” and “motives of external people for contacting the team”.<br />

For example:<br />

[We contact people outside the team] to take corrective actions as changing work<br />

procedures or processes<br />

[People outside the team contact us] to discuss or to have an extended exchange of<br />

ideas<br />

The questionnaire was submitted after an intense training period, 14 days before deployment<br />

as part of the task force Belukos VI, which was deployed from the beginning of April 2001<br />

until the beginning of August 2001 in Kosovo.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Results<br />

The internal consistency of the questionnaire as a whole is very good (.90) and at the scale<br />

level ranges from good to poor; i.e. .86 for Ambassador, .50 for Coordination, .65 for<br />

Scouting and .48 only for Guard.<br />

First we checked the appropriateness of our data for factor analysis. A Kaiser-Maier-Olkinindex<br />

of .88 and Bartletts’ sphericity index of 1496 (p=.000) show that our data are suited for<br />

factor analysis.<br />

We used a principal component analysis, asked for extraction of all factors with an eigenvalue<br />

over 1 and a varimax rotation to obtain a structure which is easy to interpret.<br />

The analysis based on the total sample revealed a four factor structure which could easily be<br />

labeled in the same way as Ancona did, although with a somewhat different distribution of the<br />

items over the factors.<br />

The analysis on the military subsample resulted in a seven factor solution which we will not<br />

discuss for obvious reasons. A second analysis, with a forced extraction of four factors,<br />

yielded the following results.<br />

Table 1 . Factor structure and factor loadings > .40 based on own factor analysis<br />

Item<br />

number<br />

19<br />

17<br />

16<br />

18<br />

21<br />

15<br />

20<br />

23<br />

22<br />

11<br />

1<br />

3<br />

2<br />

4<br />

5<br />

12<br />

24<br />

7<br />

8<br />

10<br />

6<br />

Keywords F I F II F III FIV<br />

Scan the environment for threats<br />

Find out if others support or oppose<br />

Procure things from others<br />

Collect technical info<br />

Scan the environment for<br />

technology<br />

Report progress<br />

Control release of info<br />

Search info about company’s<br />

strategy<br />

Absorb outside pressure<br />

Negotiate with others<br />

Persuade others to support<br />

decisions<br />

Acquire resources<br />

Review product design<br />

Keep news secret until appropriate<br />

time<br />

Avoid releasing info to protect<br />

image<br />

Scan the environment for marketing<br />

Talk up the team<br />

Keep others informed<br />

Co-ordinate activities with external<br />

groups<br />

Resolve design problems<br />

.722<br />

.716<br />

.712<br />

.673<br />

.640<br />

.580<br />

.569<br />

.526<br />

.490<br />

.460<br />

.408<br />

.403<br />

.409<br />

.795<br />

.715<br />

.594<br />

.588<br />

.569<br />

.500<br />

.479<br />

.420<br />

.673<br />

.667<br />

.605<br />

.524<br />

.401<br />

595<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


596<br />

14 Find out what others do in similar<br />

projects<br />

5<br />

25<br />

9<br />

Prevent for info overload<br />

Avoid releasing info to protect<br />

image<br />

Keep info secret for others<br />

Protect team for inference<br />

Label Scout Ambas<br />

sador<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.445<br />

.444<br />

Co<br />

ordinat<br />

or<br />

.733<br />

.614<br />

.594<br />

Guard<br />

The variance explained by these four factors is in total 52% (respectively, 19, 14, 11 and 9%).<br />

Factor I totals 10 items and a content analysis of it shows that the six highest loadings can be<br />

associated with scouting. Seven items load on Factor II and five of them refer to<br />

ambassadorial activities. Factor III with five items is more ambiguous: only two of them are<br />

clearly associated with coordination, although with high loadings. Finally, the three items of<br />

factor IV refer clearly to the guard role.<br />

The factor structure together with the factor loadings above .40 are given in Table 1. Thus,<br />

based on the content we can label those factors in the same way as Ancona did; i.e.<br />

Ambassador, Coordinator, Scout and Guard, but the composition of the scales differs in both<br />

solutions.<br />

The cross-tabulation in Table 2 shows which items are kept in the same scale and which items<br />

moved to which scale.<br />

Table 2. Comparison of the Ancona factor structure and the own structure<br />

ANCON<br />

F I<br />

Ambassador<br />

OWN DATA<br />

Item F II<br />

(Ambassado<br />

r)<br />

1<br />

3<br />

7<br />

9<br />

13<br />

14<br />

15<br />

17<br />

19<br />

22<br />

23<br />

24<br />

X<br />

X<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

X<br />

F III<br />

(Coordinato<br />

r)<br />

.<br />

.<br />

.X<br />

.<br />

.<br />

X<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

F I<br />

(Scout)<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

X<br />

X<br />

X<br />

X<br />

X<br />

.<br />

F IV<br />

(Guard)<br />

.<br />

.<br />

.<br />

X<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.


A<br />

F II<br />

Coordinat<br />

or<br />

F III<br />

Scout<br />

F IV<br />

Guard<br />

2<br />

8<br />

10<br />

11<br />

16<br />

6<br />

12<br />

18<br />

21<br />

4<br />

5<br />

20<br />

25<br />

X<br />

.<br />

.<br />

.<br />

.<br />

.<br />

X<br />

.<br />

.<br />

X<br />

X<br />

.<br />

.<br />

A factor analysis, based on the total sample, of the set of items encompassing Ancona’s<br />

questionnaire, the communication- and the motives scales shows a six factor structure. The<br />

items of the Communication scale and the Motive scale are not distributed over the four<br />

Boundary Management scales. Thus, they are not instantiations of one of the four functions<br />

but measure some separate entities. There exists nevertheless a significant correlation<br />

between them: r(A 48 ,C) = .524, r(A,M) = .458, and r(M,C) = 411.<br />

Conclusion<br />

The concept of boundary management is a necessary and fruitful approach to understand team<br />

behaviors in turbulent situations and to explain what makes the difference between effective<br />

and ineffective teams.<br />

These behaviors can be grouped into four categories; each category of behaviors serving a<br />

particular purpose: searching for information/means (Scout), promoting the team<br />

(Ambassador), protecting the team (Guard), coordinating with other teams (Coordinator).<br />

As a result of our data analysis we can conclude that the concept of boundary management<br />

and its basic structure is validated but that the scales as elaborated by Ancona are not.<br />

Finally, we need to refine our questionnaire to get rid of the ambiguities in three of the four<br />

scales.<br />

References<br />

Ancona, D. (1990). Outward bound: Strategies for team survival in an organization. Academy<br />

of Management Journal, 33 (2), 334-365.<br />

Ancona, D. (1993). The classic and the contemporary: A new blend of small group theory.<br />

In K. Murnighan (Ed), Social Psychology in organizations: Advances in theory and<br />

research. Englewood Cliffs: Prentice Hall.<br />

48 “A” stands for the complete boundary management scale elaborated by Ancona,; “C” refers to our<br />

Communication scale and “M” to our scale Motivates for contacts by external people.<br />

.<br />

X<br />

X<br />

.<br />

.<br />

X<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

597<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

.<br />

.<br />

.<br />

X<br />

X<br />

.<br />

.<br />

X<br />

X<br />

.<br />

.<br />

X<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

.<br />

X<br />

.<br />

X


598<br />

Gladstein, D. (1984). Groups in context: A model of task group effectiveness. Administrative<br />

Science Quarterly, 29, 499-517.<br />

Mylle, J. (2001). Perceived team effectiveness in peace support operations; a cross-sectional<br />

analysis in a belgian task force. <strong>Proceedings</strong> of the 43rd Annual Meeting of the<br />

<strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>, Canberra.<br />

Mylle, J., Sips K., Callaert J., Bouwen R. (2002). Perceived team effectiveness: What makes<br />

the difference? <strong>Proceedings</strong> of the 38 th Annual <strong>International</strong> Applied <strong>Military</strong><br />

Psychology Symposium, Amsterdam.<br />

Yan , A. & Louis, M.R. (1999). The migration of organizational functions to work unit level :<br />

Buffering, spanning and bringing up boundaries. Human Relations, 52 (1), 25-47.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Abstract<br />

Mental Health Literacy in the Australian Defence Force<br />

Colonel A.J. Cotton,<br />

Director of Mental Health<br />

Australian Defence Force, Canberra, Australia<br />

Anthony.Cotton@defence.gov.au<br />

MS EMMA GORNEY<br />

Directorate of Strategic Personnel Planning and Research<br />

Department of Defence, Canberra, Australia<br />

Emma.Gorney@defence.gov.au<br />

Cotton (2002) 49 reported on the implementation of the Australian Defence Force (ADF)<br />

Mental Health Strategy (MHS). One of the key principles underlying this was a focus on mental<br />

health promotion. Mental health literacy is one of the key components of mental health<br />

promotion and can be defined as the levels of understanding of mental health issues, including<br />

treatment options, in a population. This paper reports on the first attempts of the ADF to<br />

measure mental health literacy in its population. In particular, it will highlight what ADF<br />

members perceive to be the mental health issues facing them and compare this with the<br />

development of initiatives in the ADF MHS.<br />

INTRODUCTION<br />

Mental health is considered a major health issue in Australia, and is one of the country’s<br />

top five National Health Priority Areas 50 . Quality of life surveys, in Australia as well as in most<br />

other western societies, routinely show that mental health rates highly, if not highest, among the<br />

issues of greatest concern to people. The Australian Defence Force (ADF) is not immune to the<br />

pressures that face the general community so it is reasonable to assume that mental health is an<br />

issue for the ADF. Add to this the unique demands of service life and a level of operational<br />

tempo that has been steadily increasing over the past decade and it is reasonable to assume that<br />

mental health is a major issue for the ADF.<br />

The cost of the defined burden of mental health problems on the ADF is estimated to be<br />

around $20m per annum (ADF, 2002) 51 . The undefined burden, being the impact on people other<br />

than those directly affected, is more difficult to measure, but is characterised by ongoing family<br />

49 th<br />

Cotton, A.J. (2002), The Australian Defence Force Mental Health Strategy, paper presented at the 44<br />

Annual meeting of the <strong>International</strong> <strong>Military</strong> testing <strong>Association</strong>.<br />

50<br />

Australian Institute of Health and Welfare, Australia’s Health 1998, p103-108<br />

51<br />

Australian Defence Force (2002), Health Status Report, unpublished.<br />

599<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


600<br />

problems for the member, reduced job performance (including discipline and morale problems in<br />

the member' unit), and possible separation from the ADF. The hidden cost of mental health<br />

problems is defined as the increase in mental health problems that occur as the result of<br />

individuals not seeking adequate or early support for their problems due to the stigma attached to<br />

mental health problems, again, calculating this for the ADF is very difficult.<br />

Cotton (2002) 52 reported on the implementation of the Australian Defence Force (ADF)<br />

Mental Health Strategy (MHS). The development of this comprehensive strategy for the<br />

delivery of mental health services in the ADF was a response to the ADF Health Status Report<br />

2000. Among the requirements identified in this report was the need for appropriate mental<br />

health indicators to help guide the provision of appropriate mental health promotion activities<br />

and service delivery programs.<br />

Mental health service delivery has typically focussed on risk factors, or those things that<br />

predispose an individual to experience a mental health problem. Protective factors, however, are<br />

in many cases more important for prevention interventions. Protective factors derive from all<br />

domains of life: from the individual, family, community and wider environment, some are<br />

internal, such as a person’s temperament or their intelligence, while others are related to social,<br />

economic and environmental supports. Protective factors enable individuals to maintain their<br />

emotional and social wellbeing and cope with life experiences and adversity. They can provide a<br />

buffer against stress as well as a set of resources to draw upon to deal with stress.<br />

The National Mental Health Strategy, 2000, presents protective factors that can reduce<br />

the likelihood of mental health problems and mental disorders and mitigate the potentially<br />

negative effects of risk factors. These are categorized as individual, family/social,<br />

school/education, life events and situations, and, community and cultural factors. Protective<br />

factors improve a person’s response to some environmental hazard resulting in an adaptive<br />

outcome, one of the major protective factors consistently identified in the literature is the<br />

building of resilience in individuals (Rutter, 1979) 53 .<br />

The concept of resilience is central to most empirically based prevention programs.<br />

Resilience describes the capacities within a person that promote positive outcomes, such as<br />

mental health and wellbeing, and provide protection from factors that might otherwise place that<br />

person at risk of adverse health outcomes. Factors that contribute to resilience include personal<br />

coping skills and strategies for dealing with adversity, such as problem-solving, good<br />

communication and social skills, optimistic thinking, and help-seeking.<br />

A key element in developing resilience is to enhance the mental health literacy of<br />

individuals. Here, mental health literacy is defined as: ‘the ability to recognise specific<br />

disorders; knowing how to seek mental health information; knowledge of risk factors and causes,<br />

52 Op cit.<br />

53 Rutter, M. (1987). Psychosocial resilience and protective mechanisms. American Journal of<br />

Orthopsychiatry, vol. 57, pp. 316-331.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


of self-treatments and of professional help available; and attitudes that promote recognition and<br />

appropriate help-seeking’ (Jorm et al, 1997) 54 . Not only is this of use in a preventative sense it<br />

has a key role to play in the early intervention with mental health problems. Robinson (1994) 55<br />

notes the role of education as a key strategy in assisting people to recognise stress and trauma in<br />

themselves as well as others.<br />

While the effectiveness of many mental health promotion and prevention strategies has<br />

not been comprehensively demonstrated, interventions that improve mental health literacy,<br />

coping skills, and social support appear to be helpful (Graham et al 2000) 56 . The evaluation of<br />

mental health promotion programs must continue a key element of which is the development of<br />

appropriate indicators of wellbeing and mental health promotion benchmarks (National Mental<br />

Health Strategy, 2000) 57 . To be able to establish an evidence-based mental health promotion<br />

program in the ADF, in particular to develop an effective mental health literacy program, the<br />

ADF needs to develop indicators of mental health literacy in the ADF community. Many of<br />

these measures are implemented through broad surveys that access a range of members of the<br />

community, the ADF has such a vehicle in the Defence Attitude Survey.<br />

AIM<br />

The aim of this paper is to report the initial development and administration results of a<br />

set of measures of mental health literacy in the ADF through the Defence Attitude Survey.<br />

THE DEFENCE ATTITUDE SURVEY<br />

The Defence Attitude Survey was first administered in 1999. It replaced the<br />

existing single Service attitude surveys, drawing on content from each the<br />

RAN Employee Attitude Survey (RANEAS), the RAAF General Attitude<br />

Survey (RGAS), the Soldier Attitude and Opinion Survey (SAOS) and the<br />

Officer Attitude and Opinion Survey (OAOS). The amalgamation of these<br />

surveys has facilitated comparison and benchmarking of attitudes across the<br />

three Services whilst maintaining a measure of single Service attitudes.<br />

The Directorate of Strategic Personnel Planning and Research (DSPPR) re-administered<br />

the survey to 30% of Defence personnel in April 2001. The results were widely used throughout<br />

54 Jorm AF, Korten AE, Jacomb PA, Christensen H, Rogers B & Pollitt P (1997), Mental health literacy: A<br />

survey of the public’s ability to recognise mental disorders and their beliefs about the effectiveness of<br />

treatment, Medical Journal of Australia, vol. 166, pp. 182-186.<br />

55 Robinson R. (1994). Developing Psychological Support Programs in Emergency Service Agencies. In<br />

Watts R & de L Horne D. (Eds). Coping with Trauma: The Victim and the Helper. Academic. Brisbane.<br />

56 Graham A, Reser J, Scuderi C, Zubrick S, Smith M & Turley B (2000). Suicide: An<br />

Australian Psychological Society Discussion Paper. Australian Psychologist, 35(1), pp.<br />

1-28.<br />

57 National Mental Health Strategy, (2000). Promotion, Prevention and Early Intervention for Mental Health.<br />

Commonwealth Department of Health and Aged Care, Canberra.<br />

601<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


602<br />

the organisation. Consequently, to maintain the provision of this information, the Your Say<br />

Survey was developed, taking a number of key items from the Attitude Survey to be more<br />

regularly administered to gather trend data on the organisation. The Your Say Survey is<br />

administered to a 10% sample of Defence members twice a year, and while providing useful<br />

information, the sample size is not extensive enough to allow detailed breakdowns of the data.<br />

It was determined by the Defence Committee 58 in May 2002 that the Defence Attitude<br />

Survey be administered annually to a 30% sample of Defence personnel, allowing for more<br />

comprehensive data analysis. The Committee also directed that an Attitude Survey Review<br />

Panel (ASRP) be established, with representatives from all Defence Groups, to review and refine<br />

the content of the Attitude Survey.<br />

The final survey was a result of thorough consultation through the ASRP. The item<br />

selection both maintained questions from previous surveys to gather trend data, and incorporated<br />

new questions to feed into Balanced Scorecard and other Group requirements. Service forms are<br />

identical, with the only variation being Service-specific terminology. The Civilian form<br />

excludes ADF specific items and includes a number of items relevant to APS personnel only.<br />

THE AIMS OF THE DEFENCE ATTITUDE SURVEY ARE TO:<br />

• inform personnel policy and planning, both centrally and for the single Services/APS;<br />

• provide Defence Groups with a picture of organisational climate, and;<br />

• provide ongoing measurement in relation to the Defence Matters scorecard.<br />

Questionnaire<br />

The survey consisted of four parallel questionnaires, one for each Service and one for<br />

Civilians. The Civilian form excludes ADF specific items and includes a number of items<br />

relevant to APS personnel only. Terminology in each form was Service-specific.<br />

Each survey contained a range of personal details/demographic items including gender, age,<br />

rank, information on deployments, specialisation, branch, Group, years of Service, education<br />

level, postings/promotion, and family status (44 for Navy, 40 for Army and Air Force, 35 for<br />

Civilians). Navy personnel received additional questions regarding sea service. The survey<br />

forms contained 133 attitudinal items (some broken into parts) for Service personnel and 122 for<br />

Civilians. As in previous iterations, respondents were given the opportunity to provide written<br />

comments at the end of the survey.<br />

As directed by the Defence Committee, a number of changes were carried out on the<br />

survey items, through discussion of the Attitude Survey Review Panel. This refinement process<br />

attempted to balance the maintenance of sufficient items for gathering trend data and reducing<br />

58 The Defence Committee is the senior decision-making committee in the ADF; its membership includes the<br />

Chief of the Defence Force and all three Service Chiefs.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


the number of items to decrease the length of the survey. While a number of items were<br />

excluded due to no longer being relevant or appearing to duplicate other questions, further items<br />

were added to address issues previously excluded. The new additions included items on<br />

Wellbeing, Internal Communication, Security, Occupational Health and Safety and Equity and<br />

Diversity (which had been included in the 1999 iteration of the survey). Further demographic<br />

items were also added regarding work hours (predicability of and requirement to be on-call) as<br />

well as awareness of Organisational Renewal and the Defence Strategy Map. The additions<br />

resulted in more items being included in the 2002 survey than the 2001 version, however total<br />

numbers were still lower than the original 1999 questionnaire.<br />

Mental Health Literacy Items<br />

Attitudinal items were provided response options on a five-point scale where<br />

one equalled ‘Strongly Disagree’ and five equalled ‘Strongly Agree’ (a<br />

number of the items were rated on satisfaction or importance scales, rather<br />

than the more common agreement scale).<br />

A pool of potential mental health literacy items was identified from a variety of sources<br />

in the literature. These were then workshopped among subject matter experts and the final form<br />

of the items was agreed by the ASRP; the final set of items were:<br />

• How would you rate your knowledge of mental health issues?<br />

• Do you think mental health is an issue Defence should address?<br />

• How would you rate your own mental health?’<br />

• If you thought or felt you were mentally unwell, where would you seek help?<br />

• Alcohol abuse is a problem within Defence.<br />

• Drug abuse (including steroids) is a problem within Defence.<br />

• My social support network is satisfactory should I need to ask for help or talk about personal<br />

problems.<br />

Sampling<br />

The sample for the Defence Attitude Survey is typically stratified by rank, however,<br />

concerns had been raised by Group Heads that Groups were not being representatively. Thus,<br />

for the 2002 sample, the thirty percent representation of the Organisation was stratified by both<br />

rank and Group. Recruits and Officer-Cadets were not included in the sample, as per the 2001<br />

administration. Upon request, the whole of the Inspector General’s Department was surveyed to<br />

provide sufficient numbers for reporting on this small Group.<br />

Administration<br />

The survey was administered as a ‘paper and pencil’ scannable form and employed a<br />

‘mail-out, mail-back’ methodology. For a selection of personnel in the Canberra region, where<br />

603<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


604<br />

correct e-mail addresses could be identified, the survey was sent out electronically. This<br />

methodology allowed the survey to be completed and submitted on-line or printed out and<br />

mailed back in the ‘paper and pencil’ format.<br />

Due to declining response rates for surveys and inaccuracies encountered in address<br />

information, additional attempts were made to ensure that survey respondents received their<br />

surveys and were encouraged to complete them. Surveys were grouped into batches to be<br />

delivered to individual units. In coordination with representatives from each of the Service<br />

personnel areas, units were identified and surveys were sent to unit CO/OC s for distribution to<br />

sampled personnel, accompanied by a covering letter from Service Chiefs. A number of issues<br />

were encountered in this process, including the fact that some CO s are responsible for vast<br />

numbers of personnel (for example, HMAS Cerberus), and this process entailed double-handling<br />

of the survey forms.<br />

Surveys were sent out directly from DSPPR in Canberra, with Civilian forms delivered<br />

via regional shopfronts as specified by pay locations. Completed questionnaires were returned<br />

via pre-addressed return envelopes directly to DSPPR.<br />

Table 1 below outlines the response rate 59 by Service/APS. The response rate from 2001<br />

is also included, and the decline indicates that delivery via unit CO/OC s was not an improved<br />

methodology, and also highlights the operational commitments of personnel, particularly those in<br />

Navy.<br />

Table 1<br />

Navy Army<br />

Air<br />

Force<br />

APS Total<br />

Sent 4640 6841 3461 5625 20567<br />

Return to Sender 489 265 179 312 1245<br />

Useable Returns 1532 2669 1808 3504 9513<br />

Response Rate 36.9% 40.6% 55.1% 66.0% 49.2%<br />

2001<br />

Rate<br />

Response 52.0% 50.7% 60.9% 56.2% 54.5%<br />

2001-2002<br />

Difference<br />

-15.1% -10.1% -5.8% +9.8% -5.3%<br />

Demographics<br />

A randomly selected sample of 1,525 cases were taken from this data set to provide the<br />

analysis for this paper. Because the focus of this paper is on the delivery of service to ADF<br />

(i.e., uniformed members) the civilian component of the sample was removed for clarity,<br />

leaving a sample of 990 reposndents. Demographic data for the sample are:<br />

59<br />

The response rate is calculated by the number of useable returns divided by the number of surveys mailed out minus the number of<br />

surveys returned to sender.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


• Gender – 87.6% males, 12.4% females.<br />

• Age – mean 33 years, median 32 years.<br />

• Service – RAN 24.6%, Army 45.3%, RAAF 30.1%<br />

• Length of Service – mean 11.96 years, median 12 years.<br />

• Time in Current Posting – 76.6% two years or less.<br />

• Proportion having served on operations – 48.3%<br />

• Proportion in a recognised relationship – 64.3%<br />

• Proportion with at least Year 12 education – 71.1%<br />

RESULTS<br />

Examination of the initial basic frequencies showed the following observations:<br />

• Less than 45% of the ADO believe that they have an understanding of mental health issues<br />

that is better than fair.<br />

• More than ninety percent of the sample agree that mental health is an issue that the ADF<br />

should address.<br />

• Over one third of the ADO rate their mental health as Fair or worse. One in ten rate their<br />

mental health as poor or very poor.<br />

• One in ten respondents said that they would not seek help if they felt that they were mentally<br />

unwell. One in six said that they would seek support from outside the ADF if they felt they<br />

were mentally unwell.<br />

• Nearly half of the ADF were either uncertain, or disagreed with the statement that alcohol<br />

abuse was a problem within Defence.<br />

• Two thirds of the ADF were either uncertain or disagreed that drug abuse (including steroids)<br />

was a problem for Defence.<br />

• In terms of general protective factors, nearly 40% were uncertain or felt that their social<br />

support systems were inadequate should they need to talk to someone or seek support.<br />

HOW ATTITUDES VARY BY RANK<br />

In order to get a broader understanding of the levels of understanding of mental health<br />

issues within the ADF, responses to the mental health items were compared across rank levels<br />

(Other Ranks, Non-Commissioned Officers, Officers). This showed that only five items showed<br />

significant differences across rank and an analysis of the residuals showed some identifiable<br />

trends:<br />

• More NCO felt that their knowledge of mental health was poor, where more officers than<br />

expected felt that their knowledge of mental health was good.<br />

• More officers than expected felt that their mental health was good, while less NCO than<br />

expected felt that their mental health was good.<br />

605<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


606<br />

• There were few clear results from the analysis of residuals on attitudes towards alcohol as a<br />

problem, however, junior members tended to feel more strongly about it being a problem for<br />

defence than more senior ranks.<br />

• Attitudes on other drugs (including steroids) varied across rank with other ranks being more<br />

extreme with more than expected responding in both of the extreme categories. NCO<br />

generally felt that drugs were a problem, while officers were less inclined to describe drugs<br />

as a problem.<br />

• Attitudes towards the adequacy of social support networks also varied across ranks but with<br />

no clearly discernable pattern to the residuals, with the exception that more officers were<br />

positive about their ability to access these networks.<br />

How Attitudes vary by Service<br />

It is widely recognised in the ADF that each of the individual services possesses a distinct<br />

and unique culture that significantly influences the attitudes and values of those service<br />

members’. Given this, is it reasonable that attitudes to mental health will vary across the services<br />

in the ADF. Comparison of the items across service showed the following differences:<br />

• There were no differences in perceived levels of understanding of mental health problems,<br />

attitudes towards whether mental health was an issue that the ADF should address, or levels<br />

of mental health.<br />

• Alcohol was perceived as much more of an issue by the Navy, whereas the Air Force felt that<br />

it has less of an issue that required addressing.<br />

• There were clear differences in attitudes about drug use as a problem, with the Air Force less<br />

likely to see drugs as a problem than either Army or Navy.<br />

• There were no differences in perceptions of adequacy of social support networks.<br />

DISCUSSION<br />

The results presented above only provide basic descriptive and some very basic<br />

inferential statistics from the DAS data set. However, they provide good guidance for the<br />

development of a mental health literacy program, and this has been identified as a key<br />

component of the ADF Mental Health Strategy.<br />

The group statistics indicate clearly that ADF members feel that mental health is an<br />

important issue, and a small, but significant, proportion feel that they have poor mental health.<br />

ADF members feel that they have an inadequate understanding of mental health issues, and a<br />

small but also significant proportion claim that they would not seek support if they felt that they<br />

were mentally unwell. This is a significant concern for commanders, particularly the<br />

implications that it has for the health of the deployable force.<br />

On specific issues, the majority of members do not feel that the either alcohol or drugs<br />

(including steroids) is a problem for the ADF. Given the extant data on alcohol use rates in the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


ADF 60 , and community drug use rates 61 this is of concern for the provision of drug and alcohol<br />

education to the ADF. And a clear indicator of a need for an enhanced<br />

Comparison across ranks indicates that there are clear differences in attitudes depending<br />

on the rank of the member. In particular, there would appear to be a need to improve education<br />

and mental health services targeting NCOs. Attitudes towards alcohol and drugs vary across<br />

ranks and support anecdotal evidence about the differences in these groups that reflect age (or<br />

generational) effects across these groups. The implication is that there is a clear need to vary the<br />

mental health literacy programs provided to these groups, perhaps with specific effort targeted<br />

towards NCO.<br />

Given the widely held view of significant cultural differences between the three services<br />

the comparison across services is surprising in that there are no differences in any of the items<br />

with the exception of attitudes towards alcohol and other drugs. The differences in attitudes<br />

towards alcohol might be explained by the fact that the Navy has had a alcohol management<br />

program for some time, whereas neither the Army or Air Force have. The differences in<br />

attitudes towards illicit drugs is more difficult to explain and requires more investigation,<br />

particularly given the group data showing a relative lack of concern for illicit drugs as a problem<br />

for the ADF. These data would support the notion of a tri-service mental health literacy program<br />

with enhanced education for Army and Air Force members.<br />

The data reported here only addresses a small portion of the DAS, which has 330<br />

variables measuring a very wide range of organisational behaviour markers. But even this initial<br />

analysis indicates that this tool provides an excellent opportunity to measure mental health<br />

literacy on a continuous basis, providing an excellent means of evaluating the mental health<br />

literacy programs that will be guided by the results of this instrument. Future analyses need to<br />

more fully explore the descriptive information in the data set as well as exploring some of the<br />

inferences that might be made from this data.<br />

CONCLUSION<br />

The provision of mental health services is a key part of the management of the health of<br />

the ADF. The ADF Mental Health Strategy has identified a need to develop indicators of mental<br />

health literacy as a key component in building health promotion and prevention programs in<br />

order to build resilience in ADF members.<br />

The DAS provides the ADF with an excellent tool for developing such indicators because<br />

it is comprehensive and enduring in nature. It allows the monitoring of mental health literacy<br />

over time and the tailoring of mental health literacy programs to meet the evolving mental health<br />

literacy of the ADF.<br />

60 Current data suggests that around 30% of the ADF drink at hazardous levels.<br />

61 There is no ADF data on drug use due to lack of reporting but Australian population data indicate that more<br />

than half of the Australian school leavers have tried cannabis, and 40.4% of the 20-29 year old men in<br />

Australia have recently used an illicit substance.<br />

607<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


608<br />

DEFENCE ETHICS SURVEY: THE IMPACT OF SITUATIONAL MORAL<br />

INTENSITY ON ETHICAL DECISION MAKING<br />

BACKGROUND<br />

Sanela Dursun, MA and Major Rob Morrow, MA<br />

Director Human Resources Research and Evaluation<br />

Ottawa, Canada<br />

Dursun.S@forces.gc.ca & Morrow.RO@forces.gc.ca<br />

In 1998, The Directorate for Human Resources Research and Evaluation (DHRRE) was<br />

approached by the Defence Ethics Program (DEP) to conduct a comprehensive assessment of the<br />

ethical climate of the Canadian Forces and Department of National Defence (CF/DND) and the<br />

values used by members to make ethical decisions. A model (Figure 1) of ethical decisionmaking<br />

applicable to the Department of National Defence was developed (Catano, Kelloway &<br />

Adams-Roy, 1999).<br />

Figure 1<br />

PREDICTORS<br />

A Model of Ethical Decision Making Behaviour<br />

An instrument, based upon the model, was constructed to describe the ethical climate in<br />

the organization, to derive an understanding of the ethical values of respondents, and, to gauge<br />

individual levels of moral reasoning and systematic approaches to ethical decision-making. The<br />

results indicated that the components of the model were successful in accounting for ethical<br />

decision making except for situational moral development and situational moral intensity<br />

(Catano et al, 1999).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

ETHICAL DECISION


The survey was re-administered in the summer of <strong>2003</strong>. A review of the original ethics<br />

instrument was conducted to further refine the DND/CF ethical decision-making model (Dursun<br />

and Morrow, <strong>2003</strong>). A new approach to measuring moral intensity represented the most<br />

significant change to the model and the measurement instrument. This paper will illustrate how<br />

moral intensity was measured and will present preliminary results of the moral intensity<br />

component of the <strong>2003</strong> survey re-administration.<br />

MEASURING MORAL INTENSITY<br />

Perceived moral intensity deals with the individual’s perception of the specific<br />

characteristics of the moral/ethical issue and directly influences whether the individual believes<br />

that the issue contains a moral or ethical dilemma. If the moral intensity of a situation is<br />

perceived to be weak, individuals will not perceive an ethical problem in the issue.<br />

While ethical perception is concerned with the individual’s recognition of a moral issue<br />

(Jones, 1991) and drives the entire ethical decision making process (Hunt & Vitell, 1993), ethical<br />

intention is making a decision to act on the basis of moral judgments (Jones, 1991). The moral<br />

intensity dimensions should influence all stages of the ethical decision making process, from<br />

recognition that an issue represents an ethical dilemma to deciding whether to engage in a<br />

particular action.<br />

Moral Intensity<br />

Jones (1991) describes six dimensions of moral intensity: magnitude of consequences<br />

(MC), social consensus (SC), probability of effect (PE), temporal immediacy (TI), proximity<br />

(PX), and concentration of effect (CE).<br />

Magnitude of consequences refers to the sum of harms (or benefits) resulting from the<br />

moral act in question. Jones illustrates this construct as follows: an act that causes 1000 people to<br />

suffer an injury is of greater magnitude of consequences than an act that causes 10 people to<br />

suffer the same injury (Jones, 1991).<br />

Social consensus refers to the degree of social agreement that a proposed act is ethical<br />

or unethical. Individuals in a social group may share values and standard, which influence<br />

their perception of ethical behaviour. A high degree of social consensus reduces the level<br />

of ambiguity one faces in ethical dilemmas. An act that most people feel is wrong has greater<br />

moral intensity than an act about which people’s opinions vary.<br />

Probability of effect refers to both the probability that the act in question will happen,<br />

and the probability that the act will actually cause the harm predicted. The more likely an act<br />

will cause harm, the greater the propensity of an individual to view the act as unethical. For<br />

example, Jones (1991) suggested that selling a gun to a known criminal (?) has a greater<br />

probability of harm than selling a gun to a law–abiding citizen.<br />

609<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


610<br />

Temporal immediacy refers to the length of time between an act and the consequences<br />

resulting from the act. In other word, an act that will have negative consequences tomorrow is<br />

more morally intense than an act that will have negative consequences in ten years.<br />

Proximity refers to the feelings of nearness that the moral agent holds for the target of<br />

moral act. There are four aspects of proximity: social, cultural, psychological and physical.<br />

As an example Jones states that the sale of a dangerous pesticide in the U.S. has greater moral<br />

intensity for U.S. citizens than the sale of the same pesticide in another country would have<br />

on them.<br />

Concentration of effect refers to the impact of a given magnitude of harm in relation to<br />

the number of people affected. Jones provides as an example that cheating an individual or a<br />

small group of individuals of a given sum has a more concentrated effect that cheating a large<br />

corporation of the same sum.<br />

Instead of assessing moral intensity through manipulating the severity of ethical scenarios<br />

(Catano et al., 1999), this CF/DND study examined the relationship between perceived moral<br />

intensity dimensions and three stages of the ethical decision making process. Scenarios of an<br />

arguably ethical nature were used to stimulate participants’ perception of the moral intensity of<br />

the vignette, and to examine participants’ ethical perception, moral intention and judgement of<br />

the decision made in each vignette. The utilization of scenarios is considered a “positive solution<br />

in improving the quality of data from questionnaire” (Paolillo & Vitell 2002). Singer (1998)<br />

emphasized that when compared with other approaches to ethics research, scenarios are less<br />

susceptible to the social desirability bias. It is very important to have a standardized stimulus for<br />

all respondents, which will make a decision making process more real.<br />

METHODOLOGY<br />

Perceived moral intensity, recognition of an ethical issue, ethical intention and ethical<br />

judgement were measured using an instrument consisting of four scenarios for civilian and five<br />

for military personnel involving ethical situations. The military version of the questionnaire<br />

contained one additional scenario, in order to assess the effect of moral intensity on ethical<br />

decision-making in an operational environment. All scenarios were adopted from the compilation<br />

of focus groups findings (a study conducted in 2001) in which CF members and DND<br />

employees identified the ethical issues with which they were exposed. An initial selection of ten<br />

scenarios was pilot tested to ensure the salience of the stimulus for both civilians and military. In<br />

an effort to reduce the potential for a social desirability response bias, scenarios were written in<br />

the third person, rather than having the participant be the decision maker (Butterfield et al.,<br />

2000). To reduce the potential for gender bias, the gender of the actors was not specified.<br />

Perceived moral intensity<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The perceived moral intensity scale developed by Singhapakdi et al. (1996) was adapted<br />

for the purpose of the CF/DND study. A single statement was used for each component of<br />

perceived moral intensity. A seven-point Likert-type scale was used in the measurement. As<br />

moral intensity is a situation-specific construct, it was measured separately for each of the five<br />

scenarios.<br />

This study examined the effect of all dimensions of moral intensity except concentration<br />

of effect. Most studies have not found support for this dimension of moral intensity. Chia &<br />

Mee (2000) suggested that this dimension should be deleted from the moral intensity construct.<br />

Jones (1991) admitted that he included concentration of effect in the moral intensity construct<br />

“for the sake of completeness.”<br />

The interpretation of scores is different for one of the five remaining dimensions of moral<br />

intensity. For magnitude of consequences, temporal immediacy, social consensus and probability<br />

of effect, a high score indicates a high level of perceived moral intensity, while for proximity a<br />

high score indicates low level of moral intensity.<br />

Recognition of moral issue<br />

Respondents started by reading each scenario and their ethical perception was measured<br />

by asking them to respond to a single item, “Do you believe that there is a moral or ethical issue<br />

involved in the above action/decision?” (Barnett, 2001) on a 7-point scale ranging from 1<br />

(completely agree) to 7 (completely disagree). Lower scores indicated that participants agree that<br />

the action/decision had a moral or ethical component.<br />

Ethical intention<br />

Respondents’ ethical intentions were measured by asking them to indicate the likelihood<br />

“that you would make the same decision described in the scenario” on a 7 point Likert scale with<br />

1 representing “Definitely would” and 7 representing “Definitively would not”.<br />

Ethical judgement<br />

Respondents’ judgements about the morality of the actions in each scenario were<br />

assessed with a 7-point, eight-item semantic-differential measure developed by Reidenbach and<br />

Robin (1998, 1990). The ethical judgment scale has been used in several empirical studies and<br />

has demonstrated acceptable psychometric properties, with reliability coefficients in the .70 to<br />

.90 range (Barnett et al., 1998, Robin et al., 1996).<br />

To assess the effectiveness of the moral intensity constructs on ethical decision-making,<br />

regression analyses were conducted. Specifically, within each scenario, assessments of each of<br />

the five components of moral intensity were used to predict ethical decision-making. A similar<br />

set of regression analyses was conducted to assess the impact of moral intensity on moral intent.<br />

Within each scenario, assessments of the moral intensity were used to predict moral intent.<br />

Finally, moral intensity components were used to predict moral awareness, or the recognition of<br />

a moral issue in the scenarios.<br />

611<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


612<br />

RESULTS<br />

The results seem to demonstrate strong support for three of the five components as<br />

predictors of ethical decision making. In all five scenarios, social consensus, magnitude of<br />

consequences and probability of effect significantly predicted ethical decision-making. There<br />

was only partial support for temporal immediacy in scenario one and three and support for<br />

Proximity in scenario one.<br />

The results for predicting moral intent were similar to the ethical decision making results.<br />

Strong support seems to be evident for the same three components of moral intensity (social<br />

consensus, magnitude of consequences and probability of effect) as predictors of moral intent.<br />

Weaker support was shown for temporal immediacy in scenario one and three and for proximity<br />

in scenario one and four.<br />

While some components of moral intensity predicted moral awareness, none of them<br />

were consistent predictors and the overall regressions accounted for small amounts of variance<br />

(< 5% except in scenario 3).<br />

DISCUSSION<br />

These results mirror the results that have been produced by other authors. Similar to<br />

Singer (1996, 1998) and colleagues (Singer et al., 1998; Singer & Singer, 1997) who found<br />

support for social consensus and magnitude of consequences, this study also found that these<br />

components were strongly associated with ethical decision making. The only difference is that<br />

this study also found consistent support for probability of effect that was not found in previous<br />

studies. When predicting moral intent, these results mirrored those of Barnett (2001), Butterfield<br />

et al (2000), Chia & Mee (2000), and Frey (2000) who also found that magnitude of<br />

consequences, social consensus and probability of effect were relatively strong.<br />

Quite clearly, results for recognizing a moral issue did not identify strong predictors.<br />

However, a closer look at the scenarios reveals that they were all rather complex in the sense that<br />

there were potential dilemmas in each of them. Without any scenarios in which there was no<br />

ethical dilemma or ones where there was a clear dilemma, it was not surprising that there was<br />

very little variability in participants’ assessments. This restriction of range likely contributed to<br />

the results.<br />

IMPLICATIONS<br />

Additional positive benefits of these results over and above their substantiation of other<br />

researchers findings are the implications they have for policy makers in DND. Social consensus,<br />

magnitude of consequences and probability of effect were all strong predictors of ethical<br />

decision making and moral intent. These components are also factors over which the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


organization has significant control. Social consensus refers to the degree to which people agree<br />

that a particular act is ethical or not. Magnitude of consequences refers to total harm resulting<br />

from the moral act in questions. Policy formulations which clearly outline unacceptable<br />

behaviour and the consequences of those behaviours help to develop the consensus among CF<br />

personnel and DND employees about what is ethical and what is unethical. That same emphasis<br />

can shape the extent to which the magnitude or the seriousness of unethical behaviour is viewed<br />

by personnel. In other words, the more that people agree that an act or a behaviour is unethical,<br />

the more likely it will be generally viewed as unethical. At the same time, the more likely that<br />

people perceive that severe harm will result from the act in question, the more likely people will<br />

view that it is an unethical act. The results of the research demonstrate that addressing consensus<br />

and magnitude of consequences issues should assist in assisting people to understand more<br />

clearly what constitutes unethical behaviour.<br />

References<br />

Barnett, T. (2001). Dimensions of moral intensity and ethical decision making: An<br />

empirical study. Journal of Applied Social Psychology, 31,1038-1057.<br />

Butterfield, K.D., Trevino, L.K., & Weaver, G.R. (2000). Moral awareness in business<br />

organizations: Influences of issue-related and social context factors. Human Relations,<br />

53, 981-1018.<br />

Catano, V.M., Kelloway, E.K. & Adams-Roy, J.E. (1999). Measuring Ethical Values in the<br />

Department of National Defence: Results of the 1999 Research, Director Human<br />

Resources Research and Evaluation, Sponsor Research Report 00-1.<br />

Chia, A., & Mee, L.S. (2000). The effects of issue characteristics on the recognition of<br />

moral issues. Journal of Business Ethics, 27, 255-269.<br />

Dursun, S., & Morrow, R.O. (<strong>2003</strong>) Ethical Decision Making in the Canadian Forces:<br />

Revision of the Defence Ethics Questionnaire. Paper presented at the <strong>International</strong><br />

Conference on Social Sciences, Honolulu, Hawaii, USA, 12 th – 15 th June <strong>2003</strong>.<br />

Frey, B.F. (2000a). The impact of moral intensity on decision making in a business<br />

context. Journal of Business Ethics, 26, 181-195.<br />

Frey, B.F. (2000b). Investigating moral intensity with the world-wide web: A look at<br />

participant reactions and a comparison of methods. Behavior Research Methods,<br />

Instruments, & Computers, 32, 423-431.<br />

Hunt, S.D., & Vitell, S. (1986). A general theory of marketing ethics. Journal of<br />

Macromarketing, 6, 5-16.<br />

613<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


614<br />

Jones, T.M. (1991). Ethical decision making by individuals in organizations: An issue-<br />

contingent model. Academy of Management Review, 16, 366-395.<br />

Kelloway, E.K., Barling,J., Harvey, S., & Adams-Roy, J.E. (1999). Ethical Decision-<br />

Making in DND: The Development of a Measuring Instrument. Sponsor Research Report<br />

99-14. Ottawa: Canadian Forces Director Human Resources Research and Evaluation<br />

Paolillo, J.G.P., & Vitell, S.J. (2002). An empirical investigation of the influence of<br />

selected personal, organizational and moral intensity factors of ethical decision making.<br />

Journal of Business Ethics, 35, 65-74.<br />

Reidenbach, R.E., & Robin, D.P. (1988). Some initial steps toward improving the<br />

measurement of ethical evaluations of marketing activities. Journal of Business Ethics, 7,<br />

871-879.<br />

Reidenbach, R.E., & Robin, D.P. (1990). Toward the development of a multidimensional<br />

scale for improving evaluations of business ethics. Journal of Business Ethics, 9,<br />

639-653.<br />

Rest, J.R. (1986). Moral development: Advances in research and theory. New York:<br />

Praeger.<br />

Singer, M.S. (1996). The role of moral intensity and fairness perception in judgments of<br />

ethicality: A comparison of managerial professionals and the general public.<br />

Journal of Business Ethics, 15, 469-474.<br />

Singer, M.S. (1998). The role of subjective concerns and characteristics of the moral<br />

issue in moral considerations. British Journal of Psychology, 89, 663-679.<br />

Singer, M., Mitchell, S., & Turner, J. (1998). Consideration of moral intensity in<br />

ethicality judgements: Its relationship with whistle-blowing and need-for-cognition.<br />

Journal of Business Ethics, 17, 527-541.<br />

Singer, M.S. & Singer, A.E. (1997). Observer judgements about moral agents' ethical<br />

decisions: The role of scope of justice and moral intensity. Journal of Business Ethics,<br />

16, 473-484.<br />

Singhapakdi, A., Vitell, S.J. & Kraft, K.L. (1996). Moral intensity and ethical decision-<br />

making of marketing professionals. Journal of Business Research, 36, 245-255.<br />

Singhapakdi, A., Vitell, S.J. & Franke G.R. (1999). Antecedents, consequences,<br />

and mediating effects of perceived moral intensity and personal moral philosophies.<br />

Journal of the Academy of Marketing Science, 27, 19-36.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


ADAPTING OCCUPATIONAL ANALYSIS METHODOLOGIES TO<br />

ACHIEVE OPTIMAL OCCUPATIONAL STRUCTURES<br />

INTRODUCTION<br />

Brian R. Thompson<br />

MOSART Project, Chief of Analysis<br />

Ottawa, Ontario, Canada<br />

Thompson.BR@forces.gc.ca<br />

In the Canadian Forces (CF) job requirements are fundamentally obtained through an<br />

occupational analysis (OA) where the structure and content of one or a series of CF occupations<br />

are evaluated and military specifications are drafted. The CF, through the <strong>Military</strong> Occupational<br />

Structure Analysis, Redesign and Tailoring (MOSART) project, is currently engaged in a<br />

strategic initiative to reorganize its occupational structure - the way in which CF members are<br />

grouped and managed from recruitment to training to career progression to release/separation.<br />

To do so, the normal CF OA process has been modified in order to apply to a broader career field<br />

concept, based largely on relating separate, or “stovepiped” occupations within functional areas<br />

of employment 62 . This paper briefly describes how CF OA methodology was adapted to analyze<br />

job requirements and shares lessons learned through the conduct of different analysis projects.<br />

BACKGROUND<br />

In 1968, several hundred occupations in the Royal Canadian Navy, Army and Air Force<br />

were unified into one common <strong>Military</strong> Occupational Structure (MOS) called the CF. Since that<br />

time, technology, downsizing, operational effectiveness, and the need to attract and retain<br />

qualified military personnel has led to the need to revise the MOS. In a great many cases, CF<br />

members have had to change their primary skill-sets to keep abreast of changing job<br />

requirements. In the CF, job performance requirements are normally obtained through an OA<br />

where the structure of one or a series of military occupations are evaluated and occupational<br />

specifications are in turn drafted. These specifications provide the basis upon which training is<br />

developed and career progression is defined. At present, under the MOSART Project, the CF is<br />

involved in a review of the number and type of occupations that currently exist, with the ultimate<br />

aim of increasing operational effectiveness. In this respect, MOSART is also making a<br />

concerted effort to include the roles of the Reserve Force in any MOS modernization plans. For<br />

purposes of this paper, Information Management (IM) and Human Resources (HR) career field<br />

analysis projects will be discussed. Both projects defined a need to be structured in a fashion<br />

that will effectively enable succession planning and the development of future military leaders in<br />

62 Career Fields are formally described as a grouping of <strong>Military</strong> Occupations and/or generic jobs, which are used<br />

for the purpose of both enhancing operational effectiveness and broadening individual career development to meet<br />

the Environmental and CF requirements. Institutional Career Fields (not yet decided on as desirable entities) refer to<br />

potential Career Fields serving mainly corporate, or HQ-level required functional areas, such as strategic HRM.<br />

615<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


616<br />

their particular domain of work. In order to effectively develop these structures, both of these<br />

projects required new approaches to OA.<br />

OCCUPATIONAL STRUCTURE DEVELOPMENT<br />

It is our unique ability to discriminate between jobs using statistics and scientific process<br />

instead of intuition and personal experience that defines the work of the occupational analyst. In<br />

the CF OA process, once job typing has been performed, the focus usually shifts to occupational<br />

structure. Based on similarity of job performance requirements, the analyst considers whether<br />

the jobs in question are best grouped into a single occupation, two or more occupations,<br />

performed by specialty-trained members of another occupation(s), or structured in some other<br />

way. It is important to note that the analyst’s mandate is to determine what work is done, not to<br />

question the necessity of the work. In addition to describing work, career/employment patterns<br />

for the occupation(s) or function(s) and their related training requirements are proposed. Three<br />

types of occupational structure are usually developed for the occupation(s) under study. First,<br />

Mobilization structure is developed based on results of the job typing process and provides a<br />

framework of occupations with a narrow scope of work to facilitate rapid recruitment/manning in<br />

times of emergency with minimum training. Then, a Primary Reserve occupational structure is<br />

developed to augment the Regular Force, which is constructed last. These three occupational<br />

structures, taken together, constitute the MOS that provides the basis by which members are<br />

recruited, selected, trained, paid, etc. Beyond meeting Canadian needs in peace and war, an<br />

integrated MOS must harmonize Regular Force personnel working in close cooperation with<br />

their counterparts in the Primary Reserve.<br />

MOS PRINCIPLES<br />

The design of the MOS is largely dependent upon occupation analysis and is guided by<br />

the application of the principles of operational effectiveness, economy of training/professional<br />

development, career field management, and rationalized assignment. Since maximizing one of<br />

these principles may diminish another, several criteria guide the development and evaluation of<br />

occupational structure options. For example, in a career employment model, key developmental<br />

jobs are represented at several periods and may require occupational training that all occupation<br />

members receive or specialty training for those posted to specific jobs. While this career model<br />

specifies at what points training is to be provided, and for which job or set of jobs the trainee is<br />

to be prepared, it does not prescribe the methods of training delivery. Nonetheless, proposals<br />

may create whole new course(s) and/or delete existing ones, or may only fine-tune existing<br />

courses by adding, deleting and/or shifting specific content areas based on the job requirement<br />

identified during the job typing activity.<br />

Since the goal of the MOS is to ensure and enhance operational effectiveness, projected<br />

personnel shortages must be addressed prior to implementing a new MOS. This highlights the<br />

need to create occupational structures that enable the necessary rotation of personnel between<br />

“ship/shore” and/or “field/garrison” positions. Arguably, recruiting, retention and training are<br />

the key HR activities having a bearing on operational effectiveness. If current rates of retention<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


are not improved for the long term, the future of the CF is in jeopardy. It is posited that Career<br />

Fields will permit more efficient management of military personnel within a career progression<br />

framework. The MOS can play an important role by designing occupations and career fields that<br />

offer excellent training and career opportunities. Attractive employment, fair and competitive<br />

pay, employee benefits and clearly defined opportunity for career advancement are key<br />

objectives for many MOS renewal activities.<br />

DATA ANALYSIS<br />

A CF OA study typically uses an occupational survey to gather empirical data. Some of<br />

the unique characteristics of the CF such as two official languages, equal opportunity for women<br />

in all occupations/roles, and a unified military, create certain challenges when stratified random<br />

sampling techniques are considered. Gatewood and Field’s (1998) finding that questionnaires<br />

are resource efficient for large samples provides support for the MOSART project use of surveys<br />

tailored to functional domains of work that cut across several occupations. From a purely<br />

theoretical perspective, the CF Job Analysis model recognizes E.J. McCormick’s (1979) “World<br />

of Work” reference in job/occupational analysis. Similarly, OA organizes work in hierarchical<br />

descending order of career field, occupation, sub-occupation, jobs, duties, and finally, tasks,<br />

skills, knowledge and other abilities. Although occupational structuring has often come under<br />

attack in the CF (McCutcheon, 1997), the empirical rationale of task-based requirements is still<br />

considered the best model to accurately describe work.<br />

In the course of a normal OA study, case, task, and knowledge focused job data are<br />

analyzed via Comprehensive Occupational Data Analysis Programs (CODAP). These programs,<br />

also used by the United States Air Force and Australian Defence Organization (Mitchell and<br />

Driskall, 1996), cluster personnel based upon the similarity of time spent on similar tasks. One<br />

challenge the analyst faces with multiple occupation analyses is that the length of the survey and<br />

the CODAP inventory limits (maximum of 7000 cases/individuals which can be clustered and<br />

maximum of 3000 tasks) dictates the level of discrimination used when formulating task and<br />

knowledge statements. Furthermore, in a technical modification of the standard process, when<br />

clustering knowledge data via CODAP, we must use the level of knowledge required by the job<br />

incumbents vice percent time spent. Nonetheless, as with case and task clustering, the<br />

knowledge sequence is graphically presented in a hierarchical “tree-like” diagram. In order to<br />

fully utilize knowledge clustering, a requirement exists to develop the knowledge inventories to<br />

the same specificity level as afforded the task inventory. This proves to be problematic in multioccupational<br />

surveys given the already lengthy period of time required by the sample population<br />

to answer the survey instruments. Despite this, knowledge clustering is a tool that the analyst<br />

will rely upon when structuring career fields that not longer fit within existing occupational<br />

boundaries.<br />

Information Management Functional Analysis (IMFA). Events such as the September 11 th<br />

terrorist attacks and CF involvement in Afghanistan and Iraq have underscored the essential need<br />

for a highly organized, interoperable system of managing information. As a result, the CF has<br />

commenced transition to an Enterprise Model for IM including the professional development and<br />

617<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


618<br />

career planning of individual CF members and civilian Department of National Defence (DND)<br />

employees. The aim of the IMFA is to describe the IM work done by 10 military occupations<br />

(officer and non-commissioned) and two civilian classifications as well as to identify the way<br />

ahead for the training, development and career paths required to produce and maintain the DND<br />

IM workforce. The survey will be administered to approximately 7000 CF members/civilians<br />

who occupy IM positions in the DND. While this survey will likely be presented in a paper and<br />

pencil format, efforts are being made to develop a web-based survey system since the task count<br />

for a survey of this magnitude is considerably greater that in a normal paper and pencil survey.<br />

Concepts of inter-operability and the ability to effectively function in joint operations are key to<br />

the development of an effective IM occupational structure. One challenge with implementing an<br />

IM career field will lie in defining the functional authority for IM and establishing what are the<br />

roles and relationships with each particular environment.<br />

Human Resource Functional Analysis (HR FA). Although a unified CF MOS has been<br />

maintained, there has been philosophical differences in the approach to HR management<br />

between the Sea, Land and Air environments. Despite this, there is general agreement that an<br />

overall system of control and HR management is required in the CF. A CODAP data analysis<br />

technique was used to interpret the HR data by grouping tasks into modules without restrictions<br />

normally associated with non-overlapping duty areas (Thew and Weissmuller, 1978). This<br />

technique was used to better understand HR jobs and their inter-relationships existing within this<br />

domain of work. Once HR jobs and their associated tasks were defined, task co-performance<br />

modules were examined to determine the overlap of competencies across the survey population.<br />

Finally, those positions that can be filled by persons from one or more career field(s) and/or<br />

stand-alone occupations are assigned for succession planning purposes.<br />

CONCLUSIONS<br />

While our current phase of the project is not complete, our experience suggests that these<br />

types of analyses are feasible for multiple occupational studies. The normal CF OA<br />

methodology has been adapted to deal with career fields by expanding surveys to include several<br />

occupations (some including both military and civilian respondents). In addition, the concept of<br />

knowledge clustering has been incorporated into normal OA practice. As survey methodology<br />

only defines present-day work requirements, it is incumbent that sponsors and subject-matterexperts<br />

(SMEs) are engaged throughout the OA process. In fact, it can’t be overstressed that a<br />

thorough process of consultation with stakeholders must occur prior to effecting any MOS<br />

change. However, respondents have generally tended to be positive about these large-scale OA<br />

surveys. Many have commented that the survey effectively captured their collective<br />

competencies and should assist in building a MOS that will provide professional and technically<br />

competent personnel to perform all future CF roles and missions. As the CF enters a phase<br />

where it must effectively structure its’ workforce within a work/job-oriented focus, it is hoped<br />

that CF OA methodology will be further adapted to embrace personal competencies.<br />

REFERENCES<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Gateworth, R.D., & Field,H.S. (1998). Human resource selection. (4 th Ed.) Fort Worth: Harcourt<br />

Brace.<br />

McCormick, E.J. (1979). Job Analysis: Methods and Applications. New York: Amacom.<br />

McCutcheon, J.M. (1997). “Competencies” and “Task Inventory” Occupational Analysis – Can<br />

they both sing from the same music sheet? 10 th <strong>International</strong> Occupational Analyst Workshop,<br />

San Antonio, Tx.<br />

Mitchell, J.L., & Driskall, W.E. (1996). <strong>Military</strong> Job Analysis: A historical perspective.<br />

<strong>Military</strong> Psychology, 8(3), 119-142.<br />

Thew, M.C. & Weissmuller, J.J. (1978). CODAP: A new modular approach to occupational<br />

analysis. <strong>Proceedings</strong> of the 20 th annual conference of the <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

(pp.362-372), Oklahoma City, OK.<br />

619<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


620<br />

Whom Among Us? Preliminary Research on Position and<br />

Personnel Selection Criteria for MALE UAV Sensor<br />

Operators<br />

Captain Glen A. Smith<br />

Canadian Forces Experimentation Centre<br />

Ottawa, Ontario, Canada K1A 0K2<br />

gsmith42@uwo.ca<br />

Abstract<br />

Net-centric warfare and interoperability are fast becoming basic tenets of modern military<br />

strategic thought. The Canadian Forces and its NATO allies are currently conducting<br />

research into the effective use of current and emerging technologies such as airborne<br />

sensors and uninhabited aerospace vehicles (UAVs) to enhance their intelligence,<br />

surveillance, and reconnaissance (ISR) capabilities. Effective sensor operation is critical<br />

to the successful support of UAVs to Canada’s joint and combined net-centric warfare<br />

capability. The selection, training, and employment of Canadian Forces personnel as<br />

sensor operators will depend upon an accurate analysis of this position’s requirements<br />

and upon the determination of whom among us has the appropriate training and<br />

experience to competently fill this vital ISR position. Canadian Forces UAV<br />

experimentation is developing an understanding of the generic task and knowledge<br />

requirements of the Medium Long Endurance (MALE) UAV Sensor Operator position to<br />

that end. This paper discusses the methods and techniques used over the course of three<br />

major research events to determine the position and personnel selection criteria for<br />

MALE UAV Sensor Operators and provide preliminary results from Canadian Forces<br />

research to date.<br />

Introduction<br />

At the turn of the millennium, there is an apparent fundamental shift pervading<br />

current military strategic thought. Ongoing research into the effective use and practical<br />

application of secure information technology and information management techniques to<br />

improve C4ISR capabilities between tactical, operational, and strategic units to exploit<br />

opportunities and increase mission success are leading to the common development of<br />

net-centric warfare principles and procedures. In a related area, HR reviews have been<br />

conducted among the components of the United States Department of Defense, within the<br />

Australian Defence Force, as well as the Canadian Forces’ <strong>Military</strong> Occupational<br />

Structure Analysis, Redesign, and Tailoring (MOSART) Project in order to assess its<br />

capabilities to meet the expected human resources demands to the year 2020. Assessing<br />

both the capital and human resource assets of the Canadian Forces with a knowledge of<br />

the common strategic thought of incorporating net-centric warfare serves to focus both<br />

national objectives and international commitments and to synergize the interoperability<br />

between allied nations.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Inevitably, new technology will be introduced into the Canadian Forces over the<br />

coming years. Medium Altitude, Long Endurance (MALE) UAVs have shown<br />

promising potential for use in Canadian Forces initiatives to further enhance its ISR<br />

capability to the year 2020. The ability of this technology to minimize the potential<br />

threat and loss of aircrew in domestic activities such as coastal and fishery patrols as well<br />

as international commitments such as peacekeeping operations is enticing. Further, this<br />

technology’s ability to remain aloft for up to 40 hours and provide detailed ISR imagery<br />

at a maximum ceiling of up to 30,000 feet from over 400 nautical miles from its ground<br />

control station (GCS) are features that further make MALE UAVs a desirable ISR asset.<br />

When these features are coupled with the fact that a host of Canada’s allies are either<br />

contemplating or have already introduced MALE UAVs into their air inventories, it<br />

further spurs the Department of National Defence to investigate the possibility of<br />

incorporating this technology into the Canadian Forces, thereby aligning our human and<br />

capital resources as well as our interoperability with other nations we are presently or<br />

may be involved with in combined operations such as peacekeeping in the future.<br />

Method<br />

Collecting data for research involving emerging technology such as MALE UAVs<br />

provided a unique and interesting opportunity. Occupational data collection and analysis<br />

is routinely reported on through the process of occupational analysis within the Canadian<br />

Forces, as it has been for the past 20-25 years in the Canadian Forces. An extensive<br />

history of analyzing occupations within the Canadian Forces through a method which<br />

requires all members of an occupation under study to complete an extensive inventory of<br />

task, knowledge, and skills performed or required within their occupation, provides the<br />

Canadian Forces with a good understanding of its workforce’s capabilities. However,<br />

little was known until recently about the job requirements associated with the various<br />

positions necessary to provide real-time intelligence, surveillance, and reconnaissance<br />

(ISR) information via MALE UAVs. Even though knowledge gained from research<br />

conducted by allied nations in this area is available, it is difficult to determine whether<br />

military personnel employed in MALE UAVs in other countries have skill sets, training,<br />

and experience similar to potential candidates from Canadian Forces occupations.<br />

Opportunities to observe Canadian Forces personnel in actual MALE UAV<br />

positions were authorized through a series of research events demonstrate the operational<br />

efficiency and effectiveness of UAV technology concurrent to joint military exercises<br />

and operations. UAV participation in Exercise Robust Ram, Operation Grizzly, and the<br />

Pacific Littoral Information, Surveillance, and Reconnaissance Experiment (PLIX)<br />

provided venues in which manufacturers, soldiers, and research scientists could converge<br />

at a central location in order to demonstrate products, train with and operate potential,<br />

future assets, and collect data on a variety of aspects associated with the future direction<br />

of Canadian Forces operations and procedures. For researchers interested in the personjob<br />

fit between the 105 occupations within the Canadian Forces and the various positions<br />

within a MALE GCS, devising research designs and developing a research methodology<br />

based on the lessons learned from these research events was seen as essential. Collecting<br />

job requirements in terms of tasks, knowledge, and skills and determining the overlap<br />

between these job attributes and those formally contained in the occupational<br />

621<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


622<br />

specifications of the 105 occupations within the Canadian Forces was viewed as the most<br />

pragmatic and objective way of eventual recommending occupations for selection,<br />

training, and employment in these positions. Over the course of these research events,<br />

instruments were developed to measure the tasks and knowledge involved in UAV GCS<br />

positions as well as the human factors and physiological factors involved as well. These<br />

instruments are described after a short summary of the three research events involving<br />

UAVs that have occurred to date.<br />

Robust Ram<br />

The initial field experiment involving UAVs was held at Suffield Range, Alberta<br />

in April 2002. One mini UAV (Pointer manufactured by AeroVironment), and two<br />

MALE UAVs (Bombardier’s CL-327 Guardian and General Atomics’ I-Gnat) were<br />

demonstrated by manufacturer representatives during military land exercises.<br />

Job requirements associated with the tasks, knowledge, and skills, crew<br />

composition, command and control, ground maintenance, and communications involved<br />

in integrating information technology and information management for net-centric<br />

warfare development were primary concerns from a human resource perspective during<br />

this initial field work. Since Robust Ram was the initial research event focusing on the<br />

potential of UAVs in the Canadian Forces, field observations and notes were gathered<br />

based on discussions with manufacturing representatives, Canadian Forces personnel<br />

employed with and supporting each UAV. Task, knowledge, and skill statements were<br />

compiled from the 105 occupations within the Canadian Forces military occupational<br />

structure, which served as a checklist and guidance during field observations of Canadian<br />

Forces personnel interacting with equipment and performing duties associated with<br />

MALE GCS positions during mission scenarios. Human resource information gathered<br />

during Robust Ram provided a baseline understanding of common GCS positions and<br />

their potential knowledge and task requirements.<br />

Operation Grizzly (OP GRIZZLY)<br />

A joint ISR operation between the Chief of Land Staff (CLS) and the Chief of Air<br />

Staff (CAS) provided ground and air support to the June 2002 G-8 summit in Kananaskis,<br />

Alberta. General Atomics’ I-Gnat was employed, with Canadian Forces personnel in<br />

each of the GCS positions save that of the UAV Operator (Pilot) due to contract<br />

obligations. OP GRIZZLY represented the first operational use of UAVs within Canada.<br />

Canadian Forces ISR support to this summit was successful and commended due in part<br />

to MALE UAV involvement.<br />

OP GRIZZLY also provided the first opportunity to collect consistent data on the<br />

tasks, knowledge, skills, and environmental demands associated with UAV GCS<br />

positions filled by Canadian Forces personnel through the development of a structured<br />

interview questionnaire. Canadian Forces personnel employed in the various UAV GCS<br />

positions met individually with a Canadian Forces Personnel Selection Officer who<br />

briefed them on the purpose of the 45-minute structured interview. Interviewees were<br />

then asked a series of questions and their responses were recorded verbatim onto a laptop<br />

computer. All responses were then compiled, analyzed, and reported on an<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


experimentation report. The structured interview schedule continues to be used to<br />

collect data from Canadian Forces personnel employed in MALE UAV positions.<br />

The Pacific Littoral Intelligence, Surveillance, and Reconnaissance Exercise (PLIX)<br />

In the summer of <strong>2003</strong>, a field experiment was conducted to assess the utility of a<br />

multi-sensor, MALE UAV to support the construction of the recognized maritime picture<br />

(RMP) within a specific littoral operations area. Construction of the RMP is a<br />

fundamental activity of littoral ISR. Its efficiency and effectiveness is sometimes suspect<br />

due to the potential limitations of current technology to provide an accurate, detailed<br />

assessment of maritime activity within a specific area of interest. This experiment<br />

predicted that if a multi-sensor, MALE UAV patrolled a designated littoral operations<br />

area then all surface contacts would be detected, continuously tracked, and positively<br />

identified in the recognized maritime picture of the operations area before the end of each<br />

patrol. Using conventional methods, less than ten targets were identified and tracked<br />

within the specified area of operation. This ability to identify and track targets was<br />

increased to three times by employing just one MALE UAV.<br />

Concurrent to the overarching objective of this experiment, a richer understanding<br />

of GCS position requirements and the potential suitability of Canadian Forces’<br />

occupational personnel were gained using previously employed field techniques (e.g.,<br />

observations, notes, and structured interviews. A computer-based survey was also<br />

developed and administered as a pilot project to Canadian Forces personnel employed as<br />

MALE UAV Sensor Operators during PLIX. The survey was developed using Microsoft<br />

2000 Access and provided participants with a simple ‘point and click’ navigation system<br />

through four areas of interest with respect to the sensor operator position. These four<br />

areas were tasks and knowledge statements contained within each member’s occupational<br />

specifications that were associated with the UAV sensor operator position. Two other<br />

menus permitted members employed as sensor operators to add and rate additional tasks<br />

performed and knowledge required in the sensor operator position that were not<br />

contained within their occupational specifications.<br />

The MALE UAV sensor operators’ electronic survey asked participants to<br />

identify and rate those tasks and knowledge statements contained in their occupational<br />

specifications that were performed or required in the GCS position they filled. The<br />

electronic survey was well received and provided a rich source of information about the<br />

MALE UAV sensor operator position. Participants found this survey easy to navigate<br />

and easy to complete. Their completion of the additional tasks and additional knowledge<br />

sections of this survey showed a willingness to provide supplemental data on task and<br />

knowledge statements that were not included in their occupational specifications but were<br />

performed in the sensor operator position. This additional information, along with the<br />

task and knowledge data from their occupational specifications, can also be used to<br />

develop job inventories to conduct analysis on this position in future experiments<br />

involving MALE UAVs.<br />

Participation by personnel from three distinct Canadian Forces occupations<br />

further expanded the task and knowledge requirements associated with the UAV sensor<br />

623<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


624<br />

operator position. Analysis of the distribution of task and knowledge associated with this<br />

position displayed in the figures below, suggests that human resource requirements will<br />

require training and experience in sensor selection and manipulation, radar tracking, and<br />

competencies involving the organization and management of information. The<br />

information contained in graphs 1 and 2 below describes the distribution of knowledge<br />

and tasks associated with the MALE UAV sensor operator position to date. A further<br />

comparison and description of the both task and knowledge distributions is also provided<br />

in graph 3, followed by a summary of the information obtained from structured<br />

interviews conducted with Canadian Forces personnel that participated in the three<br />

MALE UAV research events to date.<br />

Information<br />

Systems<br />

Management<br />

6%<br />

Electronics<br />

3%<br />

Infrared<br />

2%<br />

Distribution of Knowledge for<br />

MALE UAV Sensor Op<br />

Tactics<br />

6%<br />

Electronic Warfare<br />

7%<br />

Combat<br />

Information<br />

Organization<br />

24%<br />

Navigation<br />

7%<br />

Operations<br />

13%<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Adm inis tration<br />

7%<br />

General Aircrew<br />

8%<br />

Radar<br />

8%<br />

Figure 1 – Distribution of Knowledge for MALE UAV Sensor Operators<br />

Communications<br />

8%<br />

Figure 1 displays the distribution of knowledge required for MALE UAV sensor<br />

operators during PLIX. The largest proportion of knowledge involves organizing combat<br />

information (24%) obtained through UAV sensors for use by higher commanders.<br />

Comprehensive knowledge of air and surface radar equipment, their capabilities and<br />

limitations along with radar controls and procedures are fundamental to this position.<br />

Rules of engagement, constructing contact probability areas, as well as intelligence and<br />

evidence gathering techniques, collection requirements, and team duties and<br />

responsibilities detail the operational knowledge (13%) component needed by sensor<br />

operators to support future MALE UAV missions. Knowledge items such as intercom<br />

systems and operating procedures, voice procedures, infrared radar interpretation, and<br />

radar display interpretation are examples of communications (8%) and radar (8%)<br />

knowledge requirements. General aircrew (8%), administration (7%), and navigation<br />

knowledge (7%) suggest that there are environmental as well as flight support


esponsibilities involved in this position as well. Flight safety procedures and<br />

regulations, air traffic control organization and procedures, as well as heading and<br />

altitude reference and operating systems describe the sensor operator general aircrew<br />

knowledge requirements (8%), while principles of true navigation, dead reckoning<br />

procedures, knowledge of trigonometry, algebra, and logarithms exemplify navigational<br />

knowledge (7%) required. Tactical knowledge (6%) was indicated as well. Here, the<br />

data suggests that general aircrew and navigational knowledge combined with the tactical<br />

knowledge gained through training and experience in operational military occupations is<br />

involved in this position. Six percent of the knowledge required by MALE UAV sensor<br />

operators involves information systems management, which includes computer hardware,<br />

software, and network utilities and procedures, as well as data compression and<br />

extraction techniques. Capabilities and characteristics of electronic jammers, microwave<br />

systems, and principles of electromagnetic compatibility and interference principles are<br />

components of electronics knowledge (3%) requirements of MALE UAV sensor<br />

operators. Weather and obstacle avoidance procedures, understanding radar imagery<br />

modes, and comprehending radar imagery analysis are examples of the radar knowledge<br />

(2%) involved.<br />

This initial understanding of the knowledge distribution for MALE UAV sensor<br />

operators provides information for further exploration and research. Gathering<br />

information through the use of electronic airborne means suggests that there may be a<br />

basic requirement for military personnel selected, trained, and employed in this position<br />

to have prerequisite operational and tactical understanding of basic aircraft or surface<br />

radar and sensor capabilities, operations, and limitations. Operational training and<br />

experience is considered an asset both in this particular position and in its role within the<br />

eventual crew composition. The knowledge distribution also suggests that general<br />

aircrew or ship borne experience may facilitate a more efficient and effective person-job<br />

fit. The essential responsibilities of MALE UAV sensor operators will be to gather,<br />

extract, and compile radar and sensor imagery as efficiently as possible. Knowledge<br />

gained from working in an environment that requires potential candidates to manage<br />

combat information and support electronic warfare objectives through information<br />

systems management would further ensure that information from UAV missions is<br />

competently obtained.<br />

625<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


626<br />

UAV Sensor Op Task Distribution<br />

Electronics<br />

2%<br />

Infrared<br />

14%<br />

Radar<br />

25%<br />

Operations Support<br />

2%<br />

Navigation<br />

11%<br />

Operations<br />

10%<br />

Administration<br />

2%<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Tactics<br />

10%<br />

Figure 2 – Distribution of Tasks for MALE UAV Sensor Operators<br />

Info Systems<br />

Management<br />

4%<br />

Communications<br />

5%<br />

General Aircrew<br />

6%<br />

Combat Information<br />

Organization<br />

9%<br />

Figure 2 displays the distribution of tasks performed by MALE UAV sensor<br />

operators during PLIX. Seventy percent of task performed during this research event<br />

involved radar (25%), infrared (14%), navigation (11%), operations (10%), and tactics<br />

(10%). Typical radar tasks consisted of identifying and classifying radar contacts,<br />

estimating the size of radar contacts, determining radar search altitudes, and conducting<br />

radar area searches. Conducting forward-looking infrared camera (FLIR) searches,<br />

detecting and identifying FLIR contacts, and assessing FLIR intelligence identification<br />

runs are examples of infrared tasks. Navigational tasks performed included determining<br />

the UAV’s position using visual and radar fixing procedures as well as interpreting<br />

meteorological charts, reports, and forecasts. Detecting and classifying radar contacts,<br />

conducting intelligence/evidence gathering, interpreting, displaying, and coordinating the<br />

display of intelligence information provide insight into tasks associated with operations.<br />

Tactical tasks included selecting and localizing UAV sensors and assessing the tactical<br />

significance of contacts.<br />

Thirty percent of MALE UAV sensor operator tasks were accounted for by less<br />

than ten percent of tasks by individual duty area. Combat information organization (9%)<br />

tasks included interpreting radar display information, as well as coordinating and<br />

maintaining the production of the recognized maritime picture. General aircrew tasks<br />

accounted for six percent of tasks performed by MALE UAV sensor operators during<br />

PLIX. Managing individual tactical displays, visually identifying and classifying<br />

contacts, and preparing post-flight reports and messages are indicative of general aircrew<br />

tasks performed as a MALE UAV sensor operator. Communications accounted for five<br />

percent of tasks performed in this position during PLIX, where tasks such as maintaining<br />

internal communications and configuring communication equipment were performed.<br />

Smaller percentages of tasks were performed in the information systems management<br />

(4%), administration (2%), operations support (2%), and electronics (2%) duty areas as<br />

well.


Comparison of Tasks and Knowledge Distributions by Duty Area<br />

Absolute Value<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

627<br />

A B C D E F G H I J K<br />

Duty Area<br />

Knowledge Tasks<br />

Figure 3 – Tasks and Knowledge Requirements for MALE UAV Sensor Operators<br />

A. Combat Information Organization<br />

B. Operations<br />

C. Communications<br />

D. General Aircrew<br />

E. Administration<br />

F. Information System Management<br />

G. Electronics<br />

H. Tactics<br />

I. Navigation<br />

J. Infrared<br />

K. Radar<br />

As may be the case with many jobs, the relationship between knowledge and tasks<br />

is not always uniform across duty areas. Many duties require greater cognitive resources<br />

and abilities, whereas other duties are more task-specific, requiring less knowledge,<br />

theory and, understanding to perform or complete. The very nature of the MALE UAV<br />

sensor operator position suggests that, although there is not a uniform distribution of<br />

tasks and knowledge proportionally across individual duty areas, there is a sense of<br />

uniformity within the job as a whole to support both the UAV crew and its chain of<br />

command through the core provision of real-time information and imagery on areas of<br />

interest. Figure 3 shows that duty areas A through G are predominantly knowledgespecific<br />

and will require future MALE UAV sensor operators to know a constellation of<br />

theories, principles, and procedures in order to organize combat information, support land<br />

and sea as well as joint and combined operations, with proper internal and external<br />

communications, for example. Duty areas H through K, on the other hand, are more taskspecific,<br />

requiring future sensor operators to support the piloting and navigation of<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


628<br />

MALE UAVs, more often than not, through direct instructions to the UAV Operator<br />

(Pilot) on the maneuver, circuit, and tactics required to maintain images of objects of<br />

interest while using radar and infrared or other sensor capabilities. This relationship<br />

between the knowledge required and tasks performed by future MALE UAV sensor<br />

operators suggests that there may be considerable challenge involved and task,<br />

knowledge, and interpersonal skill required on their part, in order to feed this essential<br />

information into Canada’s net-centric warfare matrix.<br />

Structured Interviews. Chief of Air Staff and Chief of Maritime Staff sensor operator and<br />

support personnel were employed in the MALE UAV sensor operator position during<br />

research events involving UAVs. Occupationally, personnel from the Airborne<br />

Electronic Sensor Operator occupation (Chief of Air Staff), the Naval Electronic Sensor<br />

Operator occupation and Naval Combat Information Operator occupation (Chief of<br />

Maritime Staff) have participated in these research events. Participants averaged 19<br />

years of military service and all but one was considered senior non-commissioned<br />

members. One Airborne Electronic Sensor Operator participated in Robust Ram<br />

experimentation with the I-Gnat MALE UAV. Three Airborne Electronic Sensor<br />

Operator were employed in the MALE UAV sensor operator position during OP<br />

GRIZZLY. Four CF members from three CF occupations were employed as UAV<br />

Sensor operators within the Tofino UAV GCS during PLIX; two Airborne Electronic<br />

Sensor Operator, one Naval Combat Information Operator, and one Naval Electronic<br />

Sensor Operator. The personnel from the Airborne Electronic Sensor Operator<br />

occupation reported that the sensor operator position required good air sense and aircrew<br />

experience gained from at least one operational tour. They suggested that air sense,<br />

spatial and situational awareness, and aircrew experience were necessary as well.<br />

Personnel from the Airborne Electronic Sensor Operator and the Naval Electronic Sensor<br />

Operator occupations also identified knowledge, skill, and experience operating Infrared<br />

(IR) and Electro-Optical (EO) sensors as essential.<br />

Canadian Forces’ personnel employed as sensor operators during UAV<br />

experimentation suggested that a standard tour/posting of three to five years would be a<br />

suitable. They suggested that the challenges involved in this position would occur from<br />

operating new technology as a member of a new crew and acquiring the motor skills and<br />

hand-eye coordination required in order to establish and maintain sensors on target.<br />

Further, they reported that the GCS work environment was charged with activity,<br />

requiring constant communication, and a proclivity for quick thinking, planning, and<br />

preparation. Experimental MALE UAV sensor operators also found directing the UAV<br />

Operator (Pilot) in order to maintain the sensors on a specific target was also challenging<br />

and more so for Chief of Maritime Staff personnel employed in this capacity due perhaps<br />

to both occupational and environmental cultural restrictions and etiquette assumed<br />

between non-commissioned members and officers.<br />

Environmental demands associated thus far with MALE UAV field<br />

experimentation were reported to be inherent in air operations. Canadian Forces<br />

personnel employed as sensor operators suggested that comprehending the present<br />

location of the UAV in relation to its altitude, direction, distance from the GCS, and the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


angle of approach to the target may require a significant learning curve for non-air<br />

operations personnel. For these reasons, training efficiency may dictate that personnel<br />

selection criteria favor air operations personnel and occupations.<br />

Tasks associated with the MALE UAV sensor operator position include reviewing<br />

pre-flight documentation and attending pre-flight briefs, conducting pre-flight functional<br />

checks on the sensor package, as well as planning and recommending the appropriate<br />

sensor package for the mission. In-flight tasks include locating and observing targets,<br />

selecting appropriate zoom, elevation, angle, depression, and rates of movement using<br />

sensors to increase the accuracy and confidence in target identification. As well, this<br />

position involves performing first level intelligence analysis on possible targets, as well<br />

as communicating and directing the UAV Op to maintain sensor equipment on target.<br />

Knowledge required for this position included airborne tactics, sensor technology, the<br />

UAV’s limitations, establishing and maintaining data links, and airspace regulations.<br />

The perceived task demands associated with this position ranged from relaxed<br />

when conducting sensor sweeps to locate possible targets to intense concentration during<br />

target detection as well when first level intelligence analysis was being performed. The<br />

intense concentration required to maintain the sensor on target during target detection<br />

was described as quite demanding, resulting in eyestrain and mental fatigue. Vigilance<br />

and focus was also required to maintain position and continued referring of the targets in<br />

the creation of RMP. Chief of Air Staff personnel employed in this position suggested<br />

that these tasks were comparable to those currently performed by their occupation on<br />

maritime patrol aircraft.<br />

Chief of Air Staff personnel involved in the PLIX field research were also<br />

employed as senior sensor operators, responsible for the supervision and performance of<br />

either the Naval Combat Information Operator or Naval Electronic Sensor Operator on<br />

their respective crews. This supervisory and management activity further enhanced the<br />

job satisfaction they derived from this position. Other supervisory and managerial tasks<br />

associated with this position include maintaining the RMP, ensuring efficient and<br />

effective use of the sensor to properly provide still and video data to the chain of<br />

command.<br />

CF personnel involved in research concerning the MALE UAV sensor operator<br />

position felt that their operational experience, training, and qualifications were enhanced<br />

through their experience in this position. Chief of Air Staff personnel in particular<br />

suggested that their tasks, knowledge, and skills associated with their primary<br />

employment on ship borne or maritime patrol aircraft was reinforced by their<br />

involvement in these experiments.<br />

Job Satisfaction. Personnel employed in this MALE UAV sensor operator position<br />

reported a high level of job satisfaction. Operating sensors on UAVs to detect and<br />

identify targets presented a novel and efficient means of gathering ISR data. Canadian<br />

Forces members employed in this position appreciated the opportunity to participate in<br />

this experimentation. Canadian Forces members employed as MALE UAV sensor<br />

629<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


630<br />

operators from Naval occupations expressed some reservation as to the viability of their<br />

occupations’ future involvement in this position on the grounds that they may be<br />

disadvantaged compared to their cohort for advanced training, qualifications, and<br />

promotions. These concerns were not the expressed by Chief of Air Staff personnel<br />

employed as MALE UAV sensor operators. In fact, they suggested this employment<br />

opportunity would be welcomed by their occupation to further enhance their sensor<br />

training and expertise. Clearly, from an organizational as well as an occupational and<br />

individual worker perspective, these intrinsic selection issues must also be considered in<br />

the final decision as to whom among us is best suited to fill this ISR position.<br />

Summary<br />

In just a few short years, the Canadian Forces Experimentation Centre has created<br />

opportunities for the Canadian Forces and its leaders to become familiar with the<br />

potential and benefits of MALE UAVs as an ISR asset within the larger context of the<br />

developing strategy surrounding net-centric warfare. In the year 2005, the Canadian<br />

Forces is expected to make a multi-million dollar investment to incorporate UAV<br />

technology within its inventory. Concurrent with this timeline, the Canadian Forces<br />

Experimentation Centre has endeavored to study the human resource requirements<br />

associated with MALE GCS positions so that sound, objective recommendations can be<br />

made on the effective and efficient operation, support, and maintenance of this<br />

technology. Field research events associated with UAV experimentation have provided<br />

progressive opportunities for the development of observational and data collection<br />

techniques to match the position and personnel requirements associated with GCS<br />

positions. Personnel selection criteria, training development, and potential employment<br />

patterns within the existing military occupational structure of the Canadian Forces are<br />

becoming clearer with each research event involving this promising technology.<br />

Our position and personnel selection criteria investigations to date are based upon<br />

subject matter expertise. We concede that the potential for significant variation remains<br />

to be explained through more precise measurement in laboratory settings and simulations.<br />

Plans to conduct a formal, objective, and independent task analysis of the sensor operator<br />

and indeed, all positions common to MALE UAV GCS are being readied for the near<br />

future. Determining person-job fit should be viewed as an iterative process. This is<br />

especially true when research is conducted on the design of future jobs as a consequence<br />

of introducing emerging technology to organizations and their workforce. Although, we<br />

cannot definitively say at this time what the exact selection criteria, training, and<br />

employment will be with respect to this future position and job, we look forward to<br />

further developing our understanding of these requirements through independent as well<br />

as combined research collaborations with our allies in time for the introduction of this<br />

technology into the Canadian Forces inventory in the year 2005.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Transformational Leadership: Relations to the Five Factor Model and Team<br />

Performance in Typical and Maximum Contexts<br />

Beng-Chong Lim<br />

University of Maryland and Applied Behavioral Sciences Department,<br />

Ministry of Defense, Singapore<br />

Robert E. Ployhart<br />

George Mason University<br />

Abstract<br />

This study examines the Five Factor Model of personality, transformational leadership, and team<br />

performance under conditions similar to typical and maximum performance contexts. Data were<br />

collected from 39 combat teams from an Asian military sample (n = 276). Results found that<br />

neuroticism and agreeableness were negatively related to transformational leadership ratings.<br />

Team performance ratings correlated across the typical and maximum contexts only .18.<br />

Furthermore, transformational leadership related more strongly to team performance in the<br />

maximum than typical context. Finally, transformational leadership fully mediated the<br />

relationship between leader personality and team performance in the maximum context, but only<br />

partially mediated the relationship between leader personality and team performance in the<br />

typical context. The Discussion focuses on how these findings, while interesting, need to be<br />

replicated with different designs, contexts, and measures.<br />

Transformational Leadership:<br />

Relations to the Five Factor Model and Team Performance in Typical and Maximum Contexts<br />

Over the last 20 years, research on transformational leadership has become one of the<br />

dominant leadership theories in the organizational sciences (Judge & Bono, 2000). Although<br />

there are several reasons for this, perhaps one of the most important is that transformational<br />

leadership appears to be extremely important for modern work. For example, the growing<br />

number of mergers and acquisitions, globalization, and uncertainty with the stock market require<br />

leaders to not only exhibit confidence and direction, but to also instill motivation and<br />

commitment to organizational objectives. Numerous studies have found that followers’<br />

commitment, loyalty, satisfaction, and attachment are related to transformational leadership<br />

(Becker & Billings, 1993; Conger & Kanungo, 1988; Fullagar, McCoy, & Shull, 1992; Niehoff,<br />

Enz, & Grover, 1990; Pitman, 1993). Indeed, this has led researchers such as Bass (1998) to<br />

conclude, “transformational leadership at the top of the organization is likely to be needed for<br />

commitment to extend to the organization as a whole (p.19).”<br />

Despite the importance of transformational leadership in practice and the wealth of<br />

research on the topic, there are still many questions relating to the antecedents and consequences<br />

of transformational leaders. For example, only two studies have examined the dispositional basis<br />

of transformational leadership using the Five Factor Model (e.g., Judge & Bono, 2000; Ployhart,<br />

Lim, & Chan, 2001), and more research is needed to understand how personality is manifested in<br />

transformational leadership behaviors. Similarly, previous research examining the consequences<br />

of transformational leadership has been focused almost exclusively at the individual level (i.e.,<br />

leader effectiveness). However, many have argued that leadership may have its most important<br />

consequences for teams and thus a focus on the team level is also important (Bass, Avolio, Jung,<br />

& Berson, <strong>2003</strong>; Dvir, Eden, Avolio, & Shamir, 2002; Hogan, Curphy, & Hogan, 1994; Judge,<br />

Bono, Ilies, & Gerhardt, 2002). Further, research by Ployhart et al. (2001) suggests that<br />

631<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


632<br />

transformational leadership may be most important for maximal performance contexts rather<br />

than for typical performance contexts.<br />

The purpose of this study is to examine these neglected antecedents and consequences of<br />

transformational leadership. We first examine how leader personality, based on the Five-Factor<br />

Model (FFM), relates to subordinate ratings of the leader’s transformational behaviors. Second,<br />

we examine how transformational leadership relates to team performance assessed under typical<br />

and maximum performance contexts. Third, we assess whether transformational leadership fully<br />

or partially mediates the relationship between team performance and the FFM of personality.<br />

Thus, this study contributes to the research on transformational leadership by examining the<br />

FFM determinants of transformational leadership, examining how transformational leadership<br />

predicts team criteria, and whether the strength of prediction differs across typical and maximal<br />

performance contexts. This study therefore integrates and simultaneously tests the findings by<br />

Judge and Bono (2000) and Ployhart et al. (2001) by assessing the FFM determinants and<br />

consequences of transformational leadership. Figure 1 provides an overview of the relationships<br />

examined in this study.<br />

In the following section, we discuss the FFM antecedents of transformational leadership.<br />

Next, we examine the consequences of transformational leadership for team performance in<br />

typical and maximal contexts.<br />

Transformational Leadership and the FFM of Personality<br />

Much progress has been made in the field of leadership research. From the early work on<br />

a one-dimensional model of leadership (Katz, Maccoby, & Morse, 1950), to the two-dimensional<br />

model of initiating structure and consideration (Stogdill & Coons, 1957), to the recent<br />

transformational/charismatic leadership theory (e.g. Bass & Avolio, 1993; Conger & Kanungo,<br />

1988; Shamir, House, & Arthur, 1993), the field has witnessed significant advances in theory<br />

development and empirical work. Despite the existence of numerous leadership theories and<br />

paradigms, it is safe to say that, for the past two decades, transformational leadership theory has<br />

captured much of the research attention (Judge & Bono, 2000). The concept of transformational<br />

leadership can be traced back to Burns’ (1978) qualitative classification of transactional and<br />

transformational political leaders, although it was the conceptual work by House (1977) and Bass<br />

(1981) that brought the concept of transformational leadership to the forefront of leadership<br />

research. Transformational leadership is often contrasted to transactional leadership.<br />

Transactional leadership is often depicted as contingent reinforcement; leader-subordinate<br />

relationships are based on a series of exchanges or bargains between them (Howell & Avolio,<br />

1993). Transformational leaders, on the other hand, rise above the exchange relationships typical<br />

of transactional leadership by developing, intellectually stimulating, and inspiring subordinates<br />

to transcend their own self-interests for a higher collective purpose, mission, or vision (Howell,<br />

& Avolio, 1993). Notice that one consequence of this perspective is a focus on unit-level<br />

interests, beyond those of the individual person.<br />

Transformational leadership is comprised of four constructs (Bass 1998): Charisma or<br />

idealized influence, Inspirational motivation, Intellectual stimulation, and Individualized<br />

consideration. A leader is charismatic if his/her followers seek to identify with the leader and<br />

emulate him/her. Transformational leaders motivate and inspire their followers by providing<br />

meaning and challenge to their work. Intellectually stimulating leadership aims to expand the<br />

followers’ use of their potential and abilities. Finally, individually-considerate leaders are<br />

attentive to their followers’ needs for achievement and growth. These leaders act not only as<br />

superiors but also as coaches and mentors to their subordinates. In short, transformational leaders<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


concentrate their efforts on longer term goals, emphasize their vision (and inspire subordinates to<br />

achieve the shared vision), and encourage the subordinates to take on greater responsibility for<br />

both their own development and the development of others (Avolio, Bass, & Jung, 1997; Bass,<br />

1985; Bycio, Hackett, & Allen, 1995; Howell & Avolio, 1993). They are also receptive to<br />

innovations and are likely to promote creativity in their subordinates (Avolio et al., 1997; Bass,<br />

1985). Finally, they are more likely than transactional leaders to cater to individual followers’<br />

needs and competencies (Bycio et al., 1995; Howell et al., 1993). In contrast, transactional<br />

leaders tend to focus on short-term goals and needs of their followers since they operate<br />

predominantly through an economic exchange model, as exemplified by path-goal theory (Koh,<br />

Steers & Terborg, 1995).<br />

Since the inception of the transformational leadership theory two decades ago,<br />

considerable empirical evidence has accumulated in support of the theory (Kirkpatrick & Locke,<br />

1996). Despite this empirical support, questions remain as to what determines or predicts<br />

transformational leadership. Surprisingly little empirical evidence exists to help answer this<br />

question. While much theoretical work has been done linking personality to transformational<br />

leadership (e.g., Bass, 1998; Hogan et al., 1994; Stogdill, 1974), most past research has used so<br />

many different types of traits that relationships obtained are difficult to comprehend or integrate<br />

(see Bass, 1998). However, organizing these findings around the FFM of personality allows<br />

researchers to have a common platform to examine the relationships between personality and<br />

transformational leadership. To our best knowledge, only one study thus far has directly linked<br />

the FFM of personality to transformational leadership. Judge and Bono (2000) found that<br />

extroversion (corrected r = .28) and agreeableness (corrected r = .32) positively predicted<br />

transformational leadership. Although openness to new experience was correlated positively with<br />

transformational leadership, its effect attenuated once the influence of the other traits was<br />

controlled. Despite the small to moderate relationships found, Judge and Bono (2000) provided<br />

preliminary evidence that certain FFM traits may be related to transformational leadership.<br />

Clearly more empirical research is necessary to help refine a theory linking the FFM of<br />

personality to transformational leadership.<br />

Consistent with the results of Judge and Bono (2000), we predict that extroversion,<br />

agreeableness, and openness to new experience will be positively related to transformational<br />

leadership. For example, extroversion should be related because of the dominance and expressive<br />

components of the trait, agreeableness should be related because individual consideration<br />

requires empathy (a key component of agreeableness), and openness should be related because of<br />

the need for creativity to intellectually stimulate subordinates (see Judge & Bono, 2000, for more<br />

detail). Although Judge and Bono (2000) failed to find the hypothesized negative relationship<br />

between neuroticism and transformational leadership in their study, given our military sample,<br />

we believe that neuroticism should be negatively related to transformational leadership. As<br />

neuroticism is often associated with anxiousness, nervousness, low self-confidence, and low selfesteem<br />

(McCrae & Costa, 1991), neurotic military leaders would not be able to exhibit<br />

transformational leadership given the nature of the military environment (i.e., a context<br />

inherently hazardous and often life threatening to both leaders and subordinates, and thus<br />

requiring a strong command structure and leadership). Such a finding is consistent with Ployhart<br />

et al. (2001), who found that more neurotic leaders performed worse on leadership exercises.<br />

Finally, like Judge and Bono (2000) and Ployhart et al. (2001), we do not expect<br />

conscientiousness to be related to transformational leadership.<br />

633<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


634<br />

Hypothesis 1: Extroversion will be positively related to transformational leadership<br />

behavior.<br />

Hypothesis 2: Openness to new experience will be positively related to transformational<br />

leadership behavior.<br />

Hypothesis 3: Agreeableness will be positively related to transformational leadership<br />

behavior.<br />

Hypothesis 4: Neuroticism will be negatively related to transformational leadership<br />

behavior.<br />

Transformational Leadership and Team Performance Under Typical and Maximum Contexts<br />

One reason for the interest in transformational leadership is that it predicts a variety of<br />

important criteria. For example, a meta-analysis by Lowe, Kroeck, and Sivasubramaniam (1996)<br />

found that transformational leadership, aggregated across the four dimensions, was related to<br />

objective (corrected r = .30) and subjective (corrected r = .73) measures of leadership<br />

effectiveness. These relationships generalized across low level (corrected r = .62) and high level<br />

leaders (corrected r = .63), in organizations from both the private (corrected r = .53) and public<br />

sectors (corrected r = .67). Another meta-analysis found transformational leadership correlates<br />

with leader effectiveness, even when transformational leadership and effectiveness are<br />

independently measured (corrected r = .34) (Fuller, Patterson, Hester, & Stringer, 1996).<br />

However, nearly all of the conceptual development and empirical work in<br />

transformational leadership research have been directed towards individual level outcomes (e.g.<br />

individual satisfaction and performance). Little attention has been paid to the influence of a<br />

leader on group or organizational processes and outcomes (Conger, 1999; Yukl, 1999). In fact, a<br />

recent meta-analysis by Judge et al. (2002) did not find a single leadership study that had used<br />

group performance as the leadership effectiveness measure. Since then, only two empirical<br />

studies have linked transformational leadership to unit-level performance criteria. Bass et al.<br />

(<strong>2003</strong>) found transformational leadership predicted unit performance in infantry teams, and Dvir<br />

et al. (2002) found transformational leadership training resulted in better unit performance<br />

relative to groups that did not receive the training. Thus, while many argue leadership<br />

effectiveness should be assessed in terms of team or organizational effectiveness (e.g., Hogan et<br />

al., 1994), in reality most studies evaluate leadership effectiveness in terms of ratings provided<br />

by superiors, peers, or subordinates (Judge et al., 2002).<br />

Obviously, this is a critical void in the leadership literature, despite the clear implications<br />

of transformational leadership for team-level outcomes. For example, the theory predicts that<br />

transformational leaders will inspire followers to transcend their own self-interests for a higher<br />

collective purpose (Howell et al., 1993). Likewise, Bass (1998) hypothesizes transformational<br />

leadership fosters “a greater sense of a collective identity and collective efficacy (p.25).”<br />

Transformational leaders are also instrumental for the development of important team processes<br />

such as unit cohesion and team potency (Bass et al., <strong>2003</strong>; Dvir et al., 2002; Guzzo, Yost,<br />

Campbell, & Shea, 1993; Sivasubramaniam, Murry, Avolio, & Jung, 2002; Sosik, Avolio, Kahai<br />

& Jung, 1998; Sparks & Schenk, 2001). Given the instrumental role of transformational<br />

leadership to the development of important team processes, it would hardly be surprising that<br />

teams with transformational leaders should outperform teams without such leaders (e.g., Dvir et<br />

al., 2002).<br />

Theory and research must demonstrate links between transformational leadership and<br />

unit-level performance because without such empirical research, we are forced to rely on<br />

findings at the individual level. This is can potentially be a dangerous practice, as research on<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


levels of analysis (e.g., Klein, Dansereau, & Hall, 1994; Kozlowski & Klein, 2000; Rousseau,<br />

1985) has shown that findings at one level of analysis cannot automatically be assumed to exist<br />

at a higher level. Similarly, in practice leaders are expected to influence collective outcomes<br />

such as team performance and organizational effectiveness, and they are oftentimes held<br />

accountable for accomplishing such outcomes (Yammarino, Dansereau, & Kennedy, 2001).<br />

Clearly for both theoretical and practical reasons, it is critical that transformational leadership be<br />

linked to team performance.<br />

The present study intends to provide some preliminary data on this issue by linking<br />

transformational leadership to team performance. Based on prior theory (e.g., Bass, 1985;<br />

Kozlowski, Gully, McHugh, Salas, & Cannon-Bowers, 1996) and previous empirical findings<br />

(Bass et al., <strong>2003</strong>; Dvir et al., 2002), we expect a positive relationship to exist.<br />

Yet beyond this simple relationship, we examine the relationship between<br />

transformational leadership and team performance assessed under typical and maximum<br />

performance contexts. As noted by Sackett, Zedeck, and Fogli (1988), maximum performance<br />

contexts occur when the following conditions are satisfied: (a) one is aware he/she is being<br />

evaluated, (b) the instructions to perform maximally on the task are accepted, and (c) the task is<br />

of relatively short duration so the person can maximize effort. An important necessary condition<br />

to compare typical and maximum measures is that only the performance context changes; the<br />

content of the performance domain must remain the same.<br />

Sackett and colleagues demonstrated the importance of this distinction at the individual<br />

level by showing typical and maximum performance are different constructs and have different<br />

antecedents (DuBois, Sackett, Zedeck, & Fogli, 1993; Sackett et al., 1988; see also Ployhart et<br />

al., 2001). In this study, we do not claim to have direct measures of typical and maximum<br />

performance constructs, but rather assess team performance under typical and maximum<br />

performance contexts. As such, we propose that teams face maximum performance contexts,<br />

with the conditions of short time span, awareness of being evaluated, and acceptance of<br />

instructions to exert maximum effort being critical features of such maximum contexts (see<br />

Kozlowski et al., 1996). Common examples include SWAT teams, small unit combat teams, and<br />

even project teams responding to crises.<br />

One implication of distinguishing between the two performance contexts is that the<br />

determinants and consequences of transformational leadership may likewise differ. Preliminary<br />

evidence supports this assertion, as Ployhart et al. (2001) found the criterion-related validities of<br />

the FFM differed for both typical and maximum leadership performance measures in a military<br />

sample. Openness to new experiences was predictive of transformational leadership performance<br />

in a maximum performance condition, neuroticism was most predictive of transformational<br />

leadership in a typical performance condition (having an adverse effect on performance), and<br />

extroversion was predictive of both. Importantly, they found the effect sizes tended to be<br />

stronger for maximum performance. However, they did not directly assess transformational<br />

leadership; they used ratings of transformational behaviors at the individual level. Thus, it is not<br />

known whether and how transformational leadership might relate differently to team<br />

performance in typical and maximum settings.<br />

In this study, we extend these findings to propose transformational leadership will be<br />

more predictive of team performance in maximum rather than typical performance contexts. This<br />

expectation is consistent with theory, as many of the reasons offered as requiring<br />

transformational leadership are inherently “maximum performance” unit-level phenomena (e.g.,<br />

maintaining unit performance during a merger, and military units in combat). For example, Bass<br />

635<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


636<br />

(1985; 1988; 1998) has repeatedly argued the importance of transformational leadership to<br />

groups and organizations during periods of stress, crisis, instability, and turmoil. Indeed,<br />

transformational leadership makes a difference in these situations. First, transformational leaders,<br />

using inspirational motivation and individualized consideration behaviors, are able to reduce the<br />

stress experienced by followers by instilling a sense of optimism and collective efficacy (Bass,<br />

1998). Second, transformational leaders, using idealized influence behaviors, can direct<br />

followers’ attention to a superordinate goal and lead followers toward the resolution of the crisis<br />

(Bass, 1998). Third, transformational leaders, using intellectual stimulation behaviors, are able to<br />

break out from old rules and mind sets and encourage their followers to do likewise, by<br />

promoting an effective decision making process whereby different ideas, opinions, and<br />

alternatives are freely articulated before arriving at a decision (Atwater & Bass, 1994; Bass,<br />

1990). Based on this theoretical reasoning, we propose:<br />

Hypothesis 5: Transformational leadership will be more predictive of team performance<br />

in maximum rather than typical performance contexts.<br />

Transformational Leadership as a Mediator Between the FFM and Team Performance<br />

Thus far, we have discussed the antecedents and consequences of transformational<br />

leadership in a bivariate fashion. Yet as Figure 1 shows, the FFM, transformational leadership,<br />

and team performance are theoretically expected to relate to each other in a mediated<br />

multivariate model. Such a model is consistent with recent suggestions to develop process<br />

models linking personality to work outcomes (e.g., Barrick, Mount, & Judge, 2001).<br />

Extending the logic outlined in the previous sections, we propose transformational<br />

leadership will fully mediate the relationship between the FFM and team performance in the<br />

maximum performance context, but only partially mediate the relationship between the FFM and<br />

team performance in the typical context. We base these hypotheses on several lines of evidence.<br />

First, as noted previously, transformational leadership is expected to be most important in times<br />

of extreme time pressure, stress, and instability—maximum performance conditions (e.g., Bass,<br />

1988; 1998; Ployhart et al., 2001). In such a condition transformational leadership should be the<br />

primary determinant of team performance. Second, transformational leadership will still be<br />

important under typical performance contexts, but to a lesser extent than in maximum<br />

performance contexts, and the more “mundane” nature of typical performance will allow<br />

personality to also be important. This is based on previous theory and research arguing<br />

personality is a stronger predictor of typical performance because the personality-based<br />

behaviors of effort and choice are more constrained in maximum performance contexts. In<br />

contrast, the long time periods involved with performance in typical contexts allow individual<br />

differences in effort and choice to more strongly manifest themselves and thus personality will<br />

determine performance (e.g., Cronbach, 1949, 1960; DuBois et al., 1993; Ployhart et al., 2001;<br />

Sackett et al., 1988).<br />

Hypothesis 6: Transformational leadership will fully mediate the relationship between<br />

leader personality (in terms of the FFM) and team performance in maximum contexts.<br />

Hypothesis 7: Transformational leadership will partially mediate the relationship between<br />

leader personality (in terms of the FFM) and team performance in typical contexts.<br />

Method<br />

Sample<br />

The sample comprised participants from the Singapore Armed Forces: (a) 39 team<br />

leaders, (b) 202 followers, (c) 20 superiors of these combat teams, and (d) 15 assessment center<br />

assessors. Hence, in total, 276 military personnel participated in the study. The team leaders and<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


soldiers constituted 39 combat teams. These were intact teams going through training; team<br />

leaders and team members had originally been randomly assigned to form these teams according<br />

to standard military practice. These combat teams had been training together for nearly three<br />

months prior to the commencement of the study. The size of these teams varied from four to<br />

seven members, with a mean of five. These participants were all males who were enlisted for<br />

compulsory National Service. Their age ranged from 18 to 23 years old (M = 19.3 years, SD =<br />

1.04). The racial composition of the sample mirrored the general population, which is<br />

predominately Chinese.<br />

Team performance was measured by ratings from various sources (superiors and<br />

assessment center assessors) under maximum and typical performance contexts. Team<br />

performance measures under the typical performance context were obtained via supervisory<br />

ratings near the end of the team training. As described by Sackett et al. (1988), these<br />

performance measures are similar to performance appraisal ratings in organizations, in that they<br />

assess performance over a longer time period. On the other hand, team performance measures<br />

under the maximum performance context were obtained during a one day assessment center<br />

conducted to evaluate the combat proficiency of the team. Note that there was no overlap<br />

between raters providing performance ratings across the two conditions. Further, no raters would<br />

have known team members because the raters came from other military units (a brief survey<br />

administered post-hoc to six assessors and four supervisors supported these expectations). While<br />

we cannot definitively equate these two sets of performance measures as reflecting latent typical<br />

performance and maximum performance constructs (a point we return to in the Limitations<br />

section), the fact that the ratings were obtained under two very different measurement contexts is<br />

consistent with the requirements for typical and maximum performance conditions (e.g., Sackett<br />

et al., 1988). That is, participants were fully aware that the assessment center was an evaluative<br />

context, they were given explicit instructions to maximize their performance, and the assessment<br />

center took place over a short period of time (i.e., one day).<br />

Procedure<br />

Participants were team members of intact military teams undergoing military training.<br />

Leaders and team members were originally randomly assigned to form these teams by the unit<br />

commanders. About 10 weeks into the training, leaders completed a measure of the FFM of<br />

personality while their subordinates’ ratings of the leader’s transformational leadership were<br />

obtained through a survey administrated by one of the primary researchers and several assistants.<br />

Given the highly intensive and interactive time subordinates spent with their leaders, followers<br />

should have had sufficient opportunity to observe and thus provide accurate ratings of<br />

transformational leadership. About three weeks later, supervisors’ ratings of the team’s training<br />

performance were collected. These ratings of the team over the 3 month training course are<br />

reflective of performance under more typical conditions. The teams were trained to perform<br />

some basic military tasks such as capturing an enemy observation post or laying an ambush.<br />

About the same time, an assessment center, designed to evaluate the combat proficiency of the<br />

combat team, was used to obtain measures of the team’s performance in maximum performance<br />

contexts. Different sets of evaluators were used to provide typical and maximum performance<br />

measures. Given that different sources completed the various measures, same source bias was<br />

less of an issue in this study, although this does not eliminate other potential sources of shared<br />

contamination between the ratings (an issue we address more fully in the Limitations section).<br />

Prior to the data collection, we checked with the unit commanders to ensure these combat teams<br />

were being trained and evaluated in accordance with the stipulated training doctrine.<br />

637<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


638<br />

Measures<br />

Leader Personality. The personality of the leaders was measured using the <strong>International</strong><br />

Personality Item Pool (IPIP) developed by Goldberg (1998; 1999). The IPIP is a broadbandwidth,<br />

public domain personality inventory that directly measures the FFM. It was<br />

developed as part of an international development program (e.g., Hofstee, de Raad, Goldberg,<br />

1992). Although items were also developed to measure facets, we did not collect these data as<br />

Judge and Bono (2000) found that the specific facets of the FFM predicted transformational<br />

leadership less well than the general factors. The IPIP instrument is a 50-item measure with 10<br />

items for each factor of the FFM (i.e. extroversion, agreeableness, conscientiousness,<br />

neuroticism, and openness to new experience). In this study, we found the following alpha<br />

reliabilities: .77, .74, .72, .82, and .80 respectively. These reliabilities are similar to Ployhart et<br />

al. (2001): .80, .67, .75, .83, and .77. Each item was assessed using a 5-point scale ranging from<br />

1 (very inaccurate) to 5 (very accurate), and each factor was scored such that higher numbers<br />

indicate greater quantities of the trait.<br />

Transformational leadership. Transformational leadership of the military leaders were<br />

measured using the 36 item Multi-factor Leadership Questionnaire (MLQ form 5X) (Avolio,<br />

Bass, & Jung, 1999). Followers described their leader using a frequency scale that ranges from 1<br />

= not at all, to 5 = frequently, if not always. Note the MLQ form 5X uses a 0 to 4 point rating<br />

scale; we used a 1 to 5 point scale in this study to be consistent with existing military answer<br />

sheets. However, the items and anchors for our rating scale are identical to those from the MLQ,<br />

thus the change in scale is a straightforward linear transformation. Furthermore, raters should<br />

have used the rating scales in an equivalent manner because considerable research suggests it is<br />

rater training, and not the rating format, that most influences rating variance (see Landy & Farr,<br />

1980; Murphy & Cleveland, 1995, for reviews). The five scales used to measure transformational<br />

leadership were: charisma/idealized influence (attributed), charisma/idealized influence<br />

(behavior), inspirational motivation, intellectual stimulation, and individualized consideration.<br />

Like previous research (Judge & Bono, 2000), we combined these dimensions into an overall<br />

measure of transformational leadership. The internal consistency reliability of the overall<br />

transformational leadership scale was .88. In order to justify aggregation, we calculated<br />

intraclass correlation coefficients (ICC(1)) (Ostroff & Schmitt, 1993). In the present study, the<br />

ICC(1) was .22 (p < .05). Past research has used ICC(1) levels ranging from .12 (James, 1982) to<br />

.20 (Ostroff et al., 1993) to justify aggregation. Hence, given the high level of ICC(1),<br />

aggregating followers’ transformational leadership scores to reflect the transformational<br />

leadership of the team leader is statistically justified.<br />

Team Performance in Typical Contexts. Supervisors’ ratings of team training<br />

performance were obtained near the end of the team training. Five supervisors provided<br />

performance ratings for each team. As these superiors were directly involved in the training of<br />

these teams, they had ample opportunity to observe the team in action. Supervisors were asked to<br />

rate the team’s performance on two dimensions: the efficiency of the team actions, and the<br />

quality of team actions. That is, supervisors rated the team’s effectiveness and efficiency in<br />

learning and practicing for the military exercises that were later evaluated in the assessment<br />

center (i.e., maximum performance condition). Supervisors were instructed not to base their<br />

assessment on the team’s performance in day-to-day garrison activities (e.g., guard duty,<br />

physical fitness training, administration), but rather to focus on the behaviors associated only<br />

with the training program (e.g., actions taken to secure a critical road junction for friendly<br />

forces). Therefore, the performance measures across the two contexts tapped the same<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


performance domains at the same level of specificity (e.g., Sackett et al., 1988). Each of these<br />

two items was rated on a five point Likert scale, ranging from 1 to 5, where higher scores reflect<br />

higher efficiency or quality of team actions. As these two scores were highly correlated (r = .72),<br />

we decided to average them to form a composite team performance score. Given that the ICC (1)<br />

is .35 (p < .05) and ICC(2) = .85, we averaged the scores across raters to form an overall typical<br />

team performance rating for each team.<br />

Team Performance in Maximum Contexts. These ratings were collected during a one-day<br />

assessment center conducted at the end of the team training to evaluate the combat proficiency of<br />

the team. One external assessor was randomly assigned to evaluate the performance of the team<br />

over a series of six military tasks (e.g., the team may be tasked by HQ to evacuate a casualty<br />

from one place to another); these tasks comprehensively summarized the types of tasks<br />

performed as part of the team training. The assessor used a 5 point Likert scale to evaluate the<br />

efficiency and the quality of team actions on each of the tasks. As there was only one assessor<br />

per team, inter-rater reliability was not available; however, the inter-task reliability for the<br />

efficiency measure was .90 while the inter-task reliability for the quality of team actions was .87.<br />

As with the ratings collected under the typical performance context, both of these measures were<br />

highly correlated (r = .67, p < .01) and so a composite team performance measure was created.<br />

Content Equivalence of the Performance Ratings. To ensure the content and specificity<br />

of the performance measures was sufficiently similar across the two performance contexts, we<br />

asked 10 subject matter experts (SMEs) with extensive experience in training and evaluating this<br />

type of combat team to respond to a 10 item survey. The survey sought their opinions about the<br />

overlap of the performance domains being assessed by the supervisors and the assessors under<br />

the two performance conditions. As shown in Table 1, the responses from these SMEs<br />

demonstrate the performance domains assessed by the supervisors and the assessors were highly<br />

similar in terms of content. The mean response from these SMEs on the 10-item survey is 5.2 on<br />

a six-point scale, indicating these raters believed the overlap between the two performance<br />

measures was present to at least “a great extent.” This information, coupled with the fact that<br />

raters were instructed to only consider team behaviors associated with the content of the training,<br />

and that the same performance dimensions and response scales were used for both conditions,<br />

suggests that the performance ratings assessed the same performance domain with the same<br />

degree of specificity. Thus, the ratings obtained under the two contexts differ in terms of the<br />

performance demands placed on the team (typical or maximum), and perhaps also the knowledge<br />

raters had about the teams and leaders (a point we return to in the Limitations section).<br />

Results<br />

Power Analysis and Data<br />

In contrast to research at the individual level of analysis, the difficulty of collecting data<br />

from large samples of intact teams usually results in smaller sample sizes. For instance, Liden,<br />

Wayne, Judge, Sparrowe, Kraimer, and Franz (1999) analyzed 41 workgroups while Marks,<br />

Sabella, Burke, and Zaccaro (2002) analyzed 45 teams. Indeed, Cohen and Bailey (1997) report<br />

the average number of teams in project team research (the type of teams most similar to those in<br />

the present study) averages only 45 per study. Such is the case in this study, making a careful<br />

consideration of power, p-values, and effect size important.<br />

A power analysis shows that given a sample size of 39 teams, there is only a 59% chance<br />

of detecting moderate effects at p < .05 (one-tailed) (Cohen, 1988). However, with a one-tailed<br />

test and p < .10, power becomes 73%. Hence, we considered values with p < .10 (one-tailed) to<br />

be statistically significant instead of p < .05 (one-tailed) because of the low statistical power<br />

639<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


640<br />

arising from small sample size (however, if values are less than p < .05, we report the lower p<br />

value). Note that because we hypothesize a specific direction for our hypothesis tests, a onetailed<br />

test is appropriate (Cohen & Cohen, 1983). In light of recent recommendations (e.g.,<br />

Wilkinson et al., 1999), and to help better interpret the magnitude of these effects, we also report<br />

90% confidence intervals around each of our hypothesis tests. Thus, by presenting effect sizes, pvalues,<br />

and confidence intervals, readers should best be able to determine the “importance” of<br />

the effects and tests we report.<br />

A different concern, perhaps more important with smaller sample sizes, is the presence of<br />

outliers and extreme cases. To ensure the results were not biased by extreme cases, we examined<br />

the distributions of the variables in terms of skewness and kurtosis (zero represents perfectly<br />

normal distributions, skewness > ±3 and kurtosis > ±7 are indicative nonnormal distributions; see<br />

West, Finch, & Curran, 1995). None of the measures were nonnormal, with openness to new<br />

experiences showing the largest deviation from normality (skewness = -1.27; kurtosis = 4.37) but<br />

still falling well within the range of appropriate normality.<br />

Performance Ratings Across Typical and Maximum Contexts<br />

Table 2 shows the means, standard deviations, and correlations for all measures. As<br />

reflected in Table 2, the low correlation between the team performance measures across the<br />

typical and maximum performance contexts (r = .18, ns) suggests the ratings from these<br />

contexts were not interchangeable.<br />

Hypotheses<br />

Hypotheses 1 through 4 predicted that extroversion (Hypothesis 1), openness to new<br />

experiences (Hypothesis 2), and agreeableness (Hypothesis 3) would be positively related to<br />

transformational leadership, while neuroticism (Hypothesis 4) would be negatively related. As<br />

shown in the last row of Table 2, transformational leadership is positively related to extroversion<br />

(r = .31, p < .05, [.04; .53]), but negatively related to both neuroticism (r = -.39, p < .05, [-.59;<br />

-.14]) and agreeableness (r = -.29, p < .05, [-.52; -.03]). Transformational leadership is not<br />

significantly related to openness to experience, nor conscientiousness. Hence Hypotheses 1 and 4<br />

were supported while Hypotheses 2 and 3 were not. While the relationship between<br />

transformational leadership and agreeableness was significant, it was in the opposite direction as<br />

hypothesized.<br />

In line with Murphy’s (1996) recommendation that personality should be examined using<br />

a multivariate framework, we also conducted a multiple regression analysis in which<br />

transformational leadership was regressed on all of the FFM constructs. As shown in Table 3, the<br />

overall model comprising the five personality factors was significant, explaining 28% of the<br />

variance in transformational leadership ratings (F[5, 33] = 2.59, p < .05). However, only<br />

neuroticism (β = -.29, p < .10, [-.57; .-01]) and agreeableness (β = -.30, p < .10, [-.58; -.02]) were<br />

significant predictors at p < .10 (one tailed).<br />

Next, Hypothesis 5 predicted that transformational leadership would be more predictive<br />

of team performance in maximum rather than typical performance contexts. As Table 2 shows,<br />

transformational leadership was significantly related to team performance in both typical<br />

contexts (r = .32, p < .05, [.06; .54]) and maximum contexts (r = .60, p < .05, [.40; .75]). The<br />

formula proposed by Williams (1959) and Steiger (1980) was used to test for the difference<br />

between two non-independent correlations. We found these correlations to be significantly<br />

different t(36) = 1.63, p < .10 (one tailed), although the confidence intervals overlapped slightly.<br />

Thus, we concluded that the relationship between transformational leadership and team<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


performance was significantly stronger in the maximum context than in the typical context,<br />

supporting Hypothesis 5.<br />

Mediation Hypotheses<br />

We examined transformational leadership as a mediator of the relationship between<br />

leader personality (i.e., FFM) and team performance using a procedure outlined by Baron and<br />

Kenny (1986). Hypothesis 6 predicted that transformational leadership would fully mediate the<br />

relationship between the FFM and team performance in maximum contexts, while Hypothesis 7<br />

predicted that transformational leadership would partially mediate the relationship between the<br />

FFM and team performance in typical contexts. To test transformational leadership as a<br />

mediator, we first examined whether the FFM accounted for significant variance in<br />

transformational leadership and both team performance measures and whether transformational<br />

leadership was related to both team performance measures. If these regression models were<br />

statistically significant, we could then examine the effects of the FFM on both team performance<br />

measures after controlling for transformational leadership.<br />

Results from the multiple regression analyses show that the FFM explained significant<br />

variability in transformational leadership (R 2 = .28; F[5, 33] = 2.59, p < .05 [one tailed]), team<br />

performance in typical contexts (R 2 = .41; F[5, 33] = 4.50, p < .05 [one tailed]), and team<br />

performance in maximum contexts (R 2 = .20; F[5, 33] = 1.62, p < .10 [one tailed]). These<br />

findings indicate that the FFM is associated with transformational leadership and both team<br />

performance measures.<br />

To test for mediation, we entered the FFM into the regression equation after controlling<br />

for transformational leadership. The FFM did not produce a significant increment in variance for<br />

predicting team performance in maximum contexts (∆R 2 = .12; ∆F[5, 31] = 1.42, ns). On the<br />

other hand, the FFM accounted for significant incremental variance in predicting team<br />

performance in typical contexts after controlling for transformational leadership (∆R 2 = .34;<br />

∆F[5, 31] = 3.87, p < .05). Hence, both Hypotheses 6 and 7 are supported. That is,<br />

transformational leadership fully mediated the relationship between the FFM and team<br />

performance in the maximum performance context, but only partially mediated the relationship<br />

between the FFM and team performance in the typical performance context. Keep in mind this<br />

finding is primarily due to the fact that the relationship between the FFM model and performance<br />

in the typical context was about twice as strong as it was in the maximum context, while the<br />

relationship between transformational leadership and performance was about twice as strong in<br />

the maximum context as it was in the typical context.<br />

Discussion<br />

The purpose of this study was to examine the antecedents and consequences of<br />

transformational leadership. Four of the FFM of personality constructs were hypothesized as<br />

antecedents of transformational leadership, and the consequences of transformational leadership<br />

were expected to occur for team performance across typical and maximum performance<br />

contexts, but more strongly for the latter. The results suggest that transformational leadership is<br />

positively related to extroversion, and negatively related to agreeableness and neuroticism,<br />

although in a multiple regression only neuroticism and agreeableness were predictive. The<br />

results also show that transformational leadership has important consequences for team<br />

performance, but the magnitude of these relationships is dependent on whether performance is<br />

assessed in typical or maximum contexts. In particular, transformational leadership appears to be<br />

more predictive of team performance in maximum contexts. In addition, we found<br />

transformational leadership fully mediated the relationship between the FFM and team<br />

641<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


642<br />

performance in maximum contexts, while only partially mediated the relationship between FFM<br />

and team performance in typical contexts. We now describe these major findings in more detail.<br />

Major Findings<br />

Consistent with our bivariate hypotheses, extroversion was positively related to<br />

transformational leadership and neuroticism was negatively related. Contrary to our hypotheses,<br />

openness was unrelated to transformational leadership, and the significant effect for<br />

agreeableness was negative and opposite to our prediction. The multivariate analyses found only<br />

extroversion and agreeableness were significantly related to transformational leadership.<br />

Comparing our multivariate effects to the multivariate findings of Judge and Bono (2000)<br />

shows both similarities and differences (we compare the multivariate models rather than the<br />

bivariate effects because in practice, these various traits influence behavior in combination;<br />

Murphy, 1996). Both studies found effects for agreeableness, and no effects for<br />

conscientiousness and openness to new experience. Yet this study found the effect for<br />

agreeableness was negative whereas Judge and Bono (2000) found the effect was positive. In<br />

other words, the more agreeable military leaders were rated as less transformational by their<br />

followers. Another area of difference with Judge and Bono (2000) is that this study found a<br />

significant effect for neuroticism while they found a significant effect for extroversion.<br />

The differences between our study and Judge and Bono (2000) may exist due to the<br />

nature of our sample, which was primarily young and entirely male (a point we return to shortly).<br />

Alternatively, they might suggest the existence of moderators on the relationship between<br />

personality and transformational leadership, specifically setting (military versus civilian). For<br />

example, military samples were used in both Ployhart et al. (2001) and the present study, and<br />

both found neuroticism and agreeableness had an adverse effect on transformational leadership.<br />

In contrast, the Judge and Bono (2000) sample was comprised of business leaders and found no<br />

effect for neuroticism and a positive effect for agreeableness. Compared to business leaders,<br />

military personnel often have to work under hazardous and life threatening situations, hence the<br />

ability to remain calm, secure, and non-anxious is critical. Followers will often look to them for<br />

direction and leadership in these critical times; perhaps under such conditions being agreeable<br />

does not contribute to perceptions of effective leadership (e.g., followers may want direction in<br />

crisis situations). Future research will be necessary to determine whether context truly acts as a<br />

moderator of the personality-transformational leadership relationship.<br />

With respect to the team performance measures, we found the correlation between team<br />

performance assessed in typical and maximum contexts was small and non-significant, a finding<br />

consistent with research conducted at the individual level (e.g., DuBois et al., 1993; Ployhart et<br />

al., 2001; Sackett et al., 1988). The consequence of this distinction at the team level can be seen<br />

when examining relations to leadership, as transformational leadership was more predictive of<br />

team performance when it was assessed in maximum performance contexts. While it may be too<br />

early to draw any definitive conclusions given the small sample size and potential limitations of<br />

the design (discussed below), future research linking transformational leadership to team<br />

performance might consider this distinction. Previous research has found that transformational<br />

leaders are capable of developing important team processes (e.g., unit cohesion, team potency,<br />

collective efficacy, organizational trust and commitment, a sense of higher purpose or shared<br />

vision; Bass et al., <strong>2003</strong>; Shamir, et al., 1993); we speculate the consequences of these team<br />

processes may matter most in maximum performance conditions. More empirical research using<br />

tighter designs is definitely needed to test this hypothesis.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The finding that transformational leadership fully mediated the relationship between the<br />

FFM and team performance assessed in maximum contexts, but only partially mediated the<br />

relationships between the FFM and team performance assessed in typical contexts, is consistent<br />

with the hypothesis that typical predictors (such as personality) are more strongly related to<br />

typical performance measures than to maximum performance measures (Cronbach, 1949, 1960;<br />

DuBois et al., 1993; Sackett et al., 1988). This difference is due to the fact that individual<br />

differences in personality will primarily manifest themselves in typical performance contexts; in<br />

maximum performance contexts effort and choice are more constant across people (Cronbach,<br />

1949; 1960). This was found in our data because the relationship between the FFM and<br />

performance in the typical context was about twice as large as the relationship between the FFM<br />

and performance in the maximum context. Transformational leadership showed just the opposite<br />

pattern, with the effect size being greater in the maximum than typical context. These findings<br />

suggest transformational leadership may become most critical in maximum performance contexts<br />

and have several implications for both transformational leadership and personality research. For<br />

example, there may be a greater need to evaluate leadership performance in light of team<br />

performance. If effective leadership is hypothesized to ultimately improve team effectiveness,<br />

team performance across different contexts may provide an additional criterion construct that<br />

may be used for leader selection, training, and development.<br />

Limitations and Directions for Future Research<br />

Like any field study, there are a number of potential issues we could not control that may<br />

influence the interpretation of our findings. Readers should be mindful of these alternative<br />

explanations because they must be carefully considered in the design of any future research.<br />

Indeed, future theory building will be dependent on researchers adopting more stringent methods<br />

and designs that address the limitations noted here. One of the most pressing issues will be for<br />

future research to eliminate the potential contamination present among the various ratings.<br />

Despite the fact that several sources of ratings were used (e.g., supervisor, followers, etc.), the<br />

ratings may be contaminated to various degrees by different sources of information or common<br />

sources of information, ultimately making it difficult to cleanly demonstrate casual relationships.<br />

These are important concerns and we discuss them in some detail<br />

First, it is impossible to definitively conclude the team performance measures collected<br />

under the typical and maximum performance contexts represent typical and maximum<br />

performance constructs. While our results may be consistent with this hypothesis, an alternative<br />

explanation is that raters differed in the types and amount of information they had for each leader<br />

and team in each context. For example, assessors were only able to observe the teams and leaders<br />

during the one-day exercise, and thus could only use this on-task information to make their<br />

ratings. Alternatively, raters providing ratings in the typical condition were with the teams and<br />

leaders for several weeks, and could have been influenced (whether consciously or not) by nontask<br />

information (e.g., the personality of the leader, how well they got along with the team);<br />

despite the presence of rater training. Thus, as noted by an anonymous reviewer, an alternative<br />

explanation of these findings was that differences in raters’ implicit theories of leadership<br />

accounted for the findings, not differences due to typical and maximum performance. It is also<br />

possible that differences in the reliability for the two sets of ratings partly explain our findings.<br />

For these reasons, it is important to realize that our results and theoretical implications speak<br />

only to there being a distinction between the two performance conditions; they do not provide<br />

strong evidence as to why the differences exist.<br />

643<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


644<br />

This is clearly an important area for future research, as studies should measure the<br />

relevant explanatory constructs (e.g., liking) to rule out this alternative explanation. Such<br />

research would contribute to a better understanding of the features and conditions underlying<br />

typical and maximum performance constructs. Although the three conditions proposed by<br />

Sackett et al. (1988) have helped stimulate research on this topic, they may be in need further<br />

refinement. Research must begin to assess what are essentially “manipulation checks” to<br />

distinguish between the two conditions (e.g., assess perceptions of effort, instruction acceptance,<br />

duration; assessor biases and perceptions). With that said, the present study may still have<br />

implications for practice, as finding a difference between the two contexts has consequences for<br />

how organizations should use the two types of performance measures. Regardless of whether the<br />

difference is explained by differences in the performance construct or differences in raters’<br />

implicit theories of performance, organizations that assess performance in typical and maximum<br />

conditions must realize these measures may not be interchangeable. For example, the practical<br />

question facing the organization examined in this study is which type of rating is most useful for<br />

different administrative purposes (e.g., assessment/validation, performance appraisal,<br />

development, promotion). Additionally, theoretical and empirical work on typical/maximum<br />

performance should be conducted at the team level of analysis, given their increasing frequency<br />

in organizations (Devine, Clayton, Philips, Dunford & Melner, 1999).<br />

A second and equally important issue relating to contamination concerns the followers’<br />

ratings of transformational leadership. As noted by an anonymous reviewer, follower ratings of<br />

transformational leadership may have been affected by the team’s performance in stressful<br />

training exercises held during the first few weeks of training, and this may account for the large<br />

(.60) relationship between transformational leadership and team performance in maximum<br />

contexts. That is, success of the team in challenging contexts may have led followers to rate the<br />

leader as more transformational, and thus there is a shared source of variance contributing to the<br />

relationship between transformational leadership and team performance. To the extent this<br />

contamination exists, it decreases our ability to make inferences of how much transformational<br />

leadership causes (or at least predicts) team performance.<br />

A related concern is the potential for informal communication among the various sources<br />

that would contribute to common variance among the correlations. This is not method bias in the<br />

traditional sense, but the potential for common information about the quality of the unit to be<br />

known by assessors, supervisors, and followers. We noted earlier that assessors and followers<br />

were from different units, assessors were not aware of how the teams were performing in the<br />

training, and supervisors were not familiar with how their teams performed in the assessment<br />

exercises. Our post-hoc survey of supervisors and assessors supported these expectations, but the<br />

fact remains some informal means of communicating team quality may have occurred.<br />

Such concerns with contamination among the ratings must be addressed in future<br />

research for stronger causal relationships to be theoretically supported. For example, Kozlowski,<br />

Chao, and Morrison (1998) and Murphy and Cleveland (1995) review evidence showing the<br />

provision of ratings for administrative purposes is largely a political process. In contrast, ratings<br />

collected for research-only purposes may oftentimes show better psychometric properties.<br />

Future research might therefore implement a formal research-only design whereby the ratings are<br />

completely anonymous and confidential. Likewise, ratings may be supplemented with other<br />

sources of performance information (e.g., objective or administrative indices) to help understand<br />

the construct validity of the ratings. Finally, quasi-experimental designs such as that conducted<br />

by Dvir et al. (2002) would be most helpful in establishing stronger inferences of causality.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


A third limitation is the relatively small sample size. Although more than 276 military<br />

personnel participated in this study, as the unit of analysis is at the team and leader level, this<br />

reduced the sample size to 39 teams and leaders. To some extent, this is the reality of team<br />

research in the field (e.g., Cohen & Bailey, 1997). However, our sample was also unique in that<br />

the participants were primarily young and entirely male. One implication of these realities is that<br />

effect sizes may be different in other contexts and settings. An important avenue for future<br />

research will be to replicate and extend this study using different samples (military/business),<br />

cultures, measures, designs, and contexts.<br />

A fourth limitation is that we did not examine any team process variables in this study,<br />

and thus there is no way to determine why and how transformational leadership was related to<br />

team performance across the two contexts. Future research should include team processes<br />

variables so that important mediators of the relationship between transformational leadership and<br />

team performance can be determined (e.g., Bass et al., <strong>2003</strong>). Such mediated models also help<br />

establish stronger inferences of the causal sequence of psychological processes. Our prediction is<br />

that the mediators may be somewhat different across typical and maximum performance<br />

contexts, or at least the effect sizes of these mediating processes will be relatively different. An<br />

interesting question for future research is that while transformational leaders are capable of<br />

developing important team processes, the incremental value of these team processes will only be<br />

brought to bear when teams must perform in maximum conditions.<br />

A final potential limitation regarding transformational leadership is that while we have<br />

used the group mean as an indicator of the team leader’s transformational leadership, the<br />

dispersion of team members’ leadership ratings (operationalized in terms of standard deviation or<br />

Rwg) may be, in and of itself, an important reflection of team alignment. Indeed, Bliese and<br />

Halverson (1998) found that group consensus about leadership was related to important<br />

psychological well-being. Future research should explore this issue further, perhaps in<br />

conjunction with mediating processes.<br />

Conclusions<br />

In today’s dynamic workplace, organizations must increasingly contend with varying degrees of<br />

uncertainty for such reasons as mergers and acquisitions, global competition, and changes in the<br />

economy and stock market. It is in such times that transformational leadership is critically<br />

needed to lead these organizations out of uncertainty. This study attempts to fill several<br />

important voids in the transformational leadership literature by examining the potential<br />

dispositional antecedents of transformational leadership, and the consequences of<br />

transformational leadership on collective performance under typical and maximum performance<br />

contexts. We found that transformational leadership appears to be more critical for team<br />

performance under a maximum performance context than a typical performance context. Future<br />

research should address the limitations present in this field study to help build theories linking<br />

transformational leadership to collective performance in typical and maximum contexts. A quote<br />

from Bass (1998) captures elegantly the essence of transformational leadership and our findings:<br />

“To be effective in crisis conditions, the leaders must be transformational (p.42)….transforming<br />

crises into challenges (p.45).” Future leadership research will likewise need to transform our<br />

understanding of individual-level process in typical performance contexts to multilevel process<br />

in typical and maximum performance contexts.<br />

645<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


646<br />

References<br />

Atwater, D.C., & Bass, B.M. (1994). Transformational leadership in teams. In D.C., Atwater,<br />

B.M., Bass (1994). Improving organizational effectiveness through transformational<br />

leadership (pp. 48-83). CA: Thousand Oaks.<br />

Avolio, B.J., Bass, B.M., & Jung, D.I. (1999). Reexamining the components of transformational<br />

and transactional leadership using the Multifactor Leadership Questionnaire. Journal of<br />

Organizational and Occupational Psychology, 72, 441-462.<br />

Barrick, M.R., & Mount, M.K. (1991). The Big Five personality dimensions and job<br />

performance: A meta-analysis. Personnel Psychology, 44, 1-26.<br />

Barrick, M.R., Mount, M.K., & Judge, T.A. (2001). Personality and performance at the<br />

beginning of the new millennium: What do we know and where do we go next?<br />

<strong>International</strong> Journal of Selection & amp, 9, 9-30<br />

Bass, B.M. (1981). From leaderless group discussions to the cross-national assessment of<br />

managers. Journal of Management, 7, 63-76.Bass. B.M., (1985). Leadership and<br />

performance beyond expectations. New York: Free Press.<br />

Bass, B.M., (1988). The inspirational processes of leadership. Journal of Management<br />

Development, 7, 21-31.<br />

Bass, B.M. (1990). Bass and Stogdill’s handbook of leadership: theory, research and<br />

management applications (3 rd ). New York: Free Press.<br />

Bass, B.M. (1998). Transformational leadership: Industry, <strong>Military</strong>, and Educational Impact.<br />

Mahwah, NJ: Erlbaum.<br />

Bass, B.M., & Avolio, B. J. (1993). Full range leadership development: Manual for the<br />

Multifactor Leadership Questionnaire. Palo Alto, CA: Mind Garden.<br />

Bass, B.M., Avolio, B. J., Jung, D. I., & Berson, J. (1993). Predicting unit performance by<br />

assessing transformational and transactional leadership. Journal of Applied Psychology,<br />

88, 207-218.<br />

Becker, T.E., & Billings, R.S. (1993). Profiles of commitment: An empirical test. Journal of<br />

Organizational Behavior, 14, 177-190.<br />

Burns, J.M. (1978). Leadership. New York: Free Press.<br />

Bycio, P., Hackett, r.D., & Allen, J.S. (1995). Further assessments of Bass’s (1985)<br />

conceptualization of transactional and transformational leadership. Journal of Applied<br />

Psychology, 80, 468-478.<br />

Cohen. J. (1988). Statistical power analysis for the behavioral sciences (2 nd ). New Jersey:<br />

Lawrence Erlbaum Associates.<br />

Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the<br />

behavioral sciences (2 nd edition). Hillsdale, NJ: Lawrance Erlbaum Associates.<br />

Cohen, S. G., & Bailey, D. E. (1997). What makes teams work: Group effectiveness research<br />

from the shop floor to the executive suite. Journal of Management, 23, 239-290.<br />

Conger, J.A. (1999). Charismatic and transformational leadership in organizations: an insider’s<br />

perspective on these developing streams of research. Leadership Quarterly, 10, 145-179.<br />

Conger, J.A., & Kanungo, R.N. (1988). Toward a behavioral theory of charismatic<br />

leadership. In J.A. Conger & R.N. Kanungo (Eds.), Charismatic leadership: The elusive<br />

factor in organizational effectiveness (pp. 78-97). San Francisco, CA: Jossey-<br />

Bass.<br />

Cronbach, L. J. (1949). Essentials of psychological testing. New York: Harper.<br />

Cronbach, L. J. (1960). Essentials of psychological testing (vol. 2). New York: Harper.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Devine, D.J., Clayton, L.D., Philips, J.L., Dunford, B.B., & Melner, S.B. (1999). Teams in<br />

organizations: Prevalence, characteristics, and effectiveness. Small Group Research, 30,<br />

678-711.<br />

DuBois, C.L.Z., Sackett, P.R., Zedeck, S., & Fogli, L. (1993). Further exploration of typical and<br />

maximum performance criteria: Definitional issues, prediction, and white-black<br />

differences. Journal of Applied Psychology, 78, 205-211.<br />

Dvir, T., Eden, D., Avolio, B. J., & Shamir, B. (2002). Impact of transformational leadership on<br />

follower development and performance: A field experiment. Academy of Management<br />

Journal, 45, 735-744.<br />

Fullagar, C., McCoy, D., & Shull, C. (1992). The socialization of union loyalty. Journal of<br />

Organizational Behavior, 13, 13-26.<br />

Fuller, B.J., Patterson, C.E.P., Hester, K., & Stringer, D.Y. (1996). A quantitative review of<br />

research on charismatic leadership. Psychological Reports, 78, 271-287.<br />

Goldberg, L.R. (1990). An alternative “description of personality”,: The Big-Five factor<br />

structure. Journal of Personality and Social Psychology, 59, 1216-1229.<br />

Goldberg, L.R. (1998). <strong>International</strong> Personality Item Pool: A scientific collaboratory for the<br />

development of advanced measures of personality and other individual differences.<br />

[on-line]. Available HTTP: http//ipip.ori.org/ipip/ipip.html.<br />

Goldberg, L.R. (1999). A broad-bandwidth, public-domain, personality inventory measuring the<br />

lower-level facets of several five-factor models. In I. Mervielde, I. Deary, F. De Fruyt, &<br />

F. Ostendorf (Eds.), Personality Psychology in Europe, Vol. 7 (pp. 7-28). Tilburg,<br />

The Netherlands: Tilburg University Press.<br />

Guzzo,R.A., Yost, P.R., Campbell, R.J., and Shea, G.P. (1993). Potency in groups: Articulating a<br />

construct. British Journal of Social Psychology, 3, 87-106.<br />

Hofstee, W.K., de Raad, B., & Goldberg, L.R. (1992). Integration of the Big Five and<br />

circumplex to trait structure. Journal of Personality & Amp social Psychology, 63, 146-<br />

163.<br />

Hogan, R., Curphy, G.J., & Hogan, J. (1994). What we know about leadership: Effectiveness and<br />

personality. American Psychologist, 49, 493-504.<br />

House, R.J. (1977). A 1976 theory of charismatic leadership. In J.G. Hunt & L.L. Larson (Eds.),<br />

Leadership: The cutting edge (pp.189-207). Carbondale, IL: Southern Illinois University<br />

Press.<br />

Howell, J.M., & Avolio, B.J. (1993). Transformational leadership, transactional leadership, locus<br />

of control, and support for innovation: Key predictors of consolidated business unit<br />

performance. Journal of Applied Psychology, 78, 891-902.<br />

James, L.R., Demaree, R.G., and Wolf, G. (1984). Estimating within-group interrater reliability<br />

with and without response bias. Journal of Applied Psychology, 69, 85-98.<br />

Janis, I.L., & Mann, L. (1977). Decision making: A psychological analysis of conflict, choice,<br />

and commitment. New York: Free Press.<br />

Judge, T.A., & Bono, J.E. (2000). Five Factor Model of Personality and Transformational<br />

Leadership. Journal of Applied Psychology, 85 (5), 751-765.<br />

Judge, T.A., Bono, J.E., Ilies, R., and Gerhardt, M.W. (2002). Personality and leadership: A<br />

qualitative and quantitative review. Journal of Applied Psychology, 87, 765-780.<br />

Katz, D., Maccoby, N., & Morse, N.C. (1950). Productivity, supervision, and morale in an office<br />

situation. Part 1. England: Oxford.<br />

647<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


648<br />

Kirkpatrick, S.A. & Locke, E.A. (1996). Direct and indirect effects of three core charismatic<br />

leadership components on performance and attitudes. Journal of Applied Psychology, 81,<br />

36-51.<br />

Klein, K.J., Dansereau, F., & Hall, R.J. (1994). Levels issues in theory development, data<br />

collection, and analysis. Academy of Management Review, 19, 195-229.<br />

Koh, W.L., Steers, R.M., & Terborg, J.R. (1995). The effects of transformational leadership on<br />

teacher attitudes and student performance in Singapore. Journal of Organizational<br />

Behavior, 16, 319-333.<br />

Kozlowski, S. W. J., Chao, G. T., & Morrison, R. F. (1998). Games raters play: Politics,<br />

strategies, and impression management in performance appraisal. In J. W. Smither (Ed.),<br />

Performance appraisal: State of the art in practice (pp. 163-205). San Francisco, CA:<br />

Jossey-Bass.<br />

Kozlowski, S. W. J., Gully, S. M., McHugh, P. P., Salas, E., & Cannon-Bowers, J. A. (1996). A<br />

dynamic theory of leadership and team effectiveness: Developmental and task contingent<br />

leader roles. Research in Personnel and Human Resource Management, 14, 253-<br />

305.<br />

Kozlowski, S.W., & Hattrup, K. (1992). A disagreement about within-group agreement:<br />

Disentangling issues of consistency versus consensus. Journal of Applied Psychology, 77,<br />

161-167.<br />

Kozlowski, S.W.J., & Klein, K.J. (2000). A multilevel approach to theory and research in<br />

organizations: Contextual, temporal, and emergent processes. In K.J. Klein & S.W.<br />

Kozlowski (Eds.), Multilevel theory, research, and methods in organizations:<br />

Foundations, extensions, and new directions (pp. 3-90). San Francisco:<br />

Jossey-Bass.<br />

Landy, F. J., & Farr, J. L. (1980). A process model of performance rating. Psychological<br />

Bulletin, 87, 72-107.<br />

Liden, R.C., Wayne, S.J., Judge, T.A., Sparrowe, R.T., Kraimer, M.I., & Franz, T.M. (1999).<br />

Management of poor performance: A comparison of manager, group member, and group<br />

disciplinary decisions. Journal of Applied Psychology, 84, 835-850.<br />

Lowe, K.B., Kroeck, K.G., and Sivasubramaniam, N. (1996). Effectiveness correlates of<br />

transformational and transactional leadership: A meta-analytic review of MLQ literature.<br />

Leadership Quarterly, 7, 385-425.<br />

Marks, M.A., Sabella, M.J., Burke, C.S., & Zaccaro, S.J. (2002). The impact of cross-training on<br />

team effectiveness. Journal of Applied Psychology, 87, 3-13.<br />

McCrae, R.R., & Costa, P.T., Jr. (1987). Validation of the five-factor model of personality across<br />

instruments and observers. Journal of Personality and Social Psychology, 52, 81-90.<br />

McCrae, R.R., & Costa, P.T. (1991). Adding Liebe und Arbeit: the full five-factor model and<br />

well-being. Personality & amp; Social Psychology Bulletin, 17, 227-232.<br />

Murphy, K.R. (1996). Individual differences and behavior in organizations: Much more than g.<br />

In K.R. Murphy (Ed.), Individual differences and behavior in organizations (pp. 3-30).<br />

San Francisco: Jossey-Bass.<br />

Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social,<br />

organizational, and goal-based perspectives. Thousand Oaks, CA: Sage.<br />

Niehoff, B.P., Enz, C.A., & Grover, R.A. (1990). The impact of top-management actions on<br />

employees attitudes. Group & Organizational Management, 15, 337-352.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Ostroff, C., & Schmitt, N. (1993). Configurations of organizational effectiveness and efficiency.<br />

Academy of Management Journal, 36,1345-1361.<br />

Pitman, B. (1993). The relationship between charismatic leadership behaviors and organizational<br />

commitment among white-collar workers. Dissertation Abstracts <strong>International</strong>, 54, 1013.<br />

Ployhart, R., Lim, B.C., & Chan, K.Y., (2001). Exploring relations between typical and<br />

maximum performance ratings and the five factor model of personality. Personnel<br />

Psychology, 54, 809-843.<br />

Rousseau, D. (1985). Issues of level in organizational research: Multilevel-level and cross-level<br />

perspectives. In L.L. Cummings and Barry M. Staw (Eds.), Research in Organizational<br />

Behavior, 7, 1-37.<br />

Sackett, P.R., Zedeck, S., & Fogli, L. (1988). Relations between measures of typical and<br />

maximum job performance. Journal of Applied Psychology, 73, 482-486.<br />

Shamir, B., House, R.J., & Arthur, M.B. (1993). Th motivational effects of charismatic<br />

leadership: A self-concept based theory. Organizational Science, 4, 577-594.<br />

Sivasubramaniam, N., Murry, W.D., Avolio, B.J., & Jung, D.I. (2002). A longitudinal model of<br />

the effects of team leadership and group potency on group performance. Group &<br />

Organization Management, 27, 66-96.<br />

Sosik, J.J., Avolio, B.J., Kahai, S.S., & Jung, D.I. (1998). Computer-supported work groups<br />

potency and effectiveness: The role of transformational leadership, anonymity and task<br />

interdependence. Computer in Human Behavior, 14, 491-511.<br />

Sparks, J.R., & Schenck, J.A. (2001). Explaining the effects of transformational leadership: An<br />

investigation of the effects of higher-order motives in multilevel marketing organizations.<br />

Journal of Organizational Behavior, 22, 849-869.<br />

Steiger, J.H. (1980). Tests for comparing elements of a correlation matrix. Psychological<br />

Bulletin, 87, 245-251.<br />

Stogdill, R.M. (1974). Handbook of leadership: A survey of theory and research. New York.<br />

Stogdill, R.M. & Coons, A.E. (1957). Leader behavior: Its description and measurement.<br />

England: Oxford.<br />

West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal<br />

variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural Equation Modeling:<br />

Concepts, Issues, and Applications (pp. 56-75). Thousand Oaks: Sage.<br />

Wilkinson, L., & the Task Force on Statistical Inference. (1999). Statistical methods in<br />

psychology journals. American Psychologist, 54, 594-604<br />

Williams, E.J. (1959). The comparison of regression variables. Journal of the Royal Statistical<br />

Society (Series B), 21, 396-399.<br />

Yammarino, F.J., Dansereau, F., & Kennedy, C.J., (2001). A Multiple-level multidimensional<br />

approach to leadership: viewing leadership through an elephant’s eye. Organizational<br />

Dynamics, 29, 149-163.<br />

Yukl, G. (1999). An evaluation of conceptual weaknesses in transformational and charismatic<br />

leadership theories. Leadership Quarterly, 10, 285-305.<br />

649<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


650<br />

Author Note<br />

We would like to thank the Associate Editor and two anonymous reviewers for their help<br />

and suggestions. We also appreciate the help and support of the Singapore Ministry of Defense.<br />

The opinions expressed in this paper are those of the authors and do not necessarily reflect the<br />

views of the Singapore Ministry of Defense.<br />

Correspondence concerning this article should be sent to Beng-Chong Lim, Applied<br />

Behavioral Sciences Dept, 5 Depot Road, #16-01, Tower B, Defense Technology Towers,<br />

Singapore, 109681; email: lim_b_c@yahoo.com.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 1: Responses from 10 Subject Matter Experts on the Performance Content Survey<br />

Response scale:<br />

1 To no extent<br />

2 To a limited extent<br />

3 To some extent<br />

4 To a considerable extent<br />

5 To a great extent<br />

6 Perfectly<br />

To what extent:<br />

1. …does the Assessment Center reflect the knowledge, skills, abilities, and<br />

M SD<br />

tasks acquired during the team training?<br />

5.3 .48<br />

2. …does team performance (rated at the end of the team training phase) reflect<br />

the knowledge, skills, abilities, and tasks required during the team training?<br />

3.…is the content of the Assessment Center similar to the content of the team<br />

training?<br />

4.…does the team performance rated in the team training phase tap the same<br />

performance domain as the team performance rated in the Assessment Center?<br />

5.…do the ratings from the Assessment Center and the ratings from the team<br />

training phase tap the same dimensions of performance?<br />

6.…are the team training objectives similar to the performance criteria used in<br />

the Assessment Center?<br />

7.…are the team training objectives similar to the performance ratings used in<br />

the team training phase?<br />

8.…are the team tasks (e.g., quick attack) performed in the Assessment Center<br />

similar to the team tasks learned in team training?<br />

9.…. are the behaviors (e.g., fire and movement) exhibited in the Assessment<br />

Center similar to the behaviors exhibited during team training?<br />

10.Yes or no: Does the Assessment Center measure the same content, tasks,<br />

knowledge, skills, abilities, and other characteristics as the team training?<br />

5.0<br />

5.3<br />

5.2<br />

5.1<br />

4.7<br />

5.3<br />

5.5<br />

5.3<br />

YES-10<br />

NO- 0<br />

Note: n = 10 . These experts do not use the term “performance ratings in typical contexts,” rather<br />

in the language of this organization such ratings would be known as “team performance rated in<br />

the team training phase.” We therefore used the language familiar to these experts to refer to the<br />

performance measures across both contexts.<br />

651<br />

.47<br />

.67<br />

.63<br />

.88<br />

.67<br />

.48<br />

.53<br />

.48<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


652<br />

Table 2: Leader and Team Descriptive Statistics for All Measures<br />

Measures<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Personality<br />

M SD 1 2 3 4 5 6 7 8<br />

Personality<br />

1. Extroversion 2.97 .70 -<br />

2. Conscientiousness 3.55 .58 .15 -<br />

3. Neuroticism 2.97 .65 -.63* -.04 -<br />

4. Openness To Experience 3.35 .62 .42* .45* -.32* -<br />

5. Agreeableness 3.84 .62 .22* .21* -.04 .58* -<br />

6. Transformational Leadership 2.35 .55 .31* -.09 -.39* -.08 -.29* -<br />

7. Maximum-Like Team Performance 3.87 1.04 .11 -.05 -.13 -.31* -.21* .60* -<br />

8. Typical-Like Team Performance 3.52 .46 .50* .18 -.56* .37* .28* .32* .18 -<br />

Note N = 39 leaders and teams. * p < .10 (one tailed).


Table 3: Regression for FFM on Transformational Leadership<br />

Extroversion<br />

Transformational Leadership<br />

Variables β Overall R 2<br />

.24<br />

Neuroticism -.29*<br />

Conscientiousness -.04<br />

Openness to Experience -.07<br />

Agreeableness -.30*<br />

.28**<br />

Note. N = 39 leaders and teams. * p < .10 (one tailed); **p < .05 (one tailed).<br />

******************************************************<br />

Figure 1: Proposed relationships among the variables. The dashed line indicates a posited<br />

weak relationship.<br />

Five Factor Model<br />

of Personality<br />

Transformational<br />

Leadership<br />

653<br />

Team Performance<br />

in Typical<br />

Contexts<br />

Team Performance<br />

in Maximum<br />

Contexts<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


654<br />

ARMY LEADERSHIP COMPETENCIES: OLD WINE IN NEW<br />

BOTTLES?<br />

Brian Cronin, M.A., Ray Morath, Ph.D., and Jason Smith, M.A.<br />

Caliber Associates<br />

10530 Rosehaven St., Suite 400<br />

Fairfax, VA<br />

croninb@calib.com<br />

This paper will compare the current Army leadership competency model (FM 22-<br />

100) published in 1999, which is used as a guideline for Army leader development, to<br />

previous versions of FM 22-100 used by the Army. The purpose of the project is to<br />

identify the attributes and competencies that were prevalent in the past and remain<br />

prevalent today as well as to identify new competencies that have emerged over time.<br />

Specifically, do attributes or competencies such as duty, cultural awareness, and<br />

conceptual skills make unique and notable contributions to our understanding of new<br />

leader requirements above and beyond those of earlier Army leadership models?<br />

HISTORY OF LEADERSHIP MODELS<br />

Colonel Paparone (2001) explains that the development of military staff<br />

organization and procedure models can be traced back 2000 B.C. beginning with the<br />

armies of early Egypt. Although, James D. Hittle, a historian of military staff, states that<br />

the modern military staff model did not emerge until the late 1800’s (Paparone, 2001).<br />

Hittle proposes that the modern staff system has certain distinguishable features:<br />

� A regular education system for training staff officers<br />

� Delegation of authority from the commander<br />

� Supervised execution of orders issued by or through the staff<br />

� A set method of procedure by which each part performs specific duties.<br />

These aspects of the modern models guide leaders in their duties and provide consistency<br />

to the larger organization.<br />

Although modern military staff models had emerged in Europe in the 1800’s, the<br />

United States did not have a published modern US Army Staff doctrine until after World<br />

War I when, in 1924, the US published its first document providing leaders with formal<br />

requirements. This document was entitled ‘Field Service Regulation (FSR)’ (Paparone,<br />

2001). However, FSR lacked specific detail and therefore, could not provide the guidance<br />

that soldiers needed in the field.<br />

To alleviate this situation, the Staff Officers’ Field Manual was introduced in<br />

1932. This manual provided significantly more information to Army leaders and was a<br />

modest success. It was an improvement over FSR because it provided leaders with<br />

principles, data, and decision-making tools to guide their operation of units and<br />

commands during peace and war rather than a simple set of rules that were to be blindly<br />

followed (Paparone, 2001). However, this manual also fell short because it could not<br />

adapt to the Army expansion that preceded World War II.<br />

As the Army began to expand for World War II, the scale and complexity<br />

of military planning and decision-making became increasingly more intricate, thus Army<br />

doctrine was forced to expand. The goal of this doctrine expansion was to create a<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


comprehensive guide that the Army could use to develop and guide their leaders across<br />

situations. In 1940, an expanded guide was published. Entitled: The US Army Field<br />

Manual (FM) 101-5, Staff Officers’ Field Manual: The Staff and Combat Orders, this<br />

document increased the scope and depth of the Army’s doctrine proportionately beyond<br />

the 1932 version and allowed the Army to focus on more specific aspects of officer<br />

training. In 1948, the Army published its first manual focusing specifically on leadership,<br />

DA Pam 22-1. While this manual was only a pamphlet, the notion of having one guide to<br />

develop leadership within the Army was a concept that was quickly embraced. Since DA<br />

Pam 22-1’s creation, the Army has updated this field manual numerous times and has<br />

continued to use it as a building block for all subsequent leader development manuals.<br />

The current paper investigates the similarities and differences of these iterations over<br />

time.<br />

FM 22-100 COMPARISONS<br />

The Army’s field manual has undergone nine iterations since its formation in<br />

1948 (FM 22-1, 1951; FM 22-10, 1953; FM 22-100, 1955; FM 22-100, 1958; FM 22-<br />

100, 1961; FM 22-100, 1965; FM 22-100, 1973; FM 22-100, 1983; FM 22-100, 1990).<br />

As Major Smidt (1998) comments, “From a humble start as a 1948 pamphlet titled<br />

Leadership, the doctrine has evolved into a comprehensive electronic treatise published<br />

on the World Wide Web. However, the question to be answered is whether or not the<br />

content of this document has changed significantly and the evolutions in information<br />

presentation have aided leader comprehension of the material.<br />

In general, the rest of this paper will describe the identification of past and present<br />

leader competencies that were identified by the Army Leadership manuals over time and<br />

our attempts to develop a crosswalk of these competencies—highlighting the emergence<br />

and expiration of Army leader competencies, including the documentation of<br />

competencies that have remained critical over time (even if their labels have been<br />

changed). To accomplish these goals, the current paper will use three versions of FM 22-<br />

100 as exemplars of the evolution of this document, FM 22-100, 1958; FM 22-100, 1973;<br />

and FM 22-100, 1999.<br />

FM 22-100 METHOD COMPARISON<br />

The Army Leadership, Field Manual 22-100 has evolved over time and has used<br />

various methods for deriving and presenting leadership guidance. Early versions of the<br />

manual relied on the experience of military leaders such as General Omar N. Bradley<br />

(FM 22-100, 1948) and General J. Lawton Collins (FM 22-100, 1951) to record their<br />

insights and experiences in order to teach other Army Leaders. Their opinions combined<br />

with pieces of supporting behavioral science research such as Maslow’s hierarchy of<br />

needs (Maslow, 1954) were used as the foundation of these early documents.<br />

The manual received its first considerable increase in content in 1953. This was<br />

the first publication of the Manual under the FM 22-100 title, which is still used today,<br />

and it was the first attempt at a comprehensive guide for Army leaders. Throughout the<br />

1950’s, the manual was continually updated (1953, 1955, 1958). The 1958 version, the<br />

last manual of that decade, highlights the vast improvements of the document since its<br />

early days as a humble pamphlet.<br />

655<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


656<br />

The principles and techniques presented in the 1958 version were “the result of an<br />

analysis of outstanding leadership displayed by both military and civilian leaders (FM 22-<br />

100, 1953).” The result of this analysis was a description of 14 traits and 11 principles,<br />

which provided the attributes a leader must have to succeed. This model was thorough,<br />

easily understood, and remained a fixture in Army leadership for the next 32 years until<br />

1990. Between 1958 and 1990, there were several updates to the manual (1961, 1965,<br />

1973, and 1983). Each of the ‘improvements’ provided more or less detail to leaders<br />

(e.g., depending upon the trait or principle being presented) than the previous version<br />

regarding how to develop themselves and their subordinates. For example, the 1958<br />

version included--in addition to operational descriptions/definitions of each trait--lists of<br />

activities and methods leaders could use to develop these traits. The 1973 version failed<br />

to include these lists of methods/activities, yet it offered situational studies to leaders to<br />

provide assistance in relating the material in the manual to the day-to-day issues that a<br />

leader might face in the field. Meanwhile, both versions offered ‘Indications of<br />

Leadership’ as benchmarks leaders could use to determine whether or not they were<br />

successful in their roles.<br />

The 1990 manual, however, illustrates the significant departure from the<br />

trait/principle approach of earlier versions. The 1990 version of FM 22-100 used a factor<br />

analysis of leadership survey responses to establish the following nine leadership<br />

competencies: communications, supervision, teaching and counseling, Soldier team<br />

development, technical and tactical proficiency, decision making, planning, use of<br />

available systems, and professional ethics, which in concept has many similarities to<br />

other versions but was presented in a different format.<br />

The most recent version (1999) of FM 22-100 used quite a different approach to<br />

establish a framework of leadership than any of the earlier versions. This version<br />

presents 39 labels that specify what a leader of character and competence must be, know,<br />

and do. Within this framework are “be” dimensions consisting of values (7), attributes<br />

(3) and sub-attributes(13); “know” dimensions consisting of skills (4); and “do”<br />

dimensions consisting of actions (3) and sub-actions (9). Among the approaches used to<br />

derive these labels was borrowing from the notions of a best selling job-hunting book<br />

(Bolles, 1992) that identified people, things, and ideas as critical to job success. This<br />

framework transposed these issues into interpersonal, technical, and conceptual skill<br />

areas and added an additional skill labeled ‘tactical’ to lend an Army flavor to the list.<br />

This version also differed from the 1958 and 1973 models in that it no longer included<br />

lists of activities or methods that leaders could use to develop their leadership skills and<br />

abilities.<br />

In summary, the Army Leadership doctrine has used a variety of methods to<br />

derive very different types of leadership frameworks. While there may be no single<br />

correct method for establishing Army leadership requirements, there are several<br />

important considerations when attempting to develop a framework or model that<br />

prescribes those requirements. Among these are methodological rigor in development,<br />

comprehensiveness of the framework/model, consistency of the dimensions within the<br />

framework/model, ability to communicate the model/framework to the audience, and<br />

endurance of the framework/model over time.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


FRAMEWORK COMPARISON: 1958; 1973; 1999<br />

The comparison of the 1958, 1973, and 1999 manuals was very intriguing. Our<br />

research team created a crosswalk that compared the leadership requirements proposed<br />

by each of the three documents. The most noticeable factor that emerged from this<br />

exercise was that the 1958 version and 1973 versions were highly similar in both their<br />

structure and straightforward (i.e., handbook-like) presentation of their models.<br />

Meanwhile the 1999 version offered a very different, if not more complex and<br />

encompassing, framework. One differentiating factor between the 1999 version was that<br />

it presented leader requirements in terms of values, attributes, skills, actions (some of<br />

these dimensions also included sub-dimensions) that were organized in the Be-Know-Do<br />

model, while the earlier versions presented leader requirements in somewhat more<br />

economical terms of traits and principles. In addition, the 1999 version provided 33 total<br />

requirements (24 values, attributes, and skills and 9 actions) while the 1958 and 1973<br />

offered 25 requirements (14 traits and 11 principles).<br />

While the 1999 version certainly differed from past models, our team noticed that<br />

there was considerable overlap between the old and updated version. For instance, our<br />

crosswalk of the three manuals indicated that 29 of the 33 requirements that were<br />

presented in the 1999 model were directly addressed in the earlier versions. Only 9 of the<br />

33 requirements were found to have one-to-one correspondence in their labels across<br />

versions (i.e., same label was used (loyalty)). Twenty of the 1999 requirements were<br />

mapped to earlier requirements with different labels but the same or highly similar<br />

definitions/descriptions of the requirement. For example the value labeled “duty” from<br />

the 1999 version was linked to the trait labeled “dependability” from the 1958 and 1973<br />

versions because the definitions/descriptions were highly similar:<br />

Duty (1999): Duty begins with everything required of you by law, regulation, and<br />

orders; but it includes much more than that. Professionals do their work not just to<br />

the minimum standard, but to the very best of their ability.<br />

Dependability (1973): The certainty of proper performance of duty; To carry out<br />

any activity with willing effort; To continually put forth one’s best effort in an<br />

attempt to achieve the highest standards of performance and to subordinate<br />

personal interests to military requirements.<br />

Dependability (1958): The certainty of proper performance of duty. A constant<br />

and continuous effort to give the best a leader has in him. Duty demands the<br />

sacrifice of personal interests in favor of military demands, rules and regulations,<br />

orders and procedures, and the welfare of subordinates.<br />

Thus, our review suggests that although the terminology and labels for particular<br />

requirements changed over time, the actual content of those leader requirements remained<br />

relatively stable.<br />

Our review also revealed that the definitions/descriptions from the 1958 and 1973<br />

versions often appeared to be more straightforward and concrete than those of the 1999<br />

version. The earlier frameworks typically described the particular trait, in more explicit<br />

terms than the 1999 model, how it differed from other similar traits, how it was<br />

manifest/demonstrated in a task or activity, and how it could be developed. Meanwhile,<br />

657<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


658<br />

the requirements in the 1999 model were often described in a more implicit manner. It<br />

used more general terms that could be applied across civilian and military settings and<br />

appeared to be directed at a more advanced audience of readers in terms of their general<br />

knowledge of behavior and cognition. This version spent less emphasis defining the<br />

operational parameters of the requirement (i.e., what it was and was not) and more<br />

emphasis describing its importance to leadership.<br />

Other requirements from the 1999 version such as: Honor, Will, Self-Discipline,<br />

Self-control, and Balance, whose labels did not directly correspond with those of earlier<br />

versions were more than adequately linked to the definitions/descriptions of requirements<br />

of previous manuals. Requirements from the 1999 model that were not linked to<br />

requirements from earlier models included Self-Confidence, Intelligence, Cultural<br />

Awareness, and Health Fitness. These new requirements may represent vital additions in<br />

the face of the new missions that the Army faces in today’s world. Additionally, one<br />

requirements from the earlier models was not linked to any of the 33 requirements of the<br />

1999 model. The trait, Enthusiasm, was not directly addressed in the requirement<br />

definitions of the 1999 version. It is possible that the trait of Enthusiasm may no longer<br />

be required or it may be that the authors misread or misinterpreted the requirements<br />

definitions of the 1999 model whereby particular phrases of these definitions may have<br />

inferred some characteristic of enthusiasm (or like characteristic) on the part of the<br />

leader.<br />

DISCUSSION<br />

This review has provided a unique understanding of the development of the FM<br />

22-100. Each version has built on the previous and provided more information to help<br />

leaders grow and succeed. This commitment to improvement on the part of the Army has<br />

resulted in the 1999 version that has more detail than previous models. As General Patch<br />

(1999) indicates, “the (1999) manual takes a qualitative step forward by:<br />

� Thoroughly discussing character-based leadership.<br />

� Clarifying values.<br />

� Establishing attributes as part of character.<br />

� Focusing on improving people and organizations for the long term.<br />

� Outlining three levels of leadership – direct, organizational and strategic<br />

� Identifying four skill domains that apply at all levels.<br />

� Specifying leadership actions for each level.<br />

… (Further), more than 60 vignettes and stories illustrate historical and contemporary<br />

examples of leaders who made a difference. The (1999) manual captures many of our<br />

shared experiences, ideas gleaned from combat, training, mentoring, scholarship and<br />

personal reflection.”<br />

Our analysis led the authors to conclude that the content in each of the manuals<br />

was similar but that the 1999 version implements a completely new framework for<br />

presenting this material. It was found that the vast majority of leader requirements and<br />

the competencies underlying those requirements have remained stable over time even<br />

though labels for these requirements and the complexity of leader requirements models<br />

have evolved over time.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Our review of the evolution of the leader requirements models found that the<br />

earlier models were practical and straightforward in format and had the look and feel of<br />

handbooks. While these models provided clear descriptions of the requirements they also<br />

had fewer numbers of requirements. These models were practical and utilitarian and their<br />

strengths lay in their parsimony and explicit descriptions, but they were relatively limited<br />

in terms their theoretical underpinnings. For example, all competency requirements<br />

related to the individual leader (values, characteristics, abilities, skills) were labeled<br />

under the single category heading of traits even though some of these requirements were<br />

clearly not traits (e.g., knowledge) as typically defined by behavioral science.<br />

The newer 1999 version was found to be more specific in terms of the various<br />

levels or strata of leader requirement dimensions within the framework. This version was<br />

also more sophisticated in its specification of model components and subcomponents and<br />

their interrelationships with one another—thus providing greater opportunities for testing<br />

and validation of the model and its components. It attempted to disentangle the single<br />

category of leader traits into more appropriate categories of values, attributes, and skills<br />

and described the differences in these categories of requirements. This model also<br />

replaced the Leadership Principles found in earlier models with actions and sub-actions<br />

that support the performance of these Principles and described how values, attributes,<br />

skills, and actions are maintained within the Be-Know-Do framework.<br />

However, this most recent leadership framework is not without its shortcomings. The<br />

1999 manual was of considerably greater length (almost twice as many pages as the 1973<br />

version) and complexity than previous versions. This version also appeared to be less<br />

precise in terms of helping the leader identify particular activities to develop their<br />

leadership skills, due possibly to its greater focus upon the specification of the various<br />

dimensions and categories of leadership, Due largely to these factors, the 1999 version<br />

may be more difficult for leaders (junior leaders in particular) to quickly grasp, and as a<br />

result, may be less easily applied by Army leaders. With these issues in mind, the authors<br />

of future iterations of FM 22-100 may wish to evaluate both the strengths and weaknesses<br />

of recent evolutions and determine if there are ways to present, describe, and advance<br />

complex leadership models without sacrificing practicality, parsimony and ease of<br />

comprehension in the audience of future leaders.<br />

659<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


660<br />

References<br />

Bolles, R.N. (1992). What color is your parachute? Ten Speed Press: Berkeley,<br />

California.<br />

Fitton, R. A. (1993). Development of Strategic Level Leaders. The Industrial<br />

College of the Armed Forces, Fort McNair, Washington, D.C.<br />

FM 22-10 Leadership (March 1951)<br />

FM 22-100 <strong>Military</strong> Leadership (August 1999)<br />

FM 22-100 <strong>Military</strong> Leadership (July 1990)<br />

FM 22-100 <strong>Military</strong> Leadership (June 1973)<br />

FM 22-100 <strong>Military</strong> Leadership (November 1965)<br />

FM 22-100 <strong>Military</strong> Leadership (June 1961)<br />

FM 22-100 <strong>Military</strong> Leadership (December 1958)<br />

FM 22-100 Command and Leadership for the Small Unit Leader (February 1953)<br />

Maslow, A. H. (1954). Motivation and personality. New York: Harper & Row.<br />

Paparone, C. R. (1991). 45 US Army Decision-making: Past, Present and Future.<br />

<strong>Military</strong> Review, Fort Leavenworth, Kansas.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


DEVELOPING APPROPRIATE METRICS FOR PROCESS AND<br />

OUTCOME MEASURES<br />

INTRODUCTION<br />

Amy K. Holtzman, David P. Baker, and Robert F. Calderón<br />

American Institutes for Research<br />

1000 Thomas Jefferson St., NW<br />

Washington, DC, 20007-3835, USA<br />

aholtzman@air.org<br />

dbaker@air.org<br />

Kimberly Smith-Jentsch<br />

University of Central Florida<br />

4000 Central Florida Blvd.<br />

P.O. Box 161390<br />

Orlando, FL, 32816-1390, USA<br />

kjentsch@mail.ucf.edu<br />

Paul Radtke<br />

NAVAIR Orlando TSD<br />

12350 Research Parkway<br />

Orlando, FL, 32826-3275, USA<br />

paul.radtke@navy.mil<br />

Scenario-based training is a systematic process of linking all aspects of scenario<br />

design, development, implementation, and analysis (Oser, Cannon-Bowers, Salas, &<br />

Dwyer, 1999). An exercise or scenario serves as the curriculum and provides trainees the<br />

opportunity to learn and practice skills. In the military, scenario-based training exercises<br />

are often used to evaluate whether individuals or teams have attained the necessary skills<br />

for specific missions and can apply them in real-world situations. To determine if the<br />

objectives of training have been met, performance measures can be used to assess<br />

individual and team performance within a given scenario.<br />

Performance measures can vary by the level of analysis, as performance can be<br />

measured at the individual, team, and multi-team level. They can also vary by the type of<br />

measures (i.e., outcomes or processes) and the overall purpose of the training.<br />

Performance outcomes are the results of an individual or team’s performance, whereas<br />

process measures “describe the steps, strategies, or procedures used to accomplish a task”<br />

(Smith-Jentsch, Johnston, & Payne, 1998, p.62). Examples of purposes of training<br />

include diagnosing root causes of performance problems, providing feedback, or<br />

evaluating levels of proficiency or readiness for a task.<br />

661<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


662<br />

Much of the research conducted on performance measurement has been done in<br />

the civilian performance appraisal arena, in which a supervisor evaluates a subordinate’s<br />

performance on the job. The focus of the early research was to assess ways to improve<br />

instruments used to evaluate performance (Arvey & Murphy, 1998; Bretz, Milkovich, &<br />

Read, 1992; Landy & Farr, 1980). Most of this research targeted rating scales, which are<br />

numerical or descriptive judgments of how well a task was performed. Research has been<br />

conducted on graphic rating scales, on behaviorally anchored ratings scales (BARS), on<br />

behavioral summary scales (BSS), and on the strengths and weaknesses of each (Cascio,<br />

1991; Murphy & Cleveland, 1991; Borman, Hough, & Dunnette, 1976). However, the<br />

performance appraisal research lacks studies that compare multiple rating formats. The<br />

civilian team performance and training literature has also addressed checklist and<br />

frequency count formats. Checklists consist of items or actions that have dichotomous<br />

answers such as yes/no, right/wrong, or performed action versus failed to perform action,<br />

whereas frequency counts provide an indication of the number of times that a behavior,<br />

action, or error occurs.<br />

However, the literature on scenario-based training lacks research on measurement<br />

methods, on the type of data that can be collected from each, and on how measurement<br />

purpose influences measurement method. In sum, the civilian research on rating formats<br />

has not been conducted in the scenario-based training area, nor has it been translated into<br />

the military arena.<br />

<strong>Military</strong> instructors have primarily used the checklist format, which is rarely used<br />

in the civilian sector. The reason for this difference may be that the criteria for<br />

evaluating performance in the military may be better defined than are the criteria for<br />

civilian jobs. That is, in the military, successful performance may be more amenable to<br />

yes/no judgments. Furthermore, little civilian or military research has been conducted on<br />

other rating formats, such as distance and discrepancies (D&D), which are numerical<br />

indices of how actual performance on the task differs from optimum performance.<br />

Moreover, after 1980, when Landy and Farr declared that further research on rating<br />

formats was not needed, little additional research addressed this topic at all. Thus, when<br />

to use a certain format for evaluating scenario-based training in the military and what<br />

factors drive that decision are necessary topics of research.<br />

To address this need, we conducted a study to provide guidance on identifying<br />

and developing appropriate metrics for measuring human performance in military<br />

settings. To gather information about how best to measure processes and outcomes, we<br />

conducted brief interviews with ten experts in human performance measurement. The<br />

literature identified a number of outcome and process measures, but we selected the ones<br />

most relevant to scenario-based training in the military, as this domain was the focus of<br />

the study. We used accuracy, timeliness, productivity, efficiency, safety, and effects as<br />

our outcome measures and procedural and non-procedural taskwork and teamwork as<br />

process measures (See Table 1 for definitions and examples.)<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


METHOD<br />

Participants<br />

Participants were ten experts with extensive experience in human performance<br />

measurement and training. Many also had experience working for the Navy or other<br />

military branches. All had PhDs in various areas of Psychology, with the majority having<br />

a Ph.D. in Industrial/Organizational Psychology.<br />

Participants’ collective experience in the human performance measurement arena<br />

included test development and assessment, performance model development, job<br />

analysis, and performance appraisal measure development. Their collective training<br />

experience included developing training evaluation measures, using assessment centers<br />

for development, developing training programs, and facilitating training. In addition,<br />

several had evaluated training programs and developed competency models for the Navy.<br />

Measures<br />

The four rating formats included checklist, frequency, distance and discrepancy,<br />

and rating scale. Participants were given definitions and examples of processes and<br />

outcomes and told to rank their first, second, and third choice of format for measuring<br />

each process and outcome (Interview guide is available from the author.) If they felt<br />

other formats were necessary, they were instructed to add them and explain their reasons<br />

for doing so. In addition, participants explained their rationale for choosing their first<br />

choices for each process and outcome. Finally, they provided demographic information<br />

on themselves.<br />

663<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


664<br />

Table 1. Definitions and Examples of Outcomes and Processes<br />

Item<br />

Outcomes<br />

Definition Example<br />

Accuracy Precision with which a task is Identifying whether a bomb<br />

performed<br />

hit the target<br />

Timeliness Length of time in which actions Assessing how long it took<br />

are performed the damage control team to<br />

extinguish a fire<br />

Productivity Rate at which actions are Examining the number of<br />

performed or tasks are planes that were launched or<br />

accomplished within a given refueled during a particular<br />

situation<br />

mission<br />

Efficiency Ratio of resources required to Examining the amount of<br />

those expended to accomplish a fuel that was burned on<br />

given task deck compared to the<br />

amount planned<br />

Safety Degree to which a task is Number of injuries per<br />

accomplished in a way that does<br />

not unduly jeopardize human and<br />

capital resources<br />

month<br />

Effects Degree to which the desired Keeping enemy air forces<br />

Processes<br />

effect was achieved<br />

grounded<br />

Procedural Taskwork Requirements specific to a Completing the detect-to-<br />

position that follow a step-bystep<br />

process<br />

engage sequence<br />

Non-procedural Requirements specific to a Developing a plan to clear<br />

Taskwork position that do not follow a mines from the Straits of<br />

step-by-step process<br />

Hormuz<br />

Teamwork Processes individuals use to Information exchange,<br />

coordinate activities communication, supporting<br />

behavior, initiative, and<br />

leadership<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


RESULTS<br />

Results are broken out below by type of measure. Techniques were chosen based on the<br />

definition and examples of the measures listed above.<br />

Outcome Measures<br />

Accuracy<br />

As Table 2 demonstrates, the majority of participants chose distance and<br />

discrepancy as their first choice for measuring accuracy. The main reason participants<br />

gave for this choice was that it allows for greater precision than do the other techniques.<br />

Participants felt D&D allows evaluators to determine how far from optimum the<br />

individual has performed and compare the outcome with the specified goal; furthermore,<br />

it is also the most direct measure of ratio level data.<br />

A few individuals chose rating scale as their first choice because, in their opinion,<br />

it allows for finely tuned judgments. This format was “the most flexible tool and can be<br />

written to accomplish the same objectives as the other techniques,” according to one<br />

individual. Their judgment was that rating scales can be easier to use than the other<br />

techniques and can be effectively used to assess low-base rate outcomes, such as hitting<br />

targets. This format was also judged to be useful when the criteria are individually<br />

determined and circumstance-specific. In fact, one individual chose rating scales for all<br />

process and outcome measures for these reasons.<br />

One individual felt that the frequency format was best for measuring accuracy<br />

because it effectively measures the percentage of hits, which may be more important than<br />

how close the misses were. D&D measures how close or far off the misses were. As<br />

shown in Table 3, the most popular second choices for accuracy include the checklist,<br />

frequency, and rating formats.<br />

Timeliness<br />

Table 2 reveals that D&D is the most popular method for measuring timeliness<br />

because participants felt that D&D allows for precise measurement and can best measure<br />

how late a person is. Furthermore, they felt that D&D allows for valid and reliable<br />

measurement and is the most appropriate measure for ratio-level data.<br />

Other techniques that participants chose include frequency and rating formats.<br />

Participants’ comments were that frequencies could be used if no comparison is needed<br />

and time requirements are the only necessary information. On the other hand, according<br />

to participants, rating formats allow for measuring the extent to which the action is<br />

perceived as timely.<br />

As depicted in Table 3, rating scales and checklist formats were the top second<br />

choices for measuring timeliness, whereas checklist was the most popular third choice.<br />

665<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


666<br />

Productivity<br />

According to Table 2, almost everyone felt that a frequency format would most<br />

effectively measure productivity, because frequency allows one to count specific actions.<br />

If the number of occurrences of an activity is important, participants felt frequency was<br />

the best technique to use. D&D and rating formats were also chosen as possible<br />

techniques. Participants said these techniques would be appropriate when measuring<br />

productivity in comparison to a standard.<br />

As shown in Table 3, over half of participants chose checklist as a second choice,<br />

and another one-third chose the rating scale format second. D&D was a popular third<br />

choice.<br />

Efficiency<br />

Table 2 illustrates the most common first choice for measuring efficiency: D&D.<br />

Participants felt this technique allowed for the most direct measure of efficiency,<br />

including ratio measures in comparison to a specified goal. Frequency and rating format<br />

techniques were also chosen. Participants’ commented that frequency allows for<br />

counting the number of resources expended, whereas the rating format allows for more<br />

flexibility and judgment.<br />

Table 3 demonstrates that well over half chose the rating scale format as their<br />

second choice. Finally, frequency and checklist formats were the most common third<br />

choices.<br />

Safety<br />

Table 2 reveals that frequency and rating scales were the top first choices for<br />

measuring safety. Part of the choice may depend on how safety is measured. According<br />

to several participants, frequency allowed for concrete information to be gathered on the<br />

number of observable occurrences, such as the number of accidents. On the other hand,<br />

rating scales allowed for scoring unsafe behaviors that may be precursors to accidents<br />

and determined the extent to which a goal is met. D&D could also be used, according to<br />

participants.<br />

Table 3 shows that checklist was the second choice for most participants. Finally,<br />

rating scale was third for several participants.<br />

Effects<br />

Table 2 shows that slightly over half of participants chose rating scales as the best<br />

way to measure effects because they can measure the extent to which a goal was met.<br />

The checklist technique was also a popular first choice because it showed whether the<br />

effect was achieved and allowed for a yes or no format.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


As depicted in Table 3, half of participants chose D&D as their second choice,<br />

whereas frequency was a common third choice. Rating scale was a less common third<br />

Distance and<br />

Outcomes Checklist Frequency Discrepancy Rating Scale<br />

Accuracy 10% 10% 60% 20%<br />

Timeliness - 11% 67% 22%<br />

Productivity - 80% 10% 10%<br />

Efficiency - 10% 70% 20%<br />

Safety - 50% 10% 40%<br />

Effects 33% - 11% 56%<br />

choice.<br />

Table 2. First Choice Rating Formats for Outcome Measures<br />

Table 3. Second Choice Rating Formats for Outcome Measures<br />

Distance and<br />

Outcomes Checklist Frequency Discrepancy Rating Scale<br />

Accuracy 45% 22% 11% 22%<br />

Timeliness 30% 10% 20% 40%<br />

Productivity 56% - 11% 33%<br />

Efficiency 13% 13% 13% 61%<br />

Safety 62% 25% 13% -<br />

Effects 12% 25% 50% 13%<br />

Process Measures<br />

Procedural Taskwork<br />

As highlighted in Table 4, nearly every respondent chose checklist as the most<br />

appropriate way to measure procedural taskwork because it follows a step-by-step<br />

process. Participants felt that the steps lend themselves to a checklist format, which<br />

allows for actions to be measured by using dichotomous variables, such as yes or no. In<br />

addition, they felt that it allows for determining whether or not multiple outcomes were<br />

accomplished. Paper and pencil job knowledge tests and hands-on performance tests<br />

were also recommended for both procedural and non-procedural taskwork, given that<br />

these tests effectively measure the expertise and skill sets of individuals.<br />

667<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


668<br />

Table 5 shows that nearly half of participants chose the rating scale format as their<br />

second choice, whereas few chose it as their third choice. In addition, using a hands-on<br />

performance task was a second choice for measuring all three processes.<br />

Non-procedural Taskwork<br />

Table 4 demonstrated that the majority of participants felt that rating scales most<br />

effectively measure non-procedural taskwork. Rating scales “with appropriate<br />

benchmarks permit reliable measurement of complex tasks,” according to one individual.<br />

Participants felt that this format allows for subjective judgments and offers flexibility in<br />

assessing various aspects of the task under scrutiny.<br />

A few chose the checklist format as their first choice because it allows for<br />

measuring whether or not a task was accomplished. The checklist format was a more<br />

popular second choice, as one-half of participants chose it for second place, whereas<br />

rating format was second for only one-quarter of participants (Refer to Table 5). The<br />

frequency format was the most popular third choice.<br />

Teamwork<br />

Not surprisingly, rating scale was also the most popular choice for measuring<br />

teamwork, as evidenced in Table 4. Participants’ opinions were that rating scales allow<br />

for measuring subjective items and variance in the responses. According to one<br />

individual, rating scales were “the best method for capturing complex behavior associated<br />

with teamwork.” Several felt that this approach allowed for greater flexibility.<br />

According to participants, if the teamwork components were countable behaviors,<br />

frequency would be a suitable method. The most common answers for second and third<br />

choices were the rating scale, checklist, and frequency formats (Refer to Table 5 for<br />

second choice). Peer evaluations and performance tasks were also recommended as<br />

second choices for measuring teamwork.<br />

Table 4. First Choice Rating Formats for Process Measures<br />

Processes Checklist Frequency<br />

Distance and<br />

Discrepancy Rating Scale<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Job<br />

Knowledge<br />

Tests<br />

Procedural<br />

Taskwork 80% - - 10% 10%<br />

Nonprocedural<br />

Taskwork 22% - - 67% 10%<br />

Teamwork - 10% - 90% -


Table 5. Second Choice Rating Formats for Process Measures<br />

Processes Checklist Frequency<br />

Distance<br />

and<br />

Discrepancy<br />

Rating<br />

Scale<br />

Performance<br />

Tasks<br />

669<br />

Peer<br />

Evaluations<br />

Procedural<br />

Taskwork 11% 33% - 45% 11% -<br />

Nonprocedural<br />

taskwork 50% 13% - 25% 12% -<br />

Teamwork 38% 13% 13% 12% 12% 12%<br />

CONCLUSION<br />

In summary, we found that D&D was the preferred choice for many outcome<br />

measures because D&D allowed the rater to measure performance against a standard.<br />

For example, D&D allowed the rater to evaluate a trainee’s timeliness or efficiency. On<br />

the other hand, safety and effects processes lent themselves more frequently to frequency<br />

counts and checklists, as they are often measured in terms of the number of actions<br />

performed.<br />

For measuring the step-by-step components of procedural taskwork, the checklist<br />

format was judged to be most suitable. However, according to participants, nonprocedural<br />

taskwork and teamwork can best be measured using rating scales, given that<br />

the activities can vary.<br />

This study was a beginning step in providing guidance on linking process and outcome<br />

measures with appropriate rating formats. It is important to note that these results were<br />

based on data from ten individuals. Further research is needed to validate these findings.<br />

To collect more data on choosing an appropriate measurement method and determining<br />

how the choice may vary based on training purposes, an additional independent survey<br />

with measurement experts is planned. Based on these results, we will map particular<br />

rating formats to particular measures and develop the business rules, or the guidelines, for<br />

the automated tool for military instructors. This tool will help military instructors<br />

identify, develop, and assess specific measures of human performance during scenariobased<br />

training.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


670<br />

REFERENCES<br />

Arvey, R. D., & Murphy, K. R. (1998). Performance evaluation in work settings. Annual<br />

Review of Psychology, 49, 141-168.<br />

Borman, W. C., Hough, L. M., & Dunnette, M. D. (1976). Development of behaviorally<br />

based rating scales for evaluating U.S. Navy Recruiters. (Technical Report TR-<br />

76-31). San Diego, CA: Navy Personnel Research and Development Center.<br />

Bretz, R. D., Jr., Milkovich, G. T., & Read, W. (1992). The current state of performance<br />

appraisal research and practice: Concerns, directions, and implications. Journal<br />

of Management, 18(2), 321-352.<br />

Cascio, W. F. (1991). Applied psychology in personnel management (4 th ed.).<br />

Englewood Cliffs, NJ: Prentice Hall.<br />

Landy, F. J., & Farr, J. L. (1980). Performance rating. Psychological Bulletin, 87, 72-<br />

107.<br />

Murphy, K., & Cleveland, J. (1995). Understanding performance appraisal: Social,<br />

organizational, and goal-based perspectives. Thousand Oaks, CA: Sage.<br />

Oser, R. L., Cannon-Bowers, J. A., Salas, E., & Dwyer, D. J. (1999). Enhancing human<br />

performance in technology-rich environments: Guidelines for scenario-based<br />

training. In E. Salas (Ed.), Human/technology interaction in complex systems<br />

(Vol. 9; pp. 175-202). Stanford, CT: JAI Press.<br />

Smith-Jentsch, K. A., Johnston, J. H., & Payne, S. C. (1998). Measuring team-related<br />

expertise in complex environments. In J.A. Cannon-Bowers & E. Salas (Eds.),<br />

Making decisions under stress: Implications for individual and team training (pp.<br />

61-87). Washington, DC: American Psychological <strong>Association</strong>.<br />

AUTHOR NOTES<br />

Amy K. Holtzman, American Institutes for Research; David P. Baker, American<br />

Institutes for Research; Robert F. Calderón, American Institutes for Research (now at<br />

Caliber Associates, Inc.); Kimberly Smith-Jentsch, University of Central Florida; Paul<br />

Radtke, NAVAIR Orlando.<br />

Funding for this research project was provided by the Navy (contract N61339-02-<br />

C-0016). The opinions in this article are those expressed by the authors and are not<br />

necessarily representative of Navy policies.<br />

Correspondence concerning this article should be addressed to Amy K. Holtzman,<br />

American Institutes for Research, 1000 Thomas Jefferson St., NW, Washington, DC<br />

20007-3835. E-mail: aholtzman@air.org.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


SOFTWARE SUPPORT OF HUMAN PERFORMANCE ANALYSIS<br />

Ian Douglas<br />

Learning Systems Institute, Florida State University,<br />

320A, 2000 Levy Avenue Innovation Park, Tallahassee, Florida, 32310, USA.<br />

idouglas@lsi.fsu.edu<br />

INTRODUCTION<br />

This paper will briefly describe the outcomes of the object-oriented performance analysis<br />

(AOOPA) project, which is a three year research project carried out in collaboration with<br />

the army training information systems directorate and the coast guard human<br />

performance technology center. The project has two main goals: to develop a framework<br />

for optimal methods of front-end analysis to precede the development of human<br />

performance support systems (Douglas and Schaffer, 2002) and to develop a model for a<br />

new generation of software tools to support the framework. A framework is a set of<br />

guidelines for creating efficient methodologies; a methodology is a more detailed process<br />

prescription. There are two key foundations for the framework. Firstly, that everything<br />

should be driven by an understanding of performance within an organizational system. It<br />

should not be driven by solutions. Secondly, that the output of performance analysis<br />

should be digitized in the form of standard packages of analysis knowledge that can be<br />

shared and reused.<br />

In addition, the following principles are recommended within the framework:<br />

Visual modeling<br />

Collaborative analysis that includes end-users<br />

Rationale management<br />

Automated support for analysis<br />

It is important to stress that the framework is not tied to any particular solution type. It<br />

should not be interpreted as needs analysis for training or any other solution type.<br />

Although training has been the dominant solution by which organizations seek to enhance<br />

the performance of their personnel, the knowledge and skill requirements for operations<br />

are expanding and changing at such a rate that other solutions are required for attaining<br />

optimal performance. Automation, process re-engineering, providing jobs aids, and justin<br />

time learning are among many solution types that can be blended into performance<br />

support systems. Encouraging consideration of creative, non-traditional solutions is also<br />

important in performance improvement; a perfect example of this was the use of playing<br />

cards in Iraq to facilitate facial recognition. The OOPA framework is founded on general<br />

systems theory (Weinberg, 2001). It also incorporates the common analytic approach of<br />

stepwise refinement from a general problem domain to more specific components of the<br />

problem.<br />

671<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


672<br />

An important part of the framework is that it incorporates the need for greater costefficiency<br />

in the development of performance support solutions. It does this in two ways;<br />

firstly, by identifying measurable performance goals during analysis against which<br />

improvements brought about by different performance support mechanisms can be<br />

measured. In this regard it adopts the lessons from studies in the field of human<br />

performance technology (Gilbert, 1996, Robinson and Robinson, 1995, Rosset, 1999).<br />

Secondly, it incorporates the growing trend towards encouraging and facilitating reuse<br />

and sharing of digital assets. The view associated with this trend has so far been confined<br />

predominantly to learning content (Douglas, 2001, Gibbons et al 2000). In the military<br />

this trend is associated with the sharable content reference model (SCORM) from the<br />

Advanced Distributed Learning (ADL) initiative (see www.adlnet.org). In the<br />

framework, we extend reuse thinking to the reuse of problem analysis knowledge.<br />

Figure 1: Model for enterprise software system based on performance analysis and<br />

evaluation<br />

By successfully defining a technology-supported framework for reusable performance<br />

analysis knowledge, solutions to organizational performance problems or new<br />

performance requirements will be specified more clearly and open to wider scrutiny via<br />

the internet.<br />

The AOOPA framework is one part of a more general framework that incorporates<br />

analysis, change intervention and evaluation. The AOOPA software prototype is one part<br />

of a model for an enterprise information technology system to support the more general<br />

framework (see figure 1). In the enterprise IT system, analysis and evaluation knowledge,<br />

and support solutions are organized into digitized components and shared among webenabled<br />

communities of stakeholders. Performance analysis sets the baseline in<br />

determining what roles exist in an organization, what goals they must achieve and how<br />

the achievement of those goals can be measured. The packaging of this knowledge into<br />

digital components that can be accessed online will help reduce the replication of effort<br />

that can occur when disparate groups look at similar problems at different times and are<br />

unaware of existing knowledge. Part of the reason for this situation is that there are no<br />

centralized stores of such knowledge and it is usually communicated in the form of large<br />

integrated documents in non-standard formats.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


SOFTWARE SUPPORT<br />

A working model (proof-of-concept prototype) for a new generation of software tools to<br />

support the performance analysis framework has been constructed. This can be accessed<br />

at the web site of knowledge communities’ research group,<br />

http://www.lpg.fsu.edu/OOPA/. The prototype is entirely web-based and incorporates all<br />

of the features of the framework outlined in the introduction. An important concept<br />

embedded in the design of the prototype is configurability, i.e. tools should be not fixed<br />

to a particular methodology, but be adaptable to the specific methodologies (and<br />

terminology) used in different organizations and groups. The intention is to create a set of<br />

configurable tools and methods, which have a shared underlying representation of<br />

performance analysis knowledge. The system architecture is based on the emerging new<br />

paradigm of service-oriented systems (Yao, Lin and Mathieu, <strong>2003</strong>). Service-oriented<br />

systems are constructed from web-based software components which each offer a distinct<br />

service, they differ from traditional systems that package a number of services into an<br />

integrated application. The model enables custom front-ends to be created to a<br />

continuously refined shared repository of knowledge. Each version of the AOOPA<br />

system will have core component categories (see figure 2), but the specific version of<br />

each component will vary from organization to organization. In the current version a third<br />

party collaboration tool called Collabra has used for the collaboration component. If a<br />

different organization used a different collaboration tool this would be ‘plugged in’ in<br />

place of Collabra. Likewise, if different data types were collected in another<br />

organizations methodology (or different terminology used), different data entry templates<br />

could appear. The user support component can be tailored to the specific methodology<br />

employed by an organization.<br />

Analysis System Specific<br />

Interfaces<br />

Visual Modelling<br />

Performance Data Entry<br />

Rationale Management<br />

User Support<br />

Collaboration<br />

User Management<br />

Search<br />

Enterprise-Wide<br />

Data<br />

Handling<br />

Performance<br />

Analysis<br />

Project<br />

Databases<br />

Enterprise Gatekeeper<br />

Reusable<br />

Analysis<br />

Repository<br />

Figure 2: Architecture for the performance analysis support software<br />

Reusable<br />

Solution<br />

Repository<br />

673<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


674<br />

Figure 3: Screen shot from current version of the model showing performance case<br />

modeling<br />

The components of an analysis (models, data and rationale) are stored in a project<br />

specific database from which analysts and stakeholders can retrieve, view and comment<br />

on the content. Some organizations may wish to have a gatekeeper function to filter<br />

quality analysis controlled components into a central repository. An integral part of the<br />

tool is an automated search of this repository. Thus, as soon as an analysis team on a new<br />

project begins to enter data, it is matched against existing data in the analysis repository<br />

to alert the user to possible sources of existing knowledge.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Figure 3, illustrates the prototype that has been constructed to demonstrate one version<br />

conforming to the framework and the architecture illustrated in figure 2. The modeling<br />

component is a key focal point and provides a shared reference and navigation model<br />

throughout a project. The current prototype uses performance case modeling, which is an<br />

adaptation from unified modeling language (UML) use case notation (Cockburn, 1997).<br />

UML is widely used in object-oriented software systems analysis and has been adapted<br />

for more general systems analysis (Marshall, 2000). Performance case notation provides<br />

a simple, end-user understandable means of defining a problem space. A performance<br />

diagram is a graphic that illustrates what performers do on the job and how they interact<br />

with other performers to reach performance goals. A role is a function that someone has<br />

as part of an organizational process (e.g., mission commander, radio operator, vehicle<br />

inspector). A primary role is the focus of the project. Secondary roles, someone who<br />

interacts with the primary role, may be included when looking at team performance. The<br />

primary role is likely to achieve several performance goals, e.g. a mission commander<br />

would have to successfully plan, brief, execute and conduct an after action review. High<br />

level performance goals decompose into lower level diagrams containing sub-goals.<br />

Performance goals represent desired performance at an individual level.<br />

Figure 4: Screen shot from current version of the model showing gap analysis for one<br />

performance case<br />

Facilitated by the groupware component an analysis team works collaboratively to create<br />

and edit the performance diagram. The analysis team will use the diagram to develop a<br />

shared understanding of a domain and identify performance cases where there is a gap<br />

between desired and current on-the-job performance. It allows the organization to<br />

pinpoint a specific performance discrepancy that could be costing time, money, and other<br />

resources. Those performance cases will be subject to a more detailed analysis.<br />

675<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


676<br />

There are a variety of data collection templates that could be attached to the performance<br />

case to assist in detailed analysis. The current version of the AOOPA model uses a gap<br />

analysis template (see figure 4) in which data is collected about current and desired<br />

performance in the tasks that are carried out in pursuit of a performance goal. Where a<br />

gap is found, for example if 100% accuracy is required on a task and only 60% of those<br />

assigned to the task are able to achieve this, then a cause and solution analysis will be<br />

initiated. In a cause analysis, stakeholders review gap data, brainstorm possible causes,<br />

put them into cause categories, rate them by user-defined criteria, and select which ones<br />

to pursue. The AOOPA prototype allows users to categorize causes so the recommended<br />

solutions are more likely to address the underlying causes. The specific process used in<br />

this version is described in more detail in Douglas et al, <strong>2003</strong>.<br />

Organization<br />

Process<br />

Organization<br />

Process<br />

X<br />

X Version<br />

Version of<br />

Y<br />

Version<br />

X<br />

Analysis<br />

data<br />

Y<br />

Analysis<br />

data<br />

dl<br />

Figure 5: Transferring data between different organizations<br />

FUTURE WORK<br />

An important concept embedded in the design of the prototype is configurability<br />

(Cameron, 2002). As noted in the introduction, a framework is meant to provide a<br />

structure for a variety of approaches that can be tailored to specific groups or situations<br />

rather than to provide a set of rules for a single correct way of developing systems. The<br />

philosophy is that no “one size fits all” methodology will be effective; methodologies<br />

evolve to fit organizations, situations, and new technologies. The same is true for<br />

software tools, which are of limited use when fixed on a particular methodology. Given<br />

that different organizations will adopt different methods to suit different circumstances,<br />

software support should be adaptable. The vision is of a system of configurable tools and<br />

methods, which have a shared underlying representation of performance analysis<br />

knowledge. This will allow custom interfaces to a continuously refined shared repository<br />

of knowledge on human performance (see figure 5).<br />

The software architecture used allows the plug-in of different components, thus allowing<br />

a different set of components to be configured to each methodology that conforms to the<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Process<br />

Independen<br />

t


framework. Having completed the version of AOOPA toolset described in the previous<br />

section a second version is being constructed based on the methodology being used by<br />

the coast guard. The two versions will be used to begin the development and testing of<br />

mechanisms that enable the exchange of performance analysis data across different<br />

organizations. If such mechanisms prove feasible, they have the potential not only to<br />

reduce the replication of effort that currently occurs across service and unit boundaries,<br />

but also to greatly increase performance levels. Performance analysis may develop into a<br />

continuous improvement process throughout the military rather than a discrete activity<br />

prior to new systems development.<br />

REFERENCES<br />

Cameron, J. (2002). Configurable Development. Communications of the ACM. Process,<br />

45 (3), pp. 72-77.<br />

Cockburn. A. (1997). Structuring use cases with goals. Journal of Object Oriented<br />

Programming, 10 (7): pp. 35–40.<br />

Chung, J.C., Lin, K.J., and Mathieu R.G. (<strong>2003</strong>). Web Services Computing: Advancing<br />

Software Interoperability. IEEE Computer, October, 36 (10). pp. 35-37.<br />

Douglas, I., Nowicki, C., Butler, J. and Schaffer S., (<strong>2003</strong>). Web-Based Collaborative<br />

Analysis, Reuse and Sharing of Human Performance Knowledge. To appear in the<br />

proceedings of the Inter-service/Industry Training, Simulation and Education Conference<br />

(I/ITSEC). Orlando, Florida, Dec.<br />

Douglas, I. and Schaffer, S. , (2002). Object-oriented performance improvement.<br />

Performance Improvement Quarterly. 15 (3), pp. 81-93.<br />

Douglas, I. (2001). “Instructional design based on reusable learning objects: applying<br />

lessons of object-oriented software engineering to learning systems design”. <strong>Proceedings</strong><br />

of the IEEE Frontiers in Education conference. F4E, pp. 1-5, Reno, Nevada, October.<br />

Gibbons, A. S., Nelson, J. & Richards, R. (2000). The nature and origin of instructional<br />

objects. In D. A. Wiley (Ed.), The Instructional Use of Learning Objects: Online Version.<br />

Retrieved from the World Wide Web: http://reusability.org/read/chapters/gibbons.doc<br />

Gilbert, T. (1996). Human competence: Engineering worthy performance. Amherst,<br />

MA: HRD Press, Inc.<br />

Marshall, C. (2000). Enterprise modeling with UML. Reading, Mass: Addison Wesley.<br />

Robinson, D. and Robinson J.C., (1995). Performance Consulting: Moving Beyond<br />

Training. San Francisco: Berrett-Koehler.<br />

677<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


678<br />

Rossett, A. (1999). First Things Fast: A Handbook for Performance Analysis. San<br />

Francisco: Jossey-Bass Pfeiffer.<br />

Weinberg, G. (2001). An introduction to general systems thinking. New York: Dorset<br />

House.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


HOW MILITARY RESEARCH CAN IMPROVE TEAM TRAINING<br />

EFFECTIVENESS IN OTHER HIGH-RISK INDUSTRIES<br />

INTRODUCTION<br />

Jeffrey M. Beaubien, Ph.D.<br />

Senior Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

jbeaubien@air.org<br />

David P. Baker, Ph.D.<br />

Principal Research Scientist<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

dbaker@air.org<br />

Amy K. Holtzman, M.A.<br />

Research Associate<br />

American Institutes for Research<br />

1000 Thomas Jefferson Street, NW<br />

Washington, DC 20007-3835<br />

aholtzman@air.org<br />

For over 30 years, military-sponsored research has advanced the state-of-the-science by<br />

defining the essential components of teamwork (Salas, Bowers, & Cannon-Bowers, 1995);<br />

developing theoretical models of team dynamics (Salas, Dickinson, Converse, & Tannenbaum,<br />

1992); measuring team inputs, processes, and outputs (Cannon-Bowers, Tannenbaum, Salas, &<br />

Volpe, 1995); and developing training programs to improve team performance (Smith-Jentsch,<br />

Zeisig, Acton, & McPherson, 1998). Although similar lines of research have been undertaken in<br />

other high-risk industries – such as aviation and healthcare – researchers in these domains have<br />

rarely built upon the lessons learned from military team training research to any significant<br />

degree (Salas, Rhodenizer, & Bowers, 2000).<br />

The primary purpose of this paper is to illustrate how military-sponsored research can be<br />

leveraged to advance the practice of team training in other high-risk industries. Specially, we<br />

identify two areas – the specification of critical team knowledge, skill, and attitude<br />

competencies, and the development of effective training strategies – that have the greatest<br />

potential for transitioning military research findings to non-military settings. Finally, we<br />

comment on several possible reasons as to why advancements in the military have not<br />

transitioned, and provide suggestions for disseminating critical military research findings on<br />

team performance.<br />

679<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


680<br />

CRITICAL TEAMWORK COMPETENCIES<br />

Team training refers to a set of instructional strategies that apply well-tested tools (e.g.,<br />

simulation, lectures, behavioral models) to improve the knowledge, skills, and attitudes that are<br />

required for effective team performance. Unfortunately, the published literature on teamwork<br />

competencies contains numerous inconsistencies in both the competency labels and their<br />

associated definitions. In this section, we describe recent efforts to clarify this body of research,<br />

and the implications of this work for improving the team training effectiveness.<br />

Team Knowledge Competencies<br />

Team knowledge competencies are defined as facts, principles, and concepts that help<br />

team members form appropriate interaction strategies, coordinate with one another, and achieve<br />

maximum team performance. For example, to function effectively the team members must know<br />

what team skills are required, when particular team behaviors are appropriate, and how these<br />

skills should be utilized. The team members must also be familiar with the team’s mission, and<br />

should understand one another’s roles in achieving that mission (Cannon-Bowers et al., 1995).<br />

Team Skill Competencies<br />

Team skill competencies are defined as the learned capacity to interact with one another<br />

in pursuit of a common goal. Unlike knowledge competencies, which involve the mastery of<br />

factual knowledge, team skill competencies involve the application of knowledge to perform<br />

specific behaviors. Recent research suggests that team skill competencies can be classified into<br />

eight major categories: adaptability, situation awareness, performance monitoring/feedback,<br />

leadership, interpersonal relations, coordination, communication, and decision-making.<br />

Moreover, several research studies have shown that these skills are directly related to team<br />

performance (cf. Salas et al., 1995).<br />

Team Attitude Competencies<br />

Team attitude competencies are defined as internal states that influence the team<br />

members’ decisions to act in a particular way. Previous research suggests that team attitudes can<br />

have a significant effect on how teamwork skills are actually put into practice. For example,<br />

Driskell and Salas (1992) reported that collectively-oriented individuals performed significantly<br />

better than individually-oriented team members, because collectively-oriented individuals tended<br />

to take advantage of the benefits offered by teamwork.<br />

Factors That Influence Team Competency Requirements<br />

Tannenbaum and his colleagues suggests that team performance cannot be understood<br />

independently of the team’s organizational, work, and task environment (Tannenbaum, Beard, &<br />

Salas, 1992). The authors define “organizational characteristics” – such as reward systems,<br />

policies, supervisory control, and resources – as features that define the task and, by extension,<br />

the competencies that are required to perform that task. The authors define “work<br />

characteristics” as structural and normative variables – such as formal rank or leadership<br />

hierarchies, and the extent to which team members are geographically dispersed – that determine<br />

how tasks are assigned and shared by various team members. Finally, the authors define “task<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


characteristics” – such as task complexity, task organization, and task type – as factors that<br />

determine the extent to which coordination is necessary for successful team performance.<br />

Building on Tannenbaum et and colleagues’ work, Cannon-Bowers and her colleagues<br />

(1995) developed a 2x2 typology of team training requirements. Quadrant I depicts teams that<br />

perform a relatively stable set of tasks with a relatively stable set of teammates. These teams are<br />

hypothesized to require team-specific and task-specific competencies – such as task organization,<br />

mutual performance monitoring, and shared problem-model development – that are “contextdriven.”<br />

Examples of teams that require context-driven competencies include combat teams and<br />

sports teams. Quadrant II depicts teams whose tasks vary considerably over time, but these tasks<br />

are performed with a relatively stable set of teammates. These teams are proposed to require<br />

team-specific and task-generic competencies – such as conflict resolution, motivating others, and<br />

information exchange – that are “team-contingent.” Examples of teams that require teamcontingent<br />

competencies include self-managing work teams, management teams, and quality<br />

circles. Quadrant III depicts teams that perform a stable set of tasks with sets of individuals that<br />

vary. These teams are expected to require task-specific and team-generic competencies – such as<br />

task structuring, mission analysis, and mutual performance monitoring – that are “taskcontingent.”<br />

Examples of teams that require task-contingent competencies include medical<br />

teams, aircrews, and some fire fighting teams. Finally, quadrant IV depicts teams that perform<br />

tasks that vary over time with team members who also vary. These teams are predicted to<br />

require team-generic and task-generic competencies – such as morale building, consulting with<br />

others, and assertiveness – that are “transportable.” Examples of teams that require transportable<br />

competencies include task forces, project action teams, and project teams. Practitioners can use<br />

this typology to help them identify the most important team competencies for their particular<br />

team type.<br />

The Measurement of Team Competencies<br />

Many researchers have found it difficult to measure more than four distinct competencies<br />

at a time, for example during scenario-based training. Smith-Jentsch and her colleagues (1998)<br />

identified four teamwork skill competencies that could be reliably and accurately measured<br />

during Navy combat-information-center (CIC) team training scenarios: information exchange,<br />

supporting behavior, team feedback skill, and flexibility. Information exchange was defined as<br />

passing relevant data to team members who need it, before they need it, and ensuring that sent<br />

messages are understood as intended. Supporting behavior is defined as offering and requesting<br />

assistance in an effective manner both inside and outside of the team. Team feedback skill is<br />

defined as communicating one’s observations, concerns, suggestions, and requests clearly<br />

without becoming hostile or defensive. Finally, flexibility was defined as adapting team<br />

performance strategies quickly and appropriately to changing task demands.<br />

Conclusions<br />

As our discussion demonstrates, military-sponsored research has led the way in defining<br />

team competencies, specifying the core training requirements for various team types, and<br />

assessing team competencies during simulation-based training. Sadly, the aviation and<br />

681<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


682<br />

healthcare domains are still plagued by inconsistent terminology and definitions for important<br />

team competencies, and have made substantially less progress in measuring team competencies<br />

during training. In the next section, we identify recent advances in training strategies that have<br />

the potential to improve team training effectiveness in other high-risk industries.<br />

TRAINING STRATEGIES<br />

The military has led the way in developing effective strategies for training team<br />

competencies. The watershed for much of this research was the accidental shoot down of an<br />

Iranian Airbus by the USS Vincennes in the Persian Gulf in 1988. In response to the incident,<br />

the Navy began a multi-year, multi-million dollar research program to identify effective team<br />

training interventions. The program, called Tactical Decision Making Under Stress (TADMUS),<br />

began in 1990 and led to numerous breakthroughs in the science and practice of team training,<br />

such as the development of cross-training, mental model training, and team self-correction<br />

training. Following the Navy’s lead, the U.S. Air Force and U.S. Army also supported applied<br />

research into team training during the 1990s. Both programs led to improved team training in<br />

these two branches of the military (cf. Spiker, Silverman, Tourville, & Nullmeyer, 1998). In the<br />

sections that follow, we describe some of the accomplishments in team training that have the<br />

greatest potential for application in other high-risk industries.<br />

Simulator-Based Training<br />

Simulators have been used widely to train teams in the military, aviation, and most<br />

recently, healthcare. Simulator-based training is based on the logic that the fidelity of the<br />

training environment is essential to ensure the transfer of trained skills. Training-environment<br />

fidelity is comprised of stimulus fidelity (i.e., trainees experience the same “behavioral trigger”<br />

that they will experience on the job); response fidelity (i.e., trainees perform the same behaviors<br />

that they will perform on the job); and equipment fidelity (i.e., trainees use the same materials<br />

and equipment that they will use on the job) (Salas et al., 1992).<br />

Even though there have been tremendous advances in the extent to which simulations can<br />

reproduce realistic conditions of a team’s environment, military research has demonstrated that a<br />

realistic simulation by itself is not a panacea for ensuring effective team training. Other factors,<br />

in particular the design of the training, are equally if not more important than simulator fidelity.<br />

For example, Oser and colleagues define scenario-based training is a systematic process of<br />

linking all aspects of scenario design, development, implementation, and analysis (Oser,<br />

Cannon-Bowers, Salas, & Dwyer, 1999). Scenario-based training involves a six-step process:<br />

(1) reviewing skill inventories and/or historical performance data; (2) developing learning<br />

objectives and competencies; (3) selecting scenario events; (4) identifying performance measures<br />

and standards; (5) diagnosing performance strengths and weaknesses; and (6) delivering<br />

feedback to the trainees. Scenario-based training differs from traditional classroom training in<br />

that a scenario or exercise serves as the curriculum with the overall goal of providing specific<br />

opportunities for trainees to develop critical competencies through practice and feedback.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Team-Coordination Training<br />

Another technique widely used for military team training is team-coordination training<br />

(TCT). TCT concentrates on teaching team members about the basic processes underlying<br />

teamwork. It typically targets several team competencies needed for successful performance in a<br />

particular environment. TCT is usually delivered through a combination of lecture,<br />

demonstration (e.g., video examples), and practice-based methods (e.g., role plays) over two to<br />

five days. Research supports its effectiveness in terms of positive reactions, enhanced learning,<br />

and behavioral change. Similar to simulator-based training, TCT has been widely applied in<br />

aviation and has recently been introduce in healthcare. In aviation, TCT is referred to as Crew<br />

Resource Management (CRM) training (Salas, Fowlkes, Stout, Milanovich, & Prince, 1999).<br />

Team Self-Correction Training<br />

The last three training methods noted here – self-correction training, cross-training, and<br />

stress exposure training – are strategies that were developed from the TADMUS project and have<br />

been applied in the military but have not been utilized in other high risk industries. Team selfcorrection<br />

is the naturally occurring tendency for effective teams to debrief themselves by<br />

reviewing their past performance, identifying and diagnosing errors, discussing remedial<br />

strategies, and planning for the future. Self-correction training is delivered through a<br />

combination of lecture, demonstration, practice, and feedback. Team members learn to observe<br />

their performance, to categorize their effective and ineffective behavior into a structured format,<br />

and to use this information to give each other feedback (Cannon-Bowers & Salas, 1998). When<br />

guided by a competent instructor, this method of team training has been demonstrated to improve<br />

team performance.<br />

Cross-Training<br />

Cross-training exposes team members to the basic tasks, duties, and responsibilities of the<br />

positions held by other members of the team; the purpose is to promote coordination,<br />

communication and team performance. Ideally, this training alleviates the decline in<br />

performance that is likely to follow personnel changes; it also increases implicit coordination<br />

(i.e., being able to coordinate without the need to communicate explicitly). The training<br />

comprises sharing cross-role information (teammates, task, equipment, situation); enhancing<br />

team members’ understanding of interdependencies, roles and responsibilities; and providing<br />

cross-role practice and feedback. Research has demonstrated that, compared their counterparts<br />

who were not cross-trained, cross-trained teams better anticipate the information needs of their<br />

teammates, commit fewer errors, and exhibit more effective teamwork behaviors (Cannon-<br />

Bowers & Salas, 1998).<br />

Stress Exposure Training<br />

Stress can exert a significant negative influence on an individual or a team’s ability to<br />

perform effectively, especially in high-stress environments that are characterized by ambiguous<br />

683<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


684<br />

situations and severe time pressure (e.g., military operational environment, medical emergency<br />

departments). Stress exposure training (SET) reduces stress through a three-phase program<br />

designed to provide trainees with information, skills training, and practice. SET improves<br />

performance by providing team members with experience in the stressful environment, thereby<br />

helping them learn what to expect. Practice takes place under graduated exposure to stressors.<br />

Documented outcomes of SET include reduced anxiety in stressful situations, increased<br />

confidence, and improved cognitive and psychomotor performance under stress (Driskell &<br />

Johnson, 1998).<br />

Conclusions<br />

As our discussion demonstrates, the military has led the way in developing a set of tools,<br />

methods, and content that focuses on enhancing teamwork. In aviation and healthcare some of<br />

these strategies have been adapted, or a more accurate characterization is that aviation and<br />

healthcare have developed their own similar approaches. With that in mind, we now turn to the<br />

cases of aviation and healthcare and briefly examine current practices in each of these industries<br />

and then highlight several areas where we see great opportunities to transition findings from<br />

military research.<br />

CASE STUDIES<br />

Aviation<br />

Current Practices. For over thirty years, team performance has been a central focus of<br />

commercial aircrew training. This training, which is known as Crew Resource Management<br />

(CRM) training, initially focused on changing pilot attitudes through in-class lectures,<br />

demonstrations, and discussion among aircrew members (Helmreich, Merritt, & Wilhelm, 2000).<br />

Over the years, CRM training has evolved into its current form today under the Federal Aviation<br />

Administration’s (FAA) Advanced Qualification Program (AQP). Unlike traditional pilot<br />

training under CFR 14 Part 121, AQP integrates CRM principles with technical skills training<br />

through the entire training curriculum. Team training under AQP primarily relies on two<br />

strategies, team coordination training (TCT) and simulator-based training. In fact under AQP,<br />

aircrews are actually evaluated on their CRM and technical skills in the simulator during an endof-training<br />

Line Operational Evaluation (LOE) which is used to certify their airworthiness<br />

(Federal Aviation Administration, 1990).<br />

Possible Transitions. Unlike other high-risk industries, aviation has been one of the<br />

leaders in attempting to understand and enhance team performance (Helmreich et al., 2000).<br />

However, as we have noted throughout, these efforts have occurred in a vacuum and have<br />

“reinvented the wheel” by not transitioning findings and lessons learned from the military.<br />

Based on the current status of military research, we believe that although the science of team<br />

performance is advanced in aviation, aviation could benefit significantly by leveraging lowercost,<br />

team training strategies from the military, to reduce aviation’s over reliance on expensive<br />

simulator training. Granted when training advanced technical skills, high fidelity simulations<br />

will be required, however similar levels of fidelity are not required for training CRM skills.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Numerous lessons learned have come out of the TADMUS research for example (e.g., team selfcorrection<br />

training, cross-training) that could directly transition and should yield improved team<br />

performance on the part of aircrews at significantly reduced costs.<br />

Healthcare<br />

Current Practice. It is only within the last few years that the healthcare industry has<br />

placed a significant emphasis on the relationship between teamwork and patient safety. This<br />

new focus was caused by the publication of To Err is Human, a detailed treatise on the<br />

unacceptable levels of system failures within healthcare (Kohn, Corrigan, & Donaldson, 1999).<br />

Since the publication of To Err is Human, several team training interventions have been<br />

introduced. For example, MedTeams TM (Morey, Simon, Jay, Wears, et al., 2002) – a lecture and<br />

discussion-based curriculum – and Anesthesia Crisis Resource Management (ACRM; Gaba,<br />

Howard, Fish, Smith, & Sowb, 2001) – a simulator based curriculum – have been implemented<br />

in a number of private, public, and military hospitals.<br />

Possible Transitions. We believe, relative to aviation, that there are significantly more<br />

opportunities to transition military findings to healthcare, because of the early stage of<br />

development of medical team training. However, and quite interesting from the standpoint of the<br />

discussion here, the healthcare domain has not developed their existing approaches in a vacuum<br />

but rather looked to aviation and not the military for guidance. Although this has led to some<br />

important transitions from aviation, we believe that the more relevant and most useful<br />

information resides in accomplishments made by the military. Specifically, we believe that<br />

healthcare could benefit greatly by examining the work of Cannon-Bowers et al. (1995) on how<br />

team knowledge, skill, and attitude requirements vary by task and team characteristics. Such<br />

research should be used as a basis when identifying medical team competency requirements and<br />

how competency requirements might vary by medical specialty. Second, we believe that many<br />

of the training strategies developed under the TADMUS program could be directly transitioned<br />

to healthcare. Particularly, team self-correction training because of its reliance on team members<br />

to observe, assess, and debrief their own performance. This strategy seems like it would be of<br />

great benefit in medicine where time and cost constraints require practical approaches to<br />

addressing teamwork.<br />

IMPEDIMENTS AND RECOMMENDATIONS<br />

With the continuous reduction in Federal funding for basic and applied research, we<br />

believe that now more than ever it is important for multiple industries to coordinate their efforts<br />

to understand important human performance problems like error, safety, and team performance.<br />

Typically, the impediments for joint efforts are the unique contextual factors that are<br />

characteristic of the teams under investigation; military teams differ from aircrews and medical<br />

teams. However, we believe that teams in high-risk environments likely have more<br />

characteristics in common than not. For example, in all cases the consequences of error are great<br />

and time pressure and high workload are likely.<br />

685<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


686<br />

To offset the traditional stovepipes of research, we recommend joint industry workshops<br />

on team training and performance. An annual conference would allow researchers to<br />

disseminate findings and tools that could be directly transitioned into other industries – this<br />

approach would also be far quicker than the typical publication process. Furthermore, such open<br />

forums would promote the coordination of future research efforts. This approach would<br />

maximize the use of available resources, which are extremely limited in today’s environment.<br />

Finally, we recognize that numerous findings are in fact transitioned from the military and other<br />

high-risk public and private industries. We simply believe that research on team performance is<br />

particularly ripe for such transitions and that the science of teams and teamwork could do better<br />

in promoting this approach.<br />

REFERENCES<br />

Cannon-Bowers, J.A., & Salas, E. (1998). Individual and team decision making under stress:<br />

Theoretical underpinnings. In J.A. Cannon-Bowers & E. Salas (Eds.), Making decisions<br />

under stress: Implications for individual and team training (pp. 17-38). Washington,<br />

DC: American Psychological <strong>Association</strong>.<br />

Cannon-Bowers, J.A., Tannenbaum, S.I., Salas, E., & Volpe, C.E. (1995). Defining<br />

competencies and establishing team training requirements. In R.A. Guzzo, E. Salas, &<br />

Associates (Eds.), Team effectiveness and decision-making in organizations (pp. 333-<br />

380). San Francisco: Jossey-Bass.<br />

Driskell, J.E., & Johnston, J.H. (1998). Stress exposure training. In J.A. Cannon-Bowers & E.<br />

Salas (Eds.), Making decisions under stress: Implications for individual and team<br />

training (pp. 191-217). Washington, DC: American Psychological <strong>Association</strong>.<br />

Driskell, J. E., & Salas, E. (1992). Collective behavior and team performance. Human Factors,<br />

34, 277-288.<br />

Federal Aviation Administration. (1990). Line operational simulations: Line oriented flight<br />

training, special purpose operational training, line oriented evaluation. Advisory<br />

Circular 120-35B. Washington, DC: Author.<br />

Gaba, D.M., Howard, S.K., Fish, K.J., Smith, B.E., & Sowb, Y.A. (2001). Simulation-based<br />

training in anesthesia crisis resource management (ACRM): A decade of experience.<br />

Simulation & Gaming, 32, 175-193.<br />

Helmreich, R.L., Merritt, A.C., & Wilhelm, J.A. (2000). The evolution of crew resource<br />

management training in commercial aviation. <strong>International</strong> Journal of Aviation<br />

Psychology, 9,19-32.<br />

Keesling, W., Ford, P., & Harrison, K. (1994). Application of the principles of training in armor<br />

and mechanized infantry units. In R.F. Holz, J.H. Hiller, et al. (Eds.), Determinants of<br />

effective unit performance: Research on measuring and managing unit training readiness<br />

(pp. 137-178). Alexandria, VA: US Army Research Institute for the Behavioral & Social<br />

Sciences.<br />

Kohn, L.T., Corrigan J.M., & Donaldson, M.S. (1999). To err is human. Washington, DC:<br />

National Academy Press.<br />

Morey, J.C., Simon, R., Jay, G.D., Wears, R., Salisbury, M., Dukes, K.A., et al. (2002). Error<br />

reduction and performance improvement in the emergency department through formal<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


teamwork training: Evaluation results of the MedTeams project. Health Services<br />

Research, 37, 1553-1581.<br />

Oser, R.L., Cannon-Bowers, J.A., Salas, E., & Dwyer, D.J. (1999). Enhancing human<br />

performance in technology-rich environments: Guidelines for scenario-based training. In<br />

E. Salas, (Ed.), Human/technology interaction in complex systems (Vol. 9, pp. 175-202).<br />

Stamford, CT: JAI Press.<br />

Salas, E., Bowers, C.A., & Cannon-Bowers, J.A. (1995). <strong>Military</strong> team research: 10 years of<br />

progress. <strong>Military</strong> Psychology, 7, 55-75.<br />

Salas, E., Dickinson, T.L., Converse, S.A. & Tannenbaum, S.I. Toward an understanding of team<br />

performance and training. In: R.W. Swezey, & E. Salas (Eds.), Teams: Their training and<br />

performance (pp. 3-29). Norwood, NJ: Ablex.<br />

Salas, E., Fowlkes, J.E., Stout, R.J., Milanovich, D.M., & Prince, C. (1999). Does CRM training<br />

improve teamwork skills in the cockpit? Two evaluation studies. Human Factors, 41,<br />

326-343.<br />

Salas, E., Rhodenizer, L., & Bowers, C.A. (2000). The design and delivery of crew resource<br />

management training: Exploiting available resources. Human Factors, 42, 490-511.<br />

Smith-Jentsch, K.A., Zeisig, R.L., Acton, B., & McPherson, J.A. (1998). Team dimensional<br />

training: A strategy for guided team self-correction. In J.A. Cannon-Bowers & E. Salas<br />

(Eds.), Making decisions under stress: Implications for individual and team training (pp.<br />

271-297). Washington, DC: American Psychological <strong>Association</strong>.<br />

Spiker, V. A., Silverman, D. R., Tourville, S. J., & Nullmeyer, R. T. (1998). Tactical team<br />

resource management effects on combat mission training performance (Report No.<br />

USAF AMRL Technical Report AL-HR-TR-1997-0137). Brooks Air Force Base: U.S.<br />

Air Force Systems/Materiel Command.<br />

Tannenbaum, S. I., Beard, R. L., & Salas, E. (1992). Team building and its influence on team<br />

effectiveness: An examination of conceptual and empirical developments. In K. Kelly<br />

(Ed.), Issues, theory, and research in industrial/organizational psychology (pp. 117-153).<br />

New York: Elsevier.<br />

687<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


688<br />

DEVELOPING MEASURES OF HUMAN PERFORMANCE:<br />

AN APPROACH AND INITIAL REACTIONS<br />

Dana Milanovich Costar, David P. Baker, Amy Holtzman<br />

American Institutes for Research<br />

Kimberly A. Smith-Jentsch<br />

University of Central Florida<br />

Paul Radtke<br />

NAVAIR Orlando TSD<br />

Even with the tremendous emphasis on training throughout the U.S. Navy, the<br />

development of reliable and valid performance rating tools for assessing trainee performance has<br />

represented a significant challenge to Navy instructors. Such tasks are typically a collateral duty<br />

and instructors often have no background in performance measurement techniques. As a result,<br />

instructors tend to use measures that are familiar to them and easy to use (e.g., checklists) to<br />

assess trainee performance, whereas studies appearing in the performance measurement and<br />

training literatures have employed a wider variety of measurement methods. These include:<br />

frequency counts (e.g., Goodman & Garber, 1988; Stout, Cannon-Bowers, Salas, & Milanovich,<br />

1999), behavioral checklists (e.g., Fowlkes, Lane, Salas, Franz, & Oser, 1994; Salas, Fowlkes,<br />

Stout, Milanovich, & Prince, 1999), distance/discrepancy scores (e.g., Josalyn & Hunt, 1998;<br />

Smith-Jentsch, Johnston, & Payne, 1998), and rating scales (e.g., Hollenbeck, Ilgen, Tuttle, &<br />

Sego, 1995; Marks, Zaccaro, & Mathieu, 2000).<br />

To provide assistance in the area of individual and team performance measurement, a<br />

training workshop was developed and delivered to Navy instructors, civil service employees, and<br />

government contractors involved in Navy training. The overall objective of the workshop was to<br />

demonstrate the process of identifying training objectives for measurement, selecting an<br />

appropriate method to assess performance on that objective, and tailoring the measure with<br />

operationally specific content. By attending the workshop, it was expected that participants<br />

would: (1) be able to identify and craft good training objectives, (2) understand the importance<br />

of collecting data on performance outcomes and processes, (3) understand the pros and cons<br />

associated with various types of performance measurement methods, and (4) recognize and<br />

develop effective performance measures.<br />

The one-day workshop included both a morning and afternoon session. The morning<br />

portion of the workshop consisted of a briefing on individual and team performance<br />

measurement. The briefing provided an overview of performance measurement and then<br />

presented a 7-step framework for developing reliable and valid measures of trainee performance.<br />

The seven steps included: (1) consider level of analysis; (2) identify measurement objectives; (3)<br />

clarify purpose for measuring performance; (4) decide whether you need to assess outcomes,<br />

process, or both; (5) make sure objectives are measurable; (6) select a method for each process<br />

and/or outcome; and (7) tailor measure with the appropriate content. In addition to discussing<br />

each of the 7-steps, tips and guidelines related to each of the steps were presented and attendees<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


participated in informal class exercises to demonstrate the utility of the approach.<br />

The afternoon session provided participants with hands-on practice using the 7-step<br />

framework that had been presented in the morning session. Small groups of participants were<br />

formed, with each group developing measures for a performance measurement objective of<br />

interest. In the workshop invitation, participants were asked to bring their own objectives to<br />

work on during the afternoon hands-on practice session. These objectives were used as the basis<br />

for the afternoon session. Two facilitators from the research team were assigned to each group<br />

to guide participants through the development of their measures.<br />

WORKSHOP EVALUATION<br />

Evaluation forms were developed to assess participant reactions to the morning briefing<br />

and the afternoon hands-on practice. The morning evaluation form asked participants to rate the<br />

performance measurement briefing on four criteria: Its ability to prepare them to (1) develop<br />

good measurement objectives, (2) distinguish between outcomes and processes, (3) select an<br />

appropriate measurement method, and (4) develop effective measures. Ratings were made via a<br />

5-point Likert-type scale (5 = strongly agree, 1 = strongly disagree). Four open-ended questions<br />

were then presented to assess whether participants’ expectations for the morning session had<br />

been met, what they had found most useful about the briefing, what additional information<br />

should be added to the briefing, and who could benefit from attending this type of briefing. A<br />

similar evaluation form was developed for the afternoon session in that it required participants to<br />

rate the hands-on practice on the same four criteria that were used to assess the morning briefing<br />

and then presented two open-ended questions to assess what participants found most useful about<br />

the hands-on practice session and how the afternoon session could be improved.<br />

WORKSHOP 1<br />

Morning Session Participants<br />

Forty-four individuals attended the morning briefing on individual and team performance<br />

measurement. Thirty-five of the morning participants (73%) completed the evaluation form. Of<br />

those that completed the form, 11 were active-duty military personnel, 12 were civil service<br />

employees, and 12 were contractors.<br />

Reactions to Morning Session<br />

Overall, participant ratings of the morning session were extremely positive. Eighty-six<br />

percent of the attendees felt that the briefing had prepared them to develop good measurement<br />

objectives. Seventy-nine percent reported that the morning helped them to distinguish outcomes<br />

from processes. Eighty-six percent of participants indicated that the briefing had successfully<br />

prepared them to select an appropriate measurement method and 80% reported that they felt<br />

prepared to develop effective performance measures. These percentages are based on the<br />

number of attendees who either agreed or strongly agreed with the objective statements.<br />

689<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


690<br />

Furthermore, ratings were very positive regardless of whether participants were active-duty<br />

military personnel, civil service employees, or government contractors.<br />

In examining the responses to the open-ended questions related to the morning session,<br />

almost all participants reported that the briefing had met their expectations because it had<br />

provided a specific approach for developing reliable and valid measures of human performance<br />

and presented a good overview of the various issues associated with developing measures. In<br />

terms of the most useful part of the morning session, participants consistently identified the stepby-step<br />

framework that was presented for developing measures of human performance.<br />

Regarding suggestions for how the morning session could be improved, the majority of<br />

participants reported that they would have liked more information about the relationship between<br />

individual and team performance measurement and the use of the Naval Mission Essential Task<br />

List (i.e., NMETL). Naval Mission Essential Tasks represent all tasks that have been identified<br />

as necessary, indispensable, or critical to the success of a mission. Additionally, the associated<br />

conditions and standards within which tasks must be performed are specified. While the content<br />

of the current workshop was consistent with NMETLs, the focus was on performance<br />

measurement rather than on NMETLs per se. Lastly, participants suggested that a host of<br />

individuals could benefit from attending the workshop including supervisors and trainers.<br />

Afternoon Session Participants<br />

Twenty-two individuals attended the afternoon hands-on practice session. Eleven<br />

individuals (50%) completed the evaluation form. Of these 11 individuals, 5 were active-duty<br />

military personnel and 6 were contractors.<br />

Reactions to Afternoon Session<br />

Overall, ratings of the hands-on practice session were also positive. When evaluating the<br />

afternoon session against our four criteria, the majority of participants agreed that the session<br />

was effective in preparing them to develop good measurement objectives and to select an<br />

appropriate measurement method. Approximately half of the respondents felt that the session<br />

successfully prepared them to distinguish performance outcomes from processes and to develop<br />

effective measures.<br />

When responding to the two open-ended questions about the hands-on practice, the<br />

majority of participants felt that the most useful parts of the afternoon session were (a) the<br />

unique perspective brought to the table by each of the different participants and (b) the<br />

discussion within the groups. Regarding how the afternoon session could be improved,<br />

participants suggested that it might be beneficial to present attendees with a standardized task to<br />

develop measures for rather than allowing participants to choose their own.<br />

Workshop Revisions<br />

Based on the feedback obtained by participants, the hands-on practice session was<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


slightly modified for the second workshop. Specifically, we pre-selected an objective (i.e.,<br />

Positively Identify Friendly Forces) from the Naval Mission Essential Task List to serve as the<br />

basis for measurement development. The objective that we selected consisted of providing the<br />

means, procedures, and equipment to positively identify friendly forces and distinguish them<br />

from unknown, neutral, or enemy forces. This task included positively distinguishing friendly<br />

from enemy forces through various methods that may include procedural, visual, electronic, and<br />

acoustic, in addition to providing information to the force commander to aid in the identification<br />

of unknown contacts. It was anticipated that groups of participants would develop performance<br />

measures for the pre-selected objective at different levels of analysis (e.g., individual, team).<br />

WORKSHOP 2<br />

Morning Session Participants<br />

Forty-nine participants attended the morning briefing on individual and team<br />

performance measurement. Thirty-one participants (63%) completed an evaluation form. Of<br />

those that completed the form, 21 were active-duty military personnel, 4 were civil service<br />

employees, and 6 were contractors.<br />

Reactions to Morning Session<br />

Consistent with the first workshop, participants were asked to rate the morning briefing<br />

against our four criteria using the 5-point Likert-type scale. Seventy-six percent of participants<br />

felt that the briefing had prepared them to develop good measurement objectives. Sixty-six<br />

percent felt that they were better able to distinguish outcomes from processes as a result of the<br />

briefing. Sixty-nine percent of the respondents indicated that the morning had prepared them to<br />

select an appropriate measurement method. Forty-three percent felt that the morning session had<br />

prepared them to develop effective performance measures, but an equal number were neutral on<br />

the subject. This result is most likely attributable to the fact that a number of participants<br />

recognized that learning how to accomplish this task successfully would require practice.<br />

In examining the responses to the four open-ended questions on the morning evaluation,<br />

attendees felt that their expectations had been met in that “[the session] provided clear,<br />

understandable methods for developing measures of performance” and “helped to develop a<br />

calculated way to measure human performance.” In terms of the most beneficial part of the<br />

morning session, participants cited the performance measurement steps, the background<br />

information about human performance, and the examples and discussion. When asked how the<br />

briefing could be improved, participants indicated that they would have liked more information<br />

on the history of the Universal Task List, NMETLs, and how they drive training and readiness<br />

assessments. Although there could have been greater discussion about the NMETL process, it<br />

was assumed that participants already had some level of familiarity with NMETLs prior to<br />

attending the current workshop. Finally, attendees reported that supervisors, individuals that<br />

observe and assess performance, trainers, and anyone working with NMETLs could benefit from<br />

attending this type of briefing.<br />

691<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


692<br />

Afternoon Session Participants<br />

Although 24 participants attended the afternoon session, only 13 individuals completed<br />

the evaluation form (54%). Of those that completed the evaluation, 9 were active-duty military<br />

personnel and 4 were contractors.<br />

Reactions to Afternoon Session<br />

Although the ratings for the afternoon session were slightly lower than for the morning<br />

session, the data were still positive. When evaluating the hands-on practice session against our<br />

four criteria, slightly less than two-thirds of participants felt that the afternoon session had<br />

prepared them to distinguish outcomes versus processes, while about half felt that it was helpful<br />

in developing good measurement objectives, selecting an appropriate measurement method, and<br />

developing effective measures.<br />

In responding to the two open-ended questions about the hands-on practice, participants<br />

indicated that the most useful part of the afternoon was gaining actual experience developing<br />

measures. In addition, participants felt that the prepared examples were useful in understanding<br />

and working through the process. In general, participants provided few suggestions for how the<br />

afternoon session could be improved. Those that did provide input mentioned that greater detail<br />

could have been provided on some of the handouts (e.g., more information related to<br />

performance conditions and measures).<br />

SUMMARY AND IMPLICATIONS<br />

In summary, a workshop was developed on human performance measurement and<br />

delivered on two separate occasions to personnel involved in Navy training. Overall, reactions to<br />

these workshops were positive. Our analysis of participants’ reactions indicated that the<br />

workshop: (1) met its stated objectives; (2) provided participants with useful information about<br />

human performance measurement; (3) provided participants with a 7-step process for developing<br />

reliable and valid measures of human performance; and (4) provided participants with experience<br />

developing a performance measure.<br />

Several important lessons have been learned as a result of the workshops conducted.<br />

First, the discussions that took place during the workshops and the comments that were provided<br />

on the evaluation forms indicated that there is a great deal of interest in human performance<br />

measurement. Second, the individuals that attended the workshops appeared to vary greatly in<br />

their knowledge about performance measurement – some knew very little while others knew a<br />

good deal more. Third, participants liked the step-by-step framework for developing human<br />

performance measures that was presented. Participants reported that this framework was logical<br />

and relatively easy to follow. Finally, it appears that there is a need for tools and job aids that<br />

help guide instructors in developing valid and reliable metrics on-the-job.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


As a result, we are currently developing a performance measurement authoring tool (PMAT) that<br />

will be delivered to the Navy. Based on the positive feedback obtained from workshop<br />

participants, the tool will include the same 7-step framework that was presented in the<br />

workshops. In addition, the authoring tool will include a great deal of guidance (i.e., definitions,<br />

background information, tips) and probing questions in order for instructors of varying levels of<br />

experience to benefit from the tool. Specifically, the tool will be programmed to incorporate a<br />

wizard (i.e., tutor) that will guide the user through the 7-step framework just as the team of the<br />

facilitators did in the afternoon portion of the workshops. Studies are currently underway to test<br />

the decision rules that will be programmed into the authoring tool. Once completed, a prototype<br />

of the tool will be developed and a series of usability tests will be conducted. Finally, the<br />

effectiveness of PMAT will be tested by asking two groups of instructors to develop measures of<br />

human performance. One group of instructors will develop these measures on their own. The<br />

second group will develop their measures with the assistance of PMAT. Participants will then<br />

use the measures that they have developed to assess performance during a scenario-based<br />

training exercise. System effectiveness will be demonstrated by comparing the reliability and<br />

validity of ratings developed with and without the PMAT system. Additionally, the amount of<br />

time that it takes both groups to develop their measures will also be examined.<br />

REFERENCES<br />

Fowlkes, J. E., Lane, N. E., Salas, E., Franz, T., & Oser, R. (1994). Improving the<br />

measurement of team performance: The targets methodology. <strong>Military</strong> Psychology, 6, 47-61.<br />

Goodman, P. S., & Garber, S. (1988). Absenteeism and accidents in a dangerous<br />

environment: Empirical analysis of underground coal mines. Journal of Applied Psychology,<br />

73, 81-86.<br />

Hollenbeck, J. R., Ilgen, D. R., Tuttle, D. B., & Sego, D. J. (1995). Team performance<br />

on monitoring tasks: An examination of decision errors in contexts requiring sustained attention.<br />

Journal of Applied Psychology, 80(6), 685-696.<br />

Joslyn, S., & Hunt, E. (1998). Evaluating individual differences in response to timepressure<br />

situations. Journal of Experimental Psychology: Applied, 4(1), 16-43.<br />

Marks, M. A., Zaccaro, S. J., & Mathieu, J. E. (2000). Performance implications of<br />

leader briefings and team-interaction training for team adaptation to novel environments.<br />

Journal of Applied Psychology, 85(6), 971-986.<br />

Salas, E., Fowlkes, J. E., Stout, R. J., Milanovich, D. M., & Prince, C. (1999). Does<br />

CRM improve teamwork skills in the cockpit? Two evaluation studies. Human Factors, 41(2),<br />

326-343.<br />

Smith-Jentsch, K. A., Johnston, J. H., & Payne, S. C. (1998). Measuring team-related<br />

expertise in complex environments. In J.A. Cannon-Bowers & E. Salas (Eds.), Making decisions<br />

under stress: Implications for individual and team training (pp. 61-87). Washington, DC:<br />

American Psychological <strong>Association</strong>.<br />

Stout, R. J., Cannon-Bowers, J. A., Salas, E., & Milanovich, D. M. (1999). Planning,<br />

shared understanding, and coordinated performance: An empirical link is established. Human<br />

Factors, 41(1), 61-71.<br />

693<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


694<br />

PSYCHOLOGICAL IMPLICATIONS OF DEPLOYMENTS FOR THE MEMBERS OF<br />

THE SOUTH AFRICAN NATIONAL DEFENCE FORCE (S. A. N. D. F.)<br />

MAJOR CHARLES KENNY M. MAKGATI<br />

RESEARCH PSYCHOLOGIST, MILITARY PSYCHOLOGICAL INSTITUTE, SANDF,<br />

SOUTH AFRICA.<br />

Kennymakgati@hotmail.com<br />

Introduction<br />

The African continent has been emerging for some time now. This process came along with<br />

issues which involve poverty, spilling of blood, migration, refugees and sometimes, death. At the<br />

same time, the democratisation of South Africa, implied that the South African government<br />

assumed a different role in terms of involving the South African National Defence Force in<br />

international missions. This came with a different focus from which these forces used to operate.<br />

Traditionally, the SANDF primarily focused on aspects such as border control, crime prevention<br />

and peace enforcement on a national level. However, the integration of all military forces (South<br />

African Defence Force, non-statutory forces such as Umkhunto we Sizwe and the African<br />

Peoples Liberation Army), the removal of sanctions and changes in international relations,<br />

implied greater political involvement by the South African government in Southern Africa. As a<br />

result, the principles, roles and practices of the SANDF had to change accordingly. Clearly, these<br />

changes could not proceed without some difficulty.<br />

This paper focuses mainly on the involvement of the South African National Defence Force<br />

through the deployment of soldiers into the African continent. Specific attention is laid on the<br />

question of how our South African troops cope with international deployments which at times<br />

include the United Nations. In an attempt to address this, the researcher commences by providing<br />

a brief background on the conflict in Africa and latter addresses the psychological implications<br />

of these deployments.<br />

The Conflict in Africa and the legacies of colonialism<br />

Over the past decades there have been numerous attempts to resolve intra-state conflict in<br />

Africa through mediation. Most of these efforts have failed, with one or more of the parties<br />

spurning negotiations, being unwilling or unable to reach a settlement in the course of mediation,<br />

or subsequently violating agreements that have been concluded. The factors that may account for<br />

the lack of success in each case include the history, nature and causes of the conflict;<br />

demographic, cultural and socio-economic conditions; the goals and conduct of disputant parties,<br />

the role of external actors; and the style and methods of the mediator.<br />

History postulates that colonialism stunted Africa’s political, economic, and social<br />

development. It has been argued that it was during the nineteenth century’s scramble for Africa,<br />

that the European powers partitioned the continent into arbitrary territorial units. The colonies<br />

that emerged lacked internal cohesiveness, and differences and antagonisms among various<br />

indigenous groups were frequently exploited and exacerbated. Africans were given virtually no<br />

voice in political affairs. Designed to support the needs of the colonial powers, colonial<br />

economies required largely unskilled labour and education was neglected. Generally, colonial<br />

powers did not prepare African countries for statehood, which most achieved during the 1960's<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


It was not surprising that decolonisation created a new set of challenges which the first<br />

generation of African statesman was ill - equipped to handle. Most of the transitions to<br />

independence were often bloody. The added problem was the poor definition of borders as a<br />

result of the pragmatic decision taken by the Organisation for African Unity (OAU) to accept<br />

colonial defined borders. This in itself let to continuous conflicts as a result of increasing scarcity<br />

of resources. Being unable to come into terms with the ethnic, linguistic, and religious diversity<br />

within the preordained borders, individual African States have found it difficult to build the<br />

national identities which are crucial in creating stability<br />

In addition, the cold war also had profound effects on African Governments and security.<br />

Both the Soviet Union and the United States courted the newly-independent African States (as<br />

well as liberation movements) in an effort to win converts to their respective causes. As a result,<br />

they often supported authoritarian, corrupt, and oppressive Governments. With the end of the<br />

superpower rivalry, many African leaders could not rely on the accustomed backing of the<br />

outside power to lend much-needed political legitimacy, financial and military support to their<br />

regimes. This led to disgruntled and oppressed groups openly and forcefully challenging the<br />

legitimacy of these leaders and the weakened regimes increasingly started to be susceptible to<br />

domestic unrest and violence<br />

At the same time, it may be stated that today’s crisis in Africa were also brought about by the<br />

leaders themselves. The style of government pervasive on the continent has not been conducive<br />

to development, democracy, and peace. Many leaders of the newly-independent African<br />

countries tried to impose national unity by consolidating political and economic power in the<br />

State. This impacted badly on governance with inefficient bureaucracies and corruption rampant<br />

and tolerated.<br />

The economic and fiscal policies of many African States had failed and largely Westernimposed<br />

solutions have created new problems. After the prices of many of their exports slumped<br />

in the 1970's, African States borrowed heavily to maintain Government expenditures. Initially,<br />

Western States and institutions readily lent money on the shared expectation that commodity<br />

prices would recover. By and large, African countries did not invest the borrowed funds<br />

prudently and their debts mounted. Waste and corruption exacerbated the situation.<br />

Subsequently, the international financial institutions restricted access to international loans. As a<br />

result, many African States are still servicing their debts and this has become their<br />

preoccupation. Social responsibilities that were once the purview of the State have been<br />

substantially ignored or subcontracted to others with varying degrees of success. States are also<br />

finding it difficult to provide for their own security. A lot of African military do not posses the<br />

human and material resources or the discipline and inclination to defend the State. To establish<br />

and maintain order, some African States have called upon private security firms (or corporate<br />

mercenaries). This in itself has serious repercussions for peace, sovereignty and self<br />

determination of nations and its people<br />

The political, economic, social, and military challenges to the State have been enormous to<br />

the extent that some writers/authors have suggested that some parts of Africa be re-colonized for<br />

humanitarian purposes until such time when the state would be prepared to govern effectively<br />

and humanely<br />

The proliferation of rebel movements, small arms, and refugees all adversely affect a State’s<br />

ability to govern, and this also threatens regional security. Intra-State conflicts usually spill over<br />

national borders frequently assume regional dimensions. Whereas States have historically<br />

supported-or denied support for-insurgencies in other countries as a means of retaining or<br />

695<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


696<br />

gaining influence, their abilities to control rebel movements have diminished. Some of these<br />

groups are sufficiently independent that they have themselves reportedly contracted mercenaries.<br />

It is reported that one-fifth of the global diamond market is supplied by African rebel groups who<br />

at times collaborate with one another independent of State patrons.<br />

Vast quantities of weapons, especially small arms, used to fight wars of independence, civil<br />

wars, and insurgencies remain in circulation and help fuel present conflicts. Many African<br />

Governments simply cannot monitor the movement of small arms in their countries or across<br />

their borders-although some are endeavoring to develop such a capacity. Other African<br />

Governments lack the political will to do so.<br />

Like arms flows, movement of people will continue to have profound repercussions on<br />

African security. Countries often have insufficient infrastructure to deal with the influx and<br />

migrations of people, and conflicts over scarce resources frequently arise. The fact that many of<br />

the refugees camps are situated near the borders, makes it easy for rebels to use them as bases to<br />

launch attacks and regroup, thus exacerbating the situation.<br />

It therefore appears that the challenges to African peace and security defy easy solutions.<br />

Many conflicts are multifaceted and deeply entrenched. They require sustained diplomatic and<br />

military engagement to move towards resolving them. Mediating between the conflicting groups<br />

will feature quite prominently in the whole process of peacekeeping in Africa<br />

Modern peacekeeping has developed beyond the mere monitoring of a cease-fire. Fifty years<br />

of UN peacekeeping thus bring various different disciplines-humanitarian relief, human rights<br />

monitoring and education, the protection of refugees, peacemaking, peace-building, and ontogether<br />

in one holistic mission plan. Modern multidimensional peacekeeping thus includes<br />

elements such as support voluntary disarmament and demobilisation; programmes to rehabilitate<br />

child soldiers and re-introduce ex-combatants into civil life; de-mining; support for national<br />

reconciliation; rebuilding the judicial system; repatriation of refugees; re-introducing civilian<br />

administration; training a new police force, and so on.<br />

The involvement of South Africa<br />

Until recently South Africa resisted considerable international pressure to contribute to<br />

peacekeeping operations in Africa. Instead, it focused on consolidating the transformation<br />

process in the SANDF. But, at the same time, the government realised that South Africa needed<br />

to prepare for a peacekeeping role in Africa. It therefore sent several officers and diplomats all<br />

over the world on peacekeeping courses; introduced peacekeeping training in its staff courses<br />

and at several other layers; and prepared two infantry battalions and other specialised units for<br />

peacekeeping operations.<br />

Simultaneously the government, in close consultation with interest groups in civil society,<br />

developed a White Paper that would guide South Africa’s participation in international<br />

peacekeeping missions. These and various other efforts culminated in Exercise Blue Crane, a<br />

SADC brigade-size peacekeeping exercise that took place in April in 1999. Out of this exercise,<br />

South Africa’s confidence of being able to take up the peace missions role was enforced.<br />

However, it remained trapped with the dilemma of the meaning of the term peacekeeping on an<br />

operational level and the contextual level.<br />

This was due to the fact that the concept of contemporary peacekeeping is replete with<br />

doctrinal ambiguities and defies a straightforward definition. The term in its present form has<br />

become synonymous with any number of international activities designed to resolve or attenuate<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


a conflict. The end of the cold war has seen the tenets of traditional peacekeeping eroded and the<br />

scope of peacekeeping and its activities expanded significantly. In practice, the temporal<br />

boundaries between peacekeeping, and peace-building are not always apparent. The once-clear<br />

distinction between peacekeeping operations and enforcement actions has also become blurred.<br />

This has also led to difficulties as the UN is no longer the only actor in this regard.<br />

Efforts to clarify the terminology have not kept up with the rapid pace of development on the<br />

ground. In an effort to make sense of the changing security environment, the then Secretary<br />

General of the United nations Boutros Boutros-Ghali tried to provide a definition for the<br />

integrally related concepts of preventive diplomacy, peacekeeping, peacemaking, peacebuilding.<br />

This definition gained wide currency, but recently its value has declined with the expansion of<br />

peacekeeping following the end of the cold war. Currently, most commentators speak of<br />

successive generations of United Nations operations. In some circles, the terms peace operations<br />

and peace support operations are used interchangeably with the term peacekeeping operations to<br />

encompass a broad spectrum of conflict management and resolution techniques. The South<br />

African Department of Defence, for example, recently identified and defined nine overlapping<br />

terms and this are, peace missions, peace support operations, preventive diplomacy,<br />

peacemaking, peacekeeping operations, peace enforcement, peace-building and humanitarian<br />

assistance.<br />

To further complicate matters, different countries and organisations ascribe different<br />

meanings to the same terms. Many scholars on Defence broadly use the term peacekeeping to<br />

denote a military or police force deployed at the request of a Government or a representative<br />

group of political and military actors that enjoys wide international recognition. This process<br />

place much greater restraints on the use of force than do pure enforcement actions<br />

The Psychological Implications<br />

Two studies were conducted by the researcher in the Democratic Republic of the Congo and<br />

Burundi in which our troops are currently being deployed for peace missions. The following<br />

results were found to be the most psychological concepts critical in deployments. They included<br />

communication, pre-deployment preparation and the role of the government and the<br />

organisation.<br />

Communication<br />

Gibson, Ivancevich and Donnelly (1994) define communication as transmission of<br />

information and understanding through the use of common symbols, verbal and/or nonverbal.<br />

However, the South African troops experience communication problems on three distinct levels.<br />

These are a) the member and his/her family in the Republic of South Africa, b) the member and<br />

the commanding staff within the mission area and in the Republic, c) the member and other<br />

members from other countries or the same country within the deployment area.<br />

a) The member and his/her family in the Republic of South Africa.<br />

The results revealed that 40% of the deployed South African soldiers find it difficult to<br />

communicate with their families at home. They expressed this as a very real need. In the absence<br />

of continual communication with family members, deploying troops stated that they realised that<br />

697<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


698<br />

they are “far away from home”. This realisation makes them feel “emotionally paralysed”. At the<br />

same time, their families are concerned about how “safe our loved one is within the<br />

deployment”. Deployed members tend to become psychologically dysfunctional if they are<br />

deprived of communication with family members. They aspire to return home and as a result<br />

“lame” excuses or faked diseases or ailment begin to surface. The group loses its cohesion<br />

because members want to go back home. On average, the results revealed that this tends to be the<br />

pattern after two to three months of deployment. On the other hand, participants were also of the<br />

opinion that communication must be restricted or filtered by family members. Ten percent<br />

indicated that the reason for doing so may include situations where one gets told about a death,<br />

sickness or financial difficulties at home.<br />

In view of one of the functions of communication which include a controlling and a<br />

motivating function which clarifies what needs to be done and how when a task is to be done.<br />

Communication also acts in the formulation of goals, in feedback on progress and in the<br />

reinforcement of desired behaviour. It serves as a release mechanism for emotional expression<br />

enabling individuals to show their frustrations.<br />

In the absence of adequate communication, and if they are not given relief from this<br />

situation, participants tend to view themselves as being as being worthless and are therefore<br />

more prone to danger. Leisure time utilisation then becomes critical. However, in Burundi,<br />

members cannot utilise their leisure time as or when they want to. They need to comply with set<br />

rules aimed at ensuring their safety. Consequently, high levels of alcohol consumption and<br />

unprotected sex are reported. This would confirm the UN report on peace missions that “at any<br />

given moment around 5% of the preselected peacekeeping force may be experiencing increased<br />

psychological problems and up to 50% report increased high risk behaviour” (in Burgess, 2001).<br />

b) The member and the commanding staff within the mission area and in the Republic.<br />

The information giving function of communication is also critical within this context.<br />

Participants expressed dissatisfaction with the fact that the commanding staff receives a great<br />

deal of information that is not relayed to them. This ranges from intelligence reports to<br />

information from their home units in the South Africa. This information is censored to such an<br />

extent that it becomes relatively “worthless” to subordinates. Furthermore, the precise nature and<br />

role of the deployed members is not clearly understood by everyone. Interpretation seems to be<br />

quite varied. The history, origin and necessity of their deployment, are not clearly communicated<br />

to all levels. As a result, miscommunication and misinterpretation occur which add to the level of<br />

stress and frustration experienced by members.<br />

These barriers to effective communication, which include filtering (the sender manipulates<br />

information so that it will be seen favourably), are perceived as being deliberate (Makgati, 2001).<br />

They are not being interpreted as safety measures but as “ethnocentric”. The argument to support<br />

this interpretation is based on the perception that management and subordinates are delineated<br />

along racial lines. Management is not relaying information sufficiently and this is being<br />

considered to be racially motivated. Subordinates view this as an attempt make one group<br />

(Whites) feel superior to the other (Blacks) (Makgati, Mokhoka &Naele, 2002). This ultimately<br />

impacts negatively on group functioning and cohesion.<br />

This dynamics illicit defensive behaviour in members which may be detrimental to the<br />

effective execution of military objectives. Ashforth and Lee (1990) noted that when employees<br />

feel threatened, they tend to react in ways that reduce their ability to achieve mutual<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


understanding. Also, when individuals interpret others’ messages as threatening, they often<br />

respond in ways that retard effective communication. As a result, the commanding staff is<br />

perceived as being oppressive and not having the interests of the employees at heart. The latter<br />

view is supported by Johnson and Johnson (2000) who state that the higher the level of<br />

bureaucracy, implying more vertical levels, the greater the opportunities for filtering. Whether<br />

the perceptions and interpretations that members have are correct or not, they do have a negative<br />

impact on the effectiveness of their operational functioning.<br />

c) The member and other members from other countries or the same country within the<br />

deployment area.<br />

It is a well–established fact that cultural differences may lead to uncertainty about human<br />

behaviour (Cox, 1993). Individuals also selectively see and hear based on their needs,<br />

motivation, experiences and other personal characteristics. However, within an international<br />

deployment context, language differences acts as the greatest barrier to effective communication<br />

(Ursin and Olff, 1995). Some members feel substantially depressed by the fact that they are<br />

unable to communicate with their counterparts from other countries. Their inability to<br />

communicate leads to a sense of alienation, stress, anxiety and increases the need to “go home”<br />

(Makgati, 2001).<br />

Pre-deployment preparation<br />

Pre-deployment preparation does not begin when the deploying member actually reports at<br />

the mobilisation area to receive briefings and to do the administration in order to deploy. It<br />

commences from the moment that the member is informed that he or she has been selected for<br />

deployment. Conflict and confirmation become the two primary aspects that members face when<br />

they have to inform their loved ones about this.<br />

At this time, members begin to build defences and coping mechanisms that enable them to<br />

better deal with the deployment. Ursin and Olff (1995) theorise that it is possible to treat defense<br />

as distorted stimulus expectancies within information processing theory. From the time the<br />

members hear of the news, they begin coding and analysing. They would consequently attempt<br />

to make projections into the future. In support of the latter, Lazurus and Folkman (1994)<br />

postulate that we need to regard defense as part of coping strategies.<br />

South African deploying members argue that they have not been fully prepared nor helped to<br />

create these defences and coping mechanisms that are essential towards their deployment. They<br />

argue that the deployment is treated as being an ordinary daily work experience at the home unit.<br />

For instance, members may be concerned about “who will take care of the house” when they are<br />

deployed. Those responsible are often individuals with different interests and this may cause<br />

disputes upon the return of the member.<br />

The Role of the Government and the Organisation<br />

The SANDF is mandated and tasked by the government. However, the organisation is also<br />

responsible to act in the interest of its members. As a result, these three parties are co–dependent<br />

on one another. If one of these parties fails to deliver, the victim is the image of the government<br />

through the deployed member. Both the organisation and the government play a role in the<br />

699<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


700<br />

planning and organisation of all deploying forces and equipment. Agreements get entered into<br />

long before the actual deployment takes place (Makgati, 2000).<br />

Many deployed members of the SANDF tend to blame the government for not ensuring<br />

sufficient racial representivity during deployments. The organisation on the other hand is being<br />

blamed for not considering the needs of the deploying members. This includes ensuring that<br />

there are adequate telephone lines for members to be able to maintain contact with their families.<br />

The organisation is also blamed for not ensuring that appropriate and serviceable equipment is<br />

available to the members. Other complaints centre around the calculation of allowances, the fact<br />

that they are not being told the exact dates for rotations and the absence of health personnel, for<br />

example psychologists and social workers, on the ground. All these discomforts and irritants tend<br />

to lead to disputes that not only affect members and the organisation, but also have repercussions<br />

for the government.<br />

CONCLUSION<br />

The discussion in the present paper highlights critical psychological aspects that South<br />

African deployed soldiers tend to experience. However, this might not necessarily be applicable<br />

only to South African soldiers, but also to military deployments internationally. They remain,<br />

although ignored, common and demand to be addressed. Nonetheless, it is evident that the<br />

ignorance regarding all these aspects is a result of a lack of effective communication. To many,<br />

this may not be explicit, but it remains a reality.<br />

Lastly, Africa bears enormous pressure regarding the economic, social and political problems<br />

that it is facing in this current epoch. Its history on the other hand, points to the fact that the<br />

current problems have a longstanding history. In this regard, the SANDF needs to continue to<br />

play a meaningful role in affecting political stability on the African continent. Interventions<br />

made, need to be well–planned and coordinated in order to have a positive and decisive impact<br />

on the realisation of the African rebirth as envisioned by the president of the Republic of South<br />

Africa, Mr Thabo Mbeki. Ensuring effective communication with deploying soldiers, thereby<br />

empowering them in facilitating the realisation of this vision, is therefore critically important.<br />

REFERENCES<br />

ASHFORTH, B.E. & LEE, R.T. Defensive Behaviors in Organizations: A Preliminary<br />

Model. Human Relations, 1990, 43 (7), 621 – 648.<br />

BARTONE, P.T. & ADLER, A.B. (1994) A model for soldier adaptation in Peacekeeping<br />

Operations. Paper presented at the 36 th Annual conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong><br />

<strong>Association</strong>, Rotterdam, The Netherlands, October 1994.<br />

BARTONE, P.T. (1996) American IFOR experience : Psychological stressors in the early<br />

deployment period. <strong>Proceedings</strong> of the 32 nd <strong>International</strong> Applied <strong>Military</strong> Psychology<br />

Symposium. Brussels, Belgium, May 1996.<br />

BURGESS, W.B.H. (2001) Second Psychological report on Operation Mistral. Unpublished<br />

report, South African <strong>Military</strong> Health Services, <strong>Military</strong> Psychological Institute, Pretoria, South<br />

Africa.<br />

COX, T. (1993) Cultural Diversity in Organizations : Theory, Research & Practice. Berrett<br />

– Koehler Publishers. San Francisco.<br />

De Coning, C. & Mnqibisa, K. (2000) Lessons Learned from Exercise Blue Crane. Accord.<br />

Kwa – Zulu Natal.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


DEPARTMENT OF DEFENCE (1996). White paper on national defence for the Republic of<br />

South Africa. Pretoria, DOD Policy Publication Database, South African National Defence Force<br />

, South Africa.<br />

DEPARTMENT OF FOREIGN AFFAIRS (1999) White paper on South African<br />

participation in international peace missions. Pretoria, DOD Policy Publication Database, South<br />

African National Defence Force, South Africa<br />

S DU PLESSIS, L. (1997). Historical Roles of Sub-Saharian Armed Forces. Paper presented<br />

at the Congress of the South African Political Studies <strong>Association</strong>, Mmabatho, South Africa, p 2-<br />

20<br />

GAL, R. & MANGELSDORFF, A.D. (1991) Handbook of <strong>Military</strong> Psychology. Wiley and<br />

Sons Ltd.<br />

GIBSON, J.L., IVANCEVICH, J.M. & DONNELLY, J.H. (1994) Organizations : Behavior,<br />

Structure, Process. 8 th ed. Irwin. Boston.<br />

JOHNSON, D.W. & JOHNSON, F.P. (2000) Joining Together : Group Theory and Group<br />

Skills. 7 th ed. Allyn and Bacon. Boston.<br />

LAZARUS, R.S & FOLKMAN, S. (1984) Stress, Appraisal and Coping. Springer. New<br />

York.<br />

MAKGATI, C.K.M. (2001) Pilot report on the deploying members of the South African<br />

National Defence Force to the Democratic Republic Congo. Unpublished report, South African<br />

<strong>Military</strong> Health Services, <strong>Military</strong> Psychological Institute, Pretoria, South Africa.<br />

MAKGATI, C.K.M. (2001) On site report on the deploying members of the South African<br />

National Defence Force to the Democratic Republic Congo. Unpublished report, South African<br />

<strong>Military</strong> Health Services, <strong>Military</strong> Psychological Institute, Pretoria, South Africa.<br />

MAKGATI, C.K.M., MOKHOKA, M.D & NAELE, A.(2002). Staff paper to the GOC<br />

SAPSD Burundi on the psychological stressors experienced by deployed members of the South<br />

African Contingent. South African <strong>Military</strong> Health Services Head Office, Pretoria, South Africa.<br />

NOY, S. (1991) Handbook of <strong>Military</strong> Psychology. Wiley and Sons Ltd.<br />

URSIN, H. & OLFF, M. Aggression, Defense, and Coping in Humans. Aggressive Behavior,<br />

1995, 21, 13 – 19<br />

701<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


702<br />

Abstract<br />

The Psychological Impact of Deployments<br />

Colonel A.J. Cotton,<br />

Director of Mental Health<br />

Australian Defence Force, Canberra, Australia<br />

Anthony.Cotton@defence.gov.au<br />

Johnstone (2000) 63 reported on initial data taken from Australian Defence Force<br />

(ADF) troops returning from deployment in East Timor. This paper will present data<br />

from subsequent East Timor deployments and provide some link to strategic personnel<br />

indicators available. It will also examine the strategic personnel management issues<br />

related to the impacts of deployments and how these are starting to be addressed in the<br />

ADF.<br />

INTRODUCTION<br />

The Australian Defence Force (ADF) has had a program of screening service<br />

personnel on their return from operations since the early 1990s, this has been previously<br />

documented (Cotton, 2002) 64 . The development (or selection) of appropriate instruments<br />

to support this has been an ongoing activity since the commencement of this program.<br />

The current screening process is based on conducting screening immediately prior<br />

to, or immediately on, return to Australia (the Return to Australia Psychological Screen;<br />

RtAPS); followed by a subsequent screen three to six months after the member returns to<br />

Australia (the Post Operational Psychological Screen; POPS). The process for both is<br />

similar involving some education, the completion of a number of screening tools, and an<br />

individual screening interview.<br />

The selection and development of the screening instruments has been documented<br />

by Deans (2002) 65 . The development of this battery is on-going and has changed a little<br />

since Deans’ (2002) report, however the core elements of the screens have remained<br />

meaning that it is possible to make some comparisons with these earlier results. Deans<br />

(2002) made a number of recommendations; these were:<br />

a. Future RtA forms need to include an area for personnel to indicate whether they<br />

are reservists on full-time service, or full-time members.<br />

63 Johnston, I. (2000). The psychological impact of peacekeeping deployment. Presentation to the<br />

42 nd Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>.<br />

64 Cotton, A.J. (2002). Screening for Adjustment Difficulties after Peacekeeping Operations.<br />

Presentation to the 44 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong>.<br />

65 Deans, C, (2002). The Psychological Impact of Peacekeeping Deployments: Analysis of<br />

Questionnaire Data 1999-2001. Research Report 6/2002, Psychology Technology and research<br />

Group, Canberra Australia.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


. Screening and monitoring of personnel deployed overseas continues to occur.<br />

c. The Defence Force Psychology Organisation (DFPO), together with the Defence<br />

Health Service (DHS), establish an appropriate process for the coordination and<br />

effective utilisation of mental health data, and that ADF members be informed<br />

that data from mental health questionnaires will be collected for research<br />

purposes.<br />

d. Norming of the mental health of ADF personnel, both non-deployed and<br />

deployed, is recommended. Development of appropriate norms should be<br />

followed by benchmarking research.<br />

e. The modified version of the GHQ12 in the MHS be replaced by an original<br />

version of the GHQ.<br />

f. A more systematic approach for the use of psychological screening instruments<br />

within the ADF should occur. In determining the appropriate instruments, all<br />

relevant stakeholders (Navy, Army, and RAAF representatives, 1 Psychology<br />

Unit, DFPO, and DHS) should be involved.<br />

g. Future end-of-deployment paperwork include reference to the main location of<br />

deployment within the country of deployment.<br />

Of these, recommendations a. and b. have been adopted, and a more comprehensive<br />

approach to the use of mental health data is now being considered both in a research and<br />

surveillance sense (recommendation c). While the norming of the instruments has yet to be<br />

conducted (recommendation d), that too is being considered. The modified GHQ12 has<br />

been replaced (recommendation e), although not with the original GHQ, and this will be<br />

addressed later in this paper. Policy providing a more systematic approach to screening<br />

instruments has been produced (Health Bulletin 9/<strong>2003</strong>, 11/<strong>2003</strong> 6667 ; recommendation f),<br />

but the inclusion of details on main location within country has proved to be impractical to<br />

include on the screening paperwork (recommendation g).<br />

Given the progress that has been made on Deans’ (2002) recommendations it seems<br />

timely to review the impact of these changes on the data available from these deployments.<br />

AIM<br />

The aim of this paper is to conduct an analysis of the psychological impacts of<br />

deployments on ADF members returning from operations in East Timor to determine the<br />

impact of changes made to the screening process since Deans’ (2002) analysis.<br />

66<br />

Defence Health Service, Health Bulletin 9/<strong>2003</strong>, Australian Defence Force Mental Health Screen,<br />

Canberra, Australia.<br />

67<br />

Defence Health Service, Health Bullettin 11/<strong>2003</strong>, Mental Health Support to Operationally<br />

Deployed Forces, Canberra, Australia.<br />

703<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


704<br />

METHOD<br />

Instruments<br />

Deans’ (2002) recommended the replacement of the modified GHQ12 with the<br />

original form. However, other concerns had been expressed about the GHQ12 from other<br />

quarters within the ADF and a broader search for a screening instrument occurred. This<br />

identified the Kessler Psychological Distress Scale –10 (K10) as a suitable replacement.<br />

This has been widelt used in epidemiological studies both oveseas and in Australia and<br />

shown to correlate well with a Composite <strong>International</strong> Diagnostic Interview (CIDI)<br />

diagnosis of anxiety or affective disorder, and has been shown to have better<br />

discriminatory power than the GHQ12 (HB 9/<strong>2003</strong>).<br />

Two other changes to the RtAPS instruments also occurred, both the Acute Stress<br />

Disorder Scale (ASDS) 68 and the Alcohol Use Disorders Identification Test (AUDIT) 69<br />

were removed from the screen. The ASDS was removed because it was designed to<br />

measure the impact of a specific traumatic incident on the individual. This made it<br />

difficult to employ in the RtAPS context where for the bulk of respondents had not<br />

experienced a single overwhelming traumatic incident but had more likely been involved<br />

in a number of potentially stressful events that were outside their normal range of<br />

experience. Similarly the AUDIT is inappropriate in the RtAPS context where many<br />

ADF personnel will have either had no access to alcohol, only very limited access to<br />

alcohol, or will have had unrestricted access. In all of these cases the behaviour of the<br />

individual will be atypical and therefore make the AUDIT of limited value.<br />

As a result, the RtAPS consists of the following:<br />

a. Personal Details,<br />

b. Deployment Details,<br />

c. K10,<br />

d. Traumatic Stress Exposure Scale – Revised (TSES-R),<br />

e. Posttraumatic Stress Disorder Check List (PCL)<br />

f. Major Stressors Scale.<br />

68 Bryant, R.A., Moulds, M.L., & Guthrie, R.M. (2000). Acute Stress Disorder<br />

Scale: A self-report measure of acute stress disorder. Psychological Assessment,<br />

12, 61-68.<br />

69 Saunders, J.B., Aasland, O.G., Babor, T.F., de la Fuente, J.R, & Grant, M. (1993). Development of<br />

the Alcohol Use Disorders Identification Test (AUDIT): WHO collaborative project on early<br />

detection of persons with harmful alcohol consumption: II. Addiction, 88, 791-804.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The TSES-R is a 13 item scale of potentially traumatic stressors that have been<br />

found to be are commonly experienced by ADF personnel serving on operations. The<br />

Major Stressors is a 36 item scale that covers a range more general stressors that have<br />

also been found to be commonly experienced by ADF members on operations.<br />

Sample<br />

All ADF personnel who had been through the RtAPS process in 2002 and <strong>2003</strong><br />

(the latest entry was September <strong>2003</strong>) and whose data had been entered into the database<br />

were included in the sample. This resulted in a total sample of 1,657 cases. Basic<br />

demographic data about the sample are:<br />

a. Gender – male 93.4%, female 6.6%<br />

b. Age – mean 27.44, median 26<br />

c. Marital status – married 49%, partnered 9.6%, separated or divorced 3.7%, single<br />

37.6%<br />

d. Previous deployments – none 55.9%, one 30.4%, more than one 13.7%<br />

e. Average length of service – 7.64 years<br />

Analyses<br />

Analyses were conducted on a number of levels:<br />

a. Descriptive analyses of K10 and PCL, in particular a consideration of the<br />

numbers of individuals meeting clinical cutoffs.<br />

b. Rank order of stressors from both the TSES-R and Major Stressors scale.<br />

c. Comparison of career intentions pre- and post-deployment.<br />

d. Comparisons of clinical scores across key stressors (from TSES-R and Major<br />

Stressors Scale) and key personal details (e.g., number of previous deployments).<br />

RESULTS<br />

Clinical Scales<br />

The K10 offers three clinical score bands; 10-15 low risk (78% of the population),<br />

16-29 medium level of psychological distress, 30-50 high-level risk of psychological<br />

705<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


706<br />

distress 70 . Cutoffs for the PCL are less clear, but a score of 50 has been shown to be a<br />

good predictor of PTSD diagnosis in a population of Vietnam combat veterans 71 .<br />

Analysis of the K10 scores for this sample produced the following results for this<br />

sample: 73.3% low risk, 14.9% medium risk, 1.8% (19 cases) high risk. Analysis of the<br />

PCL showed only five cases reaching the clinical cutoff score.<br />

A total of 43 personnel were referred for follow up as a result of their RtAPS, in<br />

17 cases this was recorded as being related to their deployment, while the remainder were<br />

recorded as being for other reasons. There was no relationship between the reason for<br />

their referral and the referral source (i.e., interviewer, self, unit).<br />

Stressors<br />

Rank ordering the TSES-R resulted in the following stressors rated as the five<br />

most prevalent stressors:<br />

a. Being in danger of being injured.<br />

b. Being in danger of being killed.<br />

c. Witnessing human degradation on a large scale.<br />

d. Seeing dead bodies.<br />

e. Fearing that one had been exposed to a contagious disease or toxic agent.<br />

Of these, the threats of being killed or being injured were the most stressful at the<br />

time and none were causing any significant distress at the time the screen was completed.<br />

Rank ordering the Major Stressors scale resulted in the following stressors being<br />

rated the five most stressful:<br />

a. Double standards.<br />

b. Leadership.<br />

c. The military hierarchy.<br />

d. Risk of vehicle accidents.<br />

e. Separation from family and friends.<br />

Career Intentions<br />

Respondents were asked to record their career intentions prior to deployment and<br />

currently. The proportions of each of these responses is given in Table 1 below.<br />

Table 1<br />

Career Intentions<br />

70 Defence Health Service, Health Bulletin 9/<strong>2003</strong>, Australian Defence Force Mental Health Screen<br />

71 op cit.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Career Intention Prior to Deployment Current<br />

Long term service career 61.4 53.7<br />

Serve out current enagement 14.7 13.9<br />

Seek Corps/Branch transfer 12.8 13.8<br />

Discharge within one year 9.6 10.3<br />

Discharge immediately 1.5 8.3<br />

Initial examination of this table suggests that there has been a shift in career<br />

intentions away towards a change in career. Further examination showed that 130<br />

respondents (7.8%) changed their career intentions and that the bulk of these (112)<br />

changed in a “negative” direction; i.e., towards a change in career. When the direction of<br />

career change was compared with current career intention this change proved to be<br />

significant (chi square = 77.3; df = 4).<br />

Comparisons<br />

Comparisons of K10 and PCL scores across categories in the five most prevalent<br />

traumatic events (as measured by the TSES-R) all showed significant differences and an<br />

increase in K10 or PCL score with an increase in the frequency of the occurence of the<br />

event. Similar comparisons across the most stressful general stressors (from the Major<br />

Stressors Scale) also showed significant differences with an increase in scores on the<br />

clinical scales as the reported stress of the event.<br />

Comparison of K10, PCL, and composite TSES-R scores across current career<br />

intentions showed significant results for both K10 and PCL, but not for TSES-R score.<br />

For both the K10 and PCL, there was an increase in score as the category of career<br />

intention became more oriented towards changing career.<br />

Finally comparison of K10 and PCL scores across the number of deployments<br />

that the member had prior to the current deployment showed no significant differences.<br />

When current career intentions were compared across number of deployments prior to the<br />

current one this did produce a significant difference. Examination of cell residuals,<br />

showed that those with several deployments prior to the current one were less likely to be<br />

seeking a change of occupation within the military, and those with one deployment prior<br />

to the current deployment were less likely to serve out their current engagement.<br />

DISCUSSION<br />

Analysis of scores on the clinical scales of the RtAPS showed that the rates of<br />

psychological distress are slightly elevated compared to the general population and the<br />

reported levels of PTSD symptoms are low. The levels of psychological distress is<br />

understandable given the recency of the operational experience, and could be expected to<br />

diminish over time. The levels of PTSD symptoms, on the other hand, might be expected<br />

to increase over time, and therefore need to be monitored. These are important markers<br />

for the future health of the individual and, hence, the personnel component of ADF<br />

capability.<br />

707<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


708<br />

The relationship between both K10 and PCL scores with career intentions<br />

certainly needs further investigation. In particular, the possibility of causality between<br />

psychological symptoms and career intentions should be investigated. The relationship<br />

between deployment stressors (traumatic or otherwise) and K10 and PCL scores is as<br />

expected, and provides some guidance on events that may require more immediate<br />

clinical attention when they occur. Finally the lack of a relationship between number of<br />

previous deployments and K10 or PCL scores requires further investigation, particularly<br />

in terms of the inoculation effects of previous deployments.<br />

The effect of the deployment on the career intentions of the member supports<br />

earlier findings and certainly requires more investigation. There are many theories about<br />

the effect of deployments on the member’s career intention, certainly the results here<br />

would suggest that deployments do have an impact on the long term stability of careers.<br />

Given the current level of operational tempo and the challenges that many nations are<br />

facing in recruiting, this is a cause for considerable concern and a clear argument for<br />

further research into the organisational impacts of deployment.<br />

These organisational impacts are of particular importance and need further<br />

discussion. The purpose of RtAPS is to provide mental health screening for individuals<br />

who have been involved in a military operation because we believe that this may have an<br />

adverse impact on the individual concerned. Screening helps command to exercise its<br />

duty of care in an effective and efficient manner by attempting to identify those who<br />

might be in more need of assistance. It is in effect a “triage” system in that allows<br />

command to more effectively allocate resources. This is an individual intervention.<br />

The results presented here, although very preliminary, suggest that there might<br />

also be an organisational “triage” that can be effected through this by identifying adverse<br />

effects for the organisation that occur as a result of participation in a military operation.<br />

We use many objective measures for the costs of an operation and have tended to assume<br />

away the subjective (or people) cost of an operation. Perhaps there is a more objective,<br />

people cost, that is in fact measurable; i.e., the number of individuals who leave the<br />

military, or opt for a reduced career in the military, as a direct result of their involvement<br />

in an operation.<br />

Limitations to the Sudy<br />

There are several clear limitations to this study that need be considered before any<br />

strong statements are made about this:<br />

a. The analysis presented is very superficial and needs to be conducted in<br />

significantly more depth.<br />

b. The measures used here reflect stated intention rather than actual behaviour.<br />

c. The measures used don’t provide any indication of causality.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


d. The sample is based on peace operations and so should be broadened to<br />

incorporate warlike operations 72 .<br />

Having identified the limitation to the study, there is certainly the capacity within<br />

the current data set to address some of these and there is scope within the RtAPS process<br />

to address the remainder.<br />

CONCLUSION<br />

The RtAPS process has been modified to take into account a number of the<br />

recommendations made by Deans (2002) 73 while others have yet to be incorporated, and<br />

some are simply too difficult to implement. However, the process clearly provides very<br />

useful information for the organisation and is a useful part of the provision of individual<br />

mental health support to the ADF.<br />

However there is also broad organisational information available from the RtAPS<br />

process that can contribute to the wellbeing of the organisation as opposed to just the<br />

individual. In particular the ability to link mental health data to career intentions and to<br />

be able to gain some indication of the impact of the operation on the individual’s career<br />

intentions could provide a useful organisational “triage”. To do this, the items in the<br />

current RtAPS should be retained and a comparison made across warlike and non-warlike<br />

operations, as well as attempts made to look at the causes of changes in members’ career<br />

intentions.<br />

72 Anecdotal evidence would suggest that the psychological impacts of warlike service are different.<br />

73 Op cit.<br />

709<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


710<br />

THE LEADERS CALIBRATION SCALE<br />

Karen J. Brown, Captain<br />

Canadian Forces, Directorate Human Resources Research and Evaluation,<br />

Ottawa, Canada, Brown.KJ3@forces.gc.ca<br />

“Under a good General, there are no bad soldiers.” Chinese Proverb<br />

Leaders’ ability to directly and indirectly influence all dimensions of climate, such as<br />

morale and cohesion, has well been established. Further, the enhancement of these important<br />

dimensions of climate has also been well recognized as a key method to improve group<br />

performance and combat effectiveness. Thus, as leaders influence all dimensions of climate, it is<br />

vital that leaders be able to accurately assess climate to maximize group effectiveness. In day-today<br />

operations, leaders at all levels attempt to informally gauge organizational climate by<br />

judging subordinates’ attitudes. However, previous research has demonstrated that leaders have<br />

a tendency to be overly optimistic in their assessments of climate (e.g., Stouffer et al., 1949).<br />

The “Leadership Calibration Scale” (LCS), previously named the Officer Calibration<br />

Scale, was developed to assess the degree to which Canadian Army officers are capable of<br />

accurately judging their subordinates’ perceptions of morale, cohesion, and confidence in<br />

leadership (Brown & Johnston, <strong>2003</strong>). The goals of this instrument also included measuring the<br />

confidence of leaders in their assessments and assisting leaders in re-calibrating any perceptual<br />

discrepancies. The aim of this paper is to present the LCS and test hypotheses relevant to its<br />

success at measuring and reducing discrepancies between leaders and subordinates. The<br />

psychometric properties of the LCS will also be presented.<br />

PREVIOUS RESEARCH<br />

Briefly, research in a number of militaries across many years has demonstrated<br />

discrepancies between leaders’ rating of subordinates’ attitudes of climate and subordinates’<br />

attitudes perceptions of climate. For a more through review of related literature refer to Brown<br />

and Johnston (2002). Stouffer et al. (1949) initiated this avenue of research with their seminal<br />

work on “The American Soldier” where they found that “officers tended to believe that their men<br />

were more favourably disposed on any given point” (p. 392) than the men actually were. This<br />

disparity in agreement between officers/enlisted men was noted in officers’ overestimation of<br />

levels of job satisfaction, confidence in leadership, and pride as soldiers, and underestimation of<br />

aggression towards the military (Stouffer et al., 1949). Again in the US, in 1985, Gabriel (as<br />

cited in Eyres, 1998) reported that soldiers perceived “that officers are of poor quality . . .. in<br />

sharp contrast to the perceptions of the officers themselves, who, in general, believe that they are<br />

doing an adequate job of establishing a bond with their men (p. 22).”<br />

Korpi (1965) working with Swedish conscripts found that leaders tended to overestimate<br />

their subordinates’ responses on morale-related questions with a fairly substantial error rate (22-<br />

25%). Interestingly, the degree of positive bias in perceptions increased along with leaders’<br />

rank. Eyres (1998) concluded that leaders in the Canadian Army were “not having the positive<br />

leadership effect on their subordinates that they think they do” (p. 21) when: (1) noncommissioned<br />

members rated junior officers’ leadership and management skills significantly<br />

lower than officers rated them, and (2) senior officers rated their own leadership ability<br />

significantly higher than did their subordinates.<br />

Considerable divergence has also been found when reviewing leaders’ and subordinates’<br />

ratings on leadership behaviour. Baril, Ayman, and Palmiter’s (1994) review of articles<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


comparing self- and subordinate descriptions reported that correlations ranged between zero and<br />

.23 and were often non-significant. A similar study, found that up to 85% of subordinates did<br />

not agree with their leaders’ self-appraisal (Karlins & Hargis, 1988). Although leaders’ self<br />

ratings are only one dimension of climate measured with the LCS, these results provide support<br />

for the lack of congruence between leaders’ and subordinates’ perceptions.<br />

LEADER CALIBRATION SCALE<br />

In view of leaders’ tendency to over-estimate climate, the Leader Calibration Scale (LCS)<br />

was developed to assess the extent of discrepancy between leader and subordinate perceptions, to<br />

measure confidence in assessments, and, ultimately, to assist officers in re-calibrating any<br />

perceptual discrepancies they have in that regard (Brown & Johnson 2002). The organizational<br />

climate dimensions to be measured were based on those measured within the Unit Climate<br />

Profile (UCP), a 47-item attitudinal scale administered to members of the Canadian Army<br />

holding the rank of Sergeant and below. Both instruments measure the following 11 climate<br />

dimensions: morale/social cohesion, task cohesion, military ethos, professional morale,<br />

perceptions of immediate supervisor, as well as confidence in six different levels of leadership.<br />

Definitions of each climate dimension were developed based on the items used to measure each<br />

construct on the UCP, thereby increasing the likelihood that leaders and subordinates would be<br />

responding to similar constructs. Within the LCS, each climate dimension definition preceded a<br />

question on that dimension. Leaders were first asked to rate the statement “Estimate how the<br />

majority of the soldiers under your command would respond to the following statement” for each<br />

climate dimension (e.g., Morale is very high in my unit) using a 5-point Likert-type scale ranging<br />

from 1 (strongly disagree) to 5 (strongly agree). The following hypothesis was posited:<br />

Hypothesis 1. Leaders’ estimates of their subordinates’ attitudes toward climate<br />

dimensions would be significantly higher than subordinates’ actual ratings.<br />

Kozlowski and Doherty (1989) stressed the need to assess perceptions of organizational<br />

climate and leadership at a unit level rather than a global level because it is believed that the<br />

direct and mediating effects of local leaders are likely to have large impacts on the processes and<br />

events within the unit. As such, the following hypotheses were tested:<br />

Hypothesis 2a. Perceptions of climate would vary significantly as a function of company<br />

and leader/subordinates membership.<br />

Hypothesis 2b. Perceptions of climate would vary significantly as a function of platoon<br />

and leader/subordinates membership.<br />

The inability to judge soldiers’ attitudes has been attributed to officers’ overconfidence in<br />

their judgments (Korpi, 1965; Farley, 2002). Cognitive and sensory judgment research<br />

has established a relationship between the accuracy and confidence of judgements (Baranski<br />

& Petrusic, 1999): individuals are often overconfident in their judgments, especially when the<br />

judgments in question are difficult to make (Baranski & Petrusic, 1999). Overconfidence in<br />

judgment of sensory tasks has also been found in cognitive judgement and intellectual<br />

knowledge tasks (Baranski & Petrusic, 1995; Baranski & Petrusic, 1999). Based on these<br />

observations, confidence items were developed and included in the LCP. Thus, immediately<br />

following each dimension rating, leaders were asked to rate the following statement “Indicate<br />

how confident you are in the accuracy of your rating” using a 4-point Likert-type scale ranging<br />

from 1 (not at all confident) to 4 (highly confident). It was further hypothesized that:<br />

Hypothesis 3. Confidence ratings of the leaders will be negatively correlated to the<br />

accuracy ratings of each climate dimension.<br />

711<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


712<br />

Research has shown that making a leader aware of these discrepancies can facilitate the<br />

leader’s success (e.g., Becker, Ayman, & Korabik, 2002). Examining the agreement between<br />

subordinate and self-ratings of leaders in an actual upward feedback session, London and<br />

Wohlers (1991) found that leaders’ ability to accurately judge subordinates’ attitudes improved<br />

over time; one year after the initial feedback session, which included results from subordinates,<br />

agreement in ratings increased significantly, although not dramatically from Time 1 (r 2 = .28) to<br />

Time 2 (r 2 = .32). This supports the premise that discrepancies identified with the LCS could<br />

decrease over time if leaders were provided feedback on their accuracy. As the ultimate goal is<br />

to facilitate leaders’ ability to accurately judge the attitudes of their soldiers, the following<br />

hypotheses are offered:<br />

Hypothesis 4a. Discrepancies between leaders’ ratings of subordinate’s attitudes of<br />

climate dimensions and subordinates’ rating would reduce significantly across the four phases of<br />

administration.<br />

Hypothesis 4b. Confidence levels would vary significantly over the phases of<br />

administration. At Time 2 and 3, the confidence levels would initially be lowered (i.e., Phase 2)<br />

until leaders re-calibrate their assessments of climate, and then confidence levels would raise<br />

(e.g., Phase 4).<br />

RESULTS<br />

The Human Dimensions Operations (HDO) survey was administered throughout the<br />

course of an operational tour: a predeployment phase and three in-theatre phases. The HDO – W<br />

version, including the LCS was administered to 552 leaders (warrant officers and above) and,<br />

concurrently, the HDO – S version, including the UCP, was administered to 2,064 subordinates<br />

(sergeants and below). Results of the UCP were averaged for each climate dimension<br />

(i.e., appropriate items were summed and averaged for each construct). Participants’ resultant<br />

scores for each climate dimensions were merged with the LCP results. Demographic information<br />

such as company, platoon, rank, etc. was available for both data sets.<br />

Hypothesis 1. Previous findings that leaders overrated soldiers’ perceptions of climate<br />

(Brown & Johnston, 2002) were confirmed with eleven separate one-way ANOVAs (Tabachnick<br />

& Fidell, 2001). Results revealed significant differences between leaders and subordinates on all<br />

of the climate dimensions (refer to Figure 1), thereby supporting Hypothesis 1.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


5<br />

4<br />

3<br />

2<br />

1<br />

Figure 1 - Mean Perception Differences on Climate Dimensions<br />

Morale / Cohesion<br />

Task Cohesion<br />

Leaders Subordinates<br />

Professional Morale<br />

<strong>Military</strong> Ethos<br />

Pl Comd<br />

Leadership Skills<br />

Hypothesis 2a. Significant interactions were found between companies and leaders /<br />

subordinate membership for the following dimensions of climate: Task Cohesion, Morale/Social<br />

Cohesion, Leadership Skills, and Confidence in the CO, Coy Comd, Pl Comd, and Sec Comd.<br />

Significant differences were observed between leaders’ and subordinates’ perceptions of climate<br />

for most companies.<br />

Hypothesis 2b. Similar results were found in significant interactions between platoon and<br />

leaders/subordinates membership for Task Cohesion, Morale/Social Cohesion, <strong>Military</strong> Ethos,<br />

and Confidence in the CO, Pl Comd, and Pl WO. Here, too, the nature of discrepancies differed<br />

across the various platoons.<br />

Hypothesis 3. A review of the confidence ratings indicated the means ranged from 3.39<br />

to 3.59 on a 4-point Likert-type scale, indicating confidence in assessments. Correlations<br />

between leaders’ perceptions of climate and their confidence in their ratings revealed significant,<br />

positive relationships (refer to Table 2). To further test this hypothesis a difference score was<br />

calculated for each leader by subtracting the mean of subordinates’ ratings on each dimension<br />

within their respective companies from the leaders’ rating for each phase. Thus, a positive<br />

difference score indicated an over-estimation of that dimension. Significant and positive<br />

correlations were found between the difference scores and confidence in ratings for all climate<br />

dimensions with the exception of Professional Morale. As all but one dimension of climate<br />

demonstrated increased confidence as leaders’ accuracy decreased Hypothesis 3 was supported.<br />

Pl WO<br />

Coy Comd<br />

Sec Comd<br />

CSM<br />

713<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

CO


714<br />

Table 2 - Summary of Correlations Between Perceptions, Confidence, and Difference Scores<br />

Topic Correlation<br />

Correlation<br />

Perceptions / Confidence Difference Score/ Confidence<br />

<strong>Military</strong> Ethos .24** .24**<br />

Task Cohesion .32** .29**<br />

Morale/Social Cohesion .22** .17**<br />

Professional Morale .27** .25**<br />

Leadership Skills .14** .11*<br />

Sec Comd .29** .22**<br />

Pl WO .29** .26**<br />

Pl Comd .11* .01 NS<br />

Coy Comd .18** .22**<br />

CSM .27** .24**<br />

CO<br />

Note: * p < .05 and **


0.9<br />

0.7<br />

0.5<br />

0.3<br />

0.1<br />

-0.1<br />

Figure 2 – Difference Scores Means Across the Phases<br />

Pre Phase 1 Phase 2 Phase 3<br />

Task Cohesion Morale/Social Cohesion Professional Morale Pl Comd<br />

Hypothesis 4b. Similarly, another eleven one-way ANOVAs were conducted to assess<br />

changes in reported levels of confidence in assessments (refer to Figure 3). Significant<br />

differences across phases were found for Professional Morale, F (3, 536) = 2.76, p < .05,<br />

Leadership Skills, F (3, 532) = 4.13, p < .01, Pl Comd, F (3, 398) = 3.09, p < .05, and Pl WO, F<br />

(3, 388) = 3.81, p < .01. Confidence in rating perceptions of Professional Morale, Leadership<br />

Skills, and Pl WO were significantly higher at Phase 2 than at Predeployment. Confidence in<br />

rating Pl Comd was significantly lower at the Predeployment Phase than at Phases 2 and 3.<br />

Results did not support Hypothesis 4b as the prediction that confidence would lower after an<br />

initial assessment was not found; instead the opposite that confidence levels increased or did not<br />

change across administrations.<br />

715<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


716<br />

5<br />

4<br />

3<br />

2<br />

1<br />

Figure 3 – Confidence Differences Across Phases<br />

Pre Phase 1 Phase 2 Phase 3<br />

Professional Morale Leadership Skills Pl Comd Pl WO<br />

Psychometric Analyses<br />

Several psychometric properties of the LCS were examined. Initially, a Principal<br />

Components Analysis (PCA) was conducted on all 22 items in the LCS. A varimax rotation<br />

yielded a five-component solution that accounted for 73.7% of the variance. Table 3 reveals the<br />

five factors as: (1) Perceptions of Direct Leaders and their Confidence Rating, including Sec<br />

Comd, Pl WO, and Pl Comd; (2) Perception of Indirect Leaders and respective Confidence<br />

Rating, including Coy Comd and CSM; (3) Perceptions of Climate; (4) Confidence in Climate<br />

Perceptions; and (5) Perceptions of the CO and Confidence Rating. The results of these analyses<br />

will be used in future administrations of the HDO.<br />

The internal consistency or reliability of the LCS was tested with the Cronbach’s<br />

coefficient alpha. Initially, the reliability was tested for the original item structure of all items<br />

measuring soldiers’ perceptions of climate (α = .79) and confidence ratings (α = .79); both subscales<br />

demonstrated good internal consistency. High reliability was found for the dimensions<br />

based on the PCA results: Direct Leadership (α = .97), Indirect Leadership (α = .93), Climate (α<br />

= .76), Confidence in Climate Ratings (α = .80), and CO (α = .99).<br />

DISCUSSION<br />

Significant discrepancies between leaders’ and subordinates’ ratings suggest that leaders<br />

in operational roles may not be accurately assessing their subordinates’ perceptions of unit<br />

climate. A number of factors complicate the explanation of this phenomenon. First, fluctuating<br />

group membership across the phases, as a result of voluntary participation, likely decreased the<br />

number of significant differences between climate dimensions and some groups (e.g., company<br />

and platoon). Second, the group representation varies from administration to administration<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


adding another variable of individual differences that cannot be controlled, due the confidential<br />

nature of the study.<br />

Consistent with the literature, inaccurate judgements were positively related to<br />

confidence in assessments. This study however provided no evidence that leaders re-calibrate<br />

their assessment of subordinates’ attitudes. Once again, this could be due to the data collection<br />

method that makes it difficult to ensure that the same leaders are participating at each phase.<br />

Moreover, participation in the survey is anonymous to provide a secure environment for candid<br />

responses. It is, therefore, not possible to identify leaders in order to conduct a direct<br />

comparison of each leader’s responses at various phases to determine the nature of changes (e.g.,<br />

reduced confidence, increased accuracy) at an individual. Consequently, it cannot be determined<br />

whether leaders are: (a) receiving feedback on the discrepancies in perceptions, and (b) whether<br />

they have altered their confidence and climate ratings to determine if re-calibration has actually<br />

occurred.<br />

Further, this study provides general evidence that confidence in assessments altered<br />

across the phases of the tour. Although this finding was not supported across all climate<br />

dimensions, does indicate that leaders may be adjusting their confidence levels. As<br />

discrepancies did not decrease over the phases, it is unknown whether the changes in confidence<br />

in ratings were due to individual differences or incongruent re-calibration of confidence.<br />

Regardless of the cause, leaders’ changes in confidence levels did not result in higher levels of<br />

accuracy in judging subordinates’ attitudes about climate.<br />

Finally, the survey results may be influenced by the very survey itself. Leaders may not<br />

have been receptive to negative feedback about their ability to accurately judge subordinates’<br />

attitudes. They may also be sceptical about the utility or the potential application of results to<br />

performance appraisal. “Survey fatigue” and frustration with the repetitive nature of the queries<br />

in the HDO may also have taken their toll on respondents. Moreover, many participants of this<br />

survey, especially subordinates, believe that the results will “fall on deaf ears”. As a result,<br />

responses on perceptions of leaders and climate may be inaccurate, either under or over rated.<br />

FUTURE DIRECTIONS<br />

Research. Ideally, a tightly controlled experiment across several groups would permit a<br />

direct comparison between individual leaders’ ratings of climate and confidence between<br />

different administrations and ensure feedback is provided. Additionally, research should be<br />

conducted to determine how leaders react to, interpret, and apply the feedback they receive from<br />

LCS results (London & Wohlers, 1991). Further, the responses of subordinates warrant deeper<br />

exploration; subordinates might respond differently (e.g., more frank) if the purpose of the<br />

survey varied (e.g., performance, developmental, or research). London and Wohlers (1991), for<br />

example, found that a substantial percentage of subordinates (34%) stated they would have<br />

responded differently if they had known the results were for performance appraisal.<br />

Professional Development. There are many possible explanations for the discrepancy<br />

between leaders’ and subordinates’ perceptions of climate such as: power and status differentials,<br />

physical distance, size of the unit, personal contact, (Korpi, 1965), self-protection mechanisms<br />

(e.g., denial, self-promotion and defense mechanisms: Korpi, 1965; London & Wohlers, 1991),<br />

self-awareness and situational factors (Becker, et al., 2002), different frames of references for<br />

assessment (Baril et al., 1994), and projection of own attitudes, either negative or positive, onto<br />

the soldiers Stouffer et al. (1949).<br />

Regardless of the cause of discrepancies, it is likely that different forms of education<br />

and/training can improve leaders’ accuracy. Professional development programs can be<br />

717<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


718<br />

developed to enhance self-awareness of factors such as self-monitoring and defensive<br />

mechanisms. Furthermore, merely providing leaders with the results of the LCS may increase<br />

self-awareness and result in re-calibration of their assessment and confidence, which could<br />

greatly reduce these attitudinal discrepancies. This upward form of feedback has been linked<br />

with reduced divergent perceptions between leaders and subordinates (London & Wohlers,<br />

1991). In addition, formal leadership training programs could focus on the importance of having<br />

an accurate appraisal of their group (e.g., platoon) and assessment training, practice, and<br />

evaluation of their ability to accurately climate dimensions of troops. Naturally, research should<br />

precede any alterations to training or professional development programs.<br />

The ability to accurately judge unit climate will provide officers with an additional skill<br />

with which to maintain and/or improve morale, cohesion, confidence in leadership, and military<br />

ethos, which in turn will improve combat effectiveness. In addition, results from this study can<br />

also be used to develop pre-deployment and leadership training that nurture this ability. The<br />

ultimate goal of the LCS is to improve leadership effectiveness and mission success.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


REFERENCES<br />

Baranski, J.V., & Petrusic, W.M. (1995). On the calibration of knowledge and perception.<br />

Canadian Journal of Experimental Psychology, 49, (3), 397-407.<br />

Baranski, J.V., & Petrusic, W.M. (1999). Realism of confidence in sensory discrimination.<br />

Perception and Psychophysics, 61, (7), 1369-1383.<br />

Baril, G. I., Ayman, R., & Palmiter, D. J. (1994). Measuring leadership behavior: Moderators of<br />

discrepant self and subordinate descriptions. Journal of Social Psychology, 24, (1), 82-<br />

94.<br />

Becker, J., Ayman, R., & Korabik, K. (2002). Discrepancies in Self/Subordinates’ perceptions<br />

of leadership behaviour. Group and Organization Management, 27 (2), 226-244.<br />

Brown, K. J., & Johnston, B. F. (2002). The Officer Calibration Scale: Towards Improving<br />

Officers’ Ability to Judge Unit Climate in the Canadian Army. Presented at the 39 rd<br />

<strong>International</strong> Applied <strong>Military</strong> Psychology Symposium, Brussels, Belgium.<br />

Eyres, S.A.T. (1998). Measures to Assess Perceptions of Leadership and <strong>Military</strong> Justice<br />

in the Canadian Army: Results from the 1997 Personnel Survey. Sponsor Research<br />

Report 98-5. Director Human Resources Research and Evaluation, National Defence<br />

Headquarters, Ottawa, Ontario, Canada.<br />

Farley, K. M. J. (2002). A Model of Unit Climate and Stress for Canadian Soldiers on<br />

Operations. Unpublished dissertation. Department of Psychology, Carleton University<br />

Ottawa, Ontario.<br />

Karlins, M., & Hargis, E. 1988. Inaccurate self-perceptions as a limiting factor in managerial<br />

effectiveness. Perceptual and Motor Skills, 66, 665-666.<br />

Kozlowski and Doherty (1989). Integration of climate and leadership: Examination of a<br />

neglected issue. Journal of Applied Psychology, 74, (4) 546-553.<br />

Korpi, W. (1965). A note on the ability of military leaders to assess opinions in their units. Acta<br />

Sociologica, 8, 293-303.<br />

London, M. , & Wohlers, A. J. (1991). Agreement between subordinate and self-ratings in<br />

upward feedback. Personnel Psychology, 44, 375-390.<br />

Stouffer, S. A., Lumsdaine, A. A., Lumsdaine, M. H., Williams, R. M. Jnr., Smith, M. B., Janis,<br />

I., L., Star, S.A., & Cottrell, L. S. Jnr. (1949). Studies in Social Psychology in World War<br />

II. The American Soldier: Combat and its Aftermath. Princeton, NJ: Princeton<br />

University Press.<br />

Tabachnick, B.G. & Fidell, L.S. (2001). Using Multivariate Statistics. 4 th Edition. Needham<br />

Height, MA: Allyn & Bacon.<br />

719<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


720<br />

Table 3 - Principle Components Analysis<br />

Item Communalities Direct Indirect Confidence in<br />

Leadership Leadership Climate Ratings<br />

Sec Comd .92 .93<br />

CR 74 of Sec<br />

Comd<br />

CR of Pl WO<br />

Pl WO<br />

.92 .93<br />

.81 .89<br />

.81 .89<br />

CR of Pl Comd .86 .84<br />

Pl Comd<br />

.86 .84<br />

CR of CSM<br />

CSM<br />

CR of Coy<br />

Comd<br />

Coy Comd<br />

CR of Task<br />

Cohesion<br />

CR in Prof<br />

Morale<br />

CR of<br />

Morale/Social<br />

Cohesion<br />

CR of<br />

Leadership<br />

Skills<br />

CR of <strong>Military</strong><br />

Ethos<br />

Morale/Social<br />

Cohesion<br />

Prof Morale<br />

.90 .84<br />

.90 .83<br />

.89 .76<br />

.88 .75<br />

.63 .79<br />

.60 .75<br />

.55 .74<br />

.51 .68<br />

.44 .65<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Perceptions of<br />

Climate<br />

.63 .78<br />

.60 .74<br />

Task Cohesion .56 .69<br />

<strong>Military</strong> Ethos .52 .69<br />

Leadership<br />

Skills<br />

.53 .68<br />

CO<br />

.97 .97<br />

CR of CO<br />

Eigenvalue<br />

Percent<br />

Variance<br />

.97 .97<br />

7.16 3.67 2.31 1.81 1.26<br />

32.56 16.70 10.52 8.22 5.71<br />

74 CR is the Confidence Rating the leader gave of his assessment of soldiers’ perceptions of that dimension.<br />

CO


Leadership Competencies: Are We All Saying the Same Thing?<br />

Jeffrey D. Horey<br />

Caliber Associates<br />

49 Yawl Dr.<br />

Cocoa Beach, FL 32931<br />

horeyj@calib.com<br />

Jon J. Fallesen, Ph.D.<br />

Army Research Institute<br />

Ft. Leavenworth, KS<br />

jon.fallesen@leavenworth.army.mil<br />

In the course of developing an Army leadership competency framework focused on the<br />

Future Force (up to year 2025), the authors examined several existing U.S. military and civilian<br />

leadership competency frameworks. We attempt to link the core constructs across the<br />

frameworks and identify similarities and differences in terms of their content and structures.<br />

We conclude that leadership competency modeling is an inexact science and that many<br />

frameworks present competencies that mix functions and characteristics, have structural<br />

inconsistencies, and may be confusing to potential end users. Recommendations are provided to<br />

improve the methods and outcomes of leadership modeling for the future.<br />

Table 1 represents many of the traits and characteristics commonly found in leadership<br />

competency frameworks. At first glance it may appear to be a comprehensive framework for<br />

leaders. It includes values (principled, integrity), cognitive skills (inquiring, thinking),<br />

interpersonal skills (caring, enthusiastic, communicating), diversity components (tolerance,<br />

respect, empathetic), and change orientation (open-minded, risk taking).<br />

Table 1<br />

Sample Leadership Competencies<br />

Inquiring Thinking Communicating Risk Taking Principled<br />

Caring Open-Minded Well Balanced Reflective Committed<br />

Confident Cooperative Creative Curious Empathetic<br />

Enthusiastic Independent Integrity Respect Tolerance<br />

Surprisingly, this is not an established leadership framework but rather a list taken from a<br />

4 th grade student profile guide. While a simplistic example, it illustrates both the universality of<br />

the competency concept and the potential confusion when associating a simple list of traits and<br />

processes with leadership.<br />

721<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


722<br />

WHAT IS LEADERSHIP?<br />

This, of course, is the $64,000 question (maybe it’s now the Who Wants to be a<br />

Millionaire question?). As the Armed Forces face a rapidly evolving and complex future threat<br />

environment, it is crucial that leadership in these organizations be well defined, described and<br />

inculcated. Part of this challenge includes establishing a common language for discussing<br />

leadership concepts and ensuring consistent assessment, development, reinforcement, and<br />

feedback processes are in place for maintaining leadership across our forces.<br />

So, again, what is leadership? Apparently, decades of research, dozens of theories, and<br />

countless dollars haven’t completely answered this question. If it had, then we wouldn’t have<br />

vastly different visions of leadership and leadership competency across similar organizations. Or<br />

would we?<br />

An acceptable definition of leadership might be ‘influencing, motivating, and inspiring<br />

others through direct and indirect means to accomplish organizational objectives.’ Defining<br />

leadership is an important first step toward establishing how it should be conducted within an<br />

organization. However, a simple definition is insufficient for describing the nature, boundaries,<br />

contexts, and desirable manifestations of leadership. Enter the evolution of competencies.<br />

WHAT IS THE PURPOSE OF COMPETENCIES?<br />

Behavioral scientists and organizational development professionals seek to improve<br />

individual and group work processes through the application of systematic procedures and<br />

research-based principles. Job analysis techniques, and to a lesser extent competency modeling,<br />

have long been used to establish the requirements of jobs and positions throughout organizations<br />

and provided input to selection, training, and management practices. Knowledges, skills,<br />

abilities, other characteristics (KSAOs), tasks and functions, and more recently competencies<br />

have become the building blocks of leadership selection and development processes.<br />

Competencies have become a more prevalent method of identifying the requirements of<br />

supervisory, managerial, and leadership positions, rather than job or task analysis techniques,<br />

because they provide a more general description of responsibilities associated across these<br />

positions (Briscoe and Hall, 1999).<br />

Employees want information about what they are required to do (or confirmation of what<br />

they think they are supposed to do) in their jobs or positions. The operative word here is ‘do’.<br />

They typically do not want to know what they are supposed to ‘be’. This simple representation<br />

of leadership requirements helps us establish a context for evaluating leadership competencies<br />

and frameworks/models. Those that are stated only as traits, characteristics, or in attribute terms<br />

are, in our estimation, less valuable than those that are stated in task, function, and behavioral<br />

terms. However, models that address both aspects of leadership may prove to be more valuable<br />

to more individuals.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


The purpose in establishing competencies for leaders should be to better define what<br />

functions leaders must perform to make themselves and others in their organizations effective.<br />

Many competency definitions include reference to clusters of knowledges, skills, abilities, and<br />

traits that lead to successful performance (Newsome, Catano, Day, <strong>2003</strong>). Yet competency<br />

labels are typically expressed in either process or functional terms. This can lead to confusion as<br />

to what competencies actually represent for leadership and organizations. Competency<br />

frameworks or models should serve as the roadmap to individual and organizational leader<br />

success. The value of competencies is in providing specific or at least sample actions and<br />

behaviors that demonstrate what leaders do that makes them successful. Therefore the end goal<br />

of all frameworks or models should be to provide measurable actions and behaviors associated<br />

with leadership functions. Functions are a step removed from this goal, while KSAOs, traits, and<br />

attributes are yet another step removed.<br />

Leadership competency modeling has been in vogue for several decades but the methods<br />

for developing these models and the content are as varied as the organizations for which they<br />

have been developed. Briscoe and Hall (1999) identify four principal methods for developing<br />

competencies and Newsome, Catano, and Day (<strong>2003</strong>) present summaries of competency<br />

definitions and the factors affecting their outcomes.<br />

COMPONENTS OF COMPETENCIES<br />

The components of competency frameworks are seemingly as varied as the competencies<br />

themselves. Competencies are generally no more than labels that require additional detail to<br />

communicate how they relate to leadership and behavior. This detail may come in the form of<br />

definitions, elements or subcomponents of the competencies, and behaviors, actions or other<br />

indicators of manifesting the competency or elements. More detailed frameworks may include<br />

hierarchies of competencies or elements based on levels of leadership or other distinctions. In<br />

some cases, it’s unclear what the higher order labels (e.g., Leading Change, Performance) should<br />

be called.<br />

We must also preface our discussion by admitting it is not completely fair to judge any<br />

frameworks by a high level, surface comparison of the labels and definitions/descriptions of the<br />

competencies and components. We did use as much of the definitions and description of the<br />

framework components as possible in making our comparisons. A more accurate analysis of<br />

these frameworks would involve an elemental analysis of each framework construct that is<br />

beyond the scope of this paper. However, it is this high level aspect of the framework that, in<br />

some sense, sets the stage for the acceptance and comprehension of the framework by the<br />

intended audience.<br />

NOW, ON TO THE LEADERSHIP FRAMEWORKS<br />

We wish to thank the Center for Strategic Leadership Studies at the Air War College for<br />

inspiring this paper with their extensive presentation of military and civilian leadership issues. If<br />

723<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


724<br />

you are not familiar with their website (http://leadership.au.af.mil/index.htm), we encourage you<br />

to explore it.<br />

We chose to review leadership frameworks from the four major services, the Coast<br />

Guard, and the Executive Core Qualifications that apply to senior civilian leaders within the<br />

federal government. Table 2 presents overview information for the frameworks that includes the<br />

service entity, sources for the frameworks, and components that we investigated. Initially, we<br />

sought to determine the similarity of constructs across the frameworks. In the course of this<br />

comparison we also recognized variation in the types of constructs represented within a<br />

particular framework, overlap among the components, and different levels of detail across the<br />

frameworks. We discuss each of these as well.<br />

Table 2<br />

Overview of Competency Frameworks<br />

Service Coast Guard Army Marine<br />

Corps<br />

Source COMDTINST<br />

5351.1<br />

Components<br />

Framework<br />

3 Categories,<br />

21Competencies<br />

Field Manual<br />

22-100<br />

Be, Know, Do:<br />

7 Values, 3<br />

Attributes, 4<br />

Skills, 12<br />

Different<br />

Actions at 3<br />

Levels of<br />

leadership<br />

(Direct,<br />

Organizational,<br />

Strategic),<br />

Performance<br />

Indicators<br />

USMC<br />

Proving<br />

Grounds<br />

11<br />

Principles,<br />

14 Traits<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Air Force Executive<br />

Core<br />

AF Senior<br />

Level<br />

Management<br />

Office<br />

3 Main areas,<br />

24<br />

Competencies<br />

at 3 Levels of<br />

Leadership<br />

(Tactical,<br />

Operational,<br />

Strategic)<br />

Qualifications<br />

Office of<br />

Personnel<br />

Management<br />

5 Areas, 27<br />

Competencies<br />

Navy*<br />

ereservist.net;<br />

Naval<br />

Leadership<br />

Training Unit<br />

4 Guiding<br />

Principles, 5<br />

Areas, 25<br />

Competencies<br />

* the Navy leadership competency framework is currently in revision and a copy of the<br />

most recent version was not available at the time of publication. Four guiding principles are<br />

highlighted, two of which are also considered main areas.<br />

Definitions of leadership or leadership competency for the frameworks we investigated<br />

are as follows:<br />

Coast Guard – leadership competencies are measurable patterns of behavior essential to leading.<br />

The Coast Guard has identified 21 competencies consistent with our missions, work force, and<br />

core values of Honor, Respect, and Devotion to Duty. (COMDTINST 5351.1)


Army – influencing people – by providing purpose, direction, and motivation – while operating<br />

to accomplish the mission and improving the organization. Leaders of character and competence<br />

act to achieve excellence by developing a force that can fight and win the nation’s wars and<br />

serve the common defense of the United States. (FM 22-100, 1999).<br />

Marine Corps – no definition found, seemingly defined by the principles and traits.<br />

Air Force – leadership is the art of influencing and directing people to accomplish the mission.<br />

(AFT 35-49, 1 Sep 85).<br />

Navy – no definition found, can be inferred from four guiding principles: professionalism,<br />

integrity, creativity, and effectiveness.<br />

Civilians – no definition of leadership found for the ECQs. All core qualifications have<br />

definitions.<br />

At the most basic level, the frameworks can be compared on the sheer number of<br />

components and structures that comprise them. Hardly an exact or enlightening comparison,<br />

they nonetheless vary from the 24 components of the Coast Guard framework to the 34<br />

components of the Navy framework. The Coast Guard, Air Force, ECQ, and Navy frameworks<br />

present essentially two levels of framework components, although the Navy seems also to be<br />

considering 4 guiding principles in their conceptualization. The Army and Marine Corps<br />

presentations are not technically competency-based frameworks, but are still appropriate for<br />

comparison with the others. The Army and Air Force frameworks also provide specific guidance<br />

related to level of leadership and application of components.<br />

In Table 3 we attempt to link similar constructs across the 6 frameworks. This table<br />

presents a more detailed treatment of similarities and differences across the services. Again, we<br />

used the definitions and descriptions in making our links but in many cases the complexity of the<br />

definition or description made it difficult to completely represent how the component is related<br />

to others or distinguished from others in this table. We reiterate the goal of this comparison is to<br />

show, at a relatively broad level of abstraction, how these frameworks compare to one another.<br />

Bold text in Table 3 represents the main competencies or the highest level of each<br />

framework for those that clearly included such a distinction (Coast Guard, Air Force, Navy, and<br />

ECQs). Across rows, we attempt to group similar constructs among the frameworks for<br />

comparison. In several cells within the same framework, we have grouped constructs that we<br />

feel are also similar enough to consider them part of the same construct. The most prevalent<br />

example of this is related to the value construct. Therefore, while there are 41 rows in our table,<br />

this doesn’t necessarily equate to 41 unique constructs of leadership across the six models.<br />

The constructs that appear to have the greatest concurrence across the six models<br />

(represented in 4 or more frameworks) are performing/executing/accomplishing mission;<br />

vision/planning/preparing; problem solving/decision making; human resource management;<br />

725<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


726<br />

process/continuous improvement; motivating/leading people; influencing/negotiating;<br />

communicating; team work/building; building/developing partnerships; interpersonal skills;<br />

accountability/service motivation; values; learning (including components of adaptability,<br />

flexibility, awareness); and technical proficiency. Other constructs that are common across 3 of<br />

the frameworks are driving transformation/leading change; strategic thinking; diversity<br />

management; mentoring/developing people (distinct from team building); and physical/health/<br />

endurance.<br />

There were six additional constructs that were represented in two of the frameworks but<br />

the authors caution that much of the agreement between these constructs is due to the extreme<br />

similarities in the Navy and ECQ models (overlap on 6/14). These constructs are external<br />

awareness; political savvy/working across boundaries; customer service/focus; conflict<br />

management; resource stewardship; financial management; tactical/translating strategy (same<br />

construct?); leveraging technology/technology management; looking out for others; developing<br />

responsibility/ inspiring/empowering/exercising authority; leading courageously/combat/crises<br />

leadership; assessing/assessing self; personal conduct/responsibility; demonstrating<br />

tenacity/resilience; and creativity and innovation. Unique constructs, at least on the surface of<br />

the models, appear to be entrepreneurship (defined in terms of risk taking), integrating systems<br />

(akin to systems thinking); emotional (attribute); inspiring trust; enthusiasm; and followership.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 3<br />

Leadership Competency Components Compared<br />

Coast Guard Army Marine Corps Air Force Navy ECQ<br />

Performance Executing; Operating Ensure assigned tasks are<br />

understood, supervised,<br />

and accomplished<br />

Vision Development and<br />

Implementation<br />

Leading the Institution;<br />

Driving Execution<br />

Planning/Preparing Creating and Demonstrating<br />

Vision<br />

Accomplishing Mission;<br />

Effectiveness<br />

Results Driven<br />

Vision Vision<br />

External Awareness; Political<br />

Awareness<br />

External Awareness<br />

Thinking/Working Across<br />

Boundaries<br />

Political Savvy<br />

Customer Focus Customer Service<br />

Driving Transformation Leading Change Leading Change<br />

Decision-Making and Mental; Decision Making; Make sound and timely Commanding; Exercising Decisiveness/Risk Management; Problem Solving;<br />

Problem-Solving<br />

Conceptual decisions; Decisiveness;<br />

Judgment<br />

Sound Judgment<br />

Problem Solving<br />

Decisiveness<br />

Conflict Management Conflict Management<br />

Applying Resource<br />

Stewardship<br />

Resource Stewardship<br />

Financial Management Financial Management<br />

Workforce Management<br />

Attracting, Developing, and Human Resource Management Human Resource<br />

Systems; Performance<br />

Appraisal<br />

Retaining Talent<br />

Management<br />

Shaping Strategy Strategic Thinking Strategic Thinking<br />

Tactical Translating Strategy<br />

Management and<br />

Improving Initiative Driving Continuous Continuous Improvement<br />

Process Improvement<br />

Improvement<br />

Working with Others Motivating Employ your command in<br />

accordance with its<br />

capabilities<br />

Leveraging Technology<br />

Entrepreneurship (Risk<br />

Taking)<br />

Technology Management<br />

Integrating Systems<br />

Leading People and Teams Leading People; Working with<br />

People<br />

Leading People<br />

Influencing Others Influencing Influencing and Negotiating Influencing and Negotiating Influencing and Negotiating<br />

Respect for Others and<br />

Diversity Management<br />

Leveraging Diversity Leveraging Diversity<br />

Looking out for Others Know your Marines and<br />

look out for their welfare<br />

Effective Communication Communicating Keep your Marines Fostering Effective Oral Communication; Written Oral Communication;<br />

informed<br />

Communications<br />

Communication Written Communication<br />

Coast Guard Army Marine Corps Air Force Navy ECQ<br />

727<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


728<br />

Group Dynamics Train your Marines as a Fostering Teamwork and<br />

Team Building Team Building<br />

team<br />

Collaboration<br />

Develop a sense of Inspiring, Empowering, and<br />

responsibility among your<br />

subordinates<br />

Exercising Authority<br />

Mentoring Mentoring Developing People<br />

Leading Courageously Combat/Crisis Leadership<br />

Building; Developing<br />

Emotional<br />

Building Relationships Partnering Building Coalitions/<br />

Communication;<br />

Partnering<br />

Self Interpersonal Tact Personal Leadership Professionalism Interpersonal Skills<br />

Accountability and<br />

Dependability Responsibility, Accountability, Service Motivation;<br />

Responsibility<br />

Authority; Service Motivation; Accountability<br />

Aligning Values Loyalty; Respect, Duty, Bearing; Courage; Integrity; Leading by Example Integrity Integrity and Honesty<br />

Selfless Service; Honor, Justice; Unselfishness;<br />

Followership<br />

Integrity, Personal Courage Loyalty; Set the example<br />

Health and Well Being Physical Endurance<br />

Personal Conduct Seek responsibility and<br />

take responsibility for your<br />

actions<br />

Self Awareness and<br />

Learning; Leadership<br />

Theory<br />

Learning Know yourself and seek<br />

improvement<br />

Technical Proficiency Technical Be technically and tactically<br />

proficient; Knowledge<br />

Enthusiasm<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Adapting Flexibility Flexibility; Continual<br />

Learning<br />

Inspiring Trust<br />

Assessing Assessing Self<br />

Technical Credibility Technical Credibility<br />

Demonstrating Tenacity Resilience<br />

Creativity and Innovation Creativity and Innovation


In answer to ‘are we all saying the same thing?’ we respond with a simple mathematical<br />

exercise. Among the 41 constructs represented in Table 3, 20 are included in three or more<br />

frameworks, 15 are included in two, and six are unique to a single framework. Too close to call?<br />

In about half the cases, the frameworks appear to be saying the same thing but there are also<br />

significant differences in terms of what is included, or at least the level at which it is included in<br />

the leadership framework. There are some very obvious differences in terms of labels of<br />

leadership constructs as indicated by the within row groupings in Table 3.<br />

CRITIQUE OF THE FRAMEWORKS<br />

The true value of our efforts is to point out aspects of each of the frameworks that could<br />

be improved. While each of the organizations included in this analysis is unique, we believe that<br />

the nature and purposes of these organizations is similar enough that there should be great<br />

similarities in how leadership is defined, described and displayed within them.<br />

The first test we submitted the frameworks to was whether or not they used a consistent<br />

representation of the labels of their components across all those components. Only the Air Force<br />

and Army models passed this test. The Coast Guard, Navy, and ECQ frameworks mix processes<br />

(decision making, influencing and negotiating, problem solving), functions (mentoring,<br />

management and process improvement, financial management), and characteristics (health and<br />

well being, flexibility, integrity and honesty). The Marine Corps principles and traits were more<br />

difficult to evaluate, but one could argue that several traits are actually KSAs (decisiveness,<br />

judgment, knowledge).<br />

The second test was one of independence of components within a framework. The Coast<br />

Guard framework includes performance appraisal and workforce management systems –<br />

certainly related; and self awareness/learning and leadership theory (defined in terms of learning<br />

about leadership). The Army framework includes mental and conceptual aspects on the attribute<br />

and skill dimensions, respectively. There also appears to be some overlap among the twelve skill<br />

dimensions (developing/building/improving; executing/operating). The Air Force framework<br />

may potentially overlap on commanding and exercising sound judgment, and many of the other<br />

identified components seem closely related to other components (inspiring trust and<br />

influencing/negotiating; building relationships/mentoring). The Navy and ECQ frameworks had<br />

similar overlap within them (problem solving/decisiveness; leading people/working with people).<br />

Several Marine Corps principles and traits overlap (make sound and timely<br />

decisions/decisiveness; seek responsibility and take responsibility for actions/initiative).<br />

The most common confounding in the frameworks is the mixing of processes or<br />

techniques to perform work and the functional areas of that work. For example, all organizations<br />

include decision making, problem solving, or judgment at some level in their frameworks. With<br />

the exception of the Army and Marine Corps, they also include functional areas such as<br />

workforce management, financial management, and conflict management that obviously require<br />

these processes or techniques to perform them.<br />

Next we examined the extent to which each of the frameworks provide behavioral<br />

examples or actions associated with the competency or components. As an illustration of the<br />

729<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


730<br />

variety of definition and behavior content and detail, we provide information from each<br />

competency framework relevant to the construct of decision making/decisiveness/sound<br />

judgment in Table 4. The results indicate the different ways the services say the same thing.<br />

Table 4<br />

Competency Framework detail for the Construct of Decision Making/Decisiveness/Sound<br />

Judgment<br />

Source Competency<br />

Label<br />

Air Force Exercising<br />

Sound<br />

Judgment<br />

Army Decision<br />

Making<br />

Coast<br />

Guard<br />

Decision<br />

Making and<br />

Problem<br />

Solving<br />

Definition/Description Behaviors<br />

Developing and applying broad<br />

knowledge and expertise in a disciplined<br />

manner, when addressing complex<br />

issues; identifying interrelationships<br />

among issues and implications for other<br />

parts of the Air Force; and taking all<br />

critical information into account when<br />

making decisions<br />

Involves selecting the line of action<br />

intended to be followed as the one most<br />

favorable to the successful<br />

accomplishment of the mission. This<br />

involves using sound judgment, reasoning<br />

logically, and managing resources wisely.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

None found.<br />

(Partial list of performance indicators)<br />

Employ sound judgment and logical<br />

reasoning. Gather and analyze relevant<br />

information about changing situations to<br />

recognize and define emerging problems.<br />

Make logical assumptions in the absence of<br />

facts. Uncover critical issues to use as a<br />

guide in both making decisions and taking<br />

advantage of opportunities. Keep informed<br />

about developments and policy changes<br />

inside and outside the organization.<br />

Recognize and generate innovative<br />

solutions.<br />

None found. Learn to identify and analyze problems under<br />

normal and extreme conditions. Learn to<br />

consider and assess risks and alternatives.<br />

Use facts, input from systems, input from<br />

others, and sound judgment to reach<br />

conclusions. Learn to lead effectively in<br />

crisis, keeping focus on key information and<br />

decision points. Commit to action; be as<br />

decisive as a situation demands. Involve<br />

others in decisions that affect them.<br />

Evaluate the impact of your decisions<br />

ECQ Decisiveness Exercises good judgment by making<br />

sound and well-informed decisions;<br />

perceives the impact and implications of<br />

decisions; makes effective and timely<br />

decisions, even when data is limited or<br />

solutions produce unpleasant<br />

consequences; is proactive and<br />

achievement oriented.<br />

Source Competency<br />

Label<br />

Marine Decisiveness Decisiveness means that you are able to<br />

Corps<br />

make good decisions without delay. Get<br />

all the facts and weight them against each<br />

other. By acting calmly and quickly, you<br />

Definition/Description Behaviors<br />

(Embedded in example qualification and<br />

capability narratives)<br />

(Suggestion for improvement) Practice being<br />

positive in your actions instead of acting halfheartedly<br />

or changing your mind on an issue.


Navy Decisiveness<br />

/Risk<br />

Management<br />

should arrive at a sound decision. You<br />

announce your decisions in a clear, firm,<br />

professional manner.<br />

Exercises good judgment by making<br />

sound and well-informed decisions;<br />

perceives the impact and implications of<br />

decisions; makes effective and timely<br />

decisions, even when data are limited or<br />

solutions produce unpleasant<br />

consequences; is proactive and<br />

achievement oriented. (Identical to ECQ)<br />

None found.<br />

Competency models/frameworks are intended to establish what leaders should be or do to<br />

achieve organizational goals. Decisiveness means little to leaders without accompanying<br />

information about what decisiveness accomplishes, how it is enacted, and why it leads to<br />

organizational goals. Most of the frameworks provide definitions of competencies and<br />

components to further understanding. Simply defining decisiveness, much like defining<br />

leadership, does little other than to provide an alternative set of words for the label. What is truly<br />

valuable is the description of how decisiveness is manifested in the organization. The more<br />

concrete and concise the description of actions and behavior associated with competencies, the<br />

more likely these competencies will be accepted, understood, and demonstrated.<br />

FINAL WORDS<br />

The most important considerations in developing and establishing leadership<br />

competencies should be how they will be used to influence leadership assessment, selection,<br />

development, and performance management processes. Even the best framework of leadership<br />

has no value if it is not used productively by that organization. Redundancy, missing<br />

components, buzzwords, and inaccurate descriptions of effective behavior in doctrine are<br />

insignificant if they are not used. Well developed, comprehensive, prescriptive models of<br />

organizational leadership will be wasted unless leaders understand, embrace, and apply the<br />

features of the framework/model and organizations integrate them into succession planning,<br />

training and development, and multi-rater feedback systems.<br />

Shippmann, et al. (2000) conducted a review of competency modeling procedures<br />

compared with job analysis procedures. In general, competency modeling procedures were rated<br />

as less rigorous than job analysis procedures. However, competency modeling was felt to<br />

provide more direct information related to business goals and strategies. Competencies may also<br />

be more appropriate for describing successful leadership behaviors in future terms. This could<br />

be a critical factor for the organizations studied as future threats and environments remain<br />

dynamic and uncertain. These strengths should be exploited by these organizations and not lost<br />

on confusing framework structures, unexplained redundancy in components, and incomplete<br />

examples of how competencies are manifested for success.<br />

There are many sources for recommendations on how to implement or improve sound<br />

competency modeling procedures (Cooper, 2000; Lucia and Lepsinger, 1999). We would like to<br />

highlight a few of their suggestions based on our findings.<br />

731<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


732<br />

1. Define leadership and establish the boundaries on what is and isn’t considered in<br />

your organizations leadership framework.<br />

2. Use a consistent representation of tasks, functions, actions and behaviors that<br />

leaders perform.<br />

3. Seek to eliminate redundancy in competencies and elements and clearly indicate<br />

how actions and behaviors are linked to competencies or elements.<br />

4. Involve behavioral scientists as well as leaders at all levels of the organization in<br />

development and vetting of the model/framework.<br />

5. Seek to validate competencies through organizational results.<br />

We would also like to point out that some of the frameworks that we investigated are<br />

undergoing change. We were not able to gather the pertinent information related to where each<br />

service is in refining, updating, or extending their framework but we do know there are efforts<br />

underway in the Army and Navy to modify their leadership frameworks and models.<br />

Looking back to our elementary school student profile, perhaps we can take solace in the<br />

recognition that our current students are our future leaders. Providing them with a roadmap for<br />

student success serves to assist them in their development and gives us a method for tracking<br />

their progress. Communicating the meaning of those competencies labeled in Table 1 will help<br />

them determine how they should behave, and help the rest of us assess, develop, and reinforce<br />

those behaviors. Reducing the redundancy, improving the detail, and providing behavioral<br />

examples of the competencies will assist in this effort.<br />

REFERENCES<br />

Air Force Leadership Development Model. Retrieved October 6, <strong>2003</strong> from<br />

http://leadership.au.af.mil/af/afldm.htm.<br />

Army Leadership: Be, Know Do. (1999). Field Manual 22-100. Headquarters, Department of<br />

the Army, Washington, DC.<br />

Briscoe, J., & Hall, D. (1999). Grooming and picking leaders using competency frameworks: Do<br />

they work? An alternative approach and new guidelines for practice. Organizational Dynamics,<br />

28, 37-52.<br />

Coast Guard Leadership Development Program. (1997). Commandant Instruction 5351.1.<br />

United States Coast Guard, Washington, DC.<br />

Cooper, K. (2000). Effective competency modeling and reporting: A step by step guide for<br />

improving individual and organizational performance. AMACOM.<br />

Executive Core Qualifications. Retrieved October 6, <strong>2003</strong> from<br />

http://www.opm.gov/ses/handbook.htm.<br />

Lucia, A., and Lepsinger, R. (1999). The art and science of competency models: Pinpointing<br />

critical success factors in organizations, Vol. 1. Wiley, John & Sons.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Marine Corps Leadership Principles and Traits. Retrieved October 6, <strong>2003</strong> from<br />

http://www.infantryment.net/MarineCorpsLeadership.htm.<br />

Navy Leadership Competencies. Retrieved October 6, <strong>2003</strong> from http://www.ereservist.net/SPRAG/Leadership_Competencies.htm.<br />

Newsome, S., Catano, V., Day, A. (<strong>2003</strong>). Leader competencies: Proposing a research<br />

framework. Research Paper prepared for the Canadian Forces Leadership Institute. Available<br />

online at http://www.cda-acd.forces.gc.ca/cffi/engraph/research/pdf/50.pdf<br />

Shippmann, J., Ash, R., Battista, M., Carr, L., Eyde, L., Heskety, B., Kehoe, J., Pearlman, K., &<br />

Prien, E. (2000). The practice of competency modeling. Personnel Psychology, 53, 703-740.<br />

733<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


734<br />

NEW DIRECTIONS IN FOREIGN LANGUAGE APTITUDE TESTING<br />

Dr. John Lett, Mr. John Thain, Dr. Ward Keesling, Ms. Marzenna Krol<br />

Defense Language Institute Foreign Language Center<br />

597 Lawton Road, Suite 17<br />

Monterey, CA 93944-5006<br />

john.lett@monterey.army.mil<br />

<strong>Military</strong> personnel are selected for foreign language training at the Defense Language<br />

Institute Foreign Language Center (DLIFLC) via a two-tiered system. 75 Those who pass<br />

their service’s ASVAB composite for a language-requiring career field are permitted to take<br />

the Defense Language Aptitude Battery (DLAB). Over the past thirty years, the DLAB has<br />

proven to be a valid and reliable tool, contributing predictive variance over and above the<br />

Armed Services Vocational Aptitude Battery (ASVAB). However, several factors converge<br />

to stimulate efforts to reexamine the DLAB and possibly create a new one to replace it.<br />

Factors include test security concerns (there is only one form), datedness (instructional<br />

methods have changed since the test was developed in the early 1970s, and DLAB may not<br />

predict as well as it could if tailored to new instructional environment), and customer demand<br />

(we need more, and more highly proficient, language specialists than ever before.) In<br />

response to these factors, DLIFLC has launched several initiatives. First, we created two<br />

scrambled versions of the original test to protect against possible compromise of the original<br />

version. Second, we have programmed the test for computer-based administration and have<br />

initiated dialog with the accessions community regarding implementation thereof. Third, we<br />

have launched a contractor-supported project to obtain the opinions of leading applied<br />

linguists and cognitive psychologists regarding the advisability of exploring new approaches<br />

and item types in the development of a new DLAB. In this paper we will discuss the<br />

background factors alluded to above, describe the initiatives completed and presently under<br />

way, and present the preliminary recommendations that have emerged from the most recent<br />

project.<br />

BACKGROUND<br />

The Defense Language Aptitude Battery (DLAB) is an instrument primarily used, in<br />

conjunction with the Armed Services Vocational Aptitude Battery (ASVAB), for selecting<br />

and assigning military recruits to careers within military intelligence that require foreign<br />

language skills. In order for individuals to enter a military intelligence career field, they must<br />

first attain acceptable scores on the ASVAB. If that career field requires that a recruit be<br />

trained in a foreign language, the individual must also attain an acceptable score on the<br />

DLAB.<br />

75 The views expressed in this document are those of the authors and do not necessarily reflect the views of the<br />

Defense Language Institute Foreign Language Center or the Department of the Army.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


All potential military recruits must take the multi-aptitude test battery known as the Armed<br />

Services Vocational Aptitude Battery (ASVAB), either in a high school setting, at a <strong>Military</strong><br />

Entrance Processing Station (MEPS), or at a <strong>Military</strong> Entrance <strong>Testing</strong> Site (METS). The<br />

ASVAB contains eight individual tests: General Science, Arithmetic Reasoning, Word<br />

Knowledge, Paragraph Comprehension, Mathematics Knowledge, Electronics Information,<br />

Auto and Shop Information, and Mechanical Comprehension. Each ASVAB subtest is<br />

timed, and the entire battery takes about three hours. Scores are reported for individual<br />

subtests and for various combinations of subtests, known as composites.<br />

ASVAB data are used for both accession and classification purposes, and each service has its<br />

own preferred composite of ASVAB subtests that qualify recruits for further testing to<br />

determine whether they can become military linguists. Recruits who qualify on their<br />

service’s ASVAB composite may then take the DLAB, which is one of several special tests<br />

administered in the MEPS. Whether the DLAB is actually administered to any particular<br />

recruit who has the appropriate ASVAB composite score depends on several factors. First,<br />

the potential recruit must be willing to take the DLAB. Second, classifiers in the MEPS may<br />

offer the recruit a different post-ASVAB instrument if other jobs are more pressing at the<br />

time. Third, time or other circumstances may prevent the DLAB from being administered to<br />

the potential recruit.<br />

When the DLAB is administered to a recruit, it, too, serves both selection and assignment<br />

purposes. A recruit becomes eligible for assignment to language training at DLIFLC in<br />

preparation for a career as a military linguist by scoring at or above his/her service’s DLAB<br />

cut score. Eligibility to study a given language is determined by comparing the recruit’s<br />

DLAB score to cut scores for the various language difficulty categories. Within those<br />

constraints, the actual language assigned depends upon the needs of the service, and<br />

sometimes upon the desires of the recruit. Regardless of the language studied, DLIFLC<br />

students are expected to attain the Institute’s stated level of proficiency at the end of the basic<br />

program.<br />

The nature of the DLAB and the context in which it is used<br />

The DLAB is a multiple-choice test that takes about two hours to administer. It consists of<br />

four parts: a brief biographical inventory, a test of the ability to perceive and discriminate<br />

among spoken stress patterns, a multi-part section which tests the ability to apply the<br />

explicitly stated rules of an artificial language, and a final section which tests the ability to<br />

infer linguistic patterns as illustrated by examples in an artificial language and to apply the<br />

induced patterns to new artificial language samples.<br />

735<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


736<br />

The learning environment<br />

The DLAB was developed in the 1970s following the then-current approach to aptitude<br />

testing, which tended to emphasize the strength of the empirical relationships between<br />

various predictors and a criterion. Thus, the DLAB is not primarily an attempt to flesh out a<br />

theoretical model of language learning aptitude. Nevertheless, any foreign language aptitude<br />

test such as the DLAB contains an implicit view that there is an underlying construct of<br />

“aptitude to learn a foreign language” that is over and above general intelligence needed for<br />

academic learning. 76<br />

It should be stressed that the foreign language learning that the DLAB is to predict takes<br />

place in an intensive classroom-based language-teaching environment in the U.S. This type<br />

of learning environment is likely to be unfamiliar to most students because of its intensive<br />

nature and its duration. Students are in class for six hours a day, five days a week, for up to<br />

63 weeks. Authentic materials are used in the classroom as soon as possible and throughout<br />

the whole course.<br />

At the policy level, the DLIFLC espouses a communicative, proficiency-oriented approach to<br />

language teaching and learning; students must be able to use their language to perform<br />

specific kinds of tasks when they reach their post-DLIFLC job stations. In contrast, the<br />

DLAB was developed in the immediate aftermath of the “audio-lingual” era, one in which<br />

language teaching was heavily influenced by the habit-formation theories of behavioral<br />

psychologists such as B. F. Skinner.<br />

Criteria for success<br />

The optimal definition of success at DLIFLC is that the student completes the course on time<br />

and demonstrates the language proficiency required to receive a graduation diploma.<br />

Proficiency is demonstrated via scores on the Defense Language Proficiency Test (DLPT).<br />

The DLPT is administered at the end of the course of study at the DLIFLC and also is taken<br />

annually by military linguists throughout their careers. It uses a multiple-choice format to<br />

assess foreign language proficiency in listening and reading, and a face-to-face performancebased<br />

interview to assess speaking proficiency. Scores on the DLPT are interpreted in terms<br />

of the proficiency levels of the government’s Interagency Language Roundtable (ILR) Skill<br />

Level Descriptions, 77 which range from 0 to 5, where ‘5’ represents the proficiency of the<br />

educated native speaker. 78 To graduate, a DLIFLC student must demonstrate ILR<br />

proficiency levels of at least 2 in listening and reading and at least 1+ in speaking, regardless<br />

of the difficulty category of the language studied. It should be noted that the DLAB is not<br />

intended to predict success in the job after language training. Its purpose is to predict<br />

76 The development of the DLAB is described in Petersen & Al-Haik, (1976).<br />

77 The complete text of the Descriptions and a synopsis of their history are available at http://govtilr.org/.<br />

78 The DLPT listening and reading tests measure only to level 3; the oral proficiency interview can measure the<br />

full range of the scale.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


successful completion of basic language training, where success is defined in terms of<br />

satisfactory performance on the DLPT.<br />

Why a new DLAB is needed<br />

In general, data support the position that the current DLAB “works,” and that this has been<br />

true for some time.<br />

• Data from the Language Skill Change Project (LSCP), conducted in the mid-1980s to<br />

early 1990s by the Army Research Institute (ARI) and DLIFLC in coordination with the<br />

US Army Intelligence Center and School (USAICS), indicated that DLAB scores added<br />

meaningfully to the prediction of language learning outcomes over and above that<br />

contributed by ASVAB scores. 79<br />

• In a separate study conducted in the late 1980s (White & Park, 1987), ARI analyzed over<br />

5,000 cases of ASVAB and DLAB data to explore the relationship between them. The<br />

study did show that the ASVAB Scientific and Technical (ST) composite (the composite<br />

used by the Army as a gateway to military intelligence career fields) and DLAB were<br />

positively correlated (r = .51), and pointed out that raising the ST cut score would reduce<br />

the number of DLAB testing hours required to obtain a given number of successful<br />

DLAB scores. For example, among those with ST scores of 104 and below, only 1.3%<br />

made a DLAB of 100 or more, compared with 46% for recruits scoring 130 or better on<br />

the ST. However, the data also showed that high STs did not guarantee high DLABs:<br />

even among those scoring 130 or more on ST, over half failed to reach DLAB 100, and<br />

thus would not qualify for the more difficult languages.<br />

• At the DLIFLC’s Annual Program Review, the Institute reports how many students reach<br />

success as related to their DLAB scores; generally, the higher the DLAB score, the more<br />

probable a successful outcome. Of course, high aptitude scores do not guarantee success<br />

for any given student; failures can and do occur. However, the likelihood of failure is<br />

greater among low-aptitude students than among high-aptitude students.<br />

So why is the Institute contemplating the development of a new DLAB? There are several<br />

reasons.<br />

• There is only one form of the DLAB. Its complete compromise would be catastrophic,<br />

and its age leads to speculation that it may have been partially compromised many times<br />

already.<br />

• There have been substantial changes in the philosophy and practice of foreign language<br />

education between the 1960s-70s and the present time.<br />

79 The LSCP, conducted by ARI and DLIFLC, tracked an entering pool of 1903 US Army students of Spanish,<br />

German, Russian, and Korean from their arrival at DLIFLC until approximately the end of their first enlistment<br />

period. The data referred to here are described in Lett & O’Mara, 1990.<br />

737<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


738<br />

• The current standardized criterion measures (i.e., the DLPT) were not available when the<br />

DLAB was developed.<br />

• There are always desires to improve the efficiency of large training systems, and the need<br />

to produce language specialists of higher proficiency than ever leads to greater<br />

expectations for selection and assignment systems.<br />

• Issues of face validity and flexibility lobby for a modernized, computer-based test.<br />

RECENT AND CURRENT INITIATIVES<br />

DLAB-scramble<br />

In an effort to guard against one form of test compromise, DLIFLC produced two versions of<br />

the existing test in which the order of items was carefully altered within test sections so as to<br />

make the original answer key invalid. Care was taken to ensure that relevant parameters<br />

remained constant, such as the relative difficulty of adjacent items and the time allowed<br />

between adjacent items. All test materials were integrated into modern media; e.g., graphics<br />

were scanned, text was retyped into Word for Windows files, and original recordings were<br />

transferred to compact discs (CD). The result was two parallel forms, each containing the<br />

original items but with answer keys that differed from each other and from the original.<br />

These materials were made available to Army Personnel <strong>Testing</strong> (APT) in [date] and the<br />

original DLAB was withdrawn from service.<br />

Computer-delivered DLAB<br />

One way to get more high-aptitude language students into the training pipeline would be to<br />

administer the DLAB to larger numbers of students. One approach that has been proposed to<br />

test more recruits is to use a computer-based DLAB. In order to address those issues, work<br />

has proceeded along two fronts. One step was to establish liaison with MEPCOM to ask<br />

whether the MEPS would be able to make use of a programmed DLAB if they had one.<br />

Through the courtesy of the neighboring Defense Manpower Data Center (DMRC), we were<br />

able to meet with MEPCOM representatives and others at the October 2002 meeting of the<br />

<strong>Military</strong> Accessions Policy Working Group (MAPWG). Discussions led to an agreement<br />

that the new computers being procured to move the computer-adaptive ASVAB to a<br />

Windows platform would be allowed to have audio capability to enhance the possibility that<br />

they could be used to administer a computerized DLAB when not being used for their<br />

primary purpose. At the present time a study is being conducted by SY Coleman, Inc., to<br />

identify in some detail the technological and infrastructure issues which must be addressed in<br />

order for MEPS and Post-MEPS locations to administer an automated DLAB.<br />

Meanwhile, we have taken steps to position ourselves for feasibility studies regarding the use<br />

of computer-based DLABs in the MEPS. We took the electronic item files that had been<br />

developed for the scrambled DLAB project and a DLIFLC programmer produced a working<br />

model of a computer-based DLAB. This product has undergone extensive beta testing within<br />

DLIFLC and is being revised per user feedback. It is now being used by SY Coleman to<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


investigate whether it or a similar product can or could be administered on the Win-CAT-<br />

ASVAB workstations.<br />

The “new mousetrap” project<br />

Making use of FY 2002 funding provided by the US Navy, we launched the first portion of a<br />

multi-phase effort to design, develop, validate, and field a new test to provide a replacement<br />

for or an enhancement to the existing DLAB. The first step of the design phase was to<br />

review recent research on the subject of language aptitude and to query experts in this field to<br />

obtain insights and recommendations to inform decisions for revising or replacing the<br />

DLAB. To this end, and with contractor assistance, 80 we contacted several expert<br />

consultants 81 who familiarized themselves with our issues, prepared position papers, and<br />

participated in a two-day workshop at DLIFLC in October <strong>2003</strong>. The three overarching<br />

general questions which we posed to the experts were these:<br />

• What do theoretical developments in educational and cognitive psychology with regard to<br />

classroom-based, second language learning imply for the improvement of the DLAB?<br />

Are there new constructs that relate to acquisition of a second language that can be used<br />

in a predictive battery?<br />

• What do current theories and practices in assessing aptitude for foreign language learning<br />

among adults, or for adult learning in general, imply for the improvement of the DLAB?<br />

Have there been significant developments in conceptualizing the assessment of aptitude<br />

80 The services of Perot Systems Government Services, then Soza & Co., Ltd., were obtained through the<br />

auspices of the OPM Training Management Assistance program. Perot Systems engaged the Center for<br />

Applied Linguistics (CAL) as the project’s subcontractor.<br />

81 THE CONSULTANTS WERE WILLIAM J. STRICKLAND, VICE PRESIDENT,<br />

HUMAN RESOURCES RESEARCH ORGANIZATION (HUMRRO); PETER J.<br />

ROBINSON, PROFESSOR OF LINGUISTICS, DEPARTMENT OF ENGLISH,<br />

AOYAMA GAKUIN UNIVERSITY, TOKYO, JAPAN; AND DANIEL J REED,<br />

LANGUAGE ASSESSMENT SPECIALIST, PROGRAM IN TESOL AND APPLIED<br />

LINGUISTICS, INDIANA UNIVERSITY. AN EXPERT PARTICIPANT FROM WITHIN<br />

THE US GOVERNMENT WAS MADELINE E. EHRMAN, DIRECTOR, RESEARCH,<br />

EVALUATION AND DEVELOPMENT, FOREIGN SERVICE INSTITUTE (FSI) AND<br />

SUBJECT MATTER EXPERT, CENTER FOR THE ADVANCED STUDY OF<br />

LANGUAGE (CASL). THE WORKSHOP WAS FACILITATED BY THREE PERSONS<br />

FROM THE CENTER FOR APPLIED LINGUISTICS, WASHINGTON, DC: DORRY<br />

KENYON, DIRECTOR, LANGUAGE TESTING DIVISION; DAVID MACGREGOR,<br />

RESEARCH ASSISTANT; AND PAULA M. WINKE, TEST DEVELOPMENT<br />

COORDINATOR, LANGUAGE TESTING DIVISION AND PH.D. CANDIDATE,<br />

APPLIED LINGUISTICS, GEORGETOWN UNIVERSITY. SOME OF THE MATERIAL<br />

IN THIS PAPER IS BASED ON MATERIALS DEVELOPED DURING THIS PROJECT.<br />

739<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


740<br />

for foreign language learning since the era when the DLAB or the Modern Language<br />

Aptitude Test (MLAT) were first developed?<br />

• Could personnel selection and classification for language specialist career fields be<br />

improved by relying more heavily on measures of general aptitude for learning rather<br />

than seeking to refine measures of aptitude for learning specific kinds of things, such as<br />

languages in general, or language families, or specific languages or language skills?<br />

What mix of these approaches would yield the best predictions of success at DLIFLC?<br />

At the conclusion of two days of presentations and small-group working sessions, the<br />

participants were asked to synthesize their opinions and to indicate what kinds of item types<br />

should be involved in a revised test, what existing tests or test parts should be retained, etc.<br />

After the workshop, the expert consultants provided written summaries of their<br />

recommendations, with justifications.<br />

Preliminary synthesis of recommendations<br />

The synthesis presented here is only preliminary because we want to have all of the materials<br />

generated in the workshop reviewed by a prominent cognitive psychologist before we<br />

consolidate them into a final set of recommendations. With that caveat, several key findings<br />

can be stated.<br />

• Most or all of the existing DLAB should be retained.<br />

• Certain DLAB parts should be expanded or replaced by items like those on other existing<br />

language aptitude tests.<br />

• New subtests should be developed to measure constructs that are not now being<br />

measured. Among others, these should include tests of perceptual speed, working<br />

memory, phonological discrimination, and the ability to listen (to one’s native language)<br />

under less than ideal acoustic conditions.<br />

• Consideration should be given to a two-tiered approach to language aptitude assessment:<br />

one to be given before arrival at DLIFLC and another to be given post-arrival. The<br />

former would serve as a gatekeeper for language training in general, and the latter would<br />

be used to make more informed assignments of recruits to languages or even to specific<br />

kinds of instructional environments.<br />

• We should investigate the validity of alternative scoring strategies for both the current<br />

and the proposed system, e.g., by exploiting scores on DLAB parts in a manner similar to<br />

the way ASVAB subsets are grouped into composites for particular screening purposes.<br />

Similarly, we might consider a compensatory model for selection rather than today’s<br />

“multiple hurdles” approach. Such a system might allow the minimum DLAB score to<br />

vary based on appropriate ASVAB scores, or waive DLAB altogether for extremely high<br />

scorers on certain ASVAB subtests or composites.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


•<br />

NEXT STEPS<br />

As a first priority, we will complete the “New Mousetrap” project currently under way and<br />

will continue to explore infrastructure issues regarding the possible implementation of a<br />

computer-based DLAB in the MEPS. Simultaneously, we will be designing proposals for<br />

studies which should be done in the short term to address selected aspects of the various<br />

recommendations that are synthesized above. We will also be developing a longer-range<br />

research agenda and submitting proposals to appropriate research centers within the<br />

Government.<br />

REFERENCES<br />

Lett, J. A., & O’Mara, F. E. (1990). Predictors of Success in an Intensive Foreign Language<br />

Learning Context: Correlates of Language Learning at the Defense Language Institute<br />

Foreign Language Center. In Stansfield, C. and Thomas A. Parry (Eds.), Language<br />

Aptitude Reconsidered. Englewood Cliffs, NJ: Prentice Hall Regents.<br />

Petersen, C.R. & Al-Haik, A.R. (1976). The development of the Defense Language<br />

Aptitude Battery (DLAB). Educational and Psychological Measurement, 36, 369-380.<br />

White, L. A., & Park, K. (1987). Selection and Classification Technical Area Working<br />

Paper RS-WP-87-09. An examination of relationships between the Defense Language<br />

Aptitude Battery and the Armed Services Vocational Aptitude Battery. Alexandria, VA:<br />

U.S. Army Research Institute for the Behavioral and Social Sciences.<br />

741<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


742<br />

THE STRUCTURE & ANTECEDENTS OF ORGANISATIONAL<br />

COMMITMENT IN THE SINGAPORE ARMY.<br />

Correspondence:<br />

Applied Behavioral Sciences Department,<br />

Tower B, #16-01,<br />

Depot Road,<br />

Singapore 109681.<br />

Tel: 65-3731574<br />

Fax: 65-3731577<br />

Maj Don Willis<br />

Applied Behavioural Sciences Department<br />

Ministry of Defence, Singapore<br />

Authors’ note:<br />

The opinions expressed in this paper are those of the author and not the official position of the<br />

Singapore Armed Forces or the Ministry of Defence, Singapore.<br />

ABSTRACT<br />

Using the Meyer and Allen’s (1991) 3-Component Model of Organisational<br />

Commitment (Affective, Continuance and Normative) as the theoretical framework, this<br />

study showed that commitment in the Singapore Army could be described in the 3<br />

dimensions postulated by Meyer and Allen (1990). Using both Exploratory and<br />

Confirmatory analysis of the 18-item scale developed by Meyer et al (1993), this study<br />

increases the evidence for the applicability of this model to an Asian context. In addition,<br />

‘Job Satisfaction/Meaningfulness’ was found to be the most important predictor of all 3<br />

approaches to commitment. ‘Relationship with peers and superiors’ was also a significant<br />

predictor of Affective Commitment, while ‘Promotion opportunities’ and ‘Support provided<br />

by the organisation’ predicted Normative Commitment. The above-mentioned findings,<br />

while providing cross-cultural support, also extended the applicability to a military culture, in<br />

spite of the latter’s collectivistic nature (Kagitçibasi, 1997; Triandis, 1995), where hierarchy,<br />

control, orderliness, teamwork, group loyalty and collective goals are valued over equality,<br />

individual expression and self-interests (Soh, 2000).<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


INTRODUCTION<br />

This study investigated the structure and antecedents of employee commitment in<br />

the Singapore Army. In the modern battlefield, technology has been a proven force<br />

multiplier. However, the employment of the best weapons’ systems can only be optimized in<br />

the hands of talented and committed individuals. An understanding of the structure and<br />

antecedents of Organizational Commitment in the Army would provide a source of insight<br />

for the formulation of strategic career development as well as recruitment and retention<br />

policies. Not only will this provide the Army with a potential edge in the highly competitive<br />

Singapore labour market, but will also assist in developing processes that imbue commitment,<br />

something paramount to an organization that has been entrusted with the sacred responsibility<br />

of the country’s defence.<br />

METHOD<br />

A cross-sectional survey design was employed. Data was collected via selfadministered<br />

questionnaires. 15 Battalions were randomly selected for the survey. A total<br />

of 621 regular Army personnel completed and returned the questionnaire. Organizational<br />

Commitment was measured using the revised 18-item scale developed by Meyer et al (1993).<br />

This scale has been used extensively and a review by Allen and Meyer (1996) of the evidence<br />

relevant to the reliability and construct validity of the scale provided strong support for its use<br />

in substantive research. A second section comprised 48 items pertaining to employees’<br />

perception of various aspects of working in an organization, e.g. work relations, work<br />

environment, rewards, etc. These items were derived and used by Lim (2001) in his study of<br />

organizational commitment in a Singapore Public Sector undertaking. The generic nature of<br />

the items made them applicable to the Army and hence was used in this study as possible<br />

antecedents of Organizational Commitment as postulated by Meyer et al (1993).<br />

ANALYSES & RESULTS<br />

Exploratory Factor Structure of the 18-item Scale<br />

To ascertain the construct validity of the Meyer and Allen scale, Exploratory Factor<br />

Analysis using SPSS 10.0 for Windows was conducted on the 18 items. Principal axis<br />

factoring extraction and Varimax rotation were used to mirror the approach adopted by<br />

Meyer et al, (1993). An examination of the scree-plot suggested that 3 factors might be<br />

743<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


744<br />

retained for further analysis. Consistent with the scree-plot, the un-rotated factor solution<br />

produced 3 factors with eigenvalues above 1 that accounted for almost 54% of the variance.<br />

Eventually, following the scale reliability analysis, a 3-factor structure was obtained as<br />

appended in Table 1 below:<br />

Table 1. 3-Factor Solution Based on Exploratory Factor Analysis of the Revised 18-item<br />

Scale (Principal Axis Factor Extraction with Varimax Rotation) following Scale Reliability<br />

Analysis (Cronbach Alpha)<br />

Factor I (Alpha = .8376)<br />

Factor II (Alpha = .8527) Factor III (Alpha = .7280)<br />

AC1 NC10 CC2<br />

AC3 NC11 CC3<br />

AC4 NC12 CC5<br />

AC5 NC13 CC6<br />

AC7 NC14 CC7<br />

AC8<br />

It can be seen that the first factor comprised all of the 6 Affective Commitment (AC)<br />

items as postulated by Meyer et al (1993) Factor II comprised 5 of the 6 Normative<br />

Commitment (NC) items while Factor III comprised 5 of the 6 Continuance Commitment<br />

(CC) items. By virtue of the orthogonal rotation, the results therefore indicated the presence<br />

of 3 distinguishable factors, generally replicating the factor structure of the 18-item scale.<br />

Given this, it became pertinent to carry out Confirmatory Factor Analysis to assess if the data<br />

did indeed fit the 3-factor model as depicted by Meyer et al (1993) in the revised18-item<br />

scale.<br />

Confirmatory Factor Analysis of the 18-item Scales<br />

The covariance matrices derived from the data were used as the inputs and a<br />

maximum likelihood solution was obtained using LISREL 8.30 (Jöreskog & Sörbom, 1993).<br />

The factor for which each item was proposed to load on was based on the loadings used by<br />

Meyer et al (1993). Following the recommendations of Bollen and Long (1993), multiple<br />

indices of fit were used to evaluate the models’ fit to the data. Table 2 below summarises the<br />

findings.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Table 2. Summary Goodness of Fit indices and χ2 tests of the 18-item Commitment Scale<br />

Statistic Desired Fit Indices Obtained Fit<br />

Indices<br />

χ2 - 362.18 (132),<br />

p=0.00<br />

Root Mean Square Error of Approximation<br />

(RMSEA)<br />

Standardised Root Mean Square Residual<br />

(RMR)<br />

< 0.05 (Good)<br />

0.05-0.08<br />

(Reasonable)<br />

0.066<br />

< 0.05 (Acceptable) 0.065<br />

Goodness of Fit Index (GFI) >0.90 (Acceptable) 0.91<br />

Comparative Fit Index (CFI) >0.90 (Acceptable) 0.92<br />

The above-mentioned indices were selected as recommended by Diamantopoulos and<br />

Siguaw (2000) to make an informed decision. It is noted that the chi-sq value was significant<br />

thus suggesting a poor fit. However, this was to be expected, as chi-sq is sensitive to samples<br />

that exceed 200. Based on the other indices, it can be concluded that the data from the 18<br />

items produced an acceptable fit to the model.<br />

Exploratory Factor Structure of the 48-item Work Variables<br />

To facilitate the meaningful interpretation of the work variables surveyed, an<br />

exploratory factor analysis was conducted on the 48 items. The work items retained after the<br />

EFA and Scaled Reliability Analysis were reduced from 48 to 45 items, which loaded onto 8<br />

factors. These, together with the summary statistics obtained for each of these factors are<br />

described in Table 3 below. Given their acceptable Cronbach Alpha coefficients, they were<br />

subsequently used as the predictor variables for the 3 Commitment scales.<br />

Table 3. 8-Factor Solution based on Exploratory Factor Analysis (Principal Axis Factor<br />

Extraction,Varimax Rotation) of 48-item Work Variables following Scale Reliability<br />

Analysis<br />

745<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


746<br />

FACTO ALPH<br />

DESCRIPTION MEA SD*<br />

R A<br />

N*<br />

Factor I .8829 Relationship with peers and superiors<br />

5.0879 1.26<br />

34<br />

Factor II .8896 Satisfaction with financial re-numeration<br />

3.6566 1.36<br />

28<br />

Factor III .8687 Career and personal development opportunities<br />

4.2537 1.25<br />

34<br />

Factor IV .8836 Perception and satisfaction with issues pertaining to<br />

promotions and advancements<br />

3.6719 1.25<br />

74<br />

Factor V .8813 Meaningfulness of and satisfaction with the job<br />

Factor VI .8876 Support provided by the organisation<br />

Factor<br />

VII<br />

Factor<br />

VIII<br />

.7720 Workload<br />

.7334 Comparisons with the Private Sector<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

5.2620 1.34<br />

25<br />

4.1999 1.53<br />

51<br />

4.9660 1.20<br />

22<br />

4.1574 1.25<br />

15<br />

* Factor scores were computed based on the arithmetic means of the items that loaded on the<br />

respective factor.<br />

Relationship Between Commitment & Work Variables<br />

The next step involved using Stepwise Multiple Regression to analyse the relationship<br />

between the work factors and each Organisational Commitment dimension. All the 8 work<br />

factors were regressed onto each of the Organisational Commitment dimension in turn. The<br />

order of the variables entered into the regression equation each time was determined by the<br />

strength of the correlation between each variable and the Organisational Commitment<br />

dimension concerned.<br />

‘Meaningfulness/Satisfaction with job’ and ‘Relationship with peers and superiors’<br />

were found to predict AC significantly, accounting for about 48% of the variance. On the<br />

other hand, only ‘Meaningfulness/Satisfaction with job’ predicted CC, accounting for only<br />

7% of the variance. There were 3 predictors of NC; ‘Meaningfulness/ satisfaction with the<br />

job’, ‘Promotion opportunities’ and ‘Support provided by the organisation’. Together, these<br />

accounted for almost 52% of the variance.


Analysis of the 18-item Scale<br />

DISCUSSION<br />

The factor structure obtained via EFA generally replicated the factor structure of the<br />

18-item scale. Due to the orthogonal nature of the rotation, the results therefore indicated the<br />

presence of 3 distinguishable factors. The Cronbach Alpha-coefficients obtained were also<br />

consistent with those reported by Meyer et al (1993). Using CFA, the reasonable fit of the<br />

data to the 18-item model confirms the findings of Meyer et al (1993) and Dunham et al<br />

(1994). Given this, the results not only provide support for the 3-dimensional construct<br />

definition of Organisational Commitment, but also on its cross-cultural applicability to an<br />

Asian as well as a <strong>Military</strong> sample.<br />

Organisational Variables & Work Variables<br />

Only Meaningfulness/Satisfaction with Job and Relationship with peers and superiors<br />

were found to predict AC significantly, accounting for about 48% of the variance. A<br />

comparison of the standardised betas showed that the former is a more important factor<br />

(about 4.5 times) than latter in predicting Affective Commitment. This finding is consistent<br />

with those of Allen & Meyer (1993, 1997) & Cramer (1996) who found that employees with<br />

a high level of emotional attachment to the organisation are also more likely to find their<br />

work in the organisation meaningful and relevant, and at the same time enjoying a positive<br />

relationship with their superiors and peers.<br />

Only Meaningfulness/Satisfaction with Job predicted this form of Organisational<br />

Commitment accounting for only 7% of the variance, an indication that there are other factors<br />

not tested by the model. This finding is consistent with Meyer and Allen’s (1996) postulation<br />

that the antecedents of CC include job status and benefits accruing from long years in service,<br />

retirement benefits, opportunities for employment elsewhere, as well as the perceived<br />

transferability of work skills; factors that were not part of our work variables.<br />

‘Meaningfulness/Satisfaction with the job’, ‘Promotion opportunities’ and ‘Support<br />

747<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


748<br />

provided by the organisation’ predicted NC, accounting for almost 52% of the variance. The<br />

standardised betas showed that ‘Meaningfulness/Satisfaction with the job’ is a more<br />

important predictor than the other 2 variables. This finding is consistent with Allen and<br />

Meyer’s (1996) postulation that NC may arise from experiences which make the employee<br />

feel that their organisation is providing them with more than they can reciprocate thus<br />

obliging them to continue membership with their organisation.<br />

CONCLUSION<br />

The present research, while providing cross-cultural support for the 3-factor model,<br />

also extended the applicability of the scale to a military context, in spite of the latter’s<br />

collectivistic culture (Kagitçibasi, 1997; Triandis, 1995), where hierarchy, control,<br />

orderliness, teamwork, group loyalty and collective goals are valued over equality, individual<br />

expression and self-interests (Soh, 2000), thus providing strong evidence for the model’s<br />

generalisability across occupation groups. The findings, although exploratory in nature, is<br />

encouraging. Future work will focus on determining other antecedents, especially for CC, as<br />

well as the nature and components of these antecedents. These will provide the necessary<br />

framework to serve as a platform for modeling and more robust confirmatory testing so as to<br />

better help the Army appreciate and develop processes that imbue commitment.<br />

REFERENCES<br />

Bollen, K. & Long, J.S. (1993). <strong>Testing</strong> structural equation models. Newbury Park,<br />

CA: Sage.<br />

Cramer, D. (1996). Job satisfaction and organizational continuance commitment: A<br />

two-wave panel study. Journal of Organizational Behaviour, 17, 389-400.<br />

Diamantopoulos, A. & Siguaw, J.A. (2000). Introducing LISREL. London: Sage<br />

Publications.<br />

Jöreskog, K., & Sörbom, D. (1993). LISREL 8: Structural equation modeling with the<br />

SIMPLIS command language. Chicago, IL: Scientific Software.<br />

Kagitçibasi, C. (1997). Individualism and collectivism. In J. W. Berry, M. H. Segall,<br />

& C. Kagitçibasi (Eds.), Handbook of cross-cultural psychology: Social and behavior<br />

applications (2nd Edition), 3, 1-49. Needham Heights, MA: Allyn & Bacon.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Lim, B. (2001). The Structure and Nature of Organisational Commitment in a<br />

Singaporean Work Context. Unpublished doctoral dissertation, University of Manchester<br />

Institute of Science & Technology.<br />

Meyer, J.P., Allen, N.J, & Smith, C.A. (1993). Commitment to organizations and<br />

occupations: Extension and test of a three-component conceptualization. Journal of Applied<br />

Psychology, 78 (4), 538-551.<br />

Soh, S. (2000). Organizational Socialization of Newcomers: A Longitudinal Study of<br />

Organizational Enculturation Processes And Outcomes. Manuscript submitted for<br />

publication.<br />

Triandis, H. C. (1995). Individualism and collectivism. Boulder, CO: Westview Press.<br />

749<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


750<br />

FURTHER UNDERSTANDING OF ATTITUDES TOWARDS NATIONAL<br />

DEFENCE AND MILITARY SERVICE IN SINGAPORE<br />

Charissa Tan, Lt-Col Star Soh, Ph.D, and Major Beng Chong, Lim, Ph.D<br />

Applied Behavioural Sciences Department, Ministry of Defence<br />

Defence Technology Tower B, 5 Depot Road,<br />

#16-01, Singapore 109681<br />

charissa@starnet.gov.sg<br />

Authors’ note:<br />

The opinions expressed in this paper are those of the authors and are not the official<br />

position of the Singapore Armed Forces or the Ministry of Defence, Singapore.<br />

______________________________________________________________________________<br />

ABSTRACT<br />

As the Singapore Armed Forces is a citizen’s army, it is important to understand the<br />

attitudes of active and reserve servicemen towards national defence and military service. The<br />

present study examines a model of inter-relationships among six constructs and across three<br />

samples – full-time national servicemen, Regulars, and Reservists. The constructs of interest<br />

were support for National Service, Commitment to Defend the Country, Sense of Belonging,<br />

Perceived Security of the Country, Defensibility of the Country, and Confidence in the Armed<br />

Forces. Face-to-face interviews were conducted from February to April <strong>2003</strong> with a random<br />

selection of about 400 NSFs, 400 Regulars, and 400 Reservists. The hypothesized relationships<br />

among the constructs were tested using structural equation modeling. The findings are presented<br />

and discussed.<br />

INTRODUCTION<br />

In countries with conscript armed forces, citizens’ support for military service is crucial.<br />

Without adequate support, the armed forces cannot build and develop itself into a formidable<br />

force that can be entrusted to fulfill its functions. This study focuses on understanding attitudes<br />

toward national defence and support for military service among full-time conscripts, Regular<br />

servicemen, and reserves (National Servicemen).<br />

When Singapore gained independence from Malaysia in 1965, the Singapore government<br />

needed to quickly build a strong military force to provide Singapore with the foundation for<br />

nation building (Huxley, 2000). In March 1967, Parliament passed the National Service<br />

(Amendment) Bill to require every male citizen of Singapore to perform full-time military<br />

service. Today, most Singaporean males enter into full-time national service (NSF) between age<br />

eighteen and twenty for two and a half years. Upon completion of NSF service, the males<br />

become operationally-ready national servicemen (NSmen) which form the reserve force. Most<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


NSmen’s training cycle last for thirteen years. NSmen are usually ‘called-up’ for military service<br />

up to a maximum of 40 days per year. The NSFs and Regulars together form a standing Army of<br />

about 50,000, with the ability to mobilize approximately 300,000 NSmen reserves.<br />

National service is, therefore, a necessary part of every Singapore male’s life, as well as<br />

an integral part of daily life for all citizens in Singapore. Given that the sovereignty and progress<br />

of the nation depend on the security provided by the system of national service, the study of<br />

military servicemen’s attitudes towards national service and willingness to fight for the country<br />

becomes imperative.<br />

In this study, the key constructs of Support for National Service and Commitment to<br />

National Defence and their inter-relationships with various key antecedent variables were<br />

examined in a six-factor model. This model is based on a similar study by Soh, Tan, & Ong<br />

(2002) which found that among a sample of military servicemen, Support for National Service<br />

and Commitment to National Defence were strongly related to Sense of Belonging to the country<br />

and that the model was found to fit the data from another sample of the public in a crossvalidation<br />

analysis. However, it is noted that the sample of military servicemen comprised<br />

NSFs, Regulars and NSmen combined together and could have missed out important differences<br />

attributed to the different type of service. Also, the paper also recognized that one of the<br />

constructs had been poorly operationalised.<br />

The present study uses similar constructs to the 2002 study. The definitions of the<br />

constructs are as follows:<br />

Support for National Service (SPNS): The favorable or positive attitude toward military<br />

conscription as a policy as well as a worthwhile investment of personal time and<br />

resources.<br />

Commitment to National Defence (CMND): The willingness to risk one’s life to fight<br />

for the country in times of war.<br />

Sense of Belonging (SOB): The willingness to stay in the country under general contexts<br />

(i.e., war scenario not specified) and the feelings of national pride and belonging to the<br />

country.<br />

Perceived Security (PS): The confidence that the country will enjoy peace and stability<br />

and will prosper over the short term (i.e., next five years).<br />

Defensibility of the Country (DS): The belief and confidence that the country can be<br />

defended.<br />

Confidence in the Armed Forces (SAF): The confidence that the Armed Forces has the<br />

capability to defend the country.<br />

751<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


752<br />

The following relationships, based largely on the 2002 study, were hypothesized (see<br />

Figure 1):<br />

H1: Commitment to National Defence will be strongly and positively related to Support<br />

for National Service. Should one be willing to fight for the country, one would support<br />

national service as a means to be involved in or show one’s support for national defence.<br />

H2: Sense of Belonging will be strongly and positively related to Commitment to<br />

National Defence. One would be committed to fight for a country that one has a strong<br />

feeling of national pride and sense of belonging to the country.<br />

H3: Perceived Security of Singapore and Defensibility of Singapore will be weakly and<br />

positively related to one’s Sense of Belonging. One’s sense of belonging is not purely<br />

affect-based, but also has a cognitive evaluative component. It is proposed that knowing<br />

the country is safe and that the country can be defended enhances one’s sense of<br />

belonging to that country.<br />

H4: Defensibility of Singapore will be weakly and positively related to Support for<br />

National Service. One would be more willing to support national service if one believes<br />

and is confident that the country can be defended and that one’s efforts are worthwhile<br />

towards this end.<br />

H5: Confidence in the SAF will be weakly and positively related to the Perceived future<br />

Security of the Country, and moderately and positively related to the Defensibility of<br />

Singapore. Having the confidence in a strong armed forces should lead to a greater<br />

sense of security and the perception and belief that Singapore can be defended. However,<br />

Confidence in the Armed Forces is expected to have a stronger relationship with the<br />

Defensibility of Singapore as the contexts of invasion and defence are more salient<br />

compared to Perceived Security, which is a more general outlook of peace and prosperity<br />

over the next five years.<br />

Figure 1. Hypothesized Model of Inter-relationships<br />

Confidence<br />

in Armed<br />

Forces<br />

+<br />

Perceived<br />

Security of<br />

Singapore<br />

++<br />

Defensibility<br />

of the<br />

Country<br />

+<br />

Sense of<br />

Belonging<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

+<br />

++<br />

+<br />

Commitment<br />

to National<br />

Defence<br />

++<br />

Support for<br />

National<br />

Service<br />

+ Weak, positive relationship<br />

++ Moderate-Strong relationship


To test the robustness of the hypothesized relationships, the findings from the sample of<br />

Full-time National Servicemen (NSFs) were cross-validated across a sample of Regulars and a<br />

third separate sample of NSmen (the reserves).<br />

METHOD<br />

Sample<br />

Data were obtained from three samples: 392 NSFs, 402 Regulars, and 494 NSmen from<br />

the Singapore Armed Forces. Proportionate random sampling was used to select the participants<br />

for the survey so as to ensure that the sample was representative by type of service (Regulars,<br />

NSFs, NSmen), rank, and service (Army, Navy, Airforce). The sample sizes yield a precision<br />

that met the statistical criterion of ± 5% error margin at the 95% confidence level.<br />

Procedure<br />

The data for this study were extracted from an annual survey of the Regulars, NSFs, and<br />

NSmen from the Singapore Armed Forces on their perceptions and attitudes toward various<br />

defence-related issues. Data were collected using face-to-face interviews from February to April<br />

<strong>2003</strong>.<br />

Measures<br />

All the items in the survey instrument were self-developed. There were at least 2 items<br />

measuring each construct, resulting in a total of 21 items for analysis. The constructs, example<br />

survey items, and response options are presented in Table 1 below:<br />

Table 1. List of constructs, example questions and response options.<br />

Construct Items Response Options<br />

1. Support for 1. National Service is necessary for the<br />

NS<br />

defence of Singapore.<br />

Strongly Disagree, Disagree,<br />

2. NS provides the security needed for<br />

Singapore to develop and prosper.<br />

Agree, Strongly Agree<br />

2. Commitment 1. If war should come, I would risk my life<br />

to National to fight for Singapore.<br />

Strongly Disagree, Disagree,<br />

Defence 2. (R) If war should come to Singapore, I<br />

would try to leave the country.<br />

Agree, Strongly Agree<br />

3. Sense of 1. I am proud to be a Singaporean.<br />

Strongly Disagree, Disagree,<br />

Belonging 2. Singapore is where I belong.<br />

Agree, Strongly Agree<br />

4. Perceived 1. I am confident that Singapore will enjoy<br />

Security peace and stability over the next five years. Strongly Disagree, Disagree,<br />

2. I am confident that Singapore will<br />

prosper over the next five years.<br />

Agree, Strongly Agree<br />

5. Defensibility 1. Singapore has enough resources to Strongly Disagree, Disagree,<br />

of Singapore defend itself.<br />

Agree, Strongly Agree<br />

753<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


754<br />

Construct Items Response Options<br />

2. Singapore can be defended even if no Strongly Disagree, Disagree,<br />

country helps us.<br />

Agree, Strongly Agree<br />

6. Confidence in<br />

Armed Forces<br />

1. If there is a war now, I am confident that<br />

the SAF will have a quick and decisive<br />

victory.<br />

Strongly Disagree, Disagree,<br />

Agree, Strongly Agree<br />

Note. (R) indicates that the responses have been reversed-scored.<br />

RESULTS<br />

Scale reliabilities and inter-correlations. The scale reliabilities (internal consistency Cronbach<br />

Alpha) and inter-correlations for the scales are presented in Tables 2.1 to 2.3 below:<br />

Table 2.1. Scale Reliabilities and Inter-Correlations for the NSF sample (N=392)<br />

NSF Sample (N=392)<br />

Scales (No. of items in parentheses)<br />

Cronbach<br />

Alpha<br />

1 2 3 4 5 6<br />

1. Support for NS (5) .80 1.00<br />

2. Commitment to National Defence (4) .86 .61 1.00<br />

3. Sense of Belonging (4) .74 .58 .61 1.00<br />

4. Perceived Security (2) .67 .25 .23 .34 1.00<br />

5. Defensibility of the Country (3) .73 .40 .37 .34 .22 1.00<br />

6. Confidence in the Armed Forces (3)<br />

Note. p


Table 2.3. Scale Reliabilities and Inter-Correlations for the NSmen sample (N=494)<br />

NSmen Sample (N=494)<br />

Scales<br />

Cronbach<br />

Alpha<br />

1 2 3 4 5 6<br />

1. Support for NS .80 1.00<br />

2. Commitment to National Defence .86 .66 1.00<br />

3. Sense of Belonging .79 .62 .64 1.00<br />

4. Perceived Security .67 .33 .30 .41 1.00<br />

5. Defensibility of the Country .72 .38 .35 .34 .31 1.00<br />

6. Confidence in the Armed Forces<br />

Note. p


756<br />

Models<br />

M5: Strong Factorial<br />

df χ2 ∆χ2 RMSEA SRMR GFI NNFI CFI<br />

invariance (All LY,LX, BE<br />

& GA paths invariant)<br />

590 1474.50 - 0.060 0.062 0.90 0.92 0.92<br />

M5-M4 14 - 26.23 *<br />

M6 (All LY,LX,GA& BE<br />

paths invariant, except for<br />

BE(2,3) kept free<br />

588 1467.05 -<br />

M6-M4 12 - 18.78 ns Note abbreviations.<br />

* ∆χ<br />

0.060 0.066 0.90 0.92 0.92<br />

2 is significant at p=0.05.<br />

ns 2<br />

∆χ is not significant at p=0.05.<br />

As shown in Table 3 above, full measurement equivalence was obtained across the three<br />

samples (∆χ 2 (30) = 31.86 for M2-M1 was not significant). The results also indicate that the basic<br />

structural model was applicable across the three samples (∆χ 2 (30) =35.06 for M4-M3 was not<br />

significant), hence suggesting weak factorial equivalence among the three samples. However, it<br />

was too restrictive to constrain all the paths to be equal (∆χ 2 (14) = 26.23 for M5-M4 was<br />

significant) and hence one path (ß(2,3) had to be freely estimated across the three samples so that<br />

the structural model was applicable across the three samples (∆χ 2 (12) = 18.78 for M6-M4 was<br />

not significant). The final models for the three samples are illustrated in Figures 2, 3, and 4<br />

below, with their respective fit indices presented in Table 4. Fit indices of root mean square<br />

error of approximation (RMSEA=.060), standardized root mean square residual (SRMR=.066),<br />

goodness of fit index (GFI=.90), non-normed fit index (NNFI=.92), and comparative fit index<br />

(CFI=.94) all indicate acceptable fit by conventional standards.<br />

Figure 2: Model for NSFs<br />

*0.40<br />

Confidence<br />

in Armed<br />

Forces<br />

R 2 =0.16<br />

Perceived<br />

Security of<br />

Singapore<br />

Sense of<br />

Belonging<br />

*0.96 Defensibility<br />

of the<br />

Country<br />

*0.37<br />

R 2 =0.91<br />

*0.33<br />

Note. * Asterisks indicate paths that could be constrained to be equal across the models.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

R 2 =0.34 R 2 =0.58 R 2 =0.62<br />

0.76<br />

*0.27<br />

Commitment<br />

to National<br />

Defence<br />

*0.64<br />

Support for<br />

National<br />

Service


Figure 3: Model for Regulars<br />

*0.57<br />

Confidence<br />

in Armed<br />

Forces<br />

*1.00<br />

Figure 4: Model for NSmen<br />

*0.51<br />

Confidence<br />

in Armed<br />

Forces<br />

*0.98<br />

R 2 =0.33<br />

Perceived<br />

Security of<br />

Singapore<br />

Defensibility<br />

of the<br />

Country<br />

R 2 =1.00<br />

R 2 =0.26<br />

Perceived<br />

Security of<br />

Singapore<br />

Defensibility<br />

of the<br />

Country<br />

R 2 =0.96<br />

*0.29<br />

*0.33<br />

Sense of<br />

Belonging<br />

*0.49<br />

Sense of<br />

Belonging<br />

*0.42<br />

R 2 =0.45 R 2 =0.74 R 2 =0.74<br />

R 2 =0.42<br />

0.80<br />

0.86<br />

*0.22<br />

*0.29<br />

Commitment<br />

to National<br />

Defence<br />

R 2 =0.65<br />

Commitment<br />

to National<br />

Defence<br />

*0.67<br />

*0.70<br />

* Asterisks indicate paths that could be constrained to be equal across the models.<br />

Support for<br />

National<br />

Service<br />

R 2 =0.69<br />

Support for<br />

National<br />

Service<br />

757<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


758<br />

Table 4. Fit indices of the 3 models<br />

Models df χ2 RMSEA SRMR GFI NNFI CFI<br />

Model for NSF 182 475.23 0.061 0.063 0.90 0.90 0.92<br />

Model for Regulars 182 414.36 0.059 0.050 0.91 0.92 0.93<br />

Model for NSmen 182 535.48 0.063 0.055 0.91 0.91 0.92<br />

DISCUSSION<br />

The findings obtained supported the hypothesized relationships. In each of the groups,<br />

NSF, Regulars and NSmen groups, the respondents’ commitment to defend Singapore was<br />

strongly and positively related to their support for National Service. This suggests that the<br />

respondents could have perceived National Service as a tangible means to express their<br />

commitment to the country’s defence. Commitment to defend Singapore in times of war was in<br />

turn strongly and positively related to one’s sense of belonging, indicating that one who was<br />

emotionally attached to the country would be committed to defending the country in times of<br />

war.<br />

In all groups, sense of belonging was also found to be moderately and positively related<br />

to perceived security and defensibility of the country, with defensibility of the country being<br />

more strongly related as compared to perceived security. This reinforces the importance for a<br />

strong perception that the country is defensible, as sense of belonging seems not only to be<br />

affect-based, but also to be cognitively-based. As hypothesized, there was a weak and positive<br />

relationship between defensibility of the country and support for National Service. Using<br />

Vroom’s (1964) expectancy theory of motivation, these findings suggest that one’s support for<br />

the benefits of military conscription and being involved tends to be pragmatic, whereby support<br />

for national service is positively influenced by the perception that the country can be defended<br />

and their efforts would not be in vain.<br />

In all groups, confidence in the SAF was directly related to the perceived security of the<br />

country and the defensibility of the country. The stronger correlation between confidence in the<br />

armed forces and defensibility of the country as compared to perceived security of the country<br />

suggests that, given their personal involvement in the defence of the country, the servicemen<br />

could have been more likely to perceive the armed forces as having a greater role in the security<br />

and defence of Singapore.<br />

In conclusion, this study reinforces the importance of building on one’s sense of<br />

belonging to the country and perception of security and defensibility, so as to achieve the<br />

population’s commitment to national defence and support for national service.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


REFERENCES<br />

Huxley, T. (2000). Defending the lion city: The Armed Forces of Singapore. NSW,<br />

Australia: Allen & Urwin.<br />

Jöreskog, K & Sörbom, D (1993). LISREL 8: Structural equation modeling with the<br />

SIMPLIS command language. Chicago, IL: Scientific Software.<br />

Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance.<br />

Psychometrika, 58, 525-543.<br />

Reise, S.P., Widaman, K.R., & Pugh, R.H. (1993). Confirmatory factor analysis and item<br />

response theory: Two approaches for exploring measurement invariance. Psychological Bulletin,<br />

114, 552-566.<br />

Soh, S., Tan, C., & Ong, K.C. (2002). Attitudes towards national defence and military<br />

service in Singapore. The 44 th <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong> Conference. Ottawa,<br />

Canada.<br />

Vroom, V.H. (1964). Work and Motivation. New York: Wiley.<br />

759<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


760<br />

MEASURING MILITARY PROFESSIONALISM<br />

Lieutenant-Colonel Peter Bradley, Dr Danielle Charbonneau,<br />

Lieutenant (Navy) Sarah Campbell<br />

<strong>Military</strong> Psychology and Leadership Department, Royal <strong>Military</strong> College of Canada,<br />

P.O. Box 17000 Station Forces, Kingston, Ontario, Canada K7K 7B4<br />

Email: charbonneau-d@rmc.ca<br />

This paper reports our progress in developing a measure of military professionalism. We<br />

begin with the Huntington (1957) model, the traditionally accepted model of military<br />

professionalism in North America. The essence of Huntington’s model is that the military<br />

officer corps is a profession like the medical or legal profession because it embodies the three<br />

professional criteria of expertise, responsibility and corporateness. Where Huntington focuses<br />

on the organizational level, “analyzing the character of the modern officer corps” (p. 24), we<br />

investigate professionalism at the individual level of analysis, by examining the professional<br />

attitudes of individual officers, noncommissioned officers and soldiers.<br />

Expertise. Huntington viewed expertise as specialized knowledge held by the<br />

professional practitioner and gained through extensive study of the profession. We expand on<br />

the Huntington definition of expertise in the present study to also include the continuous<br />

upgrading of this specialized knowledge.<br />

Responsibility. Huntington conceptualized responsibility as social responsibility,<br />

reflecting the extent to which the professional organization provides a service essential to<br />

society. Also included in Huntington’s definition of responsibility is the requirement for the<br />

profession to regulate its members by enforcing professional codes of ethics and the need for the<br />

individual professional to be intrinsically motivated by “love of his craft” and committed to the<br />

state by a “sense of social obligation to utilize this craft for the benefit of society” (p. 31).<br />

Corporateness. Central to Huntington’s definition of corporateness is the “sense of<br />

organic unity and consciousness of themselves [i.e., the professionals] as a group apart from<br />

laymen” (p. 26). Huntington refers to corporate structures like schools, associations and journals<br />

which serve to develop and regulate the conduct of military professionals. Here it seems that<br />

Huntington permits conceptual overlap between responsibility and corporateness, as his<br />

definition of each concept makes reference to professional standards and ethics.<br />

The Canadian Forces doctrinal manual on military professionalism (entitled Duty with<br />

Honour) emphasizes the pride of the military professional in serving Canada’s citizens and<br />

institutions. A central element of professionalism outlined in Canadian Army doctrine on<br />

professionalism is the obligation of soldiers “to carry out duties and tasks without regard to fear<br />

or danger, and ultimately, to be willing to risk their lives” (Canada’s Army, p. 33). For these<br />

reasons, we included measures of national pride and risk acceptance in our professional measure.<br />

_____________________________________________________________________________<br />

The opinions in this paper are those of the authors and not the Department of National Defence.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Another important influence on our study of professionalism is the work of Hall (1968)<br />

who developed a measure of professional attitudes encompassing 5 dimensions. His first<br />

dimension, use of the professional organization as a major referent, reflects the extent to which<br />

the individual is influenced by the values, beliefs and identity of the organization. This<br />

dimension may not have a counterpart in the Huntington model. Hall’s second dimension, belief<br />

in public service, reflects a commitment to benefit society and is therefore similar to<br />

Huntington’s responsibility. His third dimension, belief in self-regulation, reflects endorsement<br />

of the idea that only other professionals are qualified to evaluate their performance. This is<br />

similar to aspects of Huntington’s corporateness. Hall’s fourth dimension, sense of calling,<br />

reflects a strong level of intrinsic motivation and is akin to Huntington’s responsibility. Hall’s<br />

fifth dimension, autonomy refers to the professional’s freedom to make decisions about his/her<br />

work without external pressures, and is unlike any of Huntington’s professional criteria.<br />

An important part of developing a measure of professional attitudes is to examine the<br />

relations between professionalism dimensions and important organizational outcomes. We<br />

hypothesized that our professionalism scales should be related to attitudinal outcome measures<br />

like organizational citizenship behaviour (OCB), satisfaction, and commitment to the Army.<br />

Method<br />

Overview<br />

We developed 5 professionalism scales of 43 items by adapting items from Hall’s (1968)<br />

professional inventory and Snizek’s (1972) evaluation of the Hall measure and by writing<br />

additional items for the dimensions described below. All items and scales were measured on a 5point<br />

scale and the survey was administered to Canadian Army personnel at 4 installations.<br />

Participants<br />

The research sample included 333 personnel from the rank of private to major. All ranks<br />

were represented in the sample along with most of the occupations in the army. Females<br />

comprised 16 % of the sample and 12 % of the sample were officers.<br />

Measures<br />

Expertise. The expertise scale contained 8 items measuring two dimensions. The first<br />

reflects the extent to which respondents possess unique knowledge that provides an important<br />

contribution to society (Item 1: I think that most members of the Army have unique skills and<br />

knowledge that make an important contribution to the Canadian Forces and to society). The<br />

second dimension reflects the extent to which they strive to keep this knowledge up to date (Item<br />

5: I keep up-to-date with new developments in the profession of Arms).<br />

Responsibility. Measured by an 11-item scale, responsibility is conceptualized as having<br />

three dimensions. First, the profession must perform a service to society (Item 10: I always use<br />

my skills and knowledge in the best interest of Canadians). Second, individual members of the<br />

profession have the obligation to adhere to professional standards in their daily work (Item 15: I<br />

would comply with unethical assignments or rules if I were ordered to do so (reverse scored)).<br />

761<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


762<br />

Third, the profession is a "calling" rather than a job (Item 17: People in the military have a real<br />

"sense of calling" for their work).<br />

Corporateness. This 13-item scale focuses on the regulatory practices within the<br />

profession which ensure members' competence and ethical behaviour. There are three<br />

dimensions to the corporateness construct. First, members must be familiar with and understand<br />

the standards of competence and ethical conduct (Item 21: I am aware of the criteria that define<br />

competence in the profession of arms). Second, a peer-regulatory system of competence and<br />

conduct monitoring must be in place and must be effective (Item 26: It is my duty to take action<br />

when I observe another unit member commit unprofessional actions). Lastly, members must be<br />

given the autonomy to exercise their professional judgment (Item 30: I don't have much<br />

opportunity to exercise my own professional judgment (reverse scored)).<br />

National pride. Measured with a 3-item scale, national pride reflects the extent to which<br />

military professionals are proud of their nation (Item 34: I am proud of Canadian society and<br />

Canadian culture) and proud to be serving their nation (Item 35: I am proud to be a member of<br />

Canada’s military).<br />

Risk acceptance. Risk acceptance was measured by 8 items such as: Item 36: I am<br />

prepared to put my life at risk to defend Canadian territory; Item 40: I am prepared to put my<br />

life at risk in peace support operations (e.g., peacekeeping, peace making).<br />

Outcome measures. We measured OCB (i.e., extra-role behaviours) with items adapted<br />

from Van Dyne, Graham, and Dienesch (1994) and Podssakoff, MacKenzie, Moorman, and<br />

Fetter (1990). We developed 2-item measures of satisfaction with the Army, the unit, and the<br />

occupation along the lines of the satisfaction measure employed by Cotton (1979). We measured<br />

commitment with a 6-item measure of Meyer’s and Allen’s (1991) affective commitment, the<br />

extent to individuals identify with their organization because of emotional attachment to their<br />

organization. All outcome items and scales were measured on a 5-point rating scale.<br />

Results and Discussion<br />

Overview<br />

Our analyses focused on two research questions: (a) To what extent are our rationally<br />

derived scales supported by psychometric analyses (i.e., scale internal consistency indices and<br />

principal components analyses)? (b) To what extent are the dimensions of professionalism<br />

related to important attitudinal outcomes?<br />

Reliability of Professionalism Scales<br />

We calculated Cronbach Alpha coefficients for each of our professionalism scales and<br />

sub-scales. As shown in Table 1, some of the coefficients are low, indicating that the dimension<br />

is multidimensional or requires additional items. The responsibility scale seems to be most<br />

problematic in this regard. Responsibility has a relatively low cronbach alpha for a scale of 11<br />

items (.28), but it also has three subordinate dimensions suggesting that it is properly classified<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


as a multidimensional construct.<br />

Table 1<br />

Internal Consistency of Professionalism Scales<br />

Scale Sub-scale Items Cronbach Alpha<br />

Expertise 1 – 8 (8 items) .61<br />

Unique knowledge 1, 2 .67<br />

Maintain knowledge 3, 4, 5, 6, 7, 8 .56<br />

Responsibility 9 – 19 (11 items) .28<br />

Service to society 9, 10, 11, 12, 13 .25<br />

Adhere to professional standards 14, 15, 16 .22<br />

Sense of calling 17, 18, 19 .57<br />

Corporateness 20 – 32 (13 items) .67<br />

Understand standards of conduct 20, 21, 24, 29 .59<br />

System of monitoring conduct 22, 23, 25, 26, 27, 28 .66<br />

Autonomy 30, 31, 32 .47<br />

National Pride 33 – 35 (3 items) .56<br />

Risk Acceptance 36 – 43 (8 Items) .87<br />

Professionalism 1 – 43 (43 items) .78<br />

Structure of the Professionalism Measure<br />

We conducted principal components analyses (PCA), with varimax rotation, on the 43<br />

professionalism items and found that national pride and risk acceptance were the only<br />

dimensions in which all items clustered as expected. We determined that a 3-component<br />

solution, accounting for 28.7% of the variance, was the most interpretable solution. As the<br />

results in Table 2 show, expertise and responsibility items loaded on three different components.<br />

Most of the expertise items relating to maintenance of professional knowledge clustered on<br />

Component 2. Other expertise items loaded on all components indicating that we should review<br />

our sub-dimensions of expertise and develop additional items for each of these sub-dimensions.<br />

Responsibility items 17, 18 and 19, all relating to professional commitment (i.e., a sense of<br />

calling), clustered on Component 2 and responsibility items 14 and 15 relating to professional<br />

standards of ethical behaviour clustered on Component 3. Responsibility items 9 and 10, relating<br />

to public service, loaded on several components suggesting that we should re-examine this<br />

element of responsibility and possibly develop several additional items for this measure. In<br />

addition, the sub-scales of professional standards and sense of calling sub-dimensions might also<br />

benefit from several additional items. Corporateness items loaded on Components 2 and 3.<br />

Except for Items 20 and 21, items relating to competence and ethical behaviour loaded on<br />

Component 3, along with the above-mentioned responsibility “ethics” Items 14 and 15. Items 30<br />

and 31 from the autonomy sub-dimension of corporateness clustered on Component 2, but the<br />

sole remaining item of this sub-dimension, Item 32, did not.<br />

763<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


764<br />

Table 2<br />

Professionalism Scales 3 - Component Solution<br />

Item Construct Comp 1 Comp 2 Comp 3<br />

1 Exp .496<br />

2 Exp .341<br />

3 Exp .398<br />

4 Exp .306<br />

5 Exp .367<br />

6 Exp<br />

7 Exp .326<br />

8 Exp .437<br />

9 Resp .329 .364 .344<br />

10 Resp .345 .365<br />

11 Resp<br />

12 Resp<br />

13 Resp<br />

14 Resp .307<br />

15 Resp .329<br />

16 Resp<br />

17 Resp .622<br />

18 Resp .699<br />

19 Resp .371<br />

20 Corp .429<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Item Construct Comp 1 Comp 2 Comp 3<br />

21 Corp .347 .313<br />

22 Corp .599<br />

23 Corp .599<br />

24 Corp .481<br />

25 Corp .524<br />

26 Corp .566<br />

27 Corp .359<br />

28 Corp .507<br />

29 Corp<br />

30 Corp .371<br />

31 Corp .486<br />

32 Corp<br />

33 Corp .586<br />

34 Pride .489<br />

35 Pride .302 .405<br />

36 Risk Acc .646<br />

37 Risk Acc .651<br />

38 Risk Acc .776<br />

39 Risk Acc .730<br />

40 Risk Acc .748<br />

41 Risk Acc .729<br />

42 Risk Acc .723<br />

43 Risk Acc .710<br />

Note. Exp = expertise, Resp = responsibility, Corp = corporateness, Pride = national pride, and<br />

Risk Acc = risk acceptance.<br />

Overall, the results depicted in Tables 1 and 2 show that risk acceptance has the strongest<br />

psychometric properties of all our professionalism scales and responsibility the weakest. The<br />

conceptual definitions and items representing each of expertise, responsibility and corporateness<br />

need to be reviewed and likely require additional items and further psychometric evaluation.<br />

Professionalism-Outcome Relations<br />

As shown in Table 3, we found many positive (statistically significant) correlations<br />

between our professional scales and the other attitudinal outcome measures. For example,<br />

professionalism (operationalized as the sum of expertise, responsibility, corporateness, national<br />

pride, and risk acceptance) correlated .59 with OCB, .37 with overall satisfaction and .37 with<br />

commitment. Risk acceptance had lower, albeit positive and statistically significant correlations<br />

with the attitudinal outcome measures. The correlations between national pride and the outcome<br />

measures were stronger.<br />

Conclusion<br />

This paper presents the early stages of our work on measuring professional military<br />

attitudes. Some of our scales have sound psychometric properties and all scales correlate with


Table 3<br />

Intercorrelations Among Professionalism and Outcome Measures<br />

1 2 3 4 5 6 7 8 9 10 11 12<br />

1 Professionalism<br />

2 Expertise .70<br />

3 Responsibility .71 .37<br />

4 Corporateness .78 .47 .45<br />

5 Risk Acceptance .20 .28 .16 ns<br />

6 National Pride .75 .28 .32 .48 ns<br />

7 OCB .59 .59 .36 .53 .27 .33<br />

8 Satisfaction .37 .28 .27 .29 .11 .28 .42<br />

9 Sat Army .32 .30 .22 .22 .31 .21 .46 .67<br />

10 Sat Unit .23 .15 .23 .15 ns .18 .23 .80 .28<br />

11 Sat MOC .32 .24 .18 .31 ns .28 .33 .85 .38 .53<br />

12 Commitment .33 .31 .19 .32 .26 .19 .51 .44 .58 .21 .29<br />

Mean 3.52 3.61 3.40 3.52 4.01 3.57 3.69 3.48 3.73 3.33 3.37 3.41<br />

Standard Dev. .35 .45 .44 .41 .71 .59 .40 .83 .92 1.11 1.15 .73<br />

Note. ns = not statistically significant. All coefficients are significant at p < .001.<br />

meaningful attitudinal outcome measures. Future research should focus on expanding the<br />

professionalism model to include other variables (such as Hall’s [1968] use of professional<br />

organization as a major referent) and improving the psychometric quality of existing scales with<br />

additional items.<br />

References<br />

a.<br />

Canada’s Army. Canadian Forces Publication B-GL-300-000/FP-000. 1 April 1998.<br />

Cotton, C.A. (1979). <strong>Military</strong> attitudes and values of the army in Canada (Report 79-5).<br />

Willowdale, Ontario: Canadian Forces Personnel Applied Research Unit.<br />

Duty with Honour. Canadian Forces Publication A-PA-005-000/AP-001. <strong>2003</strong>.<br />

Huntington, S.P. (1957). Officership as a profession. In Huntington, S.P. The soldier<br />

and the state. Cambridge MA: The Belnap Press of Harvard University Press.<br />

Hall, R.H. (1968). Professionalization and bureaucratization. American Sociological<br />

Review, 33, 92-104.<br />

Meyer, J.P., & Allen, N.J. (1991). A three-component conceptualization of<br />

organizational commitment. Human Resource management Review, 1, 61-98.<br />

Podsakoff, P. M., MacKenzie, S. B., Moorman, R. H., & Fetter, R. (1990).<br />

Transformational leader behaviours, and their effects on followers' trust in leader, satisfaction,<br />

and organizational citizenship behaviours. Leadership Quarterly, 1, 107-142.<br />

Snizek, W.E. (1972). Hall’s professionalism scale: An empirical reassessment.<br />

American Sociological Review, 37, 109-114.<br />

Van Dyne, L. Graham, J.W., & Dienesch, R.M. (1994). Organisational citizenship<br />

behaviour: Construct redefinition, measurement and validation. Academy of Management<br />

Journal, 37, 765-802.<br />

765<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


766<br />

DEOCS: A New and Improved MEOCS<br />

Stephen A. Truhon<br />

Department of Social Sciences<br />

Winston-Salem State University<br />

Winston-Salem, NC 27110<br />

truhons@wssu.edu<br />

ABSTRACT<br />

The Defense Equal Opportunity Management Institute (DEOMI) has studied<br />

equal opportunity through the use the <strong>Military</strong> Equal Opportunity Climate Survey<br />

(MEOCS) for more than a decade. In the process of updating the MEOCS a new version<br />

called the DEOCS (DEOMI Equal Opportunity Climate Survey) has been developed,<br />

which uses items from the MEOCS-EEO (Equal Employment Opportunity version) that<br />

have been neutralized (i.e., direct references to a majority and minority race and male and<br />

female gender have been removed). A three-step process was used to compare the<br />

DEOCS items with their counterparts in the MEOCS-EEO. Item response theory (IRT)<br />

using the MULTILOG program was performed to calculate difficulty and discrimination<br />

parameters. These parameters were matched to a common scale through the use of the<br />

EQUATE program. Differential item functioning (DIF) was then performed through the<br />

use of the DFIT and SIBTEST programs to discover any item bias. Results showed that<br />

few items displayed item bias (i.e., DIF), usually when those items were extensively<br />

reworded for the DEOCS. In most cases those items displaying DIF the version in the<br />

DEOCS had superior psychometric properties compared to their versions in the MEOCS-<br />

EEO.<br />

INTRODUCTION<br />

A major research project for the Defense Equal Opportunity Management<br />

Institute (DEOMI) has been the development and testing of the <strong>Military</strong> Equal<br />

Opportunity Climate Survey (MEOCS; Landis, Dansby, & Faley, 1993). This project<br />

includes revising the MEOCS and keeping it up to date.<br />

Suggested revisions to the MEOCS have included shortening it and making its<br />

items more neutral (i.e., replacing references to “majority,” “minority,” “men,” and<br />

“women” with more general terms “race” and “gender” and then using demographic<br />

information to determine the respondent’s specific race and gender). Various methods for<br />

shortening the MEOCS have been examined including confirmatory factor analysis<br />

(McIntyre, 1999), cluster analysis (Truhon, 1999), and item response theory (IRT;<br />

Truhon, 2000, 2002).<br />

Fifty-one items from the MEOCS-EEO have been rewritten to be more neutral in<br />

the DEOCS. The purpose of the current study was to compare the revised items with their<br />

original versions through the use of IRT and DIF.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


METHOD<br />

PARTICIPANTS<br />

The DEOCS had been administered to 522 participants at the time of the current<br />

study. A random sample of 522 respondents to the MEOCS-EEO was selected for<br />

comparison.<br />

MATERIALS<br />

Items had been taken for 14 scales from the MEOCS-EEO and revised for the<br />

DEOCS: Sexual Harassment and Discrimination, Differential Command Behavior toward<br />

Minorities and Women, Positive Equal Opportunity (EO) Behavior, Racist Behavior,<br />

Religious Discrimination, Disability Discrimination, Age Discrimination, Commitment,<br />

Trust in the Organization, Effectiveness, Work Group Cohesion, Leadership Cohesion,<br />

Satisfaction, and General EO Climate. Two to five items from each scale were chosen<br />

which previously research had shown to have good psychometric qualities (i.e., item-total<br />

correlations, reliability, and discriminability).<br />

PROCEDURE<br />

A three-step process was followed in these analyses. First, Thissen’s (1991, <strong>2003</strong>)<br />

MULTILOG program was used below to obtain difficulty and discriminability<br />

parameters (a and b’s) for the MEOCS-EEO and the DEOCS. Because these parameters<br />

for the two versions were calculated separately, a common metric was needed. Then,<br />

Baker’s (1995) EQUATE program was used to link the two versions. For each of the<br />

scales presented below the parameters from the revised form of the MEOCS were<br />

equated to those of the MEOCS-EEO. The transformation constants (A and K) are also<br />

presented. Finally, following the transformation, DIF analyses were performed using<br />

Raju, van der Linden, and Fleer’s (1995) DFIT program adapted for polytomous items<br />

(Flowers, Oshima, & Raju, 1999; Raju, 2001) and Shealy and Stout’s (1993) SIBTEST<br />

program adapted for polytomous items (Chang, Mazzeo, & Roussos, 1994).<br />

RESULTS<br />

A summary of the results can be seen in Table 1. Listed are the scales, number of<br />

items in the scale, whether any items were reworded for the DEOCS, whether DIF was<br />

detected by DFIT or SIBTEST. An examination of each of the scales is discussed below.<br />

SIBTEST appears to be more sensitive to possible DIF than DFIT. Examination<br />

of BRFs in this report suggests that the SIBTEST is overly sensitive. This also appears to<br />

be true in previous research (Truhon, 2002). Chang et al. (1996) have reported that their<br />

polytomous adaptation of SIBTEST is more likely to exhibit Type I error when there is<br />

nonuniform DIF, which occurs with many items. This would help to explain the seeming<br />

contradiction with Bolt’s (2002) finding that the SIBTEST had less power than DFIT.<br />

Whether one uses the stricter criteria for DIF in DFIT or the looser criteria in<br />

SIBTEST, it is noteworthy that there is a greater DIF in the reworded items than in the<br />

items whose wording was left unchanged. These reworded items allow for a different<br />

interpretation compared to the original version of the items by respondents. For example,<br />

in the reworded version sexual harassment can involve women harassing men and racist<br />

behavior can involve nonwhites discriminating against whites. Overall this suggests that<br />

767<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


768<br />

the DEOCS has kept the essential qualities of the MEOCS-EEO and many cases<br />

improved upon them.<br />

TABLE 1<br />

Summary of Differential Item Functioning Results for 14 Scales<br />

Scale Number<br />

of Items<br />

Reworded DIF (DFIT) DIF (SIBTEST)<br />

Sexual Harassment 4 All None DEOCS 13<br />

and Discrimination<br />

DEOCS 14<br />

DEOCS 15<br />

Differential<br />

4 All DEOCS 4 DEOCS 4<br />

Command Behavior<br />

DEOCS 6<br />

toward Minorities<br />

and Women<br />

Positive EO 4 All None DEOCS 2<br />

Behavior<br />

DEOCS 8<br />

Racist Behavior 3 DEOCS 1 DEOCS 12 DEOCS 1<br />

DEOCS 11<br />

DEOCS 12<br />

Religious<br />

Discrimination<br />

3 None None DEOCS 16<br />

Age Discrimination 3 None None None<br />

Disability<br />

Discrimination<br />

3 None None None<br />

Commitment 5 None None DEOCS 25<br />

DEOCS 28<br />

Trust in the 3 None None None<br />

Organization<br />

Perceived Work 4 DEOCS 35 None DEOCS 34<br />

Group Effectiveness<br />

Work Group 4 None None DEOCS 38<br />

Cohesion<br />

DEOCS 39<br />

Leadership Cohesion 4 None None None<br />

Job Satisfaction 5 None None DEOCS 48<br />

Overall EO Climate 2 None Not Determined Not Determined<br />

DISCUSSION<br />

While the use of equating was used in the current study to complete DIF analyses,<br />

equating can be done for tests as a whole. However most of the work has been limited to<br />

dichotomous tests (Kolen & Brennan, 1995). These procedures should be applied to<br />

polytomous tests like the MEOCS. Equating tests would allow researchers to make use of<br />

the large database collected on earlier versions while developing newer versions such as<br />

seen in the DEOCS.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


There is a link between test construction and test equating. This is illustrated in a<br />

statement by Mislevy (1992),<br />

Test construction and equating are inseparable. When they are applied in concert,<br />

equated scores from parallel test forms provide virtually exchangeable evidence about<br />

students’ behavior on the same general domain of tasks, under the same specified<br />

standardized conditions. When equating works, it is because of the way the tests are<br />

constructed…(italics in original, p. 37; from Kolen & Brennan, 1995, p. 246).<br />

Dodd, De Ayala, & Koch (1995) have described a procedure to use CRFs to<br />

calculate item information functions (IIFs). These IIFs can be added together to produce<br />

test information functions (TIFs). Wu (2000) developed a model management system to<br />

develop comparable tests making use of TIFs. These equating procedures can then be<br />

used to develop comparable forms of the DEOCS. In this way IRT can be used to<br />

develop alternate forms of the MEOCS and DEOCS (see Cortina, 2001) rather than<br />

compare forms after the fact, as was done in this study.<br />

The kinds of analyses described above set future directions for the DEOCS. They<br />

are frequently used in computerized adaptive testing (CAT). Most CAT has been done in<br />

the area of ability testing. However, CAT has been successfully applied to attitude testing<br />

(Koch & Dodd, 1990). A CAT-version of the DEOCS could establish a person’s response<br />

level on the different scales in the DEOCS with a minimal number of items.<br />

Finally, the DEOCS has begun to go on-line. DIF analyses should be used to<br />

compare on-line responses to paper-and-pencil responses to the DEOCS. While previous<br />

DIF research suggests that administration format does not make a difference for attitude<br />

scales (Donovan, Drasgow, & Probst, 2000) and evaluations (Penny, <strong>2003</strong>), it would<br />

useful to verify this.<br />

REFERENCES<br />

Baker, F. B. (1995). EQUATE 2.1: A computer program for equating two metrics in item<br />

response theory [Computer program]. Madison: University of Wisconsin,<br />

Laboratory of Experimental Design.<br />

Bolt, D. M. (2002). A Monte Carlo comparison of parametric and nonparametric<br />

polytomous DIF detection methods. Applied Measurement in Education, 15, 113-<br />

141.<br />

Chang, H., Mazzeo, J., & Roussos, L. (1996). Detecting DIF for polytomously scored<br />

items: An adaptation of the SIBTEST procedure. Journal of Educational<br />

Measurement, 33, 333-353.<br />

Cortina, L. M. (2001). Assessing sexual harassment among Latinas: Development of an<br />

instrument. Cultural Diversity and Ethnic Minority Psychology, 7, 164-181.<br />

Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized adaptive testing with<br />

polytomous items. Applied Psychological Measurement, 19, 5-22.<br />

769<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


770<br />

Donovan, M. A., Drasgow, F. & Probst, T. M. (2000). Does computerizing paper-andpencil<br />

job attitude scales make a difference? New IRT analyses offer insight.<br />

Journal of Applied Psychology, 85, 305-313.<br />

Flowers, C. P., Oshima, T. C., & Raju, N. S. (1999). A description and demonstration of<br />

the polytomous-DFIT framework. Applied Psychological Measurement, 23, 309-<br />

326.<br />

Koch, W. R. & Dodd, B. G. (1990). Computerized adaptive measurements of attitudes.<br />

Measurement and Evaluation in Counseling and Development, 23, 20-30.<br />

Kolen, M. J., & Brennan, R. L. (1995). Test equating: Methods and practices. New York:<br />

Springer.<br />

Landis, D., Dansby, M. R., & Faley, R. H. (1993). The <strong>Military</strong> Equal Opportunity<br />

Climate Survey: An example of surveying in organizations. In P. Rosenfeld, J. E.<br />

Edwards, & M. D. Thomas (Eds.), Improving organizational surveys: New<br />

directions, methods, and applications (pp. 122-142). Newbury Park, CA: Sage.<br />

McIntyre, R. M. (1999). A confirmatory factor analysis of the <strong>Military</strong> Equal Opportunity<br />

Climate Survey, Version 2.3. (DEOMI Research Series Pamphlet 99-5). Patrick<br />

AFB, FL: Defense Equal Opportunity Management Institute.<br />

Mislevy, R. J. (1992). Linking educational assessments: Concepts, issues, methods, and<br />

prospects. Princeton, NJ: ETS Policy Information Center.<br />

Penny, J. A. (<strong>2003</strong>). Exploring differential item functioning in a 360-degree assessment:<br />

Rater source and method of delivery. Organizational-Research-Methods, 6, 61-<br />

79.<br />

Raju, N. S. (2001). DFITPS6: A Fortran program for calculating polytomous DIF/DTF<br />

[Computer program]. Chicago: Illinois Institute of Technology.<br />

Raju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of<br />

differential functioning of items and tests. Applied Psychological Measurement,<br />

19, 353-368.<br />

Shealy, R., & Stout, W. F. (1993). A model-based standardization approach that separates<br />

true bias/DIF from group ability differences and detects test bias/DTF as well as<br />

item bias/DIF. Psychometrika, 58, 159-164.<br />

Thissen, D. (1991). MULTILOG User’s Guide (Version 6.0). Lincolnwood, IL: Scientific<br />

Software.<br />

Thissen, D. (<strong>2003</strong>). MULTILOG 7.03 for Windows. [Computer program]. Lincolnwood,<br />

IL: Scientific Software <strong>International</strong>.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Truhon, S. A. (1999). Updating the MEOCS using cluster analysis and reliability.<br />

(DEOMI Research Series Pamphlet 99-8). Patrick AFB, FL: Defense Equal<br />

Opportunity Management Institute.<br />

Truhon, S. A. (2000). Shortening the MEOCS using item response theory. (DEOMI<br />

Research Series Pamphlet 00-8). Patrick AFB, FL: Defense Equal Opportunity<br />

Management Institute.<br />

Truhon, S. A. (2002). Comparing two versions of the MEOCS using differential item<br />

functioning. (DEOMI Research Series Pamphlet 02-7). Patrick AFB, FL: Defense<br />

Equal Opportunity Management Institute.<br />

Wu, I-L. (2000). Model management system for IRT-based test construction decision<br />

support system. Decision Support Systems, 27, 443-458.<br />

771<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


772<br />

ADDRESSING PSYCHOLOGICAL STATE AND WORKPLACE BEHAVIORS<br />

OF DOWNSIZING SURVIVORS<br />

Rita Shaw Rone<br />

Naval Education and Training Command Headquarters<br />

Pensacola, Florida, USA<br />

rsrone1@mchsi.com<br />

According to Burke and Nelson (1998), 10.8 million people lost their jobs between 1981<br />

and 1988 in the United States. In the Department of Defense alone, thousands of civilian<br />

employees left federal service during the 1990's due to base closures and realignment of<br />

functions. During this period, a great deal of research has been focused on study of displaced<br />

workers. Certainly their circumstance renders them deserving of attention. As important,<br />

however, are those employees left behind, i.e., the organizational "survivors." If organizational<br />

leaders expect the dramatic changes they have made to organizational structures to yield the kind<br />

of efficiencies they hope for, they must assure that they have adequately assessed the impact of<br />

downsizing on these survivors and developed strategies for addressing it.<br />

Organizational Reality<br />

The organizational changes spawning recent downsizing have been characterized as<br />

revolutionary in nature. Hamel (2000) maintains that organizations are on the threshold of a new<br />

age and that such a "revolution" (p. 4) is bringing with it an anxiety clearly felt by those left<br />

behind. Gowing, Kraft, & Quick (1998) assert that it is a "dramatic, systemic revolution"<br />

similar to America's agricultural revolution which began in the late 18 th century and ended in the<br />

early 20 th century. These new realities are reflective of the new employment contract, also<br />

known as the psychological contract. This is comprised of the "individual beliefs, shaped by the<br />

organization, regarding terms of an exchange between individuals and their organizations”<br />

(Rousseau, 1995, p. 9). Bunker (1997) indicates that the old contract, "founded on the exchange<br />

of hard work and loyalty for lifetime employment, has been repeatedly violated and is probably<br />

permanently undermined" (p. 122).<br />

Impact on Survivors<br />

In a comprehensive meta-analysis of the effects of downsizing on survivors, West (2000)<br />

posits that there is a harmful impact on survivors. This includes impact to the survivor’s<br />

psychological state as well as the ensuing effects of that state on the his or her behavior on the<br />

job. Layoffs, according to West, have the clear potential "to affect survivors' psychological<br />

states, which, in turn, have the potential to influence a variety of work behaviors and attitudes"<br />

(p. 8). The term "layoff survivor sickness," coined by Noer (1998), is used to describe feelings,<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


attitudes and behaviors observed and found in research on this subject. Noer indicates that<br />

certain feelings are typical of the survivor. These include (a) Fear, insecurity, and uncertainty,<br />

(b) frustration, resentment, and anger, (c) sadness, depression, and guilt, and (d) unfairness,<br />

betrayal, and distrust.<br />

Another feeling often expressed by survivors is a perceived loss of control. According to<br />

Spreitzer & Mishra (1997), survivors will seem helpless and alienated and may feel somewhat<br />

disconnected with current management officials, seeing themselves as more closely identifying<br />

with co-workers and friends who have left the organization. Another concern some employees<br />

face is the loss of identity. If, for example, one's status in the organization changes, this could<br />

trigger extreme feelings of loss, as some employees feel defined by the work they do or the<br />

position they hold (Stearns, 1995).<br />

According to Noer (1998), in a downsized environment, survivors often cope via nonproductive<br />

behaviors on the job. These behaviors include: (a) Aversion to risk and a more rigid<br />

approach to work; (b) a drop in productivity, due to stress, with a longer term effect continuing<br />

due to a loss of "work spirit" (p. 213); (c) a demand for information; (d) a tendency to blame<br />

others; and (e) denial that the change causing the downsizing is real. Evidence of these<br />

behaviors is echoed in other research in this area. Parker, Chmiel & Wall (1997) cite several<br />

studies that support the notion that survivors become less committed, experience greater strain,<br />

enjoy their jobs less, and are absent more often.<br />

Challenges to Organizational Leaders<br />

It is not surprising to discover high levels of job dissatisfaction and insecurity among<br />

organizational survivors. According to a 1995 survey of 4,300 American workers, only 57<br />

percent of those in downsizing companies indicated that they were satisfied with their jobs<br />

(Woodruff, 1995). Also, a negative correlation has been found to exist between downsizing and<br />

morale and productivity variables (Duron, 1994). There is good news, however. Research<br />

indicates that specific actions of organizational leaders can make a difference. Wagner (1998)<br />

asserts "management methods in implementing downsizing may tend to positively or negatively<br />

affect the impact of downsizing on productivity, morale, and organizational perception" (p. 3).<br />

First of all, managers can involve employees in the decision-making process and attempt<br />

to make the work as interesting as possible, actions that can help combat feelings of helplessness<br />

and apathy. Secondly, managers should communicate expectations to employees. As simplistic<br />

as it may sound, making sure employees know what to do and how to do it is imperative in<br />

creating a feeling of competence. A third recommendation from Woodruff (1995) is that<br />

managers reduce as many administrative irritants as possible. Anxious and mistrustful, survivors<br />

may sometimes perceive seemingly unimportant situations as significant and threatening.<br />

Managers can also encourage teamwork. The new, changed landscape of the workplace may<br />

have broken up old alliances, as valued friends retired or were separated from employment.<br />

Working groups, based on new alliances, may serve as catalysts to a healing process.<br />

773<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


774<br />

Organizational management also must treat the survivors as individuals. One on one interaction<br />

between supervisory personnel and employees is essential to employees' feeling valued.<br />

Parker, Chmiel & Wall (1997), in a longitudinal study of strategic downsizing, found<br />

that specific actions pertaining to improvement in work characteristics could have a long term<br />

positive effect on survivors, even when there is an increase in work demands. This apparently<br />

results from increases in control, clarity, and participation, all states that are associated with<br />

improved well being. Managers should explore the kinds of initiatives that might yield such<br />

improvements and implement them.<br />

The importance of and need for better communication is a recurring theme in downsizing<br />

literature; this is a critical responsibility of management. In such times, the need for information<br />

becomes an almost frantic quest. There is often a tendency for organizational leadership to “hold<br />

back” information during times of reorganization out of concern that telling employees too much<br />

will somehow be detrimental. Such action simply creates more concern and fear. Managers<br />

should also solicit feedback, which should be given legitimate consideration and acted upon<br />

(Kotter, 1996). When organizational leaders do this, they foster a sense of well being in the<br />

survivors and, as a bonus, they receive information that can be useful in the change efforts.<br />

A less obvious, but still important, step managers can take is leading by example. Kotter<br />

(1996) indicates that there is nothing that undermines the communication of a change vision so<br />

thoroughly as key managers’ behavior inconsistent with the new organizational vision. They<br />

must act responsibly, communicate both up and down, and foster effective teamwork, behaviors<br />

for their workers to emulate. They must work toward a "positive emotional perspective toward<br />

the work," endeavoring to keep people optimistic (Hiam, 2002). In doing so, they can enhance<br />

many things – problem solving, listening, and conflict resolution. Bunker (1997) also indicates<br />

that they must assess and accept their own emotional responses and vulnerability and model<br />

healthy coping behaviors – actions that can enhance the organization’s and the survivor’s<br />

recovery.<br />

A key problem for organizational leadership in times of downsizing pertains to employee<br />

motivation. There are actions that can be taken to move survivors toward a more positive<br />

motivational climate (Woodruff, 1995). One key focus should be intrinsic motivation, i.e.,<br />

motivation that, according to Richter (2001), occurs because an employee is passionate about his<br />

or her task. Such motivation is brought about through fostering a climate in which the<br />

employees' intrinsic motivation is enabled. Richter indicates that such an environment is one in<br />

which employees feel competent, in control, and emotionally tied to others, all feelings that seem<br />

in direct conflict with what survivors typically experience. Similar to intrinsic motivation is<br />

"internal commitment," as discussed by Carter (1999, p. 105). This is a behavior evident when<br />

employees are committed to a particular project, person, or program based on their own reasons.<br />

Again, their motivation comes from within. According to Carter, this type of commitment is<br />

closely tied to empowerment. To realize it, managers must try to involve employees in defining<br />

work objectives and planning how to achieve them. Intrinsic motivation also has a strong<br />

connection to empowerment. Spreitzer & Mishra (1997) indicate that empowerment is an<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


effective way to re-build survivors' trust through the displaying of their own trust in the<br />

survivors. Such empowerment is extremely important, not only for this reason, but for others.<br />

First, it is a prerequisite to risk-taking, a behavior typically abandoned by survivors and,<br />

secondly, it "reflects a proactive orientation to one's work" as well as "a sense of meaning,<br />

competence, self-determination, and impact" (Spreitzer & Mishra, 1997, p. 6). Managers must<br />

commit themselves to other behaviors associated with intrinsic motivation. They must attempt<br />

to provide employees a sense of choice, competence, meaningfulness, and progress (Richter,<br />

2001). This could possibly have the potential of generating a lasting effect, as it appears to<br />

relate well to the aspects of control, clarity, and participation cited in the longitudinal studies of<br />

Parker, Chmiel & Wall (1997). According to the theory of self-determination, from Deci &<br />

Ryan (1985), individuals have a universal need to be autonomous and competent. Any actions<br />

managers can successfully accomplish to move employees toward these states should foster<br />

individual intrinsic motivation and, eventually, should impact the goals of the organization.<br />

A motivation focus should not totally exclude considerations of external motivation.<br />

External motivation is typically viewed as that occurring when an employee performs a task in<br />

relation to an external force, either positive, e.g., expectation of a bonus, or negative, e.g., fear of<br />

censure. During the aftermath of a downsizing effort, managers can and should utilize reward<br />

systems in recognizing outstanding performance, when appropriate. They should also provide<br />

positive reinforcement in less formal ways. Such recognition, however small, may sometimes be<br />

the "lift" the survivor needs to feel worthy and competent again (Woodruff, 1995).<br />

Conclusion<br />

Until organizational survivors' psychological needs are met, they may operate within<br />

their organizations in states of ill health, disarray, and helplessness. Organizational leaders and<br />

managers must assess and attend to the impact of downsizing on these survivors. Managers must<br />

face the fact that survivors of organizational downsizing typically experience specific and<br />

numerous psychological changes affecting their feelings and, thusly, their behavior in the<br />

workplace. Research describes the general patterns of behavior common to a downsizing<br />

environment, patterns that can be addressed through management action. Such actions, e.g.,<br />

empowerment, the re-building of trust, leading by example, and continuous, truthful two way<br />

communication, can often mitigate the negative psychological damage of the downsizing<br />

experience, helping the survivors to move back into a healthy, productive state.<br />

Managers will find challenges relating to the motivation of these employees quite<br />

daunting, primarily because a post-downsizing environment typically produces thought patterns<br />

opposite of those existing in an environment fostering intrinsic motivation. Nevertheless,<br />

managers must rise to this challenge and do the difficult work of sharing power, learning what<br />

motivates individual employees and acting on it, and improved listening. They must also<br />

embrace the concept of positive reinforcement and use it appropriately. Often the primary focus<br />

of downsizing is the gaining of efficiencies through cost reduction. Those in charge may see<br />

employees who have survived termination following downsizing as the fortunate ones. As such,<br />

they may feel that initiatives addressing the way these particular employees "feel" should be<br />

775<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


776<br />

relegated to the background. In such instances, an approach more connected to organizational<br />

goals may be necessary. They may be more receptive to suggestions for initiatives aimed at<br />

alleviating survivors' pain and discontent if they perceive them as having an authentic connection<br />

with the motivation, performance, and success of these survivors in helping the organization<br />

realize its future goals. Continued research in this important area and the communication of the<br />

results of that research to organizational leaders is the right action to insure that survivors do not<br />

become the forgotten victims of downsizing.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


Reference List<br />

Bunker, K.A. (1997). The power of vulnerability in contemporary leadership.<br />

Consulting Psychology Journal, 49, 122-136.<br />

Burke, R.J. & Nelson, D. (1998). Mergers and acquisitions, downsizing, and<br />

privatization: A North American perspective. In M.K. Gowing, J. D. Kraft &<br />

J.C. Quick (Eds.). The new organizational reality: Downsizing, restructuring, and<br />

revitalization. Washington, D.C.: American Psychological <strong>Association</strong>.<br />

Carter, T. (1999). The aftermath of reengineering: Downsizing and corporate<br />

performance. New York: The Haworth Press.<br />

Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in<br />

human behavior. New York: Plenum.<br />

Duron, S.A. (1994). The reality of downsizing: What are the productivity outcomes?<br />

(Doctoral dissertation, Golden State University, 1994). Dissertation Abstracts<br />

<strong>International</strong>, 54, 4953.<br />

Gowing, M.K., Kraft, J.D., & Quick, J.C. (1998). The new organizational reality.<br />

Washington, D.C.: American Psychological <strong>Association</strong>.<br />

Hamel, G. (2000). Leading the revolution. Boston: Harvard Business School Press.<br />

Hiam, A. (1999). Motivating & rewarding employees: New and better ways to inspire<br />

your people. Holbrook, MA: Adams Media Corporation.<br />

Hiam, A. (2002, October). Motivation: Recognition tips. Transaction World, II.<br />

Retrieved February 25, <strong>2003</strong> from<br />

http://www.transactionworld.com/articles/2002/october/motivation1.asp<br />

Heckscher, C. (1995). White-collar blues: Management loyalties in an age of corporate<br />

restructuring. New York: Basic Books.<br />

Kotter, J. P. (1996). Leading change. Boston: Harvard Business School Press.<br />

Noer, D. (1998). Layoff survivor sickness: What it is and what to do about it. In M.K.<br />

Gowing, J. D. Kraft & J.C. Quick (Eds.). The new organizational reality:<br />

Downsizing, restructuring, and revitalization. Washington, D.C.: American<br />

Psychological <strong>Association</strong>.<br />

Noer, D. (1999). Helping organizations change: Coping with downsizing, mergers,<br />

reengineering, and reorganizations. In A.I. Kraut & A. K. Korman (Eds.).<br />

Evolving practices in human resource management. San Francisco: Jossey-<br />

Bass Publishers.<br />

Parker, S. K., Chmiel, N. & Wall, T.D. (1997). Work characteristics and employee well-<br />

being within a context of strategic downsizing. Journal of Occupational Health<br />

Psychology, 2, 289 – 303.<br />

Peters, T. (1987). Thriving on chaos. New York: Harper & Row.<br />

Richter, M.S. (2001). Creating intrinsically motivating environments: A motivation<br />

system. StoryNet. Retrieved February 25, <strong>2003</strong>, from<br />

http://www.thestorynet.com/articles_essays/motivation.htm .<br />

Rousseau, D.M. (1995). Psychological contracts in organizations: Understanding<br />

written and unwritten agreements. Thousand Oaks, CA: Sage Publications.<br />

Schweiger, D.M., Ivancevich, J.M. & Power, F.R. (1987). Executive actions for<br />

777<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


778<br />

managing human resources before and after acquisition. Academy of<br />

Management Executive, 1, 127 - 138.<br />

Stearns, A.K. (1995). Living through job loss. New York: Simon & Schuster.<br />

Wagner, J. (1998). Downsizing effects on organizational development capabilities at an<br />

electric utility. Journal of Industrial Technology, 15, 2 – 7. Retrieved February<br />

20, <strong>2003</strong>, from http://www.nait.org/jit/Articles/wagn1198.pdf<br />

West, G. B. (2000). The effects of downsizing on survivors: A meta-analysis (Doctoral<br />

dissertation, Virginia Polytechnic Institute and State University, 2000). Retrieved<br />

January 25, <strong>2003</strong> from http://scholar.lib.vt.edu/theses/available/etd-04202000-<br />

14520000/unrestricted/new-etd.pdf.<br />

Woodruff, D.W. (1995). Motivation: After the downsizing. Hydrocarbon Processing,<br />

74, 131 – 135.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


VALIDATION OF THE BELGIAN MILITARY PILOT SELECTION<br />

TEST BATTERY<br />

Yves A. Devriendt, psychologist, and Cathérine A. Levaux, psychologist<br />

Belgian Department of Defence<br />

General Directorate Human Resources<br />

Kwartier Koningin Astrid<br />

Bruynstraat<br />

1120 Neder-Over-Heembeek, Brussels, Belgium<br />

yves.devriendt@mil.be<br />

PROBLEM DEFINITION<br />

For different reasons pilot selection is very important: the fast flying machines are<br />

difficult to handle, resources - such as gasoline - can be wasted, human incompetencies can<br />

cause serious damage and accidents. During the last decade budgetary restrictions have even<br />

made more important avoiding wasting resources.<br />

In the Belgian Defence Forces most of the pilot tests are computer-based and<br />

following Hilton and Dolgin (1991) the reality factor with regard to the working environment<br />

could be a catalyser for test predictivity.<br />

Nevertheless, more recently, increasing failure rates have led to call in question the<br />

effectiveness of the Belgian military pilot selection system. Psychologist were asked to<br />

conduct validity studies in order to replace bad performing tests. When training wastage is<br />

observed, it is not unusual to ask specialists to conduct validation studies, as Caretta and Ree<br />

(1997) report.<br />

Little is known about validity of the current <strong>Military</strong> Pilot Selection Battery (MPSB),<br />

due to a lack of interest for applied research matters from the side of policy makers and a<br />

shortage of large data sets. Until now there was no real psychotechnical tradition in the<br />

domain of pilot selection, but due to recent reorganisations and the creation of a General<br />

Directorate of Human Resources including a Research & Technology Department things may<br />

change for the better.<br />

In addition to the psychotechnical aspect there are technical and practical<br />

considerations to replace at least a part of the MPSB. The computer environment is an old<br />

computer setting. Spare-parts are becoming rare and expensive, and the programming<br />

languages will have to be adapted to meet modern standards. Furthermore, the pilot test<br />

battery will have to be dislocated in 2006, due to the policy to centralise all of the selection<br />

activities.<br />

The authors will give an overview of the procedures used to validate the MPSB and<br />

the results obtained. Procedures, results and areas of future research will be discussed.<br />

METHOD<br />

Participants<br />

The participants in the studies were Flemish and Walloon Auxiliary pilots, both male<br />

and female, who participated in the selection procedures during 1996 and 2000. Auxiliary<br />

pilots do not start an academic education like the candidate pilots entering at the Royal<br />

<strong>Military</strong> Academy, but they receive an adapted theoretical and mainly practical oriented<br />

training. Their contracts are limited in time, but there are possibilities to get a life long<br />

779<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


780<br />

contract by succeeding in selection procedures later on in their careers. There were data<br />

available on applicants (n = 2044) and pilot trainees (n = 129). Applicants were administered<br />

the MPSB, some of them passed the test, others did not. Pilot trainees were selected by means<br />

of scores on the MPSB and they all passed the test battery.<br />

Measures<br />

Independent variables or predictors<br />

The MPSB was composed of a Psychomotor Battery (PM), a Pilot Skills Battery (PS),<br />

Academic tests, Physical tests, a psychological and a professional interview.<br />

The PM consisted of three tests. First, there was a co-ordination test with two parts:<br />

C_COORD and T_COORD. In the C_COORD, part 1 and part 2, the candidates must keep a<br />

moving ball within the limits of a square. In the T_COORD, part 1 and part 2, a candidate<br />

must either react or not when a triangle appears. Second, there was a discrimination test<br />

producing three scores: G_DISCR (number of good responses, part 1 and part 2), TR_DISCR<br />

(reaction time) and RI_DISCR (number of errors). During this test a candidate must<br />

discriminate between coloured squares and react in an appropriate way. Third, there was the<br />

number of good responses on the RS_506, a test measuring reasoning, spatial orientation and<br />

visual representation.<br />

Test name Kind of test Description<br />

Reaction time test Reaction Number of good responses as an answer<br />

to different stimuli<br />

Reaction time Time needed to respond as an answer to<br />

ABL 304 Visual memory, spatial<br />

orientation, organisation of<br />

the learning process<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

different stimuli<br />

Number of good responses in<br />

memorising a geographic map<br />

ABL 17 Reasoning Number of good responses in<br />

completing series of stimuli<br />

ABL 152 Visual memory,<br />

Number of good responses in making<br />

concentration, learning<br />

method<br />

associations between stimuli<br />

Cubes test Three dimensional<br />

Number of good responses in<br />

visualisation and reasoning representing three dimensional figures<br />

Digit recall Short Term Memory Number of good responses in<br />

reproducing series of digits<br />

Arithmetic Numerical reasoning Number of good responses in detecting<br />

principles in series of numbers<br />

Manikin Lateral orientation Number of good responses in localising<br />

positions<br />

Spiral Motor co-ordination Time needed to track a wire without<br />

touching the wire<br />

Figure 1. Composition of the Pilot Skills Test Battery<br />

The PS Battery contained nine subtests, described in Figure 1. It should be made clear<br />

that the ABL 152 was composed of four distinct time periods, with each time the same


instructions, in order to make observations on the learning progression an applicant shows.<br />

Digit recall consisted of two parts. The academic part of the MPSB was conceived to measure<br />

skills in the domain of Maths, Physics, sciences and languages (mother tongue and English as<br />

a second language). Furthermore, candidates had to pass physical tests (swimming, shuttle<br />

run, pull ups, sit ups, jumping and throwing). The psychological interview was conducted by<br />

a psychologist in order to get an impression of the personality characteristics of the<br />

candidates. Finally, the professional interview was done by a pilot in order to measure the<br />

candidate’s job knowledge and motivation.<br />

Dependent variable or criterion<br />

The criterion for the regressions was the evaluation score at the end of the twelve<br />

flights during the initial training stage, called the General Flight Double (GFD12). During<br />

these flights the instructor flies with the trainee and can correct for trainee’s mistakes. A flight<br />

without mistakes is rewarded with a blue card (= excellent flight) or a green card (=<br />

satisfactory flight). For a weak flight the trainee gets a yellow card and for an unsatisfactory<br />

flight a red card is given.<br />

The criterion was operationalised in a continuous way (the summed score of the<br />

number of green and blue cards, with a minimum of zero and a maximum of twelve) and in a<br />

dichotomous way (passed or failed the training).<br />

Procedures<br />

A Principal Components Analysis (PCA) was conducted on the PM and PS data sets<br />

using the applicant’s population.<br />

Linear and logistic regressions were performed on the academic, physical, interview<br />

and psychotechnical data subsets. Each of the described analyses was performed on the pilot<br />

trainee population. There were violations of the normality assumptions for interview data and<br />

for some distributions of psychotechnical test scores. A multicollinearity problem arose in the<br />

analysis of the PM-test scores. There proved to be redundancy (r = -.97) between RM_DISCR<br />

(number of omitted responses in the discrimination test) and G1_DISCR number of good<br />

responses on the first part of the discrimination test). The variable RM_DISCR was removed<br />

from further analyses.<br />

RESULTS<br />

Principal Components Analysis<br />

In Table 1 the outcome of a PCA for PM and PS results on the applicant’s population<br />

is shown. A good and interpretable solution was found with a Quartimax normalized rotation.<br />

Marked loadings are > .700000. Following five components could be distinguished: memorylearning<br />

method (Component 1), discrimination (Component 2), co-ordination (Component<br />

3), spatial-mathematical reasoning (Component 4) and reaction (Component 5). In the two<br />

last rows the explained variance and the total proportion of explained variance are given.<br />

Although variables G1_DISCR and G2_DISCR, on the one hand, and variables<br />

TR_DISCR and RI_DISCR, on the other hand, belong to the same component they are<br />

negatively correlated. The reason for this is probably that they measure the same construct in<br />

a different way: the former two measure good responses (the more the better), the latter two<br />

are a time measure (the less, the better).<br />

Table 1. Principal Components Analysis on PM and PS test results in the applicant’s<br />

population of auxiliary pilots<br />

781<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


782<br />

Variable Component<br />

1<br />

Component<br />

2<br />

Component<br />

3<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Component<br />

4<br />

Component<br />

5<br />

C1_COORD .914249*<br />

T1_COORD .806199*<br />

C1_COORD .919167*<br />

T2_COORD .811132*<br />

G1_DISCR .770113*<br />

G2_DISCR .742764*<br />

TR_DISCR -.539144<br />

RI_DISCR -.681685<br />

RS_506 .580136<br />

Reaction time<br />

(responses)<br />

.425515<br />

Reaction time<br />

.289643<br />

(time)<br />

ABL 304 .446364<br />

ABL 17 .590505<br />

ABL 152 Series 1 .869106*<br />

ABL 152 Series 2 .943024*<br />

ABL 152 Series 3 .913303*<br />

ABL 152 Series 4 .865682*<br />

Cubes test .661133<br />

Digit recall 1 .435885<br />

Digit recall 2 .583022<br />

Arithmetic .547164<br />

Manikin .324255<br />

Spiral time .161590<br />

Expl. Var 3.443161 2.003158 1.787692 2.791848 1.574403<br />

Prp. Total .1497035 .087094 .077726 .121385 .068452<br />

Regressions<br />

PM-battery and Physical predictors<br />

No statistical significant results were found.<br />

PS-battery<br />

A multiple linear regression was conducted on the pilot trainee population, using the<br />

continuous criterion, GFD12. The forward and backward stepwise procedures resulted in the<br />

same subset of predictors: ABL152 Series 4, Arithmetic, Digit recall 1, Manikin, ABL 304<br />

and ABL 152 Series 2 (R = .41; R² = .17; Adjusted R² = .12 and p < .00318).<br />

Academic tests<br />

GFD12 continuous was regressed on the academic test results (R = .61; R² = .38;<br />

Adjusted R² = .34 and p < .00051). The following subset of predictors gave the best<br />

predictions: Physics and first Language.


Interviews<br />

GFD dichotomous was regressed on the results of both the physiological an<br />

professional interview scores, and on each of these variables separately. For the analysis four<br />

logistic regression methods were used: the Quasi-Newton method, the Simplex procedure, the<br />

Hooke-Jeeves Pattern Move and the Rosenbrock Pattern Search.<br />

There was no fit for the psychological and professional scores together, nor for the<br />

professional interview score apart. A logistic one-variable model containing the psychological<br />

interview score, in contrast, gave a good fit (χ² = 4.34; p = .03715).<br />

CONCLUSIONS AND DISCUSSION<br />

The PCA yielded a five-component structure. The results gave furthermore evidence<br />

for predictive validity with regard to some tests used in the PS-battery and the Academic tests.<br />

The psychological interview score too proved to be promising in a predictive way. All of<br />

these findings will have to be cross-validated.<br />

The GFD12 was chosen because of its statistical and practical qualities. First, criterion<br />

data were available both in a continuous and a dichotomous form. Using the blue and green<br />

cards there was a spread in the values of the variable, ranging from zero to twelve and its<br />

distribution was fluent and approached the bell-curve. Therefore the criterion, operationalised<br />

in this way, could be treated as a continuous variable. A continuous variable offers some<br />

advantages compared to a dichotomised criterion (Hunter & Burke, 1995; Caretta and Ree,<br />

<strong>2003</strong>). Nevertheless, there were also data available on the criterion in a dichotomised form<br />

(passed or failed in the GFD). These data were very useful in case of violation of assumptions<br />

for linear techniques with regard to the predictors.<br />

Second, the criterion was an initial or intermediate criterion and not an ultimate one.<br />

Dover (1991) remarks that the more distant the criterion is, the lower the validity is. Hilton<br />

and Dolgin are in favour of initial training as a criterion for validity and say that an ultimate<br />

criterion is less cost-effective. Others, like Helmreich (see Hilton & Dolgan, 1991) say job<br />

performance is a more realistic criterion than initial training. No evidence has been found for<br />

predictivity in relation to the PM-battery. One of the reasons could be the censoring of the<br />

population or restriction in range (Caretta & Ree, <strong>2003</strong>). Applicants were selected mainly on<br />

the PM-composite score.<br />

There was no direct selection on the basis of the PS-score, the results of the PS-battery<br />

were used only as an indicator by the psychologist during the psychological interview. In the<br />

present study no corrections for range restriction were applied. In order to obtain a better view<br />

on the predictive value, results should be recalculated and corrected for restriction in range.<br />

An important issue in the Belgian context could be the existence of differences<br />

between scores in Flemish and Walloon trainees or applicants, and, of course, the differences<br />

between scores in males and females. Study of gender and ethnic group differences is<br />

important in evaluating measurement instruments. Caretta (1997) explains why.<br />

In the Belgian context there is a shortage of data on female trainees. Research on<br />

differences between Flemish and Walloon trainees is possible, but difficult, because of the<br />

small groups of trainees for both linguistic systems.<br />

Recent evolvements in pilot research indicate that it is important to test for multitasking<br />

capability. Second, personal and non-technical skills will become more important<br />

because of the increasing importance of teamwork in small teams in modern military aviation<br />

(Damitz, Manzey, Kleinmann & Severin, <strong>2003</strong>; Hanson, Hedge, Logan, Bruskiewicz,<br />

Borman, & Siem, 1996). In the near future a test for multi-tasking will be added to the<br />

783<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


784<br />

Belgian MPSB and from next year onward candidate auxiliary pilots will be examined in the<br />

assessment centre procedure currently used by the Belgian Department of Defence. Devriendt<br />

(1999) gave an overview of some typical assessment techniques used by the Belgian<br />

Department of Defence.<br />

REFERENCES<br />

Caretta, T.R. (1997). Sex Differences on U.S. Air Force Pilot Selection Tests.<br />

<strong>Proceedings</strong> of the Ninth <strong>International</strong> Symposium on Aviation Psychology, Columbus, OH,<br />

1292-1297.<br />

Caretta, T.R. & Ree, M.J. (<strong>2003</strong>). Pilot Selection Methods. In P. Tsang & M. Vidulich<br />

(Eds.). Principles and Practice of Aviation Psychology (pp. 357-396 ). New Jersey: Lawrence<br />

Erlbaum Associates.<br />

Damitz, M., Manzey, D., Kleinmann, M., Severin, K. (<strong>2003</strong>). Assessent Center for<br />

Pilot Selection: Construct and Criterion Vlaidity and the Impact of Assessor Type. Applied<br />

Psychology: An <strong>International</strong> Review, 52, pp 193-212.<br />

Devriendt, Y. (2000). The Officer Selection in the Belgian Armed Forces. Paper<br />

presented at the RTO Human Factors and Medicine Panel (HFM) Workshop held in<br />

Monterey, USA, 9-11 November 1999, and published in the RTO proceedings 55.<br />

Dover, S.H. (1991). Selection Research an Application. In R. Gal & A. Mangelsdorff<br />

(Eds.). Handbook of <strong>Military</strong> Psychology (pp. 131-148). Chichester: Jon Wiley & Sons.<br />

Hanson, M.A., Hedge, J.W., Logan, K.K, Bruskiewicz, K.T, Borman, W.C, & Siem,<br />

F..M. (1996). Development of a Computerized Pilot Selection Test.<br />

Http://www.ijoa.org/imta96/paper59.html<br />

Hilton, T.F. & Dolgin, D.L. (1991). Pilot Selection in the <strong>Military</strong> of the Free World.<br />

In R. Gal & A. Mangelsdorff (Eds.). Handbook of <strong>Military</strong> Psychology (pp. 81-101).<br />

Chichester: Jon Wiley & Sons.<br />

Hunter, D.R. & Burke, E.F. (1995). Handbook of Pilot Selection. Aldershot: Avebury<br />

Aviation Ashgate Publishing Ltd.<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


784 INDEX OF AUTHORS<br />

Acromite, M. 70<br />

Alderton, D.L. 62, 199, 481<br />

Annen, H. 13<br />

Arlington, A.T. 587<br />

Arnold, R.D. 129<br />

Baker, D.P. 661, 679, 688<br />

Balog, J. 431<br />

Barrow, D. 431<br />

Bearden, R,M. 62, 455<br />

Beaubien, J.M. 271, 679<br />

Bilgic, R. 49<br />

Boerstler, R.E. 499<br />

Borman, W.C. 398, 455<br />

Bowles, S. 205, 398<br />

Boyce, E.M. 561, 567<br />

Braddock, L. 581<br />

Bradley, P. 760<br />

Brown, K.J. 710<br />

Brown, M.E. 561, 567<br />

Brugger, C. 44<br />

Bruskiewicz, K.T. 485<br />

Burns, J.J. 116<br />

Calderón, R.F. 661<br />

Campbell, R. 507, 556<br />

Campbell, S. 760<br />

Carriot, J. 91<br />

Caster, C,H. 448<br />

Castro, C.A. 177<br />

Charbonneau, D. 760<br />

Chen, H. 62<br />

Chernyshenko, O.S. 317, 323<br />

Cian, C. 91<br />

Cippico, I. 305<br />

Collins, M.M. 522<br />

Costar, D.M. 688<br />

Cotton, A.J. 358, 599, 702<br />

Cowan, J.D. 103<br />

Crawford, K.S. 283, 297<br />

Cronin, B. 156, 654<br />

Debač, N. 305<br />

Devriendt, Y.A. 779<br />

Douglas, I. 671<br />

Drasgow, F. 310, 323, 317<br />

Dressel, J.D. 111, 140<br />

Dukalskis, L. 438<br />

Dursun, S. 608<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong><br />

Edgar, E. 438<br />

Eller, E.D. 62<br />

Elliott-Mabey, N.L. 6<br />

Erol, T. 49<br />

Fallesen, J. 171, 721<br />

Farmer, W.L. 62, 455, 481<br />

Fatolitis, 129<br />

Ferstl, K.L. 455<br />

Filjak, T. 305<br />

Fischer, L.F. 289<br />

Fitzgerald, L.F. 237<br />

Ford, K.A. 252<br />

Giebenrath, J. 116<br />

Gorney, E. 358, 599<br />

Gramlich, A. 171<br />

Greenston, P. 556<br />

Gutknecht, S.P. 22<br />

Hanson, M.A. 485<br />

Harris, R.N. 199<br />

Hawthorne, J. 431<br />

Hedge, J.W. 455<br />

Heffner, T.S. 507, 556<br />

Heil, M. 156<br />

Helm, W.R. 123<br />

Hendriks, B. 344<br />

Hession, P. 116<br />

Heuer, R.J.,Jr. 297<br />

Hindelang, R.L. 62<br />

Holtzman, A.K. 661, 679, 688<br />

Horey, J.D. 721<br />

Horgen, K.E. 398<br />

Houston, J.S. 455<br />

Howell, L.M. 158, 208<br />

Huffman, A.H. 177<br />

Iddekinge, C.H. 491, 531<br />

Irvine, J.H. 96<br />

Janega, J.B. 150, 330<br />

Johnson, R.S. 62<br />

Jones, P.L. 505<br />

Kamer, B. 13<br />

Kammrath, J.L. 499<br />

Katkowski, D.A. 522<br />

Keenan, P.A. 522<br />

Keeney, M.J. 271<br />

Keesling, W. 734<br />

Keller-Glaze, H. 171


Kilcullen, R.N. 531<br />

Klein, R.M. 158, 208<br />

Klion, R.E. 418<br />

Knapp, D.J. 556<br />

Kolen, J 70<br />

Kramer, L.A. 297<br />

Krol, M. 734<br />

Krouse, S.L. 96<br />

Kubisiak 398<br />

Lancaster, A.R. 208<br />

Lane, M.E. 561, 567<br />

Lappin, B.M. 158<br />

Lawson, A.K. 237<br />

Lee, W.C. 310<br />

Lescreve, F.J. 336<br />

Lett, J. 734<br />

Levaux, C.A. 779<br />

Lim, B.C. 631, 750<br />

Lipari, R.N. 158, 208<br />

Lofaro, R.J. 1<br />

Luster, L. 30<br />

Makgati, C.K.M. 694<br />

Maliko-Abraham, H 1<br />

Marsh-Ayers, N. 448<br />

McCloy, R.A. 531<br />

Michael, P.G. 62<br />

Mitchell, D. 171<br />

Morath, R. 156, 654<br />

Morath, R. 654<br />

Moriarty, K.O. 522<br />

Morrow, R. 608<br />

Mottern, J.A. 199, 561, 567<br />

Mylle, J. 404, 592<br />

Nayak, A. 62<br />

Nederhof, F.V.F. 336<br />

Newell, C.E. 581<br />

Nourizadeh, S. 573, 575<br />

O’Connell, B.J. 271, 448<br />

O'Keefe, D. 461<br />

Olmsted, M.G. 150, 330<br />

Ormerod, A.J. 219<br />

Oropeza, T. 431<br />

Osburn, H. 505<br />

Paullin, C.J. 485<br />

Peck, J.F. 278<br />

Pfenninger, D.T. 418<br />

INDEX OF AUTHORS 785<br />

Phillips, H.L. 129<br />

Ployhart, R.E. 631<br />

Putka, D.J. 491, 531, 549<br />

Radtke, P. 661, 688<br />

Raphela, C. 91<br />

Reid, J.D. 123<br />

Richardson, J. 167<br />

Rittman, A.L. 140<br />

Rone, R.S. 772<br />

Rosenfeld, P. 581<br />

Russell, T.L. 514, 540<br />

Sabol, M.A. 140<br />

Sager, C.E. 491, 507, 514, 549<br />

Sapp, R. 412<br />

Schaab, B.B. 111, 140<br />

Schantz, L.B. 522<br />

Schneider, R.J. 455<br />

Schreurs, B. 381<br />

Schultz, K. 412<br />

Seilhymer, J. 431<br />

Smith, G.A. 620<br />

Smith, J. 654<br />

Smith-Jentsch, K.A. 661, 688<br />

Snooks, S 30.<br />

Soh, S. 750<br />

Stam, D. 344<br />

Stark, S.E. 317, 323<br />

Steinberg, A.G. 573, 575<br />

Stetz, T.A. 271<br />

Still, D.L. 70<br />

Strange, J. 505<br />

Styer, J.S. 468<br />

Sumer, H.C. 49<br />

Sumer, N. 49<br />

Tan, C. 750<br />

Temme, L.A. 70<br />

Thain, J. 734<br />

Thompson, B.R. 615<br />

Tišlarić, G. 305<br />

Tremble, T. 507<br />

Truhon, S.A. 766<br />

Twomey, A. 461<br />

van de Ven, C. 344<br />

Van Iddekinge, C.H. 531<br />

Waldköetter, R. 587<br />

Watson, S.E. 62, 474<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>


786 INDEX OF AUTHORS<br />

Waugh, G.W. 540<br />

Wenzel, M.U. 418<br />

Weston, K. 438<br />

White, L.A. 398, 485<br />

White, M.A. 199, 561, 567<br />

Whittam, K. 62<br />

Willers, L. 412<br />

Willis, D. 742<br />

Wiskoff, M.F. 300<br />

Wood, S. 283<br />

Wright, C.V. 219<br />

Youngcourt, S.S. 177<br />

Zarola, A. 438<br />

Zebec, K. 305<br />

45 th Annual Conference of the <strong>International</strong> <strong>Military</strong> <strong>Testing</strong> <strong>Association</strong><br />

Pensacola, Florida, 3-6 November <strong>2003</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!