- Broschiertes Buch
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
This is the third edition of the classic and comprehensive reference guide to the theory and practice of competency testing in organizations and professions. Criterion-Referenced Test Development has been thoroughly revised and updated to address the most recent issues in certification and qualification testing. Criterion-Referenced Test Development is designed specifically for training professionals who need to better understand how to develop criterion-referenced tests (CRTs). This important resource offers step-by-step guidance for how to make and defend Level 2 testing decisions, how to…mehr
Andere Kunden interessierten sich auch für
- Creating Training Development85,99 €
- Robert Bruce ShawAll in24,99 €
- Marcia HughesTeam Emotional & Social Intelligence Survey24,99 €
- Patricia Pulliam PhillipsRoi in Action Casebook68,99 €
- Patrick J AspellThe Enneagram Personality Portraits, Trainer's Guide49,99 €
- Michael MilanoDesigning Powerful Training83,99 €
- Dale BrethowerPerformance-Based Instruction, Includes a Microsoft Word Diskette65,99 €
-
-
-
This is the third edition of the classic and comprehensive reference guide to the theory and practice of competency testing in organizations and professions. Criterion-Referenced Test Development has been thoroughly revised and updated to address the most recent issues in certification and qualification testing. Criterion-Referenced Test Development is designed specifically for training professionals who need to better understand how to develop criterion-referenced tests (CRTs). This important resource offers step-by-step guidance for how to make and defend Level 2 testing decisions, how to write test questions and performance scales that match jobs, and how to show that those certified as "masters" are truly masters. A comprehensive guide to the development and use of CRTs, the book provides information about a variety of topics, including different methods of test interpretations, test construction, item formats, test scoring, reliability and validation methods, test administration, and score reporting, as well as the legal and liability issues surrounding testing. New revisions include: * Illustrative real-world examples * Issues of test security * Advice on the use of test creation software * Expanded sections on performance testing * Single administration techniques for calculating reliability * Updated legal and compliance guidelines The authors have created a very accessible guide with information that is easily grasped and implemented. In addition, the book is filled with relevant exercises that require active responses and reinforce mastery of the principles and procedures.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Produktdetails
- Produktdetails
- Verlag: Wiley
- 3rd Revised edition
- Seitenzahl: 494
- Erscheinungstermin: 17. September 2007
- Englisch
- Abmessung: 229mm x 152mm x 27mm
- Gewicht: 709g
- ISBN-13: 9781118943403
- ISBN-10: 1118943406
- Artikelnr.: 41312592
- Herstellerkennzeichnung
- Libri GmbH
- Europaallee 1
- 36244 Bad Hersfeld
- 06621 890
- Verlag: Wiley
- 3rd Revised edition
- Seitenzahl: 494
- Erscheinungstermin: 17. September 2007
- Englisch
- Abmessung: 229mm x 152mm x 27mm
- Gewicht: 709g
- ISBN-13: 9781118943403
- ISBN-10: 1118943406
- Artikelnr.: 41312592
- Herstellerkennzeichnung
- Libri GmbH
- Europaallee 1
- 36244 Bad Hersfeld
- 06621 890
Sharon Shrock is professor of Instructional Design and Technology at Southern Illinois University, Carbondale, where she coordinates graduate programs in ID/IT. She is the former co-director of the Hewlett-Packard World Wide?Test Development Center. She is a past president of the Association for Educational Communications and Technology's Division of Instructional Development and has served on the editorial boards of most of the major academic journals in the instructional design field. Bill Coscarelli is professor in the Instructional Design specialization at Southern Illinois University Carbondale's department of Curriculum & Instruction and the former co-director of the Hewlett-Packard World Wide Test Development Center. Bill has been elected as president of the International Society for Performance Improvement and the Association for Educational Communications and Technology's Division for Instructional Development. He was the founding editor of Performance Improvement Quarterly and ISPI's first vice-president of Publications.
List of Figures, Tables, and Sidebars xxiii Introduction: A Little Knowledge Is Dangerous 1 Why Test? 1 Why Read This Book? 2 A Confusing State of Affairs 3 Misleading Familiarity 3 Inaccessible Technology 4 Procedural Confusion 4 Testing and Kirkpatrick's Levels of Evaluation 5 Certification in the Corporate World 7 Corporate Testing Enters the New Millennium 10 What Is to Come. . . 11 Part I: Background: The Fundamentals 13 1 Test Theory 15 What Is Testing? 15 What Does a Test Score Mean? 17 Reliability and Validity: A Primer 18 Reliability 18 Equivalence Reliability 19 Test-Retest Reliability 19 Inter-Rater Reliability 19 Validity 20 Face Validity 23 Context Validity 23 Concurrent Validity 23 Predictive Validity 24 Concluding Comment 24 2 Types of Tests 25 Criterion-Referenced Versus Norm-Referenced Tests 25 Frequency Distributions 25 Criterion-Referenced Test Interpretation 28 Six Purposes for Tests in Training Settings 30 Three Methods of Test Construction (One of Which You Should Never Use) 32 Topic-Based Test Construction 32 Statistically Based Test Construction 33 Objectives-Based Test Construction 34 Part II: Overview: The CRTD Model and Process 37 3 The CRTD Model and Process 39 Relationship to the Instructional Design Process 39 The CRTD Process 43 Plan Documentation 44 Analyze Job Content 44 Establish Content Validity of Objectives 46 Create Items 46 Create Cognitive Items 46 Create Rating Instruments 47 Establish Content Validity of Items and Instruments 47 Conduct Initial Test Pilot 47 Perform Item Analysis 48 Difficulty Index 48 Distractor Pattern 48 Point-Biserial 48 Create Parallel Forms or Item Banks 49 Establish Cut-Off Scores 49 Informed Judgment 50 Angoff 50 Contrasting Groups 50 Determine Reliability 50 Determine Reliability of Cognitive Tests 50 Equivalence Reliability 51 Test-Retest Reliability 51 Determine Reliability of Performance Tests 52 Report Scores 52 Summary 53 Part III: The CRTD Process: Planning and Creating the Test 55 4 Plan Documentation 57 Why Document? 57 What to Document 63 The Documentation 64 5 Analyze Job Content 75 Job Analysis 75 Job Analysis Models 77 Summary of the Job Analysis Process 78 DACUM 79 Hierarchies 87 Hierarchical Analysis of Tasks 87 Matching the Hierarchy to the Type of Test 88 Prerequisite Test 89 Entry Test 89 Diagnostic Test 89 Posttest 89 Equivalency Test 90 Certification Test 90 Using Learning Task Analysis to Validate a Hierarchy 91 Bloom's Original Taxonomy 91 Knowledge Level 92 Comprehension Level 93 Application Level 93 Analysis Level 93 Synthesis Level 93 Evaluation Level 94 Using Bloom's Original Taxonomy to Validate a Hierarchy 94 Bloom's Revised Taxonomy 95 Gagné's Learned Capabilities 96 Intellectual Skills 96 Cognitive Strategies 97 Verbal Information 97 Motor Skill 97 Attitudes 97 Using Gagné's Intellectual Skills to Validate a Hierarchy 97 Merrill's Component Design Theory 98 The Task Dimension 99 Types of Learning 99 Using Merrill's Component Design Theory to Validate a Hierarchy 99 Data-Based Methods for Hierarchy Validation 100 Who Killed Cock Robin? 102 6 Content Validity of Objectives 105 Overview of the Process 105 The Role of Objectives in Item Writing 106 Characteristics of Good Objectives 107 Behavior Component 107 Conditions Component 108 Standards Component 108 A Word from the Legal Department About Objectives 109 The Certification Suite 109 Certification Levels in the Suite 110 Level A-Realworld 110 Level B-High-Fidelity Simulation 111 Level C-Scenarious 111 Quasi-Certification 112 Level D-Memorization 112 Level E-Attendance 112 Level F-Affiliation 113 How to Use the Certification Suite 113 Finding a Common Understanding 113 Making a Professional Decision 114 The correct level to match the job 114 The operationally correct level 114 The consequences of lower fidelity 115 Converting Job-Task Statements to Objectives 116 In Conclusion 119 7 Create Cognitive Items 121 What Are Cognitive Items? 121 Classification Schemes for Objectives 122 Bloom's Cognitive Classifications 123 Types of Test Items 129 Newer Computer-Based Item Types 129 The Six Most Common Item Types 130 True/False Items 131 Matching Items 132 Multiple-Choice Items 132 Fill-In Items 147 Short Answer Items 147 Essay Items 148 The Key to Writing Items That Match Jobs 149 The Single Most Useful Improvement You Can Make in Test Development 149 Intensional Versus Extensional Items 150 Show Versus Tell 152 The Certification Suite 155 Guidelines for Writing Test Items 158 Guidelines for Writing the Most Common Item Types 159 How Many Items Should Be on a Test? 166 Test Reliability and Test Length 166 Criticality of Decisions and Test Length 167 Resources and Test Length 168 Domain Size of Objectives and Test Length 168 Homogeneity of Objectives and Test Length 169 Research on Test Length 170 Summary of Determinants of Test Length 170 A Cookbook for the SME 172 Deciding Among Scoring Systems 174 Hand Scoring 175 Optical Scanning 175 Computer-Based Testing 176 Computerized Adaptive Testing 180 8 Create Rating Instruments 183 What Are Performance Tests? 183 Product Versus Process in Performance Testing 187 Four Types of Rating Scales for Use in Performance Tests (Two of Which You Should Never Use) 187 Numerical Scales 188 Descriptive Scales 188 Behaviorally Anchored Rating Scales 188 Checklists 190 Open Skill Testing 192 9 Establish Content Validity of Items and Instruments 195 The Process 195 Establishing Content Validity-The 196 Single Most Important Step Face Validity 196 Content Validity 197 Two Other Types of Validity 202 Concurrent Validity 202 Predictive Validity 208 Summary Comment About Validity 209 10 Initial Test Pilot 211 Why Pilot a Test? 211 Six Steps in the Pilot Process 212 Determine the Sample 212 Orient the Participants 213 Give the Test 214 Analyze the Test 214 Interview the Test-Takers 215 Synthesize the Results 216 Preparing to Collect Pilot Test Data 217 Before You Administer the Test 217 Sequencing Test Items 217 Test Directions 218 Test Readability Levels 219 Lexile Measure 220 Formatting the Test 220 Setting Time Limits-Power, Speed, and Organizational Culture 221 When You Administer the Test 222 Physical Factors 222 Psychological Factors 222 Giving and Monitoring the Test 223 Special Considerations for Performance Tests 225 Honesty and Integrity in Testing 231 Security During the Training-Testing Sequence 234 Organization-Wide Policies Regarding Test Security 236 11 Statistical Pilot 241 Standard Deviation and Test Distributions 241 The Meaning of Standard Deviation 241 The Five Most Common Test Distributions 244 Problems with Standard Deviations and Mastery Distributions 247 Item Statistics and Item Analysis 248 Item Statistics 248 Difficulty Index 248 P-Value 249 Distractor Pattern 249 Point-Biserial Correlation 250 Item Analysis for Criterion-Referenced Tests 251 The Upper-Lower Index 253 Phi 255 Choosing Item Statistics and Item Analysis Techniques 255 Garbage In-Garbage Out 257 12 Parallel Forms 259 Paper-and-Pencil Tests 260 Computerized Item Banks 262 Reusable Learning Objects 264 13 Cut-Off Scores 265 Determining the Standard for Mastery 265 The Outcomes of a Criterion-Referenced Test 266 The Necessity of Human Judgment in Setting a Cut-Off Score 267 Consequences of Misclassification 267 Stakeholders 268 Revisability 268 Performance Data 268 Three Procedures for Setting the Cut-Off Score 269 The Issue of Substitutability 269 Informed Judgment 270 A Conjectural Approach, the Angoff Method 272 Contrasting Groups Method 278 Borderline Decisions 282 The Meaning of Standard Error of Measurement 282 Reducing Misclassification Errors at the Borderline 284 Problems with Correction-for-Guessing 285 The Problem of the Saltatory Cut-Off Score 287 14 Reliability of Cognitive Tests 289 The Concepts of Reliability, Validity, and Correlation 289 Correlation 290 Types of Reliability 293 Single-Test-Administration Reliability Techniques 294 Internal Consistency 294 Squared-Error Loss 296 Threshold-Loss 296 Calculating Reliability for Single-Test Administration Techniques 297 Livingston's Coefficient kappa (
2) 297 The Index Sc 297 Outcomes of Using the Single-Test- Administration Reliability Techniques 298 Two-Test-Administration Reliability Techniques 299 Equivalence Reliability 299 Test-Retest Reliability 300 Calculating Reliability for Two-Test Administration Techniques 301 The Phi Coefficient 302 Description of Phi 302 Calculating Phi 302 How High Should Phi Be? 304 The Agreement Coefficient 306 Description of the Agreement Coefficient 306 Calculating the Agreement Coefficient 307 How High Should the Agreement Coefficient Be? 308 The Kappa Coefficient 308 Description of Kappa 308 Calculating the Kappa Coefficient 309 How High Should the Kappa Coefficient Be? 311 Comparison of
,
0, and
313 The Logistics of Establishing Test Reliability 314 Choosing Items 314 Sample Test-Takers 315 Testing Conditions 316 Recommendations for Choosing a Reliability Technique 316 Summary Comments 317 15 Reliability of Performance Tests 319 Reliability and Validity of Performance Tests 319 Types of Rating Errors 320 Error of Standards 320 Halo Error 321 Logic Error 321 Similarity Error 321 Central Tendency Error 321 Leniency Error 322 Inter-Rater Reliability 322 Calculating and Interpreting Kappa (
) 323 Calculating and Interpreting Phi (
) 335 Repeated Performance and Consecutive Success 344 Procedures for Training Raters 347 What If a Rater Passes Everyone Regardless of Performance? 349 What Should You Do? 352 What If You Get a High Percentage of Agreement Among Raters But a Negative Phi Coefficient? 353 16 Report Scores 357 CRT Versus NRT Reporting 358 Summing Subscores 358 What Should You Report to a Manager? 361 Is There a Legal Reason to Archive the Tests? 362 A Final Thought About Testing and Teaching 362 Part IV: Legal Issues in Criterion-Referenced Testing 365 17 Criterion-Referenced Testing and Employment Selection Laws 367 What Do We Mean by Employment Selection Laws? 368 Who May Bring a Claim? 368 A Short History of the Uniform Guidelines on Employee Selection Procedures 370 Purpose and Scope 371 Legal Challenges to Testing and the Uniform Guidelines 373 Reasonable Reconsideration 376 In Conclusion 376 Balancing CRTs with Employment Discrimination Laws 376 Watch Out for Blanket Exclusions in the Name of Business Necessity 378 Adverse Impact, the Bottom Line, and Affirmative Action 380 Adverse Impact 380 The Bottom Line 383 Affirmative Action 385 Record-Keeping of Adverse Impact and Job-Relatedness of Tests 387 Accommodating Test-Takers with Special Needs 387 Testing, Assessment, and Evaluation for Disabled Candidates 390 Test Validation Criteria: General Guidelines 394 Test Validation: A Step-by-Step Guide 397 1. Obtain Professional Guidance 397 2. Select a Legally Acceptable Validation Strategy for Your Particular Test 397 3. Understand and Employ Standards for Content-Valid Tests 398 4. Evaluate the Overall Test Circumstances to Assure Equality of Opportunity 399 Keys to Maintaining Effective and Legally Defensible Documentation 400 Why Document? 400 What Is Documentation? 401 Why Is Documentation an Ally in Defending Against Claims? 401 How Is Documentation Used? 402 Compliance Documentation 402 Documentation to Avoid Regulatory Penalties or Lawsuits 404 Use of Documentation in Court 404 Documentation to Refresh Memory 404 Documentation to Attack Credibility 404 Disclosure and Production of Documentation 405 Pay Attention to Document Retention Policies and Protocols 407 Use Effective Word Management in Your Documentation 409 Use Objective Terms to Describe Events and Compliance 412 Avoid Inflammatory and Off-the-Cuff Commentary 412 Develop and Enforce Effective Document Retention Policies 413 Make Sure Your Documentation Is Complete 414 Make Sure Your Documentation Is Capable of "Authentication" 415 In Conclusion 415 Is Your Criterion-Referenced Testing Legally Defensible? A Checklist 416 A Final Thought 419 Epilogue: CRTD as Organizational Transformation 421 References 425 Index 433 About the Authors 453
2) 297 The Index Sc 297 Outcomes of Using the Single-Test- Administration Reliability Techniques 298 Two-Test-Administration Reliability Techniques 299 Equivalence Reliability 299 Test-Retest Reliability 300 Calculating Reliability for Two-Test Administration Techniques 301 The Phi Coefficient 302 Description of Phi 302 Calculating Phi 302 How High Should Phi Be? 304 The Agreement Coefficient 306 Description of the Agreement Coefficient 306 Calculating the Agreement Coefficient 307 How High Should the Agreement Coefficient Be? 308 The Kappa Coefficient 308 Description of Kappa 308 Calculating the Kappa Coefficient 309 How High Should the Kappa Coefficient Be? 311 Comparison of
,
0, and
313 The Logistics of Establishing Test Reliability 314 Choosing Items 314 Sample Test-Takers 315 Testing Conditions 316 Recommendations for Choosing a Reliability Technique 316 Summary Comments 317 15 Reliability of Performance Tests 319 Reliability and Validity of Performance Tests 319 Types of Rating Errors 320 Error of Standards 320 Halo Error 321 Logic Error 321 Similarity Error 321 Central Tendency Error 321 Leniency Error 322 Inter-Rater Reliability 322 Calculating and Interpreting Kappa (
) 323 Calculating and Interpreting Phi (
) 335 Repeated Performance and Consecutive Success 344 Procedures for Training Raters 347 What If a Rater Passes Everyone Regardless of Performance? 349 What Should You Do? 352 What If You Get a High Percentage of Agreement Among Raters But a Negative Phi Coefficient? 353 16 Report Scores 357 CRT Versus NRT Reporting 358 Summing Subscores 358 What Should You Report to a Manager? 361 Is There a Legal Reason to Archive the Tests? 362 A Final Thought About Testing and Teaching 362 Part IV: Legal Issues in Criterion-Referenced Testing 365 17 Criterion-Referenced Testing and Employment Selection Laws 367 What Do We Mean by Employment Selection Laws? 368 Who May Bring a Claim? 368 A Short History of the Uniform Guidelines on Employee Selection Procedures 370 Purpose and Scope 371 Legal Challenges to Testing and the Uniform Guidelines 373 Reasonable Reconsideration 376 In Conclusion 376 Balancing CRTs with Employment Discrimination Laws 376 Watch Out for Blanket Exclusions in the Name of Business Necessity 378 Adverse Impact, the Bottom Line, and Affirmative Action 380 Adverse Impact 380 The Bottom Line 383 Affirmative Action 385 Record-Keeping of Adverse Impact and Job-Relatedness of Tests 387 Accommodating Test-Takers with Special Needs 387 Testing, Assessment, and Evaluation for Disabled Candidates 390 Test Validation Criteria: General Guidelines 394 Test Validation: A Step-by-Step Guide 397 1. Obtain Professional Guidance 397 2. Select a Legally Acceptable Validation Strategy for Your Particular Test 397 3. Understand and Employ Standards for Content-Valid Tests 398 4. Evaluate the Overall Test Circumstances to Assure Equality of Opportunity 399 Keys to Maintaining Effective and Legally Defensible Documentation 400 Why Document? 400 What Is Documentation? 401 Why Is Documentation an Ally in Defending Against Claims? 401 How Is Documentation Used? 402 Compliance Documentation 402 Documentation to Avoid Regulatory Penalties or Lawsuits 404 Use of Documentation in Court 404 Documentation to Refresh Memory 404 Documentation to Attack Credibility 404 Disclosure and Production of Documentation 405 Pay Attention to Document Retention Policies and Protocols 407 Use Effective Word Management in Your Documentation 409 Use Objective Terms to Describe Events and Compliance 412 Avoid Inflammatory and Off-the-Cuff Commentary 412 Develop and Enforce Effective Document Retention Policies 413 Make Sure Your Documentation Is Complete 414 Make Sure Your Documentation Is Capable of "Authentication" 415 In Conclusion 415 Is Your Criterion-Referenced Testing Legally Defensible? A Checklist 416 A Final Thought 419 Epilogue: CRTD as Organizational Transformation 421 References 425 Index 433 About the Authors 453
List of Figures, Tables, and Sidebars xxiii Introduction: A Little Knowledge Is Dangerous 1 Why Test? 1 Why Read This Book? 2 A Confusing State of Affairs 3 Misleading Familiarity 3 Inaccessible Technology 4 Procedural Confusion 4 Testing and Kirkpatrick's Levels of Evaluation 5 Certification in the Corporate World 7 Corporate Testing Enters the New Millennium 10 What Is to Come. . . 11 Part I: Background: The Fundamentals 13 1 Test Theory 15 What Is Testing? 15 What Does a Test Score Mean? 17 Reliability and Validity: A Primer 18 Reliability 18 Equivalence Reliability 19 Test-Retest Reliability 19 Inter-Rater Reliability 19 Validity 20 Face Validity 23 Context Validity 23 Concurrent Validity 23 Predictive Validity 24 Concluding Comment 24 2 Types of Tests 25 Criterion-Referenced Versus Norm-Referenced Tests 25 Frequency Distributions 25 Criterion-Referenced Test Interpretation 28 Six Purposes for Tests in Training Settings 30 Three Methods of Test Construction (One of Which You Should Never Use) 32 Topic-Based Test Construction 32 Statistically Based Test Construction 33 Objectives-Based Test Construction 34 Part II: Overview: The CRTD Model and Process 37 3 The CRTD Model and Process 39 Relationship to the Instructional Design Process 39 The CRTD Process 43 Plan Documentation 44 Analyze Job Content 44 Establish Content Validity of Objectives 46 Create Items 46 Create Cognitive Items 46 Create Rating Instruments 47 Establish Content Validity of Items and Instruments 47 Conduct Initial Test Pilot 47 Perform Item Analysis 48 Difficulty Index 48 Distractor Pattern 48 Point-Biserial 48 Create Parallel Forms or Item Banks 49 Establish Cut-Off Scores 49 Informed Judgment 50 Angoff 50 Contrasting Groups 50 Determine Reliability 50 Determine Reliability of Cognitive Tests 50 Equivalence Reliability 51 Test-Retest Reliability 51 Determine Reliability of Performance Tests 52 Report Scores 52 Summary 53 Part III: The CRTD Process: Planning and Creating the Test 55 4 Plan Documentation 57 Why Document? 57 What to Document 63 The Documentation 64 5 Analyze Job Content 75 Job Analysis 75 Job Analysis Models 77 Summary of the Job Analysis Process 78 DACUM 79 Hierarchies 87 Hierarchical Analysis of Tasks 87 Matching the Hierarchy to the Type of Test 88 Prerequisite Test 89 Entry Test 89 Diagnostic Test 89 Posttest 89 Equivalency Test 90 Certification Test 90 Using Learning Task Analysis to Validate a Hierarchy 91 Bloom's Original Taxonomy 91 Knowledge Level 92 Comprehension Level 93 Application Level 93 Analysis Level 93 Synthesis Level 93 Evaluation Level 94 Using Bloom's Original Taxonomy to Validate a Hierarchy 94 Bloom's Revised Taxonomy 95 Gagné's Learned Capabilities 96 Intellectual Skills 96 Cognitive Strategies 97 Verbal Information 97 Motor Skill 97 Attitudes 97 Using Gagné's Intellectual Skills to Validate a Hierarchy 97 Merrill's Component Design Theory 98 The Task Dimension 99 Types of Learning 99 Using Merrill's Component Design Theory to Validate a Hierarchy 99 Data-Based Methods for Hierarchy Validation 100 Who Killed Cock Robin? 102 6 Content Validity of Objectives 105 Overview of the Process 105 The Role of Objectives in Item Writing 106 Characteristics of Good Objectives 107 Behavior Component 107 Conditions Component 108 Standards Component 108 A Word from the Legal Department About Objectives 109 The Certification Suite 109 Certification Levels in the Suite 110 Level A-Realworld 110 Level B-High-Fidelity Simulation 111 Level C-Scenarious 111 Quasi-Certification 112 Level D-Memorization 112 Level E-Attendance 112 Level F-Affiliation 113 How to Use the Certification Suite 113 Finding a Common Understanding 113 Making a Professional Decision 114 The correct level to match the job 114 The operationally correct level 114 The consequences of lower fidelity 115 Converting Job-Task Statements to Objectives 116 In Conclusion 119 7 Create Cognitive Items 121 What Are Cognitive Items? 121 Classification Schemes for Objectives 122 Bloom's Cognitive Classifications 123 Types of Test Items 129 Newer Computer-Based Item Types 129 The Six Most Common Item Types 130 True/False Items 131 Matching Items 132 Multiple-Choice Items 132 Fill-In Items 147 Short Answer Items 147 Essay Items 148 The Key to Writing Items That Match Jobs 149 The Single Most Useful Improvement You Can Make in Test Development 149 Intensional Versus Extensional Items 150 Show Versus Tell 152 The Certification Suite 155 Guidelines for Writing Test Items 158 Guidelines for Writing the Most Common Item Types 159 How Many Items Should Be on a Test? 166 Test Reliability and Test Length 166 Criticality of Decisions and Test Length 167 Resources and Test Length 168 Domain Size of Objectives and Test Length 168 Homogeneity of Objectives and Test Length 169 Research on Test Length 170 Summary of Determinants of Test Length 170 A Cookbook for the SME 172 Deciding Among Scoring Systems 174 Hand Scoring 175 Optical Scanning 175 Computer-Based Testing 176 Computerized Adaptive Testing 180 8 Create Rating Instruments 183 What Are Performance Tests? 183 Product Versus Process in Performance Testing 187 Four Types of Rating Scales for Use in Performance Tests (Two of Which You Should Never Use) 187 Numerical Scales 188 Descriptive Scales 188 Behaviorally Anchored Rating Scales 188 Checklists 190 Open Skill Testing 192 9 Establish Content Validity of Items and Instruments 195 The Process 195 Establishing Content Validity-The 196 Single Most Important Step Face Validity 196 Content Validity 197 Two Other Types of Validity 202 Concurrent Validity 202 Predictive Validity 208 Summary Comment About Validity 209 10 Initial Test Pilot 211 Why Pilot a Test? 211 Six Steps in the Pilot Process 212 Determine the Sample 212 Orient the Participants 213 Give the Test 214 Analyze the Test 214 Interview the Test-Takers 215 Synthesize the Results 216 Preparing to Collect Pilot Test Data 217 Before You Administer the Test 217 Sequencing Test Items 217 Test Directions 218 Test Readability Levels 219 Lexile Measure 220 Formatting the Test 220 Setting Time Limits-Power, Speed, and Organizational Culture 221 When You Administer the Test 222 Physical Factors 222 Psychological Factors 222 Giving and Monitoring the Test 223 Special Considerations for Performance Tests 225 Honesty and Integrity in Testing 231 Security During the Training-Testing Sequence 234 Organization-Wide Policies Regarding Test Security 236 11 Statistical Pilot 241 Standard Deviation and Test Distributions 241 The Meaning of Standard Deviation 241 The Five Most Common Test Distributions 244 Problems with Standard Deviations and Mastery Distributions 247 Item Statistics and Item Analysis 248 Item Statistics 248 Difficulty Index 248 P-Value 249 Distractor Pattern 249 Point-Biserial Correlation 250 Item Analysis for Criterion-Referenced Tests 251 The Upper-Lower Index 253 Phi 255 Choosing Item Statistics and Item Analysis Techniques 255 Garbage In-Garbage Out 257 12 Parallel Forms 259 Paper-and-Pencil Tests 260 Computerized Item Banks 262 Reusable Learning Objects 264 13 Cut-Off Scores 265 Determining the Standard for Mastery 265 The Outcomes of a Criterion-Referenced Test 266 The Necessity of Human Judgment in Setting a Cut-Off Score 267 Consequences of Misclassification 267 Stakeholders 268 Revisability 268 Performance Data 268 Three Procedures for Setting the Cut-Off Score 269 The Issue of Substitutability 269 Informed Judgment 270 A Conjectural Approach, the Angoff Method 272 Contrasting Groups Method 278 Borderline Decisions 282 The Meaning of Standard Error of Measurement 282 Reducing Misclassification Errors at the Borderline 284 Problems with Correction-for-Guessing 285 The Problem of the Saltatory Cut-Off Score 287 14 Reliability of Cognitive Tests 289 The Concepts of Reliability, Validity, and Correlation 289 Correlation 290 Types of Reliability 293 Single-Test-Administration Reliability Techniques 294 Internal Consistency 294 Squared-Error Loss 296 Threshold-Loss 296 Calculating Reliability for Single-Test Administration Techniques 297 Livingston's Coefficient kappa (
2) 297 The Index Sc 297 Outcomes of Using the Single-Test- Administration Reliability Techniques 298 Two-Test-Administration Reliability Techniques 299 Equivalence Reliability 299 Test-Retest Reliability 300 Calculating Reliability for Two-Test Administration Techniques 301 The Phi Coefficient 302 Description of Phi 302 Calculating Phi 302 How High Should Phi Be? 304 The Agreement Coefficient 306 Description of the Agreement Coefficient 306 Calculating the Agreement Coefficient 307 How High Should the Agreement Coefficient Be? 308 The Kappa Coefficient 308 Description of Kappa 308 Calculating the Kappa Coefficient 309 How High Should the Kappa Coefficient Be? 311 Comparison of
,
0, and
313 The Logistics of Establishing Test Reliability 314 Choosing Items 314 Sample Test-Takers 315 Testing Conditions 316 Recommendations for Choosing a Reliability Technique 316 Summary Comments 317 15 Reliability of Performance Tests 319 Reliability and Validity of Performance Tests 319 Types of Rating Errors 320 Error of Standards 320 Halo Error 321 Logic Error 321 Similarity Error 321 Central Tendency Error 321 Leniency Error 322 Inter-Rater Reliability 322 Calculating and Interpreting Kappa (
) 323 Calculating and Interpreting Phi (
) 335 Repeated Performance and Consecutive Success 344 Procedures for Training Raters 347 What If a Rater Passes Everyone Regardless of Performance? 349 What Should You Do? 352 What If You Get a High Percentage of Agreement Among Raters But a Negative Phi Coefficient? 353 16 Report Scores 357 CRT Versus NRT Reporting 358 Summing Subscores 358 What Should You Report to a Manager? 361 Is There a Legal Reason to Archive the Tests? 362 A Final Thought About Testing and Teaching 362 Part IV: Legal Issues in Criterion-Referenced Testing 365 17 Criterion-Referenced Testing and Employment Selection Laws 367 What Do We Mean by Employment Selection Laws? 368 Who May Bring a Claim? 368 A Short History of the Uniform Guidelines on Employee Selection Procedures 370 Purpose and Scope 371 Legal Challenges to Testing and the Uniform Guidelines 373 Reasonable Reconsideration 376 In Conclusion 376 Balancing CRTs with Employment Discrimination Laws 376 Watch Out for Blanket Exclusions in the Name of Business Necessity 378 Adverse Impact, the Bottom Line, and Affirmative Action 380 Adverse Impact 380 The Bottom Line 383 Affirmative Action 385 Record-Keeping of Adverse Impact and Job-Relatedness of Tests 387 Accommodating Test-Takers with Special Needs 387 Testing, Assessment, and Evaluation for Disabled Candidates 390 Test Validation Criteria: General Guidelines 394 Test Validation: A Step-by-Step Guide 397 1. Obtain Professional Guidance 397 2. Select a Legally Acceptable Validation Strategy for Your Particular Test 397 3. Understand and Employ Standards for Content-Valid Tests 398 4. Evaluate the Overall Test Circumstances to Assure Equality of Opportunity 399 Keys to Maintaining Effective and Legally Defensible Documentation 400 Why Document? 400 What Is Documentation? 401 Why Is Documentation an Ally in Defending Against Claims? 401 How Is Documentation Used? 402 Compliance Documentation 402 Documentation to Avoid Regulatory Penalties or Lawsuits 404 Use of Documentation in Court 404 Documentation to Refresh Memory 404 Documentation to Attack Credibility 404 Disclosure and Production of Documentation 405 Pay Attention to Document Retention Policies and Protocols 407 Use Effective Word Management in Your Documentation 409 Use Objective Terms to Describe Events and Compliance 412 Avoid Inflammatory and Off-the-Cuff Commentary 412 Develop and Enforce Effective Document Retention Policies 413 Make Sure Your Documentation Is Complete 414 Make Sure Your Documentation Is Capable of "Authentication" 415 In Conclusion 415 Is Your Criterion-Referenced Testing Legally Defensible? A Checklist 416 A Final Thought 419 Epilogue: CRTD as Organizational Transformation 421 References 425 Index 433 About the Authors 453
2) 297 The Index Sc 297 Outcomes of Using the Single-Test- Administration Reliability Techniques 298 Two-Test-Administration Reliability Techniques 299 Equivalence Reliability 299 Test-Retest Reliability 300 Calculating Reliability for Two-Test Administration Techniques 301 The Phi Coefficient 302 Description of Phi 302 Calculating Phi 302 How High Should Phi Be? 304 The Agreement Coefficient 306 Description of the Agreement Coefficient 306 Calculating the Agreement Coefficient 307 How High Should the Agreement Coefficient Be? 308 The Kappa Coefficient 308 Description of Kappa 308 Calculating the Kappa Coefficient 309 How High Should the Kappa Coefficient Be? 311 Comparison of
,
0, and
313 The Logistics of Establishing Test Reliability 314 Choosing Items 314 Sample Test-Takers 315 Testing Conditions 316 Recommendations for Choosing a Reliability Technique 316 Summary Comments 317 15 Reliability of Performance Tests 319 Reliability and Validity of Performance Tests 319 Types of Rating Errors 320 Error of Standards 320 Halo Error 321 Logic Error 321 Similarity Error 321 Central Tendency Error 321 Leniency Error 322 Inter-Rater Reliability 322 Calculating and Interpreting Kappa (
) 323 Calculating and Interpreting Phi (
) 335 Repeated Performance and Consecutive Success 344 Procedures for Training Raters 347 What If a Rater Passes Everyone Regardless of Performance? 349 What Should You Do? 352 What If You Get a High Percentage of Agreement Among Raters But a Negative Phi Coefficient? 353 16 Report Scores 357 CRT Versus NRT Reporting 358 Summing Subscores 358 What Should You Report to a Manager? 361 Is There a Legal Reason to Archive the Tests? 362 A Final Thought About Testing and Teaching 362 Part IV: Legal Issues in Criterion-Referenced Testing 365 17 Criterion-Referenced Testing and Employment Selection Laws 367 What Do We Mean by Employment Selection Laws? 368 Who May Bring a Claim? 368 A Short History of the Uniform Guidelines on Employee Selection Procedures 370 Purpose and Scope 371 Legal Challenges to Testing and the Uniform Guidelines 373 Reasonable Reconsideration 376 In Conclusion 376 Balancing CRTs with Employment Discrimination Laws 376 Watch Out for Blanket Exclusions in the Name of Business Necessity 378 Adverse Impact, the Bottom Line, and Affirmative Action 380 Adverse Impact 380 The Bottom Line 383 Affirmative Action 385 Record-Keeping of Adverse Impact and Job-Relatedness of Tests 387 Accommodating Test-Takers with Special Needs 387 Testing, Assessment, and Evaluation for Disabled Candidates 390 Test Validation Criteria: General Guidelines 394 Test Validation: A Step-by-Step Guide 397 1. Obtain Professional Guidance 397 2. Select a Legally Acceptable Validation Strategy for Your Particular Test 397 3. Understand and Employ Standards for Content-Valid Tests 398 4. Evaluate the Overall Test Circumstances to Assure Equality of Opportunity 399 Keys to Maintaining Effective and Legally Defensible Documentation 400 Why Document? 400 What Is Documentation? 401 Why Is Documentation an Ally in Defending Against Claims? 401 How Is Documentation Used? 402 Compliance Documentation 402 Documentation to Avoid Regulatory Penalties or Lawsuits 404 Use of Documentation in Court 404 Documentation to Refresh Memory 404 Documentation to Attack Credibility 404 Disclosure and Production of Documentation 405 Pay Attention to Document Retention Policies and Protocols 407 Use Effective Word Management in Your Documentation 409 Use Objective Terms to Describe Events and Compliance 412 Avoid Inflammatory and Off-the-Cuff Commentary 412 Develop and Enforce Effective Document Retention Policies 413 Make Sure Your Documentation Is Complete 414 Make Sure Your Documentation Is Capable of "Authentication" 415 In Conclusion 415 Is Your Criterion-Referenced Testing Legally Defensible? A Checklist 416 A Final Thought 419 Epilogue: CRTD as Organizational Transformation 421 References 425 Index 433 About the Authors 453