APPLIED MATHEMATICS, LOCATING INFORMATION AND READING FOR INFORMATION OF THE WORKKEYS ASSESSMENTS: COMPARISON OF SCORES BY AGE, RACE, AND GENDER Except where reference is made to the work of others, the work described in this dissertation is my own or was done in collaboration with my advisory committee. This dissertation does not include proprietary or classified information. __________________________________ Deborah E. Stone Certificate of Approval: ___________________________ ____________________________ Margaret E. Ross James E. Witte, Chair Associate Professor Associate Professor Educational Foundations, Educational Foundations, Leadership and Technology Leadership and Technology ___________________________ ____________________________ Maria Martinez Witte George T. Flowers Associate Professor Interim Dean Educational Foundations, Graduate School Leadership and Technology APPLIED MATHEMATICS, LOCATING INFORMATION AND READING FOR INFORMATION OF THE WORKKEYS ASSESSMENTS: COMPARISON OF SCORES BY AGE, RACE, AND GENDER Deborah E. Stone A Dissertation Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Doctor of Education Auburn, Alabama December 17, 2007 APPLIED MATHEMATICS, LOCATING INFORMATION AND READING FOR INFORMATION OF THE WORKKEYS ASSESSMENTS: COMPARISON OF SCORES BY AGE, RACE, AND GENDER Deborah E. Stone Permission is granted to Auburn University to make copies of this dissertation at its discretion, upon request of individuals or institutions and at their expense. The author reserves all publication rights. ___________________________ Signature of Author ___________________________ Date of Graduation iii VITA Deborah Stone was born October 10, 1957 in Montgomery, Alabama, the daughter of Frances Crabtree Stone and the late Raymond Davis Stone, Sr. She grew up in Elmore County, Alabama and graduated from Alabama Christian High School in 1975. She graduated from David Lipscomb University with a Bachelor of Arts Degree in American History in 1979. She received a Master of Business Administration from Auburn University at Montgomery in 1983. After working as an instructor in the schools of business at Falkner University and Auburn University at Montgomery, she spent 20 years as a training professional in manufacturing and industry. In the summer of 2003, she entered graduate school at Auburn University while working as the Training Manager at Neptune Technology Group Inc. in Tallassee, Alabama. iv DISSERTATION ABSTRACT APPLIED MATHEMATICS, LOCATING INFORMATION AND READING FOR INFORMATION OF THE WORKKEYS ASSESSMENTS: COMPARISON OF SCORES BY AGE, RACE, AND GENDER Deborah E. Stone Doctor of Education, December 17, 2007 (M.B.A., Auburn University at Montgomery, 1983) (B.A., David Lipscomb University, 1979) 127 typed pages Directed by James E. Witte As American businesses have competed in the global economy, the need for skilled workers has become more acute (Friedman, 2005). The American educational system has struggled to provide businesses and industry with needed skilled workers, with mixed results. These results have propelled businesses to seek ways to measure the skills of potential employees before hiring. One of the most commonly used methods of skill determination has been pre-employment testing (Agard, 2003). There are many types of pre-employment assessments including interviews, presentations, simulations, and tests. For this research, three tests from the ACT WorkKeys battery of tests, Applied v Mathematics, Locating Information, and Reading for Information were chosen as the focus of this study. These three assessments have been used commonly in testing for industry, and are the basis of many state Career Readiness Certificates, including the state of Alabama (CRC Consortium, 2007). The sample for this study used 6,962 sets of scores with the self-reported demographics from one WorkKeys testing center in Alabama. The sample consisted of test takers, aged 19 and older, who were technical school students, technical school program applicants, job applicants for multiple employers and incumbent employees of multiple employers. The results of the study found statistically significant differences in the scores of all the WorkKeys assessments on the basis of racial group and age, and mixed results in the scores of the assessments between males and females. vi ACKNOWLEDGEMENTS I wish to acknowledge and express my sincere gratitude to my committee members Dr. James Witte, Dr. Maria Martinez Witte, and Dr. Margaret Ross for their continued support throughout my graduate studies. Dr. James Witte, thank you for using your knowledge and experience to keep me going. Dr. Maria Witte, thank you for providing encouragement and support throughout my doctoral program. Dr. Ross, thank you for your kindness and assistance with analyzing the data collected for this study. I also thank Dr. William Sauser, for being my Outside Reader. I would like to thank Wallace Community College and the staff of the WorkKeys testing center past and present: Janet Ormond, Sharla Harris, Trill Ausley and Tami Roper. I also want to thank ACT for answering questions as they arose. I would like to thank my co-workers at Neptune Technology Group for pushing me to work towards my dream. Especially to Hank Golden, Bob Forrester, Larry Russo, and Lydia Ledbetter, I give my interminable thanks for supporting me and not letting me give up. I would like to thank Dr. Mona Lazenby and the faculty of the AUM School of Nursing for the support and comfort they provided. I would like to thank my brother and sister and their families for their love and support. Finally, to my friend and fellow scholar, Arlene Morris, I am so thankful that we were able to go through this process together. vii This dissertation is dedicated to my mother, Frances Crabtree Stone. She brought the joy of learning to me, my brother and sister very early in our lives. As an educator for many years, she taught me to value and honor education and the professionalism of excellent teachers. She is an exceptional Christian woman who has provided me a wonderful example for living my life. viii Style manual or journal used: Publication Manual of the American Psychological Association, 5 th Edition. Computer software used: SPSS 13, Windows XP, and Microsoft Word 2003 ix TABLE OF CONTENTS Page LIST OF TABLES ............................................................................................................xii CHAPTER I. INTRODUCTION.............................................................................................1 Problem Statement.............................................................................................4 Purpose of the Study..........................................................................................4 Importance of the Research...............................................................................4 Research Questions ...........................................................................................5 Assumptions ......................................................................................................6 Limitations of the Study ....................................................................................7 Definition of Terms ...........................................................................................7 Organization of the Study..................................................................................9 II. LITERATURE REVIEW................................................................................10 Introduction .....................................................................................................10 Purpose of the Study........................................................................................10 Research Questions .........................................................................................10 Workplace Skills .............................................................................................12 Standardized Testing .......................................................................................14 Quality in Standardized Testing ......................................................................20 Employment Regulations in U.S. ....................................................................23 WorkKeys........................................................................................................27 WorkKeys Users..............................................................................................28 Job Profiling ....................................................................................................28 Assessment ......................................................................................................30 Training ...........................................................................................................31 Prior WorkKeys Research ...............................................................................33 Career Readiness Certificates..........................................................................35 Summary..........................................................................................................35 III. METHODS......................................................................................................37 Introduction .....................................................................................................37 Research Questions .........................................................................................37 Participants ......................................................................................................39 Variables of Interest ........................................................................................40 Research Design ..............................................................................................40 Sampling..........................................................................................................41 Instrument........................................................................................................41 x Reliability ........................................................................................................44 Validity............................................................................................................45 Procedures .......................................................................................................47 Analysis ...........................................................................................................48 Summary..........................................................................................................48 IV. RESULTS........................................................................................................50 Purpose of the Study........................................................................................50 Research Questions .........................................................................................50 Introduction .....................................................................................................51 Participants ......................................................................................................52 Research Questions .........................................................................................52 Research Questions 1-9 .............................................................................52 Research Question 10-12...........................................................................62 Summary..........................................................................................................70 V. SUMMARY, DISCUSSION, IMPLICATIONS AND RECOMMENDATIONS ................................................................................72 Purpose of the Study........................................................................................72 Research Questions .........................................................................................72 Summary..........................................................................................................73 Discussion........................................................................................................75 Implications ....................................................................................................76 Recommendations ...........................................................................................76 REFERENCES ............................................................................................................79 APPENDICES.............................................................................................................89 APPENDIX A. INSTITUTIONAL REVIEW BOARD (IRB) ....................90 APPENDIX B. DATA PERMISSION LETTER.........................................91 APPENDIX C. SKILL LEVEL DESCRIPTIONS - AM.............................92 APPENDIX D. SKILL LEVEL DESCRIPTIONS - LI ...............................95 APPENDIX E. SKILL LEVEL DESCRIPTIONS - RI ...............................97 APPENDIX F. SAMPLE QUESTIONS - AM .........................................100 APPENDIX G. SAMPLE QUESTIONS - LI ............................................105 APPENDIX H. SAMPLE QUESTIONS - RI ............................................110 xi LIST OF TABLES Table Page 1. Predicted Classification Consistency ......................................................................44 2. Scores of Applied Mathematics by Age Group - Ranks .........................................54 3. Scores of Applied Mathematics by Age Group ? Test Statistics ............................54 4. Scores of Applied Mathematics by Gender ? Ranks...............................................55 5. Scores of Applied Mathematics by Gender ? Test Statistics ..................................55 6. Scores of Applied Mathematics by Race ? Ranks...................................................56 7. Scores of Applied Mathematics by Race ? Test Statistics ......................................56 8. Scores of Locating Information by Age Group ? Ranks.........................................57 9. Scores of Locating Information by Age Group ? Test Statistics.............................57 10. Scores of Locating Information by Gender ? Ranks ..............................................58 11. Scores of Locating Information by Gender ? Test Statistics..................................58 12. Scores of Locating Information by Race ? Ranks..................................................59 13. Scores of Locating Information by Race ? Test Statistics .....................................59 14. Scores of Reading for Information by Age Group ? Ranks ...................................60 15. Scores of Reading for Information by Age Group ? Test Statistics.......................60 16. Scores of Reading for Information by Gender ? Ranks .........................................61 17. Scores of Reading for Information by Gender ? Test Statistics............................61 18. Scores of Reading for Information by Race ? Ranks ............................................62 19. Scores of Reading for Information by Race ? Test Statistics................................62 xii 20. Age of participants when taking the assessments ? Totals....................................63 21. Age of participants when taking the assessments ? Frequencies ..........................63 22. Spearman?s Rho (r s ) for Applied Mathematics and Age ......................................67 23. Spearman?s Rho (r s ) for Locating Information and Age........................................68 24. Spearman?s Rho (r s ) for Reading for Information and Age...................................69 25. Significance levels and correlation coefficients .....................................................70 xiii 1 I. INTRODUCTION Businesses in the world today have a need for higher skilled workers as employees. The need for highly skilled employees has been brought about due to changing technology, global competition, and the need for flexibility and speed in the workplace. Freidman (2005) indicated that Americans are not adequately prepared to deal with the globalization of the world economic systems. Challenger (2003) revealed that ?in America and around the world, there are a combination of factors, including a growing number of retirees, declining fertility rates, and a labor force that simply does not possess the right skills to meet employers? needs? (p. 28). In an effort to insure that employees have the required skills, many employers have chosen to use pre-employment testing as a part of the hiring process. According to Agard (2003), ?the best way to increase the chances of finding the ideal employee is to test the applicant for the required skills before conducting the interview? (p. 7). Patel (2002) stated ?like it or not, testing remains the only objective measure by which employers can assess the potential performance of future employees? (p. 112). Companies are testing skill levels in many different attributes, based on the specific skills and levels of the skills required for the available jobs. Legal mandates regarding the workplace dictate hiring requirements; therefore, companies must be careful in choosing an assessment system that is fair to all potential employees. Several 2 different assessments have been developed for use by different companies and industries. Specific types of companies have used specialized assessements to meet identified testing needs. For example, electric utilities, such as Alabama Power and Georgia Power, typically use the assessments developed by the Edison Electric Institute (Southern Company, 2006). The pulp and paper industry often uses the Nowlin selection system, which includes five assessments and a structured interview (Hardcastle & Mann, 2005). Other assessment tools have been developed for use by various employers. Examples of these other assessments include the Adult Measure of Essential Skills (AMES), the Assessments in Career Education (ACE), the Career Portfolio Assessment (CPA), the Career-Technical Assessment Program (C-TAP), and the Comprehensive Adult Student Assessment System ? Employability Competency System (CASAS-ECS). Additional assessments are the National Occupational Competency Testing Institute (NOCTI) Job Ready Tests, the Oklahoma Vo-Tech, the Vocational-Technical Education Consortium of the States (V-TECS), WorkKeys, and the Workplace Success Skills System (A Comparison of Career-related Assessment Tools/Models, 1999). One assessment system that is designed to provide skills assessments for more than a single industry is the WorkKeys system. WorkKeys is a group of assessments that were developed by ACT in Iowa City, Iowa during the early 1990s. Applegate (1999) described WorkKeys as a way for employers ?to identify the skills employees need to be successful and to determine where additional training will help build a higher performance workforce? (p. 52). Each of the eight initial assessments was designed to rate candidates? skill levels in a specific skill area. Assessments for applied mathematics, 3 reading for information, locating information, teamwork, applied technology, observation, listening and writing have been available since the mid-1990s (ACT, 1999). Individual states were also highly interested in providing businesses with a skilled workforce in order to attract new industry by using a qualified workforce as a key recruiting tool. These states were usually led in this effort by their industrial development boards. By 2005, several states had begun a progression towards some type of worker certification: Kentucky, Indiana, Louisiana, Missouri, and Virginia had developed certification programs. States that were developing worker certification systems include North Carolina, Oklahoma, North Dakota, South Carolina, Tennessee, Alabama, Wyoming, Washington D.C., Georgia, and West Virginia. Fifteen other states declared interest in installing some type of worker certification program, and included New Mexico, Colorado, Michigan, Kansas, California, Idaho, Delaware, Maryland, Rhode Island, Minnesota, Illinois, Hawaii, Montana, Washington, and Oregon (Bolin, 2005). Several of the programs instituted by 2005 used the same three WorkKeys assessments as the basis for the certifications -- Applied Mathematics, Locating Information, and Reading for Information. In August of 2006, the State of Alabama, Office of Workforce Development announced that it was installing a three-tiered worker certification program. These gold, silver, and bronze level certificates are issued by the State of Alabama to workers based on the scoring levels of the same three WorkKeys assessments (Alabama Office of Workforce Development, n.d.a). 4 Problem Statement Businesses and industries have needed employees with certain skill sets at specific levels to perform required jobs. One way to insure workers possessed the needed skills was to test skill levels prior to hiring new workers. One available testing battery composed of nine skill assessments was WorkKeys by ACT. Three of the WorkKeys assessments, Applied Math, Locating Information, and Reading for Information, have been used as the basis of workforce readiness certificates in several states, and have been used as pre-employment testing instruments (Bolin, 2005). Although data exists regarding differences of scores by race and gender, no studies to date have analyzed differences in scores by age or age groups. Purpose of the Study The purpose of this research was to determine if there were any statistically significant differences in the scores of the WorkKeys Applied Mathematics, Locating Information, and Reading for Information assessments based on three demographic independent variables. These independent variables were age, gender and race. Importance of the Research WorkKeys assessments are now used widely in the United States as pre- employment testing instruments and as the basis of worker credentialing programs in many states (Bolin, 2005). There is very little independently published evaluative research of demographic variables and WorkKeys. This study was designed to determine what differences, if any, existed between different groups of the sample population. 5 Special emphasis was given to employment legislation and its potential ramifications for the hiring process. There are several groups protected by federal legislation, including groups determined by age, race, and gender. Previous research has considered WorkKeys assessments in regard to differences by race and gender, but this study included the variable of age in relation to the three assessments being used by many worker credentialing programs. Research Questions The following research questions were used in this study: 1. What are the differences in the WorkKeys Applied Mathematics test scores based on age? 2. What are the differences in the WorkKeys Applied Mathematics test scores based on gender? 3. What are the differences in the WorkKeys Applied Mathematics test scores based on race? 4. What are the differences in the WorkKeys Locating Information test scores based on age? 5. What are the differences in the WorkKeys Locating Information test scores based on gender? 6. What are the differences in the WorkKeys Locating Infomation test scores based on race? 7. What are the differences in the WorkKeys Reading for Information test scores based on age? 6 8. What are the differences in the WorkKeys Reading for Information test scores based on gender? 9. What are the differences in the WorkKeys Reading for Information test scores based on race? 10. Is there any statistically significant correlation between age and WorkKeys Applied Mathematics test scores? 11. Is there any statistically significant correlation between age and WorkKeys Locating Information test scores? 12. Is there any statistically significant correlation between age and WorkKeys Reading for Information test scores? Assumptions For the purpose of this study the following assumptions are made: 1. Test takers accurately completed the demographic information on the assessment. 2. Test takers performed at their skill levels on the assessments. 3. The tests were given properly, according to ACT testing guidelines. 4. The WorkKeys instruments (tests) were assessed for and produced appropriate measures of reliability and validity. 5. The sample is representative of the population in the geographic region. 7 Limitations For the purpose of this study, the following limitations were identified: 1. This study used existing data, previously recorded by a community college testing center. 2. This research used ACT testing center scores for WorkKeys located at a community college in southeast Alabama to examine if there were any statistically significant differences in the scores of the sample as differentiated by age, race or gender. 3. The Ex Post Facto sample being used had limited demographic information available for each test taker. 4. Social Security numbers were used as individual identifying numbers and were not made available to the researcher in order to maintain the confidentially of test takers. 5. The information that was collected for the individuals in the sample were: birth date, date of test, gender, sex, individual identifying number, and test scores. 6. All of these are attribute variables which often place the people into legally protected groups. 7. Attribute variables are unable to be manipulated by the researcher and are critical in determining if one group is favored over another group in the assessment process. Definition of Terms These definitions provide meaning for the following terms as used in this study: Age is the self-identified chronological age of participants in the study. 8 Career Readiness Certificates (also known as workforce readiness certificates) are transportable documents designed to give workers a method of providing proof to employers that they have a certain level of skill within certain tested workplace skill sets. Gender is male or female as self-identified by participants. High stakes testing is an assessment given in circumstances where the outcome of the test is of great consequence to the test takers (e.g. graduation exams, college entrance exams, pre-employment tests) (Sackett, Schmitt, Ellingson, & Kabin, 2001). Job analysis is a systematic process used to identify the tasks, duties, responsibilities, and working conditions associated with a job and the knowledge, skills, abilities, and other characteristics required to perform that job (U.S. Dept. of Labor, 2000, p. 3-8). Legally protected classes are groups of individuals protected by law against certain hiring and employment practices. Mann-Whitney U test is ?the nonparametric equivalent of the independent t test? (Cronk, 2006, p. 90) in which ?scores for subjects are converted into ranks, and the analyses compares the mean ranks in each group? (Munro, 2005, p.123). Pre-employment test is an assessment to determine the skill levels of job applicants, to ensure they are capable of doing the job for which they are applying. Qualified workers are people with the skills required for the job performance they are hired to do. Race is self-identified by participants as African-American, Caucasian, or others. 9 Spearman correlation coefficient ?determines the strength of the relationship between two variables? (Cronk, 2006, p. 42). This method is used with ordinal data (Ary, Jacobs, & Razavieh, 2002) Test is a generic term to refer to any instrument or procedure that measures samples of behavior or performance (U.S. Dept. of Labor, 2000, p1-2) Worker skill levels are different levels of proficiency in a certain skill set. WorkKeys is a system of job analysis, assessments, and skill gap analysis developed by ACT, Inc. in the early 1990s for workplace skill assessment. Workplace skills, also known as employability skills, ?are transferable core skill groups that represent essential function and enabling knowledge, skills, and attitudes required by the 21 st century workplace? (Overtoom, 2000, p. 1). Organization of Study Chapter I introduces the study, presenting the problem, purpose, research questions, assumptions, limitations and definitions of terms. Chapter II is a review of related literature concerning qualified workers, workplace skills, worker skill levels, WorkKeys, ACT, standardized testing, workforce readiness certificates, legally protected classes, and high stakes testing in the United States. Chapter III reports the methods utilized in this study, including the population and sample, instrumentation, data collection and the data analysis. The findings of the study are presented in Chapter IV. Chapter V includes a summary of the study, conclusions, implications and recommendations for further practice and research. 10 II. LITERATURE REVIEW Introduction This chapter is a review of related literature concerning workplace skills, skill levels, standardized tests, pre-employment testing, WorkKeys, prior research with WorkKeys, career readiness certificates, and employment regulations in the United States. These topics relate to the issues surrounding high stakes testing for employment, and specifically the WorkKeys assessment system. Purpose of the Study The purpose of this research was to determine if there were any statistically significant differences in the scores of the WorkKeys Applied Mathematics, Locating Information, and Reading for Information assessments based on three demographic independent variables. These independent variables were age, gender and race. Research Questions The following research questions were used in this study: 1. What are the differences in the WorkKeys Applied Mathematics test scores based on age? 11 2. What are the differences in the WorkKeys Applied Mathematics test scores based on gender? 3. What are the differences in the WorkKeys Applied Mathematics test scores based on race? 4. What are the differences in the WorkKeys Locating Information test scores based on age? 5. What are the differences in the WorkKeys Locating Information test scores based on gender? 6. What are the differences in the WorkKeys Locating Infomation test scores based on race? 7. What are the differences in the WorkKeys Reading for Information test scores based on age? 8. What are the differences in the WorkKeys Reading for Information test scores based on gender? 9. What are the differences in the WorkKeys Reading for Information test scores based on race? 10. Is there any statistically significant correlation between age and WorkKeys Applied Mathematics test scores? 11. Is there any statistically significant correlation between age and WorkKeys Locating Information test scores? 12. Is there any statistically significant correlation between age and WorkKeys Reading for Information test scores? 12 Workplace Skills Workplace skills, also known as employability skills, ?are transferable core skill groups that represent essential function and enabling knowledge, skills, and attitudes required by the 21 st century workplace? (Overtoom, 2000, p.1). These employability skills include basic level education information and also job-specific knowledge. Findings from case studies have suggested production workers need more of the types of skills traditionally learned in school (e.g. math, reading and writing), as well as new skills not normally taught in conventional education curricula (Bailey, 1997). O?Neill, Allred, and Baker (1997, p. 23) stated ?we believe that students work- bound may need to learn the same competencies whether in high school or in college. What differs is their expected performance levels.? Reading, writing, and math skills have generally been considered to be the most basic of job skills. A group of researchers from the American Society for Training and Development (ASTD), with a grant from the U.S. Department of Labor, identified not only these skills, but thirteen other skills employers desire for their employees (Carnevale, Gainer, & Meltzer, 1990). In 1991, the Department of Labor released the Secretary?s Commission on Achieving Necessary Skills (SCANS) report. The report identified 36 skills as job or workplace competencies. The skills list included: reading, writing, math, speaking, listening, decision making, creative thinking, problem solving, thinking logically, seeing with the mind?s eye, integrity, sociability, self-management, self-esteem, individual responsibility, working on teams, acquiring/evaluating data, serving customers, and managing time, managing money, managing materials, managing spaces, managing staff, leading work teams, negotiating with others, working with different cultures, teaching others, 13 organizing/maintaining information, interpreting/communicating data, monitoring/correcting system performance, processing information with computers, working within social systems, working within technological systems, working within organizational systems, designing/improving systems, selecting equipment and tools, applying technology to specific tasks, and maintaining/troubleshooting technologies. SCANS clustered its list of 36 skills into three areas: basic skills, thinking skills, and personal qualities (Grubb, 1997; Mountain View Coll., 1996; McLarty & Vansickle, 1997; Overtoom, 2000). The ASTD findings, along with the SCANS report by the Department of Labor, have been the basis for many initiatives since the early 1990s by employers and others to discover methods to assess workplace skills before hiring new employees (McLarty & Vansickle, 1997). The ASTD research identified six areas of skills, into which all 16 basic required skills were grouped. The first category was the foundation for all other skills: the ability to learn how to learn. The second area identified basic skills that included reading, writing and mathematics. The third skill area was communication skills, which included listening and speaking skills. Adaptability, which included problem solving and creative thinking was the fourth category. The fifth area uses a classification the researchers called personal management. The skills included in this group were self-esteem, motivation/goal setting and employability/career development. The sixth group of skills, which included interpersonal skills, negotiation and teamwork, was called group effectiveness (Carnevale, Gainer, & Meltzer, 1990). 14 Standardized Testing As identification of necessary skill sets for jobs was begun, companies desired methods to screen applicants for these skill sets. This would allow employers to hire only individuals that had the skills to be successful for the particular job. Pre-employment testing presented an attractive option. ?Cognitively loaded tests of knowledge, skill, and ability are commonly used to help make employment, academic, licensure and certification decisions? (Sackett, Schmitt, Ellingson, & Kabin, 2001, p. 302). In 1998, an American Management Association (AMA) survey of over 1000 companies found that 35% of the responding companies did some type of testing for basic skills (American Management Association, 1998). In a follow-up survey, the AMA found 43% of the responding companies were testing for at least basic math skills, and 60% of the responding companies mandated some type of specific job-skill testing of their applicants (Nicholson, 2000). A quandary for companies was choosing the correct tests for their purposes from the many assessment tests available for use. The Buros Institute, at the University of Nebraska, publishes Tests in Print which lists over 2,900 commercially available tests. Over 550 of these listed tests are vocational tests, while 650 listed are personality tests. The Buros Institute also updates the Mental Measurements Yearbook every other year, which provides independent evaluations of over 400 assessment tests (Nicholson, 2000). Thus, the services of the Buros Institute can provide employers information regarding the type of test to select for the needed purpose along with an evaluation by two independent reviewers. 15 The following assessments were designed for skill level determination, job placement and career development tools: the Adult Measure of Essential Skills (AMES), the Assessments in Career Education (ACE), the Career Portfolio Assessment (CPA), the Career-Technical Assessment Program (C-TAP), and the Comprehensive Adult Student Assessment System ? Employability Competency System (CASAS-ECS). Further assessments included the National Occupational Competency Testing Institute (NOCTI) Job Ready Tests, the Oklahoma Vo-Tech, the Vocational-Technical Education Consortium of the States (V-TECS), WorkKeys, and the Workplace Success Skills System (A Comparison of Career-related Assessment Tools/Models, 1999). The history of using testing as a part of the hiring process in the United States had its roots in the government employment system for federal employees. At the beginning of the United States federal government in 1789, those who won elections were able to have political allies hired into government jobs. In the 1830s President Andrew Jackson publicly popularized the process by having many of his supporters installed as federal employees. By the 1860s, the spoils system was firmly entrenched in American government and politics. Senator Charles Sumner introduced a bill that would have made competitive examinations the centerpiece of reform during the Civil War. The bill was so unpopular that it was not even assigned to a committee. However, over a twenty year period the reform movement grew, and Congress was pressured to deal with the political patronage system. The resulting Pendleton Act of 1883 was a congressional mandate to change the way federal employees were to be chosen for jobs, and initiated the civil service testing system (Theriault, 2003). 16 Some key events were necessary to bring about the reforms of the Pendleton Act. In the summer of 1881, President Garfield was shot by ?a disappointed office-seeker? and died several weeks later (Williams & Bowman, 2007). In the mid-term elections of 1882, the Republicans lost their majority in Congress, and came back to Washington to finish out their term. If reforms had to happen, the Republicans wanted an employment law passed before their appointees were ousted by the incoming majority (Theriault, 2003). Thus, the Pendleton Act was passed in 1883. The legislation provided for ?testing the fitness of applicants for the public service ? [and that] such examinations will be practical in their character ? [and] will fairly test the relative capacity of fitness of the persons? (Pendleton Act of 1883, 1997). By 2007, nearly 125 years of pre-employment testing has been required for many jobs of the federal government. The use of competitive exams validated the notion that employees should be hired based on their abilities and qualifications, not patronage (Hendrick, 2006; Woodard, 2005). In the United States the largest organization to conduct testing for evaluation of applicants for job placement or to screen applicants for needed skills is the United States Department of Defense. Within the Department of Defense, the individual units (i.e. Army, Navy, Air Force) have and continue to test aptitude and achievement of potential new recruits (Sticht, 1997). There have been four major series of assessments utilized by the service branches since World War I (WWI). In 1917-1918 a group of psychologists developed two forms of an intelligence test to administer to assess new recruits. Over 1.9 million men were given one of the two tests. These tests were to be used to assign men to different jobs within the military. The psychologists that developed the test operated under the assumption that intelligence was innate, and if persons were not literate or did 17 not speak English, it did not mean the men were not intelligent. Therefore, two tests were developed: ?the Army Alpha test of intelligence for literates and the Army Beta test of intelligence of illiterates and non-English speakers? (Sticht, 1997, p. 263). One major difference in the design of the tests was the method of giving instructions to the test takers. The Alpha had both verbal and written instructions, while the instructions for the Beta test were acted out in pantomime by the persons administering the examinations. Both tests measured cognitive skills, but the Alpha test required broad reading skills, while the Beta test required almost no language or reading skills (Sticht, 1997). When World War II (WWII) began, the armed services were again faced with huge numbers of volunteers and recruits. Again standardized mental ability tests were used to assist in military job placement. Job assignments were based on the aptitude and achievement scores of these new soldiers. However, the different branches of the military used different assessment tests due to the differing requirements of the military jobs. By the end of WWII, these mental ability assessments were used as a screening mechanism as well as a job placement tool. Men were not accepted into the military if minimal test standards were not met (Sticht, 1997). After WWII, all military services began using the same assessment that required a minimal level of literacy called the Armed Forces Qualification Test (AFQT). This assessment was designed to screen military applicants for low cognitive skills and not to allow them into any of the armed services. The lowest 10% of the test takers could, by legislative decree, be turned down for military service. The AFQT was then combined with other instruments to determine job assignments for new recruits. These tests were used during the Korean conflict and the Vietnam War years (Sticht, 1997). 18 After the Armed Forces became an all volunteer force in the mid-1970s, the Department of Defense introduced a new mental testing tool. It was called the Armed Services Vocational Aptitude Battery (ASVAB). This tool was designed to incorporate the AFQT and several aptitude sub-tests including word knowledge, arithmetic reasoning, mathematical knowledge, general science, and mechanical comprehension (Sticht, 1997). As the armed services worked to define the needed skill levels of jobs in the military, they used two general methods to assess the cognitive skill levels required for the jobs. The first method used predictive validity as the basis of the research (Sticht, 1997). The military was able to evaluate the correlation between test scores and the actual job performance of the test takers. The Job Performance Measurement / Enlistment Standards Project (JPM) was conducted over a ten year period in the 1980s and 1990s. In this study, the military invested over 36 million dollars ?developing measures of job performance as criterion indicators of job proficiency? (Sticht, 1997, p. 278). Over 15,000 members of all branches of the service participated in the study. In the summary of results ?the National Research Council states that the [job] measures provide a credible criterion against which to validate the ASVAB, and the ASVAB has been demonstrated to be a reasonably valid predictor of performance in entry-level military jobs? (Sticht, 1997, p. 279). Other predictive validity research was conducted to add to the knowledge base of job proficiency indicators. There was a study conducted to use general reading skills (literacy) assessments to predict on the job reading skills competency, and another study to establish the reading level requirements for certain military jobs. These tried to predict how well a soldier would perform in his job based on his test scores. The other method extensively used studied the job and task analyses of 19 specific requirements of certain jobs in the military. A list was developed of the tasks required to perform competently a particular job. Then each individual task was investigated for component skills required to accomplish that particular task. Those skills were either basic skills or specialized skills. Once these requirements were determined, the military built training programs to develop these basic and specialized skills (Sticht, 1997). Both predictive validity and job task analysis studies continue to be important to the process of pre-employment testing. Following WWII, many veterans became leaders in business and industry. The military practice of assessing applicants for aptitude and ability was converted to civilian interests. By the 1950s and early 1960s, employment testing was also becoming widespread in use by U. S. businesses for pre-employment testing. As testing grew, opposition to testing also grew (Hendrick, 2006; Wonderlic, n.d.). By the 1970s lawsuits and civil rights legislation began to deter the use of standardized tests by employers. Hendrick (2006) recounted a 1971 survey reporting fewer employers using pre- employment testing than in 1963. By the early 1980s, employers were bemoaning the lack of qualified applicants for their jobs. During the late 1980s and early 1990s, reports were issued concerning skill deficiencies in new employees and the decline in ranking of U.S. students with students from other industrialized nations (O?Neil Jr., Allred, & Baker, 1997; D., 2006). Businesses greatly desired pre-employment assessments that met legal requirements to aid the selection process for new hires and to place new hires in jobs for which they were qualified. Hall, Davis, Bolen, & Chia (1999) began writing on gender and racial differences in mathematics with the following: ?Major concerns on the U.S. national agenda are the 20 gender and racial gaps in math achievement. The research suggests that by the end of high school, such differences are close to one standard deviation? (p. 667). An Educational Testing Service study was conducted in the mid-1990s to investigate gender differences in assessment tests. Latham (1997-1998) summarized the data released from the study, relating a key finding that gaps were small at the 4 th grade level, but ?gender differences appear to grow by the time students reach 12 th grade? (p.88). A group of ACT researchers published a report in 1999 detailing the racial/ethnic and gender differences in ACT test scores of 6000 students. The findings of these researchers suggested that test performance was ?to a large extent, the result of differences in the type and quality of academic preparation, regardless of race/ethnicity or gender? (Noble, Davenport, Schiel, & Pommerich, 1999, p. ii). Quality in Standardized Tests In order to be an acceptable instrument in measuring knowledge, skills and abilities a test must meet standards and have certain characteristics. ?Test reliability and validity are two technical properties of a test that indicate the quality and usefulness of the test? (U.S. Dept. of Labor, 2000, p. 3-1). The Standards for educational and psychological testing defines reliability as the degree to which test scores for a group of test takers are consistent over repeated applications of a measurement procedure and hence are inferred to be dependable, and repeatable for an individual test taker; the degree to which some scores are free of errors of measurement for a given group (American Educational Research Association, American 21 Psychological Association & National Council on Measurement in Education, 1999, p. 180). When a person scores similarly on a test taken more than one time the test is said to be reliable. This is called test-retest reliability. (Huck, 2004) Another method to assess the consistency or reliability of a test is to have two different forms of the instrument. This is called parallel forms reliability. A third method of reliability is called internal consistency. This method is used to determine if all the items on a test are measure the same thing (Ary, Jacobs, & Razavieh, 2002; Huck, 2004). Standardized tests should report both reliability and validity coefficients in the test?s technical manual (U.S. Dept. of Labor, 2000). Validity is ?the degree to which accumulated evidence and theory support specific interpretations of test scores entailed by proposed uses of a test? (American Educational Research Association, 1999, p. 184). Validity ?refers to what the characteristic measures and how well the test measures that characteristic? (U.S. Dept. of Labor, 2000, p. 3-5). There are three different methods of performing validation studies: criterion- related, content-related, and construct-related. Criterion-related validity requires proof of some type of relationship between test scores and the desired job performance. Content- related validity shows that the test measures skills or behaviors that are related to the content of the job. Construct-related validity demonstrates that the test measures what it claims to measure (U.S. Dept. of Labor, 2000). Criterion-related validity involves a time factor. ?If the criterion is obtained at the same time the test is given, it is called concurrent validity; if the criterion is obtained at a later time, it is called predictive validity? (U.S. Dept. of Labor, 2000, p. 3-7). 22 Both reliability and validity have coefficients that are used in the evaluation of the quality and effectiveness of the test. This is important because some standardized tests are used in decisions that directly affect the individual taking the assessment. Standardized tests that are used for decisions regarding academic admission, scholarship award, graduation, certification or employment are considered to be high stakes tests (Heller & Shapiro, 2001; Sackett, Schmitt, Ellingson, & Kabin, 2001). According to Standards for Educational and Psychological Testing, (1999), ?The higher the stakes associated with a given test use, the more important it is that test-based inferences are supported by strong evidence of technical quality? (p. 139). There are different types of standardized tests. Patterson (2000) categorized standardized tests into six types: general mental ability, workplace skills, honesty/integrity, personality, medical status, and physical ability. General ability tests measure cognitive activities such as analyzing, computing, reading, and communicating (U.S. Dept. of Labor, 2000). Workplace skills measure skill levels specific to a job or group of jobs, such as reading, computing, problem solving, or teamwork (Patterson, 2000). Honesty/integrity tests are a type of personality test that assesses the applicant?s understanding of appropriate behavior (Patterson, 2000; U.S. Dept. of Labor, 2000). Personality inventories are often given to determine the fit between an applicant?s personality and the requirements of the available job, and should be used in concert with other instruments to predict job success (Patterson, 2000; U.S. Dept. of Labor, 2000). Medical status tests are used to determine if the person can adequately and safely perform a specific job, and include drug tests (Patterson, 2000; U.S. Dept. of Labor, 2000). 23 Physical ability tests are designed to measure physical traits, such as strength, endurance, and manual dexterity (Patterson, 2000). Standardized tests, especially cognitive tests, have been shown to be good predictors of job performance (Gottfredson, 1994). These well-designed standardized tests that measure cognitive skills have been found repeatedly to exhibit three attributes. First, these tests are valid only if used to achieve the purpose for which they were designed. Second, racial group differences consistently show up in the results. Blacks normally score one standard deviation lower than Whites; Hispanics usually score two- thirds of a standard deviation lower than Whites. Third, extensive research, in both education and employment, has shown that the tests do not generally exhibit any predictive bias. ?In other words, standardized tests do not underpredict the performance of minority group members? (Sackett, Schmitt, Ellingson, & Kabin, 2001, p. 303). Employment Regulations in the United States A review of major employment legislation provides background information for understanding the key issues needed to comprehend the legal environment in which U.S. companies operate. The most significant laws concerning employee recruitment and selection include Title VII of the Civil Rights Act (CRA) of 1964 (which included the creation of the Equal Employment Opportunity Commission), the Age Discrimination in Employment Act (ADEA) of 1967, the Tower Amendment to Title VII in 1972, Title I of the Civil Rights Act (CRA) of 1991, and the Americans with Disabilities Act (ADA) in 1990. Each of these pieces of legislation provided legal protection to groups of people in the country?s labor pool (Civil Rights Act, 1964; Civil Rights Act, 1991; U.S. Dept. of 24 Labor, 2000). Title VII was the legislation that prohibited hiring discrimination on the basis of race, color, sex, religion or national origin. People in these categories were classified as members of protected groups. The ADEA and the ADA added people aged 40 and older and people with disabilities to the list of protected groups (Age Discrimination in Employment Act of 1967;Lieber, 2007; U.S. Dept. of Labor, 2000). The Tower Amendment legalized workplace tests when making employment decisions, with the specification that the instruments did not discriminate against any of the protected groups. ADEA forbade employers from discriminating against all applicants or employees aged 40 and over. Age could be a job requirement if it was a matter of business necessity and the employer could document the need (e.g. commercial airline pilots, firefighters, soldiers). The Equal Employment Opportunity Commission (EEOC), formed by the Civil Rights Act of 1964, was made responsible for enforcing all federal laws against employment discrimination (Equal Employment Opportunity Commission, n.d). The Uniform Guidelines on Employee Selection Procedures (Uniform Guidelines) were issued in 1978 by the EEOC, the Civil Service Commission, and the Departments of Labor and Justice. These guidelines were developed to give employers information needed for compliance with federal employment laws. ?The Guidelines incorporate a set of principles governing the use of employee selection procedures according to applicable laws. They provide a framework for employers and other organizations for determining the proper use of tests and other selection procedures? (U.S. Dept. of Labor, 2000, p. 2- 3). The Uniform Guidelines specifically dealt with a concept called adverse impact. 25 One of the basic principles of the Uniform Guidelines is that it is unlawful to use a test or selection procedure that creates adverse impact, unless justified. Adverse impact occurs when there is a substantially different rate of selection in hiring, promotion, or other employment decisions that work to the disadvantage of members of race, sex or ethnic group. (U.S. Dept of Labor, 2000, p. 2-4) Usually, the level used to determine potential for adverse impact is 80%. This means if the selection rate of one group is less than 80% of another, adverse impact is deemed (Flynn, 1999). Another principle identified in the Uniform Guidelines was test fairness: The Uniform Guidelines define biased or unfair assessment procedures as those assessment procedures on which one race, sex, or ethnic group characteristically obtains lower scores than members of another group and the differences in the scores are not reflected in differences in the job performance of members of the groups. (U.S. Dept. of Labor, 2000, p. 2-5) Title I of the Civil Rights Act of 1991 upheld the concepts of Title VII, but included a number of revisions. Employers were required to identify and demonstrate the job-relatedness and business necessities for any assessments or procedures that have caused adverse impact. Title I also outlawed the practice of within group norming or race-norming. There were several different forms of what was called race-norming. Setting up different cut-off scores for different groups had been used by employers and government agencies, allowing avoidance of any adverse impact issues. Some would put 26 the scores of each racial group in a separate pool, and then take a percentage from the top of each pool into the hiring process. Other systems used separate cutoff scores for different racial groups. These systems were found to be unfair, and were therefore outlawed in Title I (Sackett & Wilk, 1994). The other Title I major revision specified that employers proven to be intentionally discriminating against any of the protected groups were liable for damages (U.S. Dept. of Labor, 2000). The ADA of 1990 added the additional specification of requiring employers to provide ?reasonable accommodation? for the assessment process (U. S. Dept. of Labor, 2000, p. 2-7). Reasonable accommodations included having the test in a different location or giving more time for a learning disabled person to complete a test (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 1999; U.S. Dept of Labor, 2000) In addition to the laws that were passed, several professional associations worked to develop standardized practices for many types of testing. The following is a partial list of the groups that participated in the efforts: the American Association for Higher Education, the American Counseling Association, the American Educational Research Association, the Association for Assessment in Counseling, the Association of Test Publishers, the Equal Employment Advisory Council, the International Brotherhood of Electrical Workers, the Society for Human Resource Management, the National Council of State Boards of Nursing, the Army Research Institute, the U.S. Department of Justice, the U.S. Department of Labor, the U.S. Department of Education, the U.S. Equal Employment Opportunity Commission, the American Psychological Association, and the National Council on Measurement in Education collaborated to develop guidelines and 27 standards for educational and psychological tests, as well as other tools used in hiring and selection. The two most recognized and highly regarded are The Standards for Educational and Psychological Testing and The Principles for the Validation and Use of Personnel Selection Procedures (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 1999; U.S. Dept. of Labor, 2000). These publications provide ?criteria for the evaluation of test, testing practices, and the effects of test use.? (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 1999, p. 2) WorkKeys WorkKeys is a job skills assessment tool designed by ACT (formerly American College Testing) in the early 1990s (ACT 1999; McLarty & Vansickle 1997; Patterson 2000). WorkKeys was developed in response to a growing demand for testing that would meet EEOC guidelines and still be an adequate predictor of job success (ACT, 2003). ACT worked with employers, labor unions, and educators to develop a list of employability skills. These skills were general enough to be used by many different organizations, yet specific enough to identify measurable skill scale levels (ACT, 1997c). Eight skill sets were identified and became the first eight assessments developed. When introduced, an ACT executive stated the desire for employers to value WorkKeys as much as colleges valued the ACT test (Zehr, 1998). 28 WorkKeys Users The four groups that most often promote and use the WorkKeys system are: businesses, community colleges, economic developers, and adult educators (ACT, 2004b; Hendrick, 2006). The WorkKeys assessments were developed to measure the skill levels of an individual for several identified workplace skill sets. Each group can use the information gathered from testing for different purposes. Businesses can use the assessments as a screening device in their hiring process. Community colleges can use the results of the assessments as pre- and post-tests for their programs to determine student learning and preparation for entry into the workforce. Economic developers can use generic score summaries to attract industry into an area with the skill levels of the labor pool. Adult educators can use WorkKeys assessments to identify skill gaps in their students and prepare them for higher paying jobs (ACT, 2004d; ACT, 2001b). Job Profiling The WorkKeys tool has three major components: job profiling, assessment, and training (ACT, 2004b). Each of these sections is required for a complete match between an individual and the skills levels needed for successful performance of a specific job (ACT, 1999). The job profiling process consists of several parts. A job profiler, trained and certified by ACT, observes incumbent workers doing the actual job being profiled and records a list of activities or tasks. The profiler develops an initial task list for the job. Once the task list is developed, the profiler meets with groups of workers who are 29 currently in the job or have recently held that job (ACT, 2000; ACT, 2001a). These groups of employees, with the profiler, create a job profile for their specific job. The employees that participate in the profiling process are chosen by the company with direction from the profiler. Qualifications for selecting employees include average (or higher) competency level in the job performance and respect of peers. The group of employees, called Subject Matter Experts (SMEs) should, as much as possible, be representative of the job classification?s demographics. For example, if 90% of the workforce in that job is female, then 90% of the SMEs should be women (ACT, 2001a). After explaining WorkKeys and the profiling process to the SMEs, the profiler begins the process by relating the task list that was developed from observation of actual job performance. The SMEs add any tasks the profiler overlooked during the observation process, such as something that is done only once per week or monthly. The SMEs delete any task not considered critical to the job. With the completed task list, the SMEs then rank the tasks by the amount of time spent on a particular task, and the importance, or criticality, of the task (ACT, 2000b). Next, the profiler explains the skill levels for one WorkKeys assessment test (i.e. Applied Mathematics) skills to the group. The SMEs review each job task to determine the appropriate skill level necessary to perform the specific task of the job being profiled. SMEs are asked to specify two skill levels for each task: entry level and effective performance. These levels are recorded by the profiler. After the group has been through the entire task list for one skill, the same task list is used to assess the next skill. If two groups of SMEs assign different skill levels for the same job task, members of each group are brought together to reconcile the differences (ACT, 2001a.) At the completion of the 30 profiling process, a company has a detailed task list, sorted by criticality, for each job profiled. The company also has two sets of skill levels identified for each job: entry level and effective performance. These skill levels are used for hiring criteria, identifying skill gaps, and training targets (ACT, 2001a, ACT, 2004a). Assessment Currently there are nine WorkKeys assessments to measure skills in the three areas: communications, problem solving, and interpersonal skills (ACT, 2004a). Assessments in the communication area include: Business Writing, Listening, Reading for Information, and Writing. Assessments in the problem solving area are: Applied Mathematics, Applied Technology, Locating Information and Observation. Teamwork is the solitary assessment in the interpersonal skills area (ACT, 2004a). Business Writing was added to the communications area in 2004. The other eight WorkKeys assessments have been available for use since the mid 1990s (ACT, 1997a; ACT, 2004a). Observation and Teamwork tests use videotape presentations in the assessments. Listening and Writing tests use audiotape presentations. The other assessment tests are given on the computer or taken with pencil and paper (ACT, 2000). Companies that use these assessment tests choose the most appropriate tests for use in their hiring process. Very few, if any, companies use all nine tests (Patterson, 2000). These assessments were developed in the late 1980s and early 1990s, during the same time frame the ASTD skills report and the Secretary?s Commission on Achieving Necessary Skills (SCANS) report were researched and reported (Carnevale, Gainer, & Meltzer, 1990; U.S. Department of Labor, 2000). The WorkKeys system was designed to 31 provide ?a metric which describes the skill requirements for individual jobs in terms of levels of proficiency? (McLarty & Vansickle, 1997, p. 294). The skill scales used to develop each WorkKeys assessment were also used in the corresponding job profiling process. This job profiling-job analysis, component differentiates the WorkKeys system from many other standardized assessment tools. WorkKeys utilizes the interactivity of each set of skill scales, with both the determination of skill levels of a particular job and the skills levels possessed by individual test takers. Because the skill scales were so important to the process, the scaling system had to be developed before the assessments or the job profiling system. ACT considered options and initially chose the Guttman scaling technique, which had been developed in the 1950s. For the Guttman technique to work, each skill set had to be hierarchical in nature. The developers also designed content and cognitive strands to organize the tests. After using the Guttman-based approach in initial test development, the Item Response Theory (IRT) method of scaling was determined to be a better predictive tool. Therefore, the scoring system was changed to its current state (McLarty & Vansickle, 1997). All of the WorkKeys assessments are criterion-referenced, as opposed to norm- referenced. Criterion-referenced assessments are referenced to specific job skills with scoring based on skill levels. A test taker is testing against pre-set skill levels rather than scores being compared against population norms to define success (ACT, 2001c) Training The third component of the WorkKeys system is training. This part of the system is customizable. Skill gap analysis is first conducted. The scores of an individual?s 32 assessment tests are compared to the scores required for the specific job by the job profiling process. If the individual does not meet the score established from the profiling process, a skill gap has been identified. An individualized learning plan can then be developed for the training process. If the person meets or exceeds the scores identified in profiling, no skill gaps have been identified for that particular job (ACT, 2004c). ACT did not develop training materials to be used by students, but did provide a group of booklets called WorkKeys Targets for Instruction to provide direction and insight to educators. These booklets have been available for the first eight assessments (ACT, 1995). Each of these booklets explains the WorkKeys system, specifically discusses the skill set covered (e.g. Applied Mathematics), and systematically describes each of the skill levels and provides sample questions for each level (ACT, 1997b; ACT, 1997c; ACT, 1997d). Sample questions from ACT for each of the three assessments at each skill level are provided in Appendices F, G, and H (ACT, 2007a, ACT, 2007b, ACT, 2007c, ACT, 2007d, ACT, 2007e, ACT, 2007f, ACT, 2007g, ACT, 2007h, ACT, 2007i, ACT, 2007j, ACT, 2007k, ACT, 2007l, ACT, 2007m, ACT, 2007n). ACT licensed two companies to produce training materials specific to the WorkKeys system for use by individuals, educators and organizations (ACT, 2007). These companies provide training materials that are computer based as well as written materials. Both companies utilize the WorkKeys skill scales in the materials, so an individual could obtain a book to study Applied Mathematics at Level 4, or Locating Information at Level 5. 33 Prior WorkKeys Research As noted in Hendrick?s (2006) dissertation, WorkKeys is a relatively new instrument and there has been limited independent research using it with adult populations. Recent studies, including American College Testing Work Keys assessments and individual variables of one-year technical completers in a selected community college in Mississippi , (Belton, 2000), A comparative study of the Tests of Adult Basic Education and Work Keys with an incarcerated population. (Buchanan, 2000), Applied Mathematics and Reading for Information Scores on The American College Testing (ACT) WorkKeys Assessment: Comparing groups by race, gender, and educational level (Barnes, 2002) and Evaluating Work Keys Profiling as a pre-employment assessment tool to increase employee retention (Hendrick, 2006) have provided some interesting hypotheses and research using the WorkKeys Assessments. The purpose of Hendrick?s (2006) study was to investigate any effect on employee retention by using the WorkKeys set of assessment tests. Quantitative analysis of 757 applicant?s test scores and qualitative analysis of interviews with 12 companies using the assessment for pre- employment testing were accomplished. Findings of the study indicated companies using WorkKeys were generally pleased with the quality of employees after testing, and that the retention rate of new employees was higher after the companies began using WorkKeys. Barnes? (2002) research centered on searching for statistically significant differences in groups by race, gender, and educational level for two WorkKeys assessments: Applied Mathematics and Reading for Information. Over 3000 high school students, technical, and 2-year college students, and employees of industry provided the 34 sample for this research. Barnes found there were statistically significant differences in the Applied Mathematics and Reading for Information scores between racial groups and educational level. Barnes? study (2002) found that there were statistically significant differences in the scores of the African-American and Caucasian test takers for both Applied Mathematics and Reading for Information, with Caucasian participants scoring higher than African-American participants. Barnes (2002) found no statistically significant differences in the test results of males and females. In 2000, Belton compared the WorkKeys scores of one-year technical school completers with two-year completers and evaluated for differences. Relationships of WorkKeys scores with the variables of age, gender, hours worked per week, and request for employment information was evaluated to determine differences or relationship of WorkKeys scores and length of educational preparation. The results of Belton?s study indicated that two-year completers scored higher on the WorkKeys assessments than did the one-year completers. Another study involved the comparison of Test of Adult Basic Education scores and WorkKeys scores for the three assessments: Applied Mathematics, Locating Information and Reading for Information. The sample was comprised of incarcerated individuals and analysis involved the variables age, and employment status prior to incarceration (Buchanan, 2000). The TABE is most commonly used by Adult Basic Education (ABE) programs. The assessment has been helpful in determining specific strengths and weaknesses of test takers for both reading and math. The WorkKeys scaling system is much less specific, so comparing the scores of both assessments would 35 provide educators with important information while helping test takers overcome skill gaps. Career Readiness Certificates Workforce readiness certificates (also known as career readiness certificates), are transportable documents designed to provide workers with a method of proof of a certain skill level with specified tested workplace skill sets which can be presented to employers. The three WorkKeys assessment tests, Applied Mathematics, Locating Information, and Reading for Information, are the basis of many of the career readiness certificates (CRCs). Most of these documents are issued by individual states, although ACT has also begun to issue a national certificate (ACT, 2007). The State of Alabama awarded its first career readiness certificate in August of 2005 using the same three assessments (Alabama Office of Workforce Development, n.d.b; Hendrick, 2006; La. offers portable certificates, 2004). Summary The review of literature discussed an overview of topics related to the research questions to be answered in this paper. Workplace skills, the history of standardized testing in the United States, standardized testing, quality of standardized tests, employment regulations in the United States, and WorkKeys were presented to give context and perspective to the study. The work conducted by the military over the last 90 years has been transferred to business and industry. All large organizations need tools to help in the process of putting people in jobs for which they are qualified. If qualified 36 people are not available, then the job analyses provide excellent starting places for instruction. 37 III. METHODS Introduction The purpose of this study was to determine what, if any, statistically significant differences exist in the scores of three WorkKeys assessments: Applied Mathematics, Locating Information and Reading for Information based on age, race and/or gender of the participants in the sample. The researcher selected three of the WorkKeys assessments for this research: Applied Math, Locating Information and Reading for Information were chosen because they are the most commonly used assessments by industry and are the basis of Career Readiness Certificates (CRC) conferred by several states, including Alabama (Alabama Office of Workforce Development, n. d. b; Hendrick, 2006; La offers portable certificate, 2004). This chapter will describe the participants of the study, variables that were investigated, the design of the research, the instrument used, procedures used in the study and the type of analyses that were conducted. Research Questions The following research questions were used in this study: 1. What are the differences in the WorkKeys Applied Mathematics test scores based on age? 38 2. What are the differences in the WorkKeys Applied Mathematics test scores based on gender? 3. What are the differences in the WorkKeys Applied Mathematics test scores based on race? 4. What are the differences in the WorkKeys Locating Information test scores based on age? 5. What are the differences in the WorkKeys Locating Information test scores based on gender? 6. What are the differences in the WorkKeys Locating Information test scores based on race? 7. What are the differences in the WorkKeys Reading for Information test scores based on age? 8. What are the differences in the WorkKeys Reading for Information test scores based on gender? 9. What are the differences in the WorkKeys Reading for Information test scores based on race? 10. Is there any statistically significant correlation between age and WorkKeys Applied Mathematics test scores? 11. Is there any statistically significant correlation between age and WorkKeys Locating Information test scores? 12. Is there any statistically significant correlation between age and WorkKeys Reading for Information test scores? 39 Participants The participants in the study were adults age 19 and older having completed any of the three WorkKeys assessments: Applied Math, Locating Information and Reading for Information, through the WorkKeys testing center at a southeastern Alabama community college from 1998 to mid-2005. These participants were technical school students, technical school program applicants, job applicants for multiple employers and incumbent workers of multiple employers. There were over 7,500 sets of scores provided by the community college. Of that sample, 6,962 sets of test scores, supplied with the matching demographic data, were usable. Each research question will have a different sample size based on the tests taken and information provided, as some participants did not take all of the assessment tests under consideration in this study, nor did all provide complete demographic data. Age at the time of testing was calculated for the 5,814 participants in the sample that self-reported date of birth which was then subtracted from the date of testing. All participants self-reported gender affiliation. WorkKeys provided the following ten options for race in the demographic reporting section: African-American/Black, Non- Hispanic, American Indian/Alaskan Native, Asian-American or Pacific Islander, Caucasian/White, Non-Hispanic, Mexican-American/Chicano, Puerto Rican, Cuban, Other Hispanic/Latino, Other, and prefer not to respond. Of the 6,962 sets of data, 6,340 identified race. That is a 91% response rate for the race variable. Only two categories were large enough to compare: 56.7% of the responding group self-reported as being Caucasian, and 38.3% as African-American. 40 There are no data to identify the participants, so the same person could have taken the tests on more than one occasion causing duplication in the results. The sample size should be sufficient to decrease the potential error that would cause (Hair, Black, Babin, Anderson, & Tatham, 2006). Variables of Interest This research consists of three independent variables and three dependent variables. The independent variables are age, race and gender. The dependent variables were the participant?s tests scores in the following WorkKeys Assessments: Applied Mathematics, Locating Information, and Reading for Information. Research Design The instruments used in this research were three assessment tests from the WorkKeys group of skills assessments. The three assessments were: Applied Mathematics, Locating Information, and Reading for Information. Each of these assessments has a skill set scale incorporated into the assessment. None of these skill sets are the same. The scores assigned to each skill level are not at common intervals, requiring these scores to be considered ordinal in classification. Therefore, non- parametric statistical techniques were used to analyze the data (ACT, 2001e). The first method used was the Mann-Whitney U test, a non-parametric analysis, to determine if differences exist based on age, race and gender. The second method used was the Spearman Rank Correlation. This method was used only with the continuous independent variable, age. 41 Sampling The sample contained 6,962 useable records. Only adults age 19 and over were used in the study. The sample consisted of test takers who were technical school students, technical school program applicants, job applicants for multiple employers and incumbent employees of multiple employers. All of the data was collected by one WorkKeys testing center in southeast Alabama. Instrument Three WorkKeys assessment tests were the instruments used in this research: Applied Mathematics, Locating Information and Reading for Information. These assessments were developed in the late 1980s and early 1990s, during the same time frame the ASTD skills report and the Secretary?s Commission on Achieving Necessary Skills (SCANS) report were researched and reported (Carnevale, Gainer, & Meltzer, 1990; U.S. Department of Labor, 2000). The WorkKeys system was designed to provide ?a metric which describes the skill requirements for individual jobs in terms of levels of proficiency? (McLarty & Vansickle, 1997, p. 294). WorkKeys is not just a battery of tests; the skill scales used to develop each assessment were also used in the corresponding job profiling process. The job analysis component is the difference in the WorkKeys system from many standardized assessment tools. WorkKeys utilizes the interactivity of each set of skill scales, with both the determination of skill levels of a particular job and the skills levels possessed by individual test takers. This critical interactivity 42 necessitated that the scaling system had to be developed before the assessments or the job profiling system. ACT considered options and initially chose the Guttman scaling technique developed in the 1950s. For the Guttman technique to work, each skill set had to be hierarchical in nature. The developers also designed content and cognitive strands to organize the tests (McLarty & Vansickle, 1997). The WorkKeys system assessments were designed to meet the following criteria: 1. The way in which the generic skill is assessed is generally congruent with the way the skill is used in the workplace. 2. The lowest level assessed is at approximately the level for which an employer would be interested in setting a standard. 3. The highest level assessed is at approximately the level beyond which specialized training would be required. 4. The steps between the lowest and highest levels are large enough to be distinguished and small enough to have practical value in determining workplace skills. 5. The assessments are sufficiently reliable for high-stakes decision making. 6. The assessments can be validated against empirical criteria. 7. The assessments are feasible with respect to administration time and complexity, as well as cost. (McLarty & Vansickle, 1997, p. 300) After using the Guttman-based approach in initial test development, the Item Response Theory (IRT) method of scaling was determined to be a better 43 predictive tool. Therefore, the scoring system was changed to its current state (McLarty & Vansickle, 1997). All of the WorkKeys assessments are criterion-referenced, as opposed to norm- referenced. Criterion-referenced assessments are referenced to specific job skills with scoring based on skill levels. A test taker is testing against pre-set skill levels rather than scores being compared against population norms to define success. Unlike norm-referenced assessment scores, the WorkKeys assessments use only four to five level score points in the reporting scale. These level scores are ordinal in nature as they form a hierarchy. Therefore, it is not useful or meaningful to describe the score distributions with means, standard deviations, or standard errors. Instead, numbers and percents of the examinees in the sample at or above each skill level are used to report the score distributions for the sample in this section. (ACT, 2001e) The Applied Mathematics assessment tests the skills ?involved with the application of mathematical reasoning to work-related problems? (ACT, 2001c, p. 16). ?The Locating Information skill involves using information taken from workplace graphics such as diagrams, floor plans, tables, forms, graphs, charts, and instrument gauges? and use of the information to make decisions or answer questions (ACT, 2001c, p. 19). ?Reading and understanding work-related instructions and policies? is the skill tested in Reading for Information (ACT, 2001c, p. 22). Detailed characteristics of each of these assessment levels in located in Appendices C, D, and E. 44 Reliability Test reliability is ?how dependably or consistently a test measures a characteristic? (U.S. Dept. of Labor, 2000, p. 3-2). The reliability of a test is typically indicated by a reliability coefficient which is a number between 0 and 1. According to the WorkKeys Technical Manual, reliability coefficients have ?limited meaning for WorkKeys tests, [as] WorkKeys tests are primarily classification tests? (ACT 2001e, p. 36). ACT provides ?information about the percentage of examinees that would be classified in the same way on two applications of the same form or alternate forms? in the WorkKeys Technical Handbook (ACT, 2001c, p. 36). This predicted classification consistency is shown in Table 1 for the three WorkKeys assessments in this study. Table 1 Predicted Classification Consistency Type of Classification* Applied Mathematics Locating Information Reading for Information Exact 52 59 50 >3 94 89 96 >4 84 78 90 >5 81 88 78 >6 91 100 84 >7 97 -- 96 * Exact classifications specify a specific skill level for the examinee; ?>? classifications specify whether the examinee is at or above the indicated level (ACT, 2001, p. 39). 45 Since 2001, ACT has accomplished additional reliability and validity studies with WorkKeys assessments, especially Applied Mathematics and Reading for Information (Hendrick, 2006). ACT has evaluated some WorkKeys test scores in three categories that reflect test reliability: internal consistency, generalizability and classification consistency (ACT, 2005). ACT reports an internal consistency +0.92 reliability coefficient for two forms of Reading for Information and Applied Mathematics as tested in 2002 and 2003. These values are considered high for the 30-item test administered and reflect good internal consistency (ACT, 2005) (As cited in Hendrick, 2006, p. 66). Validity Validity is considered the most important attribute of an assessment. It refers to what characteristic is being measured and how well it measures that characteristic (U.S. Dept. of Labor, 2000). Gay and Airasian (2000) discussed the changes of the meaning of validity that have taken place in educational research over the years, including the three classic approaches to test validity: construct-related validity, content-related validity and criterion-related validity and the evolution of validity?s meaning. Within the context of pre-employment testing, ?construct-related validity requires a demonstration that the test measures the construct or characteristic it claims to measure, and that this characteristic is important to successful performance on the job? (U.S. Dept of Labor, 2000, p. 3-7). Content-related validity is the compliance of the test measuring 46 what it says it is going to measure. Criterion-related validity requires some proof of statistical relationship between test performance and job performance (Huck, 2004; U.S. Dept of Labor, 2000). Gay and Airasian (2000) recognized that all types of validity relate and are connected to each other; however there are other concerns, specifically the ?concern over the consequences that arise from use of tests and measures? (Gay & Airasian, 2000, p. 162). Messick (1994) contends that the traditional view of validity and reliability should be enlarged to include other concepts. Messick suggested adding ?the value implications of the scores? meaning? and the ?social consequences? of the scores? use (Messick, 1994, p. 5). Linn (2001) reported on the efforts by states to conduct educational assessments and hold schools or school systems accountable for the results of these assessments. Linn (2001) discussed these assessments and accountability programs as ?high-stakes? programs, suggesting validity is the most fundamental consideration in the evaluation of the uses and interpretations of any assessments. But, how should validity be investigated and reported? Validity encompasses many components and Linn commented ?it is clearly not appropriate to make an unqualified statement that an assessment is valid? because ?validity is specific to particular uses and interpretations? (Linn, 2001, p. 23). These views mirror those of Messick (1994) that the same assessment could have a great deal of validity when used in one way, but little, if any validity, when used in a different manner. Linn (2001) would add further to the concept of validity and reliability regarding what is termed high-stakes testing. Linn indicated that the higher the stakes of 47 the assessment scores, the more information about error and margins of error should be a part of the reporting process (Linn, 2001). A major component of validity in the WorkKeys process is the fact that a job analysis (profile) is done for each job, and skill levels are assigned for each assessment prior to testing. This job profile provides the documentation needed to support criterion- related and content validity (ACT, 2001e). Hendrick (2006) cited an ACT unpublished technical manual to present new data on validity testing that ACT has conducted: ACT has offered construct-related evidence of test validity in a study of over 120,000 samples (ACT, 2005). This study compared the ACT Applied Mathematics test with the ACT Mathematics Test, with a correlation coefficient of +0.81 between number-correct (NC) scores on the two tests and +0.75 between scale scores on the two tests (ACT, 2005). Similar comparisons between the ACT Reading for Information test and the ACT Reading and ACT English tests resulted in correlations between NC scores of +.066 and +0.71, respectively, and scale scores correlations of +0.62 and +0.66, respectively. This comparative study indicated that the constructs tested in the WorkKeys Applied Mathematics and Reading for Information tests significantly correlated with the constructs tested in the ACT Mathematics and English tests. (p. 69) Procedures A proposal for the study was sent to the Institutional Review Board for the Use of Human Subjects in Research (IRB). Approval was granted in August 2007 (see Appendix 48 A). Permission to use archival data was obtained by the researcher from the WorkKeys testing center at a community college in southeast Alabama (see Appendix B). The data were provided to the researcher on a CD by the WorkKeys testing center in an Access database. The data were converted from Access to Excel, then from Excel into SPSS. During that process, all identifying information and records that did not meet demographic requirements were deleted from the database. Data converted to SPSS format were saved to a flash drive and locked in a file cabinet in the researcher?s committee chair?s office. All data used in the research came from the WorkKeys testing center. Analysis The data were analyzed using the Statistical Package for the Social Sciences (SPSS) software program. Non-parametric statistical methods were used to analyze the data, as the dependent variables (assessment scores) were ordinal. The Mann-Whitney U test was used to determine any statistically significant differences in the independent variable groupings. The researcher used a Spearman rank correlation (r s ) to assess the level of consistency between the variable age with test scores. Summary Chapter III discussed the participants of the study, variables that were investigated, the design of the research, the instrument used, procedures used in the study and the type of analyses that were conducted. The design of the instrument was 49 reviewed, as was as a summary of validity in high-stakes testing. Chapter IV will discuss the findings of the research. 50 IV. RESULTS Purpose of the Study The purpose of this research was to determine if there were any statistically significant differences in the scores of the WorkKeys Applied Mathematics, Locating Information, and Reading for Information assessments based on three demographic independent variables. These independent variables were age, gender and race. Research Questions The following research questions were used in this study: 1. What are the differences in the WorkKeys Applied Mathematics test scores based on age? 2. What are the differences in the WorkKeys Applied Mathematics test scores based on gender? 3. What are the differences in the WorkKeys Applied Mathematics test scores based on race? 4. What are the differences in the WorkKeys Locating Information test scores based on age? 5. What are the differences in the WorkKeys Locating Information test scores based on gender? 51 6. What are the differences in the WorkKeys Locating Information test scores based on race? 7. What are the differences in the WorkKeys Reading for Information test scores based on age? 8. What are the differences in the WorkKeys Reading for Information test scores based on gender? 9. What are the differences in the WorkKeys Reading for Information test scores based on race? 10. Is there any statistically significant correlation between age and WorkKeys Applied Mathematics test scores? 11. Is there any statistically significant correlation between age and WorkKeys Locating Information test scores? 12. Is there any statistically significant correlation between age and WorkKeys Reading for Information test scores? Introduction The WorkKeys test scores of participants were examined to determine statistically significant differences based on the independent variables of age, race, and gender. Three of nine WorkKeys assessments were chosen for this study: Applied Mathematics, Locating Information, and Reading for Information. These assessments were designed to measure skill levels of test takers within the particular skill scale. These ordinal scales were designed individually without interval measurement; therefore non-parametric statistics were used to determine statistical significance. The Mann-Whitney U test was 52 used to determine the relationship of each dependent variable with the independent variables. In addition, Spearman Rank-Order Correlation was used to determine the relationship between age and each of the assessment scores, as age provided a continuous attribute for conducting the correlation. Participants The participants in the study were adults, age 19 and older, who completed any of the three WorkKeys assessments: Applied Math, Locating Information and Reading for Information, through the WorkKeys testing center at a southeastern Alabama community college from 1998 to mid-2005. These participants were technical school students, technical school program applicants, job applicants for multiple employers and incumbent workers of multiple employers. Over 7,500 sets of scores provided by the community college, of which, 6,962 sets of test scores, supplied with the matching demographic data, were usable. Each research question had a different sample size based on the tests taken and information provided, as some participants did not take all of the assessment tests under consideration in this study, nor did all provide complete demographic data. Research Questions 1-9 To examine the first research question, ?What are the differences in the WorkKeys Applied Mathematics test scores based on age?? the sample population was divided into two groups for comparison. Test takers aged 19 to 39 comprised the first group while the second group was composed of all test takers aged 40 and older. The age 53 of 40 was chosen as the dividing point because the Age Discrimination in Employment Act of 1967 put all employees and applicants aged 40 and older into a protected group (U.S. Dept of Labor, 2000). A Mann-Whitney U test was used to examine difference in the Applied Mathematics scores between two groups of test takers: one group included all test takers aged 19 to 39, and the other group included all respondents aged 40 and older. Older test takers scored statistically significantly less (M = 2072.15) than the younger test takers (M = 2486.82; U = 2036517.5; p < .001). Of the total of 4,767 respondents, 3,585 were aged 19-39 and 1,182 were aged 40 or older. There is a statistically significant difference between the scores of those 40 or older and the scores of those 19 to 39 (p < .001). See Tables 2 and 3. There is a large difference in the sizes of the two groups, which might have some statistical impact. The Mann-Whitney U test is ?the nonparametric equivalent of the independent t test? (Cronk, 2006, p. 90). Rather than comparing means, as occurs in an independent t test, the Mann-Whitney converts scores into ranks, and then the means of the ranks are compared. (Munro, 2005) The U is the mean of the ranks. In this study, with over 6,000 participants, the U is very large. Comparing the mean ranks provides key information. 54 Table 2 Scores of Applied Mathematics by Age Group ? Ranks Age in Two Groups n Mean Rank Sum of Ranks 19-39 yrs old 3585 2486.82 8915241 40 yrs old + 1182 2072.15 2449287 Total 4767 Table 3 Scores of Applied Mathematics by Age Group ? Test Statistics Applied Mathematics Mann Whitney U 1750134.0 Asymp. Sig. (2-tailed) <.001 Findings for research question two examined, ?What are the differences in the WorkKeys Applied Mathematics test scores based on gender?? using a Mann-Whitney U test to determine the difference between the scores of male and female test takers. Female test takers scored statistically significantly less (M = 2643.30) than the male test takers (M = 3076.18; U = 2883657.5; p < .001). Of the total of 5569 respondents, 1823 were male and 3746 were female. There is a statistically significant difference between the scores of men and women on the Applied Mathematics assessment (p < .001). (See Tables 4 and 5). 55 Table 4 Scores of Applied Mathematics by Gender ? Ranks Gender n Mean Rank Sum of Ranks Male 1823 3076.18 5607876.50 Female 3746 2643.30 99011788 Total 5569 Table 5 Scores of Applied Mathematics by Gender ? Test Statistics Applied Mathematics Mann Whitney U 2883657.5 Asymp. Sig. (2-tailed) <.001 Research question three examined, ?What are the differences in the WorkKeys Applied Mathematics test scores based on race?? Only two groups of the sample population were large enough to compare. The African-American population (n = 1890) was compared to the Caucasian population (n = 3032). A Mann-Whitney U test was used to examine the difference in the Applied Mathematics scores between two groups of test takers. The African-American population scored statistically significantly less (M = 1605.78) than the Caucasian test takers (M = 2994.91; U = 1247936.5; p <.001). There is a statistically significant difference between 56 the scores of the African-American group and the scores of the Caucasian group (p < .001). (See Tables 6 and 7). Table 6 Scores of Applied Mathematics by Race ? Ranks Race n Mean Rank Sum of Ranks African American 1890 1605.78 3034931.50 Caucasian 3032 2994.91 9080571.50 Total 4922 Table 7 Scores of Applied Mathematics by Race ? Test Statistics Applied Mathematics Mann Whitney U 1247936.5 Asymp. Sig. (2-tailed) <.001 The fourth research question examined, ?What are the differences in the WorkKeys Locating Information test scores based on age?? The sample population was divided into two groups for comparison. The first group was those test takers aged 19 to 39. The second group was all test takers 40 and older. The age of 40 was chosen as the dividing point because the Age Discrimination in Employment Act of 1967 put all employees and applicants aged 40 and older into a protected group (U. S. Dept of Labor, 2000). 57 A Mann-Whitney U test was used to examine the difference in the Locating Information scores between two groups of test takers; one group included all test takers aged 19 to 39, and the other group included all respondents aged 40 and older. Older test takers scored statistically significantly less (M = 1737.39) than the younger test takers (M = 2230.50; U = 1201935; p < .001). Of the total of 4238 respondents, 3,284 were aged 19-39 and 954 were aged 40 or older. There is a statistically significant difference between the scores of those 40 or older and the scores of those 19 to 39 (p < .001). (See Tables 8 and 9). Table 8 Scores of Locating Information by Age Group ? Ranks Age in Two Groups n Mean Rank Sum of Ranks 19-39 yrs old 3284 2230.50 7324971.00 40 yrs old + 954 1737.39 1657470.00 Total 4238 Table 9 Scores of Locating Information by Age Group ? Test Statistics Locating Information Mann Whitney U 1201935 Asymp. Sig. (2-tailed) <.001 58 Examining the findings for research question five, ?What are the differences in the WorkKeys Locating Information test scores based on gender?? a Mann-Whitney U test was used to inspect the difference between the scores of male and female test takers. There was no statistically significant difference in the scores of the two groups on the Locating Information assessment (p = .601). The sample totaled 5130 respondents, 1737 were male and 3393 were female. Tables 10 and 11 provide additional information. Table 10 Scores of Locating Information by Gender ? Ranks Gender n Mean Rank Sum of Ranks Male 1737 2579.45 4480496.50 Female 3393 2558.36 8680518.50 Total 5130 Table 11 Scores of Locating Information by Gender ? Test Statistics Locating Information Mann Whitney U 2922597.5 Asymp. Sig. (2-tailed) .601 When examining research question six, ?What are the differences in the WorkKeys Locating Information test scores based on race?? only two groups of the sample population were large enough to compare. The African-American population (n = 1766) was compared to the Caucasian population (n = 2729). 59 A Mann-Whitney U test was used to examine the difference in the Applied Mathematics scores between two groups of test takers. The African-American population scored statistically significantly less (M = 1649.60) than the Caucasian test takers (M = 2635.24; U = 1352929.5; p < .001). There is a statistically significant difference between the scores of the African-American group and the scores of the Caucasian group (p < .001). (See Tables 12 and 13). Table 12 Scores of Locating Information by Race ? Ranks Age in Two Groups N Mean Rank Sum of Ranks African American 1766 1649.60 2913190.50 Caucasian 2729 2635.24 7191569.50 Total 4495 Table 13 Scores of Locating Information by Race ? Test Statistics Locating Information Mann Whitney U 1352929.5 Asymp. Sig. (2-tailed) <.001 A Mann-Whitney U test was used to examine the difference in the Reading for Information scores between two groups of test takers; one group included all test takers 60 aged 19 to 39, and the other group included all respondents aged 40 and older. Older test takers scored significantly less (M = 2225.65) than the younger test takers (M = 2562.07; U = 2036517.5; p < .001). Of the total of 4,948 respondents, 3,660 were aged 19-39 and 1,288 were aged 40 or older. There is a statistically significant difference between the scores of those 40 or older and the scores of those 19 to 39 (p <.001). (See Tables 14 and 15). Table 14 Scores of Reading for Information by Age Group ? Ranks Age in Two Groups n Mean Rank Sum of Ranks 19-39 yrs old 3660 2562.07 9377192.50 40 yrs old + 1288 2225.65 2866633.50 Total 4948 Table 15 Scores of Reading for Information by Age Group ? Test Statistics Reading for Information Mann Whitney U 2036517.5 Asymp. Sig. (2-tailed) <.001 To examine the findings for research question eight, ?What are the differences in the WorkKeys Reading for Information test scores based on gender?? a Mann-Whitney U test was used to examine the difference between the scores of male and female test takers. Male test takers scored less (M = 2837.50) than the female test takers (M = 2934.78) on 61 the Reading for Information assessment. The difference in the scores was statistically significant at the .05 level, but not at the .001 level (M = 2934.78; U = 3553841.0; p = .032). Of the total of 5806 respondents, 1867 were male and 3939 were female. Tables 16 and 17 provide additional information. Table 16 Scores of Reading for Information by Gender ? Ranks Age in Two Groups n Mean Rank Sum of Ranks Male 1867 2837.50 5297619.00 Female 3939 2934.78 11560102.00 Total 5806 Table 17 Scores of Reading for Information by Gender ? Test Statistics Reading for Information Mann Whitney U 3553841.0 Asymp. Sig. (2-tailed) .032 When examining research question nine, ?What are the differences in the WorkKeys Reading for Information test scores based on race?? only two groups of the sample population were large enough to compare. The African-American population (n = 1766) was compared to the Caucasian population (n = 2729). A Mann-Whitney U test was used to examine the difference in the Reading for Information scores between two groups of test takers. The African-American population 62 scored statistically significantly less (M = 1649.60) than the Caucasian test takers (M = 2635.24; U = 1352929.5; p < .001). There is a statistically significant difference between the scores of the African-American group and the scores of the Caucasian group (p < .001). (See Tables 18 and 19). Table 18 Scores of Reading for Information by Race ? Ranks Age in Two Groups n Mean Rank Sum of Ranks African American 1766 1649.60 2913190.50 Caucasian 2729 2635.24 7191569.50 Total 4495 Table 19 Scores of Reading for Information by Race ? Test Statistics Reading for Information Mann Whitney U 1352929.5 Asymp. Sig. (2-tailed) <.001 Research Questions 10-12 In research questions ten, eleven and twelve, a Spearman?s Rank-Order Correlation was conducted to determine whether there was a relationship between age and the scores of the three WorkKeys assessments. The findings show that there was an 63 inverse relationship with age and the scores of the assessments. The scores of older test takers went down as age went up. To understand the age range of the participants, Tables 20 and 21 were included in these results. Tables 20 and 21 provide detailed frequency information about the sample used in this research. Table 20 Age of participants when taking the assessments- Totals N Valid 5814 Missing 1148 Total 6962 Table 21 Age of participants when taking the assessments-Frequencies Valid Age Frequency Percent Valid Percent 19 545 7.8 9.4 20 403 5.8 6.9 21 276 4.0 4.7 22 279 4.0 4.8 23 242 3.5 4.2 24 213 3.1 3.7 25 149 2.1 2.6 (table continues) 64 Table 21 (continued) Valid Age Frequency Percent Valid Percent 26 179 2.6 3.1 27 185 2.7 3.2 28 167 2.4 2.9 29 154 2.2 2.6 30 179 2.6 3.1 31 157 2.3 2.7 32 179 2.6 3.1 33 150 2.2 2.6 34 145 2.1 2.5 35 113 1.6 1.9 36 146 2.1 2.5 37 135 1.9 2.3 38 105 1.5 1.8 39 125 1.8 2.1 40 121 1.7 2.1 41 105 1.5 1.8 42 125 1.8 2.1 43 112 1.6 1.9 44 123 1.8 2.1 (table continues) 65 Table 21 (continued) Valid Age Frequency Percent Valid Percent 45 99 1.4 1.7 46 93 1.3 1.6 47 94 1.4 1.6 48 90 1.3 1.5 49 94 1.4 1.6 50 70 1.0 1.2 51 69 1.0 1.2 52 57 .8 1.0 53 51 .7 .9 54 41 .6 .7 55 49 .7 .8 56 37 .5 .6 57 29 .4 .5 58 27 .4 .5 59 27 .4 .5 60 17 .2 .3 61 14 .2 .2 62 15 .2 .3 63 11 .2 .2 (table continues) 66 Table 21 (continued) Valid Age Frequency Percent Valid Percent 64 6 .1 .1 65 4 .1 .1 66 2 .0 .0 68 2 .0 .0 69 3 .0 .1 72 1 .0 .0 Total 5814 83.5 100.0 Missing System 1148 16.5 Total 6962 100.0 Studying the findings for research question ten, ?Is there any statistically significant correlation between age and WorkKeys Applied Mathematics test scores?? a Spearman Rank-Order Correlation was used to examine the relationship between the scores of the Applied Mathematics test and the ages of the test takers. As noted in Table 22, there is an inverse relationship with test scores and age. That is, as test takers get older, they tend to score lower on the Applied Mathematics assessment. Spearman?s Rho for this comparison was r s = -.118. This coefficient shows an inverse relationship that is not particularly strong, but it does exist. 67 Table 22 Spearman?s Rho (r s ) for Applied Mathematics and Age Age at Test Applied Math Score Age at Test Correlation Coefficient 1.000 -.118 Sig. (2-tailed) .000 N 5814 4768 Applied Mathematics Score Correlation Coefficient -.118 1.000 Sig. (2-tailed) .000 N 4768 5655 To examine the findings for research question eleven, ?Is there any statistically significant correlation between age and WorkKeys Locating Information test scores?? a Spearman Rank-Order Correlation was used to determine the relationship between the scores of the Locating Information test and the ages of the test takers. As noted in Table 23, there is an inverse relationship with test scores and age. That is, as test takers get older, they tend to score lower on the Locating Information assessment. Spearman?s Rho for this comparison was r s = -.149. This coefficient shows an inverse relationship that is not particularly strong, but it does exist. 68 Table 23 Spearman?s Rho (r s ) for Locating Information and Age Age at Test Locating Information Score Age at Test Correlation Coefficient 1.000 -.149 Sig. (2-tailed) <.001 N 5814 4238 Locating Information Score Correlation Coefficient -.149 1.000 Sig. (2-tailed) <.001 N 4238 5202 To examine the findings for research question twelve, ?Is there any statistically significant correlation between age and WorkKeys Reading for Information test scores??, a Spearman Rank-Order Correlation was used to examine the relationship between the scores of the Locating Information test and the ages of the test takers. As noted in Table 24, there is an inverse relationship with test scores and age. That is, as test takers get older, they tend to score lower on the Reading for Information assessment. Spearman?s Rho for this comparison was r s = -.034. This coefficient shows an inverse relationship that is weak, but it does exist. 69 Table 24 Spearman?s Rho (r s ) for Reading for Information and Age Age at Test Reading for Information Score Age at Test Correlation Coefficient 1.000 -.034 Sig. (2-tailed) .016 N 5814 4949 Reading for Information Score Correlation Coefficient -.034 1.000 Sig. (2-tailed) .016 N 4949 5906 Table 25 is a summary table of information. It provides the significance levels for each of the independent variables with each of the dependent variables from the Mann- Whitney U tests. The table also includes the Spearman Rho correlations with age and each of the dependent variables. Data presented in Table 25 indicate statistically significant differences between the scores of each group for all assessments based on age and race. There is also statistically significant difference in the scores of men and women on the Applied Mathematics assessment. There is no statistically significant difference in the Locating Information and Reading for Information scores between men and women. The Spearman Rho correlations show an inverse relationship between age and scores on all of the assessments. These relationships are relatively mild. 70 Table 25 Table with significance levels and correlation coefficients Applied Mathematics Locating Information Reading for Information Mann-Whitney U Test Significance level Age p < .001 p < .001 p < .001 Gender p < .001 p = .601 p = .032 Race p < .001 p < .001 p < .001 Spearman Rho Correlation coefficient Age r s = -.118 r s = -.149 r s = -.034 Non-parametric (two-tailed) Summary The results of the study are provided in Chapter IV. The study was a comparison of the test scores of three WorkKeys assessments, Applied Math, Locating Information and Reading for Information by three independent demographic variables, age, race and gender. The test scores of each of the WorkKeys assessments are ordinal, so non- parametric methods were used in the comparisons. Using the Mann-Whitney U test, there were statistically significant differences in all assessments for race and age. Gender results were mixed, with statistically significant differences for only the Applied Mathematics assessment. To determine relationship between age and test scores, a Spearman Rho Rank correlation was run. The results showed an inverse relationship 71 between age and test scores with mild connection. There was a trend as people got older, they scored lower on the assessments, but it was not a strong correlation. 72 V. SUMMARY, DISCUSSION, IMPLICATIONS AND RECOMMENDATIONS Purpose of the Study The purpose of this research was to determine if there were any statistically significant differences in the scores of the WorkKeys Applied Mathematics, Locating Information, and Reading for Information assessments based on three demographic independent variables. These independent variables were age, gender and race. Research Questions The following research questions were used in this study: 1. What are the differences in the WorkKeys Applied Mathematics test scores based on age? 2. What are the differences in the WorkKeys Applied Mathematics test scores based on gender? 3. What are the differences in the WorkKeys Applied Mathematics test scores based on race? 4. What are the differences in the WorkKeys Locating Information test scores based on age? 5. What are the differences in the WorkKeys Locating Information test scores based on gender? 73 6. What are the differences in the WorkKeys Locating Information test scores based on race? 7. What are the differences in the WorkKeys Reading for Information test scores based on age? 8. What are the differences in the WorkKeys Reading for Information test scores based on gender? 9. What are the differences in the WorkKeys Reading for Information test scores based on race? 10. Is there any statistically significant correlation between age and WorkKeys Applied Mathematics test scores? 11. Is there any statistically significant correlation between age and WorkKeys Locating Information test scores? 12. Is there any statistically significant correlation between age and WorkKeys Reading for Information test scores? Summary As American businesses have competed in the global economy, the need for skilled workers has become more acute (Friedman, 2005). The American educational system has struggled to provide businesses and industry with needed skilled workers, with mixed results. These results have propelled businesses to seek ways to measure the skills of potential employees prior to hiring. One of the most commonly used methods of skill determination has been pre-employment testing (Agard, 2003). There are many types of pre-employment assessments including interviews, presentations, simulations, 74 and tests. For this research, three tests from the ACT WorkKeys battery of tests, Applied Mathematics, Locating Information, and Reading for Information, were chosen as the focus of this study. These three assessments have been used commonly in testing for industry, and are the basis of many state Career Readiness Certificates, including the State of Alabama (Alabama Office of Workforce Development, n. d. b; CRC Consortium, 2007). In the early 1960?s, civil rights legislation mandated the removal of discrimination from employment practices, including hiring on the basis of race, color, sex, religion, or national origin. The Age Discrimination in Employment Act of 1967 added people aged 40 and older to the list of protected groups (U.S. Dept of Labor, 2000). Standardized testing has been used for over 100 years in the U.S. Armed Forces to assign new recruits to jobs where they could be productive and proficient. After WWII many military methods were moved to the private sector as veterans rejoined the civilian workforce. Pre-employment testing was conducted by many companies and government agencies until civil rights and employment legislation changed legal hiring practices to stop the adverse impact for members of protected groups (i.e. women, minorities). A pattern of adverse impact is present in almost all standardized tests, not just pre-employment tests. Therefore, to legally use cognitive testing as a part of the hiring process, businesses had to demonstrate the job-relatedness of the assessment. This study analyzed test scores to determine if any statically significant differences existed between the groups identified, of which, some members belong to protected groups. The sample for this study used 6,962 sets of scores with the self- reported demographics from one WorkKeys testing center in Alabama. The sample 75 consisted of test takers, aged 19 and older. The group included technical school students, technical school program applicants, job applicants for multiple employers and incumbent employees of multiple employers. The results of the study found statistically significant differences in the scores of all the WorkKeys assessments on the basis of racial group and age, and mixed results in the scores of the assessments between males and females. Discussion This research compared three independent variables and three dependent variables. The independent variables were age, race and gender. The dependent variables were the participant?s tests scores in the following WorkKeys assessments: Applied Mathematics, Locating Information, and Reading for Information. In the literature review process, the researcher found no independent research related to the age of test takers and the WorkKeys assessments. This became an area of focus for the research. A previous study investigated differences based on race, gender, and educational levels, using the scores of two assessments, Applied Mathematics and Reading for Information (Barnes, 2002). Other studies involved analyzing the results of standardized tests on the basis of race and gender (Sackett, Schmitt, Ellingson, & Kabin, 2001; Hall, Davis, Bolen, & Chia, 1999). For many years, researchers have been comparing the scores of standardized tests in the United States on the basis of racial identity. Sackett, Schmitt, Ellingson, & Kabin (2001) stated ??racial group differences are repeatedly observed in scores on standardized knowledge, skill, ability and achievement tests? (p. 302). 76 The fact that scores from all three assessments showed a significant difference based on age encouraged the researcher further analyze the data. A correlation analysis revealed an inverse relationship with age and each of the sets of scores (i.e., increases in age related to lower scores), but the relationships were not very strong. The correlation coefficients were r s = -.118 for Applied Mathematics, r s = -.149 for Locating Information, and r s = -.034 for Reading for Information. The first nine research questions looked for statistically significant differences in the scores of each of the three assessments: Applied Mathematics, Locating Information and Reading for Information, with each of the independent variables of age, race and gender. The last three research questions looked at the level of correlation between age and the scores of the same three assessments. Implications As companies have closed, relocated, or experienced downsizing, older Americans have been required to apply for new jobs for the first time in many years. Many of the older applicants have never before been faced with high-stakes pre- employment testing. Employers who use pre-employment testing as a part of the hiring process may be required to study adverse impact issues or defend required skill levels (U.S. Dept of Labor, 2000). Job applicants aged 40 and older are in a legally protected class, and great consideration should be given to testing fairness for older test takers. Recommendations Further research should be conducted to consider the effects of age on the scores of the WorkKeys assessments to determine effective use of the assessments by companies 77 and state governments. One possible accommodation for older workers would be to allow a longer amount of time for testing. A reasonable accommodation of increasing time may prove to be an effective method of minimizing adverse impact. The accommodation of an increased time for testing would necessitate no one from the hiring company doing the testing or being able to identify one group from another. By law, the company should not know the age of an applicant in determining whether someone meets the criteria for the job. Allowing all applicants over age forty to test for a longer period could cause the following potential problems: 1) the company would immediately identify those in the hiring pool was aged 40 or over; 2) the younger job applicants could perceive the older group as having unfair advantage in the testing process. There are differences in experience for 19 and 20 year olds who were recently exposed to many standardized tests (e.g. Stanford Achievement test, Alabama Graduation Examination) compared to someone who graduated from high school 32 years ago. That 50 year old worker whose plant recently closed didn?t have to take a graduation exam to obtain a diploma 32 years ago when she graduated. She?s never seen a bubble sheet and she?s intimidated by computers. Is the testing taking place on a level playing field? There are things adult educators and companies can attempt that might minimize the differences. One idea is to provide classes on test-taking skills. This would allow everyone to learn the language of testing, as well as become comfortable with the process of a timed test. Samson (1985) conducted a meta-analysis of training programs for test- taking skills. The research suggested ?that programs of training in test-taking skills produce, on average, significant improvement in students? scores on achievement tests? (Sampson, 1985, p. 265). Programs that lasted for periods of five weeks or longer had a 78 significantly greater impact, which the researcher concluded was due to a larger number of contact hours (Sampson, 1985). Another option is to attempt to lower test anxiety of the test takers. There are limited numbers of things that can be done in a timed test, but every effort should be made to make test-takers comfortable during the process. Most, if not all standardized tests are likely to cause adverse impact for one or more protected groups. Companies can protect themselves by doing thorough job analyses and identifying skill levels required for their specific jobs. Further evaluation of standardized testing procedures is warranted. ACT has developed an excellent tool for providing job-relatedness to pre- employment testing. If job profiles are conducted according to the standards provided by ACT, then it is a simple process to show the WorkKeys scores required for any particular job matches the skill scales for that job created by the job profiling procedure. The issue here is not if WorkKeys is an effective pre-employment testing system, it is the fact that older test takers may need some type of preparation before they are able to show what they really know in a timed testing environment. Recommendations for further research include conducting similar research in different geographic regions of the country, conducting similar research with other WorkKeys assessments, conducting research of the impact of warm-up questions before the tests as well as the impact of test-taking skills training for older adults. A longitudinal study of scores in a specific geographic region would also be an area of further research. 79 REFERENCES A comparison of career-related assessment tools/models. (1999). San Francisco: West Ed. ACT (1995). WorkKeys: Targets for instruction. Iowa City, IA: ACT, Inc. ACT (1997a). WorkKeys: The WorkKeys System. Iowa City, IA: ACT, Inc. ACT (1997b). WorkKeys targets for instruction: Applied mathematics. Iowa City, IA: ACT, Inc. ACT (1997c). WorkKeys targets for instruction: Reading for information. Iowa City, IA: ACT, Inc. ACT (1997d). WorkKeys targets for instruction: Locating information. Iowa City, IA: ACT, Inc. ACT (1999). Work Keys: Helping you build a winning workforce. Iowa City, IA: ACT, Inc. ACT (2000). WorkKeys: Job profiling and profiler qualifications. Iowa City, IA: ACT, Inc. ACT (2001a). WorkKeys technical handbook. Iowa City, IA: ACT, Inc. ACT (2001b). WorkKeys: Assessments. Iowa City, IA: ACT, Inc. ACT. (2001e). WorkKeys technical handbook: Under revision. Iowa City, IA: ACT, Inc. ACT (2003). WorkKeys: EEOC compliance. Iowa City, IA: ACT, Inc. 80 ACT (2004a). WorkKeys: The WorkKeys System. Iowa City, IA: ACT, Inc. ACT (2004b). WorkKeys: An overview. Iowa City, IA: ACT, Inc. ACT. (2007). National Career Readiness Certificate. Retrieved July 10, 2007 from http://www.act.org/certificate/index.html. ACT. (2007a). Level 3 Applied Mathematics Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/math/sample3.html. ACT. (2007b). Level 4 Applied Mathematics Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/math/sample4.html. ACT. (2007c). Level 5 Applied Mathematics Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/math/sample5.html. ACT. (2007d). Level 6 Applied Mathematics Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/math/sample6.html. ACT. (2007e). Level 7 Applied Mathematics Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/math/sample7.html. ACT. (2007f). Level 3 Locating Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/locate/sample3.html. ACT. (2007g). Level 4 Locating Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/locate/sample4.html. ACT. (2007h). Level 5 Locating Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/locate/sample5.html. ACT. (2007i). Level 6 Locating Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/locate/sample6.html. 81 ACT. (2007j). Level 3 Reading for Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/reading/sample3.html. ACT. (2007k). Level 4 Reading for Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/reading/sample4.html. ACT. (2007l). Level 5 Reading for Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/reading/sample5.html. ACT. (2007m). Level 6 Reading for Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/reading/sample6.html. ACT. (2007n). Level 7 Reading for Information Sample Item. Retrieved September 24, 2007 from http://www.act.org/workkeys/assess/reading/sample7.html. Agard, A. (2003). Pre-employment skills testing: An important step in the hiring process. Supervision, 64(6), 7-8 Age Discrimination in Employment Act of 1967. Retrieved July 7, 2007 from Equal Employment Opportunity Commission website: http://www.eeoc.gov/policy/adea.html. Alabama Office of Workforce Development. (n.d. a) Alabama Worker Credentialing Summit! Retrieved July 5, 2007, from: http://www.owd.alabama.gov/CRC/Worker%20Credentialing%20Summit.htm Alabama Office of Workforce Development. (n.d. b) Alabama?s Career Readiness Certificate. Retrieved July 5, 2007, from: http://www.owd.alabama.gov/CRC/CRC.htm 82 American Educational Research Association, American Psychological Association & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, D.C: American Educational Research Association. American Management Association. (1998, October). Workplace testing and monitoring. Management Review. Applegate, T. (1999). The work keys workforce. Techniques: Making education & career connections. 74(1), 51-53. Ary, D., Jacobs, L.C., & Razavieh, A. (2002). Introduction to research in education (6 th ed.) Belmont, CA: Wadworth/Thomson Learning. Bailey, T. (1997). Changes in the nature of work: Implications for skills and assessment. In H. F. O?Neill, Jr. (Ed.), Workforce readiness: Competencies and assessment (pp. 293-325). Mahaw, NJ: Lawrence Erlbaum Associates. Barnes, S.L. (2002) Applied Mathematics and Reading for Information Scores on The American College Testing (ACT) WorkKeys Assessment: Comparing groups by race, gender, and educational level (Doctoral dissertation, Auburn University, 2002). ProQuest Information and Learning, (UMI No. 3057131) Belton, H. D. (2000). American College Testing Work Keys assessments and individual variables of one-year technical completers in a selected community college in Mississippi (Doctoral dissertation, The University of Southern Mississippi, 2000). ProQuest-CSA LLC. (I.D. No. 732066021) 83 Bolin, B. (2005). WorkKeys: Certification of important workplace skills - an idea whose time has come. Paper presented at the National WorkKeys Conference, Chicago. May 2005. Buchanan, B. L. (2000). A comparative study of the Tests of Adult Basic Education and Work Keys with an incarcerated population. (Doctoral dissertation, Texas A&M University, 2000). ProQuest-CSA LLC. (I.D. No. 728323461) CRC Consortiium (2007). The Career Readiness Certificate. Retrieved July 10, 2007 from http://www.crcconsortium.org/crcc.htm. Carnevale, A.P., Gainer, L.J, & Meltzer, A.S. (1990). Workplace basics: The essential skills employers want. San Francisco, CA: Josey-Bass Publishers Challenger, J. (2003). Solving the looming labor crisis. USA Today Magazine, 132(2698) 28-30. Civil Rights Act of 1964, (1997), Civil Rights Act of 1964. Retrieved June 27, 2007, from Academic Search Premier database. Civil Rights Act of 1991. (1997). Civil Rights Act of 1991, Retrieved June 27, 2007, from Academic Search Premier database. Cronk, B. C. (2006). How to use SPSS: A step-by-step guide to analysis and interpretation (4 th ed.). Glendale, CA: Pyrczak Publishing. D., J. (2006, August). WorkKeys now holds the key to hiring. Training, 43(8), 12-13. Retrieved July 12, 2007, from Business Source Premier database. Equal Employment Opportunity Commission. (n.d.) EEOC Pre 1965: Events leading to the creation of EEOC. Retrieved July 12, 2007 from http://www.eeoc.gov/abouteeoc/35th/pre1965/index.html 84 Flynn, Gillian. (1999, July). Pre-employment testing can be unlawful. Workforce, 78, 82- 83. Friedman, T.L. (2005). The world is flat: A brief history of the twenty-first century. New York: Farrar, Straus and Giroux. Gay, L. R. & Airasian, P. (2000). Educational research: Competencies for analysis and application (6 th ed.) Upper Saddle River, NJ: Prentice-Hall, Inc. Gottfredson, L. S. (1994, November). The science and politics of race-norming. American Psychologist 49(11), 955-963. Grubb, W. N. (1997). Dick and Jane at work: The new vocationalism and occupational literacy programs. In G. Hull (Ed.). Changing work, changing workers: Critical perspectives on language, literacy, and skills (pp159 ? 188) Albany, NY: State University of New York Press. Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2006). Multivariate data analysis (6 th ed). Upper Saddle River, NJ.:Prentice-Hall. Hall, C. W., Davis, N. B., Bolen, L. M., & Chia, R. (1999). Gender and racial differences in mathematical performance. The Journal of Social Psychology, 139(6), 677-689. Hardcastle, A., & Mann, C. (2005). A survey of the pulp and paper industry in Washington and Oregon. Olympia, Washington: Washington State University. Heller, D. E., & Shapiro, D. T. (2001). Legal challenges to high-stakes testing: A case of disparate impact in Michigan? Paper presented at the Annual Meeting of the American Educational Research Association, Seattle, WA, April, 2001. 85 Hendrick, R.Z. (2006) Evaluating Work Keys Profiling as a pre-employment assessment tool to increase employee retention. (Doctoral dissertation, Old Dominion University, 2006) Huck, S. W. (2004). Reading statistics and research (4 th ed.). Boston: Pearson Education Inc. Latham, A. S. (1997). Gender differences on assessments. Educational Leadership, 55(4) 88-89 La. workforce program offers portable certificates: Uses ACT's 'WorkKeys' assessments to help certify workplace readiness levels. (2004). Vocational Training News, 35(1), 12-10. Leiber, L. D. (2007, Spring). As average age of U. S. workforce increases, age- discrimination verdicts rise. Employment Relations Today (Wiley), 34(1), 105- 110. Retrieved July 13, 2007, from Business Source Premier database. Linn, R. L. (2001). The design and evaluation of educational assessment and accountability systems. CSE Technical Report. McLarty, J. R., & Vansickle, T. R. (1997). Accessing employability skills: The work keys system. In H. F. O?Neill, Jr. (Ed.), Workforce readiness: Competencies and assessment (pp. 293-325). Mahaw, NJ: Lawrence Erlbaum Associates. Messick, S. (1994). Validity of psychological assessment: Validation of inferences from persons? responses and performances as scientific inquiry into score meaning. ETS Research Report. 86 Mountain View Coll., (1996, September 1). SCANS: The Secretary?s Commission on Achieving Necessary Skills/U.S. Department of Labor. A practical guide for identifying and using SCANS competencies in technical occupational programs..(ERIC Document Reproduction Service No. ED387590) Retrieved July 5, 2007, from ERIC database. Munro, B. H. (2005). Statistical methods for health care research (5 th ed.) Philadelphia, PA: Lippincott Williams & Wilkins Nicholson, G. (2000, October). Pre-employment screening. Workforce, 79(10), 72. Retrieved July 7, 2007, from Academic Search Premier database. Noble, J., Davenport, M., Schiel, J., Pommerich, M., & American Coll. Testing Program, I. (1999, October 1). High school academic and non-cognitive variables related to the ACT scores of racial/ethnic and gender groups. (ERIC Document Reproduction Service No. ED435669) Retrieved November 19, 2007, from ERIC database. O?Neill, H. F, Jr., Allred, K. & Baker, E. L. (1997). Review of workforce readiness theoretical frameworks. In H. F. O?Neill, Jr. (Ed.), Workforce readiness: Competencies and assessment (pp. 293-325). Mahaw, NJ: Lawrence Erlbaum Associates. Overtoom, C. (2000). Employability skills: An update. ERIC Digest No. 220. (Eric Document Reproduction Service No. ED445236) Patel, D. (2002, February). Testing, testing, testing. HR Magazine, 47(2) 112 Patterson, M. (2000) Overcoming the hiring crunch: Tests deliver informed choices. Employment Relations Today 27(3), 77-88. 87 Pendleton Act of 1883. (1997). Pendleton Act of 1883, Retrieved August 23, from Academic Search Premier database. Sackett, P. R., Schmitt, N., Ellingson, J. E., & Kabin, M. B. (2001). High-stakes testing in employment, credentialing, and higher education: Prospects in a post-affirmative- action world. American Psychologist, 56(4), 302-318. Sackett, P. R., & Wilk, S. L. (1994, November). Within-group norming and other forms of score adjustment in preemployment testing. American Psychologist, 49(11), 929-954. Samson, G. (1985, May). Effects of training in test-taking skills on achievement test performance: A quantitative synthesis. Journal of Educational Research, 78(5). Retrieved November 17, 2007, from Academic Search Premier database. Section 60-3, Uniform Guidelines on Employee Selection Procedure (1978); 43 FR 38295 (August 25, 1978). Southern Company, Employment testing. Retrieved April 29, 2006, from http://www.southernco.com/careerinfo/employment_testing.asp?mnuOpco=&mn uType=&mnuItem= Sticht, T. G. (1997). Assessing foundation skills for work. In H. F. O?Neill, Jr. (Ed.), Workforce readiness: Competencies and assessment (pp. 255-292). Mahaw, NJ: Lawrence Erlbaum Associates. Theriault, S. (2003, February). Patronage, the Pendleton Act, and the power of the people. Journal of Politics, 65(1), 50-68. Retrieved August 23, 2007, from Academic Search Premier database. 88 U.S. Department of Labor, Employment and Training Administration. (2000). Testing and assessment: An employer?s guide to good practices. U.S. Department of Labor, Retrieved April 10, 2007 from http://www.onetcenter.org/dl_files/empTestAsse.pdf. Williams, R. L., & Bowman, J. S. (2007, Spring) Civil service reform, at-will employment, and George Santayana: Are we condemned to repeat the past? Public Personnel Management 36(1) 65-77. Retreived July 14, 2007, from Business Source Premier database. Wonderlic (n. d.) Our history. Wonderlic, Inc. Retrieved July 13, 2007 from: http://www.wonderlic.com/about/history.asp Woodard, C. (2005, January). Merit by any other name ? Reframing the civil service first principle. Public Administration Review, 65(1), 109-116. Retrieved July 14, 2007, from Business Source Premier database. Zehr, M. A. (1998) Work Keys job-skills assessment finally catching on. Education Week 17(25). Retrieved March 4, 2007 from Academic Search Premier database 89 APPENDICES APPENDIX A Approval of Research by IRB 90 APPENDIX B Data Permission Letter 91 92 APPENDIX C Characteristics of the WorkKeys Assessments (ACT, 2004, p. 2-3) Applied Mathematics Level Characteristics of Items Skills 3 ? Translate easily from a word problem to a math equation ? All needed information is presented in logical order ? No extra information ? Solve problems that require a single type of mathematics operation (addition, subtraction, multiplication, and division) using whole numbers ? Add or subtract negative numbers ? Change numbers from one form to another using whole numbers, fractions, decimals, or percentages ? Convert simple money and time units (e.g., hours to minutes) 4 ? Information may be presented out of order ? May include extra, unnecessary information ? May include simple charts, diagrams, or graphs ? Solve problems that require one or two operations ? Multiply negative numbers ? Calculate averages, simple ratios, simple proportions, or rates using whole numbers and decimals ? Add commonly known fractions, decimals, or percentages (e.g., __, .75, 25%) ? Add three fractions that share a common denominator ? Multiply a mixed number by a whole number or decimal ? Put information in the right order before performing calculations 93 5 ? Problems require several steps of logic and calculation (e.g., problem may involve completing an order form by totaling the order and then computing tax) ? Decide what information, calculations, or unit conversions to use to solve the problem ? Look up a formula and perform single-step conversions within or between systems of measurement ? Calculate using mixed units (e.g., ? 3.5 hours and 4 hours and 30 minutes) ? Divide negative numbers ? Find the best deal using one- and two-step calculations and then comparing results ? Calculated perimeters and areas of basic shapes (rectangles and circles) ? Calculate percentage discounts or markups 6 ? May require considerable translation from verbal form to mathematical expression ? Generally require considerable setup and involve multiple-step calculations ? Use factions, negative numbers, ratios, percentages, or mixed numbers ? Rearrange a formula before solving a problem ? Use two formulas to change from one unit to another within the same system of measurement ? Use two formulas to change from one unit in one system of measurement to a unit in another system of measurement ? Find mistakes in items that belong at Levels 3, 4, and 5 ? Find the best deal and use the results for another calculation 94 ? Find areas of basic shapes when it may be necessary to rearrange the formula, convert units of measurement in the calculations, or use the result for further calculations ? Find the volume of rectangular solids ? Calculate multiple rates 7 ? Content or format may be unusual ? Information may be incomplete or implicit ? Problems often involve multiple steps of logic and calculation ? Solve problems that include nonlinear functions and/or that involve more than one unknown ? Find mistakes in Level 6 items ? Convert between systems of measurement that involve fractions, mixed numbers, decimals, and/or percentages ? Calculate multiple areas and volumes of spheres, cylinders, or cones ? Set up and manipulate complex ratios or proportions ? Find the best deal when there are several choices ? Apply basic statistical concepts 95 Appendix D Characteristics of the WorkKeys Assessments (ACT, 2004, p. 9) Locating Information Level Characteristics of Graphics Skills 3 ? Elementary workplace graphics such as simple order forms, bar graphs, tables, flowcharts, maps, instrument gauges, or floor plans ? One graphic used at a time ? Find one or two pieces of information in a graphic ? Fill in one or two pieces of information that were missing from a graphic 4 ? Straightforward workplace graphics such as basic order forms, diagrams, line graphs, tables, flowcharts, instrument gauges, or maps ? One or more graphics are used at a time ? Find several pieces of information in one or more graphics ? Understand how graphics are related to each other ? Summarize information from one or more straightforward graphics ? Identify trends shown in one or more straightforward graphics ? Compare information and trends shown in one or more straightforward graphics 5 ? Complicated workplace graphics, such as detailed forms, tables, graphs, diagrams, maps, or instrument gauges ? Graphics may have less common formats ? One or more graphics are used at a time ? Sort through distracting information ? Summarize information from one or more detailed graphics ? Identify trends shown in one or more detailed or complicated graphics ? Compare information and trends form one or more complicated graphics 96 6 ? Very complicated and detailed graphs, charts, tables, forms, maps and diagrams ? Graphics contain large amounts of information and may have challenging formats ? One or more graphics are used at a time ? Connections between graphics may be subtle ? Draw conclusions based on one complicated graphic or several related graphics ? Apply information form one or more complicated graphics to specific situations ? Use the information to make decisions 97 Appendix E Characteristics of the WorkKeys Assessments (ACT, 2004, p. 12-13) Reading for Information Level Characteristics of Reading Materials and Items Skills 3 ? Reading materials include basic company policies, procedures, and announcements ? Reading materials ate short and simple, with no extra information ? Reading materials tell readers what they should do ? All needed information is stated clearly directly ? Items focus on the main points of the passages ? Wording of the questions and answers is similar or identical to the wording used in the reading materials ? Identify main ideas and clearly stated details ? Choose the correct meaning of a word that is clearly defined in the reading ? Choose the correct meaning of common, everyday and workplace words ? Choose when to perform each step in a short series of steps ? Apply instructions to a situation that is the same as the one in the reading materials 4 ? Reading materials include company policies, procedures and notices ? Reading materials are straightforward, but have longer sentences and contain a number of details ? Reading materials use common words, but do have some harder words, too ? Reading material describe procedures that include several steps ? When following the procedures, individuals must think about changing conditions that affect what they should do ? Questions and answers are often paraphrased from the passage ? Identify important details that may not be clearly stated ? Use the reading material to figure out the meaning of words that are not defined ? Apply instructions with several steps to a situation that is the same as the situation in the reading materials ? Choose what to do when changing conditions call for a different action (follow directions that include ?if-then? statements 98 5 ? Policies, procedures, and announcements include all of the information needed to finish a task ? Information is stated clearly and directly, but the materials have many details ? Materials also include jargon, technical terms, acronyms, or words that have several meanings ? Application of information given in the passage to a situation that is not specifically described in the passage ? There are several considerations to be taken into account in order to choose the correct actions ? Figure out the correct meaning of a word based on how the word is used ? Identify the correct meaning of an acronym that is defined in the document ? Identify the paraphrased definition of a technical term or jargon that is defined in the document ? Apply technical terms and jargon and relate them to stated situations ? Apply straightforward instructions to a new situation that is similar to the one described in the material ? Apply complex instructions that include conditionals to situations described in the materials 6 ? Reading materials include elaborate procedures, complicated information and legal regulations found in all kinds of workplace documents ? Complicated sentences with difficult words, jargon, and technical terms ? Most of the information needed to answer the items is not clearly stated ? Identify implied details ? Use technical terms and jargon in new situations ? Figure out the less common meaning of a word based on the context ? Apply complicated instructions to new situations ? Figure out the principles behind policies, rules, and procedures ? Apply general principles from the materials to similar and new situations ? Explain the rationale behind a procedure, policy, or communication 99 7 ? Very complex reading materials ? Information includes a lot of details ? Complicated concepts ? Difficult vocabulary ? Unusual jargon and technical terms are used, but not defined ? Writing often lacks clarity and direction ? Readers must draw conclusions from some parts of the reading and apply them to other parts ? Figure out the definitions of difficult, uncommon words based on how they are used ? Figure out the meaning of jargon or technical terms based on how they are used ? Figure out the general principles behind the policies and apply them to situations that are quite different from any described in the materials APPENDIX F Sample Questions for Applied Mathematics Sample Question for Level 3 (ACT, 2007a) 100 Sample Question for Level 4 (ACT, 2007b) 101 102 Sample Question for Level 5 CT, 2007c) (A 103 Sample Question for Level 6 (ACT, 2007d) 104 Sample Question for Level 7 (ACT, 2007e) 105 APPENDIX G Sample Questions for Locating Information Sample Question for Level 3 CT, 2007f) (A 106 Sample Question for Level 4 CT, 2007g). (A 107 Sample Question for Level 5 (ACT, 2007h). 108 Sample Question for Level 6 (Part A) 109 Sample Question for Level 6 (Part B) CT, 2007i) (A 110 APPENDIX H Sample Questions for Reading for Information Sample Question for Level 3 CT, 2007j) (A 111 Sample Question for Level 4 CT, 2007k) (A 112 Sample Question for Level 5 (ACT, 2007l). 113 Sample Question for Level 6 (ACT, 2007m). 114 Sample Question for Level 7 CT, 2007n). (A