W W W. E L E A R N I N G G U I L D . C O M
August 2006
Usage and Value of Kirkpatrick’s Four Levels of Training Evaluation Research Report A N A LY S I S A N D C O M M E N TA R Y B Y J O E P U L I C H I N O
here is an old adage commonly cited in management circles, “If you can’t measure it, you can’t manage it.” Most executives would likely agree that managing their organization’s training function is an essential responsibility of the enterprise. Therefore, measuring and evaluating the effectiveness of that function must be a priority of the professionals charged with this responsibility. Yet, a major challenge faces these professionals: how best to perform such measurement and evaluation, and report the results in a timely, cost effective, and useful manner. Where can they find a method or system to address this challenge?
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
T
Many training professionals turn to Kirkpatrick’s four levels because it has become an industry standard for evaluating training programs over the course of fortyseven years in the literature. First described by Donald Kirkpatrick in 1959, this standard provides a simple taxonomy comprising four criteria of evaluation (Kirkpatrick originally called them steps or segments, but over the years they have become known as levels). The structure of the four level taxonomy suggests that each level after the first succeeds from the prior level. The first level measures the student’s reaction to the training and the second level what the student learned. The third level measures change in on-the-job behavior due to the training, and the fourth, the results in terms of specific business and financial goals and objectives for the organization. Theoretically, one level of evaluation leads to the next.
Yet, despite its status as an industry standard, many studies, including one conducted by Kirkpatrick himself, have shown that the full taxonomy is not widely used beyond the first two levels. This pattern of usage means that training practitioners might not be fully measuring, and therefore effectively managing, the impact that training and development has on two of the most important reasons for funding and providing resources for training in the first place: improvements in workplace performance and positive business or organizational results. Several important questions come up. Why are not all the levels of the taxonomy as described by Kirkpatrick used more widely by training professionals? If the measurement of training is a critical task, and the industry boasts of a standard for evaluation that is almost fifty years old, then why does so much impor-
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Page Guide to the Report 3 - 4 Demographics (Qs 1 to 5)
5 Respondents’ Knowledge of Kirkpatrick’s Four Levels of Training Evaluation (Q6)
7 - 8 Usage of Kirkpatrick’s Four Levels of Training Evaluation (Q7)
9 - 15 Specific Usage and Value of Kirkpatrick Level 3 (Qs 8 to 11)
16 - 17 Why Organizations Do Not Use Kirkpatrick Level 3 (Q12)
tant measurement remain undone? Certainly, one reason is that measurement of each succeeding “level” is more complex, more time consuming, and therefore more expensive. Kirkpatrick also proposes that another reason is the lack of expertise among practitioners to conduct these higher levels of evaluation. Nonetheless, as the results of this study show, many organizations do use Kirkpatrick’s four levels and derive value from this practice, especially at Levels 3 and 4. This report provides a detailed examination of the current usage of Kirkpatrick’s four levels, especially Levels 3 and 4. It includes an assessment of the current frequency of usage of Kirkpatrick’s four levels; the reasons why organizations do or do not use Kirkpatrick Levels 3 and 4; the value of the data that organizations obtain from usage of Kirkpatrick Levels 3 and 4; the challenges that must be overcome to implement Level 3 and 4 evaluations; and the characteristics and practices of those organizations that have wrestled with the challenges of implementing Levels 3 and 4. The Guild would like to thank Guild Research Committee Members, Dr. David J. Brand of 3M Corporation; Dr. Warren Longmire of Apple Computer; Dr. Maggie Martinez of The Training Place, and Dr. Richard Smith of General Dynamics, for their valuable contributions to this report.
18 - 24 Specific Usage and Value of Kirkpatrick Level 4 (Qs 13 to 16)
25 - 26 Why Organizations Do Not Use Kirkpatrick Level 4 (Q17)
2
R E S E A R C H R E P O R T / August 2006
27 - 28
Organizational Attributes That Influence Usage of Kirkpatrick Levels 3 and 4 (Qs 18 to 20)
28
Summary
29
To Learn More About This Subject
30
About the Guild, About the Research Committee, About the Author
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Demographics We asked our respondents to identify themselves and their organizations by five attributes: their role in their organization, the size of their organization, the type of their organization, their organization’s primary business focus, and the department they work for. This section presents the demographic data of our survey sample. This survey, like all other Guild surveys, was open to Guild Members and Associates as well as to occasional web-site visitors. These surveys are completed by accessing the survey link on the homepage of the Guild website. Naturally, Guild Members and Associates are more likely than non-members to participate, because each of the more than 22,100 Members and Associates receive an email notifying them of the survey and inviting them to participate. For this reason, we can classify this survey as a random sample because all Members have an opportunity to participate, and their participation is random.
Q1. What is your role in your organization? (Select only one) 4%
Executive (“C” level and VPs)
31%
Management
38%
Instructional Designer
9%
Instructor, Teacher, or Professor
7%
Course Developer
11%
Other 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Q2. What is the size of your organization (based on number of employees)? (Select only one) 19%
Under 100
11%
101 to 500
18%
501 to 2,500
21%
2,501 to 10,000
18%
10,001 to 50,000
13%
50,001 or more 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Corporation — Not a learning or e-Learning vendor
16%
Corporation — Learning or e-Learning vendor
8%
College or University
8%
Government
7%
Non-profit organization
5%
Other
1%
Membership association Military 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
3
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / August 2006
By a significant majority, our respondents work in corporate environments (70%), divided between e-Learning product or service providers (16%) and those corporations that are not in the e-Learning business (54%). Institutions of higher education and government each make up 8% of the sample.
(Select only one) 54%
Our respondents work in organizations of all sizes. Organizations with 2,501 to 10,000 employees have the highest frequency (21%) and those with 101 to 500 employees have the lowest frequency (11%). Thus, there is a 10% range between the highest and lowest of the six size categories.
100%
Q3. What type of organization do you work for?
1%
A respondent to this survey is most likely to be working as an instructional designer (38%), although more than one-third are in executive or management roles (35%). There is almost the same number of instructors, teachers, or professors (9%) as course developers (7%), and those who selected “Other” (11%) are mostly individual consultants, students, and technical or other professional staff.
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Demographics The most frequently cited primary business focuses of our respondents’ organizations are “Commercial Training or Education Services” (9%), “Technology (Hardware or Software)” (9%), and “Higher Education” (9%), followed by “Healthcare” (8%), “Other” (8%), “Financial Services” (7%), and “Insurance” (7%). Just under half of the respondents (43%) selected one of the remaining fifteen sectors.
Q4. What is the primary business focus of your organization? (Select only one) 9%
Commercial Training or Education Services
9%
Technology (Hardware or Software)
9%
Higher Education
8%
Healthcare
8%
Other
7%
Financial Services
7%
Insurance
6%
Manufacturing
6%
Government
5%
Professional Business Services or Consulting
4%
Pharmaceuticals or Biosciences
4%
Telecommunications
4%
Banking
3%
Retail or Wholesale
2%
Non-profit
2%
Transportation
2%
Military
2%
Utilities
1%
Hospitality, Travel, or Food Service
1%
Publishing, Advertising, Media, or PR
1%
K-12
0%
Petroleum or Natural Resources 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Q5. What department do you work in? (Select only
4
R E S E A R C H R E P O R T / August 2006
one) 63%
Training or Education
11%
Other
10%
Human Resources
8%
Information Technology
2%
Sales or Marketing
2%
Engineering or Product Development
2%
Customer Service
2%
Research and Development 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
A majority of our respondents work in a “Training or Education” department (63%), followed at a distance by “Human Resources” (10%) and “Information Technology” (8%). Those who selected “Other” (11%) are mostly independent consultants, or those who work in small or non-traditional organizations that do not have these types of departmental structures.
100%
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Respondents’ Knowledge of Kirkpatrick’s Four Levels of Training Evaluation We asked our respondents to rate on a scale of 1 to 5 their level of knowledge of Kirkpatrick’s four levels of training evaluation.
Q6. Rate on a scale of 1 to 5 your level of knowledge of Kirkpatrick’s four levels of training evaluation. (Select only one) Average Rating = 3.71 28%
5 = Highly knowledgeable
34%
4 = Very knowledgeable
26%
3 = Fairly knowledgeable
6%
2 = Not very knowledgeable
6%
1 = Not at all knowledgeable
Given that the eLearning Guild is a community of practice for e-Learning professionals, we expected that the respondents to this survey would be quite knowledgeable about an industry standard such as Kirkpatrick’s four levels of training evaluation. Indeed, they are. Although this assessment of their knowledge is self-reported and subjective, well over half of the respondents (62%) claim that that they are “Highly knowledgeable” (28%) or “Very knowledgeable” (34%). Only 12% report that they are “Not very knowledgeable” (6%) or “Not at all knowledgeable” (6%).
Background on Kirkpatrick’s Four Levels of Training Evaluation an almost non-existent discipline; one that had never before been required, asked for, or practiced. As Kirkpatrick describes the environment he was working in at the time, “Training professionals were struggling with the word ‘evaluation.’ There was no common language and no easy way to communicate what evaluation meant and how to accomplish it” (Kirkpatrick & Kirkpatrick, 2005, p 3). It was in this environment that Kirkpatrick published a series of four articles in the Journal of the American Society of Training Directors (Kirkpatrick 1959a, 1959b, 1960a, 1960b), entitled Evaluating Training Programs. These articles are now out-of-print and unavailable, however, the January 1996 issue of Training & Development reprinted them in condensed form. The purpose of these articles was “... to stimulate training directors to increase their efforts in evaluating training programs” (Kirkpatrick, 1996, p. 55). Kirkpatrick recognized from his dissertation experience that training professionals at the time had very little practice in evaluation, and almost no tools or processes to help them with this task. He reasoned that with some guidance and structure he could at least give his industry peers a place to start. Kirkpatrick offered three reasons for evaluating training. In addition to determining how to improve future programs and whether to continue existing programs, Kirkpatrick argues that the third reason is “... to justify the existence of the training department” (Kirkpatrick, 1994, p. 18). Therefore, one of Kirkpatrick’s primary objectives was to give training professionals some guidelines and suggestions for showing their management that the efforts of the training department had value and were worth its cost. In these articles, Kirkpatrick proposed that evaluating training is a four-step process, with each step leading to the next in succession from one to four. He named and defined the four steps or segments as (1) “reaction” or “how well trainees like a particular program”; (2) “learning” or “a measure of the knowledge acquired, skills improved, or attitudes changed due to training”; (3) “behavior” or “a measure of the extent to which participants change their
R E S E A R C H R E P O R T / August 2006
For those readers who may not be familiar with the history of Kirkpatrick’s four levels, we offer the following background (Pulichino 2006): The four level approach to training evaluation had its genesis in Kirkpatrick’s doctoral studies in the early 1950’s. He was writing a dissertation on the evaluation of a supervisory training program, and in the course of completing this study, he concluded that a proper, comprehensive evaluation needed to include not only the measure of the trainees’ reactions to the program and what they learned due to the training, but also “the extent of their change in behavior after they returned to their jobs, and any final results that were achieved by participants after they returned to work” (Kirkpatrick, 1996, p. 55). He limited the scope of his study, however, and he chose not to include the measurement of the latter two criteria of evaluation. It is ironic that even in his initial brush with the idea of this taxonomy he himself chose not to use what later became known as Levels 3 and 4. Nonetheless, the concepts of these four classifications of training evaluation were there and then born in Kirkpatrick’s mind. The world of corporate and industrial training in the 1950s was quite different than it is today. Although centralized human resources and training departments sponsored and conducted most training, just as they still do in most organizations, training programs and courses were almost exclusively classroom-based and led by an instructor or subject matter expert. Computer-assisted self-study was still in its infancy, and the possibility of blending the classroom experience with pre-class and post-class asynchronous e-Learning was literally decades away. In addition, human capital development as a strategy for competitive advantage did not enjoy the same level of acceptance that it does today, and there was far less need to provide employees with continuing education for professional development in order to maintain a knowledgeable and skilled workforce. As a result, there was much less job security in the training department. Finally, the task of training evaluation was
5
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Background on Kirkpatrick’s Four Levels of Training Evaluation ing has taken place. Nor is that an indication that participants’ behavior will change because of the training. Still farther away is any indication of results that one can attribute to the training (p. 55).” Kirkpatrick also acknowledges that evaluation at steps 3 (behavior) and 4 (results) is more difficult than at steps 1 (reaction) and 2 (learning) because these steps require “... a more scientific approach and the consideration of many factors ...” (p. 58) such as motivation to improve, work environment, and opportunity to practice the newly acquired knowledge or skills. He refers to the problem of the “separation of variables,” which raises the question of what other factors, in addition to the training, might have effected the behavior and results. These intervening variables certainly impact results at Levels 3 and 4, but are not necessarily within the purview or the range of experience of most training evaluation practitioners. Kirkpatrick is clear that “Eventually, we may be able to measure human relations training in terms of dollars and cents. But at the present time, our research techniques are not adequate” (p. 59). These four articles lay the groundwork for a simple approach to evaluating training that Kirkpatrick hoped would be enough to get training professionals started. He did not know that this approach would become the de facto industry standard in the ensuing decades. His aim was more simple, “It’s hoped that the training directors who have read and studied these articles are now clearly oriented on the problems and approaches in evaluating training. We training people should carefully analyze future articles to see whether we can borrow the techniques and procedures described (p. 59).” Kirkpatrick wanted to jump-start the industry with his four simple steps to evaluation in the hope that practitioners would work things out as they used this approach, buying time and resources as they evolved and refined the practice. The findings presented in this report provide a glimpse of how far today’s training practitioners, as represented by Members and Associates of The eLearning Guild community, have evolved and refined the practice.
6
R E S E A R C H R E P O R T / August 2006
on-the-job behavior because of training”; and (4) “results” or “a measure of the final results that occur due to training, including increased sales, higher productivity, bigger profits, reduced costs, less employee turnover, and improved quality.” (Kirkpatrick, 1996, p. 54 - 56). Kirkpatrick describes these steps in quite general terms, yet he readily acknowledges that the level of work and expertise required by each successive step in the evaluation process is more complex and difficult than in its predecessor step. Kirkpatrick concludes his presentation of the four steps with the hope that “... the training directors who have read and studied these articles are now clearly oriented on the problems and approaches in evaluating training” (Kirkpatrick, 1996, p. 59). Kirkpatrick describes his four steps as an orientation, a way of breaking down a complex process involving many variables and data collection challenges into four clearly delineated and logically ordered parts, which are theoretically sequential in nature, but only loosely connected in practice. For example, he wants practitioners to see that they can get started with the evaluation process by completing the relatively simple task of measuring the students’ reaction to a course. At the same time he recognizes that the information gleaned in the succeeding steps will be relatively more significant even as the steps will be even more difficult to design and implement. He suggested, “When training directors effectively measure participants’ reactions and find them favorable, they can feel proud. But they should also feel humble; the evaluation has only just begun” (p. 55). In anticipation of the criticism that was yet to come, Kirkpatrick points out quite clearly that a positive evaluation of one of the steps does not guarantee or even imply that there will be a positive evaluation in another step. In doing so, he admits that there may not be a correlation among the results of four steps of evaluation, but as will be shown, he often implies that there should be, without offering a theoretical or researchable basis for such a claim. “Even though a training director may have done a masterful job measuring trainees’ reactions, that’s no assurance that any learn-
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Usage of Kirkpatrick’s Four Levels of Training Evaluation In 1968, less than ten years after the publication of his original four articles in the ASTD Journal, Kirkpatrick, and Ralph Catalanello, a graduate student at the University of Wisconsin, published the results of a research study meant “... to determine and analyze current techniques being used by business, industry, and government in the evaluation of their training programs” (Catalanello & Kirkpatrick, 1968, p. 2). In this article, they refer to an “evaluation process” that they used to conduct the analysis, which is described in Chapter 5 of the 1967 edition of the ASTD Training and Development Handbook. The process described in that chapter is the same “four-step approach” that Kirkpatrick first introduced in his 1959 publications and has been touting at industry conferences ever since. Kirkpatrick and Catalanello focused their study on the frequency of usage of each of the four steps among a sample population of 110 business enterprises. They found that 78% of the respondents attempted to measure trainee reaction (Level 1), but that less than half or of the respondents were measuring learning (Level 2), behavior (Level 3), or results (Level 4). Based on this research, Kirkpatrick and Catalanello concluded that the practice of evaluation remained a nascent discipline among training practitioners, that trainee reaction was the most frequently measured criteria, and that in regard to the “more important and difficult steps” of evaluation (i.e., learning, behavior, and results) that there was “... less and less being done, and many of these efforts are superficial and subjective” (p. 9). In conclusion, they express a hope “... that future surveys and research projects will find that the ‘state of the art’ is far advanced from where we find it today” (p. 9).
Q7. Summary of Average Ratings and Percentages of Usage of Kirkpatrick’s Four Levels
Average Rating (Scale of 1-5)
Percentage “Always” or “Frequently”
7a. Level 1: “Reaction — How students react to the training”
4.34
85%
7b. Level 2: “Learning — The extent to which students change attitudes, improve knowledge, and/or increase skill as a result of the training”
3.57
57%
7c. Level 3: “Behavior — The extent to which on-the-job behavior or performance has changed and/or improved as a result of the training”
2.65
20%
7d. Level 4: “Results — The extent to which desired business and/or organizational results have occurred as a result of the training”
2.11
13% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
These results are similar to those of many studies taken over years since Kirkpatrick’s 1968 research, including several recent studies published by the Guild (e.g., Metrics: Learning Outcomes and Business Results and Metrics and Measurement 2005 Research Report). The data of such studies generally show that usage of the Kirkpatrick four levels declines with each succeeding level and that usage of Levels 3 and 4 is consistently below 50%, and in many cases at the lower levels reported in these findings. Note in chart 7c. that Level 3 evaluations are “Never” or “Rarely” conducted by 47% of respondents’ organizations, and in chart 7d. that Level 4 are “Never” or “Rarely” conducted by an even larger 74%. Granted, there are other evaluations methods and systems, and some of these organizations may use them instead of Kirkpatrick. The point remains, however, that even after almost fifty years in practice, usage of Levels 3 and 4 has not grown as significantly as Kirkpatrick might have hoped. R E S E A R C H R E P O R T / August 2006
7
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Usage of Kirkpatrick’s Four Levels of Training Evaluation Detailed Average Ratings and Percentages of Usage of Kirkpatrick’s Four Levels
7a. Kirkpatrick Level 1: “Reaction — How students react to the training”
7b. Kirkpatrick Level 2: “Learning — The extent to which students change attitudes, improve knowledge, and/or increase skill as a result of the training”
Average Rating: 4.34
Average Rating = 3.57
62%
5 = Always
19%
5 = Always
23%
4 = Frequently
38%
4 = Frequently
7%
3 = Sometimes
28%
3 = Sometimes
5%
2 = Rarely
11%
2 = Rarely
3%
1 = Never
4%
1 = Never
7c. Kirkpatrick Level 3: “Behavior — The extent to which on-the-job behavior or performance has changed and/or improved as a result of the training”
7d. Kirkpatrick Level 4: “Results — The extent to which desired business and/or organizational results have occurred as a result of the training”
Average Rating: 2.65 6%
5 = Always
4%
5 = Always
14%
4 = Frequently
9%
4 = Frequently
33%
3 = Sometimes
13%
3 = Sometimes
34%
2 = Rarely
41%
2 = Rarely
13%
1 = Never
33%
1 = Never
R E S E A R C H R E P O R T / August 2006
8
Average Rating: 2.11
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Note: We asked respondents whose organizations “Never” or “Rarely” use Kirkpatrick Level 3 to skip Questions 8 through 11 because
these questions pertain specifically to usage of Kirkpatrick Level 3. Therefore, only the responses of those respondents whose organizations “Sometimes,” “Often,” or “Always” use Kirkpatrick Level 3 are included in the data presented for Questions 8 through 11.
Question 8. The reasons why respondents’ organizations use Kirkpatrick Level 3. In regard to their organization’s use of Kirkpatrick Level 3, we asked our respondents to rate on a scale of 1 - 5 the importance of each of the following reasons why their organization uses Kirkpatrick Level 3 to evaluate training programs. We provided respondents with a selection of six reasons why organizations might use Kirkpatrick Level 3, including three reasons proposed by Kirkpatrick himself: to gain information on how to improve future programs, to decide whether to continue existing programs, and to justify the existence of the training department. To these three we added two reasons concerning measurement of the specific criteria of Level 3 (change in behavior or performance) and one reason concerning justification of the training budget.
Q8. Summary of Average Ratings and Percentages of the reasons why respondents’ organizations use Kirkpatrick Level 3 evaluations.
Average Rating (Scale 1 - 5)
Percentage “Highly Important” or “Very Important”
8a. To demonstrate the actual impact that training has on employee on-the-job performance
4.17
80%
8b. To gain information on how to improve future training programs
4.02
78%
8c. To determine that the desired change in employee on-the-job performance has been achieved
4.01
74%
8d. To decide whether to continue or discontinue a training program
3.22
44%
8e. To justify the budget allocated to the design and delivery of training
3.18
42%
8f. To justify the existence of the training department by showing how it contributes to the organization’s objectives and goals
3.17
44% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Our respondents whose organizations use Level 3 indicate that the most important reason to do so is “To demonstrate the actual impact that training has on employee on-the-job performance.” This reason is followed closely by “To gain information on how to improve future training programs” and “To determine that the desired change in employee on-the-job performance has been achieved.” One of Kirkpatrick’s three reasons, “To justify the existence of the training department ...” is the least important. Perhaps these organizations are more sophisticated in their approach to employee development and, as such, the justification of the training is implicit, and the organization’s desire to measure and manage its impact on employee on-the-job performance is strong and well supported.
R E S E A R C H R E P O R T / August 2006
9
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Detailed Average Ratings and Percentages of Reasons for Usage of Kirkpatrick Level 3
8a. To demonstrate the actual impact that training has on employee on-the-job performance
Average Rating: 4.17
Average Rating = 4.02
42%
5 = Highly important
30%
5 = Highly important
38%
4 = Very important
48%
4 = Very important
17%
3 = Fairly important
16%
3 = Fairly important
2%
2 = Not very important
5%
2 = Not very important
1%
1 = Not at all important
1%
1 = Not at all important
8c. To determine that the desired change in employee on-the-job performance has been achieved Average Rating: 4.01
8d. To decide whether to continue or discontinue a training program
Average Rating: 3.22
35%
5 = Highly important
11%
5 = Highly important
39%
4 = Very important
33%
4 = Very important
3 = Fairly important
29%
3 = Fairly important
2 = Not very important
23%
2 = Not very important
1 = Not at all important
4%
1 = Not at all important
20% 5% 1%
8e. To justify the budget allocated to the design and delivery of training
Average Rating: 3.18
8f. To justify the existence of the training department by showing how it contributes to the organization’s objectives and goals Average Rating = 3.17
13%
5 = Highly important
17%
5 = Highly important
29%
4 = Very important
27%
4 = Very important
31%
3 = Fairly important
23%
3 = Fairly important
18%
2 = Not very important
21%
2 = Not very important
9%
1 = Not at all important
12%
1 = Not at all important
10
R E S E A R C H R E P O R T / August 2006
8b. To gain information on how to improve future training programs
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Question 9. The Value of Level 3 Evaluation Data. We asked our respondent to rate on a scale of 1 - 5 the value to their organization of the data obtained from Kirkpatrick Level 3 evaluations in terms of measuring a) the effectiveness of training programs and b) the desired change in employee on-the-job performance.
Q9. Summary of Average Ratings and Percentages of the Value of Evaluation Data in Terms of Measuring Two Outcomes
Average Rating (Scale 1 - 5)
Percentage “Highly Valuable” or “Very Valuable”
9a. The desired change in employee on-the-job performance
3.92
72%
9b. The effectiveness of training programs
3.89
68% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Those respondents whose organizations use Kirkpatrick Level 3 evaluation report that the data they obtain is quite valuable both in terms of measuring “The desired change in employee on-the-job performance” and “The effectiveness of training programs.” Significantly, 0% of respondents report that these data have no value, and very few (3% to 5%) indicate that they are not very valuable. These high levels of data value for such a large group hint at several possibilities. First, our sample population of Level 3 practitioners must be following some best practices in order to obtain this quality of data and then to apply those data to the proper evaluation criteria. Second, these data and the best practices followed may be associated with the specific intervening variables measured during the process (See Question 10). Third, it would seem that if done properly, Level 3 evaluation is well worth doing. Detailed Average Ratings and Percentages of The Value of Evaluation Data in Terms of Measuring Two Outcomes
9a. The desired change in employee on-the-job performance Average Rating: 3.92
9b. The effectiveness of training programs Average Rating: 3.89
27%
5 = Highly valuable
24%
5 = Highly valuable
45%
4 = Very valuable
44%
4 = Very valuable
23%
3 = Fairly valuable
29%
3 = Fairly valuable
5%
2 = Not very valuable
3%
2 = Not very valuable
0%
1 = Not at all valuable
0%
1 = Not at all valuable
R E S E A R C H R E P O R T / August 2006
11
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Question 10. Consideration of Intervening Variables When Conducting Kirkpatrick Level 3 Evaluations We asked our respondents to rate on a scale of 1 - 5 the extent to which their organizations’ Kirkpatrick Level 3 evaluations include consideration of each of several intervening variables. One of the difficulties of evaluating the effectiveness of training programs at the level of “behavior” or “performance” is that so many different variables outside of the training program purview may affect achieving or not achieving the desired outcomes. In an attempt to determine the extent to which Level 3 practitioners consider some of these variables in the evaluation process, we provided respondents with a selection of five intervening variables.
Q10. Summary of Average Ratings and Percentages of Frequency of Consideration of Intervening Variables When Conducting Kirkpatrick Level 3 Evaluations
Average Rating (Scale 1 - 5)
Percentage “Always” or “Frequently”
10a. Whether the student has learned successfully as a result of the training
4.00
73%
10b. Whether the student has the opportunity to apply what was learned in practice and/or on-the-job situations
3.99
71%
10c. Whether the student perceives that the training has satisfied his/her need for performance-related learning
3.64
59%
10d. Whether the student is motivated to transfer learning to on-the-job performance
3.55
54%
10e. Whether management supports the desired change in on-the-job performance
3.52
51% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
12
R E S E A R C H R E P O R T / August 2006
These findings show that while all five of the given variables are commonly measured as part of Level 3 evaluations (a point to be remembered in terms of the high value of data obtained — See Question 9), there are slight differences in frequency among them. “Successful learning” is the variable our respondents’ organizations most often consider in the evaluation process — in other words, the results of a Level 2 evaluation. Thus, demonstrating, rather than assuming, a correlation between Level 2 and Level 3 outcomes is a primary consideration by successful evaluation practitioners. However, we note that our respondents’ organizations give the same level of attention to “Whether the student has the opportunity to apply what was learned in practice and/or on-the-job situations.” By doing so, the evaluators are likely to make the connection between “learning” and “performing” by assessing whether sufficient practice time has been allowed outside the “classroom” for the student to reinforce and retain the learning in the arena of real performance.
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Detailed Average Ratings and Percentages of Frequency of Consideration of Intervening Variables When Conducting Kirkpatrick Level 3 Evaluations
10a. Whether the student has learned successfully as a result of the training
Average Rating: 4.00 31%
5 = Always
42%
4 = Frequently
22%
3 = Sometimes
4%
2 = Rarely
1%
1 = Never
10c. Whether the student perceives that the training has satisfied his/her need for performance-related learning Average Rating: 3.64
10b. Whetherthe 10b.Whether thestudent studenthas hasthe theopportunity opportunityto to applyapply what what was learned was learned in practice in practice and/or and/or on-thejob situations on-the-job situations Average Rating: 3.99 34%
5 = Always
37%
4 = Frequently
23%
3 = Sometimes
6%
2 = Rarely
0%
1 = Never
10d. Whether the student is motivated to transfer learning to on-the-job performance
Average Rating: 3.55
22%
5 = Always
18%
5 = Always
37%
4 = Frequently
36%
4 = Frequently
27%
3 = Sometimes
33%
3 = Sometimes
12%
2 = Rarely
11%
2 = Rarely
1 = Never
2%
1 = Never
2%
10e. Whether management supports the desired change in on-the-job performance
Average Rating: 3.52 5 = Always
30%
4 = Frequently
33%
3 = Sometimes
13%
2 = Rarely
3%
1 = Never
13
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / August 2006
21%
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Question 11. The Challenges of Implementing Kirkpatrick Level 3. We asked our respondents to rate on a scale of 1 - 5 the degree of challenge for each of several issues that their organization may have dealt with in order to use Kirkpatrick Level 3 evaluation. These issues are among those commonly cited in the literature by Kirkpatrick and others as obstacles to using Level 3.
Q11. Summary of Average Ratings and Percentages of The Challenges of Implementing Kirkpatrick Level 3
Average Rating (Scale 1 - 5)
11a. The time required to conduct Level 3 evaluations
3.60
56%
11b. Gaining access to the data required to conduct a Level 3 evaluation
3.46
50%
11c. Making Level 3 evaluations a priority for HRD and training professionals
3.37
48%
11d. The expertise required to conduct Level 3 evaluations
3.28
43%
11e. Gaining management support for Level 3 evaluations
3.16
38%
11f. The cost of conducting Level 3 evaluations
3.07
35%
11g. Overcoming belief or opinion that Levels 1 and/or 2 evaluations are sufficient to determine the effectiveness of training
3.00
35%
Percentage “Highly Challenging” or “Very Challenging”
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
14
R E S E A R C H R E P O R T / August 2006
If the findings presented for Questions 8 to 10 provide some indication that Level 3 evaluators find value in the results of their practice, and hint at some of the reasons why they derive this value, then it is worth examining what issues they had to deal with in honing their practice and achieving the results. As indicated by the low percentage of Level 3 usage (See Question 7), and the observations of training evaluation experts, including Kirkpatrick himself, Level 3 evaluation is not easy. These data give us some perspective on where the difficulties lie. We see that the average “challenge” rating for all of the issues faced falls somewhere between “fairly challenging” and “very challenging.” Relatively speaking, however, we note that “time required” and “access to the data required” stand out, and these two selections seem underscored by the fact that making Level 3 evaluation a priority for training professionals is also quite challenging.
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 3 Detailed Average Ratings and Percentages of The Challenges of Implementing Kirkpatrick Level 3
11a. The time required to conduct Level 3 evaluations Average Rating: 3.60
11b. Gaining access to the data required to conduct a Level 3 evaluation Average Rating: 3.46
5 = Highly challenging
15%
5 = Highly challenging
40%
4 = Very challenging
35%
4 = Very challenging
33%
3 = Fairly challenging
34%
3 = Fairly challenging
10%
2 = Not very challenging
15%
2 = Not very challenging
1%
1 = Not at all challenging
1%
1 = Not at all challenging
16%
11c. Making Level 3 evaluations a priority for HRD and training professionals Average Rating: 3.37
11d. The expertise required to conduct Level 3 evaluations Average Rating: 3.28
5 = Highly challenging
14%
5 = Highly challenging
27%
4 = Very challenging
29%
4 = Very challenging
27%
3 = Fairly challenging
33%
3 = Fairly challenging
18%
2 = Not very challenging
21%
2 = Not very challenging
7%
1 = Not at all challenging
3%
1 = Not at all challenging
21%
11e. Gaining management support for Level 3 evaluations Average Rating: 3.16
11f. The cost of conducting Level 3 evaluations Average Rating: 3.07
15%
5 = Highly challenging
23%
4 = Very challenging
28%
4 = Very challenging
31%
3 = Fairly challenging
34%
3 = Fairly challenging
24%
2 = Not very challenging
26%
2 = Not very challenging
7%
1 = Not at all challenging
5%
1 = Not at all challenging
7%
Average Rating: 3.00 11%
5 = Highly challenging
24%
4 = Very challenging
28%
3 = Fairly challenging
29%
2 = Not very challenging
8%
1 = Not at all challenging
15
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / August 2006
11g. Overcoming belief or opinion that Levels 1 and/or 2 evaluations are sufficient to determine the effectiveness of training
5 = Highly challenging
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Why Organizations Do Not Use Kirkpatrick Level 3 Question 12. The Reasons Why Organizations Do Not Use Kirkpatrick Level 3 Evaluations. Note: We asked respondents whose organizations “Never” or “Rarely” use Kirkpatrick Level 3 to answer Question 12 because this
question pertains specifically to non-usage of Kirkpatrick Level 3 evaluations. Respondents whose organizations “Sometimes,” “Frequently,” or “Always” use Kirkpatrick Level 3 evaluations did not answer Question 12. We asked our respondents to rate on a scale of 1 - 5 the relative importance of each of several reasons why their organization never, or only rarely, uses Kirkpatrick Level 3. We provided respondents with seven reasons that their organizations might not use Level 3 evaluation. Note that these reasons relate directly to the challenging issues faced by those respondents who use Level 3 evaluations (See Question 11).
Q12. Summary of Average Ratings and Percentages of The Reasons Why Organizations Do Not Use Kirkpatrick Level 3 Evaluation
Average Rating (Scale 1 - 5)
Percentage “Highly important” or “Very Important”
12a. Difficulty accessing the data required for a Level 3 evaluation
3.79
65%
12b. No management support to conduct Level 3 evaluation
3.76
63%
12c. Too time consuming to conduct Level 3 evaluation
3.63
57%
12d. Level 3 evaluation is not considered a relatively important or urgent priority for the training department
3.27
46%
12e. Too costly to conduct Level 3 evaluation
3.11
38%
12f. We do not have the required expertise to conduct Level 3 evaluation
2.78
30%
12g. Levels 1 and/or 2 evaluations are all that is needed to determine effectiveness of training programs
2.11
14% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
16
R E S E A R C H R E P O R T / August 2006
The top two reasons for not using Kirkpatrick Level 3 evaluation reported by our respondents whose organizations do not use Level 3 are “Difficulty accessing the data required ...” and “No management support ...” The first reason corresponds to the second rated challenge reported by respondents whose organizations do use Level 3 evaluation, “Gaining access to the data required” (See Question 16). The time required to conduct Level 3 evaluations seems to be much more significant a reason not to do so than the cost of conducting such evaluations. Again, this finding corresponds to the relative challenge of time and cost as issues faced by those who do use Level 3. One reason in particular does not seem to be much of a factor. We see from these results that few organizations do not conduct Level 3 evaluations because they believe “Levels 1 and/or 2 evaluation are all that is needed to determine effectiveness of training programs.”
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Why Organizations Do Not Use Kirkpatrick Level 3 Detailed Average Ratings and Percentages of The Reasons Why Organizations Do Not Use Kirkpatrick Level 3 Evaluation
12a. Difficulty accessing the data required for a Level 3 evaluation Average Rating: 3.79
12b. No management support to conduct Level 3 evaluation Average Rating: 3.76
5 = Highly important
36%
5 = Highly important
31%
4 = Very important
27%
4 = Very important
21%
3 = Fairly important
23%
3 = Fairly important
34%
8%
2 = Not very important
7%
6%
1 = Not at all important
7%
12c. Too time consuming to conduct Level 3 evaluation Average Rating: 3.63
2 = Not very important 1 = Not at all important
12d. Level 3 evaluation is not considered a relatively important or urgent priority for the training department Average Rating: 3.27
28%
5 = Highly important
24%
5 = Highly important
29%
4 = Very important
22%
4 = Very important
26%
3 = Fairly important
22%
3 = Fairly important
12%
2 = Not very important
20%
2 = Not very important
5%
1 = Not at all important
12%
1 = Not at all important
12e. Too costly to conduct Level 3 evaluation Average Rating: 3.11 12%
5 = Highly important
26%
4 = Very important
29%
3 = Fairly important
26%
2 = Not very important
7%
1 = Not at all important
12f. We do not have the required expertise to conduct Level 3 evaluation Average Rating: 2.78 13%
5 = Highly important
17%
4 = Very important
22%
3 = Fairly important
31%
2 = Not very important
17%
1 = Not at all important
Average Rating: 2.11 6%
5 = Highly important
8%
4 = Very important
19%
3 = Fairly important
26%
2 = Not very important
41%
1 = Not at all important
17
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / August 2006
12g. Levels 1 and/or 2 evaluations are all that is needed to determine effectiveness of training programs
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Note: We asked respondents whose organizations “Never” or “Rarely” use Kirkpatrick Level 4 to skip Questions 13 through 16 because
these questions pertain specifically to usage of Kirkpatrick Level 4. Therefore, only the responses of those respondents whose organizations “Sometimes,” “Often,” or “Always” use Kirkpatrick Level 4 are included in the data presented for Questions 13 through 16.
Question 13. The reasons why respondents’ organizations use Kirkpatrick Level 4.
18
R E S E A R C H R E P O R T / August 2006
In regard to their organization’s use of Kirkpatrick Level 4, we asked our respondents to rate on a scale of 1 - 5 the importance of each of several reasons why their organization uses Kirkpatrick Level 4 to evaluate training programs. We provided respondents with a selection of six reasons why organizations might use Kirkpatrick Level 4, including three reasons proposed by Kirkpatrick himself: to gain information on how to improve future programs, to decide whether to continue existing programs, and to justify the existence of the training department. To these three we added two reasons concerning measurement of the specific criteria of Level 4 (business results) and one reason concerning justification of the training budget.
Q13. Summary of Average Ratings and Percentages of the reasons why respondents’ organizations use Kirkpatrick Level 4
Average Rating (Scale 1 - 5)
Percentage “Highly Important” or “Very Important”
13a. To demonstrate the actual impact that training has on business results
4.10
76%
13b. To determine that the desired change in business results has been achieved (13f)
4.09
80%
13c. To gain information on how to improve future training programs (13b)
3.91
71%
13d. To justify the budget allocated to the design and delivery of training (13d)
3.50
51%
13e. To decide whether to continue or discontinue a training program (13a)
3.43
52%
13f. To justify the existence of the training department by showing how it contributes to the organization’s objectives and goals (13c)
3.20
43% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Our respondents whose organizations use Level 4 indicate that the most important reason to do so is “To demonstrate the actual impact that training has on business results.” This reason is followed closely by “To determine that the desired change in business results has been achieved” and “To gain information on how to improve future training programs.” One of Kirkpatrick’s three reasons, “Justifying the existence of the training department” is the least important. These findings parallel those presented for Question 8 in which we asked respondents why their organizations use Level 3 evaluation. As we noted in that case, these organizations may be more sophisticated in their approach to employee development and, as such, the justification of the training department is implicit, and the organization’s desire to measure and manage its impact on business results is strong and well supported. We might conclude from the results of both Questions 8 and 13 that Kirkpatrick’s three key reasons for conducting training evaluation are not the primary motivations for doing Level 3 or Level 4 evaluations. It would seem that unless an organization has a strong desire to specifically measure the actual criteria of Levels 3 and 4 (employee performance and business results), then the traditional Kirkpatrick rationale for evaluation might not be enough to drive usage of Level 3 and 4 evaluations. This possibility, strongly supported by these data, provides one explanation for the infrequency of usage of Levels 3 and 4 relative to Levels 1 and 2.
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Detailed Average Ratings and Percentages of the Reasons why respondents’ organizations use Kirkpatrick Level 4
13a. To demonstrate the actual impact that training has on business results Average Rating: 4.10 41%
5 = Highly important
35%
4 = Very important
19% 3% 2%
13b. To determine that the desired change in business results has been achieved Average Rating: 4.09 36%
5 = Highly important
44%
4 = Very important
14%
3 = Fairly important
3 = Fairly important 2 = Not very important 6%
2 = Not very important
0%
1 = Not at all important
1 = Not at all important
13c. To gain information on how to improve future training programs Average Rating: 3.91
13d. To justify the budget allocated to the design and delivery of training Average Rating: 3.50
25%
5 = Highly important
20%
5 = Highly important
46%
4 = Very important
31%
4 = Very important
25%
3 = Fairly important
32%
3 = Fairly important
3%
2 = Not very important
13%
2 = Not very important
1%
1 = Not at all important
4%
1 = Not at all important
13e. To decide whether to continue or discontinue a training program
13f. To justify the existence of the training department by showing how it contributes to the organization’s objectives and goals
Average Rating: 3.43 Average Rating: 3.20 13%
5 = Highly important
39%
4 = Very important
32%
3 = Fairly important
11%
2 = Not very important
5%
1 = Not at all important
5 = Highly important
28%
4 = Very important
24%
3 = Fairly important
27%
2 = Not very important
6%
1 = Not at all important
19
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / August 2006
15%
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Question 14. The Value of Level 4 Evaluation Data We asked our respondents to rate on a scale of 1 - 5 the value to their organization of the data obtained from Kirkpatrick Level 4 evaluations in terms of measuring a) the effectiveness of training programs, and b) the desired business and/or organizational results.
Q14. Summary of Average Ratings and Percentages of The Value of Level 4 Evaluation Data in Terms of Measuring:
Average Rating (Scale 1 - 5)
Percentage “Highly Valuable” or “Very Valuable”
14a. The desired business and/or organizational results
4.08
74%
14b. The effectiveness of training programs (14a)
3.97
68% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Those respondents whose organizations use Kirkpatrick Level 4 evaluation report that the data they obtain is quite valuable both in terms of measuring “The desired business and/or organizational results” and “The effectiveness of training programs.” Significantly, only 1% of respondents report that these data have no value, and only 2% indicate that they are not very valuable. These high levels of data value for such a large group hint at several possibilities. First, our sample population of Level 4 practitioners must be following some best practices in order to obtain this quality of data and then to apply those data to the proper evaluation criteria. Second, these data, and the best practices followed, may be associated with the specific intervening variables measured during the process (See Question 15). Third, it would seem that if done properly, Level 4 evaluation is well worth doing. Detailed Average Ratings and Percentages of The Value of Level 4 Evaluation Data
14a. The desired business and/or organizational results Average Rating: 4.08
14b. The effectiveness of training programs Average Rating: 3.97
38%
5 = Highly valuable
33%
5 = Highly valuable
36%
4 = Very valuable
35%
4 = Very valuable
23%
3 = Fairly valuable
29%
3 = Fairly valuable
2 = Not very valuable
2%
2 = Not very valuable
1%
1 = Not at all valuable
1%
1 = Not at all valuable
20
R E S E A R C H R E P O R T / August 2006
2%
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Question 15. Consideration of Intervening Variables When Conducting Kirkpatrick Level 4 Evaluations We asked our respondents to rate on a scale of 1 - 5 the extent to which their organization’s Kirkpatrick Level 4 evaluations include consideration of each of the several variables. One of the difficulties of evaluating the effectiveness of training programs at the level of “business or organizational results” is that so many different variables outside of the training program purview may affect achieving or not achieving the desired outcomes. In an attempt to determine the extent to which Level 4 practitioners consider some of these variables in the evaluation process, we provided respondents with a selection of five intervening variables.
Q15. Summary of Average Ratings and Percentages of Frequency of Consideration of Intervening Variables When Conducting Kirkpatrick Level 4 Evaluations
Average Rating (Scale 1 - 5)
Percentage “Always” or “Frequently”
15a. Alignment of training with desired business results
4.11
76%
15b. Stakeholder support for achieving desired business results
3.78
66%
15c. Impact of employee behavior or motivation on desired business results
3.78
63%
15d. Organizational capability for achieving desired business results
3.76
67%
15e. Impact of competitive climate on desired business results
3.39
51%
15f. Impact of customer behavior or motivation on desired business results
3.37
51% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
These findings show that while all six of the given variables are commonly measured as part of Level 4 evaluations (a point to be remembered in terms of the high value of data obtained — See Question 14), there are slight differences in frequency among them. “Alignment of training with business results” is the variable our respondents’ organizations most often consider in the Level 4 evaluation process — in other words, how well the design of a training program responds to the demands of the business itself. This result supports the notion that the most effective use of the four levels begins with consideration of the desired business results and works backwards to the training. However, we note that our respondents’ organizations also give high level of attention to factors clearly outside the purview of training. By doing so, these evaluators are likely to make a more realistic connection between “learning,” “performing,” and “results” by weighing other variables, such as stakeholder support, employee motivation, and the competitive climate, and judging their impact. Clearly, Level 4 evaluation requires the ability to evaluate factors that are outside of, yet work along with, the training program.
R E S E A R C H R E P O R T / August 2006
21
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Detailed Average Ratings and Percentages of Frequency of Consideration of Intervening Variables When Conducting Kirkpatrick Level 4 Evaluations
15a. Alignment of training with desired business results
15b. Stakeholder support for achieving desired business results Average Rating: 3.78
Average Rating: 4.11 5 = Always
25%
5 = Always
35%
4 = Often
41%
4 = Often
19%
3 = Sometimes
22%
3 = Sometimes
4%
2 = Rarely
11%
2 = Rarely
1%
1 = Never
1%
1 = Never
41%
15c. Impact of employee behavior or motivation on desired business results
15d. Organizational capability for achieving desired business results
Average Rating: 3.78 Average Rating: 3.76 22%
5 = Always
41%
4 = Often
20%
5 = Always
32%
3 = Sometimes
47%
4 = Often
3%
2 = Rarely
22%
3 = Sometimes
2%
1 = Never
8%
2 = Rarely
3%
1 = Never
15e. Impact of competitive climate on desired business results
15f. Impact of customer behavior or motivation on desired business results
Average Rating: 3.39 Average Rating: 3.37 5 = Always
40%
4 = Often
29%
3 = Sometimes
18%
2 = Rarely
2%
1 = Never
22
R E S E A R C H R E P O R T / August 2006
11%
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
12%
5 = Always
39%
4 = Often
30%
3 = Sometimes
12%
2 = Rarely
7%
1 = Never
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Question 16. The Challenges of Implementing Kirkpatrick Level 4. We asked our respondents to rate on a scale of 1 - 5 the degree of challenge for each of several issues that their organization may have dealt with in order to use Kirkpatrick Level 4 evaluation. These issues are among those commonly cited in the literature by Kirkpatrick and others as obstacles to using Level 4.
Q16. Summary of Average Ratings and Percentages of The Challenges of Implementing Kirkpatrick Level 4
Average Rating (Scale 1 - 5)
Percentage “Highly Challenging” or “Very Challenging”
16a. Gaining access to the data required to conduct Level 4 evaluations
3.77
63%
16b. The time required to conduct Level 4 evaluations
3.75
63%
16c. The expertise required to conduct Level 4 evaluations
3.49
50%
16d. The cost of conducting Level 4 evaluations (16b)
3.36
47%
16e. Gaining management support for Level 4 evaluations
3.10
39%
16f. Making Level 4 evaluations a priority for HRD and training professionals
3.09
38%
16g. Overcoming the belief or opinion that Levels 1 and/or 2 evaluations are sufficient to determine the effectiveness of training
2.72
29% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
If the findings presented for Questions 13 to 15 provide some indication that Level 4 evaluators find value in the results of their practice, and hint at some of the reasons why they derive this value, then it is worth examining what issues they had to deal with in honing their practice and achieving the results. As indicated by the low percentage of Level 4 usage (See Question 7), and the observations of training evaluation experts, including Kirkpatrick himself, Level 4 evaluation is not easy. These data give us some perspective on where the difficulties lie. We see that the average “challenge” rating for all but one of the issues faced falls somewhere between “Fairly challenging” and “Very challenging.” Relatively speaking, however, we note that “access to the data required” and “time required” stand out as they did for Level 3 evaluations (See Question 11). However, we note “expertise required” is a more significant challenge (3.49 — 50%) for Level 4 evaluations than for Level 3 (See Question 11: 3.28 — 43%). In regard to the challenges for both Level 3 and Level 4 evaluations, we note that “Overcoming the belief or opinion that Levels 1 and/or 2 evaluations are sufficient to determine the effectiveness of training” rate on average as “Not very challenging.” Many experts have criticized Kirkpatrick’s four level approach because many training practitioners assume that positive outcomes at Level 1 and 2 imply positive outcomes at Levels 3 and 4, and therefore these “higher level” evaluations are not necessary. These data show that for those evaluating at Levels 3 and 4, these types of assumptions are not much of an obstacle. R E S E A R C H R E P O R T / August 2006
23
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Specific Usage and Value of Kirkpatrick Level 4 Detailed Average Ratings and Percentages of The Challenges of Implementing Kirkpatrick Level 4
16a. Gaining access to the data required to conduct Level 4 evaluations
16b. The time required to conduct Level 4 evaluations
Average Rating: 3.77
Average Rating: 3.75
27%
5 = Highly challenging
28%
5 = Highly challenging
36%
4 = Very challenging
35%
4 = Very challenging
24%
3 = Fairly challenging
24%
3 = Fairly challenging
13%
2 = Not very challenging
10%
2 = Not very challenging
1 = Not at all challenging
3%
1 = Not at all challenging
0%
16c. The expertise required to conduct Level 4 evaluations
16d. The cost of conducting Level 4 evaluations Average Rating: 3.36
Average Rating: 3.49 23%
5 = Highly challenging
19%
5 = Highly challenging
27%
4 = Very challenging
28%
4 = Very challenging
30%
3 = Fairly challenging
27%
3 = Fairly challenging
16%
2 = Not very challenging
22%
2 = Not very challenging
4%
1 = Not at all challenging
4%
1 = Not at all challenging
16e. Gaining management support for Level 4 evaluations
16f. Making Level 4 evaluations a priority for HRD and training professionals
Average Rating: 3.10 17%
5 = Highly challenging
12%
5 = Highly challenging
22%
4 = Very challenging
26%
4 = Very challenging
25%
3 = Fairly challenging
28%
3 = Fairly challenging
26%
2 = Not very challenging
27%
2 = Not very challenging
1 = Not at all challenging
7%
1 = Not at all challenging
R E S E A R C H R E P O R T / August 2006
10%
24
Average Rating: 3.09
16g. Belief or opinion that Levels 1 and/or 2 evaluations are sufficient to determine the effectiveness of training Average Rating: 2.72 9%
5 = Highly challenging
20%
4 = Very challenging
24%
3 = Fairly challenging
29%
2 = Not very challenging
18%
1 = Not at all challenging
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Why Organizations Do Not Use Kirkpatrick Level 4 Note: We asked respondents whose organizations “Never” or “Rarely” use Kirkpatrick Level 4 to answer Question 17 because this ques-
tion pertains specifically to non-usage of Kirkpatrick Level 4 evaluations. Respondents whose organizations “Sometimes,” “Frequently,” or “Always” use Kirkpatrick Level 4 evaluations did not answer Question 17. We asked our respondents to rate on a scale of 1 - 5 the relative importance of each of several reasons why their organization never, or only rarely, uses Kirkpatrick Level 4. We provided respondents with seven reasons that their organizations might not use Level 4 evaluation. Note that these reasons relate directly to the challenging issues faced by those respondents who use Level 4 evaluations (See Question 16).
Q17. Summary of Average Ratings and Percentages of The Reasons Why Organizations Do Not Use Kirkpatrick Level 4 Evaluation
Average Rating (Scale 1 - 5)
Percentage “Highly Important” or “Very Important”
17a. Difficulty accessing the data required for a Level 4 evaluation
4.07
74%
17b. Too time consuming to conduct Level 4 evaluation
3.81
65%
17c. No management support to conduct Level 4 evaluation
3.63
59%
17d. Level 4 evaluation is not considered a relatively important or urgent priority for the training department
3.39
48%
17e. Too costly to conduct Level 4 evaluation
3.38
47%
17f. We do not have the required expertise to conduct Level 4 evaluation
3.11
42%
17g. Levels 1 and/or 2 evaluations are all that is needed to determine effectiveness of training programs
2.32
17% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
The top two reasons for not using Kirkpatrick Level 4 evaluation reported by our respondents whose organizations do not use Level 4 are “Difficulty accessing the data required ...” and “Too time consuming to conduct... .” These reasons correspond to the top two challenges reported by respondents whose organizations do use Level 4 evaluation, “Gaining access to the data required” and “The time required” (See Question 16). A lack of management support for Level 4 evaluation as well as low urgency and prioritization by the training department are also significant inhibitors to using Level 4 evaluations. One reason in particular does not seem to be much of a factor. We see from these results that few organizations do not conduct Level 4 evaluations because they believe “Levels 1 and/or 2 evaluations are all that is needed to determine effectiveness of training programs.”
R E S E A R C H R E P O R T / August 2006
25
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Why Organizations Do Not Use Kirkpatrick Level 4 Detailed Average Ratings and Percentages of The Reasons Why Organizations Do Not Use Kirkpatrick Level 4 Evaluation
17a. Difficulty accessing the data required for a Level 4 evaluation
17b. Too time consuming to conduct Level 4 evaluation
Average Rating: 4.07
Average Rating: 3.81
48%
5 = Highly important
36%
5 = Highly important
26%
4 = Very important
29%
4 = Very important
17%
3 = Fairly important
20%
3 = Fairly important
4%
2 = Not very important
10%
2 = Not very important
5%
1 = Not at all important
5%
1 = Not at all important
17c. No management support to conduct Level 4 evaluation Average Rating: 3.63
17d. Level 4 evaluation is not considered a relatively important or urgent priority for the training department Average Rating: 3.39
33%
5 = Highly important
26%
4 = Very important
23%
5 = Highly important
21%
3 = Fairly important
25%
4 = Very important
12%
2 = Not very important
27%
3 = Fairly important
8%
1 = Not at all important
16%
2 = Not very important
9%
1 = Not at all important
17e. Too costly to conduct Level 4 evaluation 17f. We do not have the required expertise to conduct Level 4 evaluation Average Rating: 3.38
26
R E S E A R C H R E P O R T / August 2006
Average Rating: 3.11 23%
5 = Highly important
24%
4 = Very important
29%
3 = Fairly important
17%
2 = Not very important
7%
1 = Not at all important
17g. Levels 1 and/or 2 evaluation are all that is needed to determine effectiveness of training programs Average Rating: 2.32 7%
5 = Highly important
10%
4 = Very important
24%
3 = Fairly important
24%
2 = Not very important
35%
1 = Not at all important
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
18%
5 = Highly important
24%
4 = Very important
23%
3 = Fairly important
22%
2 = Not very important
13%
1 = Not at all important
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Organizational Attributes That Influence Usage of Kirkpatrick Levels 3 and 4 In addition to asking respondents to describe themselves as a sample population as presented in Questions 1 to 6, we asked them to provide specific information about their organizations’ expenditures on training and development. We particularly wanted to know the size of the training budget (Q18), the importance of competitive pressures as a factor in establishing the training budget (Q19), and the importance of maintaining a knowledgeable and skilled work force as a factor in establishing the training budget (Q20). These results are in the Charts Q18 to Q20, and show a wide range of training budget expenditures among our respondents’ organizations, yet a significant majority of these respondents state that competitive pressure and the need for a knowledgeable and skilled work force are important factors in determining the level of expenditure. We conducted cross tabulations of these data among reported usage of Levels 3 and 4 (See Question 7). The results showed that there is no significant relationship between the size of a training budget and usage of Levels 3 and 4. However, there is a significant relationship between a) competitive pressure, b) the need for a knowledgeable and skilled work force, and c) usage of Levels 3 and 4. It appears that the greater the stated importance of competitive pressures as a factor in establishing the training budget, and the greater the importance of maintaining a knowledgeable and skilled work force as a factor in establishing the training budget, the more likely an organization is to use Kirkpatrick Levels 3 and 4. Note: As we have reported in prior Guild Research Reports (e.g.,
Q18. What is the annual budget for the management, development, and delivery of your organization’s training programs? Please include all internal costs (salaries, benefits, travel, office space, classrooms, etc.) as well as expenditures for products and services purchased from external vendors (courseware, technology, outsourced services, tuition reimbursement, etc.) (Select only one) 10%
Under $100,000
18%
$100,001 to $1,000,000
9%
$1,000,001 to $2,500,000
7%
$2,500,001 to $5,000,000
7%
$5,000,001 to $10,000,000
15% 34%
Over $10,000,000 I do not know 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
16%
Under $100,000
27%
$100,001 to $1,000,000
14%
$1,000,001 to $2,500,000
11%
$2,500,001 to $5,000,000
10%
$5,000,001 to $10,000,000
The range of level of expenditure is fairly well distributed among the six categories — 43% of respondents report annual expenditures of over $2,500,000 and 57% below that number. Although this is not quite an exact calculation of training expenditures by our respondents’ organizations, it certainly shows that in many organizations training expenditures are non-trivial.
Over $10,000,000 10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
27
0%
R E S E A R C H R E P O R T / August 2006
Q18a. What is the annual budget for the management, development, and delivery of your organization’s training programs? Please include all internal costs (salaries, benefits, travel, office space, classrooms, etc.) as well as expenditures for products and services purchased from external vendors (courseware, technology, outsourced services, tuition re-imbursement, etc.) (Select only one)
22%
The Buying e-Learning Research Report — June 2005), over onethird (34%) of the respondents do not know their organization’s annual budget for training and development. We removed the “I do not know” selections and re-calculated the percentages to produce the chart labeled Q18a.
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
Organizational Attributes That Influence Usage of Kirkpatrick Levels 3 and 4 19. Rate on a scale of 1 to 5 the importance of competitive pressures in your organization’s market sector as a factor in establishing your organization’s level of expenditure on training for employees.
20. Rate on a scale of 1 to 5 the importance of your organization’s need to maintain a knowledgeable and skilled work force as a factor in establishing your organization’s level of expenditure on training for employees. (Select only one)
Average Rating: 3.27 Average Rating: 4.14 16%
5 = Highly important
28%
4 = Very important
30% 19% 7%
45%
5 = Highly important
33%
4 = Very important
17%
3 = Fairly important
3 = Fairly important 2 = Not very important 3%
2 = Not very important
2%
1 = Not at all important
1 = Not at all important
Does competitive pressure drive expenditure on training? For 74% of our respondents, this factor is at least fairly important in the funding process. In addition, the findings show that as this factor increases in importance for an organization, so too does its usage of Kirkpatrick Levels 3 and 4.
Does the need for a knowledgeable and skilled workforce drive expenditure on training? For 95% of our respondents, this factor is at least fairly important in the funding process. In addition, the findings show that as this factor increases in importance for an organization, so too does its usage of Kirkpatrick Levels 3 and 4.
Summary
28
R E S E A R C H R E P O R T / August 2006
Most training professionals would likely agree that the practice of training evaluation has come a long way since Kirkpatrick first published on the topic in 1959 and gave the industry his four step taxonomy, which, for better or worse, later became known as the four levels model. Yet, despite Kirkpatrick’s own hopes, the use of this taxonomy is often limited beyond the first two levels because of the many difficult challenges raised by Level 3 and 4 evaluation. Nonetheless, this research report shows that those organizations who do meet these challenges derive significant value from the data obtained from their Level 3 and 4 evaluation efforts, especially in terms of measuring the impact of training on employee on-the-job performance and desired business results. Significantly, these findings show that those organizations who do use Levels 3 and 4 are also likely to cite the importance of competitive pressures and their need for a knowledgeable and skilled workforce as driving factors in the funding of their training programs.
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
To Learn More About This Subject To learn more about this subject, we encourage Guild Members to search the following pages on the Guild’s Website using the keywords, “Kirkpatrick,” “evaluation,” “measurement,” and “metrics.” The Resource Directory: http://www.e-LearningGuild.com/resources/resources/index.cfm?actions=viewcats The e-Learning Developers’ Journal: http://www.e-LearningGuild.com/articles/abstracts/index.cfm?action=view
References: Alliger, G. M., & Janak, E. A. (1989). Kirkpatrick’s levels of training criteria: thirty years later. Personnel Psychology, 42(2), 331-342. Catalanello, R. F., & Kirkpatrick, D. L. (1968). Evaluating Training Programs — The State of the Art. Training and Development Journal, 22(5), 2-9. Holton, E. F. (1996). The Flawed Four-Level Evaluation Model. Human Resource Development Quarterly, 7(1), 5-21. Kirkpatrick, D. L. (1959). Techniques for evaluating training programs. Journal of ASTD, 13(11), 3-9. Kirkpatrick, D. L. (1959). Techniques for evaluating training programs: Part 2 — Learning. Journal of ASTD, 13(12), 21-26. Kirkpatrick, D. L. (1960). Techniques for evaluating training programs: Part 3 — Behavior. Journal of ASTD, 14(1), 13-18. Kirkpatrick, D. L. (1960). Evaluating training programs: Part 4 — Results. Journal of ASTD, 14(2), 28-32. Kirkpatrick, D. L. (1976). Evaluation of Training. In R. L. Craig (Ed.), Training & Development Handbook (Second ed., pp. 18-11:18-27). New York: McGraw-Hill Book Company. Kirkpatrick, D. L. (1977). Evaluating training programs: evidence vs. proof. Training and Development Journal, 31(11), 9-12. Kirkpatrick, D. L. (1994). Evaluating Training Programs: The Four Levels (First ed.). San Francisco: Berrett-Koehler. Kirkpatrick, D. L. (1996). Great Ideas Revisited. Training & Development, 54-59. Kirkpatrick, D. L. (1998). Evaluating Training Programs: The Four Levels (Second ed.). San Francisco: Berrett-Koehler Publishers, Inc. Kirkpatrick, D. L., & Kirkpatrick, J. D. (2005). Transferring Learning to Behavior. San Francisco: Berrett-Koehler Publishers, Inc. Newstrom, J. W. (1978). Catch-22: the problems of incomplete evaluation of training. Training and Development Journal, 32(11), 22-24.
O’Driscoll, T., Sugrue, B., & Vona, M. K. (2005). The C-Level and the Value of Learning. TD, 7. Pulichino, J. (2004). Metrics: Learning Outcomes and Business Results Research Report. Santa Rosa: The eLearning Guild. Pulichino, J. (2005). Metrics and Measurement 2005 Research Report. Santa Rosa: The eLearning Guild. Pulichino, J. (2006). Usage and Value of Kirkpatrick’s Four Levels. Unpublished Dissertation, Pepperdine University, Malibu.
This survey generated responses from over 550 Members and Associates.
29
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com
R E S E A R C H R E P O R T / August 2006
Newstrom, J. W. (1995). Evaluating Training Programs: The Four Levels. Human Resource Development Quarterly, 6(3), 317-320.
R E S E A R C H R E P O R T / K i r k p a t r i c k ’s F o u r L e v e l s o f Tr a i n i n g E v a l u a t i o n
About the author
The Research Committee Members
Joe Pulichino, Senior Research Analyst, The eLearning Guild
Ms. Dawn Adams, Content Manager, Microsoft Global e-Learning
Joe Pulichino began his career in education as an English instructor at Rutgers University over 25 years ago. Since then he has held a number of senior management positions in the technology sector where he was responsible for the development, delivery, and marketing of a wide range of corporate education programs and services. Most recently he has served as vice-president of education services at Sybase, vice-president of eLearning at Global Knowledge Network, and CEO of EduPoint. He is an adjunct faculty member of the Pepperdine University Graduate School of Education and Psychology where he is completing his Ed.D. in Education Technology. The focus of his research is on informal and organizational learning. Joe is principal of the Athena Learning Group, a virtual network of consultants and academics working in the fields of learning, knowledge management, performance enhancement, and communities of practice.
Services
Dr. David J. Brand, Learning Design & Technology, 3M Corporation Ms. Paula Cancro, IT Training Specialist, IFMG, Inc. Ms. Barbara Fillicaro, Writer, Training Media Review Ms. Silke Fleischer, Product Manager, Adobe Mr. Joe Ganci, CEO, Dazzle Technologies, Corp. Dr. Nancy Grey, Director, Pharmaceutical Regulatory Eduction, Pfizer Ms. Sheila Jagannathan, e-Learning Specialist, The World Bank Institute
Dr. Warren Longmire, Senior Instructional Designer, Apple Computer Dr. Maggie Martinez, CEO, The Training Place Mr. Frank Nyguen, Senior Learning Technologist, Intel Dr. Richard Smith, Instructional Designer, General Dynamics Network Systems
Ms. Celisa Steele, Vice President of Operations, LearnSomething Mr. Ernie Thor, Senior Instructional Designer, Cingular Wireless
About the Guild The eLearning Guild is a global Community of Practice for designers, developers, and managers of e-Learning. Through this member-driven community, the Guild provides high-quality learning opportunities, networking services, resources, and publications. Guild members represent a diverse group of instructional designers, content developers, Web developers, project managers, contractors, consultants, managers and directors of training and learning services — all of whom share a common interest in e-Learning design, development, and management. Members work for organizations in the corporate, government, academic, and K-12 sectors. They also are employees of e-Learning product and service providers, consultants, students, and self-employed professionals. More than 22,100 Members and Associates of this growing, worldwide community look to the Guild for timely, relevant, and objective information about e-Learning to increase their knowledge, improve their professional skills, and expand their personal networks. The eLearning Guild’s Learning Solutions Magazine is the premier weekly online publication of The eLearning Guild. Learning Solutions practical strategies and techniques for designers, developers, and managers of e-Learning.
30
R E S E A R C H R E P O R T / August 2006
The eLearning Guild organizes a variety of industry events focused on participant learning:
CHECK ONLINE for topics and dates!
October 10 - 13, 2006 SAN FRANCISCO
TBA
October 10 - 13, 2006 SAN FRANCISCO
TBA
© 2006 The eLearning Guild. All rights reserved. http://www.eLearningGuild.com