BUS 680 Week Replies Needed
Nine Evaluation of Training
Learning Objectives After reading this chapter, you should be able to: (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i125#ch09un�ig01)
Describe the pros and cons of evaluation and indicate which way to go on the issue.
Explain what process evaluation is, and why it is important.
Describe the interrelationships among the various levels of outcome evaluation.
Describe the costs and bene�its of evaluating training.
Differentiate between the two types of cost-effectiveness evaluation (cost savings and utility analysis).
Describe the various designs that are possible for evaluation and their advantages and disadvantages.
De�ine and explain the importance of internal and external validity (Appendix 9-1).
9.1 Case: Training Designed to Change Behavior and Attitudes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_01) The city of Palm Desert, California, decided to provide training to improve employees’ attitudes toward their work and to provide them with the skills to be more effective on the job. The two-day seminar involved a number of teaching methods, including a lecture, �ilms, role-plays, and group interaction. Among the topics covered were con�lict control, listening, communicating, telephone etiquette, body language, delegation, and taking orders. Throughout the two days, the value of teamwork, creativity, and rational decision making was stressed and integrated into the training.
Before the training was instituted, all 55 nonmanagement employees completed a paper-and-pencil questionnaire to measure both their attitudes toward the job and their perception of their job behaviors. Supervisors also completed a questionnaire assessing each of their employees. All 55 employees were told that they would be receiving the same two-day seminar. The �irst set of 34 employees was chosen at random.
The 21 employees who did not take the training immediately became a comparison group for evaluating the training. While the �irst group of employees was sent to the training, the others were pulled off the job, ostensibly to receive training, but they simply took part in exercises not related to any training. Thus, both groups were treated similarly in every way except for the training. Both groups completed attitude surveys immediately after the trained group �inished training. Six months later, both groups completed self-report surveys to measure changes in their job behavior. Their supervisors also were asked to complete a similar behavior measure at the six-month mark.
The data provided some revealing information. For the trained group, no changes in attitude or behavior were indicated, either by the self-report or by supervisor-reported surveys. This result was also true (but expected) for the group not trained.
Was training a failure in the Palm Desert case? Would the training manager be pleased with these results? Was the evaluation process �lawed? These types of issues will be addressed in this chapter. We will refer back to the case from time to time to answer these and other questions.
1
9.2 Rationale for Evaluation Imagine a business that decided it would not look at its pro�itability, return on investment (ROI), or productivity. You are a supervisor with this company, but you never look at how well or poorly your subordinates are performing their jobs. This is what training is like when no evaluation is conducted. Good management practice dictates that organizational activities are routinely examined to ensure that they are occurring as planned and are producing the anticipated results. Otherwise, no corrective action can be taken to address people, processes, and products or services that stray “off track.”
Nonetheless, many rationalizations for not evaluating training continue to exist, and evaluation of training is often not done. A 1988 survey of 45 Fortune 500 companies indicated that all of them asked trainees how much they liked training, but only 30 percent assessed how much was learned, and just 15 percent examined behavioral change. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_02) Other evidence from that time suggested that only 1 company in 100 used an effective system for measuring the organizational effects and value of training. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_03) But this is changing. In a 1996 study, 70 percent assessed learning, 63 percent assessed behavior, and 25 percent assessed organizational results. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_04) Evaluation of training at all levels is becoming more common. Nevertheless, the evaluation of training is still not where it needs to be. A 2006 study of 140 businesses of all sizes and types shows that the things organizations view as the most important outcomes of training are still not being measured very often. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_05)
But, as noted, over the course of time, more organizations are evaluating training. The main reason for this is an increase in accountability. Top management is demanding evidence that training departments are contributing positively to the bottom line. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_06) Dave Palm, training director of LensCrafters, knows �irsthand about this trend. A frantic regional manager called Dave and told him that executives were looking to improve the bottom line and could not �ind enough evidence that training programs were providing a quanti�iable return on the company’s investment. Yes, they knew that trainees were satis�ied with training, but was the company getting the bang for their buck? The conversation ended with the regional manager saying, “So, Dave, what are you going to do about it?” Dave got his wake-up call. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_07)
2
3
4
5
6
7
9.3 Resistance to Training Evaluation Training managers can come up with a surprising number of reasons for not evaluating training, including the following:
There is nothing to evaluate.
No one really cares about it.
Evaluation is a threat to my job.
There Is Nothing to Evaluate For some companies, training is simply a reward for good performance, or something that is mandated so everyone has to attend. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_08) The argument here is that training is not expected to accomplish anything, so there is nothing to evaluate.
The Counterargument The �irst thing we would question here is why the company is spending money on something that has no value. We would argue that even in cases where training is a reward, it is designed with some goals or objectives in mind. Some type of knowledge, skills, or attitude (KSA) change is expected from the participants even if they just feel more positive about their job or the company. Once this goal or objective is identi�ied, it can be measured. Evaluation is simply measuring the degree to which objectives are achieved. Even when training is mandated, such as safety training, there are still objectives to be achieved in terms of learning, job behavior, and the organization.
No One Really Cares About Evaluating Training The most common rationale for not conducting training evaluations is that “formal evaluation procedures are too expensive and time-consuming, and no one really cares anyway.” This explanation usually means that no one speci�ically asked for, demanded, or otherwise indicated a need for assessment of training outcomes.
The Counterargument If an evaluation is not speci�ically required, this does not mean that training is not evaluated. Important organizational decisions (e.g., budget, staf�ing, and performance evaluations) are made with data when data exist, but will also be made if the data do not exist. If no formal evaluations of training have taken place, the decision makers will decide on the basis of informal impressions of training’s effectiveness. Even in good economic times, the competition for organizational budget allocations is strong. Departments that can document their contributions to the organization and the return on budget investment are more likely to be granted their budget requests. The question, then, is not whether training should be evaluated, but rather who will do it (training professionals or budget professionals), how it will be done (systematic and formally or informal impressions), and what data will be used (empirical studies of results or hearsay and personal impressions).
Evaluation Is a Threat to My Job Think about it. According to the 2011 State of the Industry Report conducted by the American Society for Training & Development, training budgets in the United States totaled over $171.5 billion. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_09) Why wouldn’t human resource development (HRD) departments be evaluating their results? Fear of the result is one reason. Football coach Woody Hayes, back in the 1950s, once said that he never liked to throw the forward pass because three things could happen and two of them were bad. The same could be said for evaluation. If time and money are
8
9
spent on training, and an evaluation determines that no learning occurred, or worse, job performance declined, this doesn’t re�lect well on the training provided. Although most managers are not likely to admit this concern publicly, it can be a real problem. When we use the term evaluation, we too often think of a single �inal outcome at a particular point that represents success or failure—like a report card. This type of evaluation is called an outcome evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_001) . When the focus is on this type of evaluation, managers naturally can be concerned about how documenting the failure of their programs will affect their careers. Consider Training in Action 9-1. It provides an example of an evaluation designed to provide feedback so that improvement (through training and practice) can take place. But when the focus shifted from “helping improve” (process evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_002) ) to a “measurement of success or failure” (outcome evaluation), the desire to participate in the process disappeared, and the airline threatened to discontinue it.
9-1 Training in Action Evaluation: What It Is Used for Matters (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_11)
For 30 years, British Airways maintained a system in all its aircraft that monitors everything done by the aircraft and its pilots. This information is examined continuously to determine any faulty aircraft mechanisms and to constantly assess the skill level of the pilots. When a pilot is �lagged as having done “steep climbs” or “hard” or “fast landings,” for example, the pilot is targeted for training to alleviate the skill de�iciency. The training is used, therefore, as a developmental tool to continuously improve the performance of pilots. The evaluation is not used as a summative measure of performance upon which disciplinary measures might be taken. The result for British Airways, one of the largest airlines in the world, is one of the best safety records in the world.
In the past, one of the major ways of determining problems in the airline industry in North America was to wait until an accident occurred and then examine the black box to �ind the causes. The �indings might indicate pilot error or some problem with the aircraft. This information was then sent to all the major airlines for their information. This form of summative evaluation met with disastrous results. Recently, six major American airlines began a program similar to the one at British Airways. After all, it makes sense to track incidents and make changes (in aircraft design or pilot skill level) as soon as a problem is noticed. In this way, major incidents are more likely to be avoided. In fact, airlines are using the evaluation information gathered as a feedback mechanism to ensure the continuous improvement of performance and not as a summative evaluation of “failure.”
This seemingly effective way of ensuring high performance threatened to come to an end in the United States. The Federal Aviation Administration (FAA) wanted to access this information for possible use as a way to evaluate pilots. The airlines feared that the information given to the FAA could be used to punish both pilots and the airlines. Fortunately, these regulations were never put into place and both the airlines and the FAA continue to use this cockpit information as a means of continuously improving safety and pilot performance by improving the training of pilots.
The Counterargument Can the airline in Training in Action 9-1 be blamed for wanting to opt out of the program? It is easy to understand why someone would not want to participate in a program where the information could be used against them. While outcome results are important in making business decisions, the day-to-day purpose of evaluation should be used
11
as a feedback mechanism to guide efforts toward success. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_10) While trying to convince a client that the company’s training should be evaluated, one trainer decided not to use the term evaluation. Instead, he chose the term data tracking. He emphasized tracking attitudes and behaviors over time and supplying feedback based on the �indings to the training designers and presenters. This feedback could then be used to modify training and organizational systems and processes to facilitate the training’s success. The term data tracking did not have the same connotation of �inality as evaluation. Hence, managers saw it as a tool for improving the likelihood of a successful intervention rather than as a pass/fail grade.
Was the evaluation in the Palm Desert case outcome or process focused? It is dif�icult to say without actually talking to those involved. If it was used for continuous improvement, assessment of the training process, as well as how much the participants learned, it could be helpful in determining the reason that transfer did not take place. On the basis of this information, the city could design additional interventions to achieve desired outcomes.
10
9.4 So We Must Evaluate On the surface, the arguments for ignoring evaluation of training make some sense, but they are easily countered when more carefully analyzed. However, perhaps the biggest reason for abandoning the resistance to evaluation is its bene�it, especially today, when more and more organizations are demanding accountability at all levels. Managers increasingly are demanding from HRD what they demand from other departments: Provide evidence of the value of your activities to the organization. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_12) Other factors that in�luence the need to evaluate training are competitive pressures on organizations requiring a higher focus on quality, continuous improvement, and organizational cost cutting. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_13)
Sometimes, the image of the training function, especially among line managers, is less than desirable because they see this as a “soft” area, not subject to the same requirements for accountability as their areas. By using the same accountability standards, it is possible to improve the image of training. Furthermore, the technology for evaluating and placing dollar amounts on the value of training has improved in the last several years. However, let us be clear. We do not advocate a comprehensive evaluation of every training program. The value of the information gained must be worth the cost. Sometimes, the cost of different components of an evaluation is simply too high relative to the information gained. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_14)
12
13
14
9.5 Types of Evaluation Data Collected Let’s go back to the evaluation phase �igure at the beginning of the chapter. Recall from Chapter 5 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i68#ch05) that one of the outputs from the design phase is evaluation considerations. These considerations, or more speci�ically, what is determined important to evaluate, are inputs to the evaluation phase. Organizational constraints and design issues are also inputs to evaluation. Remember that evaluation processes and outcome measures should be developed soon after the design phase output is obtained. The two types of outputs from the evaluation phase are process and outcome evaluation. Process evaluation compares the developed training to what actually takes place in the training program. Outcome evaluation determines how well training has accomplished its objectives.
Process Data One of the authors has a cottage near a lake, and he often sees people trying unsuccessfully to start their outboard motors. In going to their assistance, he never starts by suggesting that they pull the plugs to check for ignition or disconnect the �loat to see whether gas is reaching the carburetor. Instead, he asks if the gas line is connected �irmly, if the ball is pumped up, if the gear shift is in neutral (many will not start in gear), and if the throttle is at the correct position, all of which are process issues. He evaluates the “process” of starting the engine to see whether it was followed correctly. If he assumed that it was followed and tried to diagnose the “problem with the engine,” he might never �ind it.
It is the same with training. If learning objectives were not achieved, it is pointless to tear the training design apart in trying to �ix it. It might simply be a process issue—the training was not set up or presented the way it was intended. By examining the entire training process, it is possible to see all the places where the training might have gone wrong. In the examination of the process, we suggest segmenting the process into two areas: process before training and process during training.
Process: Before Training Several steps are required in analyzing the processes used to develop training. Table 9-1 identi�ies questions to ask during the analysis of the training process. First, you can assess the effectiveness of the needs analysis from the documentation or report that was prepared. This report should indicate the various sources from which the data were gathered and the KSA de�iciencies.
Table 9-1 Potential Questions for Analysis of Processes Prior to Delivery (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_15)
15
Were needs diagnosed correctly?
What data sources were used?
Was a knowledge/skill de�iciency identi�ied?
Were trainees assessed to determine their prerequisite KSAs?
Were needs correctly translated into training objectives?
Were all objectives identi�ied?
Were the objectives written in a clear, appropriate manner?
Was an evaluation system designed to measure accomplishment of objectives?
Was the training program designed to meet all the training objectives?
Was previous learning that might either support or inhibit learning in training identi�ied?
Were individual differences assessed and taken into consideration in training design?
Was trainee motivation to learn assessed?
What steps were taken to address trainee motivation to learn?
Were processes built into the training to facilitate recall and transfer?
What steps are included in the training to call attention to key learning events?
What steps are included in the training to aid trainees in symbolic coding and cognitive organization?
What opportunities are included in the training to provide symbolic and behavioral practice?
What actions are included in the training to ensure transfer of learning to the job?
Are the training techniques to be used appropriate for each of the learning objectives?
Next, you can assess the training objectives. Are they in line with the training needs? Were objectives developed at all levels: organizational, transfer, learning, and reaction? Are they written clearly and effectively to convey what must be done to demonstrate achievement of the objectives? It is important that you examine the proposed evaluation tools to be sure that they are relevant. On the basis of the needs assessment and resulting objectives, you can identify several tools for assessing the various levels of effectiveness. We discuss the development of these tools later in this chapter. Then evaluate the design of the training. For example, if trainees’ motivation to attend and learn is low, what procedures are included in the design to deal with this issue?
Would a process evaluation prove useful in the Palm Desert case? Yes. In that situation, as it stands, we recognize that training was not successful, but we do not know why. The process that leads to the design of training might provide the answer. Another place we might �ind the answer is in the training implementation.
Process: During Training If your outcome data show that you didn’t get the results you expected, then training implementation might be the reason. Was the training presented as it was designed to be? If the answer is yes, then the design must be changed. But, it is possible that the trainer or others in the organization made some ad hoc modi�ications. Such an analysis might prove useful in the Palm Desert case.
Imagine, for example, that the Palm Desert training had required the use of behavior modeling to provide practice in the skills that were taught. The evaluation of outcomes shows that learning of the new behaviors did not occur. If no process data were gathered, the conclusion could be that the behavior modeling approach was not effective. However, what if examination of the process revealed that trainees were threatened by the behavior modeling technique, and the trainer allowed them to spend time discussing behavior modeling, which left less time for doing the modeling? As a result, it is quite plausible that there are problems with both the design and the implementation of the training. Without the process evaluation, this information would remain unknown, and the inference might be that behavior modeling was not effective.
Examples of implementation issues to examine are depicted in Table 9-2. Here, it is up to the evaluator to determine whether all the techniques that were designed into the program were actually implemented. It is not good enough simply to determine that the amount of time allotted was spent on the topic or skill development. It must also be determined whether trainees actually were involved in the learning activities as prescribed by the design. As in the previous behavior modeling example, the time allotted might be used for something other than behavior modeling.
Table 9-2 Potential Questions for a Process Analysis of Training Delivery
Were the trainer, training techniques, and learning objectives well matched?
Were lecture portions of the training effective?
Was involvement encouraged or solicited?
Were questions used effectively?
Did the trainer conduct the various training methodologies (case, role-play, etc.) appropriately?
Were they explained well?
Did the trainer use the allotted time for activities?
Was enough time allotted?
Did trainees follow instructions?
Was there effective debrie�ing following the exercises?
Did the trainer follow the training design and lesson plans?
Was enough time given for each of the requirements?
Was time allowed for questions?
Putting It All Together Actual training is compared with the expected (as designed) training to provide an assessment of the effectiveness of the training implementation. Much of the necessary information for the expected training can be obtained from records and reports developed in the process of setting up the training program. A trainer’s manual would provide an excellent source of information about what should be covered in the training. Someone could monitor the training to determine what actually was covered. Another method is to ask trainees to complete evaluations of process issues for each module. Videotape, instructors’ notes, and surveys or interviews with trainees can also be used. Keep in mind that when you are gathering any data, the more methods you use to gather information, the better the evaluation will be.
When to Use It Table 9-3 depicts those interested in process data. Clearly, the training department is primarily concerned with this information to assess how they are doing. The customers of training (de�ined as anyone with a vested interest in the training department’s work) usually are more interested in outcome data than in process data.
Table 9-3 Who Is Interested in the Process Data (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_16)
Training Department
Trainer Yes, it helps determine what works well and what does not.
Other trainers Yes, to the extent the process is generalizable.
Training manager Only if training is not successful or if a problem is present with a particular trainer.
Customers of the Training Department
Trainees No
Trainees’ supervisor No
Upper management No
Providing some process data is important, even if it is only the trainer’s documentation and the trainees’ reactions. The trainer can use this information to assess what seems to work and what does not. Sometimes, more detailed process data will be required, such as when training will be used many times, or when the training outcomes have a signi�icant effect on the bottom line. If, however, it is only a half-day seminar on the new computer software, collecting process information might not be worth the cost.
Once training and trainers are evaluated several times, the value of additional process evaluations decreases. If you are conducting training that has been done numerous times before, such as training new hires to work on a piece of equipment, and the trainer is one of your most experienced, then process analysis is probably not necessary. If the trainer was fairly new or had not previously conducted this particular session, it might be bene�icial to gather process data through a senior trainer’s direct observation.
To be most effective, we believe that evaluations should be process oriented and focused on providing information to improve training, not just designed to determine whether training is successful. Some disagree with this approach and suggest that process evaluation ends when the training program is launched. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_17) We suggest, however, that the evaluation should always include process evaluation for the following reasons:
It removes the connotation of pass/fail, making evaluation more likely.
It puts the focus on improvement, a desirable goal even when training is deemed successful. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_18)
16
17
18
Outcome Data To determine how well the training met or is meeting its goals, it is necessary to examine various outcome measures. The four outcome measures that are probably the best known are reaction, learning, behavior, and organizational results. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_19) These outcomes are ordered as follows:
Reaction outcomes come �irst and will in�luence how much can be learned.
Learning outcomes in�luence how much behavior can change back on the job.
Behavior outcomes are the changes of behavior on the job that will in�luence organizational results.
Organizational results are the changes in organizational outcomes related to the reason for training in the �irst place, such as high grievance rate, low productivity, and so forth.
This description is a simpli�ied version of what actually happens, and critics argue that little empirical evidence indicates that the relationships between these outcomes exist. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_20) We will discuss this in more detail later.
Reaction outcomes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_003) are measures of the trainee’s perceptions, emotions, and subjective evaluations of the training experience. They represent the �irst level of evaluation and are important because favorable reactions create motivation to learn. Learning may also occur even if the training is boring or alternatively, it may not occur even if it is interesting. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_21) However, if training is boring, it will be dif�icult to attend to what is being taught. As a result, the trainees might not learn as much as they would if they found the training interesting and exciting. High reaction scores from trainees, therefore, indicate that attention was most likely obtained and maintained, which, as you recall from social learning theory, is the �irst part of learning—getting their attention.
Learning outcomes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_004) are measured by how well the learning objectives and purpose were achieved. The learning objectives for the training that were developed in the design phase specify the types of outcomes that will signify that training has been successful. Note the critical relationship between the needs analysis and evaluation. If the training process progressed according to the model presented in this book , the way to measure learning was determined during the training needs analysis (TNA). At that time, the employee’s KSAs were measured to determine whether they were adequate for job performance. The evaluation of learning should measure the same things in the same way as in the TNA. Thus, the needs analysis is actually the “pretest.” A similar measure at the end of training will show the “gain” in learning.
Job behavior outcomes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_005) are measures of the degree to which the learned behavior has transferred to the job. During the TNA, performance gaps were identi�ied and traced to areas in which employees were behaving in a manner that was creating the gap. The methods used for measuring job behavior in the TNA should be used in measuring job behavior after the completion of training. Once again, the link between needs analysis and evaluation is evident. The degree to which job behavior improves places a cap on how much the training will improve organizational results.
Organizational results (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_006) occupy the highest level in the hierarchy. They re�lect the organizational performance gap identi�ied in the TNA. This OPG is often what triggers reactive (as opposed to proactive) training. Here are some examples:
19
20
21
High levels of scrap are being produced.
Employees are quitting in record numbers.
Sales �igures dropped over the last two quarters.
Grievances are on the increase.
The number of rejects from quality control is rising.
Once again, if one of these OPGs triggered the training, it can be used as the baseline for assessing improvement after training. This process of integrating the TNA and evaluation streamlines both processes, thereby making the integration more cost-effective. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_22)
Putting it All Together If each level of the outcome hierarchy is evaluated, it is possible to have a better understanding of the full effects of training. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_23) Let’s examine one of the items in the preceding list—a high grievance rate—as it relates to the training process and the four levels of evaluation.
The needs analysis determines that the high grievance rate is a function of supervisors not managing con�lict well. Their knowledge is adequate, but their skills are de�icient. From the needs analysis, data are obtained from a behavioral test that measures con�lict management skills for comparison with skill levels after training has been completed. Training is provided, and then participants �ill out a reaction questionnaire. This tool measures the degree to which trainees feel positive about the time and effort that they have invested in the program and each of its components. Assume that the responses are favorable. Even though the trainees feel good about the training and believe that they learned valuable things, the trainer recognizes that the intended learning might not have occurred. The same behavioral test of con�lict management skills is administered, and the results are compared with pretraining data. The results show that the trainees acquired the con�lict management skills and can use them appropriately, so the learning objectives were achieved. Now the concern is if these skills transferred to the job. We compare the behavior of the supervisors before training and after training regarding use of con�lict management skills and discover they are using the skills so transfer to the job was successful. The next step is to examine the grievance rate. If it has declined, it is possible, with some level of con�idence, to suggest that training is the cause of the decline. If it is determined that learning did not take place after training, it would not make sense to examine behavior or results, because learning is a prerequisite.
Let’s examine each of these four levels of evaluation more closely.
Reaction Questionnaire The data collected at this level are used to determine what the trainees thought about the training. Reaction questionnaires are often criticized, not because of their lack of value, but because they are often the only type of evaluation undertaken. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_24)
Affective and utility are two types of reaction questionnaire. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_25) An affective questionnaire (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_007) measures general feeling about training (“I found this training enjoyable”), whereas the utility questionnaire (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_008) re�lects beliefs about the value of training (“This training was of practical value”). While both type are useful, we believe that speci�ic utility statements on reaction questionnaires are more valuable for making changes.
22
23
24
25
Training reaction questionnaires do not assess learning but rather the trainees’ attitudes about and perceptions of the training. Categories to consider when developing a reaction questionnaire should include training relevance, training content, materials, exercises, trainer(s) behavior, and facilities.
Training Relevance Asking trainees about the relevance (utility) of the training they experienced provides the organization with a measure of the perceived value of the training. If most participants do not see any value in it, they will experience dif�iculty remaining interested (much less consider applying it back on the job). Furthermore, this perceived lack of value can contaminate the program’s image. Those who do not see its value will talk to others who have not yet attended training and will perhaps suggest that it is a waste of time. The self-ful�illing prophecy proposes that if you come to training believing that it will be a waste of time, it will be. Even if the training is of great importance to the organization, participants who do not believe that it is important are not likely to work to achieve its objectives.
Once trainees’ attitudes are known, you can take steps to change the beliefs, either through a socialization process or through a change in the training itself. Think about the Palm Desert case. What do you think the trainees’ reactions to the training were? Might this source of information help explain why no change in behavior occurred?
Training Materials and Exercises Any written materials, videos, exercises, and other instructional tools should be assessed along with an overall evaluation of the training experience. On the basis of responses from participants, you can change these to make them more relevant to participants. Making suggested modi�ications follows the organizational development principle of involving trainees in the process.
Reactions to the Trainer Reaction questionnaires also help determine how the trainees evaluated the trainer’s actions. Be sure to develop statements that speci�ically address what the trainer did. General statements tend to re�lect trainees’ feelings about how friendly or entertaining the trainer was (halo error) rather than how well the training was carried out. Simply presenting an affective statement such as “The trainer was entertaining” would likely elicit a halo response. For this reason, it is useful to identify speci�ic aspects of trainer behavior that need to be rated. If more than one trainer is involved, then trainee reactions need to be gathered for each trainer. Asking trainees to rate the trainers as a group will mask differences among trainers in terms of their effectiveness.
Asking about a number of factors important to effective instruction causes the trainees to consider how effective the instructor was in these areas. When the �inal question, “Overall, how effective was the instructor?” is asked, the trainees can draw upon their responses to a number of factors related to effective instruction. This consideration will result in a more accurate response as to the overall effectiveness of the instructor. There will be less halo error. Note that the questionnaire in Table 9-4 asks the trainee to consider several aspects of the trainer’s teaching behavior before asking a more general question regarding effectiveness.
Table 9-4 Reaction Questions About the Trainer
Please circle the number to the right of the following statements that re�lects your degree of agreement or disagreement.
1 = Strongly disagree
2 = Disagree
3 = Neither agree nor disagree
4 = Agree
5 = Strongly agree
1. The trainer did a good job of stating the objectives at the beginning of training. 1 2 3 4 5
2. The trainer made good use of visual aids (easel, white board) when making the presentations. 1 2 3 4 5
3. The trainer was good at keeping everyone interested in the topics. 1 2 3 4 5
4. The trainer encouraged questions and participation from trainees. 1 2 3 4 5
5. The trainer made sure that everyone understood the concepts before moving on to the next topic.
1 2 3 4 5
6. The trainer summarized important concepts before moving to the next module. 1 2 3 4 5
7. Overall, how would you rate this trainer? (Check one)
_____ 1. Poor—I would not recommend this trainer to others.
_____ 2. Adequate—I would recommend this trainer only if no others were available.
_____ 3. Average
_____ 4. Good—I would recommend this trainer above most others.
_____ 5. Excellent—This trainer is among the best I’ve ever worked with.
Facilities and Procedures The reaction questionnaire can also contain items related to the facilities and procedures to determine whether any element impeded the training process. Noise, temperature, seating arrangements, and even the freshness of the doughnuts are potential areas that can cause discontent. One way to approach these issues is to use open-ended questions, such as the following:
Please describe any aspects of the facility that enhanced the training or created problems for you during training (identify the problem and the aspect of the facility).
Please indicate how you felt about the following:
Refreshments provided
Ability to hear the trainer and other trainees clearly
Number and length of breaks
Facility questions are most appropriate if the results can be used to con�igure training facilities in the future. The more things are working in the trainer’s favor, the more effective training is likely to be.
The data from a reaction questionnaire provide important information that can be used to make the training more relevant, the trainers more sensitive to their strengths and shortcomings, and the facilities more conducive to a positive training atmosphere. The feedback the questionnaire provides is more immediate than with the other levels of evaluation; therefore, modi�ications to training can be made much sooner.
Timing of Reaction Assessment The timing and type of questions asked on a reaction questionnaire should be based on the information needed for evaluating and improving the training, the trainer(s), the processes, or the facility. Most reaction questionnaires are given to participants at the conclusion of training, while the training is still fresh and the audience is captive. However, a problem with giving them at this time is that the participant might be anxious to leave and might give incomplete or less-than-valid data. Also, trainees might not know whether the training is useful on the job until they go back to the job and try it.
An alternative is to send out a reaction questionnaire at some point after training. This delay gives the trainee time to see how training works in the actual job setting. However, the trainee might forget the speci�ics of the training. Also, there is no longer a captive audience, so response rate may be poor.
Another approach is to provide reaction questionnaires after segments of a training program or after each day in a multiday training session. In such situations, it might be possible to modify training that is in progress on the basis of trainees’ responses. Of course, this system is more costly and requires a quicker turnaround time for analysis and feedback of the data.
Regardless of how often reaction evaluation takes place, the trainer should specify at the beginning that trainees will be asked to evaluate the training and state when this evaluation will occur. It not only helps clarify trainee expectations about what will happen during training but also acknowledges the organization’s concern for how the trainees feel about the training. It is also important that the data gathered be used. Trainees and employees in the rest of the organization will quickly �ind out if the trainer is simply gathering data only to give the impression of concern about their reactions. Table 9-5 provides a list of steps to consider when developing a reaction questionnaire.
Table 9-5 Steps to Consider in Developing a Reaction Questionnaire
1. Determine what needs to be measured.
2. Develop a written set of questions to obtain the information.
3. Develop a scale to quantify respondents’ data.
4. Make forms anonymous so that participants feel free to respond honestly.
5. Ask for information that might be useful in determining differences in reactions by subgroups taking the training (e.g., young vs. old; minority vs. nonminority). This could be valuable in determining effectiveness of training by different cultures, for example, which might be lost in an overall assessment. Note: Care must be taken when asking for this information. If you ask too many questions about race, gender, age, tenure, and so on, participants will begin to feel that they can be identi�ied without their name on the questionnaire.
6. Allow space for additional comments to allow participants the opportunity to mention things you did not consider.
7. Decide the best time to give the questionnaire to get the information you want.
A. If right after training, ask someone other than the instructor to administer and pick up the information.
B. If some time later, develop a mechanism for obtaining a high response rate (e.g., encourage the supervisor to allow trainees to complete the questionnaire on company time).
Caution in Using Reaction Measures A caution is in order regarding reaction questionnaires sent out to trainees sometime after training asking them about the amount of transfer of training that has occurred on the job. Trainees tend to indicate that transfer has occurred when other measures suggest it did not. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_26) Therefore, reaction measures should not be the only evaluation method used to determine transfer of training.
Reaction questionnaires are not meant to measure learning or transfer to the job. They do, however, provide the trainees with the opportunity to indicate how they felt about the learning. How interesting and relevant the training is found to be will affect their level of attention and motivation. What the trainees perceive the trainer to be doing well and not so well is also useful feedback for the trainer. The reaction information can be used to make informed decisions about modi�ications to the training program.
Learning Learning objectives are developed from the TNA. As we noted, training can focus on three types of learning outcomes: knowledge, skills, and attitudes (KSAs). The difference between the individual’s KSAs and the KSAs required for acceptable job performance de�ines the learning that must occur. The person analysis serves as the pretraining measure of the person’s KSAs. These results can be compared with a posttraining measure to determine whether learning has occurred and whether those changes can be attributed to training. The various ways of making such attributions will be discussed later in the chapter. Chapter 4 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i48#ch04) discussed the various ways in which KSAs can be measured. The work done in the Needs Analysis phase to identify what should be measured and how it will be measured determines what measures you will use in your evaluation. This just makes sense because your learning objectives should be based on the training needs you identi�ied in the Needs Analysis phase. Unless you were extremely insightful or lucky, you probably measured a number of things that didn’t end up being training needs as well as things that did. It is only the KSAs that ended up being training needs that get evaluated at the end
26
of training. For example, let’s say your needs analysis used a knowledge test that assessed the employees’ problem- solving knowledge. Your test had 50 items measuring various aspects of problem solving. The person analysis showed that 30 of these items were training needs. In the Design phase these 30 items would be the focus of the learning objectives you developed. A training program would then be created to address the learning objectives and de facto those 30 items. Your evaluation instrument should then assess if knowledge of those 30 items has been learned by the trainees.
Timing of Assessment of Learning Depending on the duration of training, it might be desirable to assess learning periodically to determine how trainees are progressing. Periodic assessment would allow training to be modi�ied if learning is not progressing as expected.
Assessment should also take place at the conclusion of training. If learning is not evaluated until sometime later, it is impossible to know how much was learned and then forgotten.
In the Palm Desert case, the measures that they took six months after training created a dilemma. Was the behavior ever learned, learned but forgotten, or learned but not transferred to the job?
Transfer to the Job (Job Behavior) Once it is determined that learning took place, the next step is to determine whether the training transferred to the job. Assessment at this step is certainly more complex and is often ignored because of the dif�iculties of measurement. However, if you did your needs analysis correctly, you have already determined what behavior to measure and how to measure it.
Several methods may have been used to assess job behavior prior to training. These methods were covered in depth in the discussion of TNA in Chapter 4 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i48#ch04) . So, the method used when conducting the needs assessment should be used to evaluate whether or not the learned behavior transferred to the job.
Scripted Situations Some recent research indicates that scripted situations might provide a better format for evaluating transfer of training than the more traditional behavioral questionnaires. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_27) Scripted situations help the rater recall actual situations and the behaviors related to them rather than attempting to recall speci�ic behaviors without the context provided. The rater is provided with several responses that might be elicited from the script and is asked to choose the one that describes the ratee’s behavior. Research suggests that this method is useful in decreasing rating errors and improving validity. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_28) An example of this method is depicted in Table 9-6.
Table 9-6 Scripted Situation Item for Evaluation of a School Superintendent
27
28
After receiving training and being back on the job for four months a school superintendent is being rated by members of the staff. The following is an example of one of the scripted scenarios used for rating. The following is a scenario regarding a school superintendent. To rate your superintendent, read the scenario and place an X next to the behavior you believe your superintendent would follow.
The administrator receives a letter from a parent objecting to the content of the science section on reproduction. The parent strongly objects to his daughter being exposed to such materials and demands that something be done. The administrator would be most likely to: (check one)
____ Ask the teacher to provide handouts, materials, and curriculum content for review.
____ Check the science curriculum for the board-approved approach to reproduction, and compare school board guidelines with course content.
____ Ask the head of the science department for an opinion about the teacher’s lesson plan.
____ Check to see whether the parent has made similar complaints in the past.
Finally, the trainer who includes sit-ins as a later part of training can observe on-the-job performance of the trainee. As was discussed in Chapter 5 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i68#ch05) , these sit-ins facilitate transfer (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_29) and also help the trainer determine the effectiveness of the training in facilitating the transfer of training to the job.
Transfer of Attitudes If attitudinal change is a goal of training, then it becomes necessary to assess the success of transfer and duration of the attitudinal change once the trainee is back on the job. Whatever method was used to determine the need for a change in attitude should be used to measure how much they have changed. As discussed in the needs analysis chapter, one way to assess changes in attitudes is by observing changes in behaviors. Attitudinal change can also be assessed through attitude surveys. Remember if respondents’ anonymity is ensured in such surveys, responses are more likely to re�lect true attitudes.
A study of steward training provides an example of the assessment of an attitude back on the job. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_30) Training was designed to make union stewards more accessible to the rank and �ile by teaching them listening skills and how to interact more with the rank and �ile. Results indicated that when factors such as tenure as a union of�icial and age were controlled, stewards who received the training behaved in a more participative manner (changed behavior) and were more loyal to the union (attitude survey). For the union, loyalty is important because it translates into important behaviors that might not be measured directly, such as supporting the union’s political candidates and attending union functions. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_31)
Timing of Job Behavior Assessment The wait time for assessing transfer of training depends on the training objectives. If the objective is to learn how to complete certain forms, simply auditing the work on the job before and after training would determine whether transfer took place. This could be done soon after training was complete. When learning objectives are more complex, such as learning how to solve problems or resolve con�lict, wait time before assessment should be longer.
29
30
31
The trainee will �irst need to become comfortable enough with the new behavior to exhibit it on a regular basis; then it will take more time for others to notice that the behavior has changed.
To understand this point, consider a more concrete change. Jack loses 10 pounds. First, the weight loss is gradual and often goes unnoticed. Even after Jack lost the weight, for some time people will say, “Gee, haven’t you lost weight?” or “What is it that’s different about you?” If this uncertainty about speci�ic changes happens with a concrete visual stimulus, imagine what happens when the stimuli are less concrete and not consistent. Some types of behavioral change might take a long time to be noticed.
To help get employees to notice the change in behavior, you can ask them to assess whether certain behaviors have changed. In our example, if asked, “Did Jack lose weight?” and he had lost 10 pounds, you would more than likely notice it then, even if you did not notice it before.
Organizational Results Training objectives, whether proactive or reactive, are developed to solve an organizational problem—perhaps an expected increase in demand for new customer services in the proactive case, or too many grievances in the reactive case. The fact that a problem was identi�ied (too many grievances) indicates a measurement of the “organizational result.” This measurement would be used to determine any change after the training was completed. If it was initially determined that too many defective parts were being produced, the measurement of the “number of defective parts per 100 produced” would be used again after training to assess whether training was successful. This assessment is your organizational result.
It is important to assess this �inal level, because it is the reason for doing the training in the �irst place. In one sense, it is easier to measure than is job behavior. Did the grievances decrease? Did quality improve? Did customer satisfaction increase? Did attitudes in the annual survey get more positive? Did subordinates’ satisfaction with supervision improve? Such questions are relatively easily answered. The dif�icult question is, “Are the changes a result of training?” Perhaps the grievance rate dropped because of recent successful negotiations and the signing of a contract the union liked. Or if attitudes toward supervision improved but everyone recently received a large bonus, the improvement might be a spill-off from the bonus and not the training. These examples explain why it is so important to gather information on all levels of the evaluation.
The links among organizational results, job behavior, and trainee KSAs should be clearly articulated in the TNA. This creates a model that speci�ies that if certain KSAs are developed and the employees use them on the job, then certain organizational results will occur. The occurrence of these things validates the model and provides some con�idence that training caused these results. Thus, the dif�icult task of specifying how training should affect the results of the organization is already delineated before evaluation begins. TNAs are not always as thorough as they should be; therefore, it often falls to the evaluator to clarify the relationship among training, learning, job behavior, and organizational outcomes. For this reason, it is probably best to focus on organizational results as close to the trainee’s work unit as possible. Results such as increased work unit productivity, quality, and decreased costs are more appropriate than increased organizational pro�itability, market share, and the like. Quantifying organizational results is not as onerous as it might seem at �irst glance.
Timing of Assessment of Organizational Results Consistent tracking of the organizational performance gaps such as high scrap, number of grievances, or poor quality should take place at intervals throughout the training and beyond. At some point after the behavior is transferred to the job, it is reasonable to expect improvement. Tracking performance indices over time allows you to assess whether the training resulted in the desired changes to organizational results. You will need to also track any other organizational changes that might be affecting those results. For example, a downturn in the economy might result in the necessity for temporary layoffs. This could trigger an increase in grievances, even though the grievance training for supervisors was very effective. This is one of the dif�iculties of linking training to organizational results. There are a multitude of factors, other than employees’ KSAs, that determine those results.
Relationship Among Levels of Outcomes As suggested earlier, researchers have disagreed about the relationship among these four levels of evaluation. For example, some studies show reaction and learning outcomes to be strongly related to each other. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_32) Others indicate little correlation between results of reaction questionnaires and measures of learning. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_33) As noted earlier, a good response to the reaction questionnaire might mean only that the trainer had obtained the trainees’ attention. This factor is only one of many in the learning process. The �indings also indicate that the more distant the outcome is from the actual training, the smaller the relationship is between higher- and lower-level outcomes. Figure 9-1 illustrates the hierarchical nature of the outcomes and the factors that can in�luence these outcomes.
Figure 9-1 Training Outcomes and Factors In�luencing Them
The research showing no relationship between the levels makes sense if we remember that organizational outcomes generally are the result of multiple causes. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_34) For example, productivity is affected not only by the employees’ KSAs but also by the technology they work with, supplier reliability, interdependencies among work groups, and many other factors. Although improvements can occur in one area, declines can occur in another. When learning takes place but does not transfer to the job, the issues to be concerned with do not involve learning, but they do involve transfer. What structural constraints are being placed on trainees, so they do not behave properly? Beverly Geber, special projects editor for Training magazine, describes a situation in which training in communication skills at Hutchinson Technologies, a computer component manufacturer, was not transferring to the job for some of the employees. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_35) An examination of the issue (through worker focus groups) disclosed that some employees were required to work in cramped space with poor lighting. These conditions made them irritable and unhappy. Did this situation affect their ability to communicate with their customers in a pleasant and upbeat manner? “You bet,” said their human resource (HR) representative.
Despite all the reasons that a researcher might not �ind a relationship among the four levels of evaluation, research has begun to show the existence of these linkages. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_36) More research needs to be done, but there is evidence to show that reactions affect learning outcomes, and learning outcomes
32
33
34
35
36
affect transfer to the job. Few studies have attempted to link transfer outcomes to organizational outcomes due to the signi�icant problems of factoring out other variables related to those outcomes.
Evaluating the Costs and Bene�its of Training Let’s say you are able to show that your training caused a decrease in the number of grievances. You have data to show that participants are engaging in the new behaviors, and they have the desired knowledge and skills. Your examination of all four levels of evaluation provides evidence of cause and effect, and your use of appropriate designs (see Appendix 9-1 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09div23) ) enhances the level of con�idence in all of these outcomes. You might think that your job was done, but many executives still might ask, “So what?” Looking at the outcomes of training is only half the battle in evaluating its effectiveness. The other half is determining whether the results were worth the cost.
Cost/Bene�it and Cost-Effectiveness Evaluations Was the training cost worth the results? This question can be answered in either of the following two ways: (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_37)
Cost/bene�it evaluation
Cost-effectiveness evaluation
Cost/Bene�it Evaluation A cost/bene�it evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_009) of training compares the monetary cost of training with the nonmonetary bene�its. It is dif�icult to place a value on these bene�its, which include attitudes and working relationships. The labor peace brought about by the reduction in grievances is dif�icult to assess, but it rates high in value compared with the cost of training. The con�lict resolution skills learned by supervisors provide the nonmonetary bene�it of better relationships between supervisors and union of�icials, and this is important. However, it is also possible to assess the reduction in grievances (for example) in a way that directly answers the cost-effectiveness question.
Cost-Effectiveness Evaluation A cost-effectiveness evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_010) compares the monetary costs of training with the �inancial bene�its accrued from training. There are two approaches for assessing cost-effectiveness:
1. Cost savings, a calculation of the actual cost savings, based on the change in “results”
2. Utility analysis (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_011) , an examination of value of overall improvement in the performance of the trained employees. This method is complex and seldom used, and therefore is presented in Appendix 9-2 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09app02) for those interested.
Cost Savings Analysis (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_012) (Results Focus)
37
The common types of costs associated with training programs were presented in Chapter 5, Table 5-4 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i71#ch05table04) . These costs are compared with the savings that can be attributed to training. Let’s look again at Table 5-4 on page 151.
Recall that the cost of training was $32,430. Now, determine how much is saved when training is completed. To perform this cost savings analysis, we must �irst determine the cost of the current situation (see Table 9-7). The company averaged 90 grievances per year. Seventy percent (63) of these go to the third step before settlement. The average time required by management (including HR managers, operational supervisors, etc.) to deal with a grievance that goes to the third step is 10 hours. The management wages ($50 per hour on average) add $500 to the cost of each grievance ($50 3 10). In addition, union representatives spend an average of 7.5 hours at $25 per hour, for a cost of $187.50 per grievance. The reason for this is that the union representative wages are considered paid time, as stipulated in the collective bargaining agreement. The total cost of wages to the company per grievance is $687.50. The total cost for those 63 grievances that go to the third step is $43,312.50. The cost of training is $32,430.00.
Table 9-7 Cost Savings for Grievance Reduction Training
Costs of Grievances Pretraining Posttraining
Management Time (for those going to third step) 10 h per grievance
10 h × 63 grievances = 630 h 10 h × 8 grievances = 80 h
Union Rep's Time (paid by management) 7.5 h per grievance
7.65 h × 63 grievances = 472 h 7.65 h × 8 grievances = 60 h
Total Cost
Management Time 630 h × $50 per h = $31,500 80 h × $50 per h = $4,500
Union Rep's Time
Total
Cost Savings
Reduction in cost of grievances going to the third step
$43,312.50 - $5,500.00 = $37 812.50
Cost of training
Cost saving for the 1st year
Return on Investment
Calculating the ratio $5,382.50/$32,430 = 0.166
Percent ROI 0.166 × 100 = 16.6%
The Return on Investment (ROI) is the investment minus the cost.
For this example, then, the data show a $37,812.50 return on a $32,430 investment; ROI is therefore a $5,382.50 savings in the �irst year.
Many organizations are interested in the ratio of the return to the investment. For example investors in the stock market might set a 10 percent ROI as a goal. That would mean that the investment returned the principle plus 10 percent. The ROI ratio is calculated by dividing the “return” by the “investment.” For training this would translate to dividing the cost savings (return) by the training cost (investment).
To translate that ratio to a percentage you would multiply the ratio by 100.
In the grievances case, dividing the cost saving (total savings of 37,812.50 2 cost of training which was 32.430.00 5 cost savings of 5,382.50) by the investment (cost of training which was 32,430.00) produces an ROI ratio of 0.166. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_38) If the ratio were 0, the training would break even. If the ratio were a negative number, the costs would be more than the returns to the company.
Multiplying the ratio by 100 provides the percent ROI. In this case, there is a 16.6 percent ROI for the �irst year. Most companies would be delighted if all their investments achieved this level of return. In addition, the nonmonetary bene�its described earlier are also realized. Presenting this type of data to the corporate decision makers at budget preparation time is certainly more compelling than stating, “Thirty supervisors were given a �ive- day grievance reduction workshop.”
Many training departments are beginning to see the importance of placing a monetary value on their training for several reasons: (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_39)
HRD budgets are more easily justi�ied and even expanded when HR can demonstrate that it is contributing to the pro�it.
HRD specialists are more successful in containing costs.
The image of the training department is improved by showing dollar value for training.
Recall Dave Palm from LensCrafters. Top management told him to demonstrate what they were getting in the way of “bang for the buck.” Well, he did, and the result was that his training budget was doubled. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_40) Training in Action 9-2 is a similar example. Here, Alberta Bell demonstrated the value of the training that prompted management not only to restore funding for the original training, but also to consider increasing it.
9-2 Training in Action Reduction in Training Time: The Value of Demonstrating Value (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_41)
This case occurred some time ago, but still has a valuable lesson. Alberta Bell of Edmonton, Alberta, was looking for ways to reduce the cost of its operations. Downsizing and cost cutting were necessary to meet the competition. One cost cutting decision was to reduce the entry-level training program for its customer service representatives from two weeks to one week. This would save money by reducing the cost of training and getting service representatives out “earning their keep” sooner.
38
39
40
41
The manager of training decided to assess the value of this decision. Using data already available, he determined that the average time necessary to complete a service call for those who attended the two-week program was 11.4 minutes. Those in the one-week program took 14.5 minutes. This difference alone represented $50,000 in lost productivity for the �irst six weeks of work. He further analyzed the differences in increased errors, increased collectables, and service order errors. This difference was calculated at more than $50,000. The total loss exceeded $100,000.
Obviously, when he presented this information to upper management, the two-week training program was quickly put back in place.
Because of the time and effort required to calculate the value of training, many small business managers simply do not do it. However, assessing the value of training is not an exact science, and it can be done more easily by means of estimates. Table 9-8 provides a simpli�ied approach for small business. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_42) As can be seen in the table cost-savings translates to revenue for the company. When estimates are necessary in completing this form, it is useful to obtain them from those who will receive the report (usually top management). If you use their estimates, it is more likely that your �inal report will be credible. Of course, larger organizations can also use this method.
Table 9-8 Training Investment Analysis Work Sheet
42
Objective: ________________________
Audience: ________________________
Returns measured over: ________________ One year _______________
Other: ________________________
Part 1: Calculating the Revenue Produced by Training
Option A—Itemized Analysis
Increased sales: _____ Additional sales per employee
× _____ Revenues (or margin) per sale
× _____ Number of employees
= _____ Revenue Produced by Training
Higher productivity: _____ Percent increase in productivity
× _____ Cost per employee (salary plus bene�its plus overhead)
× _____ Number of employees
= _____ Revenue Produced by Training
Reduced errors: _____ Average cost per error
× _____ Number of errors avoided per employee
× _____ Number of employees
= _____ Revenue Produced by Training
Include as many areas of �inancial gain as you are able to determine (for example, employee retention, reduction in grievances, and so on.
Total Revenue Produced by Training (add all “Revenue Produced by Training” cells):
$ _______
Option B—Summary Analysis
____________ - ____________ = ____________
Revenue After Training Revenue Without Training
Revenue Produced by Training
Part 2: Calculating the Return
____________ × ____________ = ____________
Revenue Produced by Training
Cost of Training Total Return on Training Investment
When and What Type of Evaluation to Use So, do we compute a comprehensive evaluation at all four levels in addition to a cost/bene�it analysis for all training programs? No. To determine what evaluation should take place, ask the question, “Who is interested in these data?” The different levels of outcome evaluation are designed for different constituencies or customers. Note that in Table 9-9 the trainer is interested in the �irst three levels, because they re�lect most directly on the training. Other trainers might also be interested in these data if the results show some relation to their training programs. Training managers are interested in all the information. Both reaction and learning data, when positive, can be used to evaluate the trainer and also promote the program to others. When the data are not positive, the training manager should be aware of this fact because it gives the trainer information to use to intervene and turn the program around. The training manager’s interest in the transfer of training is to evaluate the trainer’s ability to promote the transfer. Care must be taken in using this information because many other factors may be present and operating to prevent transfer. Also, if transfer is favorable, the information is valuable in promoting the training program. These generalizations are also true for the organizational results. If the training manager is able to demonstrate positive results affecting the �inancial health of the company, the training department will be seen as a worthy part of the organization.
Table 9-9 Who Is Interested in the Outcome Data (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_43)
43
Outcome Data
Reaction Learning Behavior Results
Training Department
Trainer Yes Yes Y N
Other trainers Perhaps Perhaps Perhaps N
Training manager Yes Yes Y Y
Customers
Trainees Yes Yes Y Perhaps
Trainees’ supervisor Not really Only if no transfer Y Y
Upper management No No Perhaps Y
Trainees are interested in knowing whether others felt the same as they did during training. They are also interested in feedback on what they accomplished (learning) and may be interested in how useful it is to all trainees back on the job (behavior). A trainee’s supervisor is interested in behavior and results. These are the supervisor’s main reasons for sending subordinates to training in the �irst place. Upper management is interested in organizational results, although in cases where the results may not be measurable, behavior may be the focus.
Does the interest in different levels of evaluation among different customers mean that you need to gather information at all levels every time? Not at all. First, a considerable amount of work is required to evaluate every program offered. As with process data, it makes sense to gather the outcome data in some situations and not in others.
Again, the obvious question to ask in this regard is “What customer (if any) is interested in the information?” Although one of the major arguments for gathering the outcome data is to demonstrate the worth of the training department, some organizations go beyond that idea. In an examination of “companies with the best training evaluation practices,” it was noted that none of them were evaluating training primarily to justify it or maintain a training budget. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_44) They evaluated (particularly at the behavior and results levels) when requested to do so by the customer (top management or the particular department). Jack Phillips, founder of ROI Institute, a consulting �irm that specializes in evaluation, suggests that organizations only evaluate 5 to 10 percent of their training at the ROI level. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_45) Which ones? The ones that are high pro�ile and/or are speci�ically requested by upper management. This selectivity is a function of the cost in developing such evaluations, because these type of evaluations (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_46)
need to be customized for each situation,
are costly and time consuming, and
require cooperation from the customer.
44
45
46
Motorola, for example, evaluates only at the behavioral level and not at the results level. Executives at Motorola are willing to assume that if the employee is exhibiting the appropriate behavior, the effect on the bottom line will be positive. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_47) Training in Action 9-3 shows how various companies are dealing with evaluation, particularly behavior and results.
9-3 Training in Action What Companies Are Doing for Evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_48)
After years of not evaluating their training, the U.S. Coast Guard decided to evaluate at the behavioral level, asking trainees and their supervisors three things: How well the trainees were able to perform the desired behaviors, how often they did those behaviors, and how important those behaviors were to being an effective employee. With the information provided in the evaluations, trainers were able to remove outdated training objectives and add job aids for some less frequent behaviors. Furthermore, the remaining training was re�ined, became more relevant, and provided more ef�iciency. This translated into a $3 million a year savings for the training department of the Coast Guard.
Texas Instruments noted that once trainees left training, it was dif�icult to obtain transfer of training information from them. It was generally ignored because of the time and expense of gathering this information. Then, an automated e-mail system was developed through which trainees, after being back on the job for 90 days, were contacted and asked to complete a survey related to transfer. This system increased the use of evaluations, reduced the time necessary to gather information, and provided a standardized process. Texas Instruments noted an improvement in the quantity and quality of participant feedback. It would seem easy enough to include an e-mail to the trainees’ supervisors for the same purpose.
Century 21 decided to evaluate their sales training at the results level. After training, trainees were tracked through a sales performance system that identi�ied the number of sales, listings, and commissions for each graduate. This was cross-referenced to the place they worked and their instructor. Findings were surprising. Trainees from certain of�ices outperformed trainees from other of�ices even though they had the same instructor. Examination of these results showed that the high-performing of�ices provided help when needed, had access to ongoing training, and had better support. To respond to this, Century 21 had its trainers still deliver the training but, in addition, was responsible for monitoring the environment in of�ices where trainees were sent. This monitoring was to see that every trainee was in an environment similar to that of the “high-performing trainees” identi�ied earlier.
Booz Allen Hamilton, a consulting �irm, recently decided to assess the ROI of its executive coaching program, which had been up and running for three years. The result? It was determined that the program’s ROI was about $3 million per year.
Certainly, all levels of data gathering are important at different times, and the training professional must be able to conduct an evaluation at every level. So, what and when should the trainer evaluate? The answer is that it depends on the organization and the attitudes and beliefs of upper management. If they perceive the training department as an effective tool of the organization and require only behavior-level evaluation, that is the evaluation to do.
However, this level still might require vigilance at the learning and reaction levels to ensure positive results. Darryl Jinkerson, director of evaluation services at Arthur Andersen, looks at the size and impact of the training before deciding how to evaluate it. Only those that are high pro�ile, or for which the customer requests it, will be evaluated at the results level.
47
48
49
(http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_49) What if training is a one-time event and no desire is indicated to assess individual competence (e.g., a workshop on managing your career)? Such a situation provides simply no reason to evaluate. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_50)
Can You Show That Training Is the Reason Change Has Occurred? We have discussed in detail the types of measures you can use to help determine if training has been effective. However, it is not as simple as that, because change might have occurred for reasons not related to the training. This is where designing an appropriate evaluation becomes so important. Appendix 9-1 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09div23) provides an examination of the various concerns in evaluation, such as those related to the validity (both internal and external) of the �indings. It also provides several designs that can be useful to help assure you that your results are in fact valid.
50
9.6 Focus on Small Business For the small business owner, sending employees to training that is not effective could signi�icantly affect the company’s �inancial health. Consider the owner who is constantly terminating employees because they are unable (or unwilling) to do the job properly. They all receive training, and most, but not all, turn out to be ineffective. Why? If training is not evaluated, it is not possible to know whether employees are lost because the training is not effective or because some other factor is blocking effective performance.
The small-business owner might think it is not necessary to evaluate training, because whether or not it was effective will be obvious by changes observed on the job after training. Actually, this assessment is probably true; in a small business, it would soon be evident if recently trained employees were performing at the expected level. However, if training is a signi�icant cost to the owner, evaluating learning before and after training can still be of value. After all, the trainees might be learning on the job and the training may not be adding anything to their KSAs.
Much of the training in a small business is done on the job. In such cases, evaluation is often simply an assessment of the trainee’s ability to learn. Examining the training process is not considered. As we discussed in Chapter 6 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i84#ch06) , on-the-job training requires trainer skills just like any other training. Simply placing a new employee with an experienced employee and expecting the experienced one to train is not wise. It can be worthwhile to evaluate the process of training that goes on, in addition to evaluating the outcomes, especially if the position is at a lower level where, because of turnover or promotion, a rather high number of employees receives training.
Beyond the reasons for evaluation of training mentioned previously, there are external pressures as well. The movement to quality standards, such as ISO creates a need for certi�ication in several areas. As mentioned previously, one of the requirements is that the organization must maintain training records and periodically evaluate training. Training records can take the form of diplomas, certi�icates, licenses, experience records, resumes and so on. There is no speci�ic method for evaluating training effectiveness. A popular method is an annual review of training’s outcomes. Results of the review are recorded and are used as feedback for revising and updating the training program. Another method is a periodic assessment of individual employees.
David Alcock of Canadian Plastics Training Centre, in the Toronto area, says that even though few of the center’s clients request an evaluation of training, such requests are on the increase. Most of the center’s clients are small injection molding businesses. The need for certi�ication seems to be the driving force behind the necessity to evaluate. Canadian Plastics Training conducts standardized injection molding training on its own site and provides a skill-based evaluation. A trainee who passes the skill-based test becomes certi�ied as an injection molder. Generally, the company sends its employees for this training; however, some employees pay their own way to improve themselves.
One reason that these small companies do not evaluate is the cost. For in-house training done by Canadian Plastics Training, a late-1997 cost of evaluation for 20 employees to be trained to a higher-level classi�ication was $25,000. This is similar to costs in the United States. The Center for Industrial Research and Service at Iowa State University suggests the cost to be $1,000 to $1,500 per employee. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_52) Many small companies simply do not have those resources. Another issue noted by Alcock involves what the evaluation would be used for. For example, suppose a unionized shop wants to upgrade the skills of the workforce. Sending them to training would carry with it the union’s blessing. Evaluating the learning, however, might be met with a great deal of resistance. The union leadership and rank and �ile might be concerned about the company knowing how well the employees did on a test. They might believe that the company’s goal is to get rid of some employees based on test results. Otherwise, why evaluate? Convincing the union that evaluation is a way of assessing the effectiveness of training might be dif�icult to do, depending on the relationship between union and management. Training in Action 9-4 shows what one small company is doing.
52
9-4 Training in Action Training and Evaluation at Scepter Manufacturing (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_51)
“ISO makes training mandatory,” says Don Villers, plant manager of the 160-employee Scepter Manufacturing plant in Scarborough, Ontario. “We train everyone from the shop �loor to the front of�ice.” The plant has been ISO certi�ied for over a decade and since then has moved well beyond the ISO training requirements.
In the company’s rating system, supervisors are required to rate each of their employees on a scale from 1 to 10. An employee must reach 10 to be certi�ied at that level and to be eligible for promotion. The rating system is not seen as punitive, but developmental. It is used as a needs analysis to identify skill de�iciencies, then as a learning measure, and �inally as a transfer of training measure.
What about results? According to Villers, “Defective parts dropped from 5 percent to 0.1 percent. Scrap also dropped 50 percent.” He attributes this success primarily to training. As a result of the success, the training budget is 10 times the $6,000 per year that the company spent three years ago.
Evaluation Beyond Learning The previous discussion focused primarily on learning. What about transfer of behavior and organizational results? In many ways, evaluation of transfer and organizational results is easier in a small company. After publishing the article on Scepter Manufacturing (see Training in Action 9-4), (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_53) Don Villers was asked how he knew that the drop in scrap and defective parts (results) was a function of training. His reply: “We are a small company, and it is the only thing that we changed.” He makes an important point related to the examination of results in small businesses. When a small business does training, evidence of the impact can be much clearer and faster. Also, it should be easier to rule out alternative explanations for the change, without the need for the more complex designs discussed in Appendix 9-1 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09div23) .
Case: Palm Desert (Conclusion) The Palm Desert case at the beginning of the chapter provides an example of an effort to evaluate using a control group and pre-/postdesign. Even here, however, problems arose in the way the evaluation was managed. One issue is that learning was not assessed. Only behavioral change was assessed six months after training. We know that the training did not transfer, but we do not know why. If it did not transfer because it was never learned in the �irst place, what was the reason? Perhaps, there was just too much material to learn in a one-day seminar? Examining the process of developing the training might reveal this problem, and the training could be revised before being implemented. For a small organization, the training was obviously a major undertaking, and a more comprehensive training evaluation might be more advisable.
51
53
Summary We began this chapter by discussing the importance of a comprehensive evaluation. We end it by suggesting that a comprehensive evaluation is not always necessary. Understanding what to consider before evaluating makes such decisions more logical and useful.
Evaluation can be complex and, in many cases, costly. For this reason, we suggested throughout this chapter that evaluation is useful and important, but not necessary at all levels all the time. Furthermore, good detective work can, in some cases, replace complex designs in assessing the validity of evaluation.
Deciding what training should be evaluated, and at what levels, will be easier if the organization is proactive. By examining the strategic plan, it is possible to identify those areas of training that require evaluation and the extent to which evaluating is necessary. Without such direction, the training department will need to identify its mission and goals as best it can and work from there to determine the training that needs to be evaluated. Even for a large organization, it is simply not practical to evaluate everything. All organizations need to determine what training they want to evaluate and how they will do so. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i133#ch09sidebar05)
The Training Program (Fabrics, Inc.) We are now ready to examine the evaluation phase of the Fabrics, Inc., training. We presented the training, and it is time to do the evaluation. In the design phase of the training process, one of the outcomes was development of evaluation objectives. Although we developed and implemented the training, it is critical to remember that developing the tools for evaluation needs to be done concurrently with developing the training, not after it.
Examination of the output of the evaluation phase of training indicated two types of evaluation: process and outcome. The process evaluation will consist of the trainer, during training, documenting what she covered in each module and the time spent on it. These results will then be compared with what was expected to be covered in each module and the time spent.
For the outcome evaluation, four types are identi�ied. The reaction questionnaire for trainers will model the one that was presented in Table 9-4 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09table04) of the text. For the training itself, the reaction questionnaire is shown next in “Fabrics Reaction 1”.
For learning, we need to revisit the learning objectives to determine what is required. We need a paper- and-pencil test for measuring knowledge (objectives 1 and 2) and two behavioral tests to measure active listening and con�lict resolution skills (objectives 3 and 4). More speci�ically, the �irst two learning objectives (and the others related to the training but not developed here) are accommodated using the paper-and- pencil test. The content of this test is partially represented in “Fabrics Paper-and-Pencil Test” on the next page. But �irst let’s look at the knowledge objectives.
Fabrics Reaction 1 Using the scale that follows, evaluate the training by circling the appropriate number to the right of the item.
1 = Strongly disagree
2 = Disagree
3 = Neither agree nor disagree
4 = Agree
5 = Strongly agree
Active Listening Skills
The training met the stated objectives. 1
2
3
4
5
The information provided was enough for me to understand the concepts being taught.
1
2
3
4
5
The practice sessions provided were suf�icient to give me an idea of how to perform the skill.
1
2
3
4
5
The feedback provided was useful in helping me understand how to improve.
1
2
3
4
5
The knowledge and skills in this session were of value for my job.
1
2
3
4
5
Circle the response that re�lects your feelings about the pace of the session just completed.
1. Way too fast
2. A bit fast
3. Just right
4. A bit slow
5. Way too slow
What did you like best about this part of the training?
What would you change?
Comments:
Note: A similar scale would be used for each of the other components of training that were taught.
The trainee will, with no errors, present in writing the four types of active listening, along with examples of each of the types, without using reference materials.
The trainee will, with 100 percent accuracy, provide in writing each step of the con�lict resolution model, along with a relevant example, without help from any reference material.
After watching a role-play of an angry person and an employee using the con�lict resolution model, the trainee will, without using reference materials, immediately provide feedback as to the effectiveness of the person using the con�lict resolution model. The trainee must identify four of the six errors.
Fabrics Paper-and-Pencil Test
Evaluation of Learning No speci�ic time limit is set for this test, but you should be able to �inish in about one hour.
Answers to the questions should be written in the booklet provided.
Please read each question carefully. Some of the questions contain more than one part.
1. List four types of active listening, and provide an example for each.
2. List the steps in the con�lict resolution model. After each step, provide a relevant example of a phrase that could be used to represent that step.
And so forth for as many questions as needed.
The next objective is partly related to skill development. Following are a number of standardized scenarios and guidelines to evaluate them. “Fabrics Scenario: Active Listening” is an example. But �irst, here is the objective.
When, in a role-play, the trainee is presented with an angry comment, the trainee will respond immediately using one of the appropriate active listening types. The trainee will then explain orally the technique used and why, with no help from reference material. The trainee will be presented with �ive of these situations and be expected to correctly respond and explain a minimum of four techniques.
Fabrics Scenario: Active Listening This is read to the trainee: The following set of scenarios is designed to determine how well you, the trainee, have learned the active listening skills. There are three roles here: initiator, active listener (you, the trainee), and evaluator. The initiator is a nontrainee who speaks a con�lict-provoking statement to you (the active listener). You, the trainee, listen to the statement, and then respond using active listening skills. The evaluator, who is trained in evaluating active listening, listens to your response and evaluates it based on the use of effective active listening skills.
Note: The following forms (initiator’s role, active listener’s role, evaluator’s role) are given to the respective people, with the active listener’s role being given to you, the trainee.
The next sheet is for the person playing the initiator.
Initiator’s Role (The initiator is to be played by the same actor for all trainees.)
Instructions for the Initiator Beginning with scenario 1, read the sentence describing the scenario carefully; wait until the trainee is ready, and then read the comment in bold next to the Scenario in an angry manner.
Wait until you are told by the evaluator to move to the next scenario and follow the instructions above.
Test Scenario 1
You were just asked by your supervisor (the trainee) to serve on the same committee again. You are angry that they always ask you.
You start. Say angrily: “OH, NO YOU DON’T. I’VE BEEN ON THAT COMMITTEE THREE YEARS IN A ROW AND IT TAKES UP TOO MUCH TIME!”
Test Scenario 2
Your supervisor just talked to you about following procedures. You think, Why me? After all, no one follows procedures.
You start. Say angrily: “WHY ARE YOU PICKING ON ME ALL THE TIME? I’M NOT THE ONLY ONE WHO DOESN’T FOLLOW THESE STUPID PROCEDURES!”
Test Scenario 3
You were just asked by your supervisor for a second time today whether you will be attending the weekly meeting.
You say angrily: “I ALREADY TOLD YOU, I CAN’T ATTEND THE WEEKLY MEETING BECAUSE I HAVE TO COMPLETE THE STAFF REPORTS FOR TOMORROW!”
And so forth (for a total of 5).
The next sheet is for the trainee.
Trainee’s (Active Listener) Role Instructions for the trainee: This test will require you to respond to �ive different short scenarios in which you are a supervisor and you say something to a subordinate that elicits an angry response. You will be expected to respond using the skills of active listening. The description of each of the scenarios provides what you initially said to the subordinate. When you are ready for each of the scenarios to begin, nod your head to the initiator. At that time, the initiator will say something. You need to respond to the comment, and when complete, explain to the evaluator the rationale for your response.
Scenario 1 You asked a subordinate to continue working on a particular committee for another year. Listen; then respond using active listening. Nod your head when ready. . . .
Scenario 2 You just talked to a subordinate regarding the importance of following procedures. Listen; then respond using active listening. Nod your head when ready. . . .
Scenario 3 Today is the day of your weekly meeting. You asked if your subordinate would be attending the meeting; the answer was no. It is now time for the meeting and you call once more to check to see whether the subordinate can make the meeting. Listen; then respond using active listening. Nod your head when ready. . . .
And so forth (for a total of 5).
The next sheet is for the evaluator.
Evaluator’s Role Instructions to evaluator for scoring trainee responses: Trainee fails the scenario if the response is focused on the issue instead of re�lecting what the initiator says. For example, a poor (fail) response to the �irst scenario would be something where the trainee responds to the concern by dealing with the issue “But you are my best person for the job” or “You have to do it; I have no one else” or “Look, I am asking you as a favor to me.”
Appropriate responses re�lect what the person is saying, as in the �irst scenario: “So, you’re saying that being on the committee interferes with your doing your job” or “You feel you have done your share regarding work committee.”
It is also important that the response does not sound like a mimic of what the person said. Although at this time we do not expect perfection regarding responses, the responses must, at a minimum, sound sincere. Refer to the tape recordings provided to understand the difference between what we consider mimicking and acceptable.
For each of the �ive scenarios, there is an example of a poor (fail) response and an acceptable response. When the trainee explains his or her response, we expect the trainee to be able to identify the type of active listening response used (paraphrasing, decode and feedback, summarizing) and why it was chosen. Answers to why it was chosen are intended to show that they understand the different methods, and thus any answer that does this is acceptable.
Scenario 1 The supervisor (trainee being tested) asked the subordinate to continue working on a particular committee for another year, and the subordinate responds. Listen to the supervisor’s response and grade according to guidelines.
Unacceptable response:
“I am willing to talk about reducing the work you have to do if you will be on it.”
Acceptable response:
“You don’t want to be on that committee again because it interferes with your work and you feel you have done your share.”
Scenario 2 The supervisor (trainee being tested) just talked to a subordinate regarding the importance of following procedures, and the subordinate responds. Listen to the supervisor’s response and grade according to guidelines.
Unacceptable response:
“You are not the only one I have talked to about this.”
Acceptable response:
“You believe that you’re the only one that i am singling out for not following procedures.”
Scenario 3 The supervisor (trainee being tested) called �irst thing in the morning and asked the subordinate if she would be attending the weekly meeting; the subordinate said, “No, I’m busy.” The supervisor just called again at meeting time to check to see whether the subordinate could make the meeting, and the subordinate responds. Listen to the supervisor’s response and grade.
Unacceptable response:
“The meeting will only be an hour.”
Acceptable response:
“You’re not able to attend the meeting because you are completing staff reports that are due tomorrow.”
And so forth (for a total of 5).
Note that we do not provide the test for determining the knowledge part of this objective, where the trainee is asked to explain his or her response orally.
The next objective is skill related and has to do with con�lict resolution. See “Fabrics Role-Play Con�lict Resolution” for an example of this. The objective is:
“In a role-play of an angry employee, the trainee will calm the person using the steps in the con�lict resolution model, with help from a poster that lists the steps.”
Fabrics Role-Play Con�lict Resolution Read the following to the trainee: The following role-play is designed to determine how well you, the trainee, have learned the con�lict resolution skills. There are three roles here: initiator, active listener (you, the trainee), and evaluator. The initiator is a nontrainee who starts off very angry at something you did. You listen to what is said and respond using the con�lict resolution model. The evaluator, who is trained in evaluating effective con�lict resolution, listens to your response and evaluates it based on your effectiveness. The following forms (initiator’s role, active listener’s role, evaluator’s role) are given to the respective people, with the active listener’s role being given to you, the trainee.
The next sheet is for the person playing the initiator.
Initiator’s Role (The initiator is to be played by the same actor for all trainees.)
Instructions for the Initiator Read the role a couple of times and get in the mood suggested.
Be sure you understand the issues, so you can present them without referring to the role.
Once into the role, allow your own feelings to take over; if what the supervisor is saying makes you less angry, then act that way, and vice versa.
Do not refer back to the role after the role-play begins; simply act the way you normally would do in such circumstances.
Begin the role-play by presenting the points at the end of the role-play with anger.
To elicit an assertive response, interrupt the trainee at least once after the trainee begins to present his or her point of view. If the trainee allows the interruption, interrupt again until the trainee becomes assertive and asks you not to interrupt (maximum of four interruptions).
The Role of the Initiator Your name is Pat. You are the longest working machinist in the plant, with 25 years’ service. You taught many of those who are presently there, including most of those who were made supervisor recently. The company has been busy for the last number of years, and you have been called upon many times to provide the extra boost to get some projects out. You worked hard all your life and are starting to feel it in your bones. The work is getting harder and harder to complete, especially with the older lathes. With only three years to retirement, you are wishing you could afford to retire now. You are really worn out, that is, until you hear the news that the company just purchased one of those new computer-operated lathes. You feel con�ident that once you get to use the new machine you will be rejuvenated. In fact, the thought of getting to work on one of these new machines gives you goose bumps. You have not felt this excited in years. Actually, the thought of going back to school to learn about it is the most exciting thing, as it is making you feel young again. You are sorry that you missed today’s meeting at which they were going to talk about the new equipment, but your car would not start.
“Hey, did you hear the news?” your friend Bill called out.
“I don’t think so, what is it?” you replied.
“They just announced that Fred is going for training on the new computer-operated lathe. I guess he will be the one operating it.”
“Are you sure?” you ask.
“Yep, it was announced at the circle meeting this morning. He was selected to operate it and will be going for a two-week training course next week.”
You are furious. Fred was only just hired and is just a kid. You deserve �irst crack at the new machine, given your loyal service. Well, that is it. Your supervisor (the young guy you taught how to run a lathe before he got promoted) never did get along with you, and now this. Well, you are not going to take it. You walk into the supervisor’s of�ice and in a loud voice start off by saying:
“What do you think you are doing? How can you give the new lathe to Fred, after all the years I have been here? This is not fair and I am not going to sit still for it.”
Be sure to continue the anger and bring up all the points mentioned in the role-play. Go over them again and again until the trainee calms you down.
The next sheet is for the trainee.
Trainee’s Role
Instructions for the Trainee Read your role a few times and be sure you understand the issues, so you can present them without referring to the role.
Do not refer back to the role after the role-play begins, but you can jot down a few points for reference.
Use the con�lict resolution model to deal with the issue.
Nod at the initiator when you wish to begin.
The Role for the Trainee You are the supervisor of a manufacturing �irm and have about 10 subordinates. They are all lathe operators, and you were also one until you recently got promoted. Your subordinates are all good people, and with the exception of Pat, who has been here for 25 years and is a few years away from retirement, all are fairly young and have at most 10 years’ service. Pat is a great machinist and knows more than everyone put together. He taught you the job when you had just started and, although you never really hit it off with him, you do respect his ability.
You are pretty excited these last few days, because the company just purchased a new computer-operated lathe. It is your understanding that you will be getting a new lathe each year until all are replaced. You are moving into the new age. Choosing only one of your machinists to go to training and be the �irst one on the new machine was a dif�icult decision. All were likely candidates, with the exception of Pat, who was too old to learn the new machine—computer stuff and all. Furthermore, why train Pat on a new machine when he will only be here a short time? It makes more sense to train those who will be able to use the new skills for the longest time. Anyway, Pat really knows how to operate the older machine better than anyone, so why move him? Finally, you came up with the perfect solution. The new guy, Fred, has not been trained on any machine yet, so training him on the new lathe would mean that no one else needed training for the time being. Putting anyone else on the new machine would mean training Fred on the old machine, then when they are phased out, retraining him on the computer-operated lathe. So you announced it today at your circle meeting. Everyone was pretty quiet, but they will get over it. Too bad Pat wasn’t there. Wonder if he is sick?
The next sheet is for the evaluator.
Evaluator’s Role
Instructions for the Evaluator The trainee fails the scenario if the initial response is focused on the issue instead of re�lecting what the initiator says. For example, a poor (fail) response would be if the �irst comment to Pat was “I did not think you wanted it” or “It is probably too complicated
for you” or “We value your contribution” or “You’re the best we’ve got on the old machine, and we need you there.”
Keys to successfully passing this exercise are to
actively listen to Pat (using the active listening skills) and
question to obtain as much information as possible before dealing with the issue.
To be successful, it is expected that the trainee will use active listening and questions at least four to six times (preferably more) before moving to the trainee’s point of view. The key is to note how much the initiator has calmed down.
Be sure the trainee indicates respect (must have at least one phrase such as “I can appreciate why you feel you should have the opportunity to receive the training. It makes sense that you believe after such long and loyal service you should receive some reward”).
Be assertive, not aggressive, if necessary to present points.
When interrupted, the trainee must use the proper assertive response to inhibit interruptions. The trainee is given four opportunities to be assertive, since the role requires interruptions until an assertive response is given (up to four). Note how that interruption is handled; the trainee needs to be assertive (for example, “I have carefully listened to everything you have had to say; I think it only fair that now you give me a chance to respond, okay?”).
Provide the supervisor’s points as “point of view,” not correct point of view.
The role-play will begin with the initiator being angry. Response can be a summary of these points, paraphrase of one of them, or decode and feedback regarding emotion expressed, but not anything dealing with the speci�ic issue. Use the following form to assist in the evaluation of the trainee.
Evaluator Report Form Put a mark next to each of the responses in terms of their type. Try to jot down the words used in some of the cases to enable you to provide speci�ic feedback.
Active Listening Nonverbal behavior
Say more responses
Paraphrase
Decode and feedback
Summarize
Indicate Respect
Use of active listening
Questioning
Show acceptance of other’s point of view
Be Assertive Needs to be phrased in terms of YOUR POINT OF VIEW
My perception is . . .
It seems to me that . . .
It is my belief that . . . and so forth.
Provide Information Use collaboration (problem solving) or compromise (negotiate). Note: Although this response is a part of the con�lict resolution model, it is not part of the learning objectives for this training; therefore, it is not evaluated in this training program.
You will note that a standardized scoring key, examples of acceptable and unacceptable behavior of the trainee, and a checklist for different responses are provided for the evaluator.
The aforementioned are evaluations related to learning, but we still need to consider behavior (transfer of training) and organizational results. The owner in the Fabrics, Inc., case is not interested in doing any of this type of evaluation. Recall that we indicated that an evaluation using elaborate designs is nice but seldom happens in reality.
The owner in the Fabrics, Inc., case does not want us to assess any transfer of behaviors to the job. His argument is that his primary interest is in getting fewer complaints from employees and customers. He notes that in a small organization such as his, these changes (lowering of complaints) are proof enough that training was successful. We agree, so the evaluation will consist of gathering weekly archival information on complaints from customers and subordinates as a baseline (gathering it for two months prior to the training) and tracking it for six months after training is complete.
Key Terms Affective questionnaire (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term01)
Control group (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term02) (�ilech09.xhtml#ch09fnt01) *
Cost/bene�it evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term03)
Cost-effectiveness evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term04)
Cost savings analysis (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term05)
External validity (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term06) (�ilech09.xhtml#ch09fnt01) *
History (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term07) (�ilech09.xhtml#ch09fnt01) *
Initial group differences (�ilech09.xhtml#ch09term08) (�ilech09.xhtml#ch09fnt01) *
Instrumentation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term09) (�ilech09.xhtml#ch09fnt01) *
Internal referencing strategy (IRS) (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term10) (�ilech09.xhtml#ch09fnt01) *
Internal validity (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term11) (�ilech09.xhtml#ch09fnt01) *
Job behavior outcomes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term12)
Learning outcomes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term13)
Maturation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term14) (�ilech09.xhtml#ch09fnt01) *
Organizational results (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term15)
Outcome evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i129#ch09term16)
Process evaluation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i129#ch09term17)
Random assignment (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term18) (�ilech09.xhtml#ch09fnt01) *
Reaction outcomes (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term19)
Representative sampling (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term20) (�ilech09.xhtml#ch09fnt01) *
Single-case design (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term21) (�ilech09.xhtml#ch09fnt01) *
Statistical regression (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09term22) (�ilech09.xhtml#ch09fnt01) *
Utility analysis (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term23)
Utility questionnaire (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09term24)
* These key terms appear only in appendices 9-1 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09div23) and 9-2 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09app02) .
Questions for Review 1. What is the relationship among the four levels of evaluation? Would you argue for examining all four levels
if your boss suggested that you should look only at the last one (results) and that if it improved, you would know that training had some effect?
2. What is the difference between cost/bene�it evaluation and cost-effectiveness evaluation? When would you use each, and why?
3. What is the difference between cost-effectiveness evaluation and utility analysis? When, if ever, would you use utility rather than cost-effectiveness? Why?
4. Assume that you were the training manager in the Westcan case (in Chapter 4 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i48#ch04) ). How would you suggest evaluating the training, assuming they were about to conduct it as suggested in the case? Be as speci�ic as you can.
5. Of all the designs presented in Appendix 9-1 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i140#ch09div23) , which one would you consider to be most effective while also being practical enough to convince an organization to adopt it? If your design involved representative sampling, how would you accomplish it?
Exercises 1. Examine the reaction questionnaire that your school uses. Is it designed to rate the course content or the
instructors? Does it meet the requirements of a sound reaction questionnaire? Why or why not? Explain how you would improve it (if possible).
2. Break into small groups, with each group containing at least one member who previously received some type of training in an organization. Interview that person on what the training was designed to teach and how it was evaluated. Did the evaluation cover all the levels of outcomes? How did the trainee feel about the evaluation? Devise your own methods for evaluating each of the levels based on the person’s description of the training.
3. Go to the role-play for active listening in the Fabrics, Inc., example. In groups of �ive or six, choose someone to be the initiator and someone to be the trainee. Have them go through the role-play while the rest evaluate the trainee’s response on a scale of 1 to 7 (1 being poor and 7 being excellent). Now share your scores. Were they all exactly the same? If not, how could you make the instrument more reliable? If they were all the same, why was that? Is there anything you would suggest to make the evaluation process easier?
Case Analysis You run Tricky Nicky’s Carpet Cleaning Co., which cleans carpets for businesses. On average, one carpet cleaner can clean six of�ices per eight-hour shift. Currently, 100 cleaners work for you, and they work 250 days per year. Supervisors inspect carpets when cleaners notify them that the carpet is done. Because of Nicky’s “Satisfaction Guarantee,” when a carpet does not meet the standard, it is redone immediately at no extra cost to the client. A recent analysis of the rework required found that, on average, one in every six carpets cleaned does not meet Nicky’s standards.
The pro�it averages $20 a cleaning. You pay your cleaners $15 per hour. When you re-clean a carpet, it is done on overtime and you lose, on average, $20 in labor costs. On average, your pro�it is gone. In addition, there is an average cost of materials and equipment of $2.00 per of�ice.
Your training manager conducted a needs assessment regarding this issue at your request. He reported that half the employees are not reaching the standard one in nine times, and the other half are not meeting the standard two in nine times, for an overall average of one in six [(1/9 1 2/9)/2 = 1/6]. The needs assessment also indicated that the cause was a lack of KSAs in both cases.
The training manager proposes a training program that he estimates will reduce everyone’s errors to 1 carpet in 12 (half the current level). The training would take four hours and could handle 20 employees per session.
The following costs re�lect delivery of �ive training sessions of 20 employees each and assume 250 working days in a year.
Developmental Costs
20 days of training manager’s time for design and development at $40,000 per year $3,200
Miscellaneous $800
Direct Costs
4 hours per session at $40,000 per year (trainer) $400
Training facility and equipment $500
Materials $2,000
Refreshments $600
Employee salaries at $20 per hour per employee (Nicky decides to do training on a Saturday and pay employees an extra $5 per hour as overtime)
$8,000
Lost pro�it (none because training is done on overtime) 0
Indirect Costs
Evaluation of training; 10 days of training manager’s time at $40,000 per year $1,600
Material and equipment $600
Clerical support—20 hours at $10 per hour $200
Case Questions 1. How much does the re-cleaning cost Nicky per year? Show all mathematical calculations.
2. If everyone is trained, how much will the training cost? How much will training cost if only the group with the most errors is trained? Show costs in a spreadsheet and all mathematical calculations.
3. If everyone is trained, what is the cost savings for the �irst year? If only the group with the highest re- cleaning requirements is trained, what is the cost savings for the �irst year? Show all mathematical calculations.
4. What is your recommendation for this training based on the expected return on investment? Should just the group with the most re-cleanings be trained or should both groups be trained? Provide a rationale for your recommendation that includes both the �inancial as well as other factors that may be important in making this decision. Show any mathematical calculations used.
5. Let’s back up and assume that employees had the KSAs needed to clean the of�ices effectively. What other factors might you look at as potential causes of the re-cleaning problem?
Web Research Conduct a search of the Internet to identify eight distinct reasons for conducting an evaluation of training. Document the source of these reasons, and compare the list with reasons cited in the chapter.
Web Sites of Interest Research Methods on the WWW—Questionnaires
http://www.slais.ubc.ca/resources/research_methods/questions.htm
http://www.socialresearchmethods.net/kb/index.php (http://www.socialresearchmethods.net/kb/index.php)
Appendix 9-1
Evaluation: The Validity Issues Once it is decided to evaluate training, it is important to be reasonably sure that the �indings on the effectiveness of training will be valid. After all, evaluation is both time-consuming and costly.
Let’s say that Sue is sent to a one-week training seminar on the operation of Windows. According to the needs analysis, she clearly did not know much about how to operate a computer in a Windows environment. After training, she is tested, and it is determined that she has learned a great deal. Training was effective. Perhaps—but several other factors could also result in her learning how to operate in a Windows environment. Her own interest in Windows might lead her to learn it on her own. The question is: “How certain is it that the improvement was a function of the training that you provided?” In other words, does the evaluation exhibit internal validity? Once internal validity is ensured, the next question is “Will the training be effective for other groups who go through the same training?” That is, does training show external validity? We will deal with internal and external validity separately. These “threats” are not speci�ic to training evaluation but relate to evaluation in general. When we discuss each of the threats, we will indicate when it is not a serious threat in the training context.
Internal Validity Internal validity (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_013) is the con�idence that the results of the evaluation are in fact correct. Even when an improvement is demonstrated after training, the concern is that perhaps the change occurred for reasons other than training. To address this problem, it is necessary to examine factors that might compromise the �indings; these are called threats to internal validity.
History History (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_014) refers to events other than training that take place concurrently with the training program. The argument is that those other events caused learning to occur. Consider the example of Sue’s computer training. Sue is eager to learn about computers, so she buys some books and works extra hard at home, and attends the training. At the end of training, she demonstrates that she has learned a great deal, but is this learning a function of training? It might just as well be that all her hard work at home caused her to learn so much.
In a half-day training seminar, is history likely to be a concern? Not really. What about a one-day seminar or a one-week seminar? The more that training is spread across time, the more likely history could be a factor in the learning that takes place.
Maturation Maturation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_015) refers to changes that occur because of the passage of time (e.g., growing older, hungrier, fatigued, bored). If Sue’s one- week training program was so intense that she became tired, when it came time to take the posttraining test, her performance would not re�lect how much she had learned. Making sure that the testing is done when trainees are fresh reduces this threat. Other maturation threats can usually be handled in a similar manner by being sure that training and testing are not so intense as to create physical or mental fatigue.
Testing Testing also has an in�luence on learning. Suppose the pretest and posttest of the knowledge, skills, and attitudes (KSAs) are the same test. The questions on the pretest could sensitize trainees to pay particular attention to certain issues. Furthermore, the questions might generate interest, and the trainees might later discuss many of them and
work out the answers before or during training. Thus, learning demonstrated in the posttest may be a function not of the training, but of the pretest. In Sue’s case, the needs analysis that served as the pretest for evaluation got her thinking about all the material contained in the test. Then, she focused on these issues in training. This situation presents less of a validity problem if pretests are given in every case and if they are comprehensive enough to cover all of the material taught. Comprehensive testing will also make it dif�icult for trainees to recall speci�ic questions.
Instrumentation Instrumentation (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_016) is also a concern. The problem arises if the same test is used in the pretest and posttest, as was already noted. If a different but equivalent test is used, however, the question becomes “Is it really equivalent?” Differences in instrumentation used could cause differences in the two scores. Also, if the rating requires judgments, the differences between pre- and posttest scores could be a function of different people doing the rating.
For Sue, the posttest was more dif�icult than the pretest, and even though she learned a great deal in the computer training, her posttest score was actually lower than the pretest, suggesting that she did not learn anything. If the test items for both tests were chosen randomly from a large population of items, it would not be much of a concern. For behavioral tests where raters make subjective decisions, this discrepancy may be more of a concern, but careful criteria development can help to deal with it.
Statistical Regression Statistical regression (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_017) is the tendency for those who score either very high or very low on a test to “regress to the middle” when taking the test again. This phenomenon, known as regression to the mean, occurs because no test is perfect and differences result as a function of measurement error. Those who are going to training will, by de�inition, score low for the KSAs to be covered in training and so will score low on their pretest. The tendency, therefore, will be for them to regress to the mean and improve their scores, irrespective of training. In the earlier example, Sue did not know much about computers. Imagine that she got all the questions on the pretest wrong. The likelihood of that happening twice is very low, so on another test she is bound to do better.
This threat to internal validity can be controlled through various evaluation designs that we will discuss later. In addition, the use of control groups and random assignment (when possible) goes a long way toward resolving statistical regression.
Initial Group Differences (Selection) Initial group differences can also be a concern. For example, in some cases, to provide an effective evaluation, a comparison is made between the trainees and a similar group of employees who were not trained—known as the control group (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_019) . It is important that the control group be similar in every way to the training group. Otherwise, the inherent differences between the groups might be the cause of differences after the training. Suppose that those selected for training are the up-and-coming stars of the department. After training, they may in fact perform much better than those not considered up and coming, but the problem is that they were better from the start and more motivated to improve. Therefore, if Sue is one of the highly motivated trainees, as are all her cohorts in training, they would potentially perform better even without training.
This problem does not arise if everyone is to be trained. The solution is simply to mix the two types, so both the group to be trained and the control group contain both types.
Loss of Group Members (Mortality) In this situation, those who did poorly on the pretest are demoralized because of their low score and soon drop out of training. The control group remains intact. As a result, the trained group does better in the posttest than the
control group, because the poorer-scoring members left the trained group, arti�icially raising the average score. The opposite could occur if, for some reason, members of the control group dropped out.
This situation becomes more of a problem when the groups are made up of volunteers. In an organizational setting, those who go to training are unlikely to drop out. Also, all department members who agree to be in the control group are a captive audience and are unlikely to refuse to take the posttest. Although some transfers and terminations do occur to affect the numbers of participants, they are usually not signi�icant.
Diffusion of Training When trainees interact with the control group in the workplace, they may share the knowledge or skill they are learning. For example, when Sue is back in the of�ice, she shows a few of the other administrative assistants what she has learned. They are in the control group. When the posttest is given, they do as well as the trained group, because they were exposed to much of what went on in training. In this case, training would be seen as ineffective, when in fact it was effective. This would be especially true if certain quotas of trainees were selected from each department. When such sharing of information reduces differences between the groups in this way, determining the effectiveness of the training could be dif�icult.
Compensating Treatments When the control group and training group come from different departments, administrators might be concerned that the control group is at an unfair disadvantage. Comments such as “Why do they receive the new training?” or “We are all expected to perform the same, but they get the help” would suggest that the control group feels slighted. To compensate for this inequity, the managers of the control group’s department might offer special assistance or make special arrangements to help their group. For example, let’s look at trainees who are learning how to install telephones more ef�iciently. Their productivity begins to rise, but because the supervisors of the control group feel sorry for the control group, they help the trainees to get the work done, thereby increasing the trainees’ productivity. The evaluation would show no difference in productivity between the two groups after training is complete.
Compensatory Rivalry If the training is being given to one particular intact work group, the other intact work group might see this situation as a challenge and compete for higher productivity. Although the trained group is working smarter and improving its productivity, the control group works harder still and perhaps equals the productivity of the trainees. The result is that, although the training is effective, it will not show up in the evaluation.
Demoralized Control Group The control group could believe that it was made the control group because it was not as good as the training group. Rather than rivalry, the response could be to give up and actually reduce productivity. As a result, a difference between the two groups would be identi�ied, but it would be a function of the drop in productivity and not the training. Even if training were effective, the test results would be exaggerated.
These threats to validity indicate the importance of tracking the process in the evaluation. Just as data are gathered about what is occurring in the training, it is also useful to gather data about what is going on with the control group.
External Validity The evaluation must be internally valid before it can be externally valid. If evaluation indicated that training was successful and threats to internal validity were minimal, you would believe that the training was successful for that particular group. The next question is, “Will the training be effective for the rest of the employees slated to attend training?” External validity (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_020) is the con�idence
that these �indings will generalize to others who undergo the training. A number of factors threaten external validity.
Testing If the training is evaluated initially by means of pre- and posttests, and if future training does not use the pretest, it can be dif�icult to conclude that future training would be as effective. Perhaps those in the initial training focused on particular material, because it was highlighted in the pretest. If the pretest is then not used, other trainees will not have the same cues. The solution is simple: Pretest everyone taking the training. Remember that pretest data can be gathered during the needs analysis.
Selection Suppose that a particular program designed to teach communication skills is highly effective with middle-level managers, but when a program with the same design is given to shop-�loor workers, it does not work. Why? It might be differences in motivation or in entering KSAs, but remember that you cannot be sure that a training program that was successful with one group of trainees will be successful with all groups. Once it is successful with middle managers, it can be assumed that it will be successful with other, similar middle managers. However, if it is to be used to train entry-level accountants, you could not say with con�idence that it would be successful (that it had external validity) until it was evaluated.
One of the authors was hired to assist in providing team skills to a large number of employees in a large manufacturing plant. The �irst few sessions with managers went reasonably well; the managers seemed to be involved and learned a great deal. After about a month, training began for the blue-collar workers, using the identical processes, which included a fair amount of theory. It soon became evident that trainees were bored, confused, and uninterested. In a discussion about the problem, the project leader commented, “I’m not surprised— this program was designed for executives.” In retrospect, it is surprising that lower-level managers received the training so well, given that it was designed for executives.
Reaction to Evaluation In many situations, once the training is determined to be effective, the need for further evaluation is deemed unnecessary. Thus, some of the trainees who went through the program were evaluated and some were not. The very nature of evaluation causes more attention to be given to those who are evaluated. Recall the Hawthorne Studies that indicated the power of evaluation in an intervention. The Hawthorne Effect is explained by the following:
The trainees perceived the training as a novelty;
The trainees felt themselves to be special because of being singled out for training;
The trainees received speci�ic feedback on how they were doing;
The trainees knew they were being observed, so they wanted to perform to the best of their ability; and
The enthusiasm of the instructor inspired the trainees to perform at a high level.
Whatever the mechanism, those who receive more attention might respond better as a function of that attention. As with the other threats to external validity, when the way groups are treated is changed, the training’s external validity is jeopardized.
Multiple Techniques In clinical studies, a patient receives Dose A. It does not have an effect, so a month later she receives Dose B, which does not have an effect, so she receives Dose C and is cured. Did Dose C cure her? Perhaps, but it could also be that it was the combination of A, B, and C that resulted in the required effect. The use of multiple techniques could in�luence training when some component of the training is changed from one group to the next. For example, a
group received one-on-one coaching and then video instruction. The members did poorly after receiving the coaching but excelled after receiving the video instruction, so video instruction became the method used to train future employees. It was not successful, however, because it was the combination of coaching and video instruction that resulted in the initial success.
What Does It All Mean? It is useful to understand the preceding issues to recognize why it is dif�icult to suggest with certainty that training or any other intervention is the cause of any improvement. We cannot be absolutely certain about the internal or external validity when measuring things such as learning, behavior, and organizational results. Careful consideration of these issues, however, and the use of well-thought-out designs for the evaluation can improve the likelihood that training, when shown to be effective, is in fact effective (internal validity) and will be effective in the future (external validity). This information is useful for assessing training and, equally important, for assessing evaluations done by outside vendors.
Evaluation Design Issues A number of texts provide excellent information on appropriate designs for conducting evaluations. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_55) Unfortunately, many of their recommended designs are impractical in most organizational settings. Finding the time or resources to create a control group is dif�icult at best. Getting approval to do pretests on control groups takes away from productive time and is dif�icult to justify.
Scienti�ically valid research designs are dif�icult to implement, so organizations often use evaluation designs that are generally not acceptable to the scienti�ic community. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_56) However, it is still possible to have some con�idence in the results with less rigorous designs. Some research designs are less than perfect, but it is possible to �ind ways of improving them. The two designs most often used, and most criticized by scientists, are the posttest-only and the pretest/posttest methods. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_57)
Basic Designs
Posttest Only The posttest-only method occurs when training is followed by a test of the KSAs. The posttest-only design is not appropriate in some instances. At other times, however, the method is completely acceptable. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_58) The two possible goals of evaluation are to determine
1. whether change took place and
2. whether a level of competence was reached.
If the goal of the training is the latter, a posttest-only design should suf�ice. If, for example, legal requirements state that everyone in the company who handles hazardous waste be trained to understand what to do in an emergency, then presumably this training needs only to provide a test at the end to con�irm that all trainees reached the required level of knowledge. As more companies are required to be ISO 9000 (or equivalent) certi�ied, it will be increasingly important to prove that employees possess the required skills. As a result, certi�ication will become the goal of employee training, and in that case the posttest-only will suf�ice.
We frequently mention the value in doing a needs analysis. Conducting a needs analysis provides pretest data, making the posttest-only design moot. Giving the posttest automatically applies a pretest/posttest design. Furthermore, in the absence of a TNA, archival data may serve as the pretest. Performance
2
3
4
5
appraisals, measures of quality, and the like might allow for some pre/post comparison. Although such historical data may not be ideal, it could provide some information as to the effectiveness of training. Alternatively, it is possible to identify an equivalent group and provide its members with the same posttest, thereby turning the design into a posttest only with control group. Suddenly, a much more meaningful design is created.
The posttest-only design as it stands is problematic for assessing change. A number of other competing causes could be responsible for the change such as history, maturation, instrumentation, selection and mortality. Nevertheless, we would agree with other professionals that any evaluation is better than none. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_59) Gathering any pretraining information that might suggest that the level of KSAs before training was lower than in the posttest would help to bolster the conclusion that training was effective.
Pretest/Posttest The pretest/posttest design is the other method organizations frequently use. Here, a pretest is given (T ), training is provided (3), and then a posttest is given (T ). This design is expressed as T × T .
This design can demonstrate that change has occurred. But even though it can be demonstrated that KSAs have changed, it is not possible to say that training is responsible for those changes. There are several threats to internal validity (history, maturation, testing, instrumentation, and possibly regression). For example, you might have been training a group of machine operators to operate new drill-press machines. Pretesting the trainees revealed that none knew how to operate the machine. After a three-day training session, a posttest showed that, on average, the trainees could operate the machine correctly 85 percent of the time. A big success? Not if the supervisor of the work group says that the ones without training can operate the machines correctly 95 percent of the time by just reading the manuals and practicing on their own. Several different reasons might explain why those who did not go to training are performing better on the job. Perhaps they already knew how to operate the machine. Perhaps a manufacturer’s representative came and provided on-the-�loor training to them. Or, it could be that your training somehow slowed down the learning process. And there is still the issue of external validity where testing selection and possibly reaction to evaluation are cause for concern. Therefore, it would be useful to have a control group.
In many instances, using a control group is simply not an option. Does that mean that the trainer should not bother to do anything? Absolutely not! In fact, it is better to do something than nothing. We tend to focus on the negative aspects of the preexperimental designs rather than to examine ways of using them most effectively when other options do not exist. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_60) The pre- post-no-control-group at least establishes that changes did take place. History can be examined through some detective work. Recall that Sue had learned a great deal about operating in a Windows environment according to the pretest/posttest. Did she do extra reading at home? Did she practice on her own irrespective of training expectations? Did she get some help from someone at the of�ice or elsewhere? Simply asking her might indicate that none of those factors occurred, suggesting that it was in fact the training. This process may be particularly relevant for the small business, where size makes it easier to identify potential threats.
Internal Referencing Strategy Another way of dealing with the lack of a control group is to use the internal referencing strategy (IRS) (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_021) . (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_61) With this method, include both relevant and nonrelevant test questions in the pre- and posttest. Here’s how it works.
6
1
2 1 2
7
8
Both pretests and posttests contain questions that deal with the training content and questions that deal with related content not in the training. In the pretest, trainees will do poorly on both sets of questions. In the posttest, if training is effective, improvement should only be shown for the trained items. The nonrelevant items serve as a control. In their research on the IRS, Haccoun and Hamtiaux noted that the results obtained from the IRS design were identical to those obtained when a control group was used. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_62) This method deals with many of the concerns that arise when a control group is used, and with several other concerns. Many of the threats to internal validity do not exist with the IRS because, with no control group to react in an inappropriate manner, issues such as diffusion of training, compensatory treatment, and compensatory rivalry are not a concern. The only threats are history, maturation, testing, statistical regression, and instrumentation.
As previously noted, history can be investigated through examination of the time frame in which training has occurred. Any events that potentially affected the trainees could be assessed as to their effect. Also, given that the relevant and nonrelevant items are similar in nature in the IRS, any historical event should affect both types of items in a similar manner. Maturation issues can be dealt with by ensuring that the training is designed to keep trainees interested and motivated, and to prevent them from becoming tired or fatigued. The reactive effect of testing can be dealt with if parallel tests are used. Parallel tests cover the same content but do not use identical questions. This technique does lead to another potential problem (instrumentation) that can be addressed. If all trainees receive a comprehensive pretest, then instrumentation is not an issue.
Instrumentation is a concern if two different tests are used. If a large pool of items is developed from which test items can be chosen at random, the result should be equivalent tests. Once again, it is important to note that in any evaluation, we can never be 100 percent sure that training has caused the improvement. We are not suggesting that this design take the place of more stringent designs when they are practical. It is appropriate, however, when the alternative is posttest-only or nothing. Again, some control is better than none at all.
One �inal note: The IRS design can be used to determine improvement in KSAs, but research indicates that it tends to show that training is not effective when, in fact, it is. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_63) In other words, the training must provide a substantial improvement from pretest to posttest for it to be detected by this design.
More Complex Designs Two factors need to be considered when developing a sound evaluation design:
1. Control groups
2. Random assignment
The control group is a group of similar employees who do not receive the training. The control group is used to determine whether changes that take place in trainees also take place for those who do not receive training. If change occurs only in the trainees, it is probably a result of training. If it occurs in both trained and untrained groups, it is probably a result of some other factor.
Random assignment (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_022) is the placement of employees in either the control group or the training group by chance, to ensure that the groups are equivalent. Random assignment is more applicable to experimental laboratories than to applied settings (such as in training) for two reasons. First, given the small number of employees placed in one group or the other, the theory of
9
10
randomness is not likely to hold true. When we split a group of 60 employees into two groups of 30, it is quite likely that real differences will be present within the two groups. Random assignment works well when multiple groups of 30 are used, or when the total number of subjects is quite large (e.g., 500).
Second, it is unlikely that the organization can afford the luxury of randomly assigning employees to each group. The work still needs to be done, and managers would want some control over who will be in training at a speci�ic time. For this reason, �inding the best match of employees is important so that the control group contains a sample representative of employees who are in the training group. Representative sampling (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_023) is matching employees in the control group and training group on factors such as age, tenure, and education to make the groups as equivalent as possible. The following discussion covers several designs that use control groups. We believe that assigning trainees through representative sampling is a more effective way of obtaining equivalent groups.
Posttest Only with Control Group The following represents posttesting only with a control group:
This design and the following one are equivalent in that they deal effectively with all internal validity issues.
If for some reason a pretest was not conducted or if the trainer did not provide a pretest to a control group at the beginning of training, the trainees can be compared with a control group using a posttest- only design. Differences in test scores noted between the groups, if trainees do better, provide evidence of the success of the training. The tendency is to downplay the effectiveness of this design, because no pretest assessed the equivalence of the groups before training. But if representative sampling has resulted in the groups being equivalent, there is no need to have a pretest. Of course, there is greater con�idence regarding the equivalence of the groups if there was a pretest.
Pretest/Posttest Only with Control Group The expression for pretest/posttest with a control group is as follows:
This design is one of the more favorable for eliminating threats to internal validity. Recall that we do not use random assignment in dividing the groups. So, how equivalent are they? A pretest can determine their level of equivalence. Equivalent pretests in both groups provide you with one more piece of evidence that the groups are equal, and posttest differences (if the trained group obtains higher scores) will suggest that training was successful.
Time Series Design The time series design is represented by:
This design uses a series of measurements before and after training. In this way, the likelihood of internal validity threats such as testing or regression to the mean is minimized. Also, when everyone attends training at the same time (a one-shot training program), this design can be used whether the number is large or small. In such a case it could still be argued that with no control group, there are alternative reasons for any change. But in an applied setting, the goal is to be as sure as possible about the results, given organizational constraints. If enough measures are taken pre- and posttraining to deal with �luctuations in performance, changes after training are certainly suggestive of learning. Remember that in an applied setting, there will never be absolute certainty regarding the impact of training, but taking care to use the best possible design (considering constraints) is still better than doing nothing at all.
To make this design more powerful, consider adding a control group, expressed by:
Multiple Baseline Design Multiple baseline design is represented by:
Trainee Group A T T T × T T T T T T T
Trainee Group B T T T T T × T T T T T
Trainee Group C T T T T T T T × T T T
Trainee Group D T T T T T T T T T × T
In this design, multiple measures are taken much as in time series, but each group receives the training at a different time. Each untrained group serves as a control for the trained groups. This approach deals with many of the concerns when no control group is used. Here the ability to say that changes measured by the test are a result of the training is strong. If each group improves after training, it is dif�icult to argue that something else caused the change.
Choosing the Design to Use Determining the true effect of training requires an investigation into the validity of evaluation results. Several methods are available, and the more complex the design, the more valid the results. There are other considerations when you are deciding on an evaluation design. Innovation can provide good substitutes when the best is not possible. Consider the multiple baseline design. It is a powerful design and certainly is a possibility if several employees need to receive the training over time.
However, what if multiple measures are not possible? The following design would address many of the same concerns, and although it is not as elaborate, it certainly deals with many of the concerns regarding outside in�luences causing the change. If pretest scores are all comparable and posttest scores indicate an improvement, these results are a strong argument for showing that training was responsible.
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
Trainee Group A T × T
Trainee Group B T × T
Trainee Group C T × T
Trainee Group D T × T
We have already mentioned that most organizations do not evaluate all training at all levels. Furthermore, even when evaluating training, many organizations do not use pretest/posttest or control groups in a manner that would eliminate concerns about the validity of the results.
Dr. Dixon of George Washington University indicated that, of the companies she investigated in her article “New Routes to Evaluation,” only one used designs that would deal with many of the validity issues. Other companies, including IBM and Johnson Controls, follow such procedures only when asked by particular departments or higher- level management, or when they can defray some of the high cost of developing reliable and valid tests by marketing the �inal product to other organizations. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_64) The demand for certi�ication in some skills (primarily because of ISO and others’ requirements) created a need for these types of tests.
When you are evaluating training, if using control groups or pretesting is not possible, remember that other investigative methods can be used for assessing the likelihood that factors other than training account for any change in KSAs.
What About Small Business?
Single Case Designs We noted in Chapter 9 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i125#ch09) that, for a small business, it is sometimes easier to infer cause and effect between training and outcomes. We also noted, however, that it is also useful at times to consider evaluation to ensure that training is having its effect on employee behavior. But traditional evaluation designs are very dif�icult to apply to a small business. So, is there an alternative? Consider the single-case design (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i177#glossch09_024) . It is often used to evaluate the training provided to professional counselors. But managers can also use this method when the number of employees is small. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_65)
The single-case design uses data from one individual and makes inferences based on that information. To increase con�idence in the results, use the multiple baseline approach. Suppose that two supervisors need to be trained in active listening skills. Because the business is small, both cannot attend training at the same time. Using a predetermined checklist developed for evaluating the training, count the number of active listening phrases that each of them uses while talking to you. Take several measures over three or four weeks, then send one supervisor to training. Continue monitoring the active listening after the person returns. Did the number of active listening phrases increase for the trained supervisor and not the other supervisor? Now give the second supervisor training, and afterward, continue monitoring the conversations. If both employees improved after training, it can be inferred that the training was effective. Although this approach is suggested for the small business, it is also useful in any organization when only a few employees need to be trained.
1 2
1 2
1 2
1 2
11
12
Appendix 9-2 Utility Analysis In the example in Table 9-7 (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i131#ch09table07) using the cost-saving method of evaluation, training supervisors in grievance handling reduced the total number of grievances by 50 percent and the number going to the third step from 63 to 8. In this example, we calculated only the cost savings related to the change in third-step grievances. Utility analysis, however, permits us to estimate the overall value to the organization of the supervisors’ changes in behavior. In other words, if those trained are better performers, on average, and better performers are worth more in dollar terms, utility analysis allows us to estimate that increased worth. A general approach to utility is as follows: (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_66)
where
ΔU = dollar value of improved performance
N = number of trainees
T = time the bene�its will last
D = difference in performance between trained and untrained groups (in standard deviation units)
SD = dollar value of untrained group’s performance (in standard deviation units)
C = total cost of training the trained group
Some of the variables in the equation can be measured directly, whereas others must be estimated. For example, N, C, and D can be determined objectively. However, determining how long the bene�its will last is really an estimate that will be more or less accurate, depending on the estimator’s experience with training and the types of employees involved. Calculating the dollar value of the untrained group’s performance falls somewhere in between. It is relatively easy to determine the compensation costs. However, it is often more dif�icult to translate their actual performance into dollar amounts. Recall our third-step grievance example. Even though we know what a third-step grievance costs in management labor compensation, we do not know the impact of those third-step grievances on the productivity of the work unit or the quality of the product/service. What to include in determining the dollar value of performance becomes a subjective decision. The �inal result will be an estimate of the value of the increased performance in dollars. Using the same example, an analysis of the possible utility is presented in Table 9- 10.
Utility analysis is complex and beyond the scope of this text; what has been presented here is just a taste of that complexity. More complex models account for even more factors that might affect the true �inancial value of training outcomes. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_67) The purpose here is to demonstrate the dif�iculties of getting a true picture of the total �inancial bene�its associated with training outcomes. However, these complexities exist for any area of the business when you try to determine the effects of change. By becoming more quantitative in the assessment and description of training outcomes, training managers can put themselves on an equal footing with other managers in the organization.
Although utility analysis has been around for quite some time it does not seem to have caught on in industry. Lori Fair�ield, Editor of Training magazine, notes that in their survey “Industry Reports,” she has yet to see utility analysis as a write in where the survey asks for any method used for evaluation that was not an option in the survey. Furthermore, Jack Phillips, of the ROI Institute indicated that the only time he has found this method to be used in an organization is when a PhD student is using it for a dissertation. Dr. Phillips also noted that when he talks to executives about evaluation and mentions utility analysis as one option, they often suggest that it looks like “funny money.” This latter comment may explain why some research has concluded that using utility analysis to bolster the
1
T
Y
T
2
claim as to the value of a project actually decreased managerial support for the project. (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_69) Until it is clear why this tendency is the case, it might not be wise to use this particular type of analysis to sell a project.
Table 9-10 Calculation of the Utility of the Grievance Training (http://content.thuzelearning.com/books/AUBUS680.16.1/sections/i176#ch09biblio_68)
Formula:
N = 30
T = 1 year (an overly conservative estimate)
SD = standard deviation of job performance for the untrained supervisors
ryy = reliability of job performance measure
D is a measure of the improvement (in standard deviation units) in performance that trained supervisors will exhibit. Although obtaining the data is time-consuming (collecting the performance appraisal data for supervisors, trained and untrained), the calculations can be done easily on using a computer.
The equation assumes average salary of $35,000. The 0.40 comes from the 40 percent rule, which is a calculation based on 40 percent of the average salary of trainees. This rule comes from the Schmidt and Hunter research. This and other methods to calculate SD can be found in Cascio (1991). According to the preceding information, the utility of the training based on this formula is
3
4
T
Y