Accident Investigation
SENSI
NOT MEAS UREMENT
TIVE
DDOE‐HDBK‐11208‐2012 July 2012
DOEE HAANDBOOKK
Acccideent andd Opperaational Saafetyy Annalyysis
Volumee I: Acccideent AAnalyysis Tecchniqques
U.S. Deparrtmennt of Ennergy Wasshingtoon, D.CC. 205 85
DOE‐HDBK‐1208‐2012
INTRODUCTION - HANDBOOK APPLICATION AND SCOPE
Accident Investigations (AI) and Operational Safety Reviews (OSR) are valuable for evaluating technical issues, safety management systems and human performance and environmental conditions to prevent accidents, through a process of continuous organizational learning. This Handbook brings together the strengths of the experiences gained in conducting Department of Energy (DOE) accident investigations over the past many years. That experience encourages us to undertake analyses of lower level events, near misses and, adds insights from High Reliability Organizations (HRO)/Learning organizations and Human Performance Improvement (HPI).
The recommended techniques apply equally well to DOE Federal-led accident investigations conducted under DOE Order (O) 225.1B, Accident Investigations, dated March 4, 2011, contractor-led accident investigations or under DOE O 231.1A, Chg. 1, Environment, Safety and Health Reporting, dated June 3, 2004, or Operational Safety Reviews as a element of a “Contractor Assurance Program.” However, the application of the techniques described in this handbook are not mandatory, except as provided in, or referenced from DOE O 225.1B for Federally-led investigations.
The application of the techniques described as applied to contractor-led accident investigations or OSRs are completely non-mandatory and are applied at the discretion of contractor line managers. Only a select few accidents, events or management concerns may require the level and depth of analysis described in this Handbook, by the contractor’s line management.
This handbook has been organized along a logical sequence of the application of the DOE “core analytical techniques” for conducting a DOE Federal-, or contractor-led Accident Investigation or an OSR in order to prevent accidents. The analysis techniques presented in this Handbook have been developed and informed from academic research and validated through industry application and practice.
The techniques are for performance improvement and learning, thus are applicable to both AI and OSR. This handbook serves two primary purposes: 1) as the training manual for the DOE Accident investigation course, and the Operational Safety and Accident Analysis course, taught through the National Training Center (NTC) and, 2) as the technical basis and guide for persons conducting accident investigations or operational safety analysisi while in the field.
Volume I - Chapter 1; provides the functional technical basis and understanding of accident prevention and investigation principles and practice.
Volume 1 - Chapter 2; provides the practical application of accident investigation techniques as applicable to a DOE Federally-led Accident Investigation under DOE O 225.1B. This includes: the process for organizing an accident investigation, selecting the team, assigning roles, collecting and recording information and evidence; organizing and analyzing the information,
The term operational safety analysis for the purposes of this Handbook should not be confused with application of other DOE techniques contained within nuclear safety analysis directives or standards such as 10 CFR 830 Subpart B, or DOE-STD-3009.
i
i
DOE‐HDBK‐1208‐2012
forming Conclusions (CON) and Judgments of Need (JON), and writing the final report. This chapter serves as a ready easily available reference for Board Chairpersons and members during an investigation.
Volume II provides the adaptation of the above concepts and processes to an OSR, as an approach to go deeper within the contractor’s organization and prevent accidents by revealing organizational weaknesses before they result in an accident.
Simply defined, the process in this Handbook includes:
Determining What Happened;
Determining Why It Happened and,
Developing Conclusions and Judgments of Needs to Prevent Re-Occurrence.
To accomplish this, we use:
Event and Causal Factor Charting and Analysis.
And, apply the core analytical techniques of:
Barrier analysis;
Change analysis,
Root cause analysis, and
Verification analysis.
Each of these analyses includes the integration of tools to analyze, DOE and Contractor management systems, organizational weaknesses, and human performance. Other specific analysis, beyond these core analytical techniques may be applied if needed, and are also discussed in this Handbook.
ii
DOE‐HDBK‐1208‐2012
ACKNOWLEDGEMENTS
This DOE Accident and Operational Safety Analysis Handbook was prepared under the sponsorship of the DOE Office of Health Safety and Security (HSS), Office of Corporate Safety Programs, and the Energy Facility Contractors Operating Group (EFCOG), Industrial Hygiene and Safety Sub-group of the Environmental Health and Safety (ES&H) Working Group.
The preparers would like to gratefully acknowledge the authors whose works are referenced in this document, and the individuals who provided valuable technical insights and/or specific reviews of this document in its various stages of development:
Writing Team Co-Chairs:
David Pegram, DOE Office of Health Safety and Security (HSS)
Richard DeBusk, Lawrence Berkley National Laboratory (LBNL)
Writing Team Members:
Marcus Hayes, National Nuclear Security Administration (NNSA)
Jenny Mullins, DOE Oak Ridge Site Office (ORO)
Bill Wells, Lawrence Berkley National Laboratory (LBNL)
Roger Kruse, Los Alamos National Laboratory (LANL)
Rick Hartley, Babcock and Wilcox Technical Services Pantex (BW-PTX)
Jeff Aggas, Savannah River Site (SRS)
Gary Hagan, Oak Ridge Y-12 National Security Complex (Y12)
Advisor:
Earl Carnes, DOE Office of Health Safety and Security (HSS)
Technical Editors:
Susan Keffer, Project Enhancement Corporation
Erick Reynolds, Project Enhancement Corporation
iii
DOE‐HDBK‐1208‐2012
iv
DOE‐HDBK‐1208‐2012
Table of Contents
INTRODUCTION - HANDBOOK APPLICATION AND SCOPE ................................................... i
ACKNOWLEDGEMENTS ........................................................................................................... iii
ACRONYMS ................................................................................................................................ xi
FOREWORD ................................................................................................................................. 1
CHAPTER 1. DOE’S ACCIDENT PREVENTION AND INVESTIGATION PROGRAM ............1-1
1. Fundamentals.................................................................................................................. 1-1 1.1 Definition of an Accident................................................................................................1-1 1.2 The Contemporary Understanding of Accident Causation .........................................1-1 1.3 Accident Models – A Basic Understanding..................................................................1-2
1.3.1 Sequence of Events Model..................................................................................................1‐2 1.3.2 Epidemiological or Latent Failure Model ............................................................................1‐3 1.3.3 Systemic Model ...................................................................................................................1‐4
1.4 Cause and Effect Relationships ....................................................................................1-5 1.4.1 Investigations Look Backwards ...........................................................................................1‐5 1.4.2 Cause and Effect are Inferred .............................................................................................1‐6 1.4.3 Establishing a Cause and Effect Relationship ......................................................................1‐6 1.4.4 The Circular Argument for Cause ........................................................................................1‐6 1.4.5 Counterfactuals ...................................................................................................................1‐7
1.5 Human Performance Considerations............................................................................1-8 1.5.1 Bad Apples...........................................................................................................................1‐9 1.5.2 Human Performance Modes – Cognitive Demands ............................................................1‐9 1.5.3 Error Precursors ................................................................................................................1‐11 1.5.4 Optimization......................................................................................................................1‐13 1.5.5 Work Context ....................................................................................................................1‐13 1.5.6 Accountability, Culpability and Just Culture .....................................................................1‐15
1.6 From Latent Conditions to Active Failures.................................................................1-16 1.7 Doing Work Safely - Safety Management Systems ....................................................1-18
1.7.1 The Function of Safety Barriers .........................................................................................1‐20 1.7.2 Categorization of Barriers .................................................................................................1‐22
1.8 Accident Types/ Individual and Systems....................................................................1-25 1.8.1 Individual Accidents ..........................................................................................................1‐25 1.8.2 Preventing Individual Accidents ........................................................................................1‐26 1.8.3 System Accident ................................................................................................................1‐27 1.8.4 How System Accidents Occur............................................................................................1‐28 1.8.5 Preventing System Accidents ............................................................................................1‐29
1.9 Diagnosing and Preventing Organizational Drift .......................................................1-30
v
DOE‐HDBK‐1208‐2012
1.9.1 Level I: Employee Level Model for Examining Organizational Drift ‐‐Monitoring the Gap – “Work‐as‐Planned” vs. “Work‐as‐Done”..........................................................1‐31
1.9.2 Level II: Mid‐Level Model for Examining Organizational Drift – Break‐the‐Chain ...........1‐32 1.9.3 Level III: High Level Model for Examining Organizational Drift ........................................1‐35
1.10 Design of Accident Investigations ..............................................................................1-36 1.10.1 Primary Focus – Determine “What” Happened and “Why” It Happened ........................1‐37 1.10.2 Determine Deeper Organizational Factors .......................................................................1‐38 1.10.3 Extent of Conditions and Cause ........................................................................................1‐39 1.10.4 Latent Organizational Weaknesses ...................................................................................1‐39 1.10.5 Organizational Culture ......................................................................................................1‐41
1.11 Experiential Lessons for Successful Event Analysis ................................................1-45
CHAPTER 2. THE ACCIDENT INVESTIGATION PROCESS ..................................................2-1
2. THE ACCIDENT INVESTIGATION PROCESS ................................................................2-1 2.1 Establishing the Federally Led Accident Investigation Board and Its Authority ......2-1
2.1.1 Accident Investigations’ Appointing Official .......................................................................2‐1 2.1.2 Appointing the Accident Investigation Board .....................................................................2‐3 2.1.3 Briefing the Board ...............................................................................................................2‐5
2.2 Organizing the Accident Investigation..........................................................................2-6 2.2.1 Planning...............................................................................................................................2‐6 2.2.2 Collecting Initial Site Information .......................................................................................2‐6 2.2.3 Determining Task Assignments ...........................................................................................2‐6 2.2.4 Preparing a Schedule ..........................................................................................................2‐7 2.2.5 Acquiring Resources ............................................................................................................2‐8 2.2.6 Addressing Potential Conflicts of Interest...........................................................................2‐9 2.2.7 Establishing Information Access and Release Protocols .....................................................2‐9 2.2.8 Controlling the Release of Information to the Public .......................................................2‐10
2.3 Managing the Investigation Process...........................................................................2-11 2.3.1 Taking Control of the Accident Scene ...............................................................................2‐11 2.3.2 Initial Meeting of the Accident Investigation Board .........................................................2‐12 2.3.3 Promoting Teamwork .......................................................................................................2‐13 2.3.4 Managing Evidence, Information Collection .....................................................................2‐15 2.3.5 Coordinating Internal and External Communication ........................................................2‐15 2.3.6 Managing the Analysis ......................................................................................................2‐17 2.3.7 Managing Report Writing..................................................................................................2‐18 2.3.8 Managing Onsite Closeout Activities ................................................................................2‐19 2.3.8.1 Preparing Closeout Briefings....................................................................................2‐19 2.3.8.2 Preparing Investigation Records for Permanent Retention .....................................2‐19
2.3.9 Managing Post‐Investigation Activities .............................................................................2‐21 2.3.9.1 Corrective Action Plans ............................................................................................2‐21 2.3.9.2 Tracking and Verifying Corrective Actions ...............................................................2‐21 2.3.9.3 Establishing Lessons Learned ...................................................................................2‐22
2.4 Controlling the Investigation .......................................................................................2-23 2.4.1 Monitoring Performance and Providing Feedback ...........................................................2‐23 2.4.2 Controlling Cost and Schedule ..........................................................................................2‐23
vi
DOE‐HDBK‐1208‐2012
2.4.3 Assuring Quality ................................................................................................................2‐24
2.5 Investigate the Accident to Determine “What” Happened ........................................2-24 2.5.1 Determining Facts .............................................................................................................2‐24 2.5.2 Collect and Catalog Physical Evidence ..............................................................................2‐26 2.5.2.1 Document Physical Evidence ...................................................................................2‐28 2.5.2.2 Sketch and Map Physical Evidence ..........................................................................2‐28 2.5.2.3 Photograph and Video Physical Evidence ................................................................2‐29 2.5.2.4 Inspect Physical Evidence.........................................................................................2‐30 2.5.2.5 Remove Physical Evidence .......................................................................................2‐30
2.5.3 Collect and Catalog Documentary Evidence .....................................................................2‐31 2.5.4 Electronic Files to Organize Evidence and Facilitate the Investigation.............................2‐32 2.5.5 Collecting Human Evidence...............................................................................................2‐34 2.5.6 Locating Witnesses............................................................................................................2‐34 2.5.7 Conducting Interviews ......................................................................................................2‐35 2.5.7.1 Preparing for Interviews ..........................................................................................2‐35 2.5.7.2 Advantages and Disadvantages of Individual vs. Group Interviews ........................2‐36 2.5.7.3 Interviewing Skills ....................................................................................................2‐37 2.5.7.4 Evaluating the Witness’s State of Mind ...................................................................2‐39
2.6 Analyze Accident to Determine “Why” It Happened ..................................................2-40 2.6.1 Fundamentals of Analysis .................................................................................................2‐40 2.6.2 Core Analytical Tools ‐ Determining Cause of the Accident or Event ...............................2‐41 2.6.3 The Backbone of the Investigation – Events and Causal Factors Charting .......................2‐43 2.6.3.1 ECF Charting Symbols...............................................................................................2‐47 2.6.3.2 Events and Causal Factors Charting Process Steps ..................................................2‐47 2.6.3.3 Events and Causal Factors Chart Example ...............................................................2‐58
2.6.4 Barrier Analysis.................................................................................................................. 2‐60 2.6.4.1 Analyzing Barriers ....................................................................................................2‐60 2.6.4.2 Examining Organizational Concerns, Management Systems, and Line
Management Oversight...........................................................................................2‐65 2.6.5 Human Performance, Safety Management Systems and Culture Analysis ......................2‐69 2.6.6 Change Analysis.................................................................................................................2‐69 2.6.7 The Importance of Causal Factors.....................................................................................2‐76 2.6.8 Causal Factors ...................................................................................................................2‐77 2.6.8.1 Direct Cause .............................................................................................................2‐78
2.6.9 Contributing Causes ..........................................................................................................2‐79 2.6.10 Root Causes .......................................................................................................................2‐79 2.6.10.1 Root Cause Analysis .................................................................................................2‐80
2.6.11 Compliance/Noncompliance .............................................................................................2‐83 2.6.12 Automated Techniques .....................................................................................................2‐86
2.7 Developing Conclusions and Judgments of Need to “Prevent” Accidents in the Future ...................................................................................................................... 2-87 2.7.1 Conclusions .......................................................................................................................2‐87 2.7.2 Judgments of Need ...........................................................................................................2‐88 2.7.3 Minority Opinions .............................................................................................................2‐91
2.8 Reporting the Results...................................................................................................2-92 2.8.1 Writing the Report ............................................................................................................2‐92
vii
DOE‐HDBK‐1208‐2012
2.8.2 Report Format and Content ..............................................................................................2‐93 2.8.3 Disclaimer..........................................................................................................................2‐95 2.8.4 Appointing Official’s Statement of Report Acceptance ....................................................2‐95 2.8.5 Acronyms and Initialisms ..................................................................................................2‐96 2.8.6 Prologue ‐ Interpretation of Significance ..........................................................................2‐97 2.8.7 Executive Summary ...........................................................................................................2‐98 2.8.8 Introduction ....................................................................................................................2‐100 2.8.9 Facts and Analysis ...........................................................................................................2‐102 2.8.10 Conclusions and Judgments of Need ..............................................................................2‐106 2.8.11 Minority Report...............................................................................................................2‐108 2.8.12 Board Signatures .............................................................................................................2‐108 2.8.13 Board Members, Advisors, Consultants, and Staff .........................................................2‐110 2.8.14 Appendices ......................................................................................................................2‐110
2.9 Performing Verification Analysis, Quality Review and Validation of Conclusions ................................................................................................................2-111 2.9.1 Structure and Format ......................................................................................................2‐111 2.9.2 Technical and Policy Issues .............................................................................................2‐111 2.9.3 Verification Analysis ........................................................................................................2‐111 2.9.4 Classification and Privacy Review ...................................................................................2‐112 2.9.5 Factual Accuracy Review .................................................................................................2‐112 2.9.6 Review by the Chief Health, Safety and Security Officer ................................................2‐112 2.9.7 Document the Reviews in the Records ...........................................................................2‐112
2.10 Submitting the Report ................................................................................................2-113
Appendix A. Glossary ..................................................................................................... A-1
Appendix B. References ................................................................................................. B-1
Appendix C. Specific Administrative Needs ................................................................. C-1
Appendix D. Forms.......................................................................................................... D-1
Attachment 1. ISM Crosswalk and Safety Culture Lines of Inquiry ....................................1-1
Attachment 2. Bibliography .................................................................................................... 2-1
viii
DOE‐HDBK‐1208‐2012
Table of Tables
Table 1‐1: Common Organizational Weaknesses ..............................................................................1‐40 Table 2‐1: DOE Federal Officials and Board Member Responsibilities ................................................2‐1 Table 2‐2: DOE Federal Board Members Must Meet These Criteria ...................................................2‐4 Table 2‐3: These Activities should be Included in an Accident Investigation Schedule.......................2‐7 Table 2‐4: The Chairperson Establishes Protocols for Controlling Information ................................2‐10 Table 2‐5: The Chairperson Should Use These Guidelines in Managing Information Collection
Activities. ...........................................................................................................................2‐17 Table 2‐6: Use Precautions when Handling Potential Blood Borne Pathogens .................................2‐28 Table 2‐7: These Sources are Useful for Locating Witnesses.............................................................2‐34 Table 2‐8: Group and Individual Interviews have Different Advantages ...........................................2‐37 Table 2‐9: Guidelines for Conducting Witness Interviews .................................................................2‐38 Table 2‐10: Benefits of Events and Causal Factors Charting ................................................................2‐46 Table 2‐11: Common Human Error Precursor Matrix ..........................................................................2‐53 Table 2‐12: Sample Barrier Analysis Worksheet ..................................................................................2‐64 Table 2‐13: Typical Questions for Addressing the Seven Guiding Principles of Integrated Safety
Management. ....................................................................................................................2‐67 Table 2‐14: Sample Change Analysis Worksheet .................................................................................2‐74 Table 2‐15: Case Study: Change Analysis Summary.............................................................................2‐75 Table 2‐16: Case Study Introduction ....................................................................................................2‐77 Table 2‐17: Compliance/Noncompliance Root Cause Model Categories ............................................2‐85 Table 2‐18: These Guidelines are Useful for Writing Judgments of Need ...........................................2‐90 Table 2‐19: Case Study: Judgments of Need ........................................................................................2‐90 Table 2‐20: Useful Strategies for Drafting the Investigation Report ...................................................2‐93 Table 2‐21: The Accident Investigation Report Should Include these Items .......................................2‐94 Table 2‐22: Facts Differ from Analysis ...............................................................................................2‐104
Table of Figures
Figure 1‐1: IAEA‐TECDOC‐1329 – Safety Culture in Nuclear Installations.............................................1‐8 Figure 1‐2: Performance Modes..........................................................................................................1‐11 Figure 1‐3: Error Precursors ................................................................................................................1‐12 Figure 1‐4: Organizational Causes of Accidents ..................................................................................1‐17 Figure 1‐5: Five Core Functions of DOE’s Integrated Safety Management System ............................1‐20 Figure 1‐6: Barriers and Accident Dynamics – Simplistic Design ........................................................1‐21 Figure 1‐7: Individual Accident ............................................................................................................1‐26 Figure 1‐8: System Accident ................................................................................................................1‐28 Figure 1‐9: How System Accidents Happen ........................................................................................1‐29 Figure 1‐10: Prevent a System Accident................................................................................................1‐30
ix
DOE‐HDBK‐1208‐2012
Figure 1‐11: Level I ‐ “Work‐as‐Done” Varies from “Work‐as‐Planned” at Employee Level ................1‐32 Figure 1‐12: Level II ‐ Physics‐Based Break‐the‐Chain Framework .......................................................1‐35 Figure 1‐13: Level III ‐ High‐Level Model for Examining Organizational Drift .......................................1‐36 Figure 1‐14: Factors Contributing to Organizational Drift ....................................................................1‐37 Figure 1‐15: Assessing Organizational Culture ......................................................................................1‐42 Figure 2‐1: Typical Schedule of Accident Investigation .........................................................................2‐8 Figure 2‐2: Example of Electronic File Records To Keep for the Investigation....................................2‐33 Figure 2‐3: Analysis Process Overview ................................................................................................2‐42 Figure 2‐4: Simplified Events and Causal Factors Chart for the July 1998 Idaho Fatality CO2
Release at the Test Reactor Area ......................................................................................2‐49 Figure 2‐5: Sequence of Events and Actions Flowchart ......................................................................2‐50 Figure 2‐6: Decisions before Actions Flowchart ..................................................................................2‐50 Figure 2‐7: Conditions and Context of Human Performance and Safety Management Systems
Flowchart ..........................................................................................................................2‐51 Figure 2‐8: Context of Decisions Flowchart......................................................................................... 2‐52 Figure 2‐9: Racked Out Air Breaker .....................................................................................................2‐59 Figure 2‐10: Excerpt from the Accident ECF Chart ................................................................................2‐60 Figure 2‐11: Summary Results from a Barrier Analysis Reveal the Types of Barriers Involved ............2‐61 Figure 2‐12: The Change Analysis Process ............................................................................................2‐70 Figure 2‐13: Determining Causal Factors ..............................................................................................2‐76 Figure 2‐14: Roll Up Conditions to Determine Causal Factors ..............................................................2‐78 Figure 2‐15: Grouping Root Causes on the Events and Causal Factors Chart .......................................2‐83 Figure 2‐16: Facts, Analyses, and Causal Factors are needed to Support Judgments of Need.............2‐89
x
DOE‐HDBK‐1208‐2012
ACRONYMS
AEC Atomic Energy Commission
AI Accident Investigation
AIB Accident Investigation Board
BAM Barrier Analysis Matrix
BTC Break-the-Chain
CAM Culture Attribute Matrix
CFA Causal Factors Analysis
CFR Code of Federal Regulations
CON Conclusions
CTL Comparative Timeline
DOE Department of Energy
DOE G DOE Guide
DOE M DOE Manual
DOE O DOE Order
DOE P DOE Policy
E1 Electrician 1
E2 Electrician 2
ECAQ Extraneous Conditions Adverse Quality
ECFA Expanded Causal Factors Analysis
ECF Events and Causal Factors
EFCOG Energy Facility Contractors Operating Group
ERDA Energy Research and Development Administration
ES&H Environment, Safety and Health
FOIA Freedom of Information Act
FOM Field Office Manager
HPI Human Performance Improvement
HRO High Reliability Organization
HSS Office of Health, Safety and Security
IAEA International Atomic Energy Agency
INPO Institute of Nuclear Power Operations
ISM Integrated Safety Management
xi
DOE‐HDBK‐1208‐2012
ISMS Integrated Safety Management System
IWD Integrated Work Document
LOW Latent Organizational Weakness Table
LOTO Lockout/Tagout
JON Judgment of Need
MOM Missed Opportunity Matrix
MORT Management Oversight and Risk Tree Analysis
NNSA National Nuclear Security Administration
NTC National Training Center
OPI Office of Primary Interest
ORPS Occurrence Reporting and Processing System
OSHA Occupational Safety and Health Administration
OSR Operational Safety Review
PM Preventive Maintenance
PPE Personal Protection Equipment
SSDC Safety Management System Center
TWIN Task, Work Environment, Individual Capabilities, Human Nature (TWIN) Analysis Matrix (Human Performance Error Precursors)
WAD “Work-as-Done”
WAP “Work-as-Planned”
xii
DOE‐HDBK‐1208‐2012
FOREWORD
“The … (DOE) has exemplary programs for the control of accidents and fires, signified by numerous awards. Its work in such areas as reactors, radiation, weapons, and research has developed new methods of controlling unusual and exotic problems, including safe methods of utilizing new materials, energy sources, and processes.
Despite past accomplishments, human values and other values stimulate a continual desire to improve safety performance. Emerging concepts of systems analysis, accident causation, human factors, error reduction, and measurement of safety performance strongly suggest the practicality of developing a higher order of control over hazards.
Our concern for improved preventive methods, nevertheless, does not stem from any specific, describable failure of old methods as from a desire for greater success. Many employers attain a high degree of safety, but they seek further improvement. It is increasingly less plausible that the leading employers can make further progress by simply doing more, or better, in present program. Indeed, it seems unlikely that budget stringencies would permit simple program strengthening. And some scaling down in safety expenditures (in keeping with other budgets) may be necessary.
Consequently, the development of new and better approaches seems the only course likely to produce more safety for the same or less money. Further, a properly executed safety system approach should make a major contribution to the organization's attainment of broader performance goals.”
These words were written by W. G. Johnson in 1973, in The Management Oversight and Risk Tree – MORT, a report prepared for the U.S. Atomic Energy Commission (AEC). While written almost 40 years ago Johnson’s words and the context in which the MORT innovation in accident prevention and investigation was developed remains as vital today as then. [Johnson, 1973]1
The MORT approach described in the report was converted into the first accident investigation manual for the Energy Research and Development Administration (ERDA), the successor to AEC, in 1974. In 1985 the manual was revised. The introduction to that revision explained that:
“In the intervening years since that initial publication, methods and techniques that were new at that time have been further developed and proven, and Johnson's basic concepts and principles have been further defined and expanded. Experience in using the manual in conducting high quality, systematic investigations has identified areas for additional development and has generated need for yet higher levels of investigative excellence to meet today's safety and loss control needs.
This revision is intended to meet those needs through incorporating developments and advances in accident investigation technology that have taken place since Johnson’s first accident investigation manual was written.”
1
DOE‐HDBK‐1208‐2012
This new DOE Operational Safety and Accident Analysis Techniques Handbook was prepared in the tradition of Johnson’s original report and its subsequent revisions. It incorporates “developments and advances in accident investigation technology that have taken place since…” issuance of the DOE Accident Investigation Workbook, Rev. 2, 1999.
What are those developments that prompted issuance of a new Handbook? One researcher expresses the current situation thus:
“Accident models provide a conceptualisation of the characteristics of the accident, which typically show the relation between causes and effects. They explain why accidents occur, and are used as techniques for: risk assessment during system development, and post hoc accident analysis to study the causes of the occurrence of an accident.
The increasing complexity in highly technological systems such as aviation, maritime, air traffic control, telecommunications, nuclear power plants, space missions, chemical and petroleum industry, and healthcare and patient safety is leading to potentially disastrous failure modes and new kinds of safety issues. Traditional accident modelling approaches are not adequate to analyse accidents that occur in modern sociotechnical systems, where accident causation is not the result of an individual component failure or human error.” [Qureshi, 2007]2
In 1978, sociologist Barry Turner wrote a book called Man-Made Disasters, in which he examined 85 different accidents and found that in common they had a long incubation period with warning signs that were not taken seriously. Safety science today views serious accidents not as the result of individual acts of carelessness or mistakes; rather they result from a confluence of influences that emerge over time to combine in unexpected combinations enabling dangerous alignments sometimes catastrophically. [Turner and Pidgeon, 1978]3
The accidents that stimulated the new safety science are now indelibly etched in the history of safety: Challenger and Columbia, Three Mile Island, Chernobyl, Bophal, Davis Besse, Piper- Alpha, Texas City, and Deepwater Horizon. The list is long. These accidents have introduced new concepts and new vocabulary: normal accidents, systems accidents, practical drift, normal deviance, latent pathogens, the gamblers dilemma, organizational factors, and safety culture. As explained by Roger Boisjoly in an article after the 1986 Challenger accident: “It is no longer the individual that is the locus of power and responsibility, but public and private institutions. Thus, it would seem, it is no longer the character and virtues of individuals that determine the standards of moral conduct, it is the policies and structures of the institutional settings within which they live and work.” [Ermann and Lundman, 1986]4
The work of Johnson and the System Safety Development Center at the Idaho National Engineering Laboratory was among the early contributions to a systems view. The accident at the Three Mile Island nuclear plant in 1979 prompted new directions in safety and organizational performance research going beyond human actions and equipment as initiating events to examine the influence of organizational systems. Charles Perrow’s 1984 book, Normal Accidents, challenged long held beliefs about safety and accident causation. Publication of his book was followed by the Bhopal chemical leak (1984), the Chernobyl disaster (1986), and the Challenger
2
DOE‐HDBK‐1208‐2012
explosion (1986); contributed additional urgency for rethinking conventional wisdom about safety and performance in complex systems. [Perrow, 1984]5
In 1987, the first research paper on what have come to be known as Highly Reliable Organizations (HRO) was published, The Self-Designing High-Reliability Organization: Aircraft Carrier Flight Operations at Sea by Gene I. Rochlin, Todd R. La Porte, and Karlene H. Roberts published in the Autumn 1987 issue of Naval War College Review. HRO concepts were formally introduced to DOE through the Defense Nuclear Federal Safety Board Tech 35 Safety Management of Complex, High-Hazard Organizations, December 2004, and subsequently adopted as design principles in the Department’s “Action Plan - Lessons Learned from the Columbia Space Shuttle Accident and Davis-Besse Reactor Pressure-Vessel Head Corrosion Event.” DOE’s adoption of Human Performance Improvement (derived from commercial nuclear power and aviation successful approaches and socio-technical system research) reinforced the findings of high reliability research with specific practices and techniques. [Rochlin, La Porte, Roberts, 1987]6
Early HRO studies were expanded to other hazardous domains over a period of some 20 years. The broad body of research revealed common characteristics among diverse mission high hazard organizations that are able to accomplish their missions safely over long time periods with few adverse events. HRO research has been further expanded though the perspective of Resilience Engineering. This perspective counters the historical deterministic view that safety is an inherent property of well-designed technology and reveals how technology is nested in complex interrelationships of social, organizational, and human factors. Viewing safety though the lens of complexity theory illuminates an understanding that it is the ability of people in organizations to adapt to the unexpected that produces resilient systems, systems in which safety is continually created by human expertise and innovation under circumstances not foreseen or foreseeable by technology designers.
Erik Hollnagel, a pioneer of the Resilience Engineering perspective, has explained that accident investigation and risk assessment models focus on what goes wrong and the elimination of "error.” While this principle may work with machines, it does not work with humans. Variability in human performance is inevitable, even in the same tasks we repeat every day. According to Hollnagel; our need to identify a cause for any accident has colored all risk assessment thinking. Only simple technology and simple accidents may be said to be “caused.” For complex systems and complex accidents we don't "find" causes; we "create" them. This is a social process which changes over time just as thinking and society change. After the Second World War and until the late 1970s, most accidents were seen as a result of technical failure. The Three Mile Island accident saw cause shift from technical to human failure. Finally in the 1980s, with the Challenger disaster, cause was not solely technical or human but organizational. Hollnagel and other resilience thinking proponents see the challenge not as finding cause. The challenge is to explain why most of the time we do things right and to use this knowledge to shift accident investigation and prevention thinking away from cause identification to focus on understanding and supporting human creativity and learning and performance variability. In other words, understanding how we succeed gains us more than striving to recreate an unknowable history and prescribing fixes to only partially understood failures. [Hollnagel, 2006]7
3
DOE‐HDBK‐1208‐2012
It has been suggested that we are living in the fifth age of safety. The first was a technical age, the second a systems age, and the third a culture age. Metaphorically, the first may be characterized by engineering, the second by cybernetics and systems thinking, and the third by psychology and sociology. The fourth age, the “integration age,” builds on the first three ages not abandoning them but blending them into a trans-disciplinary socio-technical paradigm, thus prompting more complex perspectives to develop and evolve. The fifth age is an “adaptive age.” It does not displace the former, but rather transcends the other ages by introducing the notion of complex adaptive systems in which the roles of expertise, professional practice, and naturalistic observation attain primacy in resolving the duality of “work-as-imagined” versus “work as done.” [Borys, Else, Leggett, October 2009]8
At present, we see mere glimpses of the implications of the adaptive age on how we think about “accident investigation.” How we may view accidents though fourth Age lens is somewhat clearer. Though still myopic, we do have examples of fourth age investigation reports beginning with the Challenger Accident. Dianne Vaughn wrote, “The Challenger disaster was an accident, the result of a mistake. What is important to remember from this case is not that individuals in organizations make mistakes, but that mistakes themselves are socially organized and systematically produced. Contradicting the rational choice theory behind the hypothesis of managers as amoral calculators, the tragedy had systemic origins that transcended individuals, organization, time and geography. Its sources were neither extraordinary nor necessary peculiar to NASA, as the amoral calculator hypothesis would lead us to believe. Instead, its origins were in routine and taken for granted aspects of organizational life that created a way of seeing that was simultaneously a way of not seeing.” [Vaughan, 1996]9
The U.S. Chemical Safety Board enhanced our fourth age vision by several diopters in its report on the British Petroleum Texas City Refinery accident. Organizational factors, human factors and safety culture were integrated to suggest new relationships that contributed to the nation’s most serious refinery accident. Investigations of the Royal Air Force Nimrod and the Buncefield accidents followed suit. More recent investigations of the 2009 Washington Metro crash and the Deepwater Horizon catastrophe were similarly inspired by the BP Texas City investigation and the related HRO framework.
This revision of DOE’s approach to accident investigation and organizational learning is by no means presented as an exemplar of fifth nor even fourth age safety theory. But it was developed with awareness of the lessons of recent major accident investigations and what has been learned in safety science since the early 1990s. Still grounded in the fundamentals of sound engineering and technical knowledge, this version does follow the fundamental recognition by Bill Johnson that technical factors alone explain little about accidents. While full understanding of the technology as designed is necessary, understanding the deterministic behavior of technology failure offers little to no understanding about the probabilistic, even chaotic interrelationships of people, organization and social environmental factors.
The Handbook describes the high level process that DOE and DOE contractor organizations should use to review accidents. The purpose of accident investigation is to learn from experience in order to better assure future success. As Johnson phrased it: “Reduction of the causes of failures at any level in the system is not only a contribution to safety, but also a moral obligation
4
DOE‐HDBK‐1208‐2012
to serve associates with the information and methods needed for success.” We seek to develop an understanding of how the event unfolded and the factors that influenced the event. Classic investigation tools and enhanced versions of tools are presented that may be of use to investigators in making sense of the events and factors. Further-more, an example is provided of how such tools may be used within an HRO framework to explore unexpected occurrences, so called “information rich, low consequence, no consequence events”, to perform organizational diagnostics to better understand the “work-as-imagined” versus “work-as-performed” dichotomy and thus maintain reliable and resilient operations. [Johnson, 1973]1
This 2012 version of the Handbook retains much of the content from earlier versions. The most important contribution of this new version is the reminder that tools are only mechanisms for collecting and organizing data. More important is the framework; the theory derived from research and practice, that is used for interpreting the data.
Johnson’s 1973 report contained a scholarly treatment of the science and practice that underlay the techniques and recommendations presented. The material presented in this 2011 version rests similarly on extensive science and practice, and the reader is challenged to develop a sufficient knowledge of both as a precondition to applying the processes and techniques discussed. Johnson and his colleagues based the safety and accident prevention methodologies squarely on the understanding of psychology, human factors, sociology, and organizational theory. Citing from the original AEC report “To say that an operator was inattentive, careless or impulsive is merely to say he is human” (quoting from Chapanis). “…each error at an operational level must be viewed as stemming from one or more planning or design errors at higher levels.”
This new document seeks to go a step beyond earlier versions in DOE’s pursuit of better ways to understand accidents and to promote the continuous creation of safety in our normal daily work. Fully grounded in the lessons and good practices of those who preceded us, the contributors to this document seek as did our predecessors to look toward the future. This Accident and Operational Safety Analysis Techniques Handbook challenges future investigators to apply analytical tools and sound technical judgment within a framework of contemporary safety science and organizational theory.
5
DOE‐HDBK‐1208‐2012
6
DOE‐HDBK‐1208‐2012
CHAPTER 1. DOE’S ACCIDENT PREVENTION AND INVESTIGATION PROGRAM
1. Fundamentals
This chapter discusses fundamental concepts of accident dynamics, accident prevention, and accident analysis. The purpose of this chapter is to emphasize that DOE accident investigators and improvement analysts need to understand the theoretical bases of safety management and accident analysis, and the practical application of the DOE Integrated Safety Management (ISM) framework. This provides investigators the framework to get at the relevant facts, surmise the appropriate causal factors and to understand those organizational factors that leave the organization vulnerable for future events with potentially worse consequences.
1.1 Definition of an Accident
Accidents are unexpected events or occurrences that result in unwanted or undesirable outcomes. The unwanted outcomes can include harm or loss to personnel, property, production, or nearly anything that has some inherent value. These losses increase an organization’s operating cost through higher production costs, decreased efficiency, and the long-term effects of decreased employee morale and unfavorable public opinion.
How then may safety be defined? Dr. Karl Weick has noted that safety is a “dynamic non event.” Dr. James Reason offers that “safety is noted more in its absence than its presence.” Scholars of safety science and organizational behavior argue, often to the chagrin of designers, that safety is not an inherent property of well designed systems. To the contrary Prof. Jens Rasmussen maintains that “the operator’s role is to make up for holes in designers ‘work’.” If the measurement of safety is that nothing happens, how does the analyst then understand how systems operate effectively to produce nothing? In other words, since accidents are probabilistic outcomes, it is the challenge to determine by evidence if the absence of accidents is by good design or by lucky chance. Yet, this is the job of the accident investigator, safety scientists and analysts.
1.2 The Contemporary Understanding of Accident Causation
The basis for conducting any occurrence investigation is to understand the organizational, cultural or technical factors that left unattended could result in future accidents or unacceptable mission interruption or quality concerns. Guiding concepts may be summarized as follows:
Within complex systems human error does not emanate from the individual but is a bi product or symptom of the ever present latent conditions built into the complexity of organizational culture and strategic decision-making processes.
The triggering or initiating error that releases the hazard is only the last in a network of errors that often are only remotely related to the accident. Accident occurrences emerge
1. 2
1. 1
1‐1
u
r
f d
DOE‐HDBK‐1208‐20012
1.3
fromm the organizzation’s commplexity, takiing many facctors to overrcome systemms’ networkk of barriiers and alloowing a threaat to initiate the hazard rrelease.
Inveestigations reequire delvinng into the basic organizzational processes: designning, consstructing, op erating, mai ntaining, commmunicating, selecting, and trainingg, supervisinng, and managing thhat contain thhe kinds of llatent condittions most likkely to constitute a threaat to the ssafety of the system.
The inherent natture of organnizational cuulture and strrategic decission-making means latennt condditions are innevitable. Syystems and oorganizationnal complexiity means noot all problemms can bbe solved inn one pass. RResources arre always limmited and saffety is only oone of manyy commpeting priorities. There fore, event i nvestigatorss should targget the latent conditions mmost in neeed of urgennt attention aand make theem visible too those who mmanage the organizationn so theyy can be corrected. [Holllnagel, 20044]10 [Dekker,, 2011]11 [Reeiman and OOedewald,
9]122009
1.3 AAccident Models –– A Basic Understanding
An acciddent model iss the frame oof reference, or stereotyppical way of thinking aboout an accident, that are uused in tryingg to understaand how an accident happpened. Thee frame of reeference is offten an unspooken, but commmonly heldd understandding, of how accidents occcur. The addvantage is tthat communiication and uunderstandinng become mmore efficiennt because soome things ( e.g., commoon terminoloogy, commoon experienc es, commonn points-of-reeference, or typical sequuences) can bbe taken forr granted. Thhe disadvanttage is that itt favors a sinngle point off view and ddoes not conssider alternate explanationns (i.e., the hypothesis mm a recognizeed solution, ccausing the uusery odel creates to discardd or ignore iinformation iinconsistent with the moodel). This iis particularlly important when adddressing humman componnent because preconceiveed ideas of hhow the acciddent occurreed can influence the invvestigators’ aassumptions of the peoplles’ roles andd affect the lline of questioniing. [Hollnaggel, 2004]10
What invvestigators loook for whenn trying to unnderstand annd analyze aan accident ddepends on hhow it is belieeved an acciddent happens. A model,, whether forrmal or simpply what youu believe, is extremelyy helpful be cause it brinngs order to aa confusing situation andd suggests wways you cann explain relationships. But the moodel is also cconstrainingg because it vviews the ac cident in a particularr way, to thee exclusion oof other viewwpoints. Acccident modeels have evollved over timme
4]10and can bbe characteriized by the tthree modelss below. [Hoollnagel, 2000
1.3.1 SSequence oof Events MModel
This is a simple, line ar cause andd effect mod el where accidentss are seen thee natural cullmination off a series of eventss or circumsttances, which occur in a specific and recoggnizable ordder. The moddel is often representted by a chaiin with a weeak link or a series of falling doominos. In tthis model, aaccidents aree preventedd by fixing oor eliminatinng the weak link, by removingg a domino, or placing a barrier betwween two
1‐2
DOE‐HDBK‐1208‐2012
dominos to interrupt the series of events. The Domino Theory of Accident Causation developed by H.W. Heinrich in 1931 is an example of a sequence of events model. [Heinrich, 1931]13
The sequential model is not limited to a simple series and may utilize multiple sequences or hierarchies such as event trees, fault trees, or critical path models. Sequential models are attractive because they encourage thinking in causal series, which is easier to represent graphically and easier to understand. In this model, an unexpected event initiates a sequence of consequences culminating in the unwanted outcome. The unexpected event is typically taken to be an unsafe act, with human error as the predominant cause.
The sequential model is also limited because it requires strong cause and effect relationships that typically do not exist outside the technical or mechanistic aspect of the accident. In other words, true cause and effect relationships can be found when analyzing the equipment failures, but causal relationships are extremely weak when addressing the human or organizational aspect of the accident. For example: While it is easy to assert that “time pressure caused workers to take shortcuts,” it is also apparent that workers do not always take shortcuts when under time pressure. See Section 1.4, Cause and Effect Relationships.
In response to large scale industrial accidents in the 1970’s and 1980’s, the epidemiological models were developed that viewed an accident the outcome of a combination of factors, some active and some latent, that existed together at the time of the accident. [Hollnagel, 2004]10
1.3.2 Epidemiological or Latent Failure Model
This is a complex, linear cause and effect model where accidents are seen as the result of a combination of active failures (unsafe acts) and latent conditions (unsafe conditions). These are often referred to as epidemiological models, using a medical metaphor that likens the latent conditions to pathogens in the human body that lay dormant until triggered by the unsafe act. In this model, accidents are prevented by strengthening barriers and defenses. The “Swiss Cheese” model developed by James Reason is an example of the epidemiological model. [Reason, 1997]14
This model views the accident to be the result of long standing deficiencies that are triggered by the active failures. The focus is on the organizational contributions to the failure and views the human error as an effect, instead of a cause.
The epidemiological models differ from the sequential models on four main points:
Performance Deviation – The concept of unsafe acts shifted from being synonymous with human error to the notion of deviation from the expected performance.
1‐3
DOE‐HDBK‐1208‐2012
Conditions – The model also considers the contributing factors that could lead to the performance deviation, which directs analysis upstream from the worker and process deviations.
Barriers – The consideration of barriers or defenses at all stages of the accident development.
Latent Conditions – The introduction of latent or dormant conditions that are present within the system well before there is any recognizable accident sequence.
The epidemiological model allows the investigator to think in terms other than causal series, offers the possibility of seeing some complex interaction, and focuses attention on the organizational issues. The model is still sequential, however, with a clear trajectory through the ordered defenses. Because it is linear, it tends to oversimplify the complex interactions between the multitude of active failures and latent conditions.
The limitation of epidemiological models is that they rely on “failures” up and down the organizational hierarchy, but does nothing to explain why these conditions or decisions were seen as normal or rational before the accident. The recently developed systemic models start to understand accidents as unexpected combinations of normal variability. [Hollnagel, 2004]10 [Dekker, 2006]15
1.3.3 Systemic Model
This is a complex, non-linear model where both accidents (and success) are seen to emerge from unexpected combinations of normal variability in the system. In this model, accidents are triggered by unexpected combinations of normal actions, rather than action failures, which combine, or resonate, with other normal variability in the process to produce the necessary and jointly sufficient conditions for failure to succeed. Because of the complex, non-linear nature of this model, it is difficult to represent graphically. The Functional Resonance model from Erik Hollnagel uses a signal metaphor to visualize this model with the undetectable variabilities unexpectedly resonating to result in a detectable outcome.
The JengaTM game is also an excellent metaphor for describing the complex, non-linear accident model. Every time a block is pulled from the stack, it has subtle interactions with the other blocks that cause them to loosen or tighten in the stack. The missing blocks represent the sources of variability in the process and are typically described as organizational weaknesses or latent conditions. Realistically, these labels are applied retrospectively only after what was seen as normal before the accident, is seen as having contributed to the event, but only in combination with other factors. Often, the
1‐4
DOE‐HDBK‐1208‐2012
worker makes an error or takes an action that seems appropriate, but when combined with the other variables, brings the stack crashing down. The first response is to blame the worker because his action demonstrably led to the failure, but it must be recognized that without the other missing blocks, there would have been no consequence.
A major benefit of the systemic model is that it provides a more complete understanding of the subtle interactions that contributed to the event. Because the model views accidents as resulting from unexpected combinations of normal variability, it seeks an understanding of how normal variability combined to create the accident. From this understanding of contributing interactions, latent conditions or organizational weaknesses can be identified.
1.4 Cause and Effect Relationships
Although generally accepted as the overarching purpose of the investigation, the identification of causes can be problematic. Causal analysis gives the appearance of rigor and the strenuous application of time-tested methodologies, but the problem is that causality (i.e., a cause-effect relationship) is often constructed where it does not really exist. To understand how this happens, we need to take a hard look at how accidents are investigated, how cause – effect relationships are determined, and the requirements for a true cause - effect relationship.
1.4.1 Investigations Look Backwards
The best metaphor for how accidents are investigated is a simple maze. If a group of people are asked to solve the maze as quickly as possible and ask the “winners” how they did it, invariably the answer will be that they worked it from the Finish to the Start. Most mazes are designed to be difficult working from the Start to the Finish, but are simple working from the Finish to the Start. Like a maze, accident investigations look backwards. What was uncertain for the people working forward through the maze becomes clear for the investigator looking backwards.
Because accident investigations look backwards, it is easy to oversimplify the search for causes. Investigators look backwards with the undesired outcome (effect) preceded by actions, which is opposite of how the people experienced it (actions followed by effects). When looking for cause - effect relationships (and there many actions taking place along the timeline), there are usually one or more actions or conditions before the effect (accident) that seem to be plausible candidates for the cause(s).
There are some common and mostly unavoidable problems when looking backwards to find causality. As humans, investigators have a strong tendency to draw conclusions that are not logically valid and which are based on educated guesses, intuitive judgment, “common sense”, or other heuristics, instead of valid rules of logic. The use of event timelines, while beneficial in understanding the event, creates sequential relationships that seem to infer causal relationships. A quick Primer on cause and effect may help to clarify.
1. 4
1‐5
DOE‐HDBK‐1208‐2012
1.4.2 Cause and Effect are Inferred
Cause and effect relationships are normally inferred from observation, but are generally not something that can be observed directly.
Normally, the observer repeatedly observes Action A, followed by Effect B and conclude that B was caused by A. It is the consistent and unwavering repeatability of the cause followed by the effect that actually establishes a true cause – effect relationship.
For example: Kink a garden hose (action A), water flow stops (effect B), conclusion is kinking garden hose causes water to stop flowing. This cause and effect relationship is so well established that the person will immediately look for a kink in the hose if the flow is interrupted,
Accident investigations, however, involve the notion of backward causality, i.e., reasoning backward from Effect to Action.
The investigator observes Effect B (the bad outcome), assumes that it was caused by something and then tries to find out which preceding Action was the cause of it. Lacking the certainty of repeatability (unless the conditions are repeated) and a causal relationship can only be assumed because it seems plausible. [Hollnagel, 2004]10
1.4.3 Establishing a Cause and Effect Relationship
A true cause and effect relationship must meet these requirements:
The cause must precede the effect (in time).
The cause and effect must have a necessary and constant connection between them, such that the same cause always has the same effect.
This second requirement is the one that invalidates most of the proposed causes identified in accident investigations. As an example, a cause statement such as “the accident was due to inadequate supervision” cannot be valid because the inadequate supervision does not cause accidents all the time. This type of cause statement is generally based on the simple “fact” that the supervisor failed to prevent the accident. There are generally some examples, such as not spending enough time observing workers, to support the conclusion, but these examples are cherry-picked to support the conclusion and are typically value judgments made after the fact. [Dekker, 2006]15
1.4.4 The Circular Argument for Cause
The example (inadequate supervision) above is what is generally termed a “circular argument.” The statement is made that the accident was caused by “inadequate XXX.” But when challenged as to why it was judged to be inadequate, the only evidence is that it must be inadequate because the accident happened. The circular argument is usually evidenced by the use of negative descriptors such
1‐6
DOE‐HDBK‐1208‐2012
as inadequate, insufficient, less than adequate, poor, etc. The Accident Investigation Board (AIB) needs to eliminate this type of judgmental language and simply state the facts. For example, the fact that a supervisor was not present at the time of the accident can be identified as a contributing factor, although it is obviously clear that accidents do not happen every time a supervisor is absent.
True cause and effect relationships do exist, but they are almost always limited to the mechanistic or physics-based aspects of the event. In a complex socio-technical system involving people, processes and programs, the observed effects are uaually emergent phenomena due to interactions within the system rather than resultant phenomena due to cause and effect.
With the exception of physical causes, such as a shorted electrical wire as the ignition source for a fire, causes are not found; they are constructed in the mind of the investigator. Since accidents do happen, there are obviously many factors that contribute to the undesired outcome and these factors need to be addressed. Although truly repeatable cause and effect relationships are almost impossible to find, many factors that seemed to have contributed to the outcome can be identified. These factors are often identified by missed opportunities and missing barriers which get miss labeled as causes. Because it is really opinion, sufficient information needs to be assembled and presented in a form that makes the rationale of that opinion understandable to others reviewing it.
The investigation should focus on understanding the context of decisions and explaining the event. In order to understand human performance, do not limit yourself to the quest for causes. An explanation of why people did what they did provides a much better understanding and with understanding comes the ability to develop solutions that will improve operations.
1.4.5 Counterfactuals
Using the maze metaphor, what was complex, with multiple paths and unknown outcomes for the workers, becomes simple and obvious for the investigator. The investigator can easily retrace the workers path through the maze and see where they chose a path that led to the accident rather than one that avoided the accident. The result is a counterfactual (literally, counter the facts) statement of what people should or could have done to avoid the accident. The counterfactual statements are easy to identify because they use common phrases like:
“they could have …”
“they did not …”
“they failed to …”
“if only they had …”
The problem with counterfactuals is that they are a statement of what people did not do and does not explain why the workers did what they did do. Counterfactuals take place in an alternate reality that did not happen and basically represent a list of what the investigators wish had happened instead.
1‐7
DOE‐HDBK‐1208‐2012
1.5
Discrepancies between a static procedure and actual work practices in a dynamic and ever changing workplace are common and are not especially unique to the circumstances involved in the accident. Discrepancies are discovered during the investigation simply because considerable effort was expended in looking for them, but they could also be found throughout the organization where an accident has not occurred. This does not mean that counterfactual statements should be discounted. They can be essential to understanding why the decisions the worker made and the actions (or no actions) that the worker took were seen as the best way to proceed. [Dekker, 2006]15
1.5 Human Performance Considerations
In order to understand human performance, do not limit yourself to the quest for causes. The investigation should focus on understanding the context of decisions and explaining the event. An explanation of why people did what they did provides a much richer understanding and with understanding comes the ability to develop solutions that will improve operations.
The safety culture maturity model from the International Atomic Energy Agency (IAEA) provides the basis for an improved understanding the human performance aspect of the accident investigation. IAEA TECDOC 1329, Safety Culture in Nuclear Installations: Guidance for Use in the Enhancement of Safety Culture, was developed for use in IAEA’s Safety Culture Services to assist their Member States in their efforts to develop a sound safety culture. Although the emphasis is on the assessment and improvement of a safety culture, the introductory sections, which lay the groundwork for understanding safety culture maturity, provide a framework to understand the environment which forms the organization’s human performance.
Organizational Maturity
Rule Based
Improvement Based
Goal Based
People who make Management’s Mistakes are seen as mistakes are blamed response to process variability with for their failure to mistakes is more emphasis is on comply with rules controls, understanding what
procedures, and happened, rather than training finding someone to
blame
Figure 1-1: IAEA-TECDOC-1329 – Safety Culture in Nuclear Installations
The model (Figure 1-1) defines three levels of safety culture maturity and presents characteristics for each of the maturity levels based on the underlying beliefs and assumptions. The concept is illustrated below with the characteristics for how the organization responds to an accident.
1‐8
DOE‐HDBK‐1208‐2012
Rule Based –Safety is based on rules and regulations. Workers who make mistakes are blamed for their failure to comply with the rules.
Goal Based –Safety becomes an organizational goal. Management’s response to mistakes is to pile on more broadly enforced controls, procedures and training with little or no performance rationale or basis for the changes.
Improvement Based –The concept of continuous improvement is applied to safety. Almost all mistakes are viewed in terms of process variability, with the emphasis placed on understanding what happened rather than finding someone to blame, and a targeted response to fix the underlying factors.
When an accident occurs that causes harm or has the potential to cause harm, a choice exists: to vector forward on the maturity model and learn from the accident or vector backwards by blaming the worker and increasing enforcement. In order to do no harm, accident investigations need to move from the rule based response, where workers are blamed, to the improvement based response where mistakes are seen as process variability needing improvement.
1.5.1 Bad Apples
The Bad Apple Theory is based on the belief that the system in which people work is basically safe and worker errors and mistakes are seen as the cause of the accident. An investigation based on this belief focuses on the workers’ bad decisions or inappropriate behavior and deviation from written guidance, with a conclusion that the workers failed to adhere to procedures. Because the supervisor’s role is seen as enforcing the rules, the investigation will often focus on supervisory activities and conclude that the supervisor failed to adequately monitor the worker’s performance and did not correct noncompliant behavior. [Dekker, 2002]16
From the investigation perspective, knowing what the outcome was creates a hindsight bias which makes it difficult to view the event from the perspective of the worker before the accident. It is easy to blame the worker and difficult to look for weaknesses within the organization or system in which they worked. The pressure to find an obvious cause and quickly finish the investigation can be overpowering.
1.5.2 Human Performance Modes – Cognitive Demands
People are fallible, even the best people make mistakes. This is the first principle of Human Performance Improvement and accident investigators need to understand the nature of the error to determine the appropriate response to the error. Jen Rasmussen developed a classification of the different types of information processing involved in industrial tasks. Usually referred to as performance modes, these three classifications describe how the worker’s mind is processing information while performing the task. (Figure 1-2) The three performance modes are:
Skill mode - Actions associated with highly practiced actions in a familiar situation usually executed from memory. Because the worker is highly familiar with the task, little attention is required and the worker can perform the task without significant conscious thought. This
1‐9
DOE‐HDBK‐1208‐2012
mode is very reliable, with infrequent errors on the order of 1 in every 10,000 iterations of the task.
Rule mode - Actions based on selection of written or stored rules derived from one’s recognition of the situation. The worker is familiar with the task and is taking actions in response to the changing situation. Errors are more frequent, on the order of 1 in 1,000, and are due to a misrepresentation of either the situation or the correct response.
Knowledge mode - Actions in response to an unfamiliar situation. This could be new task or a previously familiar task that has changed in an unanticipated manner. Rather than using known rules, the worker is trying to reason or even guess their way through the situation. Errors can be as frequent as 1 in 2, literally a coin flip.
The performance modes refer to the amount of conscious control exercised by the individual doing the task, not the type of work itself. In other words, the skill performance mode does not imply work by crafts; rule mode does not imply supervision; and the knowledge mode does not imply work by professionals. This is a scale of the conscious thought required to react properly to a hazardous condition; from drilled automatic response, to conscious selection and compliance to proper rules, to needing to recognize there is a hazardous condition. The more unfamiliar the worker is with the work environment or situation, the more reliance there is on the individual’s alert awareness, rational reasoning and quick decision-making skills in the face of new hazards. Knowledge mode would be commonly relied on in typically simple, mundane, low hazard tasks. All work, whether performed by a carpenter or surgeon, can exist in any of the performance modes. In fact, the performance mode is always changing, based on the nature of the work at the time. [Reason and Hobbs, 2003]17
Understanding the performance mode the worker was in when he/she made the error is essential to developing the response to the accident (Figure 1-2). Errors in the skill mode typically involve mental slips and lapses in attention or concentration. The error does not involve lack of knowledge or understanding and, therefore, training can often be inappropriate. The worker is literally the expert on their job and training is insulting to the worker and causes the organization to lose credibility. Likewise, changing the procedure or process in response to a single event is inappropriate. It effectively pushes the worker out of the skill mode into rule-based until the new process can be assimilated. Because rule mode has a higher error rate, the result is usually an increase in errors (and accidents) until the workers assimilate the changes and return to skill mode. Training can be appropriate where the lapse is deemed due to a drift in the skills competence, out-of-date mindset, or the need for a drilled response without lapses.
Training might be appropriate for errors that occurred in rule mode because the error generally involved misinterpretation of either the situation or the correct response. In these instances, understanding requirements and knowing where and under what circumstance those requirements apply is cognitive in nature and must be learned or acquired in some way. Procedural changes are appropriate if the instructions were incorrect, unclear or misleading.
1‐10
DOE‐HDBK‐1208‐2012
High At te nt io n (to
ta sk )
Inaccurate Mental Picture
Misinterpretation
Inattention
Low Famil iarity (w/task) High
Low
Figure 1-2: Performance Modes
Training might also be appropriate for errors that occurred in the knowledge mode, if the workers’ understanding of the system was inadequate. However, the problem might have been issues like communication and problem-solving during the event, rather than inadequate knowledge.
1.5.3 Error Precursors
“Knowledge and error flow from the same mental sources, only success can tell the one from the other.” The idea of human error as “cause” in consequential accidents is one that has been debunked by safety science since the early work by Johnson and the System Safety Development Center (SSDC) team. As Perrow stated the situation “Formal accident investigations usually start with an assumption that the operator must have failed, and if this attribution can be made, that is the end of serious inquiry. Finding that faulty designs were responsible would entail enormous shutdown and retrofitting costs; finding that management was responsible would threaten those in charge, but finding that operators were responsible preserves the system, with some soporific injunctions about better training.” [Mach, 1976]18 [Perrow, 1984]5
In contemporary safety science the concept of error is simply when unintended results occurred during human performance. Error is viewed as a mismatch between the human condition and environmental factors operative at a given moment or within a series of actions. Research has demonstrated that presence of various factors in combination increase the potential for error;
1‐11
DOE‐HDBK‐1208‐2012
these factors may be referred to as error precursors. Anticipation and identification of such precursors is a distinguishing performance strategy of highly performing individuals and organizations. The following Task, Work Environment, Individual Capabilities and Human Nature (TWIN) model is a useful diagnostic tool for investigation (Figure 1-3).
TWIN Analysis Matrix
(Human Performance Error Precursors)
Task Demands Individual Capabilities Time Pressure (in a hurry) Unfamiliarity with task / First time
High workload (large memory) Lack of knowledge (faulty mental model)
Simultaneous, multiple actions New techniques not used before
Repetitive actions / Monotony Imprecise communication habits
Irreversible actions Lack of proficiency / Inexperience
Interpretation requirements Indistinct problem‐solving skills
Unclear goals, roles, or responsibilities Unsafe attitudes
Lack of or unclear standards Illness or fatigue; general poor health or injury
Work Environment Human Nature Distractions / Interruptions Stress
Changes / Departure from routine Habit patterns
Confusing displays or controls Assumptions
Work‐arounds Complacency / Overconfidence
Hidden system / equipment response Mind‐set (intentions)
Unexpected equipment conditions Inaccurate risk perception
Lack of alternative indication Mental shortcuts or biases
Personality conflict Limited short‐term memory
Figure 1-3: Error Precursors
1‐12
DOE‐HDBK‐1208‐2012
1.5.4 Optimization
Human performance is often summarized as the individual working within organizational systems to meet the expectations of leaders. Performance variability is all about meeting expectations and actions intended to produce a successful outcome.
To understand performance variability, an investigator must understand the nature of humans. Regardless of the task, whether at work or not, people constantly strive to optimize their performance by striking a balance between resources and demands. Both of these vary over time as people make a trade-off between thoroughness and efficiency. In simple terms, thoroughness represents the time and resources expended in preparation to do the work and efficiency is the time and resources expended in completing the work. To do both completely requires more time and resources than is available and people must choose between them. The immediate and certain reward for meeting schedule and production expectations easily overrides the delayed and uncertain consequence of insufficient preparation and people lean towards efficiency. They are as thorough as they believe is necessary, but without expending unnecessary effort or wasting time.
The result is a deviation from expectation and the reason is obvious. It saves time and effort which is then available for more important or pressing activities. How the deviation is judged afterwards, is a function of the outcome, not the decision. If organizational expectations are met without incident, the deviations are typically disregarded or may even be condoned and rewarded as process improvements. If the outcome was an accident, the same actions can be quickly judged as violations. This is the probabilistic nature of organizational decision-making which is driven by the perceptions or misperceptions of risks. A deviation or violation is not the end of the investigation; it is the beginning as the investigator tries to understand what perceptions were going on in the system that drove the choice to deviate. [Hollnagel, 2009]19
1.5.5 Work Context
Context matters and performance variability is driven by context. The simple sense – think – act model illustrates the role of context. Information comes to the worker, he makes a decision based on the context, and different actions are possible, based on the context.
The context of the decision relate to the goals, knowledge and focus of the worker. Successful completion of the immediate task is the obvious goal, but it takes place within the greater work environment where the need to optimize the use of
1‐13
DOE‐HDBK‐1208‐2012
time and resources is critical. Workers have knowledge, but the application of knowledge is not always straight forward because it needs to be accurate, complete and available at the time of the decision. Goals and knowledge combine together to determine the worker’s focus. Because workers cannot know and see everything all the time, what they are trying to accomplish and what they know drives where they direct their attention.
All this combines to create decisions that vary based on the influences that are present at the time of the decision and the basic differences in people. These influences and differences include:
Organization - actions taken to meet management priorities and production expectations.
Knowledge - actions taken by knowledgeable workers with intent to produce a better outcome.
Social – actions taken to meet co-worker expectations, informal work standards.
Experience – actions based on past experience in an effort to repeat success and avoid failure.
Inherent variability – actions vary due to individual psychological & physiological differences.
Ingenuity and creativity – adaptability in overcoming constraints and under specification.
The result is variable performance. From the safety perspective, this means that the reason workers sometimes trigger an accident is because the outcome of their action differs from what was intended. The actions, however, are taken in response to the variability of the context and conditions of the work. Conversely, successful performance and process improvement also arises from this same performance variability. Expressed another way, performance variability is not aberrant behavior; it is the probabilistic nature of decisions made by each individual in the organization that can result in both success and failure emerging from same normal work sequence.
In accident investigations, performance variability needs to be acknowledged as a characteristic of the work, not as the cause of the accident. Rather than simply judging a decision as wrong in retrospect, the decision needs to be evaluated in the context in which it was made. In accident investigation, the context or influences that drive the deviation need to be understood and addressed as contributing factors. Stopping with worker’s deviation as the cause corrects nothing. The next worker, working in the same context, will eventually adapt and deviate from work-as imagined until chance aligns the deviation to other organization system weaknesses for a new accident.
Performance variability is not limited to just the worker who triggers the accident. People are involved in all aspects of the work, and the result is variability of all factors associated with the work. This can include variation in the actions of the co-workers, the expectations of the leaders, accuracy of the procedures, the effectiveness of the defenses and barriers, or even the basic
1‐14
DOE‐HDBK‐1208‐2012
policies of the organization. This is reflected in the complex, non-linear (non-Newtonian) accident model where unexpected combinations of normal variability can result in the accident.
1.5.6 Accountability, Culpability and Just Culture
“Name, blame, shame, retrain” is an oft used phrase for older ineffective paradigms of safety management and accident analysis. Dr. Rosabeth Moss Kanter of Harvard Business School phased the situation this way: “Accountability is a favorite word to invoke when the lack of it has become so apparent.” [Kanter, 2009]20
The concepts of accountability, culpability and just culture are inextricably entwined. Accountability has been defined in various ways but in general with this characterization; “The expectation that an individual or an organization is answerable for results; to explain actions, or; the degree to which individuals accept responsibility for the consequences of their actions, including the rewards or sanctions.” As Dr. Kanter explains “The tools of accountability — data, details, metrics, measurement, analyses, charts, tests, assessments, performance evaluations — are neutral. What matters is their interpretation, the manner of their use, and the culture that surrounds them. In declining organizations, use of these tools signals that people are watched too closely, not trusted, about to be punished. In successful organizations, they are vital tools that high achievers use to understand and improve performance regularly and rapidly.”
Culpability is about considering if the actions of an individual are blame worthy. The concept of culpability in safety is based largely on the work of Dr. James Reason as a function of creating a Just Culture. The purpose is to pursue a humane culture in which learning as individuals and collectively is valued and human fallibility is recognized as simply part of the human condition. Being human however is to be distinguished from being a malefactor. He explains; “The term ‘no-blame culture’ flourished in the 1990’s and still endures today. Compared to the largely punitive cultures that it sought to replace, it was clearly a step in the right direction. It acknowledged that a large proportion of unsafe acts were ‘honest errors’ (the kinds of slips, lapses and mistakes that even the best people can make) and were not truly blameworthy, nor was there much in the way of remedial or preventative benefit to be had by punishing their perpetrators. But the ‘no-blame’ concept had two serious weaknesses. First, it ignored – or at least, failed to confront – those individuals who willfully (and often repeatedly) engaged in dangerous behaviors that most observers would recognize as being likely to increase the risk of a bad outcome. Second, it did not properly address the crucial business of distinguishing between culpable and non-culpable unsafe acts.”
“…a safety culture depends critically on first negotiating where the line should be drawn between unacceptable behaviour and blameless unsafe acts. There will always be a grey area between these two extremes where the issue has to be decided on a case by case basis.”
“… the large majority of unsafe acts can be reported without fear of sanction. Once this crucial trust has been established, the organization begins to have a reporting culture, something that provides the system with an accessible memory, which, in turn, is the essential underpinning to a learning culture. There will, of course, be setbacks along the way. But engineering a just culture is the all-important early step; so much else depends upon it.” [GAIN Working Group E, 2004]21
1‐15
DOE‐HDBK‐1208‐2012
1.6
Along the road to a Just Culture organizations may benefit from explicit “amnesty” programs designed to persuade people to report their personal mistakes. In complex events, individual actions are never the sole causes. Thus determination of individual culpability and personnel actions that might be warranted should be explicitly separated from the accident investigation. Failure to make such separation may result in reticence or even refusal of individuals involved to cooperate in the investigation, may skew recollections and testimony, may prevent investigators from obtaining important information, and may unfairly taint the reputations and credibility of well intended individuals to whom no blame should be attached.
1.6 From Latent Conditions to Active Failures
An organizational event causal story developed by James Reason starts with the organizational factors: strategic decisions, generic organizational processes – forecasting, budgeting, allocating resources, planning, scheduling, communicating, managing, auditing, etc. These processes are colored and shaped by the corporate culture or the unspoken attitudes and unwritten rules concerning the way the organization carries out its business. [Reason, 1997]14
These factors result in biases in the management decision process that create “latent conditions” that are always present in complex systems. The quality of both production systems and protection systems are dependent upon the same underlying organizational decision processes; hence, latent conditions cannot be eliminated from the management systems, since they are an inevitable product of the cultural biases in strategic decisions. [Reason, p. 36, 1997]14
Figure 1-4 illustrates an example of latent conditions produced from the pressures of commitment to a heavy work load as an organizational factor at the base of the pyramid. This passes into the organization as a local work place factor in the form of stress in the work place. This is the latent condition that is a precursor or contributing factor to the worker cutting corners (the active failure of the safety system).
A distinction between active failures and latent conditions rests on two differences. The first difference is the time taken to have an adverse impact. Active failures usually have immediate and relatively short-lived effects. Latent conditions can lie dormant, doing no particular harm, until they interact with local circumstances to defeat the systems’ defenses. The second difference is the location within the organization of the human instigators. Active failures are committed by those at the human-system interface, the front-line activities, or the “sharp-end” personnel. Latent conditions, on the other hand, are spawned in the upper echelons of the organization and within related manufacturing, contracting, regulatory and governmental agencies that are not directly interfacing with the system failures.
The consequences of these latent conditions permeate throughout the organization to local workplaces—control rooms, work areas, maintenance facilities etc. —where they reveal themselves as workplace factors likely to promote unsafe acts (moving up the pyramid in Figure 1-4). These local workplace factors include undue time pressure, inadequate tools and equipment, poor human-machine interfaces, insufficient training, under-manning, poor supervisor-worker ratios, low pay, low morale, low status, macho culture, unworkable or ambiguous procedures, and poor communications.
1‐16
DOE‐HDBK‐1208‐20012
Within thhe workplacee, these locaal workplace factors can combine with natural huuman performaance tendenccies such as llimited attenntion, habit ppatterns, assuumptions, coomplacency, or mental shhortcuts. Thhese combinaations produuce unintentiional errors aand intentionnal violationns — collectiveely termed ““adaptive actts”—committted by indivviduals and tteams at the “sharp end,”” or the directt human-system interfacce (active errror).
Large nuumbers of theese adaptive acts will haappen (small red arrows iin Figure 1-44), but very few will alignn with the hooles in the deefenses (holees are createed by the lateent conditionns deep withhin the organnization). WWith defense--in-depth prooviding a muulti-barrier ddefense, it takkes multiplee human peerformance errors to breeach the mul tiple defensees. However, when defeenses have become ssufficiently fflawed and oorganizational behavior cconsistently drifts from desired behaavior accidentss can occur. In such eveents causes aare multiple aand only thee most superfficial analysis would suuggest otherwwise.
FFigure 1-4: Organizaational Cauuses of Acccidents
1‐17
DOE‐HDBK‐1208‐2012
1.7
1.7 Doing Work Safely - Safety Management Systems
Safety Management Systems (SMS) were developed to integrate safety as part of an organization’s management of mission performance. The benefits of process based management systems is a well established component of quality performance. As organizations and the technologies they employ became more complex and diverse, and the rate of change in pace of societal expectations, technical innovations, and competitiveness increased, the importance of sound management of functions essential to safe operations became heightened.
A SMS is essentially a quality management approach to controlling risk. It also provides the organizational framework to support a sound safety culture. Systems can be described in terms of integrated networks of people and other resources performing activities that accomplish some mission or goal in a prescribed environment. Management of the system’s activities involves planning, organizing, directing, and controlling these assets toward the organization’s goals. Several important characteristics of systems and their underlying process are known as “process attributes” or “safety attributes” when they are applied to safety related operational and support processes.
The SMS for DOE is the Integrated Safety Management System (ISMS), defined in Federal Acquisition Regulation and amplified though DOE directives and guidance. The ISMS is the overarching safety system used by DOE to ensure safety of the worker, the community and the environment. The DOE ISMS is characterized by seven principles and five core functions:
Seven Principles
Line management responsibility for safety Line management is directly responsible for the protection of workers, the public and the environment.
Clear roles and responsibilities Clear and unambiguous lines of authority and responsibility for ensuring safety is established and maintained at all organizational levels and for subcontractors.
Competence commensurate with responsibilities Personnel are required to have the experience, knowledge, skills and capabilities necessary to discharge their responsibilities.
Balanced priorities Managers must allocate resources to address safety, as well as programmatic and operational considerations. Protection of workers, the public and the environment is a priority whenever activities are planned and performed.
Identification of safety standards and requirements Before work is performed, the associated hazards must be evaluated, and an agreed-upon set of safety standards and requirements must be established to provide adequate assurance that workers, the public and the environment are protected from adverse consequences.
1‐18
DOE‐HDBK‐1208‐2012
Hazard controls tailored to work being performed Administrative and engineering controls are tailored to the work being performed to prevent adverse effects and to mitigate hazards.
Operations authorization The conditions and requirements to be satisfied before operations are initiated are clearly established and agreed upon.
Five Core Functions (Figure 1-5)
Define the scope of work Missions are translated into work, expectations are set, tasks are identified and prioritized and resources are allocated.
Analyze the hazards Hazards associated with the work are identified, analyzed and categorized.
Develop and implement hazard controls Applicable standards, policies, procedures and requirements are identified and agreed upon; controls to prevent/mitigate hazards are identified; and controls are implemented.
Perform work within controls Readiness is confirmed and work is performed safely.
Provide feedback and continuous improvement Information on the adequacy of controls is gathered, opportunities for improving the definition and planning of work are identified, and line and independent oversight is conducted.
1‐19
DOE‐HDBK‐1208‐20012
Figure 1-5: Five Core Funcctions of DDOE’s Integgrated Safeety Manageement Sysstem
1.7.1 TThe Functioon of Safetty Barrierss
The use oof controls oor barriers too protect the people fromm the hazardss is a core prrincipal of saafety. Barriers aare employeed to serve twwo purposes; to prevent release of haazardous eneergy and to mitigate harm in the event hazarddous energy is released. Energy is ddefined broaddly as used hhere, and incluudes multiplee forms, for example; kinnetic, biologgical, acoustiical, chemic al, electricall, mechaniccal, potential, electromaggnetic, thermmal, or radiattion.ii
For a detailed disccussion of barrriers refer to “Barriers and Accident Preevention” by EErik Hollnage l, 2004.
1‐20
ii
DOE‐HDBK‐1208‐2012
The dynamics of accidents may be categorized into five basic components, illustrated in Figure 1-6: 1) the threat or triggering action or energy, 2) the prevention barrier between the threat and the hazard, 3) the hazard or energy potential, 4) the mitigation barrier to mitigate hazardous consequences towards the target, 5) the targets in the path of the potential hazard consequences. When these controls or barriers fail, they allow unwanted energy to flow resulting in an accident or other adverse consequence.
Preventing System Accidents Initiating
Hazards Targets Source or Threats
Prevention Mitigation
Undesired Energy Flow
Human Error Workers
Public
Environment
Attack or Sabotage
Natural Forces
Equipment Failure
Barrier (e.g. Barrier (e.g. spark secondary inhibitors) containment)
Figure 1-6: Barriers and Accident Dynamics – Simplistic Design
The objective is to contain or isolate hazards though the use of protective barriers. Prevention barriers are intended to preclude release of hazards by human acts, equipment degradation, or natural phenomena. Mitigation barriers are used to shield, contain, divert or dissipate the hazardous energy if it is released thus precluding negative consequences to the employees or the surrounding communities. Distance from the hazard is a common mitigating barrier.
Barrier analysis is based on the premise that hazards are associated with all accidents. Barriers are developed and integrated into a system or work process to protect personnel and equipment from hazards. For an accident to occur the design of technical systems did not provide adequate barriers, work design did not specify use of appropriate barriers, or barriers failed. Investigators use barrier analysis to identify hazards associated with an accident and the barriers that should/could have prevented it. Barrier analysis addresses:
1‐21
DOE‐HDBK‐1208‐2012
Barriers that were in place and how they performed
Barriers that were in place but not used
Barriers that were not in place but were required
The barrier(s) that, if present or strengthened, would prevent the same or a similar accident from occurring in the future.
All barriers are not the same and differ significantly in how well they perform. The following are some of the general characteristics of barriers that need to be considered when selecting barriers to control hazards. When evaluating the performance of a barrier after an accident, these characteristics also suggest how well we would expect the barrier to have performed to control the hazard.
Effectiveness – how well it meets its intended purpose
Availability – assurance the barrier will function when needed
Assessment – how easy to determine whether barrier will work as intended
Interpretation – extent to which the barrier depends on interpretation by humans to achieve its purpose
1.7.2 Categorization of Barriers
Barriers may also be categorized according to a hierarchy of cost/reliability and according to barrier function. The barrier cost/reliability hierarchy includes:
Physical or engineered barriers – These are the structures that are built, or sometimes naturally exist, to prevent the flow of energy or personnel access to the hazards. These barriers require an investment to design and build and have a cost to maintain and update. Examples: Personnel cage around a multi-story ladder, a guard rail on a platform, or a barricade to prevent access.
Administrative or management policy barriers – These include rules, procedures, policies, training, work plans that describe the requirements to avoid hazards. These barriers require less capital investment but have a cost in the development, review, updating, training, communication, and enforcement to assure adequacy and compliance. Examples: Requirement to use harness and strap ties while climbing a multi-story ladder, a prescriptive process procedure sequence, or laws against trespassing.
Personal knowledge or skill barriers – These include human performance aspects of: fundamental lessons-learned, knowledge, common sense, life experiences, and education that contribute to the individuals’ survival instincts and decision-making ability. These barriers require little or no investment except in the screening and selection process for qualified personnel used in a task and providing supervision. Examples: The decision not to climb a
1‐22
DOE‐HDBK‐1208‐2012
ladder with a tool in one hand, the decision not to violate one of the administrative barriers, or recognizing a dangerous situation.
Another analysis system divides barriers into four categories that reflect the nature of the barriers’ performance function. These four categories can be useful in the barrier analysis for characterizing more precisely the purpose of the barrier and its type of weakness. Examples for each of the four categories are as follows:
Physical– physically prevents an action from being carried out or an event from happening
Containing or protecting - walls, fences, railings, containers, tanks
Restraining or preventing movement - safety belts, harnesses, cages
Separating or protecting – crumple zones, scrubbers, filters
Functional– impedes actions through the use of pre-conditions
Prevent movement/action (hard) – locks, interlocks, equipment alignment
Prevent movement/action (soft) – passwords, entry codes, palm readers
Impede actions – delays, distance (too far for single person to reach)
Dissipate energy/extinguish – air bags, sprinklers
Symbolic– requires an act of interpretation in order to achieve their purpose
Countering/preventing actions – demarcations, signs, labels, warnings
Regulating actions – instructions, procedures, dialogues (pre-job brief)
System status indications – signals, warnings, alarms
Permission/authorization – permits, work orders
Incorporeal– requires interpretation of knowledge in order to achieve their purpose
Process – rules, restrictions, guidelines, laws, training
Comply/conform – self-restraint, ethical norms, morals, social or group pressure
Within DOE organizations, there is typically a defense-in-depth policy for reducing the risks of a system failure or an accident due to the threats. This policy maintains a multiple layered barrier system between the threats or hazards and the requirement to correct any weaknesses or failures identified in a single layer. Therefore, an accident involving such a protected system requires
1‐23
DOE‐HDBK‐1208‐2012
either a uniquely improbable simultaneous failure of multiple barriers, or poor barrier concepts or implementation, or a period of neglect allowing cascading deterioration of the barriers.iii
Defense-in-depth can be comprised of layers of any combination of these types of barriers. Obviously, it is much more difficult to overcome multiple layers of physical or engineered barriers. This is the most reliable and most costly defense. Risk management analysis determines the basis and justification for the level of barrier reliability and investment, based on the probability and consequence of a hazard release scenario. For low probability, low consequence events the level of risk often does not justify the investment of physical barriers. Cost and schedule conscious management may influence selection of non-physical barriers on all but the most likely and catastrophically hazardous conditions. Such choices place greater reliance on layers of the less reliable barriers dependent on human behavior. Adding multiple barrier layers can appear to add more confidence, but multiple layers may also lead to complacency and diminish the ability to use and maintain the individual barrier layers. Complex barrier systems and barrier philosophies place heightened importance on the context of organizational culture and human performance becomes a major concern in the prevention of accidents as barrier systems become more complex and individual barrier layer functionality become less apparent.
A cascading effect can occur in aging facilities. Engineered barriers can become out-of-date, fall into disrepair or wear out; or be removed as part of demolition activity. Management should transition to reliance on a substitute administrative barrier, but this need may not be recognized.iv For example, a fire protection system, temporarily or permanently disable, is replaced by a fire watch until the protection system is restored, replaced, or the fire potential threat is removed. Administrative barriers may weaken due to inadequate updates to rules, inadequate communication and training, and inadequate monitoring and enforcement. This results in managements’ often unintentional reliance on the personal knowledge barriers. Personal knowledge barriers can be weakened by the inadequate screening for qualifications, inadequate assignment selections, or inadequate supervision.
An alignment of cascading weaknesses in barriers can result in an unqualified worker unintentionally violating an administrative control and defeating a worn out physical barrier to initiate an accident. Effective management of any of the barriers would have prevented the accident by breaking the chain of events. Therefore, investigating a failure of defense-in-depth requires probing a series of management and individual decisions that form the precursors and chain of actions that lead to the final triggering action.
iii A common use of “defense-in-depth” is the Lockout-Tagout (LOTO) Procedure. This procedure administratively requires that a hazardous energy be isolated by a primary physical barrier (e.g., valve or switch), a secondary physical barrier (a lock) that controls inadvertent defeat of the primary barrier, and a tertiary administrative barrier (tagging) controls the removal of the physical barriers. It is understood that omitting any one of these barriers is a violation of the LOTO procedure.
iv An example of a cascading effect, related to LOTO, is the discovery that some old facilities have used the out-of-date practice of common neutrals in old electrical systems or that facility circuit diagrams and labeling were not maintained accurately. These latent conditions potentially defeat LOTO entirely, requiring an additional administrative barrier procedure to do de-energized-circuit verification prior to accessing old wiring systems. Latent conditions are explained further in section 1.4.
1‐24
DOE‐HDBK‐1208‐2012
1.8 Accident Types/ Individual and Systems
There are two fundamental types of accidents which DOE seeks to avoid; individual and system accidents. Confusion between individual and system safety has been frequently cited as causal factors in major accidents.v In the ISMS framework, individual accidents are most often associated with failures at the level of the five core functions. System accidents involve failures at the principles level involving decision making, resource allocation and culture factors that may shift the focus and resources of the organization away from doing work safely to detrimental focus on cost or schedule.
1.8.1 Individual Accidents
Individual accidents - an accident occurs wherein the worker is not protected from the hazards of an operation and is injured (e.g., radiation exposure, trips, slips, falls, industrial accident, etc.). The focus of preventing individual accidents is to protect the worker from hazards inherent in mission operation (Figure 1-7). The inherent challenges in investigating an individual accident are due to the source of the human error and the victim or target of the accident can often be the same individual. This can lead to a limited or contained analysis that fails to consider the larger organizational or systemic contributors to the accident. These types of accidents involving individual injuries can overly focus on the mitigating barriers or personnel protection equipment (PPE) that avoid injuries and not consider the appropriate preventative barriers to prevent the actual accident.
1. 8
Texas City, Buncefield, Deepwater Horizon
1‐25
v
DOE‐HDBK‐1208‐2012
Preventing Individual Accidents
Initiating Hazards Targets Source or
Threats
Undesired Energy Flow
Human Error Individual
worker
Equipment Failure
Prevention Mitigation Barrier (e.g. Barrier LOTO policy) (e.g. PPEs)
Figure 1-7: Individual Accident
1.8.2 Preventing Individual Accidents
To prevent recurrence of individual injury accidents, corrective actions from accident investigations must identify what barriers failed and why [i.e., stop the source and the flow of energy from the hazards to the target (the worker)]. The mitigating barriers are important to reducing or eliminating the harm or consequences of the accident, but emphasis must be on barriers to prevent the accident from occurring. However, it is possible to find conditions where the threat is deemed acceptable if the consequence can be adequately mitigated.vi
An example of reliance on a mitigating barrier would be in the meat cutting process where chain-mail gloves protect hands from being cut. The threat or initiating energy is the knife moving towards the hand or vice-versa. The hazard energy is the cutting action of the blade. Since the glove does not prevent the knife from impacting the hand, the glove is a mitigation barrier that reduces the hazardous cutting consequence of the impact. Implementing a prevention barrier would require redesigning the process to block or eliminate the need for the hand to be in cutting area. The absence of the prevention barrier is the result of a bias in the organizational decision-making process discussed later in this handbook.
1‐26
vi
DOE‐HDBK‐1208‐2012
1.8.3 System Accident
A system accident is an accident wherein the protective and mitigating systems collectively fail allowing release of the hazard and adversely affecting many people, the community and potentially the environment. A system accident can be characterized as an "unanticipated interaction of multiple failures in a complex system. This complexity can either be technological or organizational, and often is both.” [Perrow, 1984]5
The focus of preventing system accidents is to maintain the physical integrity of operational barriers such that they prevent threats that may result from human error, malfunctions in equipment or operational processes, facility malfunctions or from natural disasters or such that they mitigate the consequences of the event in case prevention fails. (Figure 1-8).
System hazards are typically managed from cradle to grave through risk management. Risk management processes identify the potential threats, weaknesses, and failures as risks to the design, construction, operations, maintenance, and disposition of the system. Risk management establishes and records the risk parameters (or basis) and the investment decisions, the control systems, and policies to mitigate these risks. Risk management, in a broad organizational sense, can include financial, political, cultural, and social risks. While not excluding the broader societal factors, the principal focus of this handbook is on socio-technical systems and related life-cycle management (design, build, operate, maintain, dispose) system risks.
It is important to recognize the distinction between individual accidents and system accidents as it affects the way the accident is investigated, in particular the way the barriers are analyzed. The most likely differentiation of the type of accident investigation is from experience that individual accidents are likely to be influenced by work practices, plans and oversight, while system failures will most likely be influenced by risk management process for design, operations, or maintenance. System accidents require a more in-depth investigation into the policies and management culture that drives risk management decision-making. Naturally, there is often an overlap that combines individual work hazards control practices and the system risk management policies as potential areas of investigation.
1‐27
DOE‐HDBK‐1208‐2012
System Accident An accident wherein the system fails allowing a threat to release the hazard and as a result many* people are adversely affected
* Workers, Enterprise, Environment, Country
Focus Protect the operations
Th e emph asis on th e system acciden t from the threats in n o way degrades th e importan ce of
in dividual safety, it is a pre-requ isite of system safety, bu t focu s on in dividu als safety is n ot en ou gh .
Figure 1-8: System Accident
1.8.4 How System Accidents Occur
In order to prevent system accidents and incidents, it is important to first understand (via a mental model) how they occur. Figure 1-9 represents a simple schematic of how system accidents (accidents with large consequences affecting many people) can occur.
As defined in this figure a threat can come from four sources:
Human error such as someone dropping high explosives resulting in detonation.
Failure of a piece of equipment, tooling or facility. For example a piece of tooling with faulty bolts causes high explosive to drop on the floor resulting in detonation.
From a natural disaster such as an earthquake resulting in falling debris that could detonation high explosives.
“Other” as of yet undiscovered to accommodate future discoveries.
1‐28
DOE‐HDBK‐1208‐2012
Based on this simplistic system accident scenario it is clear technical system integrity must be protected from deterioration from physical and human/social factors.
How System Accidents Happen (Consider all Threats)
UNWANTED ENERGY FLOW
Equip/ tooling / facilities
Human Error
Natural Disasters
Other
Hazard to
Protect & to
Minimize
System Accident to
Avoid
THREATS * HAZARDS CONSEQUENCE OR SYSTEM ACCIDENT
Unwanted energy flows as a result of the threat to a plant hazard potentially resulting in a catastrophic consequence.
* Categories of threats adapted from MORT, DOE G 231.1‐2 and TapRoot
Figure 1-9: How System Accidents Happen
1.8.5 Preventing System Accidents
Figure 1-10 provides a simplistic view of how to prevent a system accident. Hazards can be energy in the form of leaks, projectiles, explosions, venting, radiation, collapses, or other ways that produce harm to the work force, the surrounding community, or the environment. The idea is that one wants to isolate these hazards from those things that would threaten to release the unwanted energy or material, such as human errors, faulty equipment, sabotage, or natural disasters such as wind and lightning through the use of preventive barriers. If this is done, work can proceed safely (accidents are avoided).
1‐29
DOE‐HDBK‐1208‐2012
1.9
DOE takes a system approach (ISMS) to preventing system accidents. The system is predicated on identifying hazards to protect, identifying threats to those hazards, implementing controls (barriers) to protect the hazard from the threats, and reliably performing work within the established safety envelope.
Equipment, tooling, facility malfunctions
Human Errors Consequence
Barrier (to prevent)
Hazard
Threats
Energy Flow
Barrier (to mitigate)
Natural Disasters
Figure 1-10: Prevent a System Accident
1.9 Diagnosing and Preventing Organizational Drift
Recognizing the hazards or risks and establishing and maintaining the barriers against accidents are continuous demands on organizations at all levels. Work, organizations, and human activity are dynamic, not static. This means conditions are always changing, even if only through aging, resource turnover, or creeping complacency to routine. Similarly to the Second Law of Thermodynamics—the idea that everything in the created order tends to dissipate rather than to coalesce – organizations left untended trend in the direction of disorder. In the safety literature this phenomena is referred to as organizational drift. Organizational drift, if not halted, will lead to weakened or missing barriers.
In order to recognize, diagnose and hopefully to prevent organizational drift from established safety systems (ISMS), models (mental pictures) are needed. Properly built models help investigators recognize aberrations by providing an accepted reference to compare against (i.e., a mental picture of how the organization is supposed to work). Models in combination with an understanding of organizational behavior also allow investigators to extrapolate individual events to a broader organizational perspective to determine if the problem is pervasive throughout the organization (deeper organizational issues).
1‐30
DOE‐HDBK‐1208‐2012
Three levels of models are introduced in this section to aid the investigators putting their event into perspective.
Level I at the employee level,
Level II at the physics level - Break-the-Chain Framework (BTC),
Level III at the organization or system level.
1.9.1 Level I: Employee Level Model for Examining Organizational Drift -- Monitoring the Gap – “Work-as-Planned” vs. “Work-as-Done”
The Employee Level Model provides the most detailed examination of organizational drift by comparing “work-as-done” on the shop floor with how work was planned by management and process designers. At this level, the effect of organization drift could result in an undesirable event because this is where the employees contact the hazards while performing work.
DOE organizations develop policies, procedures, training etc. to provide a management system envelope of safety within which they want their people to work. This safety envelope is developed through the ISMS “Define the Scope of Work, Analyze the Hazards, and Develop and Implement Hazard Controls” and can be referred to as “work-as-planned.” The way work is actually accomplished under ISMS “Perform Work within Controls,” referred to as “work-as done”, can be compared to the work-as-planned. Every organization’s goal is to have “work-as done” to equal work-as-planned (i.e., actual work performed within the established safety envelope – left side of Figure 1-11).
There will always be a performance gap between “work-as-planned” and “work-as-done” work performance gap (ΔWg) because of the variability in the execution of every human activity (right side of Figure 1-11). When the ΔWg becomes a problem because an accident or an information- rich, high-consequence or reoccurrence event occurs, a systematic investigative process helps to understand first “what” the variation is and second, determine “why” the variation exists. Figure 1-11 illustrates the comparison of the ideal or desirable organizational work performance goal on the left side, with the more likely or realistic work performance gap on the right. Recognizing and reducing the gap is the objective of “Provide Feedback and Continuous Improvement” activities.
Within this handbook, the term “physics of safety” is used to represent the science and engineering principles and methods used to assure the barriers designed into the systems are effective against the nature of the threats and hazards. Only with sound “physics of safety” basis behind the purpose of the barriers can management truly rely on a “work-as-planned” safety performance envelope. A typical gap analysis must explore weaknesses in the “work-as planned” and the “work-as-performed.”
Because the “work-as-planned” truly represents the requisite safety/security/quality process that management wants their employees to follow; the investigative process reduces the gap ΔWg by
1‐31
DOE‐HDBK‐1208‐2012
systematically addressing the broadest picture of what went wrong, and focuses the Judgments of Need and Corrective Actions to reduce the gap.
Systematically Evaluate
Organizational Goal Organizational Reality
Work-as-Planned Work-as-Done Work-as-Planned
Work-as-Done
∆Wg “What”
“Why” Goal: Align, tighten, and sustain spectrum of performance to keep work-as-planned the same as work-as-done.
Where we want to be Where we probably are ∆wg = gap in “work-as-don e” vs. as “plan n ed”
Figure 1-11: Level I - “Work-as-Done” Varies from “Work-as-Planned” at Employee Level
1.9.2 Level II: Mid-Level Model for Examining Organizational Drift – Break-the- Chain
The Mid-Level Model for examining organizational drift focuses on the Break-the-Chain (BTC) framework. Based on the simplistic representation displayed in Figure 1-12, the BTC framework provides a broader, more complete model to help organizations avoid the threat potential of catastrophic events posed by the significant hazards, dynamic tasks, time constraints, and complex technologies that are integral to ongoing missions. And, when an event does occur, it also provides a logical and systematic framework to diagnose the event to determine which step in the process broke down to allow focusing corrective actions in only those areas found deficient. The BTC model is designed to stop the system accident as shown in Figure 1-10 but
1‐32
DOE‐HDBK‐1208‐2012
can be applied equally to individual accidents. The BTC model is nothing but a logical, physics- based application of the ISM core functions. The six basic components of the BTC model are:
Step #1 – Focus on the System Accident (Pinnacle/Plateau Event) to Avoid: The first step focuses on the last link of the chain, the consequences of the system accident that the organization is trying to prevent. Once the catastrophic consequences have been identified, they should be listed in priority order. This prioritization is important for four reasons:
It serves as an important reminder to all employees of the potential catastrophic consequences they must strive to avoid each day.
It pinpoints where defensive barriers are most needed; as one would expect, the probability of an event and the severity of the consequences will drive the number and type of barriers selected.
It ensures that the defensive barriers associated with the highest priority consequences will receive top protection against degradation.
It encourages a constant review of resources against consequences focusing attention on making sure the most severe consequences are avoided at all times.
Prioritization is a critical organizational dynamic. Efforts to protect against catastrophic consequential events should be the first priority. Focus must be maintained on the priority system accidents to assure that the needed attention and resources are available to prevent them.
Step #2 – Recognize and Minimize Hazard: Identify and minimize the physical hazard, while maintaining production. After identifying the hazard, there are two approaches to minimize it. First, actions are taken to reduce the physical hazard that can be impacted by the threat (for example minimizing the amount of combustible material in facilities). Second, attempts are made to reduce the interactive complexity and tight coupling within the operation or, conversely, to increase the response time of the organization so an event can be recognized and responded to more quickly. The intent of these two approaches is to remove or reduce the hazard so that the consequences of an accident are minimized to the extent possible.
Step #3- Recognize Threat Posed by Human Errors, Failed Equipment, Tooling or Facilities, Mother Nature (i.e., natural disasters) or Other as of yet Unknown Things: A key component of consequence avoidance is identifying and minimizing all significant knowable threats that could challenge the hazard (i.e., allow the flow of unwanted energy). Note the use of the word “all.” The intent is that if not all threats are identified and addressed; the organization is vulnerable to failure. Organizations should ensure the system event does not occur, not hope it does not occur (i.e., they prove operations safe). The categories of threats from human error and failed equipment, tooling or facilities, and natural disasters have been adapted from a combination of MORT, DOE Guide (G) 231.1-1 and TapRoot® .
Step #4 – Manage Defenses: Based on the threats identified, one must ensure the right barriers are identified to prevent or to reduce the probability of the flow of energy to the hazard (red, blue, brown, and purple barriers in Figure 1-12) or if that fails to mitigate the consequences of a
1‐33
DOE‐HDBK‐1208‐2012
system accident (shown by granite encasement around system event box in Figure 1-12). The type and number of barriers and the level of effort needed to protect them are dictated by level of consequence and type of hazard associated with the operation. The decrease in the number of threats or probability of occurrence as a result of the application of various barriers or defenses is indicated in Figure 1-12 by the reduction in the number of colored arrows that can reach the hazard.
Step #5 – Foster a Culture of Reliability: Steps 1 through 4 make the operational hazard less vulnerable to threats. To execute these steps successfully and consistently without observable signs of degradation or significant events, requires an army of trained and experienced personnel who conscientiously follow the proven work practices. These workers must maintain their proficiency through continuous hands-on work and be trained so they can make judgment calls on the shop floor that will reflect the shared organizational values. They also need to have the authority to make time-critical decisions when situations require this action. They must be part of an organization that has a strong culture of reliability.
Step #6 – Learn from Small Errors to Prevent Big Ones: Gaps between “work-as-planned” by the process designer and “work-as-done” by the employees exist in every operation and reflects the challenges an organization will face sustaining the BTC framework (Figure 1-12). The fact that these gaps exist should be of no surprise, they exist in every organization. The problem occurs when the organization is unaware of the gaps or does not know the magnitude or extent of the gaps across the operation. Because of the importance of DOE sites remaining within the established safety basis (ISMS), the investigation process as described in this document places special emphasis on evaluating and closing the gap between “work-as-planned” and “work-as-done”.
BTC parallels and complements the ISMS functions. The levels of formality or rigor to which the six process components (or process steps) are applied are proportional to the complexity and consequences of the operations (e.g., for nuclear operations where the potential consequences are severe, the full rigor of 10 Code of Federal Regulations (CFR) Part 830, nuclear safety is employed). Detailed application of this process can be found in Volume II, Chapter 1.
1‐34
DOE‐HDBK‐1208‐2012
Break‐the‐Chain Framework to Prevent System Accidents
Human Performance
Error Precursors
Human Error
Equip/ tooling / facilities
Natur al Disasters
Other
Hazard to
Protect & to
Minimize
System Accident to Avoid
Step #3
Step #6 Learn from Small Errors
Step #5 Foster a Culture of Reliability
Recognize Step #4 Step #2 Step #1 Threats Manage Defenses Recognize &
Minimize Focus on the
System Accident Hazard
Figure 1-12: Level II - Physics-Based Break-the-Chain Framework
1.9.3 Level III: High Level Model for Examining Organizational Drift
The High Level Model for examining organizational drift, shown in Figure 1-13, was adapted from work by the Institute of Nuclear Power Operations (INPO)vii . It is intended to represent a systematic view for analysis of both individual and system accidents. The model breaks down work into four sequences that one typically finds at DOE sites: 1) Organizational Processes & Values; 2) Job Site Conditions (work-as-planned); 3) Worker Behaviors (“work-as-done”); 4) Operational Results. An explanation of each category of work can be found in Figure 1-13.
Also shown in Figure 1-13 are the quality assurance checks (green ovals) that take place before transitioning from one sequence of work to the next sequence of work. These process check points are additional examples of barriers put in place to ensure readiness to go the next sequence of work. DOE uses many similar quality assurance readiness steps in both its high hazard nuclear operations and industrial operations.
vii INPO Human Performance Reference Manual, INPO-06-003.
1‐35
‐ ‐
‐ ‐
‐
‐
DOE‐HDBK‐1208‐2012
The later in the work sequence the process barriers fall, noted by higher highlighted grey numbers, the more significant or important the barrier is in preventing the undesired event because it represents one of the last remaining barriers before a consequential event.
A Systems View of Operating Performance Products or results of processes Actions or inactions (i.e., using (physical barriers) that create the or not using products of right conditions for the worker to processes) by an individual successfully & safely accomplish worker during the tasks (e.g., engineered barriers, performance of a task safety systems, procedures, tools, (procedure adherence, readiness, etc.) protect barriers, etc.)
Programs and processes to focus the org anization to accomplish operational goals while avoiding the consequential accident.
Modifie d from INP O Human Pe rformance Re fe re nc e Manual, INP O 06‐003, 2006
“work as planned” JOB‐SITE
CONDITIONS
OPERATIONAL RESULTS
2
4
ORGANIZATIONAL PROCESSES & VALUES 1
“work as done” WORKER BEHAVIOR
Review Work
Post job Reviews
3
Pre job Brief Readiness
Authorization
Causal Factor Analysis Independent Oversight
Qualifications Assignments
QC Hold Points
Job Site Walk Down Mgt. Oversight Independent Verification
Outcomes to the Plant as a result of the worker ’s behavior (e.g., events, TSR violations, unplanned LCOs events, etc.)
Leadership
High Standards Courage & Integrity Questioning Attitude
Healthy Relationships Open & Honest Communications
Figure 1-13: Level III - High-Level Model for Examining Organizational Drift
1.10 Design of Accident Investigations
The organizational basis for the causes of accidents requires the accident investigators to develop insights about organizational behavior, mental models and the factors that shape the environment in which the incident occurred. This develops a better understanding of “what” in the organizational system failed and “why” the organization allowed itself to degrade to the state that resulted in an undesired consequence. The investigation progresses through the events in the opposite order in which they occurred, as shown schematically in Figure 1-14.
1.10
1‐36
DOE‐HDBK‐1208‐2012
Investigations to Determine Organizational Weaknesses
Unsafe acts
Local workplace factors
Organizational factors
Failed Defenses / Barriers
Active failures
Latent Conditions
Event
precursors
precursors
precursors
Event Investigation
Causal Factors Analysis starts with the low consequence, information‐rich event and separates “What” happened from “Why ” it happened.
This allows us to drill down to find the:
1. Flawed defenses
2. Active failures (unsafe acts)
3. Human performance error precursors
“What” 4. 4. latent conditions (local workplace factors & organizational factors).
“Why”
Adopted from Reason, Managing the Risks of Organiz ational Accidents
Figure 1-14: Factors Contributing to Organizational Drift
1.10.1 Primary Focus – Determine “What” Happened and “Why” It Happened
The basic steps and processes used for the Accident Investigation are:
Define the Scope of the Investigation and Select the Review Team
Collect the Evidence
Investigate “what happened”
Analyze “why it happened”
Define and Report the Judgments of Need and Corrective Actions
The purpose of an accident investigation is to determine:
1‐37
DOE‐HDBK‐1208‐2012
The “what” went wrong beginning by comparing “work-as-done” to planned work. The purpose is to understand what was done, how it was planned, and identify unanticipated or unforeseeable changes that may have intervened. Establishing the “what” was done tends to result in a forward progression of the sequence of events that defines what barriers failed and how they failed.
The “why” things did not work according to plan comes from a cultural-based assessment of the organization to understand why the employees thought it was OK to do what they did at the time in question. Establishing the “why” tends to be a backwards regression identifying the assumptions, motives, impetus, changes and inertia within the organization that may reveal weaknesses and inadequacies of the barriers, barrier selection, and maintenance processes. The objective is to understand the latent organizational weaknesses and cultural factors that shaped unacceptable outcomes.
Investigative tools provided in this handbook are designed to determine the “what” and the “why.” These investigative tools allow investigation teams to systematically explore what failed in the systems used to ensure safety. Rooting out the deeper organizational issues reduces degradation of any system modification put in place.
1.10.2 Determine Deeper Organizational Factors
Having determined “what” went wrong, the investigation team must attempt to use the theory introduced in Chapter 1 to understand how extensive the issues discovered in the investigation are throughout the organization, how long they have been undetected and uncorrected, and why the culture of the organizations allowed this to occur. To answer these questions, the team needs to determine the extent of conditions and causes, attempt to identify the Latent Organizational Weaknesses (those management decisions made in the past that are now starting to set employees up for errors) and attempt to identify underlying cultural issues that may have contributed to these.
A learning organization must determine “what” did not work by performing a compliance-based assessment and understand “why” the organization was allowed to get to this stage by performing a cultural-based assessment. In the Federally-led accident investigation the compliance based assessment is driven by DOE O 225.1B which requires the team investigate policies, standards, and requirements that were applicable to the accident being investigated and to investigate the safety management system that was to be in place to institutionalize the resulting work practices to allow safe work (DOE Policy (P) 450.4A, Safety Management System Policy). This is accomplished by reviewing work against the ISM Core Functions. The cultural- based assessment is accomplished by examining three principal culture shaping factors (leadership, employee engagement, organizational learning) which are developed from the ISM Principles.
This output of the deeper organizational issues is much more subjective than previous sections because it is based on the team assimilating information and making educated judgments as to possible underlying organizational causes. The following sections are provided to frame the deeper organizational part of the investigation and the results should be used in conjunction with
1‐38
DOE‐HDBK‐1208‐2012
the organizational mental model introduced earlier (Figure 1-13: Level III - High-Level Model for Examining Organizational Drift).
1.10.3 Extent of Conditions and Cause
The team should determine how long conditions have existed without detection (hints that the organization’s assessment and oversight processes are not very effective) and how extensive the conditions are throughout the organization (hints which point to deeper management system issues, indicating a higher level corrective action needed).
As part of this effort, the team should also capture the missed opportunities to catch this event in its early stages such that the event being investigated would not have occurred. A learning organization should be taking every attempt to learn from previous mishaps or near misses (including external lessons learned) and have sufficiently robust process to detect when things are going wrong early in the process.
1.10.4 Latent Organizational Weaknesses
Latent organizational weaknesses are hidden deficiencies in management control processes (for example, strategy, policies, work control, training, and resource allocation) or values (shared beliefs, attitudes, norms, and assumptions) that create workplace conditions that can provoke error (i.e., precursors) and degrade the integrity of defenses (flawed defenses). [Reason, pp. 10 18, 1997]14
Table 1-1 is a guide, to help identify latent organizational weaknesses - those factors in the management control processes or associated values that influence errors or degrade defenses. Consider work practices, resources, documentation, housekeeping, industrial safety, management effectiveness, material availability, oversight, program controls, radiation employee practices, security work practices, tools and equipment use, training and qualification, work planning and execution, and work scheduling. For an expanded list of examples, see Attachment 1, ISM Crosswalk and Safety Culture Lines of Inquiry.
1‐39
DOE‐HDBK‐1208‐2012
Table 1-1: Common Organizational Weaknesses
Category Weakness
Training Effectiveness of training on task qualification requirement for skill‐based tasks. Focus is on lower level of cognitive knowledge. Failure to involve management in training.
Training is inconsistent with company equipment, procedures, or process.
Communication Reinforcement of use of the phonetic alphabet in critical steps to preclude misunderstanding of instructions.
Failure to reinforce use of 3‐way communications. Failure to use specific unit ID numbers in procedures Unclear priorities or expectations. Unclear roles and responsibilities.
Planning and Provision for contingencies for failures. Scheduling Failure to consider that multiple components may be out of service.
Failure to provide required materials or procedures. Over scheduling of resources.
Failure to consider incorrect operation or damage to adjacent equipment. Specific type of work not performed.
Specific type of issue not addressed Inadequate resources assigned.
Design or Process Change
Involvement of users in design change implementation. Inadequate training.
Inadequate contingencies in case a procedure goes wrong
Values, Priorities, Management policies on line input into adequacy of procedures or safety features. Policies Too high a priority is placed on schedules.
Willingness to accept degraded conditions or performance. Management failure to recognize the need for or importance of related program.
Procedure Consideration of human factors in procedural development and implementation. Development or Failure to perform procedural verification or validation. Use
Failure to reference procedure during task performance.
Assumptions made in lieu of procedural guidance. Omission of necessary functions in procedures.
1‐40
DOE‐HDBK‐1208‐2012
Category Weakness
Supervisory Involvement
Performance of management observations and coaching. Failure to correct poor performance or reinforce good performance. Unassigned or fragmented responsibility and accountability.
Inadequate program oversight
Organizational Interfaces
Interfaces for defining work priorities. Lack of clear lines of communications between organizations. Conflicting goals or requirements between programs Lack of self‐assessment monitoring. Lack of measurement tools for monitoring program performance.
Lack of interface between programs.
Work Practices Reinforcement of the use of established error prevention tools and techniques (human performance tools).
1.10.5 Organizational Culture
Insights about safety culture may be inferred by considering aspects of leadership, employee engagement and organizational learning. Observations about culture should be captured reviewed and summarized to distill indicators of the most significant culture observations. These are phrased as positive culture challenges in the report.
An organization’s culture, if not properly aligned with safety requirements, could result in ignored safety requirements. A healthy culture exists when the “work-as-done” (culture artifacts and behavior) overlap the “work-as-planned” (espoused beliefs and values) indicating an alignment with the underlying assumptions (those factors felt important to management). A misalignment between actual safety behavior and espoused safety beliefs indicates an unhealthy culture or one in which the employees are not buying into the established safety system or one in which the true underlying assumptions of management is focused on something besides safety (Figure 1-15).
1‐41
39
DOE‐HDBK‐1208‐2012
Work‐as‐Imagined
Underlying Assumptions
Espoused Beliefs and Values
Below the surface
Underlying assumptions must be understood to properly interpret artifacts and to create change
Work‐as‐Done Artifacts and Behaviors Misalignment hints at deeper underlying assumptions keeping the organization from attaining its desired balance between production and safety
Schein, Organizational Culture and Leadership, 2004
Figure 1-15: Assessing Organizational Culture
Safety culture factors offer important insights about event causation and prevention. Although in-depth safety culture evaluations are beyond the doable scope of most accident investigations, the DOE (with the help of EFCOG) has determined that examining three principal culture shaping factors (leadership, employee engagement, organizational learning) will help to identify cultural issues that contributed to the event. These factors were developed from the ISM Principles by the EFCOG Safety Culture Working Group in 2007.
Leadership
Leadership and culture are two sides of the same coin; neither can be realized without the other. Leaders create and manage the safety culture in their organizations by maintaining safety as a priority, communicating their safety expectations to the workers, setting the standard for safety through actions not talk (walk the talk), leading needed change by defining the current state, establishing a vision, developing a plan, and implementing the plan effectively. Leaders cultivate trust to engender active participation in safety and to establish feedback on the effectiveness of their organization’s safety efforts.
Leaders assure plans integrate safety into all aspects of an organization’s activities considering the consequences of operational decisions for the entire life-cycle of operations
1‐42
DOE‐HDBK‐1208‐2012
and the safety impact on business processes, the organization, the public, and the environment.
Leaders understand their business and ensure the systems employed provide the requisite safety by identifying and minimizing hazards, proving the activity is safe, and not assuming it is safe before operations commence.
Leaders consider safety implications in the change management processes.
Leaders model, coach, mentor, and reinforce their expectations and behaviors to improve safe business performance.
Leaders value employee involvement, encourage individual questioning attitude, and instill trust to encourage raising issues without fear of retribution.
Leaders assure employees are trained, experienced and have the resources, the time, and the tools to complete their job safely.
Leaders hold personnel accountable for meeting standards and expectations to fulfill safety responsibilities.
Leaders insist on conservative decision making with respect to the proven safety system and recognize that production goals, if not properly considered and clearly communicated, can send mixed signals on the importance of safety.
Leadership recognizes that humans make mistakes and take actions to mitigate this.
Leaders develop healthy, collaborative relationships within their own organization and between their organization and regulators, suppliers, customers and contractors.
Employee/Worker Engagement
Safety is everyone’s responsibility. As such, employees understand and embrace the organization’s safety behaviors, beliefs, and underlying assumptions. Employees understand and embrace their responsibilities, maintain their proficiency so that they speak from experience, challenge what is not right and help fix what is wrong and police the system to ensure them, their co-workers, the environment, and the public remain safe.
Individuals team with leaders to commit to safety, to understand safety expectations, and to meet expectations.
Individuals work with leaders to increase the level of trust and cooperation by holding each other accountable for their actions with success evident by the openness to raise and resolve issues in a timely fashion.
Everyone is personally responsible and accountable for safety, they learn their jobs, they know the safety systems and they actively engage in protecting themselves, their co workers, the public and the environment.
1‐43
DOE‐HDBK‐1208‐2012
Individuals develop healthy skepticism and constructively question deviations to the established safety system and actively work to avoid complacency or arrogance based on past successes.
Individuals make conservative decisions with regards to the proven safety system and consider the consequences of their decisions for the entire life-cycle of operations.
Individuals openly and promptly report errors and incidents and don’t rest until problems are fully resolved and solutions proven sustainable.
Individuals instill a high level of trust by treating each other with dignity and respect and avoiding harassment, intimidation, retaliation, and discrimination. Individuals welcome and consider a diversity of thought and opposing views.
Individuals help develop healthy collaborative relationships within their organization and between their organization and regulators, suppliers, customers and contractors.
Organizational Learning
The organization learns how to positively influence the desired behaviors, beliefs and assumptions of their healthy safety culture. The organization acknowledges that errors are a way to learn by rewarding those that report, sharing what is wrong, fixing what is broken and addressing the organizational setup factors that led to employee error. This requires focusing on reducing recurrences by correcting deeper, more systemic causal factors and systematically monitoring performance and interpreting results to generate decision-making information on the health of the system.
The organization establishes and cultivates a high level of trust; individuals are comfortable raising, discussing and resolving questions or concerns.
The organization provides various methods to raise safety issues without fear of retribution, harassment, intimidation, retaliation, or discrimination.
Leaders reward learning from minor problems to avoid more significant events.
Leaders promptly review, prioritize, and resolve problems, track long-term sustainability of solutions, and communicate results back to employees.
The organization avoids complacency by cultivating a continuous learning/improvement environment with the attitude that “it can happen here.”
Leaders systematically evaluate organizational performance using: workplace observations, employee discussions, issue reporting, performance indicators, trend analysis, incident investigations, benchmarking, assessments, and independent reviews.
The organization values learning from operational experience from both inside and outside the organization.
1‐44
DOE‐HDBK‐1208‐2012
The organization willingly and openly engages in organizational learning activities.
1.11 Experiential Lessons for Successful Event Analysis
A fundamental shortcoming of some investigative techniques is that they do not address where the physics could fail, based on perceptions of improbability due to lack of recent evidence (it has happened before). People, equipment, and facilities only get hurt or damaged when energy flows to where it does not belong. Investigations must determine where the physics could fail in order to prevent potential bad consequences.
“System Optimism” is the belief that systems are well designed and well maintained, procedures are complete and correct, designers can foresee and anticipate every situation, and that people behave as they are expected to or as they were taught. This is the “work-as-imagined” by the organizational management culture. In this view, people are a liability and deviation from the “work-as-imagined” is seen as a threat to safety that needs to be eliminated. In other words, this is the perception that errors are caused by the individuals who made them; correct or remove the errant individual and the problem is fixed.
“System Reality” is the belief that things go right because people learn to overcome design flaws and functional glitches, adapt their performance to meet demands, interpret and apply procedures to match conditions, and can detect and correct when things go wrong. In this view, people are an asset and the deviation from the “work-as-imagined” is seen as how workers have to adapt to successfully complete the work within the time and resources constraints that exist for that task. In other words, if the worker is adapting incorrectly, the fault is in the conditions and methods available to adapt.
Rather than simply judging a decision as wrong in retrospect, the decision needs to be evaluated in the context of contributing factors that explain why the decision was made. If the investigation stops with worker’s deviation as the cause, nothing is corrected. The next worker, working in the same context, will eventually adapt in a similar fashion and deviate from “work as imagined.” Performance variability is not limited to just the worker who triggers the accident. People are involved in all aspects of the work, including variation in the actions of the co workers, the expectations of the leaders, accuracy of the procedures, the effectiveness of the defenses and barriers, or even the basic policies of the organization can influence an outcome. This is reflected in the complex, non-linear accident model where unexpected combinations of normal variability can result in the accident. Failure to follow up with lessons-to-be-learned and validations of corrective actions and Judgment of Needs can certainly lead to a recurrence of an event.
1‐45
DOE‐HDBK‐1208‐2012
1‐46
DOE‐HDBK‐1208‐2012
CHAPTER 2. THE ACCIDENT INVESTIGATION PROCESS
2. THE ACCIDENT INVESTIGATION PROCESS
2.1 Establishing the Federally Led Accident Investigation Board and Its Authority
2.1.1 Accident Investigations’ Appointing Official
Section 2.1 primarily deals with the DOE Federal responsibilities under DOE O 225.1B. Upon notification of an accident requiring a DOE Federal investigation, the Appointing Official selects the AIB Chairperson. The Appointing Official, with the assistance of the Board Chairperson, selects three to six other Board members, one of whom must be a trained DOE accident investigator. All of the AIB members are DOE federal employees. To minimize conflicts of interest influences, the Chairperson and the accident investigator must be from a different duty station than the accident location. The Appointing Official for a Federal accident investigation is the Head of Program Element, unless this responsibility is delegated to the Chief Health, Safety and Security Officer (HS-1). The roles and responsibilities of the Appointing Official for Accident Investigations, the Heads of Program Elements for Accident Investigations, and the Heads of Field Elements for establishing and supporting AIBs are defined in the Table 2-1.
Table 2-1: DOE Federal Officials and Board Member Responsibilities
Participants Major Responsibilities
Appointing Official for Accident Investigations
Formally appoints the Accident Investigation Board in writing within three days of accident categorization
Establishes the scope of the Board’s authority, including the review of management systems, policy, and line management oversight processes as possible causal factors
Briefs Board members within three days of their appointment Ensures that notification is made to other agencies, if required by memoranda of
understanding, law, or regulation Emphasizes the Board’s authority to investigate the causal roles of organizations,
management systems, and line management oversight up to and beyond the level of the appointing official
Accepts the investigation report and the Board’s findings Publishes and distributes the respective investigation report within seven calendar days
of report acceptance Develops lessons learned for dissemination throughout the Department or the
organization for or the OSRs Closes the investigation after the actions in DOE O 225.1B, Paragraph 4d, are completed
2. 1
2‐1
DOE‐HDBK‐1208‐2012
Participants Major Responsibilities
Serves as Appointing Official for Federal accident investigations for programs, offices and Elements for Heads of Program
facilities under their authority. Accident Maintain a staff of trained and qualified personnel to serve in the capacity of Chairperson Investigations and DOE Accident Investigators for AIBs and, upon request, provide them to support
other AIBs. Ensure that DOE and contractor organizations are prepared to effectively accomplish
initial investigative actions and assist Accident Investigation Boards Categorize the accident investigation in accordance with the criteria provided in
Attachment 2 of DOE O 225.1B Report accident categorization and initial actions taken by DOE site teams to the Office
of Corporate Safety Programs (HS‐23) Serve as the appointing official for Federal accident investigations Ensure that readiness teams and emergency management personnel coordinate their
activities to facilitate an orderly transition of responsibilities for the accident scene Develop lessons learned for Federal accident investigation Require submittal of corrective action plans to address the Judgments of Need, approve
the implementation of those plans, and track the effective implementation of those plans to closure.
Distribute accident investigation reports to all Heads of Field Elements under their cognizance and direct that extent‐of‐condition reviews be conducted for issues identified during accident investigations that are applicable to work locations and operations.
2‐2
DOE‐HDBK‐1208‐2012
Participants Major Responsibilities
Heads of Field Elements for Accident Investigations
Maintain a state of readiness to conduct investigations throughout the field element, their operational facilities, and the DOE site teams
Ensure that sufficient numbers of site DOE and contractor staff understand and are trained to conduct or support investigations
Procure appropriate equipment to support investigations Maintain a current site list of DOE and contractor staff trained in conducting or
supporting investigations Assist in coordinating investigation activities with accident mitigation measures taken by
emergency response personnel Communicate and transfer information on accidents to the head of the Headquarters
program elements to whom they report Communicate and transfer information to the Accident Investigation Board Chairperson
before and after his/her arrival on site Coordinate corrective action planning and follow‐up with the head of the Headquarters
program element and coordinate comment resolution by reviewing parties Facilitate distribution of lessons learned identified from accident investigations Serve as liaison to the HSS AI Program Manager on accident investigation matters Develop or provide assistance in developing lessons learned for accident investigations. Require the submittal of contractor corrective action plans to address the Judgments of
Need, approve the implementation of those plans, and track the effective implementation of those plans to closure
Conduct extent‐of‐condition reviews for specific issues resulting from accident investigations that might be applicable to work locations or activities under the Heads of Field Elements’ authority, and address applicable lessons learned from investigations conducted at other DOE sites
2.1.2 Appointing the Accident Investigation Board
A list of prospective Chairpersons who meet minimum qualifications is available from the HSS AI Program Manager and maintains a list of qualified Board members, consultants, advisors, and support staff, including particular areas of expertise for potential Board members or consultants/advisors. The Appointing Official, with the help of the HSS AI Program Manager, and the selected AIB Chairperson, assess the potential scope of the investigation and identify other board members needed to conduct the investigation. In selecting these individuals, the chairperson and appointing official follow the criteria defined in DOE O 225.1B, which are shown in Table 2-2.
2‐3
DOE‐HDBK‐1208‐2012
Table 2-2: DOE Federal Board Members Must Meet These Criteria
Role Qualifications
Chairperson Senior DOE manager Preferably a member of the Senior Executive Service or at a senior
general service grade level deemed appropriate by the appointing official
Demonstrated managerial competence Knowledgeable of DOE accident investigation techniques Experienced in conducting accident investigations through participation
in at least one Federal investigation, or equivalent experience
Board Members DOE Federal employee Subject matter expertise in areas related to the accident, including
knowledge of the Department’s safety management system policy and integrated safety management system
Either the Chairperson or, at least one Board member, must be a DOE accident investigator, who has participated in an accident investigation course sponsored by the Office of Corporate Safety Programs
Board Advisor/Consultant Knowledgeable in evaluating management systems, the adequacy of policy and its implementation, and the execution of line management oversight
Industry working knowledge in the analytical techniques used to determine accident causal factors
DOE O 225.1B establishes some additional restrictions concerning the selection of Board members and Chairpersons. Members are not permitted to have:
A supervisor-subordinate relationship with another Board member
Any conflict of interest or direct or line management responsibility for day-to-day operation or oversight of the facility, area, or activity involved in the accident.
Both the Chairperson and the DOE Accident Investigator must be selected from a different duty station than the accident location.
Consultants, advisors, and support staff can be assigned to assist the Board where necessary, particularly when DOE employees with necessary skills are not available. For example, advisory staff may be necessary to provide knowledge of management systems or organizational concerns or expertise on specific DOE policies. A dedicated and experienced administrative coordinator (see Appendix C) is recommended. The Program Manager can help identify appropriate personnel to support Accident Investigation Boards.
2‐4
DOE‐HDBK‐1208‐2012
The appointing official appoints the Accident Investigation Board within three calendar days after the accident is categorized by issuing an appointment memorandum. The appointment memorandum establishes the Board’s authority and releases all members of the AIB from their normal responsibilities/duties for the period of time the Board is convened. The appointment memorandum also includes the scope of the investigation, the names of the individuals being appointed to the Board, a specified completion date for the final report (nominally 30 calendar days), and any special provisions deemed appropriate.
The appointment memorandum should specify the scope of the investigation which includes:
Gathering facts;
Analyzing causes;
Developing conclusions and,
Developing Judgments of Need related to DOE and contractor organizations and management systems that could or should have prevented the accident.
A Sample Appointment Memorandum may be found in Appendix D.
2.1.3 Briefing the Board
The appointing official is responsible for briefing all Board members as soon as possible (within three days) after their appointment to ensure that they clearly understand their roles and responsibilities. This briefing may be given via videoconference or teleconference. If it is impractical to brief the entire Board, at least the Board Chairperson should receive the briefing and then convey the contents of the briefing to the other Board members before starting the investigation. The briefing emphasizes:
The scope of the investigation;
The Board’s authority to examine DOE and contractor organizations and management systems, including line management oversight, as potential causes of an accident, up to and beyond the level of the appointing official;
The necessity for avoiding conflicts of interest;
Evaluation of the effectiveness of management systems, as defined by DOE P 450.4A;
Pertinent accident information and special concerns of the appointing official based on site accident patterns or other considerations.
2‐5
DOE‐HDBK‐1208‐2012
2.2
2.2 Organizing the Accident Investigation
The accident investigation is a complex project that involves a significant workload, time constraints, sensitive issues, cooperation between team members, and dependence on others.
To finish the investigation within the time frame required, the AIB chairperson must exercise good project management skills and promote teamwork. The Chairperson’s initial decisions and actions will influence the tone, tempo, and degree of difficulty associated with the entire investigation. This section provides the Board Chairperson with techniques and tools for planning and organizing the investigation.
2.2.1 Planning
Project planning must occur early in the investigation. The Chairperson should begin developing a plan for the investigation immediately after his/her appointment. The plan should include a preliminary report outline, specific task assignments, and a schedule for completing the investigation. It should also address the resources, logistical requirements, and protocols that will be needed to conduct the investigation.
A tool for the Chairperson, the Accident Investigation Startup Activities List, is included in Appendix D. The Chairperson and administrative coordinator can use this list to organize the initial investigative activities.
2.2.2 Collecting Initial Site Information
Following appointment, the Chairperson is responsible for contacting the site/sponsoring organization to obtain as many details on the accident as possible. The sponsoring organization, which could include a DOE field program office, and/or contractor division point-of-contact, is usually designated as the liaison with the Board. The Chairperson needs the details of the accident to determine what resources, Board member expertise, and technical specialists will be required. Furthermore, the Chairperson should request background information, including site history, sitemaps, and organization charts. The Accident Investigation Information Request Form (provided in Appendix D) can be used to document and track these and other information requests throughout the investigation.
2.2.3 Determining Task Assignments
A useful strategy for determining and allocating tasks is to develop an outline of the accident investigation report, including content and format, and use it to establish tasks for each Board member. This outline helps to organize the investigation around important tasks and facilitates getting the report writing started as early as possible in the investigation process. Board members, advisors, and consultants are given specific assignments and responsibilities based on their expertise in areas such as management systems, work planning and control, occupational safety and health, training, and any other technical areas directly related to the accident. These assignments include specific tasks related to gathering and analyzing facts, conducting interviews, determining causal factors, developing Conclusions (CON) and JONs, and report
2‐6
DOE‐HDBK‐1208‐2012
writing. Assigning designated Board members specific responsibilities ensures consistency during the investigation.
2.2.4 Preparing a Schedule
The Chairperson also prepares a detailed schedule using the generic four-week accident investigation cycle and any specific direction from the appointing official. The Chairperson should establish significant milestones; working back from the appointing official’s designated completion date. Table 2-3 shows a list of typical activities to schedule.
Table 2-3: These Activities should be Included in an Accident Investigation Schedule
Interviews/Evidence Collection and Preliminary Analysis
Obtain needed site and/or facility/project background information, policies, procedures, and training records
Assign investigation tasks and writing responsibilities
Initiate and complete first draft of accident chronology and facts
Select analytical methods (preliminary)
Complete interviews
Complete first analyses of facts using selected analytical tools; determine whether additional tools are necessary
Obtain necessary photographs and complete illustrations for report
Internal Review Drafts
Complete first draft of report elements, up to and including facts and analysis section
Complete development and draft of direct, contributing, and root causes
Complete development and draft of Judgments of Need
Complete first draft of report for internal review
Complete draft analyses
Complete second draft of report for internal review
2‐7
DOE‐HDBK‐1208‐2012
External Review Drafts
Complete Classification/Privacy Act reviews
Conduct factual accuracy review and revise report based on input
Complete report for Quality Assurance review by HSS Office Corporate Safety Programs prior to submission to the Appointing Official
Complete final draft of report
Prepare out‐brief materials
Brief relevant site/division and/or field office managers (depending on type of investigation) on findings
Leave site
Complete final production of report
The schedule developed by the Board Chairperson should include the activities to be conducted and milestones for their completion. A sample schedule is included as Figure 2-1. The Accident Investigation Day Planner: a Guide for Accident Investigation Board Chairpersons, available on the AI Program website, can assist in the development of this schedule. Activities cover nominally 30 days.
Figure 2-1: Typical Schedule of Accident Investigation
2.2.5 Acquiring Resources
From the first day, the Chairperson begins acquiring resources for the investigation. This includes securing office space, a conference room or “command center”, office supplies, and
2‐8
DOE‐HDBK‐1208‐2012
computers through the Field Office Manager (FOM), a secured area for document storage, tools, and personal protective equipment, if necessary. The site’s FOM should provide many of these resources. The Accident Investigation Equipment Checklist (see Appendix D) is designed to help identify resource needs and track resource status.
In addition, the Board Chairperson assures that contracting mechanisms exist and that funding is available for the advisors and consultants required to support the investigation. These activities are coordinated with the Appointing Official.
2.2.6 Addressing Potential Conflicts of Interest
The Board Chairperson is responsible for resolving potential conflicts of interest regarding Board members, advisors, and consultants. Each Board member, advisor, and consultant should certify that he or she has no conflicts of interest by signing the Accident Investigation Individual Conflict of Interest Certification Form (provided in Appendix D). If the Chairperson or any individual has concern about the potential for or appearance of conflicts of interest, the Chairperson should inform the Appointing Official and seek legal counsel input, if necessary. The decision to allow the individual to participate in the investigation, and any restrictions on his or her participation, shall be documented in a memorandum signed by the Board Chairperson to the Appointing Official. If the Chairperson relies on the advice of legal counsel, the Chairperson shall seek appropriate legal counsel concurrence through the Appointing Official. The memorandum will become part of the Board’s permanent record.
2.2.7 Establishing Information Access and Release Protocols
The Chairperson is responsible for establishing protocols relating to information access and release. These protocols are listed in Table 2-4. Information access and other control protocols maintain the integrity of the investigation and preserve the privacy and confidentiality of interviewees and other parties.
The Freedom of Information Act (FOIA) and Privacy Act may apply to information generated or obtained during an investigation. These two laws dictate access to and release of government records. The Chairperson should obtain guidance from a legal advisor or the FOIA/Privacy Act contact person at the site, field office, or Headquarters regarding question of disclosure, or the applicability of the FOIA or Privacy Act. The FOIA provides access to Federal agency records except those protected from release by exemptions. Anyone can use the FOIA to request access to government records.
The Board must ensure that the information it generates is accurate, relevant, complete, and up to-date. For this reason, court reporters may be used in more serious investigations to record interviews, and interviewees should be allowed to review and correct transcripts.
The Privacy Act protects government records on citizens and lawfully admitted permanent residents from release without the prior written consent of the individual to whom the records pertain.
2‐9
DOE‐HDBK‐1208‐2012
Specifically, when the Privacy Act is applicable, the Board is responsible for:
Informing interviewees why information about them is being collected and how it will be used.
Ensuring that information subject to the Privacy Act is not disclosed without the consent of the individual, except under the conditions prescribed by law. Information that can normally not be disclosed includes name, present and past positions or “grade” (e.g., GS 13), annual salaries, duty station, and position description. Therefore, the Board should not request this information unless it is relevant to the investigation.
A Model Interview Opening Statement that addresses the provisions of both the FOIA and the Privacy Act and their pertinence to interviews for DOE accident investigations is provided in Appendix D. This statement should be read at the beginning of all applicable interviews. A brief explanatory Reference Copy of 18USC Sec. 1001 for Information is provided to the interviewer in Appendix D, in the event questions are raised by the opening statement.
2.2.8 Controlling the Release of Information to the Public
The Chairperson should instruct Board members not to communicate with the press or other external organizations regarding the investigation. External communications are the responsibility of the Board Chairperson until the final report is released. The Board Chairperson should work closely with a person designated by the site to release other information, such as statements to site employees and the public.
Table 2-4: The Chairperson Establishes Protocols for Controlling Information
Protocol Considerations
Information Security Keep all investigative evidence and documents locked in a secure area accessible only to Board members, advisors, and support staff.
Press Releases (if appropriate)
Board Chairpersons should coordinate with the official authorizing the investigation or their normal chain of command for authority/guidance on Press Releases.
Determine whether there is a designated contact to handle press releases; if so, work with that person.
The Board is not obligated to release any information. However, previous chairpersons have found that issuing an early press release can be helpful.
The initial press release usually contains a general description of the accident and the purpose of the investigation.
The Board chairperson should review and approve all press releases (in addition to whatever review process at the parent organization).
2‐10
DOE‐HDBK‐1208‐2012
Protocol Considerations
Lines of Communication Establish liaison with field element management and/or with the operating contractor at the site, facility, or area involved in the accident to set up clear lines of communication and responsibility.
Format of Information Releases
Determine the amount and format of information to be released to the site contractor(s), union advisor, and local DOE office for internal purposes.
Never release verbatim interview transcripts or tapes due to the sensitivity of raw information.
Do not release preliminary results of analyses. These results can be taken out of context and lead to premature conclusions by the site and the media.
Consult with the appointing official before releasing any information.
Approvals for Information Assure that Board members, site contractors, and the local DOE office Releases do not disseminate information concerning the Board’s activities,
findings, or products before obtaining the Chairperson’s approval. Brief the Board on what they can reveal to others.
2.3 Managing the Investigation Process
As an investigation proceeds, the Chairperson uses a variety of management techniques, including guiding and directing, monitoring performance, providing feedback on performance, and making decisions and changes required to meet the investigation’s objectives and schedule. Because these activities are crucial, the Chairperson may designate an individual to oversee management activities in case the Chairperson is not always immediately available.
2.3.1 Taking Control of the Accident Scene
Before arriving at the site, the Chairperson communicates with the point of contact or the appropriate DOE site designee to assure that the scene and evidence are properly secured, preserved, and documented and that preliminary witness information has been gathered. At the accident scene, the Chairperson should:
Obtain briefings from all persons involved in managing the accident response.
Obtain all information and evidence gathered by the DOE site team.
Make a decision about how secure the accident scene must remain during the initial phases of the investigation. If there are any concerns about loss or contamination of evidence, play it safe and keep the scene restricted from use.
Assume responsibility only for activities directly related to the accident and investigation. The Chairperson and Board members should not take responsibility for approving site
2. 3
2‐11
DOE‐HDBK‐1208‐2012
activities or procedures, or for recovery, rehabilitation, or mitigation activities. These functions are the responsibility of line management.
2.3.2 Initial Meeting of the Accident Investigation Board
The Chairperson is responsible for ensuring that all Board members work as a team and share a common approach to the investigation. As one of the Board’s first onsite activities, the Chairperson typically holds a meeting to provide all Board members, advisors, consultants, and support staff with an opportunity to introduce themselves and to give the Chairperson an opportunity to brief the Board members on:
The scope of the investigation, including all levels of the organizations involved up to and beyond the level of the appointing official;
An overview of the accident investigation process, with emphasis on:
Streamlined process and limited time frame to conduct the investigation (if applicable);
The schedule and plan for completing the investigation; and
The need to apply the components of DOE’s integrated safety management system during the investigation as the means of evaluating management systems.
Potential analytical and testing techniques to be used;
The roles, responsibilities, and assignments for the Chairperson, the Board members, and other participants;
Information control and release protocols; and
Administrative processes and logistics.
At the meeting, the Chairperson clearly communicates expectations and provides direction and guidance for the investigation. In addition, at the meeting the Chairpersons should distribute copies of local phone directories and a list of phone and fax numbers pertinent for the investigation. The Board should also be briefed on procedures for:
Handling potential conflicts of interest resulting from using contractor-provided support and obtaining support from other sources;
Storing investigative materials in a secured location and disposing of unneeded yet sensitive materials;
Using logbooks, inventory, checkout lists, or other methods to maintain control and accountability of physical evidence, documents, photographs, and other material pertinent to the investigation;
Recording and tracking incoming and outgoing correspondence; and
2‐12
DOE‐HDBK‐1208‐2012
Accessing the Board’s work area after hours.
2.3.3 Promoting Teamwork
The Board must work together as a team to finish the investigation within the time frame established by the appointing official. To make this happen, the Board Chairperson should ensure that strong-willed personalities do not dominate and influence the objectivity of the investigation and that all viewpoints are heard and analyzed.
The Chairperson must capitalize on the synergy of the team’s collective skills and talents (i.e., the team is likely to make better decisions and provide a higher quality investigation than the same group working individually), while allowing individual actions and decisions. It is important that the Chairperson set the ground rules and provide guidance to the Board members and other participants.
Friendship is not required, but poor relationships can impede the Board’s ability to conduct a high-quality investigation. The Chairperson can encourage positive relationships by focusing attention on each member’s strengths and downplaying weaknesses. The Chairperson can facilitate this by arranging time to allow team members to get to know one another and learn about each other’s credentials, strengths, and preferences. Effective interpersonal relationships can save time and promote high-quality performance.
It is the Chairperson’s responsibility to make sure that all members get a chance to speak and that no one member dominates conversations. The Chairperson should establish communication guidelines and serve as an effective role model in terms of the following:
Be clear and concise; minimize the tendency to think out loud or tell “war stories.”
Be direct and make your perspective clear.
Use active listening techniques, such as focusing attention on the speaker, paraphrasing, questioning, and refraining from interrupting.
Pay attention to non-verbal messages and attempt to verbalize what you observe.
Attempt to understand each speaker’s perspective.
Seek information and opinions from others, especially the less talkative members.
Consider all ideas and arguments.
Encourage diverse ideas and opinions.
Suggest ideas, approaches, and compromises.
Help keep discussions on track when they start to wander.
2‐13
DOE‐HDBK‐1208‐2012
The Chairperson should gain agreement in advance regarding how particular decisions will be made. Decisions can be made by consensus, by vote, by the Chairperson, or by an expert. Each method has strengths and weaknesses, and the method used should be the one that makes the most sense for the particular decision and situation. Team members should be aware of which method will be used.
Team members should clearly understand both the formal and informal roles and responsibilities of each Board member, consultant, and support person. Clarifying these roles helps avoid duplication of effort and omission of critical tasks, and reduces power struggles and other conflicts. Board Chairpersons should avoid the temptation to reassign tasks when team members encounter problems.
For an effective investigation, group processes must be efficient. Time and energy may be needed to develop these processes. The Chairperson should pay attention to and note processes that seem to work well, and ask the group to suggest alternatives to processes that are unsatisfactory.
Teams are more effective than individuals, because team members have a clear purpose, capitalize on each other’s strengths, coordinate their efforts, and help each other. Teamwork promotes a higher quality investigation.
To control team dynamics, the Chairperson needs to be aware that groups go through predictable stages as they progress from meeting one another to becoming a high-performance team:
Forming: At this stage, team members get acquainted, understand their purposes, and define their roles and responsibilities. Members are typically very polite at this stage, and conflict is rare. Little work is accomplished during this stage, as the team is still in the planning phase. The Chairperson can speed this stage by formally organizing the group; by defining goals, roles, and responsibilities; and by encouraging members to become comfortable with one another.
Storming: Team members begin to realize the sheer amount of work to be done and may get into conflict regarding roles, planned tasks, and processes for accomplishing the work. There may be power struggles. The team focuses energy on redefining work processes. The Chairperson can speed this phase by encouraging open discussion of methods and responsibilities and promoting non-defensive, solution-focused communication.
Norming: The team develops norms about roles, planned tasks, and processes for working together. Power issues are settled. Team members start to become productive and assist one another. The Chairperson can speed this stage by formalizing new norms, methods, and responsibilities and by encouraging relationship development.
Performing: The team settles into clear roles, understands the strengths of different members, and begins to work together effectively. The Chairperson can help maintain this stage by encouraging open communication, a “learning from mistakes” philosophy, and recognizing progress.
2‐14
DOE‐HDBK‐1208‐2012
Understanding the four typical stages of team development can help the Chairperson manage team interactions and promote team processes throughout the accident investigation.
The Chairperson sets the stage for effective teamwork at the very first Board meeting. At this meeting, the Chairperson should encourage the team to define their goals and tasks, clarify their roles and responsibilities, agree on team processes, and become acquainted with each other’s strengths.
Many Board members may have never worked on an effective team. The Chairperson needs to focus on effective team activities, because the members may not immediately see the value of teamwork or may be caught up in their own tasks to the exclusion of the team.
2.3.4 Managing Evidence, Information Collection
Upon arrival at the accident site, the Board begins to collect evidence and facts and to conduct interviews. Table 2-5 provides guidelines to assist the Chairperson in monitoring this process.
The Chairperson is responsible for:
Ensuring that in both internal and external communications (press conferences, briefings), the facts presented are sufficiently developed and validated, and that no speculation, hypotheses, or conjecture is expressed; consulting with the appointing official prior to disseminating any information about the investigation.
Notifying DOE and appropriate Federal, state, or local authorities of unlawful activities, or in the case of fraud, waste, or abuse, the DOE Office of the Inspector General.
Notifying the Office of Enforcement, the DOE Site Manager, and the contractor of any potential Price-Anderson enforcement concerns identified during the investigation as soon as practical (Table 2-5 provides additional detail).
Coordinating Board activities with all organizations having an interest in the accident (e.g., agencies notified by the appointing official or the Office of Corporate Safety Programs under DOE O 225.1B, Paragraph 4.b.).
Holding meetings that maximize efficiency, have a set length of time, and follow a planned, well-and focused agenda.
2.3.5 Coordinating Internal and External Communication
The Board Chairperson is responsible for coordinating communication both internally with the Appointing Official, Board members, advisors, consultants, and support staff), relevant DOE Headquarters/DOE field office managers, site contractor[s], the media and the public.
2‐15
DOE‐HDBK‐1208‐2012
Maintaining effective communications includes:
Conducting daily Board meetings to:
Review and share the latest information and evidence;
Discuss how new information may contribute to analyses;
Review latest analytical findings and potential causal factors and discuss how new information may affect these analyses;
Note information gaps and prioritize directions to pursue; and
Serve as a checkpoint to ensure that Board members are completing their tasks, acting within scope, and not pursuing factual leads of limited potential value.
Obtaining regular verbal or written progress reports from Board members and identifying solutions to potential problems.
Using a centralized, visible location for posting assignments and progress reports to keep everyone informed and up-to-date.
Conducting meetings with site managers and contractor(s) to exchange information and to summarize investigation status.
Conducting conference calls with managers from Headquarters, the local field office, and contractors; calling the appointing official on a predetermined basis; and providing written status reports to the appointing official.
Providing daily status updates to the Appointing Official.
Coordinating external communications with the public and media through the field office public relations/media representative to ensure that the Department’s interests are not compromised.
2‐16
DOE‐HDBK‐1208‐2012
Table 2-5: The Chairperson Should Use These Guidelines in Managing Information Collection Activities.
Review and organize witness statements, facts, and background information provided by the DOE site team or other sources and distribute these to the Board.
Organize a Board walk‐through of the accident scene, depicting events according to the best understanding of the accident chronology available at the time. This can help the Board visualize the events of the accident.
Assign an administrative coordinator to oversee the organization, filing, and security of collected facts and evidence.
Develop draft of objectives and topical areas to be covered in initial interviews and oversee development of a standardized list of initial interview questions to save interviewing time and promote effective and efficient interviews.
If deemed appropriate, issue a site or public announcement soliciting information concerning the accident.
Ensure that witnesses are identified and interviews scheduled. Ensure that Board members preserve and document all evidence from the accident scene. Make sure all Board members enlist the aid of technical experts when making decisions about
handling or altering physical evidence. Establish a protocol agreeable to the Board for analyzing and testing physical evidence. Identify and initiate any necessary physical tests to be conducted on evidence. Assess and reassess the need for documents, including medical records, training records, policies, and
procedures, and direct their collection. Use the Accident Investigation Information Request Form provided in Appendix D of the document and track information requests.
Emphasize to Board members that to complete the investigation on schedule, they must prioritize and may not have time to pursue every factual lead of medium to low significance. The Board Chairperson must emphasize pursuits that will lead to the development of causal factors and Judgments of Need.
2.3.6 Managing the Analysis
The Chairperson is responsible for ensuring that events and causal factors charting and application of the core analytical techniques begin as soon as initial facts are available. The responsibility to conduct the analysis is that of the trained DOE Accident Investigator or Analyst.
This will help to identify information gaps early, drive the fact collection process, and identify questions for interviews. The use of accident investigation analysis software can be a helpful tool for identifying information gaps and organizing causal factors during the analyses. Another technique is to use multicolored adhesive notes on a wall to portray elements of the events and causal factors chart. A wall-size chart makes it easier for all Board members to observe progress, provide input, and make changes.
As the Board proceeds with the analyses, the Chairperson should monitor and discuss progress to ensure that:
2‐17
DOE‐HDBK‐1208‐2012
Several Board members and/or advisors work collectively (not one person in isolation) to produce a quality result.
Analyses are iterative (i.e., analyses are repeated, each version producing results that approximate the end result more closely); several iterations of analyses will be needed as new information becomes available.
The analyses address organizational concerns, management systems, and line management oversight functions that may have contributed to the accident’s causes.
The causal factors, conclusions, and Judgments of Need are supported by the facts and analysis.
Significant facts and analyses do not result in a “dead end.” Instead, they are linked to causal factors and Judgments of Need.
Delegating responsibility for complex analyses to a single individual can produce inferior results. Analyses are strengthened by input from the entire Board and its advisors.
2.3.7 Managing Report Writing
Many investigation Boards have found report writing to be the most difficult part of the investigation, often requiring several iterations. Report quality is crucial, because the report is the official record of the investigation. Efforts to conduct a quality investigation lose integrity if the report is poorly written or fails to adequately convey a convincing set of supporting facts and clear conclusions. To manage the reporting process, the Chairperson should:
Develop a report outline as soon as possible to facilitate writing assignments and minimize overlap in content between sections;
Begin writing the accident chronology, background information, and facts as soon as information becomes available;
Continuously identify where sections should be added, moved, or deleted;
Adhere to required format guidelines and promote ongoing clarification of format, content, and writing styles;
Quickly identify strong and weak writers and pair them, when possible, to avoid report writing delays; and
Encourage authors to consult with one another frequently to become familiar with the content of each section and to reduce redundancy.
If possible, use a technical writer to evaluate grammar, format, technical content, and linkages among facts, analyses, causes, and Judgments of Need. This is important when several authors have contributed to the report. The technical writer focuses on producing a clear, concise, logical, and well-supported report and ensures that the report reads as if one person wrote it. It is
2‐18
DOE‐HDBK‐1208‐2012
possible to have serious disagreements among Board members regarding the interpretation of facts, causal factors, conclusions, and Judgments of Need. The Board Chairperson should make a concerted effort to reach consensus among Board members on accident causes, conclusions, and Judgments of Need. When Board members cannot reach agreement and the Chairperson cannot resolve the difference, the dissenting Board member(s) may opt to produce a minority report.
2.3.8 Managing Onsite Closeout Activities
2.3.8.1 Preparing Closeout Briefings
The investigative portion of the process is considered complete and Board members are released when the Appointing Official formally accepts the final report.
The Chairperson is responsible for conducting the final accuracy review, final editing, production of the report, with assistance from selected Board members and administrative support staff.
A briefing on the investigation’s outcome to the Appointing Official and field line management with cognizance over the site of the accident should be conducted. This briefing is conducted by the Board Chairperson and the Head of the Field Element of the site at which the accident occurred. Accident investigation participants (Chairperson, Board members, and any consultants and advisors deemed appropriate by the Chairperson) may attend the briefing. The briefing covers:
The scope of the investigation, as provided in the appointment letter,
The investigation’s participants, including any subject matter experts or other consultants,
A brief summary of the accident (what happened),
Causal factors (why it happened),
Judgments of Need (what needs to be corrected),
Organizations that should be responsible for corrective actions.
Other briefings may be provided by the Board Chairperson and Board members, as deemed appropriate by the Appointing Official. These may include briefing DOE and contractor line management at the site of the accident.
2.3.8.2 Preparing Investigation Records for Permanent Retention
The Chairperson is also responsible for ensuring that all information resulting from the investigation is carefully managed and controlled. To this end, the Chairperson takes the following actions:
2‐19
DOE‐HDBK‐1208‐2012
Preparing investigation documents and evidence for long-term storage: One of the final activities of the Board is to prepare investigation documents and evidence for long-term storage. For Federal investigations, these materials are to be held in storage by the Appointing Official’s Program Manager as “permanent” records (75 years) in accordance with DOE O 225.1B. It is recommended that access restriction limitation be designated as "Agency Personnel."
All factual material and analysis products are included, such as logbooks, Board meeting minutes, field notes, sketches, witness statements (including interview tapes or electronic record files, if used), stenographer transcripts, photographs, location and custody of any physical evidence, analysis charts, and the various forms completed during the investigation. Original medical or personnel records subject to the Privacy Act may be returned to their original location.
Documentation showing that the report was subjected to reviews for classified and Privacy Act information shall be retained in the investigation file.
If the appointment of an AIB is delayed beyond three calendar days from the time of the categorization of the accident, the rationale for the delay must be documented and maintained in the accident investigation file.
Computers used during the accident investigation that are not to remain in control of the accident board should have all useful records transferred to a storage medium or another computer in the Board’s control. All accident investigation or analysis files on the relinquished computers should be purged prior to release from the investigation team. Electronic records should be purged or archived according to DOE CIO procedures.
If the Heads of the Headquarters Elements delegates the responsibility for an accident investigation to the Heads of a Field Element, or to HSS, a copy of the memorandum of delegation shall be maintained in the accident investigation file.
The administrative coordinator arranges for boxing and for shipping materials to the storage facility identified by Appointing Official’s Program Manager during the onsite phase of the investigation. A well maintained AI record system should already be logged, filed, and boxed throughout the investigation for quick close out packaging and transfer. All permanent records should have been screened for classification and stamped accordingly.
Destroying non-record materials: Any non-record materials, such as extraneous information deemed not pertinent to the investigation, or multiple reference copies, or extra drafts & incomplete notes, should be controlled until destroyed. Shredder machines or services should be arranged for throughout the investigation to reduce close out shredding time.
Archiving materials: One of the final activities of the Appointing Official’s Program Manager, when immediate reference access is no longer deemed likely after the Post- Investigation Activities, is to arrange for placing investigation permanent records boxes in an archive repository in accordance with 36 CFR 1225.14.
2‐20
DOE‐HDBK‐1208‐2012
2.3.9 Managing Post-Investigation Activities
The Appointing Official is also responsible for ensuring that there is post-investigation follow through in the form of corrective actions being defined and tracked and lesson learned being documented. These responsibilities are explained below.
2.3.9.1 Corrective Action Plans
The final report is submitted by the Appointing Official to senior managers of organizations identified in the Judgments of Need in the report, with a request for the organizations to prepare corrective action plans. These plans contain actions for addressing Judgments of Need identified in the report and include milestones for completing the actions.
Corrective actions fall into four categories:
Immediate corrective actions that are taken by the organization managing the site where the accident occurred to prevent a second or related accident.
Corrective actions required to satisfy Judgments of Need identified by the Board in the final report. These corrective actions are developed by the Heads of Field Elements and/or contractors responsible for the activities resulting in the accident and are designed to prevent recurrence and correct system problems.
Corrective actions determined by the Appointing Official to be appropriate for DOE-wide application. The Appointing Official recommends these corrective actions when the report is distributed.
DOE Headquarters corrective actions that result from discussions with senior management. These actions usually address DOE policy.
2.3.9.2 Tracking and Verifying Corrective Actions
Corrective action plans are submitted to the Head of the Program Element which reviews the plans and provides comments.
This review is done to determine the:
Adequacy of proposed corrective actions in meeting the deficiencies stated in the Judgments of Need.
Feasibility of the proposed corrective actions.
Timeliness of the proposed corrective actions.
Necessity for any interim actions to prevent further accidents, pending permanent.
Corrective actions.
2‐21
DOE‐HDBK‐1208‐2012
The Heads of Field Elements whose site, facility, operation, or area was involved in the accident have responsibility for accepting, entering the corrective actions into the appropriate database established by the Head of the Program Element and implementing applicable corrective actions.
However, other DOE/National Nuclear Safety Administration (NNSA) Field Elements may have responsibility for completing actions resulting from the investigation. In these cases, the organization(s) indicated in the corrective action plan as having responsibility for implementation is (are) accountable for completing the requisite actions.
The Heads of Headquarters Elements verifies completion of approved corrective actions and satisfaction of Judgments of Need.
When corrective action plans are completed and corrective actions have been implemented, those Headquarters and field elements having responsibilities for corrective actions notify the Appointing Official, who closes the investigation. Copies of the notification to and closure by the Appointing Official are sent to the Program Manager.
2.3.9.3 Establishing Lessons Learned
Introduction. The purpose of conducting accident investigations is to determine the system deficiencies that allowed the accident to occur so that those deficiencies can be corrected and similar accidents can be prevented. Summaries of deficiencies and the recommended corrective actions are identified as "lessons learned.” In the interest of preventing recurrence of accidents, lessons learned are disseminated DOE-wide to ensure that the results of investigations have the greatest effect for continuous improvement in environment, safety, and health performance.
Responsibilities. The responsibility for developing and disseminating lessons learned arising from accident investigations resides with the Appointing Official as defined in DOE O 225.1B. For accident investigations, the Appointing Official is the Heads of Headquarters Elements. In the event that the responsibility for appointing an AIB is delegated to the Heads of Field Elements, the responsibility for developing and disseminating lessons learned from the accident investigation remains with the Heads of Headquarters Elements Quality Assurance Program.
Developing Lessons Learned. Lessons learned from accident investigations are developed in accordance with DOE O 210.2A, DOE Corporate Operating Experience Program and/or other provisions that govern the DOE Lessons Learned Program. For accident investigations, the Head of the DOE/NNSA Program Element is responsible for to develop and disseminate the lessons learned.
Disseminating Lessons Learned. Lessons learned from the accident investigation are developed and disseminated within 90 calendar days of acceptance of the investigation report by the Appointing Official. Methods for disseminating lessons learned include; hard copy, electronic, and other methods for use both intra-site and across the DOE complex, such as reports, workshops, and newsletters. The DOE Lessons Learned Information System provides for electronic dissemination of lessons-learned information throughout the DOE complex.
2‐22
DOE‐HDBK‐1208‐2012
2.4 Controlling the Investigation
Throughout the investigation, the Board Chairperson is responsible for controlling Board performance, cost, schedule, and quality of work. Techniques for implementing these controls are described below.
2.4.1 Monitoring Performance and Providing Feedback
The Chairperson uses daily meetings to monitor progress and to measure performance against the schedule of activity milestones. Board members are given specific functions or activities to perform and milestones for completion. The Chairperson assesses the progress and status of the investigation periodically by asking such questions as:
Is the investigation on schedule?
Is the investigation within scope?
Are Board members, advisors, consultants, and support staff focused and effective?
Are additional resources needed?
Are daily Board meetings still necessary and productive, or should the interval between them be increased?
The Chairperson must be informed on the status of the accident investigation and must be prepared to make decisions and provide timely feedback to Board members, site personnel, and other parties affected by the accident. Frequently, decisions must be made when there is not time to reach consensus among the Board members. When this occurs, the Chairperson informs the Board members of the decision and the reason for the urgency. Intermediate milestone revisions can then be made, if events or practical considerations so dictate.
2.4.2 Controlling Cost and Schedule
Cost and schedule must be controlled to ensure that planning and execution activities are within the established budget and milestones.
Cost Control: The Board Chairperson is responsible operating within any budget prescribed for the investigation. The Chairperson should prepare a cost estimate for the activities to be conducted during the investigation if needed. If necessary, the Chairperson may issue a memo authorizing costs incurred by Board members, including additional travel expenses, hotel rates over per diem, and incidental expenses. Control can be exercised over costs by using advisors and consultants only when required and by limiting travel (such as trips home for the weekend) during the onsite investigation. A method for estimating costs should be agreed upon early in the investigation, and the estimate should be reviewed each week to ensure that the cost of the work is not exceeding the estimate, or that any cost growth is justified and can be funded.
2. 4
2‐23
DOE‐HDBK‐1208‐2012
2.5
Schedule Control: Progress against the scheduled milestones can be assessed during daily progress meetings with the Board and its staff. As problems arise, the schedule may be adjusted or resources applied to offset variances. Because of the relatively short time frame involved, the Chairperson must identify and resolve problems immediately to maintain the schedule, or re-evaluate it with the appointing official as circumstances require.
2.4.3 Assuring Quality
Formal quality control measures are necessary because of the seriousness and sensitivity of the Accident Investigation Board’s work and because of the need for accuracy, thoroughness, and perspective. At a minimum, the Chairperson must ensure that the report is technically accurate, complete, and internally consistent. When analytical results are developed into conclusions, all verified facts, the results of analyses of those facts, and the resulting conclusions must be both consistent and logical.
When essential portions of the draft report are complete, the Chairperson conducts a verification analysis to ensure that the facts are consistent with the best information available, that all report sections are consistent, and that analyses, causes, and Judgments of Need logically flow from the facts. Section 2.8 provides further detail on assuring report quality.
Prior to submission of the report to the Appointing Official, the Board Chair, under DOE O 225.1B, needs to submit the report for a quality review to the HSS AI Program Manager.
2.5 Investigate the Accident to Determine “What” Happened
2.5.1 Determining Facts
Immediately following any accident, much of the available information may be conflicting and erroneous. The volume of data expands rapidly as witness statements are taken, emergency response actions are completed, evidence is collected, and the accident scene is observed by more individuals.
The principal challenge of the AI Board is to distinguish between accurate and erroneous information in order to focus on areas that will lead to identifying the accident’s causal factors.
This can be accomplished by:
Understanding the activity that was being performed at the time of the accident or event.
Personally conducting a walk-through of the accident scene or, work location.
Testing or inspecting pertinent components to determine failure modes and physical evidence.
Obtaining testamentary evidence, and corroborating facts through interviews.
Challenging “facts” that are inconsistent with other evidence (e.g., physical).
2‐24
DOE‐HDBK‐1208‐2012
Reviewing policies, procedures, and work records to determine the level of compliance or implementation.
Prevention is at the heart of the entire investigation process. Therefore, any accident investigation must focus on fact-finding, not fault-finding.
Fact-finding begins during the collection of evidence. All sources of evidence (e.g., accident site walk-through, witness interviews, physical evidence, policy or procedure documentation) contain facts that, when linked, create a chronological depiction of the events leading to an accident. Facts are not hypotheses, opinions, analysis, or conjecture. However, not all facts can be determined with complete certainty, and such facts are referred to as assumptions. Assumptions should be reflected as such in the investigation report and in any closeout briefings.
Board members should immediately begin developing a chronology of events as facts and evidence is collected. Facts should be reviewed on an ongoing basis to ensure relevance and accuracy. Facts and evidence later determined to be irrelevant should be removed from the accident chronology but retained in the official investigation file for future consideration.
Contradictory facts can be resolved in closed Board meetings, recognizing that the determination of significant facts is an iterative process that evolves as gaps in information are closed and questions resolved. The Board revisits the prescribed scope and depth of their investigation often during the fact-finding and analysis process. Doing so ensures that the investigation adheres to the parameters prescribed in the Board’s appointment memorandum.
Causal factors of an accident are identified after analyzing the facts. Judgments of Need, and the subsequent corrective actions, are based on the identified causes of the accident. Therefore, the facts are the foundation of all other parts of the investigative process. Analyze Accident to Determine “Why” it happened.
Three key types of evidence are collected during the investigation:
Human or testamentary evidence includes witness statements and observations;
Physical evidence is matter related to the accident (e.g., equipment, parts, debris, hardware, and other physical items); and
Documentary evidence includes paper and electronic information, such as records, reports, procedures, and documentation. A Checklist of Documentary Evidence is found in Appendix D.
Collecting evidence can be a lengthy, time-consuming, and piecemeal process. Witnesses may provide sketchy or conflicting accounts of the accident. Physical evidence may be badly damaged or completely destroyed. Documentary evidence may be minimal or difficult to access. Thorough investigation requires that board members be diligent in pursuing evidence and adequately explore leads, lines of inquiry, and potential causal factors until they gain a sufficiently complete understanding of the accident.
2‐25
DOE‐HDBK‐1208‐2012
The process of collecting data is iterative. Preliminary analysis of the initial evidence identifies gaps that will direct subsequent data collection. Generally, many data collection and analysis iterations occur before the board can be certain that all analyses can be finalized. The process of data collection also requires a tightly coordinated, interdependent set of activities on the part of several investigators.
The process of pursuing evidentiary material involves:
Collecting human evidence (locating and interviewing witnesses);
Collecting physical evidence (identifying, documenting, inspecting, and preserving relevant matter);
Collecting documentary evidence;
Examining organizational concerns, management systems, and line management oversight; and
Preserving and controlling evidence. (Examples of Physical Evidence Log Form and Evidence Sign-out Sheet are included in Appendix D.)
2.5.2 Collect and Catalog Physical Evidence
To ensure consistent documentation, control, and security, it may be useful to designate a single team member or the administrative coordinator to be in charge of handling evidence.
Following the leads and preliminary evidence provided by the initial findings of the DOE site team, the team proceeds in gathering, cataloging, and storing physical evidence from all sources as soon as it becomes available. The most obvious physical evidence related to an accident or accident scene often includes solids such as:
Equipment
Tools
Materials
Hardware
Operation facilities
Pre- and post-accident positions of accident-related elements
Scattered debris
Patterns, parts, and properties of physical items associated with the accident.
2‐26
DOE‐HDBK‐1208‐2012
Less obvious but potentially important physical evidence includes fluids (liquids and gases). Many DOE facilities use a multitude of fluids, including chemicals, fuels, hydraulic control or actuating fluids, and lubricants. Analyzing such evidence can reveal much about the operability of equipment and other potentially relevant conditions or causal factors.
Care should be taken if there is the potential for pathogenic contamination of physical evidence (e.g., blood); such material may require autoclaving or other sterilization. Specialized technicians experienced in fluid sampling should be employed to help the team to collect and to analyze fluid evidence. If required, expert analysts can be requested to perform tests on the fluids and report results to the investigation team.
When handling potential blood-borne pathogens, universal precautions such as those listed in Table 2-6 should be observed to minimize potential exposure. All human blood and body fluids should be treated as if they are infectious. The precautions in Table 2-6 should be implemented for all potential exposures. Exposure is defined as reasonable anticipated skin, eye, mucous membrane, or parenteral contact with blood or other potentially infectious materials.
In addition to pathogens, any evidence may create a hazard for persons handling it, in ways too numerous to expand upon here. This aspect of any evidence should be considered and addressed before handling it.
Physical evidence should be systematically collected, protected, preserved, evaluated, and recorded to ultimately determine how and why failures occurred and whether use, abuse, misuse, or nonuse was a causal factor.
Significant physical evidence is often found in obscure and seemingly insignificant places, such as hinges and supports.
2‐27
DOE‐HDBK‐1208‐2012
Table 2-6: Use Precautions when Handling Potential Blood Borne Pathogens
Personal protective equipment should be worn when exposure to blood borne pathogens is likely.
Hands and other skin should be washed with soap and water immediately or as soon as feasible after removal of gloves or other personal protective equipment.
Hand washing facilities should be provided that are readily accessible to employees.
When provision of hand washing facilities is not feasible, appropriate antiseptic hand cleanser in conjunction with clean cloth, paper towels, or antiseptic towelettes should be used. Hands should be washed with soap and water as soon as possible thereafter.
Mucous membranes should be flushed with water immediately or as soon as feasible following contact with blood or other potentially infectious materials.
Contaminated needles and other contaminated sharps shall not be bent, recapped, or removed except by approved techniques.
Immediately or as soon as possible after use, contaminated reusable sharps shall be placed in appropriate containers until properly reprocessed.
Eating, drinking, smoking, applying cosmetics or lip balm, and handling contact lenses are prohibited in work areas where there is a reasonable likelihood of occupational exposure.
Food and drink shall not be kept in refrigerators, freezers, shelves, cabinets, or on countertops or bench tops where blood or other potentially infectious materials are present.
All procedures involving blood or other potentially infectious materials shall be performed in such a manner as to minimize splashing, spraying, spattering, and generation of droplets of these substances.
Mouth pipetting or suctioning of blood or other potentially infectious materials is prohibited.
Specimens of blood or other potentially infectious materials shall be placed in a container to prevent leakage during collection, handling, processing, storage, transport, or shipping.
Equipment, which may become contaminated with blood or other potentially infectious materials, shall be examined prior to servicing or shipping and shall be decontaminated as necessary.
2.5.2.1 Document Physical Evidence
Evidence should be carefully documented at the time it is obtained or identified. The Physical Evidence Log Form (provided in Appendix D) can help investigators document and track the collection of physical evidence. Additional means of documenting physical evidence include sketches, maps, photographs, corporate files, and video files.
2.5.2.2 Sketch and Map Physical Evidence
Sketching and mapping the position of debris, equipment, tools, and injured persons may be initiated by the DOE site team and expanded on by the Accident Investigation Board. Position maps convey a visual representation of the scene immediately after an accident. Evidence may
2‐28
DOE‐HDBK‐1208‐2012
be inadvertently moved, removed, or destroyed, especially if the accident scene can only be partially secured. Therefore, sketching and mapping should be conducted immediately after recording initial witness statements.
Precise scale plotting of the position of elements can subsequently be examined to develop and test accident causal theories.
Computer programs or the Site Sketch, Position Mapping Form, and Sketch of Physical Evidence Locations and Orientations (provided in Appendix D) are useful for drawing sketches and maps and recording positions of objects.
2.5.2.3 Photograph and Video Physical Evidence
Photography and videography can be used in a variety of ways to emphasize areas or items of interest and display them for better understanding. These are best performed by specialists, but should be supervised and directed by an investigator.
Photography is a valuable and versatile tool in investigation. Photos or videos can identify, record, or preserve physical evidence that cannot be effectively conveyed by words or collected by any other means.
Photographic coverage should be detailed and complete, including standard references to help establish distance and perspective. Video should cover the overall accident scene, as well as specific locations or items of significance. A thorough video allows the Board to minimize trips to the accident scene. This may be important if the scene is difficult to access or if it presents hazards. The Photographic Log Sheet (provided in Appendix D) can be used to record photograph or video subjects, dates, times, and equipment settings and positions.
Good photographic coverage of the accident is essential, even if photographs or video stills will not be used in the investigation report. However, if not taken properly, photographs and videos can easily misrepresent a scene and lead to false conclusions or findings about an accident. Therefore, whenever possible, accident photography and video recording should be performed by professionals. Photographic techniques that avoid misrepresentation, such as the inclusion of rulers and particular lighting, may be unknown to amateurs but are common knowledge among professional photographers and videographers.
One of the first responsibilities of the team lead should be to acquire a technical photographer whose work will assist the team.
Five possible sources include:
DOE site’s photo lab, or digital print processor center,
Commercial photo, or digital print processor center,
Commercial photographers; industrial, medical, aerial, legal, portrait, and scientific photographers (often the best to assist in accident investigation are forensic/legal, or scientific photographers),
2‐29
DOE‐HDBK‐1208‐2012
A member of the investigation team, or
Security personnel.
Even if photos are taken by a skilled photographer, the investigation team should be prepared to direct the photographer in capturing certain important perspectives or parts of the accident scene. Photographs of evidence and of the scene itself should be taken from many angles to illustrate the perspectives of witnesses and injured persons. In addition, team members may wish to take photos for their own reference. Digital photography facilitates incorporation of the photographs into the investigation report. As photos are taken, a log should be completed noting the scene/subject, date, time, direction, and orientation of photos, as well as the photographer’s name. The Photographic Log Sheet can be used for this purpose. The Sketch of Photography Locations and Orientations (provided in Appendix D) is helpful when reviewing photos and analyzing information.
2.5.2.4 Inspect Physical Evidence
Following initial mapping and photographic recording, a systematic inspection of physical evidence can begin. The inspection involves:
Surveying the involved equipment, vehicles, structures, etc., to ascertain whether there is any indication that component parts were missing or out of place before the accident;
Noting the absence of any parts of guards, controls, or operating indicators (instruments, position indicators, etc.) among the damaged or remaining parts at the scene;
Identifying as soon as possible any equipment or parts that must be cleaned prior to examination or testing and transferring them to a laboratory or to the care of an expert experienced in appropriate testing methodologies;
Noting the routing or movements of records that can later be traced to find missing components;
Preparing a checklist of complex equipment components to help ensure a thorough survey.
These observations should be recorded in notes and photographs so that investigators avoid relying on their memories. Some investigators find a small voice recorder useful in recording general descriptions of appearance and damage. However, the potential failure of a recorder, inadvertent file erasure, and limitations of verbal description suggest that verbal recorded descriptions should be used in combination with notes, sketches, and photographs.
2.5.2.5 Remove Physical Evidence
Following the initial inspection of the scene, investigators may need to remove items of physical evidence. To ensure the integrity of evidence for later examination, the extraction of parts must be controlled and methodical. The process may involve simply picking up components or pieces of damaged equipment, removing bolts and fittings, cutting through major structures, or even recovering evidence from beneath piles of debris. Before evidence is removed from the accident
2‐30
DOE‐HDBK‐1208‐2012
scene, it should be carefully packaged and clearly identified. The readiness team or a pre- assembled investigator’s kit can provide general purpose cardboard tags or adhesive labels for this purpose.
Equipment or parts thought to be defective, damaged, or improperly assembled should be removed from the accident scene for technical examination. The removal should be documented using position maps and photos to display the part in its final, post-accident position and condition. If improper assembly is suspected, investigators should direct that the part or equipment be photographed and otherwise documented as each subassembly is removed.
Items that have been fractured or otherwise damaged should be packaged carefully to preserve surface detail. Delicate parts should be padded and boxed. Both the part and the outside of the package should be labeled. Greasy or dirty parts can be wrapped in foil and placed in polyethylene bags or other nonabsorbent materials for transport to a testing laboratory, command center, or evidence storage facility. If uncertainties arise, subject matter experts can advise the Board regarding effective methods for preserving and packaging evidence and specimens that must be transported for testing.
When preparing to remove physical evidence, these guidelines should be followed:
Normally, extraction should not start until witnesses have been interviewed, since visual reference to the accident site can stimulate one’s memory.
Extraction and removal or movement of parts should not be started until position records (measurements for maps, photographs and video) have been made.
Be aware that the accident site may be unsafe due to dangerous materials or weakened structures.
Locations of removed parts can be marked with orange spray paint or wire-staffed marking flags; the marking flags can be annotated to identify the part removed and to allow later measurement.
Care during extraction and preliminary examination is necessary to avoid defacing or distorting impact marks and fracture surfaces.
The team lead and investigators should concur when the parts extraction work can begin, in order to assure that team members have completed all observations requiring an intact accident site.
2.5.3 Collect and Catalog Documentary Evidence
Documentary evidence can provide important data (i.e., proof of “work-as-done”) and should be preserved and secured as methodically as physical evidence. This information might be in the form of documents, photos, video, or other electronic media, either at the site or in files at other locations (this information should not be confused with procedures and such).
2‐31
DOE‐HDBK‐1208‐2012
Some work/process/system records are retained only for the workday or the week. Once an event has occurred, the team must work quickly to collect and preserve these records so they can be examined and considered in the analysis.
Investigation preplanning should include procedures for identifying records to be collected, as well as the people responsible for their collection. Because records are usually not located at the scene of the incident, they are often overlooked in the preliminary collection of evidence.
Documents often provide important evidence of “work-as-done” for identifying causal factors of an event. This evidence is useful for:
Indicating the attitudes and actions of people involved in the accident; and
Revealing evidence that generally is not established in verbal testimony.
Documentary evidence to determine “work-as-done” generally can be grouped into three categories:
Records that indicate past and present performance and status of the work activities, as well as the people, equipment, and materials involved (examples include log books, security access logs, calls to the operations center, etc);
Reports that identify the content and results of special studies, analyses, audits, appraisals, inspections, inquiries, and investigations related to work activities (examples include occurrence reports, metrics, management and self assessments, etc.);
Follow-on documentation that describes actions taken in response to the other types of documentation (examples include corrective action tracking results, lessons learned, etc.).
Collectively, this evidence gives important clues to possible underlying causes of errors, malfunctions, and failures that led to the accident.
2.5.4 Electronic Files to Organize Evidence and Facilitate the Investigation
To organize the documentary evidence collected and to make it readily accessible to the investigation team, it is strongly recommended that electronic files be set up and populated as evidence becomes available. Examples of evidence to be collected could consist of:
Work orders, logbooks, training records (certifications/qualifications), forms, time sheets
Problem evaluation reports
Occurrence reports
Nonconformance reports
Closeout of Corrective Actions from similar events
2‐32
DOE‐HDBK‐1208‐2012
Process metrics
Previous lessons learned
External reviews or assessments
Internal assessments (management and self assessments)
The team’s lead or the person in charge of collecting the data should organize all information in shared electronic files in pre-established folders as shown in Figure 2-2.
Investigation Electronic File Structure
Assessments Timeline
DOE’s Operational Experience – Lessons Learned Report – Draft & Final
Extraneous Conditions Adverse to Quality ORPS Reports Performance Evaluation Requests – Action Tracking
Photographs Procedures Statements and Interviews
Training – Qualification Barrier Analysis
Human Performance Error Precursors Missed Opportunities
Causal Factors Charts Extent of Conditions and Causes
JON – Corrective Actions Lessons to be Learned
Evidence Files (log books, training, etc.) Tasking Letter
Deep Organizational Issues (culture, etc.) Housekeeping file for team members
This file structure has been pre‐ established and populated with the applicable forms and matrices to facilitate data collection and compilation.
The applicable evidence should be collected and placed into the appropriate folder so the entire team has access to all information electronically.
Upon conclusion of the investigation, the electronic file will become part of the investigation record.
Additional folders can be added to adapt to the team and the investigation.
Figure 2-2: Example of Electronic File Records To Keep for the Investigation
2‐33
DOE‐HDBK‐1208‐2012
2.5.5 Collecting Human Evidence
Human evidence is often the most insightful and also the most fragile. Witness recollection declines rapidly in the first 24 hours following an accident or traumatic event. Therefore, witnesses should be located and interviewed immediately and with high priority. As physical and documentary evidence is gathered and analyzed throughout the investigation, this new information will often prompt additional lines of questioning and the need for follow up interviews with persons previously not interviewed.
2.5.6 Locating Witnesses
Principal witnesses and eyewitnesses are identified and interviewed as soon as possible. Principal witnesses are persons who were actually involved in the accident; eyewitnesses are persons who directly observed the accident or the conditions immediately preceding or following the accident. General witnesses are those with knowledge about the activities prior to or immediately after the accident (the previous shift supervisor or work controller, for example). One responsibility of the DOE site and other initial responders is to identify witnesses, record initial statements, and provide this information to the investigation board upon their arrival. Prompt arrival by Board members and expeditious interviewing of witnesses helps ensure that witness statements are as accurate, detailed, and authentic as possible.
Table 2-7 lists sources that investigators can use to locate witnesses.
Table 2-7: These Sources are Useful for Locating Witnesses
Site emergency response personnel can name the person who provided notification of the incident and those present on their arrival, as well as the most complete list available of witnesses and all involved parties.
Principal witnesses and eyewitnesses are the most intimately involved in the accident and may be able to help develop a list of others directly or indirectly involved in the accident.
First-line supervisors are often the first to arrive at an accident scene and may be able to recall precisely who was present at that time or immediately before the accident. Supervisors can also provide the names and phone numbers of safety representatives, facility designers, and others who may have pertinent information.
Local or state police, firefighters, or paramedics, if applicable.
Nurses or doctors at the site first aid center or medical care facility (if applicable).
Staff in nearby facilities (those who may have initially responded to the accident scene; staff at local medical facilities).
News media may have access to witness information and photographs or videos of the post-accident scene.
Maintenance and security personnel may have passed through the facility soon before or just after the accident.
2‐34
DOE‐HDBK‐1208‐2012
2.5.7 Conducting Interviews
Witness testimony is an important element in determining facts that reveal causal factors. It is best to interview principal witnesses and eyewitnesses first, because they often provide the most useful details regarding what happened. If not questioned promptly, they may forget important details. Witnesses must be afforded the opportunity to have organized labor or legal representation with them, if they wish.
2.5.7.1 Preparing for Interviews
Much of the investigation’s fact-finding occurs in interviews. Therefore, to elicit the most useful information possible from interviewees, interviewers must be well prepared and have clear objectives for each interview. Interviews can be conducted after the board has established the topical areas to be covered in the interviews and after the board chairperson has reviewed with the board the objectives of the interviews and strategies for obtaining useful information.
People’s memories, as well as their willingness to assist an investigative Board, can be affected by the way they are questioned. Based on the availability of witnesses, Board members’ time, and the nature and complexity of the accident, the Board chairperson and members must determine who to interview, in what order, and what interviewing techniques to employ. The site’s point of contact for the Board is responsible for scheduling the selected witnesses, accommodating work shift schedules as necessary and union or legal representation accompanying the witness when requested. Some preparation methods that previous Accident Investigation Boards have found successful are described below.
Decide on the Interview Recording Method. Team note taking using an interviewer and a note taker is the most efficient and expedient method. A formal transcription is not required, but if a more thorough record is desired a court reported can be used. If court recorders are used for multiple witnesses, it may be necessary to have multiple court reporters “tag team” to meet the 48 hour maximum turnaround on delivery of the transcripts to the team. Electronic recording is discouraged due to delays in getting transcribed and the complications archiving the electronic record. Interview notes and transcripts should be reviewed by the witness for accuracy.
Transcripts - The written transcripts from the court reporter should be obtained as soon as possible after they are taken, considering the cost involved. Each witness should be given a reasonable amount of time to review their transcript for factual accuracy. A record of the accuracy review is made on a Transcript Review Statement form and tracked on the Transcript Receipt & Review Tracking table (examples provided in Appendix D). Any witness interviewed is afforded the opportunity to review any statements for accuracy and may request a copy of the transcript at the conclusion of the investigation. An example, half page, Transcript Request form is provided in Appendix D.
Identify all interviewees using the Accident Investigation Preliminary Interview List (provided in Appendix D). Record each witness’ name, job title, reason for interview, phone, work schedule, and company affiliation; take a brief statement of his or her involvement in the accident.
2‐35
DOE‐HDBK‐1208‐2012
Schedule an interview with each witness using the Accident Investigation Interview Schedule Form (provided in Appendix D). Designate one person, such as the administrative coordinator, to oversee this process.
Assign a lead interviewer from the board for each interviewee. Having a lead interviewer can help establish consistency in depth and focus of interviews.
Develop sketches and diagrams to pinpoint locations of witnesses, equipment, etc., based on the initial walk-through and DOE site team input.
Develop a standardized set of interview questions. Charts may be used to assist in developing questions. The AIB should develop a list of questions for each witness prior to the interview, based on the objectives for that interview. The Accident Investigation Witness Statement Form, the Accident Investigation Interview Form, or the Informal Personal or Telephone Interview Form (provided in Appendix D.2 - Forms for Witness Statements and Interviews) can aid in recording pertinent data.
2.5.7.2 Advantages and Disadvantages of Individual vs. Group Interviews
Depending on the specific circumstances and schedule of an accident investigation, investigators may choose to hold either individual or group interviews. Generally, principal witnesses and eyewitnesses are interviewed individually to gain independent accounts of the event.
However, a group interview may be beneficial in situations where a work crew was either involved in or witness to the accident. Moreover, time may not permit interviewing every witness individually, and the potential for gaining new information from every witness may be small.
Sometimes, group interviews can corroborate testimony given by an individual, but not provide additional details. The Board should use their collective judgment to determine which technique is appropriate. Advantages and disadvantages of both techniques are listed in Table 2-8. These considerations should be weighed against the circumstances of the accident when determining which technique to use.
2‐36
DOE‐HDBK‐1208‐2012
Table 2-8: Group and Individual Interviews have Different Advantages
Individual Interviews Group Interviews
Advantages Obtain independent stories More time‐efficient Obtain individual perceptions All interviewees supplement story; may Establish one‐to‐one rapport get more complete picture
Other people serve as “memory joggers
Disadvantages More time‐consuming Interviewees will not have independent May be more difficult to stories
schedule all witnesses More vocal members of the group will say more and thus may influence those who are quieter
Group think” may develop; some individual details may get lost
Contradictions in accounts may not be revealed
2.5.7.3 Interviewing Skills
It is important to create a comfortable atmosphere in which interviewees are not rushed to recall their observations. Interviewees should be told that they are a part of the investigation effort and that their input will be used to prevent future accidents and not to assign blame.
Before and after questioning, interviewees should be notified that follow-up interviews are a normal part of the investigation process and that further interviews do not mean that their initial statements are suspect. Also, they should be encouraged to contact the Board whenever they can provide additional information or have any concerns. Keys to a good start are:
Identify witnesses as quickly as possible to obtain witness statements. Sources for locating witnesses include DOE site and emergency response personnel, principal witnesses, eyewitnesses, first line supervisors, police, firefighters, paramedics, nurses or doctors, news media, and maintenance and security personnel.
Promoting effective interviews includes careful preparation, creating a relaxed atmosphere, preparing the witness for the interview, recording the interview (preferably by using a court reporter to document the interview), asking open ended questions, and evaluating the witness’s state of mind.
While witnesses describe the accident, the investigator: should not rush witnesses; should not be judgmental, hostile, or argumentative; should not display anger, suggest answers, threaten, intimidate, or blame the witness; should not make promises of confidentiality, use
2‐37
DOE‐HDBK‐1208‐2012
inflammatory words; and should not ask questions that suggest an answer, or omit questions because the investigator presumes to know the answer.
While not making promises confidentiality, the interviewer can inform the witness that the testimony is not released to site management and the witness’ name is not included in the report.
Management supervision is discouraged from attending witness interview to avoid potential intimidation issues. However, it should be made clear during the scheduling stage that the witness is allowed to invite union or legal representatives to the interview.
Before each interview, interviewees should be apprised of FOIA and Privacy Act concerns as they pertain to their statements and identity. A Model Interview Opening Statement that addresses FOIA and Privacy Act provisions can be found in Appendix D. Interviewees should be aware that information provided during the investigation may not be precluded from release under FOIA or the Privacy Act. This model opening statement also addresses the caution against false statements and Appendix D includes a brief explanation in a Reference Copy of 18 USC Sec. 1001 for Information.
If any questions arise concerning the disclosure of accident investigation records or the applicability of the FOIA or the Privacy Act, guidance should be obtained from the FOIA/Privacy Act attorney at either Headquarters or the field. Most DOE sites have FOIA/Privacy Act specialists who can be consulted for further guidance.
Following the guidelines listed in Table 2-9, will help ensure that witness statements are provided freely and accurately, subsequently improving the quality and validity of the information obtained.
Table 2-9: Guidelines for Conducting Witness Interviews
Create a Relaxed Atmosphere
Conduct the interview in a neutral location that was not associated with the accident.
Introduce yourself and shake hands.
Be polite, patient, and friendly.
Treat witnesses with respect.
Prepare the Witness
Describe the investigation’s purpose: to prevent accidents, not to assign blame.
Explain that witnesses may be interviewed more than once.
Use the Model Opening Statement to address FOIA and Privacy Act concerns.
2‐38
DOE‐HDBK‐1208‐2012
Use the Model Opening Statement to caution against false testimony and explain 18 U.S. Code 1001 concerns.
Stress how important the facts given during interviews are to the overall investigative process.
Record Information
Rely on a court reporter to provide a detailed record of the interview.
Note crucial information immediately in order to ask meaningful follow‐up questions.
Ask Questions
Establish a line of questioning and stay on track during the interview.
Ask the witness to describe the accident in full before asking a structured set of questions.
Let witnesses tell things in their own way; start the interview with a statement such as "Would you please tell me about...?"
Ask several witnesses similar questions to corroborate facts.
Aid the interviewee with reference points; e.g., "How did the lighting compare to the lighting in this room?"
Keep an open mind; ask questions that explore what has already been stated by others in addition to probing for missing information.
Use visual aids, such as photos, drawings, maps, and graphs to assist witnesses.
Be an active listener, and give the witness feedback; restate and rephrase key points.
Ask open‐ended questions that generally require more than a "yes" or "no" answer.
Observe and note how replies are conveyed (voice inflections, gestures, expressions, etc.).
Close the Interview
End on a positive note; thank the witness for his/her time and effort.
Allow the witness to read the interview transcript and comment if necessary.
Encourage the witness to contact the board with additional information or concerns.
Remind the witness that a follow‐up interview may be conducted.
2.5.7.4 Evaluating the Witness’s State of Mind
Occasionally, a witness's state of mind may affect the accuracy or validity of testimony provided. In conducting witness interviews, investigators should consider:
The amount of time between the accident and the interview. People normally forget 50 to 80 percent of the details in just 24 hours.
2‐39
DOE‐HDBK‐1208‐2012
2.6
Contact between this witness and others who may have influenced how this witness recalls the events.
Signs of stress, shock, amnesia, or other trauma resulting from the accident. Details of unpleasant experiences are frequently blanked from one’s memory.
Investigators should note whether an interviewee displays any apparent mental or physical distress or unusual behavior; it may have a bearing on the interview results. These observations can be discussed and their impact assessed with other members of the Board.
Uncooperative witness. If confronted with a witness who refuses to testify, they cannot be forced testify. Emphasize that testimony is voluntary. Reemphasis purpose of the investigation is not to find fault of the individual but to uncover weaknesses in processes and systems. Offer to reschedule the interview if there is anything the witness is uncomfortable with such as time, location, or lack of representation. Ask if the witness is willing to explain reason for refusal to testify. Offer the witness contact information in case they should change their mind. Then, close the interview, noting possible state of mind issues.
2.6 Analyze Accident to Determine “Why” It Happened
2.6.1 Fundamentals of Analysis
Careful and complete analysis of the evidence, data collected following an accident, is critical to the accurate determination of an accident’s causal factors. The results of comprehensive analyses provide the basis for corrective and preventive measures.
The analysis portion of the accident investigation is not a single, distinct part of the investigation. Instead, it is the central part of the iterative process that includes collecting facts and determining causal factors, and most importantly, re-evaluating and up-dating the events and causal factors chart and analysis the team creates.
Well chosen and carefully performed analytical methods are important for providing results that can aid investigators in developing an investigation report that has sound Judgments of Need.
Caution must be taken in applying analytic methods. First, no single method will provide all the analyses required to completely determine the multiple causal factors of an accident. Several techniques that can complement and cross-validate one another should be used to yield optimal results. Second, analytic techniques cannot be used mechanically and without thought. The best analytic tools can become cumbersome and ineffective if they are not applied to an accident’s specific circumstances and adapted accordingly.
Each AIB should utilize the core analytical techniques described in this Handbook. Then, determine which additional analytic techniques are appropriate, based on the accident’s complexity and severity. Alternative approaches and methods to those presented in this
2‐40
DOE‐HDBK‐1208‐2012
workbook are acceptable, provided that they meet the requirements of DOE O 225.1B and are demonstrably equivalent.
Why an accident happened is based on the search for cause, but the AIB must be judicious in the identification of causes. The identification of an inappropriate or incorrect cause can be harmful to the organization by wasting resources on the wrong corrective actions, needlessly damaging their reputation, or leaving the actual causes unaddressed.
The causal analysis methodologies used in accident investigation are rigorous, logical and help in the understanding of the accident, but the problem is that causality, a cause-effect relationship, can easily be constructed where it does not really exist.
To understand how this happens, investigators need to take a hard look at the accident models and how accidents are investigated; particularly, how the cause and effect relationships are determined and the requirements for a true cause and effect relationship.
Understanding of these concepts can make the difference between a thorough, professional investigation report and one that could best be described as malpractice.
2.6.2 Core Analytical Tools - Determining Cause of the Accident or Event
DOE Accident Investigation Boards need to use, at minimum, five techniques to analyze the information they have collected, to identify conditions and events that occurred before and immediately following an accident, and to determine an accident’s causal factors.
This section of the Handbook describes and provides instructions for using the five core analytic tools:
Event and Causal Factors Charting and Analysis
Barrier analysis
Change analysis
Root Cause Analysis
Verification Analysis
2‐41
Events and Causal Factors Chart
DOE‐HDBK‐1208‐2012
Factual Analysis
Barrier Analysis
Change Analysis
Updated Events and Causal
Factors Chart
Events and Causal Factors
Chart Root Cause
Analysis Verification
Analysis
Figure 2-3: Analysis Process Overview
2‐42
DOE‐HDBK‐1208‐2012
Accident Investigation Terminology
A causal factor is an event or condition in the accident sequence that contributes to the unwanted result. There are three types of causal factors: direct cause(s), which is the immediate event(s) or condition(s) that caused the accident; root causes(s), which is the causal factor that, if corrected, would prevent recurrence of the accident; and the contributing causal factors, which are the causal factors that collectively with the other causes increase the likelihood of an accident, but which did not cause the accident. Event and causal factors analysis includes charting, which depicts the logical sequence of events and conditions (causal factors that allowed the accident to occur), and the use of deductive reasoning to determine the events or conditions that contributed to the accident.
The direct cause of an accident is the immediate event(s) or condition(s) that caused the accident.
Root causes are the causal factors that, if corrected, would prevent recurrence of the same or similar accidents. Root causes may be derived from or encompass several contributing causes. They are higher‐order, fundamental causal factors that address classes of deficiencies, rather than single problems or faults. Contributing causes are events or conditions that collectively with other causes increased the likelihood of an accident but that individually did not cause the accident. Contributing causes may be longstanding conditions or a series of prior events that, alone, were not sufficient to cause the accident, but were necessary for it to occur. Contributing causes are the events and conditions that “set the stage” for the event and, if allowed to persist or re‐occur, increase the probability of future events or accidents.
Barrier analysis review the hazards, the targets (people or objects) of the hazards, and the controls or barriers that management systems put in place to separate the hazards from the targets. Barriers may be physical or administrative.
Change analysis is a systematic approach that examines planned or unplanned changes in a system that caused the undesirable results related to the accident. Human Performance analysis is a method used to identify organizational and human performance factors that combined with human actions that can precipitate undesirable outcomes. Error precursor analysis identifies the specific error precursors that were in existence at the time of or prior to the accident. Error precursors are unfavorable factors or conditions embedded in the job environment that increase the chances of error during the performance of a specific task by a particular individual, or group of individuals. Error precursors create an error‐likely situation that typically exists when the demands of the task exceed the capabilities of the individual or when work conditions aggravate the limitations of human nature.
2.6.3 The Backbone of the Investigation – Events and Causal Factors Charting
Events and Causal Factors (ECF) Charting has been a core analytic tool since its development at the SSDC in the 1970s. The basic ECF Charting approach has been expanded by DOE, and incorporates HPI by the inclusion decision points and the associated context of the decision. The AIB must develop a sound ECF chart to be able to perform an adequate analysis of the facts, and sound conclusions and Judgments of Need.
2‐43
DOE‐HDBK‐1208‐2012
Traditionally, worker error is often seen as the cause of the accident and the focus is on what people should have done to avoid the accident. Simply blaming the worker for making a decision that is judged to wrong in hindsight does not, however, explain why the worker took the actions that they did and why those actions made perfect sense to them at the time. Workers come to work the intention to do a good job and the decisions they make, without the benefit of hindsight, must be viewed within the context of the situation at the time.
This is generally referred to as the worker’s mindset, which includes the goals that they are trying to accomplish, the knowledge and information available to them at the time, and the resultant focus of their decision. What can seem like an unacceptable shortcut, in hindsight, is often the result of the worker trying to respond to conflicting demands to be efficient, yet thorough at the same time.
ECF charting provides a systematic method to capture the worker mindset by the inclusion of decision points prior to worker actions in the event sequence. Linked to the decision is information on the worker’s motivation, goals, knowledge, and focus at the time of the decision.
The ECF chart is a graphically displayed flow chart of the event with the events and decisions plotted on a timeline. As the event timeline is established, the related conditions or information and worker knowledge or focus are linked to the events and decisions. Understanding why workers did what they did and why their decisions and actions made sense to them is an essential goal of the accident investigation.
Unless the context of the decisions is understood, actions to prevent similar events will focus on what are perceived as aberrant worker actions rather than the underlying factors that influenced the decisions. The underlying factors are what need to be identified and addressed to improve the system and prevent similar events in the future.
Event Charting was developed to focus on the decisions and actions that were taken during the event. Instead of just identifying the actions that were taken, ECF Charting requires that the decision to take the action be addressed and information developed about the context of the decisions.
An Event Chart is a graphically displayed flow chart of the event with the events and decisions plotted on a timeline. As the event timeline is established, the related conditions or information and worker knowledge or focus are linked to the events and decisions.
Key to successful use of the causal factors tools introduced in this section is the systematic collection and review of the event facts as captured in the Events and Causal Factors Chart (ECF). The ECF is the workhorse in an event investigation because it provides a systematic tool to separate events in time to allow events that may be critical to determining appropriate causal factors to be seen and acted upon.
The information in the ECF is used to support each follow-on tool available to the investigation team. The ECF collects important information related to human performance challenges, missed opportunities, organizational culture attributes, and potential latent organizational weaknesses. By collecting this important information for each time sequence, biases that the team members
2‐44
DOE‐HDBK‐1208‐2012
may have as they enter the investigation process are removed or at least minimized resulting in a much more objective investigation.
Armed with the information compiled in the ECF, AIBs have numerous causal analysis tools at their disposal to analyze the factual information they have collected, to identify conditions and events that occurred before and immediately following an accident, and to determine the causal factors.
The purpose of any analytic technique in an investigation is to answer the question “WHY” the event happened. That is, why did the organization allow itself to degrade to such a state that the event in question happened? It is the job of the team to apply the appropriate techniques to help them determine the causal factors of an event or accident.
Accidents rarely result from a single cause because, hopefully, many independent systems and barriers were put in place to ensure the catastrophic event did not occur. If an incident occurred, it had to be a result of the breakdown in multiples systems. Events and causal factors charting is useful in identifying the multiple causes and graphically depicting the triggering conditions and events necessary and sufficient for an incident to occur.
Events and causal factors charting is a graphical display of the event and is used primarily for compiling and organizing evidence to portray the sequence of the events and their causal factors that led to the incident. The other analytical techniques (e.g., ECF, process mapping, barrier analysis, and change analysis) are used to inform the team and to support the development of the events and causal factors chart. After the major event facts are fully identified, analysis is performed to identify the causal factors.
Events and causal factors charting is widely used in major event investigations, because it is relatively easy to develop and provides a clear depiction of the information generated by the team. By carefully tracing the events and conditions that allowed the incident to occur, team members can pinpoint specific events and conditions that, if addressed through corrective actions, would prevent a recurrence. The benefits of this technique are highlighted in Table 2-10.
2‐45
DOE‐HDBK‐1208‐2012
Table 2-10: Benefits of Events and Causal Factors Charting
The benefits of events and causal factors charting include:
Illustrating and validating the sequence of events leading to the accident and the conditions affecting these events
Showing the relationship of immediately relevant events and conditions to those that are associated but less apparent — portraying the relationships of organizations and individuals involved in the accident
Directing the progression of additional data collection and analysis by identifying information gaps Linking facts and causal factors to organizational issues and management systems Validating the results of other analytic techniques Providing a structured method for collecting, organizing, and integrating collected evidence Conveying the possibility of multiple causes Providing an ongoing method of organizing and presenting data to facilitate communication among
the investigators Clearly presenting information regarding the accident that can be used to guide report writing Providing an effective visual aid that summarizes key information regarding the accident and its
causes in the investigation report
Two types of event and causal factors charts will be introduced in this guide:
Events and Causal Factors Analysis Chart (ECF) and the
Expanded Events and Causal Factors Analysis (E-ECF) chart, which is an enhanced application of the ECF and may be more applicable to the accident prevention focus of an Operational Safety Review Team in Volume II, Chapter 1 in looking in much greater depth at organizational weaknesses and human performance. The ECF process is described in detail in Section 2.6.3.2.
The team should choose which tool suits their needs.
To identify causal factors, team members must have a clear understanding of the relationships among the events and the conditions that allowed the accident to occur. Events and causal factors charting provides a graphical representation of these relationships that provides a mental model of the event such that team can determine the causal factors and make intelligent recommendations
After developing the “initial” ECF, the investigators apply, at minimum, the core analytic techniques of:
Events and causal factors charting and analysis,
Barrier Analysis,
2‐46
DOE‐HDBK‐1208‐2012
Change Analysis,
Root Cause Analysis, and
Verification Analysis.
2.6.3.1 ECF Charting Symbols
The symbols used are as follows:
Event
Condition
Accident
Context
Assumed Event
Assumed Condition
Causal Factor
Connection between events
Connection from a condition
Transfer
2.6.3.2 Events and Causal Factors Charting Process Steps
For purposes of this handbook, events and causal factors charting and events and causal factors analysis (see Section 2.6.8) are considered one technique. They are addressed separately because they are conducted at different stages of the investigation.
This section presents the typical approach to develop the ECF Chart for an accident, where the events have already occurred. In Figure 2-4, a modified form of ECF Chart is presented, and is suggested for use when conducting an Operational Safety Review, of events for the purposes of preventing accidents.
Events and causal factors charting is a graphical display of the accident’s chronology and is used primarily for compiling and organizing evidence to portray the sequence of the accident’s events. It is a continuous process performed throughout the investigation. Events and causal factors analysis is the application of analysis to determine causal factors by identifying significant
2‐47
DOE‐HDBK‐1208‐2012
events and conditions that led to the accident. As the results of other analytical techniques (e.g., barrier analysis and change analysis) are completed, they are incorporated into the events and causal factors chart. After the chart is fully developed, the analysis is performed to identify causal factors.
Events and causal factors charting is possibly the most widely used analytic technique in DOE accident investigations, because the events and causal factors chart is easy to develop and provides a clear depiction of the data. By carefully tracing the events and conditions that allowed the accident to occur, board members can pinpoint specific events and conditions that, if addressed through corrective actions, would prevent a recurrence. The benefits of this technique are highlighted in Table 2-10.
To identify causal factors, Board members must have a clear understanding of the relationships among the events and the conditions, both human performance and management systems, which allowed the accident to occur. Events and causal factors charting provides a graphical representation of these relationships.
Constructing the Chart
Constructing the events and causal factors chart should begin immediately. However, the initial chart will be only a skeleton of the final product. Many events and conditions will be discovered in a short amount of time, and therefore, the chart should be updated almost daily throughout the investigative data collection phase. Keeping the chart up-to-date helps ensure that the investigation proceeds smoothly, that gaps in information are identified, and that the investigators have a clear representation of accident chronology for use in evidence collection and witness interviewing.
Investigators and analysts can construct events and causal factors chart using either a manual or computerized method. Accident Investigation Boards often use both techniques during the course of the investigation, developing the initial chart manually and then transferring the resulting data into computer programs.
The benefits of events and causal factors charting include:
Illustrating and validating the sequence of events leading to the accident and the conditions affecting these events.
Showing the relationship of immediately relevant events and conditions to those that are associated but less apparent, portraying the relationships of organizations and individuals involved in the accident.
Directing the progression of additional data collection and analysis by identifying information gaps.
Linking facts and causal factors to organizational issues and management systems.
Validating the results of other analytic techniques.
2‐48
DOE‐HDBK‐1208‐2012
Providing a structured method for collecting, organizing, and integrating collected evidence.
Conveying the possibility of multiple causes.
Providing an ongoing method of organizing and presenting data to facilitate communication among the investigators.
Clearly presenting information regarding the accident that can be used to guide report writing.
Providing an effective visual aid that summarizes key information regarding the accident and its causes in the investigation report.
The process begins by chronologically constructing, from left to right, the primary chain of events that led to an accident. Secondary and miscellaneous events are then added to the events and causal factors chart, inserted where appropriate in a line above the primary sequence line. Conditions that affect either the primary or secondary events are then placed above or below these events. A sample summary events and causal factors chart (Figure 2-4) illustrates the basic format using data from the case study accident. This chart shows how data may become available during an accident investigation, and how a chart would first be constructed and subsequently updated and expanded. Guidelines for constructing the chart are shown in Table 2-10.
INEEL CO2 Events: 1971 1982 1997 1998 July
CO2 system discharge w/o
alarm 6:11 p.m.
Removal of 4160v power in Bldg. 648
6:10 p.m.
CO2 LOTO used for PM
tasks in Bldg. 648 in Feb. and May
CO2 hazard not identified
in work planning
Preparation for electrical work begins
6:00 p.m. A
Fire panel “impaired” 5:44 p.m.
Procedures changed to
require LOTO of CO2
systems
Pressure switches &
alarm feedback loop
deleted from design
Pre-job briefing
completed
4:50 p.m.
July 28
New digital fire panel installed in Bldg. 648
A
Figure 2-4: Simplified Events and Causal Factors Chart for the July 1998 Idaho Fatality CO2 Release at the Test Reactor Area
Depending on the complexity of the accident, the chart may result in a very large complex sequence of events covering several walls. For the purpose of inclusion in the investigation
2‐49
DOE‐HDBK‐1208‐2012
report and closeout briefings, the chart is generally summarized. Note that “assumed conditions” appear in the final chart. These are conditions the Board believes affected the accident sequence, but the effect could not be substantiated with evidence.
The following steps summarize the construction of the ECF Chart. In practice, this is an iterative process with constant changes and expansion of the chart as information, including context becomes available during the investigation.
Sequence of Events and Actions
First, to initiate the ECF Chart, the investigators begin with a chronological sequence of events, leading up to the accident, then the events immediately after the accident of relevance, such as how the emergency response proceeded. The sequence of events and decisions forms the starting point for reconstructing the accident. The events include observations, actions, and changes in the process or system.
Action Event Event Accident Description
Figure 2-5: Sequence of Events and Actions Flowchart
Decisions before Actions
For each event consider, the decisions (before the actions) to start to establish the mindset of the worker. The goal is to set the framework for how the workers goals, knowledge and focus unfolded in parallel with the situation evolving around them.
Action Event Event Decision Accident Description
Figure 2-6: Decisions before Actions Flowchart
2‐50
DOE‐HDBK‐1208‐2012
Conditions and Context of Human Performance and Safety Management Systems
For each event, determine the conditions that existed from the context of the human performance decisions, the actions by individuals, the safety management system, the work environment, and the physical conditions that existed at that specific point of time. This step is about reconstructing the world as it unfolded around the worker. The purpose is to:
Determine how work was actually being performed;
Determine what information was available to the worker and decisions that were made; and
Determine how work was expected to be performed, e.g., procedures, plans, permits.
Reconstruct how the process was changing and how information about the changes was presented to the workers. Use the Human Error Precursor Matrix (Table 2-11), the ISM Seven Guiding Principles (Table 2-13), and the ISM Five Core Functions (Table 1-5) to help identify the context description involved. A more detailed discussion and list of Human Error Precursors will be found in Table 2-11.
Action Event Event Decision
Condition Condition
Accident Description
Context ISM/HPI
Figure 2-7: Conditions and Context of Human Performance and Safety Management Systems Flowchart
Context of Decisions
Next determine the context by which workers formulated the decisions that lead to their actions at the point of time in the event. Decisions are not made in a vacuum. They are the result of the factors that are influencing the worker at that point in time.
People have goals. Completion of the task is obvious, but there are other, often conflicting, goals present. These can include, but are not limited to:
Economic considerations, such as safety versus schedule
2‐51
k
DOE‐HDBK‐1208‐20012
Subttle coercionss (what boss wants, not wwhat s/he sayys)
Respponse to prevvious situatiions (successses OR failuures)
People haave knowleddge, but the aapplication aand availabillity of knowwledge is not straight forwward. Was it acccurate, commplete and avvailable?
Goals & knowledge ttogether deteermine theirr focus becauuse:
Worrkers cannot know and see everythinng all the timme.
Whaat people aree trying to acccomplish a nd what theyy know drives where theey direct theiir attenntion.
Re-cconstructing their focus oof attention will help thee investigatioon to undersstand the gapp betwween availabble informatioon and whatt they saw orr used.
Figure 2--8: Context of Decissions Flowwchart
ECF charrting providees a graphicaal display off the event annd guides thhe logic floww on trying too understannd the event. The outpuut however iss not the chaart, but the exxplanation thhat of the evvent that resullts from the constructionn of the chartt. In particuular, it providdes an explannation of whhat the workers did and wwhy they didd it. The expplanation should addresss factors such as:
Whaat was happeening with thhe process?
Whaat were the wworkers tryinng to accompplish and whhy?
2‐52
DOE‐HDBK‐1208‐2012
What did they know at the time?
Where was their attention focused and why?
Why what they did made sense to them at the time?
Table 2-11: Common Human Error Precursor Matrix
TASK DEMANDS (TD)
TD #1 Time pressure (in a hurry)
Urgency or excessive pace required to perform action or task
Manifested by shortcuts, being in a hurry, and an unwillingness to accept additional work or to help others
No spare time
TD #2 High workload (high memory requirements)
Mental demands on individual to maintain high levels of concentration; for example, scanning, interpreting, deciding, while requiring recall of excessive amounts of information (either from training or earlier in the task)
TD #3 Simultaneous, multiple tasks
Performance of two or more activities, either mentally or physically, that may result in divided attention, mental overload, or reduced vigilance on one or the other task
TD #4 Repetitive actions / Monotony
Inadequate level of mental activity resulting from performance of repeated actions; boring Insufficient information exchange at the job site to help individual reach and maintain an acceptable level of alertness
2‐53
DOE‐HDBK‐1208‐2012
TD #5 Irrecoverable acts
Action that, once taken, cannot be recovered without some significant delay
No obvious means of reversing an action
TD #6 Interpretation requirements
Situations requiring “in‐field” diagnosis, potentially leading to misunderstanding or application of wrong rule or procedure
TD #7 Unclear goals, roles, and responsibilities
Unclear work objectives or expectations
Uncertainty about the duties an individual is responsible for in a task that involves other individuals
Duties that are incompatible with other individuals
TD #8 Lack of or unclear standards
Ambiguity or misunderstanding about acceptable behaviors or results; if unspecified, standards default to those of the front‐line worker (good or bad)
WORK ENVIRONMENT (WE)
WE #1 Distractions / Interruptions
Conditions of either the task or work environment requiring the individual to stop and restart a task sequence, diverting attention to and from the task at hand
WE #2 Changes / Departure from routine
Departure from a well‐established routine
Unfamiliar or unforeseen task or job site conditions that potentially disturb an individual's understanding of a task or equipment status
WE #3 Confusing displays / control
Characteristics of installed displays and controls that could possibly confuse or exceed working memory capability of an individual
Examples: missing or vague content (insufficient or irrelevant) lack of indication of specific process parameter illogical organization and/or layout insufficient identification of displayed process information controls placed close together without obvious ways to discriminate conflicts between
indications
2‐54
DOE‐HDBK‐1208‐2012
WE #4 Work‐arounds / Out‐of‐Service instrumentation
Uncorrected equipment deficiency or programmatic defect requiring compensatory or non‐standard action to comply with a requirement; long‐term materiel condition problems that place a burden on the individual
WE #5 Hidden system response
System response invisible to individual after manipulation
Lack of information conveyed to individual that previous action had any influence on the equipment or system
WE #6 Unexpected equipment condition
System or equipment status not normally encountered creating an unfamiliar situation for the individual
WE #7 Lack of alternative indication
Inability to compare or confirm information about system or equipment state because of the absence of instrumentation
WE #8 Personality conflict
Incompatibility between two or more individuals working together on a task causing a distraction from the task because of preoccupation with personal differences
INDIVIDUAL CAPABILITIES (IC)
IC #1 Unfamiliarity with task / First time
Unawareness of task expectations or performance standards
First time to perform a task (not performed previously; a significant procedure change)
IC #2 Lack of knowledge (mental model)
Unawareness of factual information necessary for successful completion of task; lack of practical knowledge about the performance of a task
IC #4 New technique not used before
Lack of knowledge or skill with a specific work method required to perform a task
IC #5 Imprecise communication habits
Communication habits or means that do not enhance accurate understanding by all members involved in an exchange of information
IC #6 Lack of proficiency / Inexperience
Degradation of knowledge or skill with a task because of infrequent performance of the activity
2‐55
DOE‐HDBK‐1208‐2012
IC #7 Indistinct problem‐solving skills
Unsystematic response to unfamiliar situations; inability to develop strategies to resolve problem scenarios without excessive use of trial‐and‐error or reliance on previously successful solutions
Unable to cope with changing facility conditions
IC #8 “Unsafe” attitude for critical tasks
Personal belief in prevailing importance of accomplishing the task (production) without consciously considering associated hazards
Perception of invulnerability while performing a particular task
Pride; heroic; fatalistic; summit fever; Pollyanna; bald tire
IC #9 Illness / Fatigue
Degradation of a person's physical or mental abilities caused by a sickness, disease, or debilitating injury
Lack of adequate physical rest to support acceptable mental alertness and function
HUMAN NATURE (HN)
HN #1 Stress
Mind's response to the perception of a threat to one's health, safety, self‐esteem, or livelihood if task is not performed to standard
Responses may involve anxiety, degradation in attention, reduction in working memory, poor decision‐making, transition from accurate to fast
Degree of stress reaction dependent on individual's experience with task
HN #2 Habit patterns
Ingrained or automated pattern of actions attributable to repetitive nature of a well‐practiced task
Inclination formed for particular train/unit because of similarity to past situations or recent work experience
HN #3 Assumptions
Suppositions made without verification of facts, usually based on perception of recent experience; provoked by inaccurate mental model
Believed to be fact
Stimulated by inability of human mind to perceive all facts pertinent to a decision
2‐56
DOE‐HDBK‐1208‐2012
HN #4 Complacency / Overconfidence
A “Pollyanna” effect leading to a presumption that all is well in the world and that everything is ordered as expected
Self‐satisfaction or overconfidence, with a situation unaware of actual hazards or dangers; particularly evident after 7‐9 years on the job
Underestimating the difficulty or complexity of a task based upon past experiences
HN #5 Mindset
Tendency to “see” only what the mind is tuned to see (intention); preconceived idea
Information that does fit a mind‐set may not be noticed and vice versa; may miss information that is not expected or may see something that is not really there; contributes to difficulty in detecting one's own error (s)
HN #6 Inaccurate risk perception
Personal appraisal of hazards and uncertainty based on either incomplete information or assumptions
Unrecognized or inaccurate understanding of a potential consequence or danger
Degree of risk‐taking behavior based on individual’s perception of possibility of error and understanding of consequences; more prevalent in males
HN #7 Mental shortcuts (biases)
Tendency to look for or see patterns in unfamiliar situations; application of thumb rules or “habits of mind” (heuristics) to explain unfamiliar situations: confirmation bias frequency bias similarity bias availability bias
HN #8 Limited short‐term memory
Forgetfulness; inability to accurately attend to more than 2 or 3 channels of information (or 5 to 9 bits of data) simultaneously
The mind’s “workbench” for problem‐solving and decision‐making; the temporary, attention‐ demanding storeroom we use to remember new information
[Pyszczynski, pp. 117 – 142, 2002]22
2‐57
DOE‐HDBK‐1208‐2012
2.6.3.3 Events and Causal Factors Chart Example
The Event
An electrician (E1), working within the basement of the facility was manipulating a stuck trip latch on a spring loaded secondary main air breaker. In order to gain access to the stuck trip latch, E1 decided to partially charge (compress) the large coil closing spring using the manual closing handle and reach into the breaker with his left hand from underneath. As he knelt in front of the breaker, his knee gave out, causing him to lose balance and strike the closing handle with his right hand. This caused the charged closing spring to release and slam the breaker closed, severing the tip of his left middle finger.
Background
The work involved a planned electrical outage for the facility in order to conduct preventive maintenance (PM) on the primary transformer. In order to perform the PM without impact to the facility and its tenants, the work and an outage were scheduled during the weekend. The resident electricians (E1 and E2) were supporting the PM activities by opening and closing seven secondary main breakers as well as several other load breakers. The electricians’ work was authorized by an Integrated Work Document (IWD). Per the IWD, their work scope was defined as “assisting the FC in the shutdown and start-up of equipment and to verify proper function.”
Air Breakers
The secondary main air breakers were General Electric Type AK-2-75. These breakers are rated for 600 volts and were installed during the construction of the facility in the 1950s and 1960s.
The normal process for closing air breakers is to close the breaker electronically. In this instance, the breaker is closed by turning the knob as shown in Figure 2-9. The breakers do not need to be racked out when closing electronically.
If the breaker does not close electronically, then it is closed manually. The breaker must be racked out and charged using the manual closing handle as shown in Figure 2-9. The air breakers are equipped with a coiled spring that drives the contacts closed. The closing springs are charged by operating the manual closing handle on the front of the breaker. The breaker releases during the 4th cycle of the closing handle.
2‐58
DOE‐HDBK‐1208‐2012
Figure 2-9: Racked Out Air Breaker
Figure 2-10 shows an excerpt from the ECF Chart that addresses the electrician’s decision to reach into the breaker:
An explanation of this event might read:
E1 determined that the breakers would not close due to a stuck trip latch based on his previous experience as the foreman of the breaker maintenance crew and having encountered this problem before, including assisting with the repair of the trip latch on one of the other breakers at the facility two months prior. He also knew that these breakers had not been serviced in over 6 years.
E1 and E2 decided to repair the breakers in place rather than send the breakers back to the shop for maintenance. They knew that it could take up to a week to get the breakers serviced and the facility would not be able to reopen the following morning. E1 was motivated to complete the work so that the nuclear facility could reopen on schedule and based on his past experience, he felt he would be “rewarded” for restoring power and that there would be ramifications if the work was not completed by the end of the day.
E1 then decided manipulate the trip latch based on his belief that the latch was stuck due to lack of maintenance that allowed the lubricant to congeal. He had done this before and had learned it from other electricians.
2‐59
DOE‐HDBK‐1208‐2012
Figure 2-10: Excerpt from the Accident ECF Chart
2.6.4 Barrier Analysis
Once the “initial” ECF is constructed, the team may use the first analysis tool, “Barrier Analysis.”
2.6.4.1 Analyzing Barriers
Figure 2-11 shows a summary diagram of the barrier analysis result. As can be seen, there is the potential for a large list of barriers that either did or could have come into play between the hazard and the target. It is user to the analysis if categories are used as much as possible to help recognize the nature of the barrier’s performance and the relationship to organizational conditions that either weaken or strengthen the barrier. Fundamental elements of the barrier analysis should identify if the barrier prevents the initiation of accident or mitigates the harm, if the barrier can be passively defeated (ignored) or must it be actively defeated (disabled); and what kinds of latent organizational conditions can influence the barrier reliability.
2‐60
DOE‐HDBK‐1208‐2012
WorkerTarget
Management Barriers Roles and responsibilities unclear Work scope not documented
Hazard unknown Hazard unanalyzed
Standards/requirements not identified Workers uninformed
Reviews bypassed Procedures incomplete
Training incomplete Required authorizations not received Procedures not followed
Supervision ineffective Stop work not used
Oversight ineffective No electrical safety program
Physical Barriers Design preliminary No as‐built drawings Electrical conduit breached
13.2 kV cable insulation breached Personal protective equipment not used
13.2 kV energized electrical cable Hazard
Figure 2-11: Summary Results from a Barrier Analysis Reveal the Types of Barriers Involved
2‐61
DOE‐HDBK‐1208‐2012
When analyzing barriers, investigators should first consider how the hazard and target could come together and what was in place or was required to keep them apart. Obvious physical barriers are those placed directly on the hazard (e.g., a guard on a grinding wheel); those placed between a hazard and target (e.g., a railing on a second-story platform); or those located on the target (e.g., a welding helmet). Management system barriers may be less obvious, such as the exposure limits required to minimize harm to personnel or the role of supervision in ensuring that work is performed safely. The investigator must understand each barrier’s intended function and location, and how it failed to prevent the accident.
To analyze the performance of physical barriers, investigators may need several different types of data, including:
Plans and specifications for the equipment or system
Procurement and vendor technical documentation
Installation and testing records
Photographs or drawings
Maintenance histories.
To analyze management barriers, investigators may need to obtain information about barriers at the activity, facility, and institutional levels responsible for the work. At the activity level, the investigator will need information about the work planning and control processes that governed the work activity, as well as the relevant safety management systems. This information could include:
Organizational charts defining supervisory and contractor management roles and responsibilities for safety
Training and qualification records for those involved in the accident
Hazard analysis documentation
Hazard control plans
Work permits
The work package and procedures that were used during the activity.
At the facility level, the investigator may also need information about safety management systems. This kind of information might include:
The standards and requirements that applied to the work activity, such as occupational exposure limits or relevant Occupational Safety and Health Administration (OSHA) regulations
2‐62
DOE‐HDBK‐1208‐2012
The facility technical safety requirements and safety analysis report
Safety management documentation that defines how work is to be planned and performed safely at the facility
The status of integrated safety management implementation.
At the institutional level, the investigator may need information about the safety management direction and oversight provided by senior line management organizations. This kind of information might include:
Policy, orders, and directives
Budgeting priorities
Resource commitments.
The investigator should use barrier analysis to ensure that all failed, unused, or uninstalled barriers are identified and that their impact on the accident is understood. However, the investigator must cross-validate the results with the results of other core analytic techniques to identify which barrier failures were contributory or root causes of the accident.
Constructing a Worksheet
A barrier analysis worksheet is a useful tool in conducting a barrier analysis. A blank Barrier Analysis Worksheet is provided in Appendix D.4 Analysis Worksheets. Table 2-12 illustrates a worksheet that was partially completed using data from the case study. Steps used for completing this worksheet are provided below.
Although a barrier analysis will identify the failures in an accident scenario, the failures may not all be causal factors. The barrier analysis results directly feed into the events and causal factors chart and subsequent causal factors determination.
2‐63
DOE‐HDBK‐1208‐2012
Table 2-12: Sample Barrier Analysis Worksheet
Hazard: 13.2 kV electrical Cable Target: Acting pipefitter
What were the barriers?
How did each barrier perform?
Why did the barrier fail?
How did the barrier affect the accident?
Context: HPI/ISM
Engineering Drawings were Engineering Existence of HPI: drawings incomplete and
did not identify electrical cable at sump location
drawings and construction specifications were not procured
Drawings used were preliminary No as‐built drawings were used to identify location of utility lines
electrical cable unknown
HN #5 ‐ Inaccurate mental picture
HN #6 ‐ Inaccurate risk perception
IC #2 ‐ Limited perspective
ISM: GP #3 & 5 –
Hazard Identification
Indoor Indoor Pipefitters and Opportunity to ISM: excavation excavation utility specialist identify CF #1 ‐ Define permit permit was not
obtained were unaware of indoor excavation permit requirements
existence of cable missed
scope of work CF #2 ‐ Analyze
hazards CF #3 ‐ Control
hazards
HN – Human Nature (see Table 2‐11) IC – Individual Capabilities (see Table 2‐11) GP – Guiding Principles of ISM (see Table 2‐13) CF – Core Functions of ISM (see Table 1‐5)
Analyzing the Results
The results of barrier analysis are first derived and portrayed in tabular form, then summarized graphically to illustrate, in a linear manner, the barriers that were unused or that failed to prevent an accident. Results from this method can also reveal what barriers should have or could have prevented an accident.
In the tabular format, individual barriers and their purposes are defined. Each is considered for its effectiveness in isolating, shielding, and controlling an undesired path of energy.
Table 2-12 provides an example of a barrier analysis summary. This format is particularly useful for illustrating the results of the analysis in a clear and concise form.
2‐64
DOE‐HDBK‐1208‐2012
Basic Barrier Analysis Steps
Step 1: Identify the hazard and the target. Record them at the top of the worksheet. “13.2 kV electrical cable. Acting pipefitter.”
Step 2: Identify each barrier. Record in column one. “Engineering drawings. Indoor excavation permit. Personal protective equipment.”
Step 3: Identify how the barrier performed (What was the barrier’s purpose? Was the barrier in place or not in place? Did the barrier fail? Was the barrier used if it was in place?) Record in column two. “Drawings were incomplete and did not identify electrical cable at sump location. Indoor excavation permit was not obtained. Personal protective equipment was not used.”
Step 4: Identify and consider probable causes of the barrier failure. Record in column three. “Engineering drawings and construction specifications were not procured. Drawings used were preliminary, etc.”
Step 5: Evaluate the consequences of the failure in this accident. Record evaluation in column four. “Existence of electrical cable unknown.”
Step 6: Evaluate the context of the consequences of the barrier in terms of both human performance (HPI) AND Integrated Safety Management System (ISMS). Use the Human Error Precursor Matrix (Table 2‐11), and Seven Guiding Principles Chart (Table 2‐13), and the ISM Five Core Functions (Table 1‐5). Record evaluation in column five. A more detailed discussion and list of Human Error Precursors will be found in Table 2‐11.
2.6.4.2 Examining Organizational Concerns, Management Systems, and Line Management Oversight
DOE O 225.1B requires that the investigation board “examine policies, standards, and requirements that are applicable to the accident being investigated, as well as management and safety systems at Headquarters and in the field that could have contributed to or prevented the accident.” Additionally, DOE O 225.1B requires the board to “evaluate the effectiveness of management systems, as defined by DOE P 450.4A, the adequacy of policy and policy implementation, and the effectiveness of line management oversight as they relate to the accident.”
Therefore, accident investigations must thoroughly examine organizational concerns, management systems, and line management oversight processes to determine whether deficiencies in these areas contributed to causes of the accident. The Board should consider the full range of management systems from the first-line supervisor level, up to and including site and Headquarters, as appropriate. It is important to note that this focus should not be directed toward individuals.
In determining sources and causes of management system inadequacies and the failure to anticipate and prevent the conditions leading to the accident, investigators should use the framework of DOE’s integrated safety management system established by the Department in
2‐65
DOE‐HDBK‐1208‐2012
DOE P 450.4A. This policy lists the objective, guiding principles, core functions, mechanisms, responsibilities, and implementation means of an effective safety management system.
The safety management system elements described in DOE P 450.4A should be considered when deciding who to interview, what questions to ask, what documents to collect, and what facts to consider pertinent to the investigation. Even more importantly, these elements should be considered when analyzing the facts to determine their significance to the causal factors of the accident.
There are several readily accessible sources of background information to be used in assessing the safety culture. The DOE maintains databases where accident occurrences, injuries, and lessons learned are recorded for analysis. Some of these databases are:
Occurrence Reporting and Processing System (ORPS)
Computerized Accident/Incident Reporting System (CAIRS)
Lessons Learned and Best Practices
Operating Experience Summaries
Electrical Safety
These information sources and onsite corrective action tracking systems can be very good methods for finding past similar incidents and their associated ISM categories. It is, also, useful to investigate past accident investigations for similar types of events to determine if any of the past lessons learned or corrective actions from across the complex were recognized and implemented prior to the present incident. Often, the AI coordinator can be requested to run preliminary search reports from these federal databases as part of the initial background information to the investigation.
In many accidents, deficiencies in implementing the five core safety management functions defined in DOE P 450.4A cause or contribute to the accident. The five core functions are: (1) define the scope of work; (2) identify and analyze the hazards associated with the work; (3) develop and implement hazard controls; (4) perform work safely within the controls; and (5) provide feedback on adequacy of the controls and continuous improvement in defining and planning the work
Table 2-13 contains a list of typical questions board members may ask to determine whether line management deficiencies affected the accident. These questions are based on the seven guiding principles of DOE P 450.4A. These are not intended to be exhaustive. Board members should adapt these questions or develop new ones based on the specific characteristics of the accident. The answers to the questions may be used to determine the facts of the accident, which, along with the analytical tools described in Section 2.6.5 will enable the board to determine whether deficiencies found in management systems and line management oversight, are causal factors for the accident.
2‐66
DOE‐HDBK‐1208‐2012
Table 2-13: Typical Questions for Addressing the Seven Guiding Principles of Integrated Safety Management.
Guiding Principle #1: Line management is directly responsible for the protection of the public, workers, and the environment.
Did DOE assure and contractor line management, establish documented safety policies and goals?
Was integrated safety management policy fully implemented down to the activity level at the time of the accident?
Was DOE line management proactive in assuring timely implementation of integrated safety management by line organizations, contractors, subcontractors, and workers?
Were environment, safety and health (ES&H) performance expectations for DOE and contractor organizations clearly communicated and understood?
Did line managers elicit and empower active participation by workers in safety management?
Guiding Principle #2: Clear lines of authority and responsibility for ensuring safety shall be established and maintained at all organizational levels within the Department and its contractors.
Did line management define and maintain clearly delineated roles and responsibilities for ES&H to effectively integrate safety into site‐wide operations?
Was a process established to ensure that safety responsibilities were assigned to each person (employees, subcontractors, temporary employees, visiting researchers, vendor representatives, lessees, etc.) performing work?
Did line management establish communication systems to inform the organization, other facilities, and the public of potential ES&H impacts of specific work processes?
Were managers and workers at all levels aware of their specific responsibilities and accountability for ensuring safe facility operations and work practices?
Were individuals held accountable for safety performance through performance objectives, appraisal systems, and visible and meaningful consequences?
Did DOE line management and oversight hold contractors and subcontractors accountable for ES&H through appropriate contractual and appraisal mechanisms?
Guiding Principle #3: Personnel shall possess the experience, knowledge, skills, and abilities that are necessary to discharge their responsibilities.
Did line managers demonstrate a high degree of technical competence and understanding of programs and facilities?
Did line management have a documented process for assuring that DOE personnel, contractors, and subcontractors were adequately trained and qualified on job tasks, hazards, risks, and Departmental and contractor policies and requirements?
Were mechanisms in place to assure that only qualified and competent personnel were assigned to specific work activities, commensurate with the associated hazards?
2‐67
DOE‐HDBK‐1208‐2012
Were mechanisms in place to assure understanding, awareness, and competence in response to significant changes in procedures, hazards, system design, facility mission, or life cycle status?
Did line management establish and implement processes to ensure that ES&H training programs effectively measure and improve performance and identify training needs?
Was a process established to ensure that (1) training program elements were kept current and relevant to program needs, and (2) job proficiency was maintained?
Guiding Principle #4: Resources shall be effectively allocated to address safety, programmatic, and operational considerations. Protecting the public, the workers and the environment shall be a priority whenever activities are planned and performed.
Did line management demonstrate a commitment to ensuring that ES&H programs had sufficient resources and priority within the line organization?
Did line management clearly establish that integrated safety management was to be applied to all types of work and address all types of hazards?
Did line management institute a safety management system that provided for integration of ES&H management processes, procedures, and/or programs into site, facility, and work activities in accordance with the Department of Energy Acquisition Regulation (DEAR) ES&H clause (48 CFR 970.5204‐2)? Were prioritization processes effective in balancing and reasonably limiting the negative impact of resource reductions and unanticipated events on ES&H funding?
Guiding Principle #5: Before work is performed, the associated hazards shall be evaluated and an agreed-upon set of safety standards shall be established that, if properly implemented, will provide adequate assurance that the public, the workers, and the environment are protected from adverse consequences.
Was there a process for managing requirements, including the translation of standards and requirements into policies, programs, and procedures, and the development of processes to tailor requirements to specific work activities?
Were requirements established commensurate with the hazards, vulnerabilities, and risks encountered in the current life cycle stage of the site and/or facility?
Were policies and procedures, consistent with current DOE policy, formally established and approved by appropriate authorities?
Did communication systems assure that managers and staff were cognizant of all standards and requirements applicable to their positions, work, and associated hazards?
Guiding Principle #6: Administrative and engineering controls to prevent and mitigate hazards shall be tailored to the work performed and associated hazards.
Were the hazards associated with the work activity identified, analyzed, and categorized so that appropriate administrative and engineering controls could be put in place to prevent or mitigate the hazards?
Were hazard controls established for all stages of work to be performed (e.g., normal operations, surveillance, maintenance, facility modifications, decontamination, and decommissioning)?
Were hazard controls established that were adequately protective and tailored to the type and magnitude of the work and hazards and related factors that impact the work environment?
2‐68
DOE‐HDBK‐1208‐2012
Were processes established for ensuring that DOE contractors and subcontractors test, implement, manage, maintain, and revise controls as circumstances change?
Were personnel qualified and knowledgeable of their responsibilities as they relate to work controls and work performance for each activity?
Guiding Principle #7: The conditions and requirements to be satisfied for operations to be initiated and conducted shall be clearly established and agreed upon.
Were processes in place to assure the availability of safety systems and equipment necessary to respond to hazards, vulnerabilities, and risks present in the work environment?
Did DOE and contractor line management establish and agree upon conditions and requirements that must be satisfied for operations to be initiated?
Was a management process established to confirm that the scope and authorization documentation is adequately defined and directly corresponds to the scope and complexity of the operations being authorized?
Was a change control process established to assess, approve, and reauthorize any changes to the scope of operations ongoing at the time of the accident?
2.6.5 Human Performance, Safety Management Systems and Culture Analysis
In conducting the change and barrier analysis consider the relationship of human performance, and management systems to the conditions that existed along the event change in the team’s ECF analysis. This section discusses these relationships between how the organization’s people and management system preformed. This analysis is straight forward. For every condition, action a person took, barrier that failed, evaluate it in the context of: Human performance and Management Systems/Culture. The ISM framework and the Error Precursor Matrix, Figure 1-3, at minimum should be used to construct the analysis and statements. Some of these ISM/HPI conditions may later roll up into a causal factor statement.
2.6.6 Change Analysis
Once the Board has completed a barrier analysis, the Board then proceeds to use the next core analytical tool “Change Analysis.”
Change is anything that disturbs the “balance” of a system operating as planned. Change is often the source of deviations in system operations. Change can be planned, anticipated, and desired, or it can be unintentional and unwanted. Workplace change can cause accidents, although change is an integral and necessary part of daily business.
For example, changes to standards or directives may require facility policies and procedures to change, or turnover/retirement of an aging workforce will change the workers who perform certain tasks. Change can be desirable, for example, to improve equipment reliability or to enhance the efficiency and safety of operations. Uncontrolled or inadequately analyzed change can have unintended consequences, however, and result in errors or accidents.
2‐69
DOE‐HDBK‐1208‐2012
Change analysis is particularly useful in identifying obscure contributing causes of accidents that result from changes in a system.
Change analysis examines planned or unplanned changes that caused undesired outcomes. In an accident investigation, this technique is used to examine an accident by analyzing the difference between what has occurred before or was expected and the actual sequence of events. The investigator performing the change analysis identifies specific differences between the accident- free situation and the accident scenario. These differences are evaluated to determine whether the differences caused or contributed to the accident. For example, why would a system that operates correctly 99 times out of 100 fail to operate as expected one time?
Change analysis is relatively simple to use. As illustrated in Figure 2-12 it consists of six steps. The last step, in which investigators combine the results of the change analysis with the results from other techniques, is critical to developing a comprehensive understanding of the accident.
When conducting a change analysis, investigators identify changes as well as the results of those changes. The distinction is important, because identifying only the results of change may not prompt investigators to identify all causal factors of an accident.
The results of a change analysis can stand alone, but are most useful when they are combined with results from other techniques. For example, entering change analysis results into the events and causal factors chart helps to identify potential causal factors.
Describe Accident
Sequence
COMPARE
Describe Accident-Free
Sequence
Identify Differences
Analyze Differences for
Effect on Accident
Input Results into Events and Causal
Factors Chart
Figure 2-12: The Change Analysis Process
2‐70
DOE‐HDBK‐1208‐2012
To conduct a change analysis, the analyst needs to have a baseline situation. This baseline situation can be:
The same situation but before the accident (e.g., previous shift, last week, or last month)
A model or ideal situation (i.e., as designed or engineered).
Generally, it is recommended that Boards compare the accident sequence to the same situation in an accident-free state, the operation prior to the accident, to determine differences and thereby identify accident causal factors. In order for the comparison to be effective, investigators must have sufficient information regarding this baseline situation.
In change analysis, differing events and conditions are systematically reviewed and analyzed to determine potential causes.
Change analysis is most effective under these circumstances:
A prior “accident-free” or typical situation is already documented or can be reconstructed.
A well-defined ideal situation exists.
Work as is described in procedures versus work as actually done.
The following data sources can be a starting point for acquiring a good working knowledge of the system, facility, or process under study prior to the accident or event; however, the list of input requirements should be tailored to fit the specific circumstances and needs of the investigation:
Blueprints
Equipment description documents
Drawings
Schematics
Operating and maintenance procedures
Roles and responsibilities
Job/task descriptions
Personnel qualifications
Results of hazard analysis
Performance indicators
Personnel turnover statistics.
2‐71
DOE‐HDBK‐1208‐2012
A sample Change Analysis Worksheet is presented in Appendix D for reference. This worksheet may be modified as necessary to meet specific requirements.
To develop the information needed to conduct a change analysis, it is useful for the Board to list any changes they identify from their information-gathering activities on a poster board set up in the Board’s common meeting room. At the beginning of the investigation, the Board members should simply note the changes they identify as they find them and not worry about analyzing the significance of the changes. Often, in the early stages of an investigation, there is insufficient information to determine whether a change is important or not.
As the investigation progresses, it will become clear that some of the changes noted on the poster board are insignificant and can be crossed off the list. The remaining changes that seem to be important for understanding the accident can then be organized by entering them into the change analysis worksheet.
Board members should first categorize the changes according to the questions shown in the left- hand column of the worksheet. For example, the Board should determine if the change pertained to a difference in:
What events, conditions, activities, or equipment were present in the accident situation that were not present in the baseline (accident-free, prior, or ideal) situation (or vice versa);
When an event or condition occurred or was detected in the accident situation versus the baseline situation;
Where an event or condition occurred in the accident situation versus where an event or condition occurred in the baseline situation;
Who was involved in planning, reviewing, authorizing, performing, and supervising the work activity in the accident versus the accident-free situation; and
How the work was managed and controlled in the accident versus the accident-free situation.
Reviewing the worksheet may also prompt the investigators to identify additional changes that were not originally listed.
To complete the remainder of the worksheet, first describe each event or condition of interest in the column labeled, “Accident Situation.” Then describe the related event or condition that occurred (or should have occurred) in the baseline situation in the column labeled, “Prior, Ideal, or Accident-Free Situation.” The difference between the events and conditions in the accident and the baseline situations should be briefly described in the column labeled, “Difference.” As a group, the Board should then discuss the effect that each change had on the accident and record the evaluation in the final column of the worksheet.
Table 2-14 shows a partially completed change analysis worksheet containing information from the case study to demonstrate the change analysis approach. The worksheet allows the user to
2‐72
DOE‐HDBK‐1208‐2012
compare the “accident situation” with the “accident-free situation” and evaluate the differences to determine each item’s effect on the accident.
A change analysis summary, as shown in Table 2-15, is generally included in the accident investigation report. It contains a subset of the information listed in the change analysis worksheet. The differences or changes identified can generally be described as causal factors and should be noted on the events and causal factors chart and used in the root cause analysis, as appropriate.
A potential weakness of change analysis is that it does not consider the compounding effects of incremental change (for example, a change that was instituted several years earlier coupled with a more recent change). To overcome this weakness, investigators may choose more than one baseline situation against which to compare the accident scenario. For example, decreasing funding levels for safety training and equipment may incrementally erode safety. Comparing the accident scenario to more than one baseline situation (for example, one year ago) and five years ago and then comparing the one-year and five-year baselines with each other can help identify the compounding effects of changes.
2‐73
DOE‐HDBK‐1208‐2012
Table 2-14: Sample Change Analysis Worksheet
Factors Accident Situation Prior, Ideal, or Accident‐Free Situation Difference Evaluation of Effect
WHAT 1. Design and ES&H reviews 1. Project design and ES&H review 1 Environmental Group 1. Design and ES&H reviews Conditions, were not performed. are performed by appropriate assumed design role and were not performed, occurrences, 2. Established review process groups to ensure adequate removed ES&H review contributing to the accident. activities, was bypassed. review and the safety and from task. 2. Construction packages were equipment 3. Hazards associated with the
work being performed were not identified. No review of as‐built drawings. No excavation permit. No underground utility survey.
health of employees. 2. Construction packages are
approved by facilities project delivery group.
3. A preliminary hazard analysis is performed on all work.
2. Environmental Group approved work packages.
3. No preliminary hazard analysis was performed on construction task.
not approved by facilities group.
3. Hazards were not identified, contributing to the accident.
WHEN Occurred, identified, facility status, schedule WHERE Sump location was placed Sump is placed in a non‐hazardous Inadequate design allowed Sump location was placed Physical location, above a 13.2 kV electrical line. location. sump to be located above a above an electrical line, which environmental 13.2 kV line. was contacted by a worker jack‐ conditions hammering in the area. WHO Staff involved, training, qualification, supervision
Environmental Group assumed line responsibility for project.
Environmental Group serves as an oversight/support organization to assist line management in project.
Support organization took responsibility of line function for project management.
Lack of oversight on project.
HOW Management allowed Management assures that work is Hazards analysis was not Hazards were not identified, Control chain, Environmental Group to performed by qualified groups. conducted. contributing to the accident. hazard analysis oversee construction tasks. monitoring OTHER
NOTE: The factors in this worksheet are only guidelines but are useful in directing lines of inquiry and analysis.
2‐74
DOE‐HDBK‐1208‐2012
Table 2-15: Case Study: Change Analysis Summary
Prior or Ideal Condition Present Condition Difference (Change)
Environmental Group serves as an oversight/support organization to assist line management in project.
Environmental Group assumed line responsibility for project.
Support organization takes responsibility for a line function.
Project design and ES&H reviews are performed by appropriate groups to ensure adequate review and the safety and health of employees.
Environmental Group assumed design role and removed ES&H review from task.
Design and ES&H reviews were not performed.
Work is stopped when unexpected conditions are found.
Work continued. No opportunity to analyze and control hazards of different work conditions.
A preliminary hazard analysis is performed on all work.
No preliminary hazard analysis was performed on maintenance task.
Hazards associated with the work being performed were not identified. No review of as‐built drawings. No excavation permit. No underground utility survey.
Sump is placed in a nonhazardous designated location.
Sump was located above a 13.2 kV electrical line.
Inadequate design allowed sump to be located above a 13.2 kV line.
2‐75
DOE‐HDBK‐1208‐2012
2.6.7 The Importance of Causal Factors
The primary purpose of any event investigation is to help prevent recurrence of events/accidents by making worthwhile recommendations based on the event’s causal factors. The team is responsible for identifying the local causal factors that, if corrected, would prevent another accident from occurring when the same work activity is performed again. However, more is required than simply detecting and removing immediate hazards.
The Board is responsible for identifying and describing any failures, human performance, and/or management systems that caused the accident. The Board should determine either/and/or the HPI and ISM factors associated with each causal factor statement. This may be accomplished by reviewing and carrying forward the relevant codes you assigned to HPI/ISM when the barrier analysis was constructed.
Modern accident investigation theory indicates that generally the root causes of accidents are found in organizational system failures, not in the most directly related causal factor(s) in terms of time, location, and place.
Generally, the higher in the management and oversight levels a root because is found, the broader the effect is on the scope of the organization’s activities. This broader scope impact translates to a larger potential to cause other accidents. Therefore, it is incumbent on a team to ensure that the investigation is not ended until the highest possible root causes are identified. If a team cannot identify root causes, this should be stated clearly in the investigation report, along with an explanation.
Ask questions to determine causal factors (why, how, what, and who)
How did the conditions originate?
Condition
Condition
Condition
Causal Factor
Causal Factor
Context HPI/ISM Why did the system
allow the conditions to exist?
Why did this event happen?
Event Event Event Event
Figure 2-13: Determining Causal Factors
2‐76
DOE‐HDBK‐1208‐2012
Table 2-16: Case Study Introduction
CASE STUDY
This section on causal analysis begins with a case study of an electrical accident. It is selectively referenced throughout this and subsequent sections to illustrate the process of determining facts and the use of the analytic techniques commonly used in DOE accident investigations. In this workbook, particular emphasis is placed on these techniques because they can be used in most accident investigations. However, for extremely complex accidents, additional, more sophisticated techniques may be needed that require specialized training. Training for these techniques is beyond the scope of this workbook and can be obtained through government, private, and university sources.
EVENT DESCRIPTION
The accident occurred at approximately 9:34 a.m. on January 17, 1996, in Building XX, during the excavation of a sump pit in the floor of the building. Workers were attempting to correct a waste stream outfall deficiency. Two workers arrived at the job site at approximately 8:40 a.m. and resumed the excavation work begun the previous day. The workers were employed by WS, the primary subcontractor for construction and maintenance. They used a jackhammer, pry bar, and shovel to loosen and remove the rubble from the sump pit. At about 9:34 a.m., at a depth of 39 inches, Worker A, who was operating the jackhammer, pierced the conduit containing an energized 13.2 kV electrical cable. He was transported to the local medical center, where cardiac medications were administered.
EVENT FACTS
Using the case study accident, the following three factual statements were derived during the investigation:
The injured worker had not completed safety training prior to the accident, as required by WS Environment, Safety, and Health Manual Procedure 12340.
Design drawings for the project on which the injured employee was working did not comply with the requirements of DOE O 6430.1A, General Design Criteria, and did not show the location of the underground cable.
A standing work order system, without a safety review, was used for non‐routine, non‐repetitive tasks.
2.6.8 Causal Factors
The core analytical technique of Causal Factor Analysis is applied after the ECF chart is constructed, as completely as possible, and a change analysis and barrier analysis are conducted at minimum.
First, the AIB looks for all potential causal factors then, determine if they are a: contributing; root or, direct causal factor of the accident or event.
The process of determining causal factors seeks to answer the questions; what happened and, why did it happen?
2‐77
DOE‐HDBK‐1208‐2012
Causal factors are the events and conditions that produced or contributed to the occurrence of the accident. There are three types of causal factors:
Direct cause;
Contributing causes; and
Root causes.
Event Event Event Pre‐job briefing
completed
Fire panel “impaired”
CO2 system
discharge w/o alarm
Condition Condition
Condition Condition
Context (HPI/ISM
Context (HPI/ISM
Causal Factor HPI/ISM
Causal Factor HPI/ISM
Causal Factor HPI/ISM
Figure 2-14: Roll Up Conditions to Determine Causal Factors
2.6.8.1 Direct Cause
The direct cause of an accident is the immediate events or conditions that caused the accident. The direct cause should be stated in one sentence, as illustrated in the examples below. Typically, the direct cause of the accident may be constructed or derived from the immediate, proximate event and conditions next to or close by to the accident on the ECF Chart.
2‐78
DOE‐HDBK‐1208‐2012
EXAMPLES EVENT DIRECT CAUSES
The direct cause of the accident was contact between the chisel bit of the air‐powered jackhammer and the 13.2 kV energized electrical cable in the sump pit being excavated.
The direct cause of the accident was the inadvertent activation of electrical circuits that initiated the release of CO2 in an occupied space.
2.6.9 Contributing Causes
Contributing causes are events or conditions that collectively with other causes increased the likelihood of an accident but that individually did not cause the accident. Contributing causes may be longstanding conditions or a series of prior events that, alone, were not sufficient to cause the accident, but were necessary for it to occur. Contributing causes are the events and conditions that “set the stage” for the event and, if allowed to persist or re-occur, increase the probability of future events or accidents.
EXAMPLES EVENT CONTRIBUTING CAUSES
Failure to implement safety procedures in effect for the project contributed to the accident. Failure to erect barriers or post warning signs contributed to the accident. The standing work order process was used by facility personnel as a convenient method of
performing work without a job ticket and work package, allowing most work to be field‐directed. Inadequate illumination in the area of the platform created visibility problems that contributed to
the fall from the platform.
2.6.10 Root Causes
Root causes are the causal factors that, if corrected, would prevent recurrence of the same or similar accidents. Root causes may be derived from or encompass several contributing causes. They are higher-order, fundamental causal factors that address classes of deficiencies, rather than single problems or faults.
Correcting root causes would not only prevent the same accident from recurring, but would also solve line management, oversight, and management system deficiencies that could cause or contribute to other accidents.
2‐79
DOE‐HDBK‐1208‐2012
In many cases, root causes are failures to properly implement the principles and core functions of integrated safety management.
For example, root causes can include failures in management systems to:
Define clear roles and responsibilities for safety
Ensure that staff are competent to perform their responsibilities
Ensure that resource use is balanced to meet critical mission and safety goals
Ensure that safety standards and requirements are known and applied to work activities
Ensure that hazard controls are tailored to the work being performed
Ensure that work is properly reviewed and authorized.
The AIB has an obligation to seek out and report all causal factors, including deficiencies in management, safety, or line management oversight systems.
Root cause statements, as shown in the examples below, should identify the DOE and contractor line organizations responsible for the safety management failures. Root cause statements should also identify the specific management system(s) that failed.
EXAMPLES ROOT CAUSES
Contractor management and the DOE field office failed to clearly define responsibilities for safety reviews of planned work. The lack of clarity in roles and responsibilities for safety reviews was a root cause of the accident.
Contractor management allowed the standing work order process, intended for routine work, to be used to accomplish non‐routine, complex modification and construction work. DOE field office oversight failed to detect and ensure correction of this practice. Misuse of the standing work order process was a root cause of the accident.
Contractor management systems were ineffective in translating lessons learned from past occurrences into safer day‐to‐day operations at the facility. The failure to implement lessons learned was a root cause of the accident.
Assessments performed by the DOE program office failed to identify that some safety standards were not addressed by contractor safety management systems. Implementation of these requirements would have prevented the accident.
2.6.10.1 Root Cause Analysis
Root causes are the causal factors that, if corrected, would prevent recurrence of the same or similar accidents. Root causes may be derived from or encompass several contributing causes.
2‐80
DOE‐HDBK‐1208‐2012
They are higher-order, fundamental causal factors that address classes of deficiencies, rather than single problems or faults.
Correcting root causes would not only prevent the same accident from recurring, but would also solve line management, oversight, and management system deficiencies that could cause or contribute to other accidents. They are identified using root cause analysis. In many cases, root causes are failures to properly implement the principles and core functions of integrated safety management.
Root causes can include failures in management systems to:
Define clear roles and responsibilities for safety
Ensure that staff are competent to perform their responsibilities
Ensure that resource use is balanced to meet critical mission and safety goals
Ensure that safety standards and requirements are known and applied to work activities
Ensure that hazard controls are tailored to the work being performed
Ensure that work is properly reviewed and authorized.
Root cause statements, as shown in the examples below, should identify the DOE and contractor line organizations responsible for the safety management failures. Root cause statements should also identify the specific management system(s) that failed.
Accidents are symptoms of larger problems within a safety management system. Although accidents generally stem from multiple causal factors, correcting only the local causes of an accident is analogous to treating only symptoms and ignoring the “disease.” To identify and treat the true ailments in a system, the root causes of an accident must be identified. Root cause analysis is any technique that identifies the underlying deficiencies in a safety management system that, if corrected, would prevent the same and similar accidents from occurring.
Root cause analysis is a systematic process that uses the facts and results of the core analytic techniques to determine the most important reasons for the accident. Root cause analysis is not an exact science and therefore requires a certain amount of judgment. The intent of the analysis is to identify and address only those root causes that can be controlled within the system being investigated, excluding events or conditions that cannot be reasonably anticipated and controlled, such as some natural disasters. The core analytic techniques—events and causal factors, analysis, barrier analysis, and change analysis—provide answers to an investigator’s questions regarding what, when, where, who, and how. Root cause analysis is primarily performed to resolve the question, “Why?”
To initiate a root cause analysis, the facts surrounding the accident must be known. In addition, the facts must be analyzed using other analytic methods to ascertain an initial list of causal factors. A rather exhaustive list of causal factors must be developed prior to the application of root cause analysis to ensure that final root causes are accurate and comprehensive.
2‐81
DOE‐HDBK‐1208‐2012
The board should examine the evidence collected from the accident scene, witness statements, interviews, and facility documents. It should then determine whether additional information will be needed for the particular root cause technique they are performing.
It is important that the Accident Investigation Board work together to determine the root causes of an accident. One of the board’s primary responsibilities is to identify an accident’s causal factors so that Judgments of Need can be prepared and appropriate corrective measures can be developed and implemented. Therefore, all board members must participate in the root cause analysis; it cannot be left solely to a single member of the board.
Root cause analysis can be performed using computerized or manual techniques. Regardless of the method, the intent is to use a systematic process for identifying root causes.
There may be more than one root cause of a particular accident, but probably not more than three or four. If more are thought to exist at the conclusion of the analysis, the board should re examine the list of causal factors to determine which causes can be further combined to reflect more fundamental (root) causes. This section provides some examples of root cause analysis and discusses analytical tools that can help accident investigators determine the root causes of an accident.
Examples of Root Cause Statements
Contractor management and the DOE field office failed to clearly define responsibilities for safety reviews of planned work. The lack of clarity in roles and responsibilities for safety reviews was a root cause of the accident.
Contractor management allowed the standing work order process, intended for routine work, to be used to accomplish non-routine, complex modification and construction work. DOE field office oversight failed to detect and ensure correction of this practice. Misuse of the standing work order process was a root cause of the accident.
Contractor management systems were ineffective in translating lessons learned from past occurrences into safer day-to-day operations at the facility. The failure to implement lessons learned was a root cause of the accident.
Assessments performed by the DOE program office failed to identify that some safety standards were not addressed by contractor safety management systems. Implementation of these requirements would have prevented the accident.
Once several (or all) of the preliminary analytic techniques have been performed, the accident investigation team should have matured in their understanding of the events and conditions, along with a fairly extensive list of suspected causal factors. A root cause analysis is performed to refine the list of causal factors and categorize each according to its significance and impact on the accident. This is done because of the finite resource limitation. The AI team wants to focus the JONs and the subsequent corrective actions on those causal (root cause) factors that provide the biggest return on investment of resources to fix.
2‐82
DOE‐HDBK‐1208‐2012
There may be more than one root cause of a particular accident, but probably not more than three or four. If more are thought to exist at the conclusion of the analysis, the team should re examine the list of causal factors to determine which causes can be further combined to reflect more fundamental (root) causes. This section provides some examples of root cause analysis and discusses analytical tools that can help accident investigators determine the root causes of an accident.
Significance of the causal factors may be determined by the “Nominal Group Technique,” during which the team simply votes on the most significant causal factors. By this point in the investigation, the team should be knowledgeable about the event, and their instincts may provide a reliable source of accurate information. The team votes for the causal factors that they feel contributed the most to the event, and the causal factors receiving the most votes win.
Validate each significant or key causal factor by asking the question, “If it was fixed, would it break the chain that caused the event?” Indicate significant causal factors on the CFA Chart using red boxes, and indicate key contributing causal factors with yellow boxes. Many significant factors and causes may be indicated, and each requires a Corrective Action.
Figure 2-15: Grouping Root Causes on the Events and Causal Factors Chart
2.6.11 Compliance/Noncompliance
The compliance/noncompliance technique is useful when investigators suspect noncompliance to be a causal factor. This technique compares evidence collected against three categories of noncompliance to determine the root cause of a noncompliance issue. As illustrated in Table 2
2‐83
DOE‐HDBK‐1208‐2012
17, these are: “Don’t Know,” “Can’t Comply,” and “Won’t Comply.” Examining only these three areas limits the application of this technique; however, in some circumstances, an Accident Investigation Board may find the technique useful.
The basic steps for applying the compliance/noncompliance technique are:
Have a complete understanding of the facts relevant to the event
Broadly categorize the noncompliance event
Determine why the noncompliance occurred (i.e., the subcategory or underlying cause).
For example, investigators may use this technique to determine whether an injured worker was aware of particular safety requirements, and if not, why he or she was not (e.g., the worker didn’t know the requirements, forgot, or lacked experience). If the worker was aware but was not able to comply, a second line of questioning can be pursued. Perhaps the worker could not comply because the facility did not supply personal protective equipment. Perhaps the worker would not comply in that he or she refused to wear the safety equipment. Lines of inquiry are pursued until investigators are assured that a root cause is identified.
Lines of questioning pertaining to the three compliance/noncompliance categories follow. However, it should be noted that these are merely guides; an Accident Investigation Board should tailor the lines of inquiry to meet the specific needs and circumstances of the accident under investigation.
Don’t Know: Questions focus on whether an individual was aware of or had reason to be aware of certain procedures, policies, or requirements that were not complied with.
Can’t Comply: This category focuses on what the necessary resources are, where they come from, what it takes to get them, and whether personnel know what to do with the resources when they have them.
Won’t Comply: This line of inquiry focuses on conscious decisions to not follow specific guidance or not perform to a certain standard.
By reviewing collected evidence, such as procedures, witness statements, and interview transcripts, against these three categories, investigators can pursue suspected compliance/noncompliance issues as causal factors.
Although the compliance/noncompliance technique is limited in applicability, by systematically following these or similar lines of inquiry, investigators may identify causal factors and Judgments of Need.
2‐84
DOE‐HDBK‐1208‐2012
Table 2-17: Compliance/Noncompliance Root Cause Model Categories
Don’t Know Can’t Comply Won’t Comply
Never Knew This is often an indication of poor training or failure in a work system to disseminate guidance to the working level.
Scarce Resources
Lack of funding is a common rebuttal to questions regarding noncompliance. However, resource allocation requires decision‐making and priority‐ setting at some level of management. Boards should consider this line of inquiry when examining root causes pertaining to noncompliance issues.
No Reward An investigator may have to determine whether there is a benefit in complying with requirements or doing a job correctly. Perhaps there is no incentive to comply.
Forgot This is usually a local, personal error. It does not reflect a systemic deficiency, but may indicate a need to increase frequency of training or to institute refresher training.
Don’t Know How
This issue focuses on lack of knowledge (i.e., the know‐how to get a job done).
No Penalty This issue focuses on whether sanctions can force compliance, if enforced.
Tasks Implied This is often a result of lack of experience or lack of detail in guidance.
Impossibility This issue requires investigators to determine whether a task can be executed. Given adequate resources, knowledge, and willingness, is a worker or group able to meet a certain requirement?
Disagree In some cases, individuals refuse to perform to a standard or comply with a requirement that they disagree with or think is impractical. Investigators will have to consider this in their collection of evidence and determination of root causes.
2‐85
DOE‐HDBK‐1208‐2012
2.6.12 Automated Techniques
Several root cause analysis software packages are available for use in accident investigations. Generally, these methods prompt the investigator to systematically review investigation evidence and record data in the software package. These software packages use the entered data to construct a tree model of events and causes surrounding the accident. In comparison to the manual methods of root cause analysis and tree or other graphics construction, the computerized techniques are quite time-efficient. However, as with any software tool, the output is only as good as the input; therefore, a thorough understanding of the accident is required in order to use the software effectively.
Many of the software packages currently available can be initiated from both PC-based and Macintosh platforms. The Windows-based software packages contain pull-down menus and employ the same use of icons and symbols found in many other computer programs. In a step by-step process, the investigator is prompted to collect and enter data in the templates provided by the software. For example, an investigator may be prompted to select whether a problem (accident or component of an accident) to be solved is an event or condition that has existed over time. In selecting the “condition” option, he or she would be prompted through a series of questions designed to prevent a mishap occurrence; the “event” option would initiate a process of investigating an accident that has already occurred.
Analytical software packages can help the board:
Remain focused during the investigation
Identify interrelationships among data
Eliminate irrelevant data
Identify causal factors (most significantly, root causes).
The graphics design features of many of these software packages can also be quite useful to the Accident Investigation Board. With a little input, these software packages allow the user to construct preliminary trees or charts; when reviewed by investigators, these charts can illustrate gaps in information and guide them in collecting additional evidence.
It is worth underscoring the importance of solid facts collection. While useful, an analytic software package cannot replace the investigative efforts of the Board. The quality of the results obtained from a software package is highly dependent on the skill, knowledge, and input of the user.
2‐86
DOE‐HDBK‐1208‐2012
2.7
2.7 Developing Conclusions and Judgments of Need to “Prevent” Accidents in the Future
Conclusions and Judgments of Need are key elements of the investigation that must be developed by the Board.
2.7.1 Conclusions
Conclusions are significant deductions derived from the investigation’s analytical results. They are derived from and must be supported by the facts plus the results of testing and the various analyses conducted.
Conclusions may:
Include concise statements of the causal factors of the accident determined by analysis of facts
Be statements that alleviate potential confusion on issues that were originally suspected causes
Address significant concerns arising out of the accident that are unsubstantiated or inconclusive
Be used to highlight positive aspects of performance revealed during the investigation, where appropriate.
When developing conclusions, the Board should:
Organize conclusions sequentially, preferably in chronological order, or in logical sets (e.g., hardware, procedures, people, organizations)
Base conclusions on the facts and the subsequent analysis of the facts
Include only substantive conclusions that bear directly on the accident, and that reiterate significant facts and pertinent analytical results leading to the accident’s causes
Keep conclusions as short as possible and, to the extent possible, limit reference citations (if used) to one per conclusion.
The process of determining conclusions seeks to answer the questions—what happened and why did it happen?
2‐87
DOE‐HDBK‐1208‐2012
EXAMPLE: CONCLUSIONS
XYZ contractor failed to adequately implement a medical surveillance program, thereby allowing an individual with medical restrictions to work in violation of those restrictions. This was a contributing cause to the accident.
Welds did not fail during the steam line rupture. Blood tests on the injured worker did not conclusively establish his blood alcohol content at the time
of the accident. The implementation of comprehensive response procedures prevented the fire from spreading to
areas containing dispersible radioactive materials, averting a significant escalation in the consequences of the fire.
2.7.2 Judgments of Need
Judgments of Need are the managerial controls and safety measures determined by the Board to be necessary to prevent or minimize the probability or severity of a recurrence. Judgments of Need should be linked to causal factors and logically flow from the conclusions. They should be:
Stated in a clear, concise, and direct manner
Based on the facts/evidence
Stated so that they can be the basis for corrective action plans.
Judgments of Need:
Should not be prescriptive corrective action plans or recommendations, nor should they suggest punitive actions.
Should not include process issues (e.g., evidence control, preservation of the accident scene, readiness) unless these issues have a direct impact on the accident. These concerns should be noted in a separate memorandum to the appointing official, with a copy to site management and the Office of Corporate Safety Programs.
Board members should work together to derive Judgments of Need to assure that the merits and validity of each are openly discussed and that each one flows from the facts and analyses.
An interactive process is the preferred approach for generating Judgments of Need. That is, Board members should work together to review causal factors and then begin generating a list of Judgments of Need. These judgments should be linked directly to causal factors, which are derived from facts and analyses.
2‐88
a
a
DOE‐HDBK‐1208‐20012
One methhod for ensuuring that all significant ffacts and anaalytical resuults are addreessed in the Judgmennts of Need is to develop displays linnking Judgmments of Needd with facts, analyses, annd causal factors. Previious Boards have found iit useful to ddisplay thesee elements onn the walls oof the Board’s cconference rroom. Figurre 2-12 demoonstrates howw this informmation can bbe arranged tto provide aan ongoing aassessment oof linkages aamong the foour elements . Using this diagrammed verificatiion analysis approach, thhe Board cann identify gaaps in the datta where a cllear, logical flow among thhe four elemments is missiing. The Booard can use this informaation to deterrmine whethher Judgmennts of Need aare supportedd by linkages connectingg the facts, reesults from aanalyses, and causal factors.
Figgure 2-16: Facts, Anaalyses, andd Causal FFactors aree needed too Support Judgmments of Neeed
2‐89
DOE‐HDBK‐1208‐2012
If a Judgment of Need cannot be clearly linked to causal factors derived from analysis of facts, exclude it from the report.
Once the Board has identified the Judgments of Need derived from their investigation activities, the members can begin writing statements documenting these judgments. Table 2-18 presents guidance on writing these statements.
Table 2-18: These Guidelines are Useful for Writing Judgments of Need
Clearly identify organizations that need to implement actions to prevent recurrence of the accident. Where applicable, specify whether the judgment of need applies to a DOE Headquarters or field element, contractor, subcontractor, or some combination of these.
Avoid generic statements and focus on processes and systems, not individuals.
Focus on causal factors.
Be specific and concise; avoid vague, generalized, broad‐brush, sweeping solutions introduced by "should."
Do not tell management how to do something; simply identify the need.
Present Judgments of Need in a manner that allows a specific organization to translate them into corrective actions sufficient to prevent recurrence.
Table 2-19 provides samples of well-written Judgments of Need for the case study electrical accident. Information in this table demonstrates the relationships among significant facts, analysis, causal factors, and Judgments of Need.
Judgments of Need form the basis for corrective action plans, which are the responsibility of line management and should not be directed by the Board. If the Board finds a need to make specific recommendations, they should appear in a separate communication and not in the body of the report or in the transmittal letter to the appointing official.
Table 2-19: Case Study: Judgments of Need
Significant Facts Causal Factors Judgments of Need
Safety training for the accident victim as required by XYZ ES&H Manual Procedure 1234 was not completed prior to the accident.
Training implementation was informal and was not based on appropriate structured development and measurement of learning. This programmatic deficiency was a contributing cause to the accident.
XYZ management needs to evaluate the effectiveness of implementation of the training program by observing and measuring workplace performance.
2‐90
DOE‐HDBK‐1208‐2012
Significant Facts Causal Factors Judgments of Need
The standing work order system normally used for non‐routine, non‐repetitive tasks was used to authorize the work involved in the accident.
Using the standing work order process, normally used for routine tasks, to accomplish non‐ routine, complex modification and construction work, was a root cause of the accident.
XYZ management needs to assure that the standing work order system is used only on routine, repetitive, and noncomplex tasks where no significant risks or hazards have been identified or could reasonably be encountered.
2.7.3 Minority Opinions
During the process of identifying Judgments of Need, Board members may find that they disagree on the interpretation of facts, analytical results, causal factors, conclusions, or Judgments of Need. This disagreement can occur because the Board:
Has too few facts or has conflicting information from different sources;
Needs to evaluate the analyses conducted and consider using different analytical techniques; or
Disagrees on the linkages among facts, analyses and causal factors.
When this disagreement occurs, additional information may be needed to resolve these conflicts. Even when new facts are collected and new analyses are conducted, Board members may still strongly disagree on the interpretation of facts, the conclusions, or the Judgments of Need. Board members should make these differences known to the Chairperson as soon as they arise. Every effort should be made to resolve a Board member’s dissenting opinion by collecting additional facts, if possible, and conducting additional analyses.
When Board members still disagree, it is recommended that the Chairperson:
Obtain a detailed briefing from those not in agreement and consider the facts, analyses, causal factors, and conclusions that each used.
Monitor the differences between those not in agreement by holding meetings to discuss any new information collected or new analyses conducted; more common ground may be found as this information emerges.
Work with the Board to identify areas of mutual agreement and areas of disagreement as the end of the investigation approaches.
Openly discuss his or her position concerning the causal factors, conclusions, and Judgments of Need with the Board and achieve consensus. At this point, Board members
2‐91
DOE‐HDBK‐1208‐2012
who disagree with the consensus should describe their position and indicate whether there is a need to present a minority opinion in the accident investigation report.
Note that the Board is not required to reach consensus, but is encouraged to work diligently to resolve differences of opinion. However, if one or more Board members disagree with the interpretation of facts, causal factors, conclusions, or Judgments of Need endorsed by the remainder of the Board, the minority Board member or members should document their differences in a minority report. This report is described in Section 2.8.11.
2.8 Reporting the Results
The purpose of the investigation report is to clearly and concisely convey the results of the investigation. The content should help the reader understand what happened (the accident description and chronology), why it happened (the causal factors), and what can be done to prevent a recurrence (the Judgments of Need). Investigation results are reported without attributing individual fault or proposing punitive measures. The investigation report constitutes an accurate and objective record of the accident and provides complete and accurate details and explicit statements of:
The Board’s investigation process
Facts pertaining to the accident, including relevant management systems involved
Analytical methods used and their results
Conclusions of the Board, including the causal factors of the accident
Judgments of Need for corrective actions to prevent recurrence of the accident.
When completed, this report is submitted to the appointing official for acceptance and dissemination.
2.8.1 Writing the Report
The investigation report is the official record of the investigation. Its importance cannot be overemphasized. The quality of the investigation will be judged primarily by the report that provides the affected site and the DOE complex as a whole with the basis for developing the corrective actions and lessons learned necessary to prevent or minimize the severity of a recurrence.
Previous Boards have conducted thorough and competent accident investigations, yet failed to effectively communicate the results in the report. As a result, the conclusions, Judgments of Need and lessons learned can appear unsupported or are lost in a mass of detail.
The report writing process is interactive, but must maintain a focused objective. Guidelines for drafting a report, provided in Table 2-20, will help the Board work within the investigation cycle and schedule to maximize their efficiency and effectiveness in developing a useful report.
2.8
2‐92
DOE‐HDBK‐1208‐2012
Senior DOE management is placing increasingly greater emphasis on generating concise (nominally less than 50 pages), yet effectively thorough investigation reports. Conciseness requires Board members to communicate the significant facts, analyses, causal factors, conclusions, and Judgments of Need with as little extraneous narrative as possible. Effective thoroughness is the need for reports to provide helpful and useful information to line managers to assist them in enhancing their safety programs.
Table 2-20: Useful Strategies for Drafting the Investigation Report
Establish clear responsibilities for writing each section of the report. Establish deadlines for writing, quality review, and production, working back from the scheduled
final draft report due date. Use an established format (as described in Section 2.8.2). Devise a consistent method for
referencing titles, acronyms, appendices, and footnotes to avoid last‐minute production problems. Use a single point of contact, such as the administrative coordinator, to control all electronic
versions of the report, including editing input, and to coordinate overall report production. Start writing as soon as possible. Write the facts as bulleted statements as they are documented.
Write the accident chronology as soon as possible to minimize the potential for forgetting the events and to save time when generating the first draft.
Begin developing illustrations and photograph captions early. These processes take more time than generally anticipated.
Allow time for regular editorial and Board member review and input. Don’t wait until the last few days on site for the Board to review each other’s writing and the entire draft report. This step is important for assuring that primary issues are addressed and the investigation remains focused and within scope.
Use a zip drive to save the report during text processing — the file is extremely large. Use a technical writer or editor early in the process to edit the draft report for readability, grammar,
content, logic, and flow. Share information with other Board members. Plan for several revisions.
2.8.2 Report Format and Content
The investigation report should consist of the elements listed in Table 2-21. Although DOE O 225.1B does not specifically require some of these elements or prescribe any specific order of presentation within the report, a certain level of consistency in content and format among reports facilitates extraction and dissemination of facts, conclusions, Judgments of Need, and lessons learned.
In addition to a table of contents for the report body, a list of exhibits, figures, and tables and a list of appendices should be included. Typically, the table of contents lists the headings within the report down to the third level.
2‐93
DOE‐HDBK‐1208‐2012
Table 2-21: The Accident Investigation Report Should Include these Items
EXAMPLE: TABLE OF CONTENTS
Table of Contents .......................................................................................................................... iii Acronyms and Abbreviations ......................................................................................................... v Executive Summary...................................................................................................................... vii 1.0 Introduction ...................................................................................................................... 1 1.1. Background ....................................................................................................................... 1 1.2. Facility Description ............................................................................................................ 1 1.3. Scope, Conduct, and Methodology ..................................................................................... 2 2.0 The Accident ...................................................................................................................... 5 2.1. Background ....................................................................................................................... 5 2.2. Accident Description ......................................................................................................... 5 2.3. Accident Response ............................................................................................................ 8 2.4. Medical Report Summary ................................................................................................... 8 2.5. Event Chronology ............................................................................................................... 9 3.0 Facts and Analysis ............................................................................................................ 11 3.1 Emergency Response ....................................................................................................... 11 3.2. Post‐Event Accident Scene Preservation and Management Response .............................. 13 3.6. Assessment of Prior Events and Accident Pre‐cursors ....................................................... 13 3.7. Integrated Safety Management Analysis .......................................................................... 15 3.3. Conduct of Operations, Work Planning and Controls ....................................................... 16 3.4. Supervision and Oversight of Work .................................................................................. 17 3.5. 10 CFR Part 851 DOE Worker Safety and Health Program ................................................. 18 3.8. Human Performance Analysis .......................................................................................... 19 3.9. Department of Energy Programs and Oversight ............................................................... 21 3.10. Summary of Causal Factor Analyses ................................................................................. 22 3.11. Barrier Analysis ............................................................................................................... 23 3.12. Change Analysis ............................................................................................................... 24 3.13. Events and Causal Factors Analysis ................................................................................. 25 4.0 Conclusions and Judgments of Need ................................................................................ 28 5.0 Board Signatures ............................................................................................................. 30 6.0 Board Members, Advisors, Consultants, and Staff ........................................................... 31 Appendix A: Board Letter of Appointment ..................................................................................A‐1 Appendix B: Barrier Analysis ...................................................................................................... B‐1 Appendix C: Change Analyses...................................................................................................... C‐1 Appendix D: Events and Causal Factors Analysis ........................................................................ D‐1 Appendix E: Human Performance and Management Systems Analysis ........................................ E‐1 Appendix F: Detailed Summary of Causal Factors ........................................................................ F‐1
2‐94
DOE‐HDBK‐1208‐2012
EXAMPLE: EXHIBITS, FIGURES AND TABLES
Exhibit 1‐1 Area Enclosure ...................................................................................................... 4 Exhibit 2‐1 View Looking South .............................................................................................. 5 Figure 2‐1 Summary Events Chart and Accident Chronology ................................................ 10 Figure 2‐2 Barrier Analysis Summary ................................................................................... 23 Figure 2‐3 Events and Causal Factors Chart........................................................................... 26 Table 3‐1 Conclusions and Judgments of Need ................................................................... 29
The following are brief descriptions and acceptable examples of the elements of a typical accident investigation report.
2.8.3 Disclaimer
The accident investigation report disclaimer should appear on the back of the title page of the report. The disclaimer is a statement that the report neither determines nor implies liability. It should be worded exactly as the example below, with the substitution of the appointing official.
EXAMPLE: DISCLAIMER
This report is an independent product of the Federal Accident Investigation Board appointed by [Name], Chief Health, Safety and Security Officer.
The Board was appointed to perform a Federal investigation of this accident and to prepare an investigation report in accordance with DOE Order 225.1B, Accident Investigations. The discussion of facts, as determined by the Board, and the views expressed in the report do not assume and are not intended to establish the existence of any duty at law on the part of the U.S. Government, its employees or agents, contractors, their employees or agents, or subcontractors at any tier, or any other party.
This report neither determines nor implies liability.
2.8.4 Appointing Official’s Statement of Report Acceptance
After reviewing the draft final report, the appointing official signs and dates a statement indicating that the investigation has been completed in accordance with procedures specified in DOE O 225.1B and that the findings of the Accident Investigation Board have been accepted. An example of this statement is provided below.
2‐95
DOE‐HDBK‐1208‐2012
EXAMPLE: APPOINTING OFFICIAL’S ACCEPTANCE STATEMENT
On [Date], I established a Federal Accident Investigation Board to investigate the [Fall] at the [Facility] at the [Site] that resulted in the [Fatality of a construction worker]. The Board’s responsibilities have been completed with respect to this investigation. The analyses, identification of direct, contributing, and root causes, and Judgments of Need reached during the investigation were performed in accordance with DOE Order 225.1B, Accident Investigations. I accept the findings of the Board and authorize the release of this report for general distribution.
Signed,
[Name]
Title Office
2.8.5 Acronyms and Initialisms
The use of acronyms and initialisms is common among DOE staff and contractors. However, to a reader outside the Department, the use of such terms without adequate definition can be frustrating and hinder understanding. Acronyms and initialisms should be kept to a minimum. Proliferation of acronyms makes it difficult, for those unfamiliar with the site, facility, or area involved, to read and comprehend the report. Acronyms or initialisms should not be used for organizational elements in the field or position titles. This element of the report assists readers by identifying, in alphabetical order, terms and acronyms used in the report (see example below). In addition, if necessary, a glossary of technical terms should follow this section.
EXAMPLE: ACRONYMS AND INITIALISMS
CFR Code of Federal Regulations DOE U.S. Department of Energy
EM DOE Office of Environmental Management ES&H Environment, Safety and Health
HSS Office of Health, Safety and Security M&O Management and Operating OSHA Occupational Safety and Health Administration
2‐96
DOE‐HDBK‐1208‐2012
2.8.6 Prologue - Interpretation of Significance
The prologue is a one-page synopsis of the significance of the accident with respect to management concerns and the primary lessons learned from the accident.
The prologue should interpret the accident’s significance as it relates to the affected site, other relevant sites, field offices within the DOE complex, and DOE Headquarters.
EXAMPLE: PROLOGUE
INTERPRETATION OF SIGNIFICANCE
The fatality at the [Site] on [Date] resulted from failures of Department of Energy (DOE), contractor, and subcontractor management, and the fatally injured worker. The subcontractor, the employer of the fatally injured worker, had a poor record of serious safety deficiencies and had never accepted the higher levels of safety performance required by the Department’s safe work ethic. Although all the appropriate contractual and procedural requirements were in place, the subcontractor failed to implement them and continued to allow violations of Occupational Safety and Health Administration regulations invoked by DOE orders. These serious deficiencies were recognized by the prime contractor, which was instituting progressively stronger sanctions against the subcontractor. However, because of the subcontractor’s recalcitrance and the imminent danger conditions represented by the subcontractor’s frequent violations of fall protection requirements, more aggressive measures, such as contract cancellation, could have been taken earlier.
The prime contractor’s oversight was narrowly focused on selected aspects of the subcontractor’s safety performance and did not identify the subcontractor’s failure to implement its own procedures, or institute appropriate fall protection measures. Thus, the implications and frequency of imminent danger hazards were not fully appreciated. Departmental oversight focused on the subcontractor’s performance and did not identify the gaps in the prime contractor’s oversight focus. As a result, hazards were not identified and barriers were not in place to prevent the accident, which could have been avoided.
This fatality highlights the importance of a complete approach to safety that stresses individual and line management responsibility and accountability, implementation of requirements and procedures, and thorough and systematic oversight by contractor and Department line management. All levels of line management must be involved. Contractual requirements and procedures, implementation of these requirements, and line management oversight are all necessary to mitigate the dangers of hazards that arise in the workplace. Particular attention must be paid to individual performance and changes in the workplace. Sound judgment, constant vigilance, and attention to detail are necessary to deal with hazards of immediate concern. When serious performance deficiencies are identified, there must be strong, aggressive action to mitigate the hazards and re‐establish a safe working environment. Aggressive actions up to and including swift removal of organizations that exhibit truculence toward safety, are appropriate and should be taken.
2‐97
DOE‐HDBK‐1208‐2012
2.8.7 Executive Summary
The purpose of the executive summary is to convey to the reader a reasonable understanding of the accident, its causes, and the actions necessary to prevent recurrence. Typical executive summaries are two to five pages, depending on the complexity of the accident.
The executive summary should include a brief account of:
Essential facts pertaining to the occurrence and major consequences (what happened)
Conclusions that identify the causal factors, including organizational, management systems, and line management oversight deficiencies, that allowed the accident to happen (why it happened)
Judgments of Need to prevent recurrence (what must be done to correct the problem and prevent it from recurring at the affected facility and elsewhere in the DOE complex).
The executive summary should be written for the senior manager or general reader who may be relatively unfamiliar with the subject matter. It should contain only information discussed in the report, but should not include the facts and analyses in their entirety.
The prologue should interpret the accident’s significance as it relates to the affected site, other relevant sites, field offices within the DOE complex, and DOE Headquarters.
The executive summary should not include a laundry list of all the facts, conclusions, and Judgments of Need. Rather, to be effective, it should summarize the important facts; causal factors; conclusions; and Judgments of Need. In other words, if this was the only part of the report that was read, what are the three or four most important things you want the reader to come away with?
2‐98
DOE‐HDBK‐1208‐2012
EXAMPLE: EXECUTIVE SUMMARY
INTRODUCTION A fatality was investigated in which a construction subcontractor fell from a temporary platform in the [Facility] at the [Site]. In conducting its investigation, the Accident Investigation Board used various analysis techniques, including events and causal factors charting and analysis, barrier analysis, change analysis, and root cause analysis. The Board inspected and videotaped the accident site, reviewed events surrounding the accident, conducted extensive interviews and document reviews, and performed analyses to determine the causal factors that contributed to the accident, including any management system deficiencies. Relevant management systems and factors that could have contributed to the accident were evaluated using with the components of the Department’s integrated safety management system, as described in DOE Policy 450.4A.
ACCIDENT DESCRIPTION The accident occurred at approximately [Time] on [Date] at the [Facility], when a construction worker, employed by [Subcontractor], fell from a temporary platform. The platform had been installed to catch falling tools and parts, but it was also used as a work platform for personnel activities when 100 percent fall protection was used. The worker was transported by helicopter to the medical center, where he died at [Time] from severe head and neck injuries.
DIRECT AND ROOT CAUSES The direct cause of the accident was the fall from an unprotected platform. The contributing causes of the accident were: (1) the absence of signs and barricades in the vicinity of the platform, (2) visibility problems created by poor illumination in the area of the platform, and (3) lack of implementation of job safety analysis, work controls, and the medical surveillance program. The root causes of the accident were: (1) failure by [Subcontractor] to implement requirements and procedures that would have mitigated the hazards, and (2) failure by [Subcontractor] to effectively implement components of the Department’s integrated safety management policy mandating line management responsibility and accountability for safety performance.
CONCLUSIONS AND JUDGMENTS OF NEED Conclusions of the Board and Judgments of Need as to managerial controls and safety measures necessary to prevent or mitigate the probability of a recurrence are summarized in Table 1.
2‐99
DOE‐HDBK‐1208‐2012
EXAMPLE: EXECUTIVE SUMMARY (continued)
Table 1. Conclusions and Judgments of Need
Conclusions Judgments of Need
Comprehensive safety requirements existed, were contractually invoked, and were appropriate for the nature of [Facility] construction work.
None
[Subcontractor] failed to follow procedures required by its contract and by its ES&H Program Plan, including:
[Subcontractor] failed to adequately implement fall protection requirements contained in its ES&H Program Plan for the [Facility] project, including enforcement of a three‐tiered approach to fall protection. The third tier (choice of last resort) requires anchor points, lanyards, shock absorbers, and full‐ body harness.
The worker was not wearing any fall protection equipment and did not obtain a direct reading dosimeter before entering the radiological control area.
[Subcontractor] line management and safety personnel need to implement existing safety requirements and procedures.
[Subcontractor] and [Contractor] did not fully implement the hazard inspection requirements of the [Facility] contract and [Subcontractor's] ES&H Program Plan, and therefore did not sufficiently identify or analyze hazards and institute protective measures necessary due to changing conditions.
[Subcontractor] and [Contractor] need to ensure that an adequate hazard analysis is performed prior to changes in work tasks that affect the safety and health of personnel.
2.8.8 Introduction
The Introduction section of the report, illustrated in the example that follows, normally contains three major subsections:
A brief background description of the accident and its results, and a statement regarding the authority to conduct the investigation.
A facility description defining the area or site and the principal organizations involved, to help the reader understand the context of the accident and the information that follows.
2‐100
DOE‐HDBK‐1208‐2012
Descriptions of the scope of the investigation, its purpose, and the methodology employed in conducting the investigation.
Site and facility diagrams and organizational charts for relevant management systems may be appropriate in either the Introduction or the Facts and Analysis section. However, include this information only when it is needed to clarify the accident's context and the role of related organizations.
EXAMPLE:
1.0 INTRODUCTION 1.1 BACKGROUND On [Date], at approximately [Time], a construction subcontractor working at the [Site] fell approximately 17 feet from a temporary platform. The platform was built to catch falling tools and parts in the [Facility]. The worker was transported by helicopter to the medical center, where he died from severe head and neck injuries.
On [Date], [Appointing Official Name and Title] appointed an Accident Investigation Board to investigate the accident, in accordance with DOE Order 225.1B, Accident Investigations. 1.2 FACILITY DESCRIPTION Contractor activities at [Site] are managed by the DOE XXX Operations Office. The facility in which this accident occurred is under the programmatic direction of the Office of Environmental Management (EM).
[Provide a brief discussion of site, facility, or area operations and descriptive background that sheds light on the environment or location where the accident occurred.]
1.3 SCOPE, CONDUCT, AND METHODOLOGY The Board commenced its investigation on [Date], completed the investigation on [Date], and submitted its findings to the Assistant Secretary for Environment, Safety and Health on [Date].
The scope of the Board's investigation was to review and analyze the circumstances to determine the accident's causes. During the investigation, the Board inspected and videotaped the accident site, reviewed events surrounding the accident, conducted interviews and document reviews, and performed analyses to determine causes.
The purposes of this investigation were to determine the nature, extent, and causation of the accident and any programmatic impact, and to assist in the improvement of policies and practices, with emphasis on safety management systems.
The Board conducted its investigation, focusing on management systems at all levels, using the following methodology:
Facts relevant to the accident were gathered Relevant management systems and factors that could have contributed to the accident were
evaluated in accordance with the components of DOE's integrated safety management system, as described in DOE Policy 450.4A.
Events and causal factors charting and analysis, along with barrier analysis and change analysis, was used to provide supportive correlation and identification of the causes of the accident.
2‐101
DOE‐HDBK‐1208‐2012
2.8.9 Facts and Analysis
The Facts and Analysis section of the report states the facts related to the accident and the analysis of those facts. It focuses on the events connected to the accident; the factors that allowed those events to occur; and the results of the various analytical techniques used to determine the direct, contributing, and root causes of the accident, including the role of management and safety system deficiencies. This section should logically lead the reader to the conclusions and Judgments of Need. Photographs, evidence position maps, and diagrams, which may provide perspectives that written narrative cannot capture, should be included in the Facts and Analysis section, as determined by the Board. The Facts and Analysis section includes subsections dealing with:
Accident description and chronology, including a description of the responses to the accident
Hazards, controls, and management systems pertinent to the accident
Brief descriptions of and results from analyses, that were conducted (e.g., barrier analysis, change analysis, events and causal factors analysis, and root cause analysis).
Accident Description and Chronology subsection. A subsection describing the accident and chronology of events should be first in the Facts and Analysis section of the report. This is typically one of the first sections written, as soon as evidence is collected and pertinent information is documented. It is reasonable for the Board to begin preparing a draft of the accident description and chronology during the first few days on site. As additional information is collected, new findings can be used to augment the initial writing. This section includes:
Background information about systems and any activities and events preceding the accident, including scheduled maintenance and system safety analysis
Chronological description of the events leading up to and including the accident itself
A summary events chart, identifying the major events from the events and causal factors chart.
Description and Analysis of Facts subsections. Subsections on the facts surrounding the accident, and the analysis of those facts, should follow the accident description and chronology subsection. These sections must provide the full basis for stating the accident’s causes and Judgments of Need.
In writing the report, it is important to clearly distinguish facts from analysis. Facts are objective statements that can be verified by physical evidence, by direct observation, through documentation, or from statements corroborated by at least one witness or interviewee other than the one making the statement. Analysis is a critical review and discussion of the implications of the facts, leading to a logical interpretation of those facts and supportable conclusions. The
2‐102
DOE‐HDBK‐1208‐2012
analysis should include a brief statement of the impact of the factual circumstances on the accident. Table 2-22 illustrates this distinction.
Following are some guidelines for developing this portion of the report:
The subsections should be organized logically according to relevant investigation topics, such as:
Physical hazards
Conduct of operations
Training
Work planning and control
Organizational concerns
Management systems
Maintenance
Personnel performance
Other topics specific and relevant to the investigation.
For each subsection, list relevant facts in the form of bulleted statements.
For each subsection, provide an analysis of what the facts mean in terms of their impact on the accident and its causes. This narrative should be as concise as possible and may reference the more detailed analyses discussed later in the report (e.g., barrier analysis, change analysis, events and causal factors charting and analysis, and root cause analysis). All facts included in the report should be addressed.
Generally the facts are presented as short statements, and the analysis of the facts provides a direct link between the facts and causal factors. See the example on the next page.
2‐103
DOE‐HDBK‐1208‐2012
Table 2-22: Facts Differ from Analysis
Facts Analysis
At 9:30 a.m. the outside temperature was 36° F and the sky was clear.
Meteorological conditions at the time of the accident did not contribute to the accident.
In September 1995, the Environmental Group implemented its own alternate work authorization process. This process did not include a job hazards analysis prior to construction activities.
The alternate work authorization process was not adequate to assure worker safety.
EXAMPLE: DESCRIPTION AND ANALYSIS OF FACTS
3.0 FACTS AND ANALYSIS 3.3 PHYSICAL HAZARDS, CONTROLS, AND RELATED FACTORS 3.3.1 Physical Barriers
Facts related to physical barriers on the day of the accident are as follows:
There were no general barriers, warning lines, or signs to alert personnel on top of the construction materials to the fall hazards in the area. There were no other safety barriers for the platform.
The platform was intended to catch falling tools or parts, but it was also used as a work platform for personnel with 100 percent fall protection.
There were no static lines or designated (i.e., engineered) anchor points for personnel to connect fall protection equipment in the vicinity of the platform.
Lighting in the area of the platform was measured at 2 foot‐candles. Following is the analysis of these facts. Occupational Safety and Health Standards for the Construction Industry (29 CFR 1926) requires that, when working from an area greater than six feet in height or near unprotected edges or sides, personal protection in the form of a fall protection system be in place during all stages of active work. Violations of fall protection requirements usually constitute an imminent danger situation. Lighting in the area was less than the minimum of 5 foot‐candles prescribed by the OSHA standards (29 CFR 1925.56). This level of illumination may have contributed to the accident, taking into consideration the visual adjustment when moving from a brighter area to a progressively darker area, as was the case in the area where the accident occurred. There were no permanently installed fall protection systems, barriers, or warnings; each sub‐tier contractor was expected to identify the fall hazards and provide its own fall protection system as they saw fit. The combination of these circumstances was a contributing cause of the accident.
2‐104
DOE‐HDBK‐1208‐2012
3.12 CHANGE ANALYSIS Change analysis was performed to determine points where changes are needed to correct deficiencies in the safety management system and to pinpoint changes and differences that may have had an effect on the accident. Changes directly contributing to the accident were failure to execute established procedures for fall protection, signs and barricades, and Job Safety Analysis/Construction Safe Work Permit; unsafe use of the temporary platform; insufficient lighting in the platform area; and unenforced work restrictions for the construction worker. No job safety analysis was performed and/or Construction Safe Work Permit obtained for work on the platform, leading to a failure in the hazard analysis process and unidentified and uncorrected hazards in the workplace. Deficiencies in the management of the safety program within [Subcontractor] are also related to failures in the medical surveillance program. Changes brought about by [Subcontractor] management failures resulted in a deficient worker safety program. Management failed to implement the contractual safety requirements necessary to prevent the accident and avoid deficiencies in the worker safety program. [Contractor's] progressive approach to improving [Subcontractor's] compliance with safety requirements was successful to a degree, but failed to prevent recurrence of imminent danger situations.
EXAMPLE:
3.13 EVENTS AND CAUSAL FACTORS ANALYSIS
3.13.1 Direct Cause of the accident: fall from an unprotected platform. However, there were also contributing causes and root causes.
3.13.2 Contributing causes for the accident:
Job safety analysis, work controls, and medical surveillance program not implemented Insufficient illumination in the area of the temporary platform Failure to remove the temporary platform Absence of warning signs and barricades. Another possible contributing factor was impaired judgment of the worker who fell from the
platform. This cause could not be substantiated.
2‐105
DOE‐HDBK‐1208‐2012
3.13.3 Root Causes of the accident:
Failure by [Subcontractor] to implement requirements and procedures that would have mitigated the hazards. The implementation of comprehensive and appropriate requirements is part of the third of DOE's safety management principles. [Subcontractor] failed to implement its medical surveillance program and to enforce work restrictions for the worker. A hazard analysis, required by the Industrial Hygiene Program Plan, was not conducted; consequently, the hazards associated with the platform were not identified, and no countermeasures were implemented. The absence of fall protection, physical barriers, and warning signs in the vicinity of the platform, along with inadequate lighting, violated DOE requirements that invoke Federal safety standards. Finally, failure to ensure that comprehensive requirements are fully implemented represents a fundamental flaw in the safety management program of [Subcontractor] and exhibits failure to meet part of the management requisites for the fifth of DOE's safety management principles requiring that comprehensive and appropriate requirements be established and effectively implemented to counteract hazards and assure safety.
Failure by [Subcontractor] to implement the principle of line management responsibility and accountability for safety. Line management responsibility and accountability for safety is the first of DOE's safety management principles. [Subcontractor] has clear safety policies and well defined responsibilities and authorities for safety. However, [Subcontractor] line management failed to appropriately analyze and manage hazard mitigation and, when faced with adverse consequences for poor safety performance, has refused to accept accountability. [Subcontractor] consistently failed to implement effective safety policies by 10 C.F.R. 831 and the ES&H and practices as reflected in DOE policies and industry standards. [Subcontractor] did not meet contractual requirements for safety and its own safety policy. Finally, [Subcontractor] failed to ensure that findings resulting from reviews, monitoring activities, and audits were resolved in a timely manner. [Subcontractor's] approach and numerous safety program failures reflect less than full commitment to safety and directly led to the accident.
2.8.10 Conclusions and Judgments of Need
The Conclusions and Judgments of Need section of the report lists the Board’s conclusions in the form of concise statements, as well as the Board’s Judgments of Need (discussed in Section 2.7). The conclusions can be listed using bulleted statements, tables, or diagrams with limited narrative, as long as the meaning is clear. Judgments of Need may be presented in the same manner.
Judgments of Need are identified actions required to prevent future accidents. Examples of well- written Judgments of Need are shown in the Example.
2‐106
DOE‐HDBK‐1208‐2012
EXAMPLE: 4.0 CONCLUSIONS AND JUDGMENTS OF NEED This section of the report identifies the conclusions and Judgments of Need determined by the Board, as a result of using the analysis methods described in Section 3.0. Conclusions of the Board consider significant facts, causal factors, and pertinent analytical results. Judgments of Need are managerial controls and safety measures believed necessary to prevent or mitigate the probability or severity of a recurrence. They flow from the causal factors and are directed at guiding managers in developing follow‐up actions. Table 4‐11 identifies the conclusions and the corresponding Judgments of Need identified by the Board.
Table 4‐1. Conclusions and Judgments of Need
CONCLUSIONS JUDGMENTS OF NEED
Comprehensive safety requirements existed, were contractually invoked, and were appropriate for the nature of construction work.
None
[Subcontractor] failed to follow procedures required by its contract and by its ES&H Program Plan, including:
[Subcontractor] failed to adequately implement fall protection requirements contained in its ES&H Program Plan for the project, including enforcement of a three‐ tiered approach to fall protection. The third tier (choice of last resort) requires anchor points, lanyards, shock absorbers, and full‐ body harness.
[Subcontractor] line management and safety personnel need to implement existing safety requirements and procedures.
A temporary platform, used as a work surface [Subcontractor] and [Contractor] need to ensure for personnel activities when employing 100 that safety personnel inspect changing work percent fall protection, did not have guardrails conditions for previously unidentified safety and and was left in place without barriers or other health hazards, and implement protective warning devices. measures.
[Subcontractor] failed to post adequate warning signs and establish barriers on the stack to warn personnel that they were approaching within six feet of the edge of a fall hazard, as required by OSHA regulations and [Subcontractor's] ES&H Program Plan.
[Contractor] failed to recognize that warning signs and barriers were not in place in the work area near the platform.
2‐107
DOE‐HDBK‐1208‐2012
2.8.11 Minority Report
If used, the Minority Report section contains the opinions of any Board member(s) that differ from the majority of the Board. The minority report should:
Address only those sections of the overall report that warrant the dissenting opinion
Follow the same format as the overall report, addressing only the points of variance
Not be a complete rewrite of the overall report.
2.8.12 Board Signatures
The Accident Investigation Board Chairperson and members must sign and date the report, even if there is a minority opinion. The signature page identifies the name and position of each Board member and the Accident Investigation Board Chairperson, as shown on the next page. This page also indicates whether each Board member is a DOE accident investigator.
2‐108
DOE‐HDBK‐1208‐2012
EXAMPLE:
5.0 BOARD SIGNATURES
Signed Date Dated
[Name], Board Chairperson U.S. Department of Energy, HQ
Signed Date Dated
[Name], Board Member DOE Accident Investigator U.S. Department of Energy, Savannah River Site Office
Signed Date Dated
[Name], Board Member DOE Accident Investigator U.S. Department of Energy, Oak Ridge Operations Office
Signed Date Dated
[Name], Board Member Accident Investigator U.S. Department of Energy, Idaho Operations Office
Signed Date Dated
[Name], Board Member U.S. Department of Energy, Idaho Operations Office
2‐109
DOE‐HDBK‐1208‐2012
2.8.13 Board Members, Advisors, Consultants, and Staff
The investigation team participants section lists the names of the Board members, advisors, and staff, indicating their employers and their positions with respect to the accident investigation.
EXAMPLE:
6.0 BOARD MEMBERS, ADVISORS, CONSULTANTS, AND STAFF
Chairperson [Name], DOE Member [Name], DOE Member [Name], DOE Member [Name], DOE Member [Name], DOE
Advisor [Name], DOE Advisor [Name], DOE Advisor [Name], DOE Advisor [Name], DOE Advisor [Name], Consultant
Medical Advisor[Name], M.D., Consultant Legal Advisor [Name], DOE
Administrative Coordinator [Name], XYZ Corporation
Technical Writer [Name], XYZ Corporation
Technical Writer [Name], XYZ Corporation
Administrative Support [Name], DOE
2.8.14 Appendices
Appendices are added, as appropriate, to provide supporting information, such as the Accident Investigation Board’s appointment letter and results from detailed analyses conducted during the investigation. Generally, the amount of documentation in the appendices should be limited. If
2‐110
DOE‐HDBK‐1208‐2012
2. 9
there is any doubt about the benefit of including material as an appendix, it should probably be omitted. All appendices should be referenced in the report.
2.9 Performing Verification Analysis, Quality Review and Validation of Conclusions
Before releasing the report outside the investigation team, the Board reviews it to ensure its technical accuracy, thoroughness, and consistency, and to ensure that organizational concerns, safety management systems, and line management oversight processes are properly analyzed as possible causes of the accident. The Board Chairperson should plan and schedule sufficient time for these reviews to maintain the appropriate investigation cycle. The following are further considerations for quality review of the report.
2.9.1 Structure and Format
The report should be reviewed to ensure that it follows the format and contains the information outlined in Section 2.8, which ensures compliance with the intent of Paragraph 4.c. (3) of DOE O 225.1B. Variation in the format is acceptable, as long as it does not affect the report’s quality or conflict with the requirements of the order.
2.9.2 Technical and Policy Issues
All technical requirements applicable to the investigation should be reviewed by appropriate subject matter experts to assure their accuracy. Likewise, a knowledgeable Board member or advisor should review whether policy, requirements, and procedures were followed. A Board member or advisor knowledgeable in such policy and requirements should also review the report to determine whether these requirements were adequately considered.
2.9.3 Verification Analysis
Verification analysis should be conducted on the draft report after all the analytical techniques are completed. This analysis ensures that all portions of the report are accurate and consistent, and verifies that the conclusions are consistent with the facts, analyses, and Judgments of Need. The verification analysis determines whether the flow from facts to analysis to causal factors to JON is logical. That is, the Judgments of Need are traced back to the supporting facts. The goal is to eliminate any material that is not based on facts.
One approach to verification analysis is to compare the facts, analysis, causal factors, and JON on an ECF wall chart; and validate the continuity of facts through the analysis and causal factors to the JON. This method also identifies any misplaced facts, insufficient analyses, and unsupported CON or JON.
If a clear, defensible linkage of a CON/JON cannot be supported by the facts and analysis from the ECF chart, consider re-working the CON/JON or dropping it from the report.
2‐111
DOE‐HDBK‐1208‐2012
2.9.4 Classification and Privacy Review
A review should be completed by an authorized derivative classifier to ensure that the report does not contain classified or unclassified controlled nuclear information. An attorney should also review the report for privacy concerns. These reviews are conducted before the report is distributed for the factual accuracy review.
Documentation that these reviews have been completed should be retained in the permanent investigation file.
2.9.5 Factual Accuracy Review
The facts presented in the Facts and Analysis section of the final draft report should be reviewed and validated for accuracy by the affected DOE and contractor line management before the final report is submitted to the appointing official for acceptance. Generally, only the “facts” portion should be distributed for this review, in order to protect the integrity of the investigation and prevent a premature reaction to preliminary analyses. However, other portions of the report may be provided at the discretion of the Board Chairperson. The review is important for ensuring an accurate report and verifying that all affected parties agree on the facts surrounding the accident. This open review of the facts is consistent with the focus on fixing system deficiencies, rather than fixing blame and is consistent with the DOE management philosophy of openness in the oversight process.
Some Boards have conducted this review in the Board’s dedicated conference room. This allows representatives of affected organizations to review the draft description of the facts and to ask follow-up questions of Board members, while ensuring that dissemination of the draft document remains closely controlled. Forms useful for the implementation of a Factual Accuracy Review such as, Example Cover Sheet For Facts Section, Example Factual Accuracy Room Sign, and Example Sign-in Sheet for Factual Accuracy Review, are included in Appendix D.5 Factual Accuracy Review.
Comments and revisions from DOE and contractor management are incorporated into the draft final report, as appropriate.
2.9.6 Review by the Chief Health, Safety and Security Officer
DOE O 225.1B requires review of accident investigation reports by the Chief Health, Safety and Security Officer (HS-1). Federal accident investigation reports are reviewed prior to acceptance by the appointing official. Comments are provided to the appointing official for incorporation prior to report publication and distribution. Coordination for these reviews should be made with the HSS AI Program Manager.
2.9.7 Document the Reviews in the Records
Documentation that these reviews have been completed should be retained in the permanent investigation file.
2‐112
DOE‐HDBK‐1208‐2012
2. 10
2.10 Submitting the Report
Once the report has been finalized, the Accident Investigation Board Chairperson provides the draft final report to the appointing official for acceptance. If the appointing official determines that the Board has met its obligation to conduct a thorough investigation of the accident, that the report fully describes the accident and its causal factors, and that it provides JON sufficient to prevent recurrence, the report is formally accepted. The statement of report acceptance from the appointing official is included in the final report (see Section 2.8.4).
2‐113
DOE‐HDBK‐1208‐2012
Appendix A. Glossary
DOE‐HDBK‐1208‐2012
DOE‐HDBK‐1208‐2012
Glossary
Accident: An unwanted transfer of energy or an environmental condition that, due to the absence or failure of barriers or controls, produces injury to persons, damage to property, or reduction in process output.
Accident Investigation: The systematic appraisal of unwanted events for the purpose of determining causal factors, subsequent corrective actions, and preventive measures.
Accident or Emergency Response Team: A team or teams of emergency and accident response personnel for a particular site. This team may be composed of a number of teams from the site, such as local police and firefighter units, emergency medical personnel, and hazardous material teams.
Analysis: The use of methods and techniques for arranging data to: (a) assist in determining what additional data are required; (b) establish consistency, validity, and logic; (c) establish necessary and sufficient events for causes; and (d) guide and support inferences and judgments.
Analytical Tree: Graphical representation of an accident in a deductive approach (general to specific). The structure resembles a tree—that is, narrow at the top with a single event (accident) and then branching out as the tree is developed, and identifying root causes at the bottom branches.
Appointing Official: A designated authority responsible for assigning Accident Investigation Boards for investigations, with responsibilities as prescribed in DOE O 225.1B.
Barrier: Anything used to control, prevent, or impede energy flows. Common types of barriers include equipment, administrative procedures and processes, supervision/management, warning devices, knowledge and skills, and physical objects.
Barrier Analysis: An analytical technique used to identify energy sources and the failed or deficient barriers and controls that contributed to an accident.
Board Chairperson: The leader who manages the accident investigation process, represents DOE in all matters regarding the accident investigation, and reports to the appointing official for purposes of the accident investigation.
Board Members: A group of three to six DOE staff assigned to investigate an accident. This group reports to the Board Chairperson during the accident investigation.
Causal Factor: An event or condition in the accident sequence necessary and sufficient to produce or contribute to the unwanted result. Causal factors fall into three categories: Direct cause Contributing cause Root cause.
A‐1
DOE‐HDBK‐1208‐2012
Cause: Anything that contributes to an accident or incident. In an investigation, the use of the word “cause” as a singular term should be avoided. It is preferable to use it in the plural sense, such as “causal factors,” rather than identifying “the cause.”
Chain of Custody: The process of documenting, controlling, securing, and accounting for physical possession of evidence, from initial collection through final disposition.
Change: Stress on a system that was previously in a state of equilibrium, or anything that disturbs the planned or normal functioning of a system.
Change Analysis: An analytical technique used for accident investigations, wherein accident- free reference bases are established, and changes relevant to accident causes and situations are systematically identified. In change analysis, all changes are considered, including those initially considered trivial or obscure.
Conclusions: Significant deductions derived from analytical results. Conclusions are derived from and must be supported by the facts, plus results from testing and analyses conducted. Conclusions are statements that answer two questions the accident investigation addresses: what happened and why did it happen? Conclusions include concise recapitulations of the causal factors (direct, contributing, and root causes) of the accident determined by analysis of facts.
Contributing Cause: An event or condition that collectively with other causes increases the likelihood of an accident but that individually did not cause the accident.
Controls: Those barriers used to control wanted energy flows, such as the insulation on an electrical cord, a stop sign, a procedure, or a safe work permit.
Critical Process Step: A step in the process where potential threats could interact with the hazard that could be released. For accident analysis, the absence of hazards or threads in a process step makes it a non-critical step.
Direct Cause: The immediate events or conditions that caused the accident.
DOE Accident Investigator: An individual who understands DOE accident investigation techniques and has experience in conducting investigations through participation in at least one Federal investigation. Effective October 1, 1998, DOE accident investigators must have attended an accident investigation course of instruction that is based on current materials developed by the Office of Corporate Safety Programs.
DOE Operations: Activities funded by DOE for which DOE has authority to enforce environmental protection, safety, and health protection requirements.
DOE Site: A tract either owned by DOE, leased, or otherwise made available to the Federal government under terms that afford DOE rights of access and control substantially equal to those it would possess if it held the fee (or pertinent interest therein) as agent of and on behalf of the government. One or more DOE operations/program activities carried out within the boundaries of the described tract.
A‐2
DOE‐HDBK‐1208‐2012
Energy: The capacity to do work and overcome resistance. Energy exists in many forms, including acoustic, potential, electrical, kinetic, thermal, biological, chemical, and radiation (both ionizing and non-ionizing).
Energy Flow: The transfer of energy from its source to some other point. There are two types of energy flows: wanted (controlled—able to do work) and unwanted (uncontrolled—able to do harm).
Event: An occurrence; something significant and real-time that happens. An accident involves a sequence of events occurring in the course of work activity and culminating in unintentional injury or damage.
Events and Causal Factors Chart: Graphical depiction of a logical series of events and related conditions that precede the accident.
Eyewitness: A person who directly observed the accident or the conditions immediately preceding or following the accident.
Fatal Injury: Any injury that results in death within 30 calendar days of the accident.
Field Element: A general term for all DOE sites (excluding individual duty stations) located outside the Washington, D.C., metropolitan area.
General Witness: A person with knowledge about the activities prior to or immediately after the accident (the previous shift supervisor or work controller, for example).
Hazard: The potential for energy flow(s) to result in an accident or otherwise adverse consequence.
Heads of Field Elements: First-tier field managers of the operations offices, the field offices, and the power marketing administrations (administrators).
Human Factors: The study of human interactions with products, equipment, facilities, procedures, and environments used in work and everyday living. The emphasis is on human beings and how the design of equipment influences people.
Investigation: A detailed, systematic search to uncover the “who, what, when, where, why, and how” of an occurrence and to determine what corrective actions are needed to prevent a recurrence.
Investigation Report: A clear and concise written account of the investigation results.
Judgments of Need: Managerial controls and safety measures necessary to prevent or minimize the probability or severity of a recurrence of an accident.
Lessons Learned: A “good work practice” or innovative approach that is captured and shared to promote its repeated application. A lesson learned may also be an adverse work practice or experience that is captured and shared to avoid recurrence.
A‐3
DOE‐HDBK‐1208‐2012
Occurrence: An event or condition that adversely affects or may adversely affect DOE or contractor personnel, the public, property, the environment, or DOE mission.
Occurrence Reporting and Processing System (ORPS): The reporting system established and maintained for reporting occurrences related to the operation of DOE facilities.
Operational Safety Analysis (OSA): is defined as the application of analytical methods to understand the potential consequences to life, health, property, or environment, caused by failure, due to human performance, or an element of a safety management system, within an operational environment.
Point of Contact: A DOE field or site staff member who is assigned the role of liaison with the Accident Investigation Program Manager in the Office of Corporate Safety Programs (HS-23), who administers the accident investigation program. In this role, the point of contact ensures that DOE site teams are trained in collecting and maintaining initial accident investigation evidence and that their activities are coordinated with accident and emergency response teams.
Principal Witness: A person who was actually involved in the accident.
Socio-technical: refers to the interrelatedness of social and technical aspects of an organization using the principle that the interaction of social and technical factors creates the conditions for successful (or unsuccessful) organizational performance.
Verification Analysis: A validation technique that determines whether the logical flow of data from analysis to conclusions and Judgments of Need is based on facts. This technique is conducted after all the analyses are completed.
Root Cause: The causal factor(s) that, if corrected, would prevent recurrence of the accident.
Root Cause Analysis: Any methodology that identifies the causal factors that, if corrected, would prevent recurrence of the accident.
Target: A person, object, or animal upon which an unwanted energy flow may act to cause damage, injury, or death.
Threat: An action or force from human error, equipment malfunctions, operational process malfunctions, facility malfunctions or from natural disasters that could cause or trigger a hazardous energy release.
A‐4
DOE‐HDBK‐1208‐2012
Appendix B. References
DOE‐HDBK‐1208‐2012
DOE‐HDBK‐1208‐2012
References
DOE Order 210.2A, DOE Corporate Operating Experience Program, August 8, 2011
DOE Order 225.1B, Accident Investigations, March 4, 2011.
DOE Order 232.2, Occurrence Reporting and Processing of Operations Information, August 30, 2011
DOE Order 231.1B, Environment, Safety and Health Reporting, June 27, 2011
DOE Order 360.1C, Federal Employee Training, July 8, 2011.
DOE Order 422.1, Conduct of Operations, June 29, 2010
DOE Order 450.2, Integrated Safety Management, April 25, 2011
DOE Guide 450.4-1C, Integrated Safety Management System Guide; September 29, 2011.
DOE Policy 450.4A, Integrated Safety Management Policy, April 25, 2011
DOE Standard (STD)-HDBK-1028-2009 Human Performance Improvement Handbook, Volumes 1 and 2, June 2009
DOE Accident Investigation Electronic Reference Tool, January 2011
Accident Investigation Day Planner, a Guide for Accident Investigation Board Chairpersons
DOE-STD-1146-2007, General Technical Qualification Standard, Section 5.1., December 2007
DOE-STD-1160-2003, Occupational Safety Functional Area Qualification Standard Competencies, Section 1.4, March 2003.
Center for Chemical Process Safety, Guidelines for Investigating Chemical Process Incidents, American Institute of Chemical Engineers, New York, New York, March 2003.
Defense Nuclear Federal Safety Board, Tech 35 – Safety Management of Complex, High-Hazard Organizations, December 2004
B‐1
DOE‐HDBK‐1208‐2012
B‐2
DOE‐HDBK‐1208‐2012
Appendix C. Specific Administrative Needs
DOE‐HDBK‐1208‐2012
DOE‐HDBK‐1208‐2012
Specific Administrative Needs
Roles and Responsibilities of the Administrative Coordinator
The onsite administrative coordinator assists the Board Chairperson and Board members in the day-to-day activities of the accident investigation. This includes serving as a central point of contact for the Board, making arrangements for office facilities and equipment, managing report production, and maintaining investigation records.
Generally, the administrative coordinator (working closely with the Board Chairperson) is responsible for:
Arranging for appropriate onsite office/ work space and furnishings (including a large conference room that can be locked when not in use by the Board, several small, hard- walled offices for conducting interviews, a central area to locate a library of documents collected, and several lockable file cabinets)
Arranging for local court reporter(s)
Arranging for security badges/passes for Board members and property permits for personal equipment (cameras, computers, etc.)
Arranging for specific security, access, safety, and health training, as required
Arranging for telephone service and dedicated fax machine
Arranging for a dedicated, high-speed copy machine that has collating and stapling capability
Selecting a hotel and reserving a block of rooms
Obtaining office supplies and consumables for use by Board members and support staff
Arranging for after-hours access to the site and work space
Serving as the custodian for all keys provided by the site
Determining site/field office contact for administrative and logistical support
Preparing and maintaining interview schedules (if requested by Board Chairperson)
Creating and maintaining onsite accident investigation files
Maintaining chain of custody for evidence (if requested by Board Chairperson)
Attending daily Board meetings and taking notes to assist the Chairperson
C‐1
DOE‐HDBK‐1208‐2012
Tracking action items and follow-up activities to completion
Coordinating report preparation and production activities on site and at Headquarters
Arranging for shipment of files and records to Headquarters for archiving at the end of the investigation.
C‐2
DOE‐HDBK‐1208‐2012
Appendix D. Forms
DOE‐HDBK‐1208‐2012
DOE‐HDBK‐1208‐2012
Preparation for Forming the Board Sample Appointment Memorandum
MEMORANDUM FOR [NAME]: BOARD CHAIRPERSON OFFICE OF XYZ
FROM: [NAME]: APPOINTING OFFICIAL OFFICE OF XYZ
SUBJECT: Accident Investigation into the fall and Serious Injury of a Worker in A-Area, at the ABC Site, MM/DD/YYYY.
In accordance with the requirements of DOE O 225.1B, Accident Investigations, I am establishing an Accident Investigation Board (AIB) to investigate the fall of a worker in A-Area at the ABC Site, which occurred on MM/DD/YYYY. I have determined the event meets the criteria of: [Insert the appropriate AI categorization language here]; item 2.a.(2) (any single accident that results in the hospitalization for more than five calendar days, commencing within seven calendar days of the accident) for the conduct of an accident investigation delineated in Appendix A, DOE Order 225.1B.
You are appointed as the Board Chairperson. The Board will be composed of the following members:
[Name] – Office of Environmental Management – Chairperson.
[Name] – Office of River Protection - Trained Accident Investigator – Human Performance.
[Name] – Y-12 National Security Site – Integrated Safety Management System.
[Name] – Idaho National Engineering Laboratory – Conduct of Operations/Work Planning and Hazard Controls.
[Name] – Savannah River Operations Office (SR) – 10 CFR 851 Worker Protection Programs.
[Name] – Accident Analyst/Consultant/Advisor
[Name] – Medical Advisor
D‐1
DOE‐HDBK‐1208‐2012
[Name] – ABC site - Administrative support
All members of the AIB, by this letter, are released from their normal regular duty assignment to serve on the AIB, during the period the AIB is convened.
The scope of the Board’s investigation is to include, but not be limited to, identifying all relevant facts, determining direct, contributing, and root causes of the event, developing conclusions, and determining the judgments of need to prevent recurrence.
The scope of the investigation is to include Department of Energy’s (DOE) programs and oversight activities.
The Board is expected to provide my office with periodic reports on the status of the investigation. Please submit draft copies of the factual portion of the investigation report to me, the Office of XYZ, the DOE ABC Site, and the affected contractor for factual accuracy review prior to finalization. The final report should be provided to me within 30 days of the date of this memorandum. Discussion of the investigation and copies of the draft report will be controlled until I authorize release of the final report.
If you have any further questions, please contact [Name], Deputy Assistant Secretary, Office of XYZ, at (202) 586-XXXX.
cc: DOE/NNSA HQ and Site Line Officials HSS AI Program Manager
D‐2
DOE‐HDBK‐1208‐2012
Accident Investigation Individual Conflict of Interest Certification Form
I certify that all work to be performed by me in support of the accident investigation identified as:
(include the accident site name and date)
has been reviewed and does not present a conflict of interest concern.
I have no past, present, or currently planned interests that either directly or indirectly may relate to the subject matter of the work to be performed that may diminish my capacity to give impartial, technically sound, objective assistance and advice. Additionally, I have performed no services that might bias my judgment in relation to the work to be performed, or which could be perceived to impair my objectivity in performing the subject work.
(Print name) (Signature)
BOARD POSITION: Member Advisor Consultant (Federal employee)
CONTRACT NO. (if applicable):
DATE:
The original of this form remains with the accident investigation files. One copy will be sent to the:
Appointing Official
For DOE‐led investigations, include the following distribution:
DOE Accident Investigation Program Manager Office of Corporate Safety Programs Office of Health, Safety and Security Department of Energy (phone) 301‐903‐9840 (fax) 301‐903‐
NOTE: Statements or entries generally: Whoever, in any matter within the jurisdiction of any department or agency of the United States knowingly and willfully falsifies, conceals, or covers up by any trick, scheme, or device a material fact, or makes any false, fictitious, or fraudulent statements or representations, or makes or uses any false writing or document knowing the same to contain any false, fictitious, or fraudulent statement or entry, shall be fined or imprisoned not more than five years, or both. (18 USC Section 1001)
D‐3
DOE‐HDBK‐1208‐2012
Accident Investigation Startup Activities List
Description of Activity Name of Designated Lead
HQ Site Other
Board Chairperson Responsibilities:
Attend briefing by Appointing Official
Assist in selecting, notifying, and briefing Board members and consultants/advisors
Identify all appropriate site authorities
Obtain details of accident from DOE site team leader and other site parties
Ensure that adequate evidence preservation and collection activities were initiated
Begin identifying and collecting background and factual information
Ask the Program Manager to search for information about similar accidents
Review all forwarded site and Board member information
Reassign normal business commitments
Establish a preliminary accident investigation schedule, including milestones and deadlines
Contact selected Board members, consultants/advisors, and site personnel
Arrange travel for self and expedite Board travel arrangements
Establish administrative support
Determine that logistical support for the accident investigation is established
Travel to site
D‐4
DOE‐HDBK‐1208‐2012
Accident Investigation Startup Activities List Administrative Coordinator Responsibilities
Make hotel selection and reserve a block of rooms for the Accident Investigation Board (if needed)
Determine contractor or DOE site/field office points of contact for administrative and logistical support, as needed
For DOE‐led investigations: Arrange for local court reporter support for interviews
Arrange for office/work space and furnishings for the Accident Investigation Board
Arrange for a large, dedicated conference room that can be locked when not in use by the Accident Investigation Board
Arrange for several small, hard‐walled offices to be used when conducting interviews
Arrange for security badges/passes for members of the Accident Investigation Board
Arrange for property permits for personal equipment (cameras, laptops, etc.) for members of the Accident Investigation Board
Arrange for specific security, access, safety, and health training, as required
Arrange for dedicated telephone services and a fax machine
Arrange for a dedicated, high‐speed copy machine that has collating and stapling capability
D‐5
DOE‐HDBK‐1208‐2012
Accident Investigation Startup Activities List Obtain office supplies and consumables for use by the Accident Investigation Board
Arrange after‐hours access to site and work space, and assume responsibility for all keys/cards provided by the site
Make hotel selection and reserve a block of rooms for the Accident Investigation Board
Determine site/field office points of contact for administrative and logistical support
Arrange for local court reporter support for interviews
Prepare and maintain the interview schedule
Create and maintain accident investigation files
Arrange for an area central to work space to locate documents, lockable file cabinets, high‐speed copy machine, large‐volume document shredder(s), and fax machine
D‐6
DOE‐HDBK‐1208‐2012
Accident Investigation Equipment Checklist
() Checklist Notes
DOCUMENT PACKET
DOE Order 225.1B, Accident Investigations
Accident Investigation Preliminary Interview List
Witness Statement Form
Barrier Analysis Form
Change Analysis Form
Chairperson Day Planner
SITE DOCUMENTS
Organization charts
Facility maps
Applicable blueprints and as‐built drawings
Policies and procedures manuals
ES&H manuals
Training manuals
OFFICE SUPPLIES
Adhesive notes (assorted sizes & colors)
Adhesive flags (assorted colors)
Chart paper (1/4" grid)
D‐7
DOE‐HDBK‐1208‐2012
Accident Investigation Equipment Checklist
() Checklist Notes
2 boxes suspension folders
3 boxes 1/3 cut (3‐tab) file folders
12 letter‐size expandable files
1 box full‐page dividers
3 boxes pens, red
3 boxes pens, black
4 heavy black markers
1 box yellow highlighters
1 box pencils (hard)
12 boxes paper clips
12 boxes binder clips (assorted)
1 box rubber bands (assorted)
1 heavy‐duty stapler
1 box heavy‐duty staples
1 heavy‐duty staple remover
4 boxes staples
8 desk staplers
8 staple removers
8 tape dispensers/tape
4 scissors
2 three‐hole punch
2 clipboards
12 three‐ring binders ‐ (1", 2", 3")
Assorted file folder labels
Overnight mailing supplies
D‐8
DOE‐HDBK‐1208‐2012
Accident Investigation Equipment Checklist
() Checklist Notes
Assorted envelopes (9"x12", 5"x7", 10"x13")
DOE‐HQ memorandum letterhead
24 ruled notepads
12 steno pads
3" x 5" index cards
Return address labels
Packing boxes
5 boxes double‐pocket portfolio (assorted colors)
Nylon filament tape
OFFICE EQUIPMENT
Telephones
voice mail capability
Computers/software
high speed printers, preferably color with duplex capability
Fax machine
Cassette tape recorder, cassettes, and batteries
High‐speed photocopier (multifunction)
Document shredder
Electric pencil sharpener
TOOLS
Flashlight or lantern (explosion‐proof)
Spare batteries and bulb for flashlight
D‐9
DOE‐HDBK‐1208‐2012
Accident Investigation Equipment Checklist
() Checklist Notes
Steel tape measure ‐ 100‐foot
Scale ‐ 12‐inch ruler
Scissors (heavy‐duty)
Compass ‐ professional type (e.g., MILSPEC Lensatic or surveyor’s)
Magnifying glass
Inspection mirrors ‐ large & small dental
Toothbrush ‐ natural bristle
Twine ‐ 300‐ft package wrapping
Cardboard tags, string
Masking tape (2‐inch)
SPECIAL DEVICES
Engineer’s scale
Calculators
Calipers, inside and outside diameter
D‐10
DOE‐HDBK‐1208‐2012
Accident Investigation Equipment Checklist
() Checklist Notes
PERSONAL PROTECTION EQUIPMENT
Hard hats
First aid kit
Glasses, other eye protection
Gloves, leather or canvas
Ear plugs, other hearing protection
Vest, orange flag person’s vest
Steel‐toed boots or shoes
Dust masks, respirators
This list is not exhaustive or limiting. Use this checklist as a starting point and add or delete items as needed.
D‐11
DOE‐HDBK‐1208‐2012
Forms for Witness Statements and Interviews
Accident Investigation Witness Statement Form Name: Job Title:
Telephone No.: Supervisor:
Work Location:
Location of Accident:
Accident Time and Date:
Please fully describe the accident sequence from start to finish (use additional paper as needed):
Please fully describe the work and conditions in progress leading up to the accident (use additional paper as needed):
Note anything unusual you observed before or during the accident (sights, sounds, odors, etc.):
What was your role in the accident sequence?
What conditions influenced the accident (weather, time of day, equipment malfunctions, etc.)?
D‐12
DOE‐HDBK‐1208‐2012
Accident Investigation Witness Statement Form What do you think caused the accident?
How could the accident have been prevented?
Please list other possible witnesses
Additional comments/observations:
Signature: Date/Time:
D‐13
DOE‐HDBK‐1208‐2012
Accident Investigation Preliminary Interview List
Interviewee/Title Reason for Interview Phone Location/Shift/Company Affiliation Notes
D‐14
DOE‐HDBK‐1208‐2012
Accident Investigation Interview Form Interviewee: Title/Position
Interviewer: Title/Position
Page ____ of ____
Others Present: Date: Time:
Initial Questions:
Follow‐up Questions:
Observations of Interviewee:
Notes:
D‐15
DOE‐HDBK‐1208‐2012
Model Interview Opening Statement [To be recorded]
Let the record reflect that this interview has commenced at , (time) (date)
and (place)
I am of (Interviewer’s name) (employment affiliation)
With me are
(name(s) and organization(s) of other Department personnel)
(name(s) and organization(s) of other Department personnel)
(name(s) and organization(s) of other Department personnel)
For the record, please state your full name, company affiliation, job title, or position.
Read into record the names and employment of any additional persons present (other than the recorder).
The Department has established an Accident Investigation Board to determine the facts that led to the
accident at (accident date) (place of accident).
The principal purpose of this investigation is to determine the facts surrounding the accident so that proper remedial measures can be instituted to prevent the recurrence of accidents. We have authority to conduct this investigation under the Department of Energy Organization Act, which incorporates provisions of the Atomic Energy Act of 1954 authorizing investigations of this type.
Your appearance here to provide information is entirely voluntary, and you may stop testifying and leave at any time. However, you should understand that giving false testimony in this investigation would be a felony under 18 U.S. Code Section 1001. Do you understand that?
You have the right to be accompanied by an attorney or a union representative. (If witness has attorney or a union representative, put the name of such person into the record.) “Let the record reflect that Mr./Mrs./Ms. is accompanied by”
(as his/her attorney or union representative).
We would like to record this interview to ensure an accurate record of your statements. A transcript of this discussion will be produced, and you will have an opportunity to review the transcript for factual accuracy and corrections. If you do not wish to have the session recorded. We will not do so. Do you have any objection to having the session recorded?
D‐16
DOE‐HDBK‐1208‐2012
We will attempt to keep your testimony confidential but we cannot guarantee it. At a later date, we may have to release your testimony pursuant to a request made under the Freedom of Information Act, a court order, or in the course of litigation concerning the accident, should such litigation arise. Do you want your testimony to be considered confidential? (wait for answer--if answer to preceding question is affirmative).
D‐17
DOE‐HDBK‐1208‐2012
Informal Personal or Telephone Interview Form
Date:
Time:
Personal or Telephone Interview?
Interviewee Name:
Telephone:
Pager:
Interviewee Title:
Interviewee Employer:
Board Interviewer(s) Name(s) (Print):
Interview Notes
D‐18
DOE‐HDBK‐1208‐2012
Reference Copy of 18 USC Sec. 1001 for Information: CITE
18 USC Sec. 1001 01/05/2009 EXPCITE
TITLE 18 ‐ CRIMES AND CRIMINAL PROCEDURE PART I ‐ CRIMES CHAPTER 47 ‐ FRAUD AND FALSE STATEMENTS
HEAD Sec. 1001. Statements or entries generally
STATUTE (a) Except as otherwise provided in this section, whoever, in any matter within the jurisdiction of the
executive, legislative, or judicial branch of the Government of the United States, knowingly and willfully ‐ 1) falsifies, conceals, or covers up by any trick, scheme, or device a material fact; 2) makes any materially false, fictitious, or fraudulent statement or representation; or 3) makes or uses any false writing or document knowing the same to contain any materially
false, fictitious, or fraudulent statement or entry; shall be fined under this title, imprisoned not more than 5 years or, if the offense involves international or domestic terrorism (as defined in section 2331), imprisoned not more than 8 years, or both. If the matter relates to an offense under chapter 109A, 109B, 110, or 117, or section 1591, then the term of imprisonment imposed under this section shall be not more than 8 years. (b) Subsection (a) does not apply to a party to a judicial proceeding, or that party's counsel, for
statements, representations, writings or documents submitted by such party or counsel to a judge or magistrate in that proceeding.
(c) With respect to any matter within the jurisdiction of the legislative branch, subsection (a) shall apply only to ‐ 1) administrative matters, including a claim for payment, a matter related to the procurement of
property or services, personnel or employment practices, or support services, or a document required by law, rule, or regulation to be submitted to the Congress or any office or officer within the legislative branch; or
2) any investigation or review, conducted pursuant to the authority of any committee, subcommittee, commission or office of the Congress, consistent with applicable rules of the House or Senate.
Additional Information:
Even constitutionally explicit Fifth Amendment privileges do not exonerate affirmative false statements. United States v. Wong, 431 U.S.C. 174, 178, 52 L. Ed. 2d 231, 97 S. Ct. 1823 (1977). As the Court in Wong said, "Our legal system provides methods for challenging the Government's right to ask questions ‐‐ lying is not one of them." Id., at 180, quoting Bryson v. United States, 396 U.S. 64, 72, 24 L. Ed. 2d 264, 90 S. Ct. 355 (1969)
(In other words, in the unlikely circumstance where there is a potential for self‐incrimination, the witness is legally better off refusing to say anything without the advice of counsel than to make a false statement to miss‐lead investigators. The example opening statement addresses this circumstance.)
D‐19
DOE‐HDBK‐1208‐2012
TRANSCRIPT REVIEW STATEMENT
Department of Energy Accident Investigation of:
[title]
I have reviewed, corrected, or added to and initialed and dated my changes to the transcript of my interview in reference to the subject above. I understand that my transcript will be protected against unauthorized disclosure by the Department of Energy Accident Investigation Board but may be released at a later date under the provisions of the Freedom of Information Act or a court order. The transcript is also subject to the Privacy Act of 1974 regarding personal information.
DATE:
PRINT NAME:
SIGNATURE:
COMPANY NAME:
TRANSCRIPT REQUEST
I hereby request a copy of my interview transcript. I understand that a copy will be provided for my personal records only.
SIGNATURE:
MAILING ADDRESS:
D‐20
DOE‐HDBK‐1208‐2012
Transcript Receipt & Review Tracking (Updated MM/DD/YY, Time)
Interviewee Name Company Date Interviewed
Date CR Transcript Received
Date Scheduled for Transcript Review by Interviewee
Transcript Review
Completed (Y/N)
D‐21
DOE‐HDBK‐1208‐2012
Evidence Collecting
Accident Investigation Information Request Form Requested From: Requested By:
Contact Person: Location:
Phone Number:
Fax Number:
Phone Number:
Fax Number:
Information Requested How Transmitted Date Received
D‐22
DOE‐HDBK‐1208‐2012
Checklist of Documentary Evidence As the investigation and interviews continue, the team will recognize the need for additional documentary evidence on which to base their understanding of how work was planned to be accomplished. There are many available sources for documentary evidence (paper and electronic information). Sources of documentary evidence and possible lines of questioning for this information include:
A. Work Records
Work Orders –history of initial work request through work order number should be available
Electronic Work Order approval (What does approval signature mean?)
Work orders, logbooks, training records (certifications/qualifications), forms, time sheets.
B. Active electronic records
Contract Documents, Directives, Manuals, Work Instructions, Forms (Review for possible changes in the process that may have been a setup factor)
C. Archived electronic records
Standards, Internal Operating Procedures, and Work Instructions
Prior issues of existing Directives, Manuals, Work Instructions
Review for possible changes in the process that may have been a setup factor
Problem Evaluation Reporting System
Main database to document most anomalies, assessment findings and weaknesses, occurrence reports, and Corrective Actions
D. Electronic Suspense Tracking and Routing System
Problem evaluation reports (provides auditable trail for tasks that have been entered). Closeout of Corrective Actions from similar events
E. Contractor Assessments
Contractor assessments will be more useful sources of information the more they mature. Are you doing self‐assessments in the area being investigated? Are Corrective Actions appropriate for the findings or weaknesses? Are there observations being documented without corresponding corrective action that collectively could have been an indication of more systemic problem?
Closeout of Corrective Actions from similar events
F. Lessons Learned
Have similar events been experienced elsewhere? What did the organization do after receiving the lesson learned? Was it acted upon?
D‐23
DOE‐HDBK‐1208‐2012
Checklist of Documentary Evidence G. Occurrence Reporting System
Look for issues similar to the area investigated. Are Corrective Actions appropriate and meaningful for the event reported? Is there indication of a series of similar problems?
Nonconformance reports
H. Directives (orders, standards, guides)
Are there external drivers applicable to the investigation area? How did the company flow down and implement the requirement?
I. External reviews or assessments
Was this organization assessed or reviewed by an external agency in the area currently being investigated?
Were Corrective Actions appropriate for any findings, weaknesses, or observations? What evidence do you have to this fact?
J. Working Conditions
Were excessive working hours a potential contributor to the event? Work with Human Resources to get the work history for affected personnel.
D‐24
DOE‐HDBK‐1208‐2012
Evidence Sign-out Sheet
Evidence Number Your Name
Date & Time Out
Check Here When
Returned
D‐25
DOE‐HDBK‐1208‐2012
Physical Evidence Log Form
Tag Number
Evidence Description
Original Location Reference
Storage Location
Inventoried & Tagged by:
Name/Signature/ Date/Time
Released by: Name/Signature/
Date/Time
Received by: Name/Signature/ Date/Time
Attach copy of Accident Investigation Sketch of Physical Evidence Locations
D‐26
DOE‐HDBK‐1208‐2012
Site Sketch Board Member: Title:
Date: Time:
Attach copy of Sketch of Position Mapping Form
D‐27
DOE‐HDBK‐1208‐2012
Position Mapping Form Team Member: Title:
Date: Time:
Code # Object Reference Point Distance Direction
Attach copy of Site Map and Site Sketch
D‐28
DOE‐HDBK‐1208‐2012
Sketch of Physical Evidence Locations and Orientations Team Member: Title:
Date: Time:
Attach copy of Physical Evidence Log Form
D‐29
DOE‐HDBK‐1208‐2012
Photographic Log Sheet Photographer: Camera Type: Lighting Type:
Film Roll No.:
Location: Date: Time:
Photo No. Scene/Subject Date of Photo
Time of Photo
Lens f/#
Direction of Camera
Distance From Subject
D‐30
DOE‐HDBK‐1208‐2012
Sketch of Photography Locations and Orientations Team Member: Title:
Date: Time:
Attach copy of Investigation Position Mapping Form
D‐31
DOE‐HDBK‐1208‐2012
Analysis Worksheets
Barrier Analysis Worksheet Hazard: Target:
What were the barriers?
How did each barrier perform?
Why did the barrier fail?
How did the barrier affect the accident?
Context: ISM/HPI
D‐32
DOE‐HDBK‐1208‐2012
Change Analysis Worksheet
Factors Accident Situation Prior, Ideal or Accident‐Free Situation Difference Evaluation of Effect
WHAT Conditions, occurrences, activities, equipment
WHEN Occurred, identified, facility status, schedule
WHERE Physical location, environmental conditions
WHO Staff involved, training, qualification, supervision
HOW Control chain, hazard analysis monitoring
OTHER
D‐33
DOE‐HDBK‐1208‐2012
Factual Accuracy Review
Example Cover Sheet for Facts Section
Accident Investigation for
[Title of Accident Investigation]
Factual Accuracy Review [Date]
[Time: from – to]
Control Copy Number:
Reviewer Name:
Reviewer Company:
Mark this copy with your factual accuracy comments. Please print!
If you believe one or more of the Board’s facts are not correct, you must submit documented evidence (e.g., report, training record, contract, etc.) to the Board Coordinator by 5:00 pm today to support your claim.
* This document may not be copied or removed from this room. Please return this copy to the Proctor by .am/pm
D‐34
DOE‐HDBK‐1208‐2012
Example Factual Accuracy Room Sign
ACCIDENT INVESTIGATION
FACTUAL ACCURACY REVIEW
[Date] [Time: from – to]
D‐35
DOE‐HDBK‐1208‐2012
Example Sign-In Sheet for Factual Accuracy Review
Accident Investigation for
[Title of Accident Investigation]
Factual Accuracy Review [Date]
[Time: from – to]
Control Copy No.
Name (Please Print)
Company (Abbreviation) Telephone
Time Copy Returned
D‐36
DOE‐HDBK‐1208‐2012
Attachment 1. ISM Crosswalk and Safety Culture Lines of Inquiry
DOE‐HDBK‐1208‐2012
DOE‐HDBK‐1208‐2012
Crosswalk between ISM Core Functions and the Break-the-Chain Framework
ISM Core Function Break‐the‐Chain Framework
CF #1: Define Scope BTC Step #1: Identify the Consequence to Avoid Work is clearly defined, including the Catastrophic consequences are listed in priority boundaries, priority, resources required and order to: expectations for completion. The level of detail Remind everyone of the potential required in the work scope is commensurate catastrophic consequences to avoid each day with the importance and complexity of the work and the potential risk, the associated hazards, and the controls needed to mitigate hazards.
Pinpoint where barriers are most needed; the severity of the consequences will drive the number and type of barriers selected
Ensure barriers protecting highest priority consequences receive top protection against degradation
Encourage constant review of resources against consequences, to ensure the most severe consequences are avoided at all times and at any cost.
Efforts to protect against catastrophic events should never be diluted by an organization’s efforts to prevent less‐consequential events. Focus must be maintained on system accidents to assure that the needed attention and resources are available to prevent them.
CF #2: Identify Hazards BTC Step #2: Identify the Hazard to Protect and Task‐level, or work planning control, identifies Minimize hazards tailored to the work performed. It Identify the hazard identifies hazards with the potential to harm Minimize the hazard workers, the facilities or the environment. Pantex provides each worker with an awareness of their work place hazards.
Reduce interactive complexity and tight coupling
CF #3: Develop and Implement Hazard Controls BTC Step #3: Reduce Threats
Controls identified and tailored as appropriate to Identify and reduce threats from human adequately address the hazards identified with error, faulty equipment, tooling, facilities, the work. Provide each worker an awareness of and from natural the controls hat protect their safety from BTC Step #4: Manage Defenses identified hazards. Implement controls in a manner that is sufficient to ensure they Manage Defenses to Reduce the Probability
sufficiently accomplish their intent. of the Systems Accident Manage Defenses to Mitigate the
Consequences of the System Accident
1‐1
DOE‐HDBK‐1208‐2012
ISM Core Function Break‐the‐Chain Framework
CF #4: Perform Work Within Controls BTC Step #5: Reduce Vulnerability to the Hazard Supervisors evaluate work packages, before Through Strong Culture of Reliability starting work, to ensure controls are in place to Understand concept of culture of reliability, mitigate hazards. Work is performed in how its measured, enhanced and sustained accordance with identified controls and Demonstrate conservative operational evaluated to indicate how safely work is decisions with regards to the selected safety performed. system
CF #5: Feedback and Improvement BTC Step #6: Minimize Gap Between “work‐as‐
Mechanisms (including independent means) imagined” & “work‐as‐done” collect data and to generate information to Ensure BTC framework effective at process make improvements to all phases of planning start‐up and conducting the work safely. Encourage worker‐supervisor interactions
Track and trend performance indicators Perform Causal Factors Analyses on
“information‐rich” events Learn from other people’s mistakes
1‐2
DOE‐HDBK‐1208‐2012
Safety Culture Lines of Inquiry Addressing the Seven ISM Guiding Principles
(Developed by EFCOG Safety Culture Task Group)
LEADERSHIP
Line Management Responsibility for Safety Leadership and culture are two sides of the same coin; neither can be realized without the other. Leaders create and manage the safety culture in their organizations by maintaining safety as a priority, communicating their safety expectations to the workers, setting the standard for safety through actions not talk (walk-the talk), leading needed change by defining the current state, establishing a vision, developing a plan, and implementing the plan effectively. Leaders cultivate trust to engender active participation in safety and to establish feedback on the effectiveness of their organization’s safety efforts.
SAFETY CULTURE ATTRIBUTE SOURCE OF ATTRIBUTE FROM ISM GUIDE (DOE G 450.4-1C)
1. Leaders assure plans integrate 1.1 What type of evidence do you see that demonstrates line safety into all aspects of an managers understand and accept their safety responsibilities organization’s activities inherent in mission accomplishment by not depending on considering the consequences of supporting organizations to build safety into line management operational decisions for the work activities? entire life-cycle of operations and 1.2 What type of evidence do you see that demonstrates line the safety impact on business managers regularly and promptly communicate important processes, the organization, the operational decisions, their basis, expected outcomes, public, and the environment. potential problems, and planned contingencies?
2. Leaders understand their 2.1 What type of evidence do you see that demonstrates line business and ensure the managers have a clear understanding of their work activities systems employed provide the and their performance objectives, and how they will conduct requisite safety by identifying their work activities safely and accomplish their performance and minimizing hazards, proving objectives. the activity is safe, and not 2.2 What type of evidence do you see that demonstrates key assuming it is safe before technical managers are assigned for long terms of service to operations commence. provide institutional continuity and constancy regarding safety
requirements and expectations? Is organizational knowledge valued and efforts made to preserve it when key players move on? 2.3 What type of evidence do you see that demonstrates facilities are designed, constructed, operated, maintained, and decommissioned using consensus industry codes and standards, where available and applicable, to protect workers, the public, and the environment? 2.4 What type of evidence do you see that demonstrates applicable requirements from laws, statutes, rules and regulations are identified and captured so that compliance can be planned, expected, demonstrated, and verified? 2.5 What type of evidence do you see that demonstrates clear, concise technical safety directives are centrally developed, where necessary, and are based on sound engineering judgment and data? Are DOE directives and technical standards actively maintained up to date and accurate? 2.6 What type of evidence do you see that demonstrates a
1‐3
DOE‐HDBK‐1208‐2012
LEADERSHIP clearly-defined set of safety requirements and standards is invoked in management contracts, or similar agreements? Are accepted process used for identification of the appropriate set of requirements and standards? And is this set of requirements is comprehensive and do they include robust quality assurance, safety, and radiological and environmental protection requirements? 2.7 What type of evidence do you see that demonstrates implementing plans, procedures and protocols are in place to translate requirements into action by the implementing organization? 2.8 What type of evidence do you see that demonstrates technical and operational safety requirements clearly control the safe operating envelope? Is the safety envelope clearly specified and communicated to individuals performing operational tasks? 2.9 What type of evidence do you see that demonstrates exemptions from applicable technical safety requirements are both rare and specific, provide an equivalent level of safety, have a compelling technical basis, and are approved at an appropriate organizational level? 2.10 What type of evidence do you see that demonstrates compliance with applicable safety and technical requirements is expected and verified? 2.11 What type of evidence do you see that demonstrates willful violations of requirements are rare, and personnel and organizations are held strictly accountable in the context of a just culture? Are unintended failures to follow requirements are promptly reported, and personnel and organizations are given credit for self-identification and reporting of errors? How do you really know? 2.12 What type of evidence do you see that demonstrates the organization actively seeks continuous improvement to safety standards and requirements through identification and sharing of effective practices, lessons learned, and applicable safety research? What type of evidence do you see that demonstrates the organization committed to continuously rising standards of excellence? 2.13 What type of evidence do you see that demonstrates work hazards are identified and controlled to prevent or mitigate accidents, with particular attention to high consequence events with unacceptable consequences? Through your interviews and direct interactions, do the workers understand hazards and controls before beginning work activities? 2.14 What type of evidence do you see that demonstrates the selection of hazard controls considers the type of hazard, the magnitude of the hazard, the type of work being performed,
1‐4
DOE‐HDBK‐1208‐2012
LEADERSHIP and the life-cycle of the facility? Are these controls designed, implemented, and maintained commensurate with the inherent level and type of hazard? 2.15 What type of evidence do you see that demonstrates safety analyses identifying work hazards are comprehensive and based on sound engineering judgment and data? 2.16 What type of evidence do you see that demonstrates defense in depth is designed into highly-hazardous operations and activities, and includes independent, redundant, and diverse safety systems, which are not overly complex? Do defense in depth controls include engineering controls, administrative processes, and personnel staffing and capabilities? 2.17 What type of evidence do you see that demonstrates emphasis is placed on designing the work and/or controls to reduce or eliminate the hazards and to prevent accidents and unplanned releases and exposures? 2.18 What type of evidence do you see that demonstrates the following hierarchy of defense in depth is recognized and applied: (1) elimination or substitution of the hazards, (2) engineering controls, (3) work practices and administrative controls, and (4) personal protective equipment? Are inherently safe designs preferred over ones requiring engineering controls? Is prevention emphasized in design and operations to minimize the use of, and thereby possible exposure to, toxic or hazardous substances? 2.19 What type of evidence do you see that demonstrates equipment is consistently maintained so that it meets design requirements? 2.20 What type of evidence do you see that demonstrates safety margins are rigorously maintained? What type of evidence do you see that demonstrates design and operating margins are carefully guarded and changed only with great thought and care? Is special attention placed on maintaining defense-in-depth? 2.21 What type of evidence do you see that demonstrates organizations implement hazard controls in a consistent and reliable manner? Is safety embedded in processes and procedures through a functioning formal integrated safety management system? Are facility activities governed by comprehensive, efficient, high-quality processes and procedures? 2.22 What type of evidence do you see that demonstrates formal facility authorization agreements are in place and maintained between owner and operator? 2.23 What type of evidence do you see that demonstrates readiness at the facility level is verified before hazardous operations commence? Are pre-operational reviews used to confirm that controls are
1‐5
DOE‐HDBK‐1208‐2012
LEADERSHIP in place for known hazards? 2.24 What type of evidence do you see that demonstrates facility operations personnel maintain awareness of all facility activities to ensure compliance with the established safety envelope? 2.25 What type of evidence do you see that demonstrates work authorization is defined at the activity level? Does the work authorization process verify that adequate preparations have been completed so that work can be performed safely? Do these preparations include verifying that work methods and requirements are understood; verifying that work conditions will be as expected and not introduce unexpected hazards; and verifying that necessary controls are implemented? 2.26 What type of evidence do you see that demonstrates the extent of documentation and level of authority for work authorization is based on the complexity and hazards associated with the work?
3. Leaders consider safety 3.1 What type of evidence do you see that demonstrates line implications in the change managers maintain a strong focus on the safe conduct of work management processes. activities?
What type of evidence do you see that demonstrates line managers maintain awareness of key performance indicators related to safe work accomplishment, watch carefully for adverse trends or indications, and take prompt action to understand adverse trends and anomalies?
4. Leaders model, coach, mentor, 4.1 What type of evidence do you see that demonstrates line and reinforce their expectations managers are committed to safety? and behaviors to improve safe Are the top-level line managers the leading advocates of business performance. safety and demonstrate their commitment in both word and
action? Do line managers periodically take steps to reinforce safety, including personal visits and walkthroughs to verify that their expectations are being met? 4.2 What type of evidence do you see that demonstrates line managers spend time on the floor? Do line managers practice visible leadership in the field by placing “eyes on the problem,” coaching, mentoring, and reinforcing standards and positive behaviors? Are deviations from expectations corrected promptly and, when appropriate, analyzed to understand why the behaviors occurred?
5. Leaders value employee 5.1 What type of evidence do you see that demonstrates line involvement, encourage managers are skilled in responding to employee questions in individual questioning attitude, an open, honest manner? and instill trust to encourage Do line managers encourage and appreciate the reporting of raising issues without fear of safety issues and errors and not disciplining employees for the retribution. reporting of errors?
Do line managers encourage a vigorous questioning attitude toward safety, and constructive dialogues and discussions on
1‐6
DOE‐HDBK‐1208‐2012
LEADERSHIP safety matters?
6. Leaders assure employees are 6.1 What type of evidence do you see that demonstrates trained, experienced and have staffing levels and capabilities are consistent with the the resources, the time, and the expectation of maintaining safe and reliable operations? tools to complete their job safely. 6.2 What type of evidence do you see that demonstrates the
organizational staffing provides sufficient depth and redundancy to ensure that all important safety functions are adequately performed? 6.3 What evidence do you have that demonstrates to line managers the organization is able to build and sustain a flexible, robust technical staff and staffing capacity? Are pockets of resilience established through redundant resources so that adequate resources exist to address emergent issues? Does the organization develop sufficient resources to rapidly cope and respond to unexpected changes? 6.4 What type of evidence do you see that demonstrates adequate resources are allocated for safety upgrades and repairs to aging infrastructure? Are modern infrastructure and new facility construction pursued to improve safety and performance over the long term?
7. Leaders hold personnel 7.1 Are responsibility and authority for safety well defined and accountable for meeting clearly understood as an integral part of performing work. standards and expectations to 7.2 What type of evidence do you see that demonstrates fulfill safety responsibilities. organizational safety responsibilities are sufficiently
comprehensive to address the work activities and hazards involved? 7.3 What type of evidence do you see that demonstrates the line of authority and responsibility for safety is defined from the Secretary to the individual contributor? Does each of these positions have clearly defined roles, responsibilities, and authorities, designated in writing and understood by the incumbent? 7.4 What type of evidence do you see that demonstrates ownership boundaries and authorities are clearly defined at the institutional, facility, and activity levels, and interface issues are actively managed? 7.5 Are organizational functions, responsibilities, and authorities documents maintained current and accurate? 7.6 Are reporting relationships, positional authority, staffing levels and capability, organizational processes and infrastructure, and financial resources commensurate with and support fulfillment of assigned or delegated safety responsibilities? 7.7 What type of evidence do you see that demonstrates all personnel understand the importance of adherence to standards? 7.8 What type of evidence do you see that demonstrates line
1‐7
DOE‐HDBK‐1208‐2012
LEADERSHIP managers review the performance of assigned roles and responsibilities to reinforce expectations and ensure that key safety responsibilities and expectations are being met? 7.9 What type of evidence do you see that demonstrates personnel at all levels of the organization are held accountable for shortfalls in meeting standards and expectations related to fulfilling safety responsibilities? Is accountability demonstrated both by recognition of excellent safety performers as well as identification of less-than- adequate performers in holding people accountable, in the context of a just culture, managers consider individual intentions and the organizational factors that may have contributed?
8. Leaders insist on conservative 8.1 Do organization managers frequently and consistently decision making with respect to communicate the safety message, both as an integral part of the proven safety system and the mission and as a stand-alone theme? recognize that production goals, 8.2 Do managers recognize that aggressive mission and if not properly considered and production goals can appear to send mixed signals on the clearly communicated, can send importance of safety? mixed signals on the importance of safety. Are managers sensitive to detect and avoid these misunderstandings, or to deal with them effectively if they
arise? What type of evidence supports your claim? 8.3 What type of evidence do you see that demonstrates the organization demonstrates a strong sense of mission and operational goals, including a commitment to highly reliable operations, both in production and safety? Are safety and productivity both highly valued? 8.4 What type of evidence do you see that demonstrates safety and productivity concerns both receive balanced consideration in funding allocations and schedule decisions? Are resource allocations adequate to address safety? If funding is not adequate to ensure safety, operations are discontinued?
9. Leadership recognizes that 9.1 What type of evidence do you see that demonstrates humans make mistakes and take hazard controls are designed with an understanding of the actions to mitigate this. potential for human error?
Are error-likely situations identified, eliminated, or mitigated? Is the existence of known error-likely situations communicated to workers prior to commencing work along with planned mechanisms to assure their safety? What is your proof?
10. Leaders develop healthy, collaborative relationships within their own organization and between their organization and regulators, suppliers, customers and contractors.
1‐8
DOE‐HDBK‐1208‐2012
EMPLOYEE INVOLVEMENT
Individual Attitude and Responsibility for Safety Safety is everyone’s responsibility. As such, employees understand and embrace the organization’s safety behaviors, beliefs, and underlying assumptions. Employees understand and embrace their responsibilities, maintain their proficiency so that they speak from experience, challenge what is not right and help fix what is wrong and police the system to ensure they, their co-workers, the environment, and the public remain safe.
SAFETY CULTURE ATTRIBUTE
SOURCE OF ATTRIBUTE FROM ISM GUIDE (DOE G 450.4-1C)
1. Individuals team with 1.1 What type of evidence do you see that demonstrates personnel at leaders to commit to all levels of the organization are held accountable for shortfalls in safety, to understand meeting standards and expectations related to fulfilling safety safety expectations, and responsibilities? to meet expectations. Is accountability demonstrated both by recognition of excellent safety
performers as well as identification of less-than-adequate performers? In holding people accountable, in the context of a just culture, do managers consider individual intentions and the organizational factors that may have contributed? 2.1 What type of evidence do you see that demonstrates individuals understand and demonstrate responsibility for safety? Are safety and its ownership apparent in everyone's actions and deeds? Are workers actively involved in identification, planning, and improvement of work and work practices? Do workers follow approved procedures? Can workers at any level stop unsafe work or work during unexpected conditions? Is there any evidence that they have stopped work?
2. Individuals work with 2.1 What type of evidence do you see that demonstrates individuals leaders to increase the promptly report errors and incidents? level of trust and Do individuals feel safe from reprisal in reporting errors and incidents; cooperation by holding they offer suggestions for improvements? each other accountable for their actions with success evident by the openness to raise and
2.2 What type of evidence do you see that demonstrates individuals are systematic and rigorous in making informed decisions that support safe, reliable operations?
resolve issues in a timely Are workers expected and authorized to take conservative actions fashion. when faced with unexpected or uncertain conditions? Do you have
any evidence that they have ever exercised this right? Do line managers support and reinforce conservative decisions based on available information and risks?
3. Everyone is personally 3.1 What type of evidence do you see that demonstrates personnel responsible and at all levels of the organization are held accountable for shortfalls in accountable for safety, meeting standards and expectations related to fulfilling safety they learn their jobs, they responsibilities? know the safety systems Is accountability demonstrated both by recognition of excellent safety and they actively engage performers as well as identification of less-than-adequate in protecting themselves, performers? their co-workers, the public and the In holding people accountable, in the context of a just culture, do managers consider individual intentions and the organizational
1‐9
DOE‐HDBK‐1208‐2012
EMPLOYEE INVOLVEMENT environment. factors that may have contributed?
3.2 What type of evidence do you see that demonstrates people and their professional capabilities, experiences, and values are regarded as the organization’s most valuable assets? Do organizational leaders place a high personal priority and time commitment on recruiting, selecting, and retaining an excellent technical staff? 3.3 Does the organization maintain a highly knowledgeable workforce to support a broad spectrum of operational and technical decisions? Is the right technical and safety expertise embedded in the organization and when necessary is outside expertise is employed? 3.4 What type of evidence do you see that demonstrates individuals have in-depth understanding of safety and technical aspects of their jobs? 3.5 What type of evidence do you see that demonstrates the technical qualification standards are defined and personnel are trained accordingly? Do technical support personnel have expert-level technical understanding? Do managers have strong technical backgrounds in their area of expertise? 3.6 What type of evidence do you see that demonstrates assignments of safety responsibilities and delegations of associated authorities are made to individuals with the necessary technical experience and expertise? In rare cases, if this is not possible, are corrective and compensatory actions taken?
4. Individuals develop 4.1 What type of evidence do you see that demonstrates individuals healthy skepticism and cultivate a constructive, questioning attitude and healthy skepticism constructively question when it comes to safety? Do individuals question deviations, and deviations to the avoid complacency or arrogance based on past successes? established safety system Do team members support one another through both awareness of and actively work to avoid each other’s actions and constructive feedback when necessary? complacency or arrogance based on past 4.2 What type of evidence do you see that demonstrates individuals are aware of and counteract human tendencies to simplify successes. assumptions, expectations, and analysis?
Are diversity of thought and opposing views welcomed and considered? Is intellectual curiosity encouraged? 4.3 What type of evidence do you see that demonstrates individuals are intolerant of conditions or behaviors that have the potential to reduce operating or design margins? Are anomalies thoroughly investigated, promptly mitigated, and periodically analyzed in the aggregate? Is the bias set on proving work activities are safe before proceeding, rather than proving them unsafe before halting? Do personnel not proceed and do not allow others to proceed when safety is uncertain? Do you have any evidence that they ever have exercised this right? 4.4 What type of evidence do you see that demonstrates individuals
1‐10
DOE‐HDBK‐1208‐2012
EMPLOYEE INVOLVEMENT outside of the organization (including subcontractors, temporary employees, visiting researchers, vendor representatives, etc.) understand their safety responsibilities?
5. Individuals make 5.1 What type of evidence do you see that demonstrates individuals conservative decisions are mindful of the potential impact of equipment and process failures; with regards to the proven they are sensitive to the potential of faulty assumptions and errors, safety system and and demonstrate constructive skepticism? consider the Do they appreciate that mindfulness requires effort? consequences of their decisions for the entire life-cycle of operations.
5.2 What type of evidence do you see that demonstrates individuals recognize that errors and imperfections are likely to happen? Do they recognize the limits of foresight and anticipation, and watch for things that have not been seen before? Do they appreciate that error-likely situations are predictable, manageable, and preventable, and seek to identify and eliminate latent conditions that give rise to human performance errors? 5.3 What type of evidence do you see that demonstrates individuals are systematic and rigorous in making informed decisions that support safe, reliable operations? Are workers expected and authorized to take conservative actions when faced with unexpected or uncertain conditions? How do you know? Do line managers support and reinforce conservative decisions based on available information and risk?
6. Individuals openly and 6.1 What type of evidence do you see that demonstrates individuals promptly report errors and promptly report errors and incidents? incidents and don’t rest Is there a sense that they feel safe from reprisal in reporting errors until problems are fully and incidents? Do they offer suggestions for improvements? resolved and solutions proven sustainable.
7. Individuals instill a high 7.1 What type of evidence do you see that demonstrates individuals level of trust by treating cultivate a constructive, questioning attitude and healthy skepticism each other with dignity when it comes to safety? Do individuals question deviations, and and respect and avoiding avoid complacency or arrogance based on past successes? harassment, intimidation, Do team members support one another through both awareness of retaliation, and each other’s actions and constructive feedback when necessary? discrimination. Individuals welcome and consider a diversity of thought and opposing views.
7.2 What type of evidence do you see that demonstrates individuals are aware of and counteract human tendencies to simplify assumptions, expectations, and analysis? Is diversity of thought and opposing views welcomed and considered? Is intellectual curiosity is encouraged?
8. Individuals help develop healthy collaborative relationships within their organization and between their organization and regulators, suppliers, customers and contractors.
1‐11
DOE‐HDBK‐1208‐2012
ORGANIZATIONAL LEARNING
Organizational Learning for Performance Improvement The organization learns how to positively influence the desired behaviors, beliefs and assumptions of their healthy safety culture. The organization acknowledges that errors are a way to learn by rewarding those that report, sharing what is wrong, fixing what is broken and addressing the organizational setup factors that led to employee error. This requires focusing on reducing recurrences by correcting deeper, more systemic causal factors and systematically monitoring performance and interpreting results to generate decision-making information on the health of the system.
SAFETY CULTURE ATTRIBUTE
SOURCE OF ATTRIBUTE FROM ISM GUIDE (DOE G 450.4-1C)
1. The organization 1.1What type of evidence do you see that demonstrates credibility establishes and cultivates and trust are present and continuously nurtured? a high level of trust; Do line managers reinforce perishable values of trust, credibility, and individuals are comfortable attentiveness? raising, discussing and resolving questions or Is the organization just – that is, does the line managers demonstrate an understanding that humans are fallible and when concerns. mistakes are made, the organization seeks first to learn as opposed
to blame? Is the system of rewards and sanctions aligned with strong safety policies and reinforces the desired behaviors and outcomes? 1.2 What type of evidence do you see that demonstrates open communications and teamwork are the norm? Are people comfortable raising and discussing questions or concerns? Are good news and bad news both valued and shared? 1.3 What type of evidence do you see that demonstrates a high level of trust is established in the organization? Is reporting of individual errors is encouraged and valued? What methods are available for personnel to raise safety issues, without fear of retribution?
2. The organization provides 2.1 What type of evidence do you see that demonstrates systems of various methods to raise checks and balances are in place and effective at all levels of the safety issues without fear organization to make sure that safety considerations are adequately of retribution, harassment, weighed and prioritized? intimidation, retaliation, or 2.2 Do safety and quality assurance positions have adequate discrimination. organizational influence?
2.3 What type of evidence do you see that demonstrates processes are established to identify and resolve latent organizational weaknesses that can aggravate relatively minor events if not corrected? Are linkages among problems and organizational issues examined and communicated?
3. Leaders reward learning 3.1 What type of evidence do you see that demonstrates the from minor problems to organization actively and systematically monitors performance avoid more significant through multiple means, including leader walk-arounds, issue events. reporting, performance indicators, trend analysis, benchmarking,
industry experience reviews, self-assessments, and performance assessments?
1‐12
DOE‐HDBK‐1208‐2012
ORGANIZATIONAL LEARNING Is feedback from various sources integrated to create a full understanding? 3.2 What type of evidence do you see that demonstrates organization members convene to swiftly uncover lessons and learn from mistakes? 3.3 Are frequent incident reviews conducted promptly after an incident to ensure data quality to identify improvement opportunities? 3.4 What type of evidence do you see that demonstrates expertise in causal analysis is applied effectively to examine events and improve safe work performance? Is high-quality causal analysis is the norm? Is causal analysis performed on a graded approach for major and minor incidents, and near-misses, to identify causes and follow-up actions? Are even small failures viewed as windows into the system that can spur learning? 3.5 Do performance improvement processes encourage workers to offer innovative ideas to improve performance and to solve problems? 3.6 What type of evidence do you see that demonstrates line managers are actively involved in all phases of performance monitoring, problem analysis, solution planning, and solution implementation to resolve safety issues?
4. Leaders promptly review, 4.1 What type of evidence do you see that demonstrates line prioritize, and resolve managers have a strong focus on the safe conduct of work problems, track long-term activities? sustainability of solutions, Are line managers maintain awareness of key performance and communicate results indicators related to safe work accomplishment, watch carefully for back to employees. adverse trends or indications, and take prompt action to understand
adverse trends and anomalies? 4.2 What type of evidence do you see that demonstrates vigorous corrective and improvement action programs are in place and effective? Is there a rapid response to problems and closeout of issues ensures that small issues do not become large ones? Are managers actively involved to balance priorities to achieve timely resolutions?
5. The organization avoids complacency by cultivating a continuous learning/improvement environment with the attitude that “it can happen here.”
5.1 What type of evidence do you see that demonstrates operational anomalies, even small ones, get prompt attention and evaluation – this allows early detection of problems so necessary action is taken before problems grow? 5.2 Are candid dialogue and debate, and a healthy skepticism encouraged when safety issues are being evaluated? Are differing professional opinions welcomed and respected? Is it ever used? Are robust discussion and constructive conflict recognized as a natural result of diversity of expertise and experience? 5.3 What type of evidence do you see that demonstrates individuals
1‐13
DOE‐HDBK‐1208‐2012
ORGANIZATIONAL LEARNING are systematic and rigorous in making informed decisions that support safe, reliable operations? Are workers expected and authorized to take conservative actions when faced with unexpected or uncertain conditions? Do line managers support and reinforce conservative decisions based on available information and risks? 5.4 What type of evidence do you see that demonstrates operations personnel are held to high standards of both technical understanding and detailed task-oriented performance? Do operations personnel provide reliable and consistent responses to expected occurrences? Are flexible responses to unexpected occurrences based on continuous preparation and training? Are formality and discipline in operations is valued? 5.5 What type of evidence do you see that demonstrates organizational systems and processes are designed to provide layers of defenses, recognizing that people are fallible? Are prevention and mitigation measures used to preclude errors from occurring or propagating? Are error-likely situations sought out and corrected, and recurrent errors carefully examined as indicators of latent organizational weaknesses? Do managers aggressively correct latent organizational weaknesses and measure the effectiveness of actions taken to close the gaps?
6. Leaders systematically 6.1 What type of evidence do you see that demonstrates line evaluate organizational managers are in close contact with the front-line; they pay attention performance using: to real-time operational information? workplace observations, Is maintaining operational awareness a priority? How do you know? employee discussions, issue reporting, performance indicators,
Do line managers identify critical performance elements and monitor them closely?
trend analysis, incident 6.2 What type of evidence do you see that demonstrates investigations, organizations know the expertise of their personnel? benchmarking, What evidence do you have that line managers defer to qualified assessments, and individuals with relevant expertise during operational upset independent reviews. conditions?
Are qualified and capable people closest to the operational upset empowered to make important decisions, and are held accountable justly? 6.3 What type of evidence do you see that demonstrates performance assurance consists of robust, frequent, and independent oversight, conducted at all levels of the organization? Does performance assurance include independent evaluation of performance indicators and trend analysis? 6.4 Are performance assurance programs guided by plans that ensure a base level of relevant areas are reviewed? Are assessments performed against established barriers and requirements? 6.5 What type of evidence do you see that demonstrates efficient
1‐14
DOE‐HDBK‐1208‐2012
ORGANIZATIONAL LEARNING redundancy in monitoring is valued; higher levels of redundancy are recognized as necessary for higher risk activities? 6.6 What type of evidence do you see that demonstrates performance assurance includes a diversity of independent “fresh looks” to ensure completeness and to avoid complacency? Is there a mix of internal and external oversight reviews reflects an integrated and balanced approach? Is this balance is periodically reviewed and adjusted as needed? 6.7 What type of evidence do you see that demonstrates the insights and fresh perspectives provided by performance assurance personnel are valued? 6.8 Is organizational feedback actively sought to make performance assurance activities more value-added? 6.9 Is complete, accurate, and forthright information is provided to performance assurance organizations? 6.10 What type of evidence do you see that demonstrates results from performance assurance activities are effectively integrated into the performance improvement processes, such that they receive adequate and timely attention? Are linkages with other performance monitoring inputs examined, high-quality causal analyses are conducted, as needed, and corrective actions are tracked to closure with effectiveness verified to prevent future occurrences? 6.11 What type of evidence do you see that demonstrates line managers throughout the organization set an example for safety through their direct involvement in oversight activities and associated performance improvement? 6.12 Are senior line managers periodically briefed on results of oversight group activities to gain insight into organizational performance and to direct needed corrective actions? 6.13 What type of evidence do you see that demonstrates periodic ISM reviews, assessments, and verifications are conducted and used as a basis for ISM program adjustments and implementation improvements?
7. The organization values 7.1 What type of evidence do you see that demonstrates operating learning from operational experience is highly valued, and the capacity to learn from experience from both experience is well developed? inside and outside the Does the organization regularly examine and learn from operating organization. experiences, both internal and in related industries?
8. The organization willingly and openly engages in organizational learning activities.
8.1 What type of evidence do you see that demonstrates line managers throughout the organization set an example for safety through their direct involvement in continuous learning by themselves and their followers on topics related to technical understanding and safety improvement? 8.2 What type of evidence do you see that demonstrates the organization values and practices continuous learning, and requires employees to participate in recurrent and relevant training and encourages educational experiences to improve knowledge, skills, and abilities?
1‐15
DOE‐HDBK‐1208‐2012
ORGANIZATIONAL LEARNING Are professional and technical growth formally supported and tracked to build organizational capability? 8.3 What type of evidence do you see that demonstrates training to broaden individual capabilities and to support organizational learning is available and encouraged – to appreciate the potential for unexpected conditions; to recognize and respond to a variety of problems and anomalies; to understand complex technologies and capabilities to respond to complex events; to develop flexibility at applying existing knowledge and skills in new situations; to improve communications; to learn from significant industry and DOE events? 8.4 Are mental models, practices, and procedures updated and refreshed based on new information and new understanding? 8.5 What type of evidence do you see that demonstrates training effectively upholds management’s standards and expectations? Beyond teaching knowledge and skills, are trainers adept at reinforcing requisite safety values and beliefs? 8.6 Do managers set an example for safety through their personal commitment to continuous learning and by their direct involvement in high-quality training that consistently reinforces expected worker behaviors? 8.7 Do managers encourage informal opinion leaders in the organization to model safe behavior and influence peers to meet high standards?
1‐16
DOE‐HDBK‐1208‐2012
Attachment 2. Bibliography
DOE‐HDBK‐1208‐2012
DOE‐HDBK‐1208‐2012
BIBLIOGRAPHY
1 W.G. Johnson, The Management Oversight and Risk Tree – MORT, 1973 2 Zahid H. Qureshi, A Review of Accident Modelling Approaches for Complex Socio-Technical
Systems, Defence and Systems Institute, University of South Australia, 2007 3 Barry Turner and Nick F. Pidgeon, Man-Made Disasters: the Failure of Foresight, 1978 4 M. David Ermann and Richard J. Lundman, Corporate and Governmental Deviance:
Problems of Organizational Behavior in Contemporary Society (5th Edition), (1986), pp. 207-231.
5 Charles Perrow, Normal Accidents. New York: Basic Books, Inc. 6 Gene I. Rochlin, Todd R. LaPorte, , and Karlene H. Roberts, The Self-Designing High-
Reliability Organization: Aircraft Carrier Flight Operations at Sea, 1987 7 Erik Hollnagel, Resilience Engineering: Concepts and Precepts, 2006 8 David Borys, Dennis Else, Susan Leggett, The Fifth Age of Safety: the Adaptive Age,
Volume 1 Issue 1, Journal of Health & Safety Research & Practice, October 2009 9 Diane Vaughan, The Challenger Launch Decision, Risky Technology, Culture, and Deviance
at NASA; University of Chicago Press, 1996, Chap X, p. 394 10 Erik Hollnagel, Barriers and Accident Prevention, 2004 11 Sydney Dekker, Drift into Failure, Ashgate, February 1, 2011 12 Teemu Reiman and Pia Oedewald, Evaluating Safety Critical Organizations – Emphasis on
the Nuclear Industry, Report No.: 2009-12, VTT, Technical Research Centre of Finland, April 2009
13 H.W. Heinrich, The Domino Theory of Accident Causation, 1931 14 James Reason, Managing the Risks of Organizational Accidents, Ashgate Publishing, 1997 15 Sydney Dekker, The Field Guide to Understanding Human Error, 2006 16 Sidney Dekker, The Field Guide to Human Error Investigations, 2002 17 James Reason and Alan Hobbs, Managing Maintenance Error, a Practical Guide, 2003 18 Ernst Mach, Knowledge and Error, Sketches on the Psychology of Enquiry, p. 84, English
edition, 1976, Netherlands: Dordrecht: Reidel. 19 Erik Hollnagel, The ETTO Principle: Efficiency-Thoroughness Trade-Off, 2009 20 Rosabeth Moss Kanter http://blogs.hbr.org/kanter/2009/08/looking-in-the-mirror-of-
accou.html, August 19, 2009 21 GAIN Working Group E, A Roadmap to a Just Culture: Enhancing the Safety Environment,
Global Aviation Information Network, September 2004 22 Gilad Hirschberger, Victor Florian, Mariomikulincer (Bar-IIan University, Ramat Gan, Israel)
and Jamie L. Goldenberg, Tom, Pyszczynski University of Colorado, Colorado Springs,
2‐1
DOE‐HDBK‐1208‐2012
Colorado), Gender Differences In The Willingness To Engage In Risky Behavior: A Terror Management Perspective, Death Studies, 26 (Feb 2002), Pp. 117-142.
2‐2
- Structure Bookmarks
- Table of Contents
- Low Famil iarity (w/task) High
- Figure 1-5: Five Core Funcctions of DDOE’s Integgrated Safeety Manageement Sysstem