List of Figures
List of Tables
List of Acronyms
Background The SOURCE(tm) Methodology (Seeking Out the Underlying Causes of Events)
Scope of the Handbook
Contents of the Handbook
1.1 The Need for Incident Investigation
1.1.1 Rational for Taking a Structured Approach to Incident Investigation
1.1.2 Depths of Analysis
1.1.3 Structured Analysis Process
1.2 Selecting Incidents to Investigate
1.3 The Investigation Thought Process
1.3.1 Differences Between Traditional Problem Solving and Structured RCA
1.3.2 The Typical Investigator
1.3.3 A Structured Approach to the Analysis
1.4 RCA Within a Business Context
1.5 The Elements of an Incident
1.6 Causal Factors and Root Causes
1.7 The Goal of the Incident Investigation Process
1.8 Overview of the SOURCE(tm) Methodology (Seeking Out the Underlying Causes of Events)
1.9 The SOURCE(tm) Root Cause Analysis Process
1.9.1 Steps That Apply to Acute Incident Analyses
1.9.2 Steps That Apply to Chronic Incident Analysis
1.9.3 Steps That Apply When No Formal Analyses Are Performed
1.9.4 Steps That Apply to All Analyses
1.10 Levels of the Analysis: Root Cause Analysis and Apparent Cause Analysis
1.11 Definitions
1.12 Summary
2.1 Initiating the Investigation
2.2 Notification
2.3 Emergency Response Activities
2.4 Immediate Response Activities
2.5 Beginning the Investigation
2.6 Initial Incident Reports and Corrective Action Requests
2.6.1 Reasons to Generate an IIR or CAR
2.6.2 Typical Information Contained in an IIR or CAR
2.6.3 Using the IIR or CAR in the Incident Investigation Process
2.7 Incident Classification
2.8 Investigation Management Tasks
2.9 Assembling the Team
2.10 Briefing the Team
2.11 Restart
2.12 Gathering Investigation Resources
2.13 Summary
3.1 Introduction
3.2 General Data-gathering and Preservation Issues
3.2.1 Importance of Data-gathering
3.2.2 Types of Data
3.2.3 Prioritizing Data-gathering Efforts
3.2.3.1 People Data Fragility Issues
3.2.3.2 Electronic Data Fragility Issues
3.2.3.3 Physical/Position Data Fragility Issues
3.2.3.4 Paper Data Fragility Issues
3.3 Gathering Data
3.4 Gathering Data from People
3.4.1 Factors to Assess the Credibility of People Data
3.4.2 Initial Witness Statements
3.4.3 The Interview Process
3.4.3.1 Before the Interviews
3.4.3.2 Beginning the Interview
3.4.3.3 Conducting the Interview
3.4.3.4 Concluding the Interview
3.4.3.5 Follow-up Interviews
3.5 Physical Data
3.5.1 Sources of Physical Data
3.5.2 Types and Nature of Physical Data Analysis Questions
3.5.3 Basic Steps in Failure Analysis
3.5.4 Use of Physical Data Analysis Plans
3.5.5 Chain of Custody for Physical Data
3.5.6 Use of Outside Experts
3.6 Paper Data
3.7 Electronic Data
3.8 Position Data
3.8.1 Unique Aspects of Position Data
3.8.2 Collection of Position Data
3.8.3 Documentation of Photos and Videos
3.8.4 Alternative Sources of Position Data
3.9 Overall Data-collection Plan
3.10 Application to Apparent Cause Analyses and Root Cause Analyses
3.11 Summary
4.1 Introduction
4.2 Overview of Primary Techniques
4.3 Cause and Effect Tree Analysis
4.4 Timelines
4.5 Causal Factor Charts
4.6 Using Causal Factor Charts, Timelines, and Cause and Effect Trees Together During an Investigation
4.7 Application to Apparent Cause Analyses and Root Cause Analyses
4.8 Summary
5.1 Introduction
5.2 Root Cause Analysis Traps
5.2.1 Trap 1 — Equipment Issues
5.2.2 Trap 2 — Human Performance Issues
5.2.3 Trap 3 — External Event Issues
5.3 Procedure for Identifying Root Causes
5.4 ABS Consulting’s Root Cause Map
5.5 Observations About the Structure of the Root Cause Map
5.6 Using the Root Cause Map
5.6.1 The Five Steps
5.6.2 Multiple Coding
5.6.3 Incorporating Organizational Standards, Policies, and Administrative Controls
5.6.4 Using the Root Cause Map – Guidance During an Investigation
5.6.5 Typical Problems Encountered When Using the Root Cause Map
5.6.6 Advantages and Disadvantage of Using the Root Cause Map
5.7 Documenting the Root Cause Analysis Process
5.8 Application to Apparent Cause Analyses and Root Cause Analyses
5.9 Summary
6.1 Introduction
6.2 Timing of Recommendations
6.3 Levels of Recommendations
6.3.1 Level 1 — Address the Causal Factor
6.3.2 Level 2 — Address the Intermediate Causes of the Specific Problem
6.3.3 Level 3 — Fix Similar Problems
6.3.4 Level 4 — Correct the Process That Creates These Problems
6.4 Types of Recommendations
6.4.1 Eliminate the Hazard
6.4.2 Make the System Inherently Safer or More Reliable
6.4.3 Prevent Occurrence of the Incident
6.4.4 Detect and Mitigate the Loss
6.4.5 Implementing Multiple Types of Recommendations
6.5 Suggested Format for Recommendations
6.6 Special Recommendation Issues
6.7 Management Responsibilities
6.8 Examples of Reasons to Reject Recommendations
6.9 Assessing Benefit/Cost Ratios
6.9.1 Estimating the Benefits of Implementing a Recommendation
6.9.2 Estimating the Costs of Implementing a Recommendation
6.9.3 Benefit/Cost Ratios
6.10 Assessing Recommendation Effectiveness
6.11 Application to Apparent Cause Analyses and Root Cause Analyses
6.12 Summary
Section 7: Completing the Investigation
7.1 Introduction
7.2 Writing Investigation Reports
7.2.1 Typical Items to Be Included in an Investigation Report
7.2.2 Tips for Writing Reports
7.3 Communicating Investigation Results
7.3.1 Decide to Whom the Results Should Be Communicated
7.3.2 Decide How to Distribute the Report
7.3.3 Document the Communication
7.4 Resolving Recommendations and Communicating Resolutions
7.4.1 Tracking Recommendations
7.4.2 Report Resolution Phase and Closure of Files
7.5 Addressing Final Issues
7.5.1 Enter Trending Data
7.5.2 Evaluate the Investigation Process
7.6 Application to Apparent Cause Analyses and Root Cause Analyses
7.7 Summary
8.1 Introduction
8.2 Why Be Careful When Selecting Incidents for Investigation?
8.3 Some General Guidance
8.3.1 Incidents to Investigate (High Potential Learning Value)
8.3.2 Incidents to Trend (Low to Moderate Potential Learning Value)
8.3.3 No Investigation (Low Potential Learning Value)
8.4 Performing the Investigation
8.4.1 Incidents to Investigate Immediately (Acute Incidents)
8.4.2 Incidents to Trend (Potentially Chronic Incidents)
8.5 Near Misses
8.5.1 Factors to Consider When Defining Near Misses
8.5.2 Reasons Why Near Misses Should Be Investigated
8.5.3 Barriers to Getting Near Misses Reported
8.5.4 Overcoming the Barriers
8.6 Acute Analysis Versus Chronic Analysis
8.7 Identifying Chronic Incidents That Should Be Analyzed
8.7.1 Using Pareto Analysis for Environmental, Health, and Safety Incidents
8.7.1.1 Examples of Pareto Analysis
8.7.1.2 Weaknesses of Pareto Analysis
8.7.2 Chronic Analysis of Reliability Problems
8.7.2.1 Prioritizing the RCA Efforts
8.7.2.2 Repeating the Process
8.7.3 Chronic Analysis for Quality Incidents
8.7.3.1 Prioritizing the RCA Efforts
8.7.3.2 Repeating the Process
8.7.4 Other Data Analysis Tools
8.8 Summary
9.1 Introduction
9.2 Benefits of a Trending Program
9.3 Determining the Data to Collect
9.3.1 Deciding What Data to Collect
9.3.2 Defining the Data to Collect
9.3.3 Other Data-collection Guidance
9.4 Data Analysis
9.4.1 Interpreting Data Trends
9.5 Application to Apparent Cause Analyses and Root Cause Analyses
9.6 Summary
10.1 Introduction
10.2 Program Implementation Process
10.2.1 Design the Program
10.2.2 Develop the Program
10.2.3 Implement the Program
10.2.4 Monitor the Program’s Performance
10.2.5 Improve the Program
10.3 Key Considerations
10.3.1 Legal Considerations and Guidelines
10.3.2 Media Considerations
10.3.3 Some Regulatory Requirements and Industry Standards
10.3.4 Training
10.4 Management Influence on the Program
10.5 Common Investigation Problems and Solutions
10.5.1 There Is No Business Driver to Change
10.5.2 There Is No Organizational Champion for the Program
10.5.3 The Organization Never Leaves the Reactive Mode
10.5.4 The Organization Must Find an Individual to Blame
10.5.5 Personnel Are Unwilling to Critique Management Systems
10.5.6 Reward Implementation of Recommendations
10.5.7 The Organization Tries to Investigate Everything
10.5.8 The Organization Only Performs Incident Investigations on Large Incidents
10.5.9 Recommendations Are Never Implemented
10.6 Summary
11.1 Introduction
11.2 Resources Available on the Companion Downloads and at www.absconsulting.com/RCAHandbookResources
11.2.1 SOURCE(tm) Investigator’s Toolkit
11.2.2 Updates and Modifications to the Root Cause Map Guidance
11.2.3 Examples Specific to Handbook Sections
11.3 Download Instructions
APPENDICES
Appendix A: Glossary
Appendix B: Cause and Effect Tree Details
B.1 Introduction to Cause and Effect Tree Analysis
B.1.1 The Basic Structure of Cause and Effect Trees
B.2 Cause and Effect Tree Examples
B.2.1 Example 1: Spill from a Tank
B.2.2 Example 2: Lighting Failure
B.2.3 Example 3: Hand Injury During Sandblasting
B.3 Cause and Effect Tree Symbols
B.4 Using “AND” Gates
B.4.1 Multiple Elements Required
B.4.2 Multiple Pathways Required
B.4.3 Redundant Equipment Must Fail
B.4.4 Initial Event Combined with a Safeguard Failure
B.5 Using “OR” Gates
B.5.1 One of More of Multiple Elements Fail
B.5.2 Component Failures
B.5.3 Inadvertent Actuation of Safeguards
B.6 Example Cause and Effect Tree Structures
B.7 Procedure for Creating a Cause and Effect Tree
B.7.1 Step 1 — Define an Event of Interest as the Top Event of the Cause and Effect Tree
B.7.2 Step 2 — Define the Next Level of the Tree
B.7.3 Step 3 — Develop Questions to Examine the Credibility of Branches
B.7.4 Step 4 — Gather Data to Answer Questions
B.7.5 Step 5 — Determine Whether the Branch Is Credible
B.7.6 Step 6 — Determine Whether the Branch Is Sufficiently Developed
B.7.7 Step 7 — Stop Branch Development
B.7.8 Step 8 — Stop When the Scenario Model Is “Complete”
B.7.9 Step 9 — Identify Causal Factors
B.8 Drawing the Cause and Effect Tree
B.9 Additional Examples of Cause and Effect Trees
C.1 Introduction
C.2 Timeline Example
C.3 Overall Timeline Guidance
C.3.1 Use Different Colors of Post-it® Notes for Different Types of Data
C.3.2 Use a Simple, Flexible Format
C.3.3 Keep the Level of Detail Manageable
C.4 Rules for Building Blocks
C.4.1 Use Complete Sentences
C.4.2 Use Only One Idea Per Building Block
C.4.3 Be as Specific as Possible
C.4.4 Document the Source for Each Event and Condition
C.5 Rules for Questions
C.6 Timeline Construction
C.6.1 Step 1 — Identify the Loss Events
C.6.2 Step 2 — Identify the Actors
C.6.3 Step 3 — Develop Building Blocks and Add Them to the Timeline
C.6.4 Step 4 — Generate Questions and Identify Data Sources to Fill in Gaps
C.6.5 Step 5 — Gather Data
C.6.6 Step 6 — Add Additional Building Blocks to the Timeline
C.6.7 Step 7 — Determine Whether the Sequence of Events Is Complete
C.6.8 Step 8 — Identify Causal Factors and Items of Note
C.7 Example Timeline Development
C.7.1 Step 1 — Identify the Loss Events
C.7.2 Step 2 — Identify the Actors
C.7.3 Step 3 — Develop Building Blocks and Add Them to the Timeline
C.7.4 Step 4 — Generate Questions and Identify Data Sources to Fill in Gaps
C.7.5 Step 5 — Gather Data
C.7.6 Step 6 — Add Additional Building Blocks to the Timeline
C.7.7 Step 7 — Determine Whether the Sequence of Events Is Complete
C.7.8 Step 8 — Identify Causal Factors and Items of Note
Appendix D: Causal Factor Charting Details
D.1 Introduction
D.2 Causal Factor Chart Example
D.3 Overall Causal Factor Chart Guidance
D.3.1 Use Different Colors of Post-it® Notes for Different Types of Data
D.3.2 Use a Simple, Flexible Format
D.3.3 Keep the Level of Detail Manageable
D.4 Rules for Building Blocks
D.4.1 Use Complete Sentences
D.4.2 Use Only One Idea Per Building Block
D.4.3 Be as Specific as Possible
D.4.4 Document the Source for Each Event and Condition
D.5 Rules for Questions
D.6 Causal Factor Chart Construction
D.6.1 Step 1 — Identify the Loss Event(s)
D.6.2 Step 2 — Take a Small Step Back in Time and Add a Building Block to the Chart
D.6.3 Step 3 — Perform Sufficiency Testing
D.6.4 Step 4 — Gather Data to Answer Questions Developed in Step 3
D.6.5 Step 5 — Add Building Blocks to the Chart
D.6.6 Step 6 — Determine Whether the Sequence of Events Is Complete
D.6.7 Step 7 — Repeat Sufficiency Testing for All Items on the Chart
D.6.8 Step 8 — Perform Necessity Testing
D.6.9 Step 9 — Identify Causal Factors and Items of Note
D.7 Example Development of A Causal Factor Chart
D.7.1 Step 1 — Identify the Loss Event(s)
D.7.2 Step 2 — Take a Small Step Back in Time and Add a Building Block to the Chart
D.7.3 Step 3 — Perform Sufficiency Testing
D.7.4 Step 4 — Gather Data to Answer Questions Developed in Step 3
D.7.5 Step 5 — Add Building Blocks to the Chart
D.7.6 Step 6 — Determine Whether the Sequence of Events Is Complete
D.7.7 Step 7 — Repeat Sufficiency Testing for All Items on the Chart
D.7.8 Step 8 — Perform Necessity Testing
D.7.9 Step 9 — Identify Causal Factors and Items of Note
Appendix E: Root Cause Map Guidance
E.1 Instructions for Using This Appendix with the Root Cause Map
E.1.1 Types of Information Provided
E.1.2 Online Documentation
E.1.3 Working Your Way Through the Root Cause Map
E.1.4 Special Considerations
E.2 Clarifications and Updated Guidance
Table of Contents
Pocket Guide to Incident Investigation/Root Cause Analysis
Index of Incident Investigation Forms, Checklists, and Support Materials
Responsibilities of the Team Leader
Investigator’s Log
Simple Investigation Plan
Detailed Investigation Plan
Investigation Data Needs Form
Investigation Data Needs Checklist
Initial Incident Scene Tour Checklist
List of Contacts
List of Meeting Attendees Interview Scheduling Form Initial Witness Statement
Interview Preparation and Documentation Form
Interview Documentation Form Physical Data Analysis Plan – Parts Analysis Physical Data Analysis Plan – Sample/Chemical Analysis
Guidelines for Collecting Paper Chart Data
Photography Guidelines Photographic Record
Position Data Form
Data Log Form
Data Correspondence Log
Data Tracking Form
Procedure for Creating a Cause and Effect Tree
Testing an OR Gate
Testing an AND Gate
Procedure for Creating a Timeline
Building a Timeline from Witness Statements
Procedure for Creating a Causal Factor Chart
Building a Causal Factor Chart from Witness Statements
Root Cause Map Causal Factor, Root Cause, and Recommendation Checklist
Root Cause Summary Table Form
Instructions for Completing the Incident Investigation Report Form
Incident Investigation Report Form
Report and Investigation Checklist
Open Issues Log
List of Figures by Section and Appendix
F.1: ABS Consulting’s SOURCE(tm) Incident Investigation Model
1.1: Task Triangle Showing Possible Depths of Analyses
1.2: Overlap of Multiple Task Triangles
1.3: Differences Between Traditional Problem Solving and Structured Root Cause Analysis
1.4: Relationship Among Proactive Analysis, Reactive Analysis, and Management Systems
1.5: Idealized Operation
1.6: Realistic Operation
1.7: Steps in the SOURCE(tm) Methodology
1.8: Steps That Apply to Acute Incident Analyses
1.9: Steps That Apply to Chronic Incident Analyses
1.10: Steps That Apply When No Formal Analyses Are Performed
1.11: Levels of Analysis
1.12: Connection Between Causal Factors and Root Causes
2.1: Initiating Investigations Within the Context of the Overall Incident Investigation Process3.1: Gathering Data Within the Context of the Overall Incident Investigation Process
3.2: Types of Data Resources
3.3: Fragility of Data Types
3.4: Flowchart of Typical Interview Sequence
3.5: Basic Steps in Failure Analysis
4.1: Analyzing Data Within the Context of the Overall Incident Investigation Process
4.2: Example Cause and Effect Tree
4.3: How to “Read” the Cause and Effect Tree in Figure 4.2
4.4: Cause and Effect Tree Showing a Multiple-event Failure
4.5: Sandblasting Cause and Effect Tree Example
4.6: Cause and Effect Tree for Number 2 Compressor Crank Failure
4.7: Sandblasting Timeline Example
4.8: Sandblasting Causal Factor Chart Example
5.1 Identifying Root Causes Within the Context of the Overall Incident Investigation Process
5.2: Connection Between the Steps of the Investigation
5.3: Structure of ABS Consulting’s Root Cause Map
5.4: Levels of the Root Cause Map
5.5: Document Hierarchy
5.6: Explanation of the Root Cause Summary Table Structure
5.7: Root Cause Summary Table Form (First Example)
5.8: Root Cause Summary Table Form (Second Example)
5.9: Root Cause Summary Table Form (Third Example)
5.10: Root Cause Summary Table Form (Fourth Example)
5.11: Completing the Three-column Form
6.1: Developing Recommendations Within the Context of the Overall Incident
6.2: Connecting Root Causes and Recommendations
7.1: Completing the Investigation Within the Context of the Overall Incident Investigation Process
7.2: Tracking Recommendations8.1: Selecting Incidents for Analysis Within the Context of the Overall Incident
8.2: Investigation Cycle if Too Many Investigations Are Performed
8.3: Hierarchy of Accidents, Near Misses, and Unsafe Acts/Unsafe Conditions
8.4: Pareto Charts Developed Using Two Different Attributes
8.5: Example Chronic Cause and Effect Tree #1 (Based on 40 Incidents)
8.6: Example Cause and Effect Tree #2 (Based on 23 Incidents)
8.7: Example Cause and Effect Tree #3 (Based on 143 Incidents)
9.1: Results Trending Within the Context of the Overall Incident Investigation Process
10.1: Overall Incident Investigation Process
A.1: Relationship Among Incident Investigation Terms
B.1: AND Gate Structure
B.2: OR Gate Structure
B.3: Example Tree with Multiple Levels
B.4: Cause and Effect Tree for a Tank Spill
B.5: Circuit Diagram
B.6: Cause and Effect Tree for a Lighting Failure
B.7: Cause and Effect Tree with Events A, B, and C Only
B.8: Cause and Effect Tree for Hand Injury During Sandblasting
B.9: Cause and Effect Tree Symbols
B.10: Example Cause and Effect Tree with Supporting Data Shown
B.11: Cause and Effect Tree for Master and Articulating Rod Failure Following Reassembly
B.12: Multiple Elements Required
B.13: Multiple Pathways Required – No Flow
B.14: Multiple Pathways Required – Misdirected Flow
B.15: Redundant Equipment Must Fail
B.16: Equipment Failure and Safeguards Failure
B.17: Human Error and Safeguards Failure
B.18: One of More of Multiple Elements Fail
B.19: Oil Tank Release
B.20: Inadvertent Actuation of Safeguards
B.21: Common-mode Failure
B.22: Human Error with Impact
B.23: Procedure for Creating a Cause and Effect Tree
B.24: Testing AND Gate Logic
B.25: Testing OR Gate Logic
B.26: Testing Credibility
B.27: Determining Branch Credibility
B.28: Determining Branch Development
B.29: Branch Development Results
C.1: Sandblasting Timeline Example
C.2: Process for Developing a Timeline
C.3: Step 1 — Identifying the Loss Event(s)
C.4: Step 2 — Identify the Actors
C.5: Step 3 — Develop Building Blocks and Add Them to the Timeline
C.6: Step 3 — Develop Building Blocks and Add Them to the Timeline
C.7: Step 4 — Generate Questions
C.8: Step 6 — Add Additional Building Blocks
C.9: Step 8 — Identify Causal Factors and Items of Note
D.1: Sandblasting Causal Factor Chart Example
D.2: Process for Developing a Causal Factor Chart
D.3: Step 1 — Identify the Loss Event(s)
D.4: Step 2 — Take a Step Backward
D.5: Step 3 — Sufficiency Testing – Questions 1 and 2
D.6: Step 3 — Sufficiency Testing – Question 3
D.7: Step 2 — Take a Small Step Back in Time
D.8: Step 3 — Sufficiency Testing – Questions 1 and 2
D.9: Step 3 — Sufficiency Testing – Question 3
D.10: Add Remaining Questions to Chart
D.11: Step 4 — Gather Data
D.12: Step 5 — Add Building Blocks to the Chart
D:13: Step 7 — Repeat Sufficiency Testing for All Items on the Chart
D.14: Step 8 — Perform Necessity Testing
D.15: Step 9 — Identify Causal Factors and Items of Note
E.1: Section of the Root Cause Map
E.2: Sample Root Cause Map Documentation Page
E.3: Navigation Box for Online Documentation
E.4: Root Cause Map Paths
Contact Rothstein Associates, Inc. to request a complimentary copy to evaluate for classroom use.