Fault Tree Analysis
Fault Tree Analysis (FTA) is a popular and productive hazard identification tool. It provides a
standardized discipline to evaluate and control hazards. The FTA process is used to solve a wide variety of
problems ranging from safety to management issues.
This tool is used by the professional safety and reliability community to both prevent and resolve hazards
and failures. Both qualitative and quantitative methods are used to identify areas in a system that are most
critical to safe operation. Either approach is effective. The output is a graphical presentation providing technical and administrative personnel with a map of "failure or hazard" paths. FTA symbols may be
found in Figure 8- 5. The reviewer and the analyst must develop an insight into system behavior,
particularly those aspects that might lead to the hazard under investigation.
Qualitative FTAs are cost effective and invaluable safety engineering tools. The generation of a qualitative
fault tree is always the first step. Quantitative approaches multiply the usefulness of the FTA but are more
expensive and often very difficult to perform.
An FTA (similar to a logic diagram) is a "deductive" analytical tool used to study a specific undesired
event such as "engine failure." The "deductive" approach begins with a defined undesired event, usually a
postulated accident condition, and systematically considers all known events, faults, and occurrences that
could cause or contribute to the occurrence of the undesired event. Top level events may be identified
through any safety analysis approach, through operational experience, or through a "Could it happen?"
hypotheses. The procedural steps of performing a FTA are:
- Assume a system state and identify and clearly document state the top level undesired event(s). This is often accomplished by using the PHL or PHA. Alternatively, design documentation such as schematics, flow diagrams, level B & C documentation may reviewed.
- Develop the upper levels of the trees via a top down process. That is determine the intermediate failures and combinations of failures or events that are the minimum to cause the next higher level event to occur. The logical relationships are graphically generated as described below using standardized FTA logic symbols.
- Continue the top down process until the root causes for each branch is identified and/or until further decomposition is not considered necessary.
- Assign probabilities of failure to the lowest level event in each branch of the tree. This may be through predictions, allocations, or historical data.
- Establish a Boolean equation for the tree using Boolean logic and evaluate the probability of the undesired top level event.
- Compare to the system level requirement. If it the requirement is not met, implement corrective action. Corrective actions vary from redesign to analysis refinement.
The FTA is a graphical logic representation of fault events that may occur to a functional system. This
logical analysis must be a functional representation of the system and must include all combinations of
system fault events that can cause or contribute to the undesired event. Each contributing fault event
should be further analyzed to determine the logical relationships of underlying fault events that may cause
them. This tree of fault events is expanded until all "input" fault events are defined in terms of basic,
identifiable faults that may then be quantified for computation of probabilities, if desired. When the tree
has been completed, it becomes a logic gate network of fault paths, both singular and multiple, containing
combinations of events and conditions that include primary, secondary, and upstream inputs that may
influence or command the hazardous mode.

A non-technical person can, with minimal training, determine from the fault tree, the combination and alternatives of events that may lead to failure or
a hazard. the figure above is a sample fault tree for an aircraft engine failure. In this sample there are three
possible causes of engine failure: fuel flow, coolant, or ignition failure. The alternatives and combinations
leading to any of these conditions may also be determined by inspection of the FTA.
Based on available data, probabilities of occurrences for each event can be assigned. Algebraic
expressions can be formulated to determine the probability of the top level event occurring. This can be
compared to acceptable thresholds and the necessity and direction of corrective action determined.
The FTA shows the logical connections between failure events and the top level hazard or event. "Event,"
the terminology used, is an occurrence of any kind. Hazards and normal or abnormal system operations are
examples. For example, both "engine overheats" and "frozen bearing" are abnormal events. Events are
shown as some combination of rectangles, circles, triangles, diamonds, and "houses." Rectangles represent
events that are a combination of lower level events. Circles represent events that require no further
expansion. Triangles reflect events that are dependent on lower level events where the analyst has chosen
to develop the fault tree further. Diamonds represent events that are not developed further, usually due to
insufficient information. Depending upon criticality, it may be necessary to develop these branches further.
In the aircraft engine example, a coolant pump failure may be caused by a seal failure. This level was not
further developed. The example does not include a "house." That symbol illustrates a normal (versus
failure) event. If the hazard were "unintentional stowing of the landing goal", a normal condition for the
hazard would be the presence of electrical power.
FTA symbols can depict all aspects of NAS events. The example reflects a hardware based problem. More
typically, software (incorrect assumptions or boundary conditions), human factors (inadequate displays),
and environment conditions (ice) are also included, as appropriate.
Events can be further broken down as primary and secondary. A primary event is a coolant pump failure
caused by a bad bearing. A secondary event would be a pump failure caused by ice through the omission
of antifreeze in the coolant on a cold day. The analyst may also distinguish between faults and failures. An
ignition turned off at the wrong time is a fault, an ignition switch that will not conduct current is an
example of failure.
Events are linked together by "AND" and "OR" logic gates. The latter is used in the example for both fuel
flow and carburetor failures. For example, fuel flow failures can be caused by either a failed fuel pump or
a blocked fuel filter. An "AND" gate is used for the ignition failure illustrating that the ignition systems are
redundant. That is both must fail for the engine to fail. These logic gates are called Boolean gates or
operators. Boolean algebra is used for the quantitative approach. The "AND" and "OR" gates are
numbered sequentially A# or O# respectively in the figure above.
As previously stated, the FTA is built through a deductive "top down" process. It is a deductive process in
that it considers combinations of events in the "cause" path as opposed to the inductive approach, which
does not. The process is asking a series of logical questions such as "What could cause the engine to fail?"
When all causes are identified, the series of questions is repeated at the next lower level, i.e., "What would
prevent fuel flow?" Interdependent relationships are established in the same manner.
When a quantitative analysis is performed, probabilities of occurrences are assigned to each event. The
values are determined through analytical processes such as reliability predictions, engineering estimates, or
the reduction of field data (when available). A completed tree is called a Boolean model. The probability of
occurrence of the top level hazard is calculated by generating a Boolean equation. It expresses the chain of
events required for the hazard to occur. Such an equation may reflect several alternative paths. Boolean
equations rapidly become very complex for simple looking trees. They usually require computer modeling
for solution.
In addition to evaluating the significance of a risk and the likelihood of occurrence, FTAs facilitate
presentations of the hazards, causes, and discussions of safety issues. They can contribute to the
generation of the Master Minimum Equipment List (MMEL).
The FTA's graphical format is superior to the tabular or matrix format in that the inter-relationships are
obvious. The FTA graphic format is a good tool for the analyst not knowledgeable of the system being
examined. The matrix format is still necessary for a hazard analysis to pick up severity, criticality, family
tree, probability of event, cause of event, and other information. Being a top-down approach, in contrast to
the fault hazard and FMECA, the FTA may miss some non-obvious top level hazards.
Evaluating a Fault Tree Analysis
FTA is a technique that can be used for any formal system safety program analysis (PHA, SSHA, O&SHA).
The FTA is one of several deductive logic model techniques, and is by far the most common. The FTA
begins with a stated top-level hazardous/undesired event and uses logic diagrams to identify single events
and combinations of events that could cause the top event. The logic diagram can then be analyzed to
identify single and multiple events that can cause the top event. Probability of occurrence values are
assigned to the lowest events in the tree. FTA utilizes Boolean Algebra to determine the probability of
occurrence of the top (and intermediate) events. When properly done, the FTA shows all the problem
areas and makes the critical areas stand out. The FTA has two drawbacks:
- Depending on the complexity of the system being analyzed, it can be time consuming, and therefore very expensive.
- It does not identify all system hazards, it only identifies failures associated with the predetermined top event being analyzed. For example, an FTA will not identify "ruptured tank" as a hazard in a home water heater. It will show all failures that lead to that event. In other words, the analyst needs to identify all hazards that cannot be identified by use of a fault tree.
The graphic symbols used in a FTA are provided in the figure below.
The first area for evaluation (and probably the most difficult) is the top event. This top event should be
very carefully defined and stated. If it is too broad (e.g., aircraft crashes), the resulting FTA will be overly
large. On the other hand, if the top event is too narrow (e.g., aircraft crashes due to pitch-down caused by
broken bellcrank pin), then the time and expense for the FTA may not yield significant results. The top
event should specify the exact hazard and define the limits of the FTA. In this example, a good top event
would be "uncommanded aircraft pitch-down," which would center the fault tree around the aircraft flight
control system, but would draw in other factors, such as pilot inputs and engine failures. In some cases, a
broad top event may be useful to organize and tie together several fault trees.
Some fault trees do not lend themselves to quantification because the
factors that tie the occurrence of a second level event to the top event are normally outside the
control/influence of the operator (e.g., an aircraft that experiences loss of engine power may or may not
crash depending on altitude at which the loss occurs).
A quick evaluation of a fault tree may be possible by looking at the logic gates. Most fault trees will have
a substantial majority of OR gates. If fault trees have too many OR gates, every fault of event may lead
to the top event. This may not be the case, but a large majority of OR gates will certainly indicate this.
An evaluator needs to be sure that logic symbols are well defined and understood. If nonstandard
symbols are used, they must not get mixed with other symbols.
Check for proper control of transfers. Transfers are reference numbers permitting linking between pages
of FTA graphics. Fault trees can be extremely large, requiring the uses of many pages and clear interpage
references. Occasionally, a transfer number may be changed during fault tree construction. If the
corresponding sub-tree does not have the same transfer number, then improper logic will result.
Cut sets (minimum combinations of events that lead to the top event) need to be evaluated for
completeness and accuracy. Establishing the correct number of cuts and their depth is a matter of
engineering judgment.
Each fault tree should include a list of minimum cut sets. Without this list, it is difficult to identify
critical faults or combinations of events. For large or complicated fault trees, a computer is necessary to
catch all of the cut sets; it is nearly impossible for a single individual to find all of the cut sets.
For a large fault tree, it may be difficult to determine whether or not the failure paths were completely
developed. If the evaluator is not totally familiar with the system, the evaluator may need to rely upon
other means. A good indication is the shape of the symbols at the branch bottom. If the symbols are
primarily circles (primary failures), the tree is likely to be complete. On the other hand, if many symbols
are diamonds (secondary failures or areas needing development), then it is likely the fault tree needs
expansion.
Faulty logic is probably the most difficult area to evaluate, unless the faults lie within the gates, which are
relatively easy to spot. A gate-to-gate connection shows that the analyst might not completely understand
the workings of the system being evaluated. Each gate must lead to a clearly defined specific event, i.e.,
what is the event and when does it occur? If the event consists of any component failures that can directly
cause that event, an OR gate is needed to define the event. If the event does not consist of any component
failures, look for an AND gate.
When reviewing an FTA with quantitative hazard probabilities of occurrence, identify the events with
relatively large probability of occurrence. They should be discussed in the analysis summaries, probably
as primary cause factors.
A large fault tree performed manually is susceptible to errors and omissions. There are many advantages
of computer modeling relative to manual analysis (of complex systems):
- Logic errors and event (or branch) duplications can be quickly spotted.
- Cut sets (showing minimum combinations leading to the top event) can be listed.
- Numerical calculations (e.g., event probabilities) can be quickly done.
- A neat, readable, fault tree can be drawn.
Source: FAA System Safety Handbook, Ch. 9.
Section Home Page
Disclaimer: This material is for training purposes only. Its purpose is to inform employers of best practices in occupational safety and health and general OSHA compliance requirements. This material is not, in any way, a substitute for any provision of the Occupational Safety and Health Act of 1970 or any standards issued by OSHA.
|