Reliability is a broad term that focuses on the ability of a product to perform its intended function. Mathematically speaking, assuming that an item is performing its intended function at time equals zero, reliability can be defined as the probability that an item will continue to perform its intended function without failure for a specified period of time under stated conditions.'[1] One of the main methods to access the reliability if a system is by fault tree analysis. It is the analysis in which the non working state of the system is analyzed using Boolean logic which combines all the lower level events. This is mainly deployed to analyse the safety of the system. It was first developed by Bell Laboratories in 1962 and further improved by Boeing. It is now a integral part of system safety analysis in many companies. Fault Tree Analysis involves events from component wear out, material failure, malfunctions or combinations of deterministic contributions to the event which can be calculated by assigning the component/system failure rate to branches or cut sets. The failure rates are usually in terms of mean time to failure of the components.
The fault tree analysis in the case of system with large number of components and complex system becomes extremely difficult to analyse and time consuming. It is carried out in the design stage and the design is subjected to change therefore the analysis is not consistent. The analysis also depends on the engineer performing it. His experience and knowledge plays a very important role in the accuracy of the analysis. Many safety studies that use fault trees have proved costly in terms of time and cost taken to construct them. It requires special skills beyond a normal process or design engineers. The fault trees could vary from person to person and its accuracy is not consistent. In order to overcome this automatic fault trees which is independent on the person developing it is handy. The generation of fault tree can be automated by knowing the system structure depicting the connection and relation between the components. Other requirement is the knowledge of component failure model which gives clear understanding on how the components function and how its failure modes cause problems to the functionality of the component and the system as a whole. The research carried out in this work concentrates on different methods of automatic fault tree generation and its understanding.
2. Automatic Fault Tree Methods
In last decade a great deal of work has been carried out to automate the fault tree construction. Many methods have been evolved to cater this need. Some of the methods areDigraph Method, Decision Matrix, Component Model Methods, Computer Aided fault trees, Expert systems, Macro fault trees, Disturbance Analysis etc.
2.1 Digraph Method
The digraph method was first introduced in 1977 by Lapp & Powers [3] which is a method for fault tree construction. Andrews and Brennen [4] adopted this method to develop a automatic fault tree to a complex control configuration. The digraph method of fault tree construction has two steps. The initial step is the construction of the digraph itself which represents normal function of the system and the effect of component failures and deviations in the inputs to system and the next one is to use an algorithm to convert the digraph into completed fault tree. The fault tree is constructed by tracing the causes of undesired deviations in the top event back through the system.
A digraph is also known as a directed graph, depicts the fault propagation through a system. It comprises a set of nodes and edges. The nodes in the digraphs used to represent fault tree symbolises process variables and failure modes. The relation between the nodes is carried by the edges connecting each node. If a deviation in one variable causes a deviation in second variable then edge is drawn from the cause deviation to effect deviation. The edge is given with a number depending on the direction and magnitude of the second deviation relative to first. If a moderate deviation in first variable causes moderate deviation on second the edge carries value of '1'. If the deviation in the second node is very large compared to the cause deviation the edge carries '10'. Usually the deviations that are not controlled by the system the edge carries a number with magnitude 10. . When two nodes are not related the or when the deviation in the first has no or very small effect on the deviation in the second no edge connects them. The sign of the number on the edge represents the relative direction of the deviation between the nodes. If the deviations are in similar direction, the number is positive otherwise negative i.e. '+' sign indicates increase in the deviation while '-' indicates decrease in deviation. So the numbers that can be assigned to the edge are +10, -10, +1, -1 and 0. The number associated with the edge is called the gain which is the partial derivative of the first variable to second variable.
2.1.1 Digraph Construction
The digraph is developed so that the fault tree can be developed for an undesired event. In this regard the first task is to identify the process variable deviation which represents the top event. The digraph is produced backwards from the top event process variable by considering both actions of components which could cause deviations in the variable and also in laws of physics between them. The newly added local variables are connected to the top event variable by appropriate edges are in turn further developed in the similar way. The development is carried out till all the undeveloped variables on the diagram are the ones corresponding to inputs at the boundary of the system. The allowable deviations in these variables are also stated and may appear in fault tree as basic events. The causes of these events are not in the scope of the analysis so are not further developed.
As an example [4], let us consider a control valve with spring action air to close. The air pressure is represented by P1 and the flow rate of the fluid is M2. The positive deviation in P3 causes a negative deviation in M2 so the edge carries a negative sign with magnitude 1.
P3
M2
-1
If the valve has quick closing characteristics, then a positive deviation in P3 causes a very large negative deviation in M2
P3
M2
-10
The number associated with the edge can be written in the partial derivative, and is termed as the gain between the variables.
Now having known the basics for developing a digraph let us use the same example to illustrate the digraph method [3].
Figure : Air to close Control valve [3]
Each parameter has two parameter labels which consist of a letter and a number. The letter represents physical variable like pressure, temperature and, mass etc. Therefore from figure 1 the air pressure to be controlled is P3 and the mass flow rate is denoted by M2. We can now draw the digraph of the system. As the positive deviation in the pressure P3 causes negative deviation in M2. If the valve was a closed such that small deviation in P3 causes large deviation in M2 the edge carries -10. The dynamics of slam shut valve is =+/- 10.
Figure : Digraph Representation of Control valve[3]
As discussed in the example the magnitude of the edge can be 1 for moderate response while it is 10 for a large response of the valve. Next step is to find out the propagation of disturbances in the digraph. The dependent variable i.e. M3 is calculated by taking the product of independent variable deviation i.e. P3 and the gain. When disturbance and gain are of magnitude 10 the output disturbance magnitude is also 10.
In order to complete the digraph and to come up with the fault tree the engineer must fully understand the system. In order to ensure that correct gains and edges are added to the digraph each component is considered separately and then an input and output model is constructed.
2.1.2 Fault Tree Construction
The systems which are simple and do not feature any control loops, fault trees can be constructed quickly by tracing the potential causes of a deviation in the top event variable back through the system digraph. The fault tree structure for this case will be very simple where process deviations are combined by OR gates unless conditional edges are encountered. In case of conditional edge the output variable deviation only occurs when the correct input deviation occurs and also condition exists. The main advantage of the digraph method is that it can be applied to analyse systems which has control loops. Simpler system doesn't require additional effort to develop intermediate steps. It is in the case of complex systems that the fault tree can only proceed when all control loops in the system are identified and are found if they are negative feedback or negative feedforward.
Negative Feedback loop
Negative Feedback loop (NFBL) is when the loop has the ability to correct moderate disturbances in the sensed process variable. The NFBL can be identified in a digraph as a path which starts and ends at the same node. For such a path the product of normal gains is negative. Some characteristics are, NFBL causes disturbance only when the net gain is positive. It passes the disturbances if it is too large or too fast for the control loop to correct. It also passes correctable disturbance if one or more control devices on the loop are inactive unless they halt the propagation of disturbance.
Figure : Negative Feedback loop operator , n=number of nodes [3]
Negative Feedforward loop
Negative Feedforward loop (NFFL) can correct the disturbance which is already present in the system and prevent disturbance from propagating through the system. This is possible by sensing an upward path variable and manipulating the variable in the downward path. The NFFL is identifiable on a digraph by the following features. It has two or more paths from one node to another node. The sign of the product of the normal gains on one path is different from that of the others.
Figure : Negative feedforward loop operator[3]
In the development of the automatic fault trees, the Digraph method is considered as formalized approach, as incorporates the functional information about the system. It is a method having multi-parameter and multi-state. However, Digraph has to be manually developed, which requires the expertise and proper knowledge of the analyst about the system. Once the digraph is constructed the development of fault trees may be automatically performed which is started from digraph representation of the system.
2.2 Computer Aided Trees
Computer Aided Trees is an attempt to automate the process of construction of the fault trees. It was started in 1976 by Salem et al [7]. The Computer aided tree was oriented and was built on a component level approach. These models can be standardized for various component types. The behaviour of the components are represented by the logical decision matrix. The matrix represents the relation between input variable to output variables whilst considering the state of internal components. The modelling of the system is carried out by defining the logical state of the component and propagating it through each component in a sequence in the system diagram. The algorithm developed checks to see if the boundary conditions are coherent to avoid repetition. Logical loops are in place to check the correct functionality of the system and components.
2.3 Taylor's Approach of Mini Trees
Taylor introduced the concept of mini-tree in order to model components [8]. It was a component level approach as each component was modelled separately and constructed independently. The fault tree was developed by linking the relevant mini-trees based on top event analysis. The component models were developed on the basis of physical and mathematical model of the component behaviour. These relations were reduced to a decision matrix, which easily related the input and output of the component state. The main advantage of this decision matrix being it not only gave the logical relation between the input and output variable but also the qualitative gain was represented. The information in the decision table is transformed to mini-tree library which further was developed into the fault tree. The fault tree development from the mini-tree library is done by expanding and linking of different mini-trees to suit the top event analysis. The process is ended when the bottom event represents all the primary events.
2.4 Automated Fault Tree generation Methodology
This is one of the last traditional approaches to automate the fault tree generation [9]. It is based on the graph theory to which uses diagrammatic algorithms to determine the functional structures of the system. The analysis is mainly for electrical and electronic applications and is carried out as stated in the following. The circuit is represented by a graph. Each component is modelled using the analytical transfer function for the functioning modes of the component. The top event is localised and described quantitatively. The boundary conditions are defined and next to this step system is decomposed by using graph theory. In the sub circuits during the decomposition it must be noted that it has at least one input and one output. The control loops and some current senses are recognised from the system diagram and boundary condition. The fault tree is developed by tracing back the circuit diagram.
2.5 Macro Fault Trees
This method is proposed by Poucet and De Meester in the year 1981 [10]. This method basically splits the system into macro components. For each of the macro component a fault tree is developed and relevant fault trees are connected together to give the overall system fault tree. The fault tree is basically built from a library which consists of fault tree structures for different failure modes of macro components. The node or gate is represented by a variable and has three parts. The first part carries the node name, second the node type and the third part has table that contain pointers to the address of all the elements in that node. The fault tree is constructed in three parts. Initially the system is represented in terms of functional floe diagram, In the next step the macro fault trees are built for macro components and lastly the macro fault tree is further developed in terms of basic events by the description of each component.
2.6 Decision Matrix Method
The decision table or decision matrix method is used in the construction of fault trees that are based on decomposition of the system into system components [11]. The system which is decomposed to system components and these components can have same or different failure modes among themselves. It is also possible that a single component with many failure modes which makes it have several mini fault trees of its own. In this process the cause and effect models of different components are built and stored in a library which will be made use as required to construct the fault trees for different objective systems.
The relations of the input and output of the system is listed in a table called decision table or matrix, which describes how each combination of the input events results in specific output events. It is possible that one output may be associated with several causes in the decision matrix and also few other outputs could have exclusive causes. These situations are represented by logical gates for representing all the causes. The OR gate is used to connect all the causes for same output value.
It is also possible that each cause to be composed of several input values, which results in the output value only when they all act together. So in this case it is connected by AND gate. Thus, by defining all the logical relation the fault tree is constructed from the table.
2.7 Artificial Intelligence Inspired Methods (Expert Systems)
Many efforts were made to use artificial intelligence (AI) technique to automate the fault tree generation. Among them the Expert Systems [12] technology showed some promise in this regard. 'Expert Systems are computer programs which manipulate knowledge on a specific field of application, achieving a high degree of skill in intricate and conflicting heuristic areas'. [7] This methodology is applied to the automation of fault tree construction. The system is represented in terms of its top events and its different states are identified. The system must be fully represented in terms of its structure and its functions. The AI knowledge base has all information on component characteristics, its behaviour in functioning and failed state and experience and heuristics used to build the logic model. The interference engine constructs the fault tree logic for the given system and top event along with the quantification of the basic events. The component behaviour is modelled by representation of the rule associated with the component. Many logical rules like CAFTS, mini trees, decision matrix etc can be written as rule. Many methods like Ernest, Express, STARS FT are among the expert systems that use AI to build the fault trees automatically.
2.8 Three Valued Logic Approach
This is one of the recent approaches to fault diagnosis on which Yamashina et al worked extensively [13]. It is based on a rule based expert system which synthesises fault trees to perform fault diagnosis. It is a component level approach along with three valued logic. The third value of logic has been added to enhance a conventional binary logic design. The third value is used to take account of unknown situations that would arise. It also incorporated the particular system modelling by representing the semantic network which is constituted by standard operators. This shows how the analyst conserves the elemental role in describing the system behaviour. The system representation encompasses functional description and later the intermediate system representation. The analysts define the device functions which enable the devices to be modelled during fault tree synthesis.
3. Conclusion
Fault Tree Analysis is the most widely used methodology for fault diagnosis across the industries. This plays a predominant role as modelling technique in safety and diagnostic analysis. The fault tree analysis is one of the very simple yet very effective methods to breakdown the system into its components and to carry out analysis at component level. In the wake of increasing complexity with the system and its analysis efforts have been made for a long time to automate the construction of the fault trees. The organisations use this to measure the reliability performance of the system in terms of its components functionality. The process of fault tree construction is very time consuming and is extremely tedious for complex and large systems. The fault tree synthesis also depends on the skill-set of the analyst carrying out the analysis.
In order to meet these short comings and to have a consistent methodology in developing the fault trees these are automated with various methods. In the literature review we have seen many methods such as digraph, decision matrix, computer assisted techniques, artificial intelligence methods to automate the fault tree generation in lesser time and very close accuracy. The main challenge in this aspect would be to develop the fault tree as close to the actual model and to catch up with increasing complexity of the system analysis.