What Is Failure Mode And Effect Analysis Information Technology Essay

Published: November 30, 2015 Words: 2460

Failure Mode and Effect Analysis (FMEA) is a part of the quality risk assessment that allows a company to see the possible failures that may occur in a system, the likelihood of those failures occurring and the consequences to both the system and the company if they were to occur. "Failure Modes" is the term associated with problems in a product or system while the "Effect Analysis" concerns the consequences and impact that will occur due to these failures.1 FMEA also defines the system and outlines its requirements in an attempt to avoid failures.

History of FMEA

FMEA was first set up in the 1940's with the title Failure Modes Effects and Criticality Analysis (FMECA). It was set up by the U.S Military with the idea of being able to outline and determine failures in the systems and equipment. In the 1960's the FMEA was set up by the aerospace industry with the aim to understand how and why systems fail2. Using FMEA within the aerospace industry was a new spin on the risk assessment as up until then it had been used with regard to military systems and weapons. FMEA was primarily used in the space race in the preservation of foods by identifying the critical control points in food systems and ensuring they would not be problems for astronauts in space.3 It was then included as part of the HACCP system in the 1970's in the food production industry due to a large number of C. botulism outbreaks associated with canned foods. HACCP stands for Hazard Analysis and Critical Control Points and as the name suggests it involves looking at possible risks (hazard analysis) and developing measures to control them. By outlining the possible risks and determining the critical control points, procedures can be established for a system to ensure the right corrective actions are taken if a problem occurs with the system.

Outlining FMEA

If FMEA is outlined prior to the beginning of a project (i.e. within the design stages) it can be applied throughout the project which can, in turn, help eliminate, reduce or even avoid failures and possible risks. FMEA is divided into two categories - product related and process related. Product related FMEA is often referred to as Design FMEA. It s an analytical technique used by design engineering teams to ensure that any and all potential failure modes, causes and effects have been looked at in terms of design. Process related FMEA is known as Process FMEA and it is an analytical technique used usually by manufacturing teams whose job is to identify and outline the potential modes, causes and effects and ensure that they have been looked at from a process point of view.(See figure 1)

Design FMEA

FMEA design process looks at what is required of the design. This involves looking at what is needed by the customer and what is wanted by the customer. Often these two requirements are very different. Usually the design FMEA does not include the possible failures which can occur in the manufacturing stages but looks at possible failures that can occur due to problems with design4. This includes problems with the materials used in the design of the system, incorrect calculations used in the design process and/or incorrect use of stresses, temperature and standards.

Problems in design FMEA can range from "very minor" to "hazardous". "Hazardous" is grouped into two categories - with warning and without warning while "very minor" refers to a product or system not complying with its purpose.

Process FMEA

Process FMEA refers to the causes, effects and risk analysis of any problems that can occur in the manufacturing stages of the system. It identifies the consequences of these possible failures and how best to avoid them. Process FMEA usually works through outlining the main operations of the system in question and then looking at all possible failure modes for this system. Each failure mode is then looked at in detail with regards to effects, causes, controls, detection methods, rate of occurrence and the severity of the consequences it will have on the system. Using Process FMEA risk with the highest possibility of occurrence is examined first. When looking at a system in terms of its severity or detection numbers the whole system may have to be redesigned with the addition of new controls being implemented to reduce the risk of failures. It is important to note that a failure in the process and manufacturing of a system can result in more than immediate problems but can also have serious consequences in other aspects of the process down the line and even affect the customers.

Evaluations of process FMEA is often looked at in two ways - quantitative and qualitative.5 Quantitative refers to the numerical data recorded by those working on the systems. It is based on failures that may be seen in the design and process FMEA. Qualitative refers to the participation of the team involved in the project. It is based on their knowledge and experience of past projects and their ability to look at a process and see where possible failures might occur.

Steps in the FMEA Process

There are a number of steps outlining the FMEA process. The aim of these steps is to identify all possible failures, their causes and their effects and also possible control measures which will prevent or reduce the likelihood of these failures actually occurring.

Step One: this generally involves looking at the system and determining the start and end processes. This is one of the first steps as it allows the possible failures, their causes and effects to be approached head on. FMEA is most often set up by work of a team. The team is set up and usually comprises of engineers, designers, developers and a team manager. Each team member will have their own individual strengths and will look at processes in a different way. This can hep the success rate of FMEA as the creativity of the individual members will combine to work out the best approach to minimise failures. Time is given to the team to have the entire process explained to them in detail and through this possible failures can be picked out as each team member will have different experience or expertise to bring to the table.

Step Two: All possible failures are brainstormed by the team. By discussing the process and its possible failures flow charts can be designed to compare the possible risks and causes. Using the flow charts each possible failure can then be broken down and it can be determined whether the problem is in the design or in the process and how best to fix this problem or to prevent it from occurring at all.(See Figure 2).

Step Three: involves outlining the areas on which the failure will have the most effect i.e. will it affect the customers or will it affect the company and stall the process? Part of step three includes not only working out where the most impact will be felt but also what effect, however minor, that a failure will have on the current or future process.

Step Four: Involves assessing the criticality or severity of the risks on the system. The least severe risks are recorded first and are charted in order of their severity i.e. a risk with a severity of 1 would not have a large impact on the process while a severity rating of 10 would be given to a failure that would have a major impact and would result in problems for the process, customers and possibly health issues for those working with the process. An example of how severity would be rated can be seen in a article by MJ O Dwyer 1 in which he states that previous possible risks that have invoked recalls are those of laptops with problems with their batteries. There was a possible risk of explosion would rank as a severity level 10 due to the fact that it was a safety hazard not only to those working with the products but also for the customers who bought the product.

Step Five: All potential causes of failure are identified by the team. The process is looked at from all aspects and anything that may have potential to cause a problem within the system is identified and changed. This may involve looking at different parts of the process i.e. the materials used or even undertaking experiments and technical problem solving skills. Technical problem solving skills generally involve defining a problem, finding out the facts about the problem and facts about possible solutions, finding a solution and evaluating the pro's and con's of the solution.

Step Six: involves determining the occurrence rating. The occurrence rating is used to determine the possibility of a failure occurring and how probable it is. Once the occurrence rating has been determined and the team can determine how often the failure could possibly occur, they assign it a number from one to ten similar to the severity rating. A rating of one indicates a remote possibility of the failure occurring whereas a rating of ten indicates a high probability of occurrence. If there is a very high probability of the failure occurring then the team must work to remove this possibility or at least minimise the possible rate of occurrence.

Step Seven: the causes of the possible failure are identified. The failures are looked at according to both there severity and occurrence ratings and changes are outlined for the system to prevent the failures occurring. These changes may fall under systematic changes which involve putting devices in place to detect failures before they occur or mistake-proofing. Mistake proofing involves using an instrument or process to prevent problems and malfunctions within a system.6Essentially mistake proofing cause the system to eliminate choices that may cause actions, signals and/or defects in a process.

Step Eight: involves determining the detection ranking i.e. the probability that the current controls will be effective in their purpose. It involves ensuring that the possible failures will be detected by instruments put in place in step seven to detect possible failure so as to make sure that these failures do not go unnoticed and cause more failures in other parts of the system both in the current process and future processes.

Step Nine: calculating the RPN number. RPN stands for Risk Priority Number. According to Pat Hammett (University of Michigan)7 the RPN outlines the areas that should be of the most concern in a system by looking at a combination of the severity rating (step 4), the occurrence rating (step six) and the detection rating (step eight). The three results found previously are multiplied together and the RPN is found. Failures with the highest RPN number are considered the most problematic and are focused on before any other failures.

Step Ten: involves tackling the problems outlined in the previous steps. This is achieved using controls and corrective actions. As with all previous steps these controls are implemented in accordance with the severity of the failure i.e. RPN. The main aim of step ten is not only to reduce the risk of failure but also to introduce the changes in a controlled and traceable manner so that all future processes will have a reduced risk of problems.

Following this ten step process the system should be reviewed regularly. It should also be reassessed prior to any new changes being implemented and any new defects or problems should be recorded as soon as they are found along with the corrective measures used to fix them.

Why Use Failure Modes and Effect Analysis?

As seen above quite a lot of work is required for a successful FMEA procedure to be implemented. A main limitation is using FMEA is with regard to the teams experience in dealing with failures. While it can be beneficial if the team is an experienced one and has plenty of past processes to refer to in terms of failure, it can also be seen as a limitation if the team has no experience. Also ranking problems may cause a less serious issue to be looked at before a more pressing matter if the RPN for one I higher. Though the RPN may be higher with reference to severity if the occurrence and detection probabilities are high in another process it may not be looked at in time.

Though these two limitations are important and must be considered when undertaking an FMEA process the benefits of FMEA far outweigh the limitations. An FMEA that is effective and efficient allows the best possible quality standards to be reached as well as ensuring customer satisfaction and reliability.

Benefits of FMEA include:

Helping the team to design the best possible process and/or system to maximise reliability and customer satisfaction as well as a high manufacturing yield.

It allows the team to look at the possible failures and how they could affect both the process and the customer both in the short term and in the long run.

Documentation which is completed throughout an FMEA provides records for future processes and information on failures, causes and effects as well as corrective measures that can be used. This prevents the company losing out in the long term because of the same problem occurring.

New ideas are presented and put forward and can be recorded for future projects if not used. This ensures that the company also strives for improvement and more effective production levels.

Documentation of in line control checks and criteria for new models is outlined.

There is never just one FMEA process to follow. It is an evolving process which constantly strives towards improvement, new designs and expectations.

Conclusion:

FMEA is a step-by-step approach used to identify any possible failures in a process. It documents these failures allowing any and all corrective measures to be at hand for any future designs and system processes.

The concept of FMEA is to outline and use the best changes to be used when working on minimising failures. It is beneficial to systems as it helps in the identification of possible failures, effects and causes. FMEA has been used since the 1940's and has proven its worth through team contributions of better designs for products and processes, an increased reliability resulting in increased customer satisfaction, as well as benefits for the company such as a safer work environment and reduced costs. Failure Modes and Effect Analysis has been used in conjunction with quality management systems such as six sigma, good manufacturing practices (GLPs) and different ISO's to achieve the best possible standard of quality for both the company and their customers.

Figure

Figure 2: Sample FMEA Worksheet as used by teams when identifying failure causes, effects and corrective measures.