SMS Part 7: Using the SMS Hazard Log to Support Change Management

SMS Part 7: Using the SMS Hazard Log to Support Change Management

In our last article we began looking at the high-level strategies for selecting mitigations, or risk controls, to reduce the risks associated with aviation safety hazards. This month we will examine how to record Safety Management System (SMS) data in a hazard log, as well as one of the less-obvious benefits of an effective SMS: the potential to use the safety risk management records to support effective change management.

Aviation Maintenance Magazine has been publishing a series of articles explaining how to establish and use a safety risk management (SRM) system to identify aviation safety hazards and assess the associated risk. SRM is one of the four key components of a complete Safety Management System (SMS). This (seventh) article assumes that you have some familiarity with the basic concepts of SMS that were covered in those first six articles. If you do not, then we recommend that you go back and read the past six articles (you can find all six at www.avm-mag.com).

In the past articles on SMS, we have discussed how to identify a hazard, how to assign values to the hazard correlating to likelihood of harm and consequence of such harm, how to assess the total risk posed by the hazard, and how to mitigate the risk. These are all part of the SRM component of an SMS. A robust SRM allows the user to assess the risks associated with hazards, and rank those risks, with the aim to focus limited resources on the hazards that pose the greatest risk, first. Once the hazards with the highest levels risks have been mitigated, then resources can be devoted to those with lower-level risks. This approach permits a risk-based approach to the development of a safety system, but it also encourages continuous evolution of the system that is used to manage safety.

SMS does more than merely help allocate limited resources. It also helps to document safety decisions, and it offers an opportunity to use those records to support elements of your safety system, including effective change management.

Recording the Results of Your SRM

One of the basic elements of SMS is documentation, and thus the system should document each of the four components of SMS, including SRM. The SRM documentation can be divided into two sets of records: (a) the records that describe the SRM processes, like the SMS manual, and (b) the records that are created as outputs of the SRM, like a hazard log. Note that this does not include the myriad records that are part of other systems, which nonetheless may be analyzed in the context of the SRM processes (like your existing component maintenance manuals used to support repair processes).

The hazard log is simply a compilation of the hazards that have been identified through the SMS, and the records concerning the way that each hazard was processed, including risk assessment data, identified/implemented mitigations, and the actual results of those mitigations.

I have a list of about 20 categories of fields that I recommend for capturing information in a complete hazard log. I don’t have enough space in this column to provide a full analysis of all fields, but a list of my preferred starting fields is being published as part of the scalability appendix in the next update to the SM-0001 Standard (expected late 2021). Pull up that standard if you want to see what I recommend. So I will just identify a few key fields that ought to be in your hazard log.

First you should identify the hazard and the details associated with it (this reflects multiple fields). Details can include information like scope: for example, if this hazard is only analyzed in a particular context, then that context should be identified. A missed-inspection hazard that arises in one repair, and also appears to arise in another repair, might have different consequences in each repair and therefore the hazards should be assessed as two different hazards, each with a different scope, because each has a different consequence.

As another example, proper calibration for ovens used to relieve hydrogen embrittlement is far more important than proper calibration for ovens in the break room (and the risk assessment for each will be different).

After identifying the hazard, you should record the risk assessment results. This typically means recording the likelihood, consequence/severity, and total risk (at a minimum). I typically like to record the risk assessment as it exists in at least four states:

• Risk assessment with no mitigations [as if there was no quality system at all – most existing businesses will already have some risk process controls in place before the SMS is created – such a processes required by the regulations – and it is important to recognize that those processes already mitigate risk and without them the risk would be worse];

• Risk assessment with current (existing) mitigations [recognizing that there may be risk controls — or mitigations — already in place in an existing system; if the risks shown between first and second assessment are the same then this might be an indication that the current mitigations are not having any appreciable affect, or it might suggest that your risk assessment categories are too broad to capture differences in risk level];

• Risk assessment with proposed (new) mitigations [before implementation, to identify anticipated results; once again, if the risk assessment shows that the risk level is the same in the second and third assessments, then this could suggest that either the mitigation is inadequate, or the risk-measurement-scale is insufficiently precise];

• Risk assessment with new mitigations [following implementation, to identify actual results and compare them to anticipated results; if the achieved risk level does not match the anticipated risk level, then this could be a signal that the mitigation is inadequate or improperly implemented; note that the goal is typically to reduce the risk to an acceptable level, so there remains the possibility of residual risk].

Each of these risk assessments would be compared to the business’ safety goals to determine when the risks of the associated hazard are satisfactorily mitigated. Obviously, the risk assessments may be performed (and recorded) at different times to reflect the process flow of the business’ SMS.

The mitigations should be listed in the hazard log as well. I like to recommend that the hazard log be established as a relational database. This allows one hazard to have more than one mitigation (recognizing that this is often the case in modern real-world quality systems) but it also allows a single mitigation to address more than one hazard. For example, a decision to purchase an alternative PMA part to support a particular repair might have been intended to mitigate the hazard of short supply from the original source, but if the PMA part also incorporates modifications designed to improve reliability, then it might also be claimed as a mitigation to a reliability hazard identified in the next higher assembly. In such a case the mitigation might reasonably be associated with both hazards. The importance of this arrangement in the hazard log will become clearer as we discuss the Change Management topic, below.

SRM and Change Management

I’ve spoken to many quality professionals who find the relationship between SMS and change management to be confusing. One of the reasons that this is confusing is because at the beginning of the SMS program, before a hazard log has been established, there appears to be no difference between a change management analysis and a typical SRM analysis. In each case you are identifying hazards and then analyzing them. This is frustrating to professionals who are seeking a systems-based approach to change management.

Simply applying SRM to the change generates multiple potential dangers within the system — there is a danger that the analysis will fail to predict a hazard associated with the proposed change. There is also a danger that the proposed change will lead to unintended consequences by impacting a mitigation that is associated with an unrelated hazard.

Luckily, a robust SMS can help to mitigate these two dangers; because as the hazard log is populated with data, it will become an important change management tool.

Remember that we are recording hazards, and their details, in the hazard log. If you are following my advice, then each hazard that needed to be mitigated is linked to one or more mitigations in the hazard log (e.g. through a relational database link). These are the mitigations that successfully reduce the risk to an acceptable level. If you look back at the recent article on risk mitigation selection strategies (https://www.avm-mag.com/sms-part-6-strategies-for-identifying-and-selecting-risk-controls/), you’ll see that there are multiple types of risk process controls and multiple strategies for implementing those risk process controls. These can range from written procedures, to training, to system design that drives safe behaviors. In each case, if you catalog those risk mitigations in your hazard log and link each one with the hazard(s) that it mitigates, then this will allow you to examine whether a change will impact risk mitigations (for example, a manual change that modifies the language of a procedure) and then you can identify the linked hazards. You can also examine how the mitigation affects those hazards. This permits you to begin your change management process by relying on analysis that has already been performed within the SMS. If you will change a risk mitigation, then examination of its connections in the hazard log allows you to identify the most likely consequences of that change (including the identification of unintended consequences).

This doesn’t take the place of a process that independently identifies likely hazards and performs safety assessment on each one, but it does provide a starting point, so that previously accomplished analysis can be reused, and so that known hazards can be assessed in the context of the change using the existing system information as a guide.

As the hazard log becomes increasingly more mature, it will capture the collected analyses of the past in a way that can directly support a systems-based approach to change management, allowing the safety department to identify likely consequences, and to develop new mitigations to ensure that previously identified hazards continue to be properly mitigated, particularly after a change.

Want to learn more? We have been teaching classes on SMS elements, and we have advised aviation companies in multiple sectors on the development of SMS processes and systems. Contact us if we can help you with your SMS questions.

SMS Part 6: Strategies for Identifying and Selecting Risk Controls

SMS PART 6: STRATEGIES FOR IDENTIFYING AND SELECTING RISK CONTROLS

In this article we will begin to look at the high level strategies for selecting mitigations – or risk controls – to reduce the risks associated with aviation safety hazards.

Aviation Maintenance Magazine has been publishing a series of articles explaining how to establish and use a safety risk management (SRM) system to identify aviation safety hazards and assess them for risk. The SRM is one of the key elements of a complete Safety Management System (SMS). This article assumes that you have some familiarity with the basic concepts of SMS that were covered in those articles. If you do not, then we recommend that you go back and read the past five articles (you can find all five on Aviation Maintenance Magazine’s website).

In the past articles on SMS, we have discussed how to identify a hazard, how to assign values to the hazard correlating to likelihood of harm and consequence of such harm, and how to assess the total risk posed by the hazard. The nature of this process is that you will be able to rank the risks so that the hazards that pose the greatest risk can be addressed first. This allows an aviation business to focus its limited resources on mitigating the most important risks first, while at the same time preserving the less important risks to be addressed at a later date.

But what do we mean when we say, “address the risks?”

Two easy meta-strategies for mitigating the risk associated with a hazard are (1) to reduce the likelihood that the hazard will arise and (2) to reduce the consequences of the hazard if it arises. Remember that likelihood and consequence are the two metrics hat we used to calculate total risk associated with each hazard. And these are both things that we do in aviation every today.

A typical hazard in a repair station is the possibility that the person performing maintenance will skip a step. This is a hazard that is mitigated in most repair stations through risk control processes aimed at both likelihood and consequence. For example, it is normal for the repair station to develop a “traveler” that describes the step-by-step process for the intended repair. This will typically be developed from the existing maintenance manual(s) for the article to be repaired. The mere existence of the traveler as a guide is a risk control to help mitigate the likelihood of missed steps in the repair. But that is not all we do. We also typically ask the person completing the processes to initial or stamp a check-box for each step to show that the step has been completed. This provides a visual cue to the maintenance technician that each step has been completed, and makes it obvious which step is next to be completed (this also mitigates other hazards, like the hazards posed by maintenance that spans over more than one shift). Each of these processes reduces the likelihood that the maintenance technician will skip a step during maintenance.

That is not all we do to mitigate the risk of skipped steps. We’ve all heard the adage that the work is not complete until the paperwork is complete. It is normal in repair stations for the traveler to be reviewed by an inspector before the work is considered to be complete. In such a review, if a step was skipped them the inspector will identify this as an issue that needs to be corrected before the article can be approved for release to service. This review is a process that mitigates a number of hazards, but one of the things that it does is it mitigates the consequence of errors. This is because if an error was made (like a skipped step), then the consequences are less likely to escape from the system because of the review process. Thus, the safety consequences of skipped steps are mitigated to an insignificant level when the review process works correctly to identify when such steps may have been skipped.

Anther way of looking at this particular mitigation (inspecting the work to ensure steps were not skipped) is that it limits the exposure of the hazard. By identifying the hazard in-house when it arises and preventing the affected article from leaving the quality system, the processes insulate the repair station’s customers from exposure to the risk. Exposure limitation can also arise in ways that are more attenuated from consequence mitigation, such as preventing access to areas in which hazards arise.

Modern technology is being used to reinforce these efforts. Computer-based travelers can be programmed to prevent an article from moving to the next step unless each step is confirmed to have been completed.

As you can see from these last few paragraphs, there are a number of ways to mitigate risks. While the meta-strategies are to reduce the likelihood or to reduce the consequences of the hazard, there are specific strategies that are commonly used to accomplish these meta-strategies.

Four common risk process control strategies – in order of their priority – are:

1. Design for minimum risk

2. Incorporate safety devices

3. Provide warning devices

4. Develop procedures and training

When you can design for minimum risk, that always allows helps to ensure that inherent hazards are mitigated. This can be true in the design of the article by the manufacturer, but it can also be true in the design of a repair station’s facility. For example, if an identified hazard is inhalation of paint fumes, then the risks associate with that hazard can be mitigated through a facility design that keeps painting separate from humans, and effectively exhausts the fumes through a mechanism that reduces their toxicity to acceptable levels.

When it is not possible to minimize risk through design approaches, then the next consideration should be incorporation of safety devices and mechanisms. Using the paint-shop inhalation hazard, appropriate respirators can be safety devices that help to mitigate the inhalation risks for those employees who must be potentially exposed to inhalation hazards.

Warning devices can also be risk mitigations. They are typically used to reduce the likelihood of harm from a hazard, because they warn the employees away from the hazard or provide advice on how to best mitigate the risk posed by the hazard. Warning devices are used throughout aviation, from signs warning unauthorized personnel away from a place with hazards, to “remove before flight” tags hung from access panels that must be closed at the conclusion of a maintenance operation.

Developing procedures and training is listed last. Ensuring that your colleagues have the right training, and the right procedures is important; but if you rely solely on these then you are introducing human factors into the risk process controls, which means that there is a greater likelihood of failure in these controls. This doesn’t mean that procedures and training are not important. They might be the only way to reasonably control a risk. They are also useful as a supplement to other risk process controls. But when they are the only risk process controls in place then it is especially important to ensure that they are effective (techniques for accomplishing this include auditing and are covered in the Safety Assurance element of SMS).

This article should not be used as a boundary. You should never hesitate to apply creative solutions to thorny problems. But if you are looking for a way to start the hazard-risk mitigation process, then using these categories as a guide can help you to begin identifying what sort of mitigation might yield the results that you want.

Want to learn more? We have been teaching classes in SMS elements, and we have advised aviation companies in multiple sectors on the development of SMS processes and systems. Give us a call or send us an email if we can help you with your SMS questions.

SMS Part 5: The Relationship Between Risk Controls and Your Safety Risk Management System

SMS Part 5: The Relationship Between Risk Controls and Your Safety Risk Management System

In this article we will begin to look at how to use mitigations — or risk controls — to reduce the risk associated with aviation safety hazards.

Last year, Aviation Maintenance Magazine published a series of four articles explaining how to establish and use a safety risk management (SRM) system to identify aviation safety hazards and assess them for risk. The SRM is one of the key elements of a complete Safety Management System (SMS). This article assumes that you have some familiarity with the basic concepts of SMS that were covered in those articles If you do not, then we recommend that you go back and read those four articles (you can find all four on Aviation Maintenance Magazine’s website).

This year, we will guide you through the next steps of implementing an SMS system; and in this month’s article we will focus on basic concepts related to risk controls and how they relate to the work you did in recording your hazards and safety risk analyses.

Part of the SRM process for analyzing hazards — the process that we addressed in the past articles – involved assigning likelihood levels and consequence levels to each identified hazard. These help you to place risks on a likelihood-consequence matrix which in turn helps you to identify which hazards need to have their risk levels reduced. Based on this matrix, there are two ways to reduce the risk associated with a hazard. You can reduce the likelihood that the hazard will occur; or you can reduce the consequence of the hazard in the event it occurs.

These two concepts are not new to aviation. We’ve been using these concepts for years. For example, an air carrier’s required inspection items are items for which a second inspection is necessary for the work is complete. The second inspection provides a second opportunity for an independent inspector to look for flaws. This improves the likelihood that any existing flaws will be caught, which in turn decreases the likelihood that flaws exist in the work performed. This effort reduces the likelihood that the underlying hazard will occur (the hazard(s) for which the inspection was designed). Total risk, in this case, is reduced by reducing likelihood.

Another example can be found in the common practice of having duplicate or back-up systems where the systems are critical. Where there is an effective back-up system, the failure of the primary system will not lead to catastrophic results. This the consequence of a failure is mitigated through the design functions that permit a duplicate or back-up system to operate in the event of a primary system failure.

Note that where a system is critical and it is impractical to have a duplicate or back-up of the system, it is normal to impose life limits that are designed to remove parts that are subject to wear or degradation before they could reasonably fail. This effort to decrease likelihood of failure shows us that elements like practicality can be weighed to allow us to choose from more than one risk control, and we can sometimes choose from controls that improve our management of likelihood, consequence, or both in our efforts to reduce total risk.

Let’s apply these concepts to an example. Imagine a scenario where a repair station performs plating. One of the hazards associated with plating is hydrogen embrittlement. This should be recorded in the repair station’s database of hazards. Naturally, without any risk process controls, the likelihood of hydrogen embrittlement might be high. Hydrogen embrittlement can cause a component to fracture at stresses less than those typically associated with the expected strength of the metal. In other words, the metal is more brittle than expected which can lead to damage in the component. The potential safety consequence of such a hazard might be significant.

There are normal processes associated with common plating operations that are intended to reduce the likelihood of hydrogen embrittlement (such as heat treatment for thermal stress relief). The heat treatment adequately reduces the likelihood of the hydrogen embrittlement hazard, and this reduces the total risk associated with the hazard (typically reducing it to an acceptable level). Thus, heat treatment would be recorded as the risk control associated with the identified hazard of hydrogen embrittlement in your plating process.

Obviously, the risk control is valuable to prevent hydrogen embrittlement, but recording it in your hazard-risk-mitigation database has independent management value. If data shows later hydrogen embrittlement in plated components, this database allows you to focus on the risk controls that were intended to reduce that risk, and to analyze them for flaws.

It also allows you to use your hazard-risk-mitigation database to perform change management. For example, if the repair station deicides to replace the ovens used for heat treatment with new ovens, then the hazard-risk-mitigation database should show where those ovens are being used as hazard mitigations, and to permit the change management reviewers to ensure that the new ovens will be adequate to mitigate each risk for which the old ovens had been identified.

By changing the likelihood level, consequence level, or both, the system can effectively reduce risk posed by hazards. As we will see in future articles, this helps to drive an effective audit schedule as well as becoming an effective and objective change management tool. How do we select process controls that will effectively reduce likelihood, consequence, or both? Read our next article where we will discuss strategies for identifying and selecting risk controls.

Want to learn more? We have been teaching classes in SMS elements, and we have advised aviation companies in multiple sectors on the development of SMS processes and systems. Give us a call or send us an email if we can help you with your SMS questions.

Customer Bankruptcies: Protect Your Right to Get Paid

Customer Bankruptcies: Protect Your Right to Get Paid

We are going to take a break from our series on how to construct an SMS system to look at ways to protect your right to get paid.

There are valuable strategies for protecting your right to get paid for your work. These strategies become especially important in tough economic times. Several air carriers filed for bankruptcy or insolvency protection at the beginning of the Covid-19 pandemic, and it looks like another wave of bankruptcies could be around the corner, especially among smaller airlines in certain markets.

This article examines bankruptcy priorities, and offers several strategies for increasing your potential to get paid when you are selling articles or providing services to a company that subsequently becomes insolvent. If you are intrigued by what you see here, then further investigation with an attorney may be appropriate.

What is Bankruptcy/Insolvency?

There are typically two main types of bankruptcy filings for aviation businesses – liquidation and reorganization. We will refer to the businesses that are seeking this sort of legal process as “firms.”

In the United States, liquidation is commonly known as a “chapter 7” filing, based on the chapter in the U.S. Bankruptcy Code. Firms who are entering a liquidation will sell off their assets in order to pay creditors. The firm may continue operation for a short time if continued operation will benefit the creditors.

When a firm thinks it can emerge from bankruptcy with the court’s help, it can file for a reorganization, or Chapter 11. In such an event, the firm would use its bankruptcy trustee to help reorganize its debts. The trustee is an administrator who is appointed to protect the creditors’ interests, and who typically has powers to help liquidate or reorganize the firm. The firm may operate as a debtor-in-possession and essentially serve as its own trustee, under close court monitoring.

One of the key powers of a Bankruptcy Court in the United States is the power to decide whether a contract shall be assumed, rejected, or otherwise terminated (note that the U.S. Bankruptcy Code gives courts the power to continue executory contracts even when the contract says that it is terminated for bankruptcy). If you have an outstanding contractual obligation and your partner files for bankruptcy protection, then the court could terminate the contract or it could order you to continue performing under the contract.

Outside the United States, the bankruptcy proceeding is often called “Insolvency,” and it may vary from the United States norms that are described in this article. For instance, some countries permit liquidation but do not have a corollary to the reorganization portion of the U.S. Bankruptcy Code.

Bankruptcy Priorities – Who Gets Paid?

When a firm files for bankruptcy protection, any efforts to collect on outstanding debt owed by the firm immediately cease and all claims against the firm must go through the Bankruptcy Court.

The outstanding debts of the firm are typically paid according to “priority.” The first priority is for the administrative expenses of the bankruptcy trustee. This encourages trustees to work actively for the firm, because the trustee knows that he or she will get paid. The second priority is for certain claims made by a Federal reserve bank related to certain loans. The third priority is for certain claims that arise in an involuntary filing (most aviation bankruptcy filings tend to be voluntary).

The fourth priority is claim is for claims for employee wages and sales commissions, followed by the fifth which is for contributions to an employee benefit plan.

In all, there are ten priorities that get paid before any secured creditors are paid. And the secured creditors will then be paid from their security before the unsecured creditors. The unsecured creditors are paid last, and they typically get a pro rata share of anything that is left (which can be pennies on the dollar or can be nothing). The difference between unsecured and secured creditors is explained in the next section, where we also explain how one can become “secured.”

There is usually a difference between debts from before the filing and debts incurred afterwards, especially in reorganization cases. In a reorganization case, the court wants to encourage companies to do business with the bankrupt firm in order to make the reorganization successful. As a consequence, it is normal for the court to order a priority for essential vendors. Those who are providing a good or service that is essential to continued operation will get paid for their post-filing transactions, and in some cases (where the good or service is sufficiently critical and cannot be obtained elsewhere) they may be able to negotiate the payment of pre-filing debt as a condition of continued business.

Normally, an independent repair station that performed work for an operator (that is now in bankruptcy) cannot find its way into the first ten priorities, but it can take steps to improve its ability to get paid in the event of a bankruptcy by taking steps to be able to reclaim unpaid property, or by seeking to become a secured creditor.

What Can I Do to Protect My Right to Get Paid?

One way to protect your right to get paid is to be able to assert ownership of an asset that appeared to be a part of the bankruptcy estate (but it was not because you owned it). If you sell parts to an air carrier, your contract could specify that they are placed in the air carrier’s inventory as a loan but that they are not purchased until they are paid-for. This does create certain additional liabilities (including tort liabilities that may arise related to the goods that you own) and those liabilities need to be considered and addressed in a written document before this approach is used.

A related approach is one in which goods are sold and then can be claimed if they remain unpaid in the event of an insolvency (pre-filing). This is a short-time period right that arises under the Uniform Commercial Code – the right is only good for ten days, and demand for return has to be made (in writing) within ten days of delivery. Ten days is a very short time period, and most vendors will not be able to ascertain an insolvency within ten days of a delivery. If the insolvent customer misrepresented its solvency – in writing – during the three months before the delivery of the goods in question, though, then this waives the ten-day limit and you may be able to reclaim the goods more than ten days after delivery. With this in mind, it may make sense to ask some customers to make a written assertion of solvency: either on a periodic basis, or before certain key deliveries.

It is also possible in some cases to reclaim unpaid goods after a bankruptcy filing. The bankruptcy code establishes timelines for post-filing reclamation and key dates arise at the 20-day and 45-day marks after filing, so this is something that you should investigate quickly if your customer files for bankruptcy, while owing you money for deliveries.

Reclaiming unpaid goods is just one possible remedy when you customer is insolvent. Another option is to establish a security interest in the goods. A security interest doesn’t give you the right to reclaim the goods, but it does give you a priority during the bankruptcy that makes your claims superior to those of the unsecured creditors.

When you have a security interest in good, then you get paid first out of the sales proceeds. Let’s say that you sell a serialized article worth $100,000 to customer X and secure the transaction with a security interest in the article. You are owed $100,000 and that amount is currently secured by the serialized article. If the bankruptcy trustee sells the article for $60,000 in an auction, then you would get the $60,000 and this would satisfy your secured interest. You would end up with $60,000 plus a 40,000 unsecured claim. While this is not as good as your original $100,000 expectation, it is better than a $100,000 unsecured claim that might yield only $1,000 after years of litigation.

One of the issues with securing a transaction is that you typically have to plan for the security interest. This means that it is something that you ought to plan with your legal consultants before it becomes necessary.

For repair stations, there are typically two different ways to secure a transaction. The first is that you can secure an interest by “contract” using a security agreement and a financing statement. This is often the way that a sale is secured. This requires the buyer (debtor) to sign certain documents related to the transaction, so it usually requires up-front negotiation to effect this ort of relationship, and it also typically requires documents to be filed in order to be effective against third parties (this is known as “perfection” of an interest).

The second way to secure an interest is that you can rely on a law that offers a specific path to assert a lien against property. In the United States, the specifics of this process will vary based on state law, but there is often a mechanism that allows a repair station to assert a lien against an asset on which it has performed work. Some states have aviation-specific laws and some states have laws that more generally apply to all sort of maintenance. A repair or alteration performed on an aircraft may permit the repair station to assert a lien against the entire aircraft.

Look at the laws in your state carefully! A common mistake is to try to rely on the “mechanics’ lien” law (which in many states applies to real estate contractors and not to aviation mechanics). Because these laws are different in every state, it is important for a repair station to work with a lawyer to examine its own state laws to assess (1) when such a lien may be asserted, (2) against what sort of assets it may be asserted (e.g. just aircraft or can you assert the lien against a component), and (3) how it must be asserted (what is the technical process to follow to make the lien enforceable).

Correlating Risk Consequence and Likelihood

Correlating Risk Consequence and Likelihood

This is the fourth article in a series about Safety Management Systems (SMS). In the first article (see page 48 of the January 2020 issue of Aviation Maintenance), we examined some hazard identification strategies (looking at ways to identify the things that could go wrong in our systems). In the second article (see page 48 of the May/June issue), we began looking at the process of using risk assessment to analyze identified hazards by explaining how to establish a “likelihood” scale that is relevant to your business needs, and how to calculate a “likelihood” for each identified hazard. In the third article (see page 48 of the July issue) we continued our examination of risk assessment by looking at the process of using “consequence” as a second metric for analyzing risk.

This month we will examine how to correlate “consequence“ and “likelihood” together to get a product that represents the relative risk associated with the hazard. We can use this risk product to help make risk mitigation decisions, and also to measure the effectiveness of our mitigation efforts.

I strongly recommend that you go back and read the first three articles if you have not looked at them recently. They are each pretty short, and they lay a foundation that will make it much easier to understand what this article is talking about. These prior articles are available in the back issues found on the Aviation Maintenance magazine website.

To review, we have previously discussed the identification of hazards (in the January 2020 issue), and the correlative assignation of likelihood levels (from the May 2020 issue) and consequence levels (from the July 2020 issue) to those identified hazards. If you are looking for more detail on how to assign these values, then please look at the earlier articles.

Typically, these assignments are only relevant within the system in which they are assigned. For example, one system might assign a likelihood level of 3 and a consequence level of 4 to a particular hazard, and another system might assign very different numbers to the same hazard. This could be because the scales of the systems are defined differently, it could be because the likelihood and consequence levels are themselves defined differently, or it could be because the hazard has different actual and potential effects within a particular system. The important thing is to use the system that you defined so that you can assign values that provide a relative risk rating. Such a rating will be relative to risks of other hazards identified and enumerated by your system.

Let’s look at the two scales that we used as sample for a repair station SMS in the past two articles. First there is a scale for likelihood (Exhibit A):

exhibit a

And second there is a scale for consequence (Exhibit B):

exhibit b

We’ve assigned numbers to the different levels (one through five in each case). By multiplying the numbers we can get a product. The product of the two represents a risk rating. For example, if likelihood is 4 and consequence is 4 (hazardous), then the product of the two is 16. If the likelihood of another hazard is 2 and the consequence is also 2 (minor), then the product is 4. These numbers are not absolute, so they do not tell us anything when analyzed outside of our system; but within our system they tell us that the first hazard (with a 16 risk product) should be a higher priority for mitigation than the second hazard (with a 4 risk product). This allows the owners of the system to prioritize their hazard mitigation projects to focus on the hazards that pose the most significant risk.

The simple multiplicative comparison is not the only way to approach these figures. For example, if your system prioritizes consequence over likelihood, then you might consider developing risk products by a formula like [consequence x consequence x likelihood]. This approach squares the consequence value which makes it a much greater influence on the final risk product number. For example, in a straight multiplicative model, a hazard with a consequence of 4 and a likelihood of 3 yields a risk product of 12; and a hazard with a consequence of 3 and a likelihood of 4 also yields a risk product of 12. They are weighted equally in such a model. But in the consequence-squared model, a hazard with a consequence of 4 and a likelihood of 3 yields a risk product of (4x4x3=) 48; while and a hazard with a consequence of 3 and a likelihood of 4 also yields a risk product of (3x3x4=) 36. Now the first hazard is prioritized over the second one for purposes of identifying an order in which to mitigate the hazards. Notice that our hypothetical hazards did not change, but only the way that we analyzed them changed.

The products of the likelihood and consequence numbers can also be used to help us set mitigation targets. By examining the products, the SMS system-owner can determine which risk products are acceptable and which risk products are unacceptable. Those hazards that have risk products that are deemed to be unacceptable would need to be mitigated in order to reduce their risk products to an acceptable level. In the next article in this series, we will discuss risk mitigation strategies.

A matrix of acceptable/unacceptable risks might look like this (Exhibit C):

exhibit c

In this matrix, we have established that certain risk products are considered to be acceptable, and certain risk products are unacceptable. In our sample, there is also one risk product that is marked in yellow as “review:” this is for catastrophic hazards that would be unlikely to ever occur. When hazards in this yellow-review risk-product are identified, they will be subject to additional review in the system to determine whether to mitigate them (so it is not acceptable nor unacceptable until the review has assessed it). When the system is newly-implemented, there may be many hazards that pose unacceptable risks. The numerical products found my multiplying the likelihood and consequence numbers can be used as a mechanism for prioritizing hazards in order to determine which ones to mitigate first.

The goal is, of course, to mitigate all of the hazards to an acceptable level. In our matrix, this means reducing the likelihood or consequence to a low enough level to move the risk product into the green. Eventually, a measured approach to SMS hazards should reduce the risk associated with the known hazards to acceptable levels. But this doesn’t mean that we are done!

We can also amend our risk product matrices as experience shows us that certain risk products need to be prioritized, and also as successful mitigations help to reduce total system risk.

Perhaps, after working in the system for two years, our hypothetical SMS-owner will feel that the business has successfully mitigated the risks posed by many of the identified hazards, and now the business is ready to begin mitigating the next round of hazards. The business might change the acceptable/unacceptable risk matrix by lowering the bar for mitigation so that the new matrix looks like this (exhibit D) :

Notice that the new matrix has changed some acceptable risks to unacceptable, which means that the business will develop new mitigations to further reduce the risk products of hazards in those categories to acceptable (green) levels. It is possible that some hazards that were mitigated from red-to-green in the prior matrix might need to be further mitigated after this change resets the concept of what is acceptable.

This approach allows the business to use its risk product acceptable/unacceptable matrix as a tool for continuous safety improvement, by moving the levels of acceptable safety to force constant im-provement.

exhibit d

In the next issue, we will look at how to use mitigations in order to reduce likelihood levels and consequence levels of identified hazards. By changing the likelihood level, consequence level, or both, the system can effectively reduce risk posed by hazards. As we will see in future articles, this helps to drive an effective audit schedule as well as becoming an effective and objective change management tool. Want to learn more? We have been teaching classes in SMS elements, and we have advised aviation companies in multiple sectors on the development of SMS processes and systems. Give us a call or send us an email if we can help you with your SMS questions.

SMS: Making It Useful Part Three Risk Consequence

SMS: Making It Useful Part Three Risk Consequence

This is the third article in a series about Safety Management Systems (SMS). In the first article (see Aviation Maintenance January 2020 issue, page 48), we examined some hazard identification strategies (looking at ways to identify the things that could go wrong in our systems). In the second article (see Aviation Maintenance May/June issue page 48), we began looking at the process of using risk assessment to analyze identified hazards by explaining how to establish a “likelihood” scale that is relevant to your business needs, and how to calculate a “likelihood” for each identified hazard. This month, we will continue examining risk assessment by looking at the process of using “consequence” as a second metric for analyzing risk. By adding “consequence“ to “likelihood,” we can correlate the two in order to get a product that represents the relative risk associated with the hazard (we will use this risk product to help make decisions, and also to measure the effectiveness of our mitigation efforts).

I strongly recommend that you go back and read the first two articles if you have not looked at them recently. They are each pretty short, and they lay a foundation that will make it much easier to understand what this article is talking about.

Did you review the prior two SMS articles? Good! Then we are ready to continue.
Remember that there are a number of reasons for assessing risk. These reasons were described in some detail in the earlier SMS articles but for review purposes, here is a summary of some basic reasons why one might want to assess risk of hazards

To better allocate limited resources by directing them to mitigating the most significant risks;

To have metrics that show when risks have been sufficiently mitigated;

To facilitate constant system improvement; and

To help judge your company’s progress on the safety continuum.

We typically assess risk In an SMS system by assigning two values to each hazard. The first value is “likelihood,” (which we examined in the May article) and the second value is “consequence.” Together, they can provide a measure of the risk posed by a particular hazard.

Likelihood reflects the prospect that the hazard condition will manifest itself. Consequence reflects the potential harm if the hazard condition manifests. So, the likelihood that Earth will be struck by an asteroid is remote; but the consequence of such a strike would be devastating. On the other hand, the likelihood that the Earth will be struck by average-size meteors is high (it happens on a regular schedule as we pass through clouds of matter) but the consequence is low (they tend to burn up in the atmosphere before hitting the ground).

In the last article we relied on the FAA certification parameters as a baseline for establishing likelihood but then talked about ways to modify those descriptions to make them more relevant to a repair station environment. We are going to do something similar with “consequence.”

The reason that we assess different levels of consequences is so we can distinguish the hazards that cause significant danger from those that pose little danger. For example, the consequence of a hazard associated with a piece of interior decoration is typically quite low. For example, a passenger might suffer a minor injury, or the hazard might cause an aesthetic issue. The consequence of an engine hazard that could cause an in-flight shutdown of an engine is dramatically higher. If we have two equally likely hazards, with vastly different consequences, then we typically want to first commit out resources to effectively mitigating the hazard that has a higher consequence.

Consequences are always judged based on the hypothesis that the consequence actually occurs, without a discount for likelihood. We have a separate parameter (likelihood) to account for the possibility of occurrence.

You have to define consequences, and levels of consequences, consistent with your own business needs and expectations. This also allows you to mitigate to the desired levels. You may change these levels as your safety management system develops, to continue to evolve the system to meet your safety needs. A good starting matrix for a repair station might look like this:

Notice that this matrix emphasizes consequences that adversely affect the repair station’s ability to support the customer’s aviation safety goals. The assumption in this matrix is that each consequence level is worse than the earlier levels.

As with likelihood, it is important to understand that even though we are assigning numbers to consequences, the measurement is really qualitative, rather than quantitative. This is because the numbers are assigned based on the SMS system that your company has developed. You typically cannot compare your numbers to the numbers generated by another company’s risk assessment.

Because different people can come up with differing opinions about reasonable consequences, a more objective standard can be valuable (so please do not assume that the consequence values in the above table reflect an ideal). When you are establishing consequence values and narratives, don’t be afraid to adjust them to suit the needs of your business (including the need to distinguish events that cause more harm to aviation safety from those that would have less impact to aviation safety). If you do adjust your values after you’ve started assessing risk, though, then you may need to re-analyze past risk assessments to update them to the new standard so you can compare hazards according to the same metrics.

The table includes five different levels of consequence. Your table may include more or less levels. The important thing is that the table you develop for your own system must distinguish among hazards in a way that is useful to your analysis of those hazards. For example, you do not want to make the levels of consequence so close together as to be impossible to distinguish in which category a hazard belongs.

In the next issue, we will look at how to use likelihood levels and consequence levels in order to arrive at a risk product that can be used to compare the relative risk of different hazards. Want to learn more? We have been teaching classes in SMS elements, and we have advised aviation companies in multiple sectors on the development of SMS processes and systems. Give us a call or send us an email if we can help you with your SMS questions. www.pmaparts.org.

SMS: Making It Useful Part Two Risk Assessment and Likelihood By Jason Dickstein

SMS: Making It Useful Part Two Risk Assessment and Likelihood By Jason Dickstein

Safety Management Systems: they seem complicated. But Aviation Maintenance Magazine is aiming to make them simple to implement.

In the January issue, we examined some hazard identification strategies. Because of Covid-19, we diverged from our expected series to bring you news about Covid-19 legislation in the March issue. But now we’re back to SMS(!) and this month, we’ll begin looking at the process of using risk assessment to analyze our identified hazards. If you don’t remember how to identify hazards, then look back at the January issue to refresh your memory (it is available online).

The point of identifying hazards is to identify the things that could go wrong in your system. In the January issue, we suggested documenting hazards in a centralized and comprehensive hazard log. We specifically recommended using a database. A database will allow you to analyze trends in hazards, reference the mitigations associated with each hazard, and even serve as a tool for change management (we will address all of these in future articles). Before we can start tying our hazards to mitigations, though, we are going to first examine how to assess the risk posed by each hazard.

We assess risk for a number of reasons.

One reason for assessing risk is to better allocate limited resources. If you know that you have three hazards that you could mitigate, but you can only mitigate then one at a time, then having a mechanism for deciding which hazard is most important to address would help you to decide how to allocate your resources.

A second reason for assessing risk is to decide when you have done a sufficient job in reducing the risks posed by the hazard. By assessing risk, you can set a metric for when risks are considered to be adequately contained. This tells you, prima facie, when a mitigation is considered to be “good enough.”

A third reason for assessing risk is to permit the system to engage in constant improvement. If you assess the risk levels posed by a set of hazards, then you can mitigate the risks to the acceptable level that has been set by the company. Once the known hazards have all been mitigated to the acceptable level, the company can decide to pursue a higher level of safety by changing the acceptable level of risk! For example, if you create a system that assigns risk values to hazards, and you successfully build a system that mitigates all of the hazard-based risks to a value of 10 or less, then after achieving that goals, you might next seek to mitigate the risks valued at 9 and 10 to a value of eight or less.

A fourth reason for assessing risk is to have a mechanism for judging your company’s progress on the safety continuum. By assessing and assigning numerical risk values to each hazard, you have an opportunity to record and assess the progress your company is making on its path toward safety. You can set risk-based goals (“performance indicators”) like reducing every risk below a certain metric or reducing the average of all risks in a system below a metric.

So what does it mean to assess risk?

We typically assess risk In an SMS system by assigning two values to each hazard. The first value is “likelihood,” and the second value (which we’ll examine in next month’s article) is consequence. Together, they can provide a measure of the risk posed by a particular hazard.

Likelihood reflects the prospect that the hazard condition will manifest itself. The purpose of this assignment is to rank more likely occurrences higher than less likely occurrences. Therefore it is typically not an absolute measure of probability. The values used may vary based on the system, and the needs of the system. For example, in a manufacturing environment, you might assess likelihood values related to failures of the manufactured product based on probability of hazard occurrence per operational hour. In the FAA Certification system, a likelihood measured at one occurrence in less than 100,000 hours of operation is considered to be probable; while a likelihood measured at one occurrence in more than 1,000,000,000 hours of operation is deemed to be extremely improbable. These two metrics reflect the bookends of the likelihood range in an FAA certification project. The United States military uses safety management and deems a hazard to be probable if it will occur several times in a system, but has another value – frequent – which describes hazards that are likely to occur frequently in the life of a system. In other systems, the values may distinguish hazards that will certainly arise in the life of a system (100% chance) from those that are expected but may not arise 100% of the time, to those that are remote in the sense that they have not yet arisen but are nonetheless feasible.

The scale that you use should be tailored to the particular hazards in your system, and the best factors that will provide you with meaningful distinctions to permit useful differentiation among the hazards being analyzed. For example, FAA certification distinctions may not be appropriate for a repair station, because the repair station may want to identify hazards that happen every day and distinguish them from those that happen once per week and distinguish those from hazards that arise once per month. All three categories likely fall into the “probable” likelihood on the FAA certification scale but if they all fall into the same category then the likelihood metric is not being successfully used to distinguish them.

In a repair station environment, you will encounter hazards such as human factors issues that arise on a more regular basis than the basis described in the FAA Certification probabilities, so the FAA Certification range probably does not provide the appropriate metrics for judging the likelihood of hazards in a repair station. For purposes of this article, we shall use the rating scale in the chart above as our likelihood values.

Notice that these values are based on narrative descriptions, rather than hard numerical probabilities. This is because the typical repair station may be unable to classify its hazards based on strict numerical probabilities. A repair station will also have to consider the scope of the narrative descriptions (which may be based in part on the sources of hazard data). For example, if you are examining the failure of a particular OEM part, then the repair station’s experience may suggest it is a level 2 likelihood (“never has occurred but the hazard could reasonably occur”); but expanding the scope to include data from other repair stations might shift it to level 3 (“has occurred, and without mitigation, the hazard would probably occur less often than once per month OR never has occurred but the hazard is likely to occur in the future”).

Let’s say that the hazard in question is the release without final inspection of a unit that was subject to overhaul procedures. Let’s also say that this hazard is identified because it occurred in the facility. Because it actually happened, this automatically gives it a level 3,4, or 5 likelihood (based on the definitions, above). It might be assigned a risk level based on past experience (if this has happened before, then the prior occurrence experience might help assign a likelihood level) or based on the intuition of the inspector responsible for the assignment. In this scenario, there is no precise answer, and therefore it makes sense to have one person or one group assessing the likelihood level in order to ensure risk assignments follow a reasonably standard pattern (so you do not have radically different risk assignments based upon different opinions of the narrative descriptions).

Because different people can come up with differing opinions about likelihood, a more objective standard can be valuable (so please do not assume that the likelihood values in the above table reflect an ideal). When you are establishing likelihood values and narratives, don’t be afraid to adjust them to suit the needs of your business (including the need to distinguish more-likely events from less-likely events). If you do adjust your values, though, then you may need to re-analyze past risk assessments to update them to the new standard so you can compare hazards according to the same metrics.

The table on the previous page includes four different levels. Your table may include more or less levels. The important thing is that the table you develop for your own system must distinguish among hazards in a way that is useful to your analysis of those hazards.

Your likelihood assessments should permit you to distinguish hazards based upon the difference in the likelihood. If likelihood was the only metric that you used, then this would permit you to focus first on the most likely hazards, and then save the less likely hazards to be mitigated later.

Likelihood is not the only metric we typically use to assess risk. Next month, we will examine the metric known as “consequence,” which will help us to distinguish the most damaging hazards from the less damaging hazards. Using likelihood and consequence together, we will be able to judge which hazards pose a greater risk to safety.

Part III of SMS Series Next Issue

In the next issue, we’ll look at the process of using “consequence” as part of our risk assessment, and we will examine how to examine our identified hazards in a risk assessment environment. Want to learn more? We’ve been teaching classes in SMS elements, and we’ve advised aviation companies in multiple sectors on the development of SMS processes and systems. Give us a call or send us an email if we can help you with your SMS questions.

New Paid Leave Laws, and a New Payroll Tax Credit

New Paid Leave Laws, and a New Payroll Tax Credit

Congress has passed a number of new laws in response to Covid-19. One of them establishes two new forms of paid leave: paid parental leave and paid sick leave. This article explains how they apply and also explains the payroll tax credit that is used to offset the impact to businesses.

For those of you who’ve been following my series on implementing SMS in a repair station environment, don’t worry; I plan to return to that series in the next issue.

Parental Leave

The Families First Coronavirus Response Act established a new category of FMLA-protected leave (we will call it “Parental Leave”). This category of leave only applies to leave for a “qualifying need related to a public health emergency” that arises starting March 18. The statute explains what this really means:

“The term ‘qualifying need related to a public health emergency’, with respect to leave, means the employee is unable to work (or telework) due to a need for leave to care for the son or daughter under 18 years of age of such employee if the school or place of care has been closed, or the child care provider of such son or daughter is unavailable, due to a public health emergency.”

With schools closing across the nation, this means that anyone with children under 18 could be eligible for this sort of leave. To be clear: Parental Leave is for parents who have to care for their children because the children are at home. If the employee gets sick from Covid-19 then the employee is covered under the more traditional elements of FMLA.

Many aviation businesses members are small businesses that haven’t had to comply with FMLA in the past because it excludes businesses with fewer than 50 employee. The new law includes those smaller businesses, so this will be new territory for many aviation companies.

Typically FMLA applies to a business that employs 50 or more employees for each working day during each of 20 or more calendar workweeks in the current or preceding calendar year. The way this is worded, an aviation business that has just gone from 60 employees to 6 employees because of the current crisis likely remains liable for FMLA through at least the end of this year (on the assumption that they had 50+ employees in 20+ weeks last year, but may not meet that threshold this year).

Parental Leave, on the other hand, applies to employers with fewer than 500 employees. So the largest companies are exempt from the new Parental Leave requirements, and it will instead only apply to small and medium-sized businesses.

For employers who have fewer than 25 employees (“Very Small Employers”), there are complicated exemption rules. If this applies to you, then you can see my blog (https://aviationsuppliers.wordpress.com) for more details about this.

Employees become eligible for Parental Leave under FMLA after only working for 30 days (so your newer employees will be eligible for Parental Leave, to care for their children, even though they are not yet eligible for other forms of FMLA protection).

This sort of Parental Leave may be paid leave. The pay, though, is offset by a payroll credit for the employer (see below). Parental Leave is unpaid for the first ten days of leave (two weeks for employees who normally work five days a week). But after those two weeks, the leave is paid leave! The pay must be at least two thirds of the employee’s regular rate of pay, and it must be paid for the number of hours the employee would normally be scheduled to work. There are also rules for calculating hours for employees with varying schedules.

There are upper limits. The maximum daily pay under this law is $200 per day and the maximum total pay is $10,000 in the aggregate. So if you are paying someone $200 per day under this law, then you would be obliged to pay up to 50 work days (a total of $10,000), or about seven weeks. You can pay more than this amount but anything more is not going to be subject to the payroll tax credit.

Paid Sick Leave

Starting no later than April 2, the new law also requires employers to provide 80 hours of paid Sick Leave to each employee. If your company already does this, then the law might not impose any new obligations on your business. If your company does not yet provide at least 80 hours of paid sick leave then you will need to make sure that you comply with the law.

There are six reasons that one is permitted to claim Paid Sick Leave under the new law (they are numbered because different categories receive different pay):

(1) The employee is subject to a Federal, State, or local quarantine or isolation order related to COVID–19.

(2) The employee has been advised by a health care provider to self-quarantine due to concerns related to COVID–19.

(3) The employee is experiencing symptoms of COVID–19 and seeking a medical diagnosis.

(4) The employee is caring for an individual covered under item (1) or (2), above.

(5) The employee is caring for his or her child, if the school or place of care of the child has been closed, or the child’s normal care provider is unavailable, due to COVID–19 precautions. This reason #5 may be subject to the paid Parental Leave protections as well.

(6) The employee is experiencing any other condition specified by the Secretary of Health and Human Services (so amendments are possible).

The new Sick Leave law becomes effective not later than April 2 (the wording of the Act is ambiguous so the Department of Labor could set an earlier effective date). It applies to any employer who is covered under the Fair Labor Standards Act, so it applies to almost everyone in aviation.

The law provides 80 hours of paid sick leave as of the effective date, so paid sick leave provided before the effective date probably does not count against the 80 hour obligation. For example, if your company normally provides 80 hours of sick leave per year, and you already paid an employee for 80 hours of sick leave that the employee used in January 2020, then that employee is likely entitled to another 80 hours of paid sick leave under the new law.

Paid Sick Leave is typically calculated based on the employee’s required compensation and the number of hours the employee would otherwise be normally scheduled to work, but this amount is also capped under the law. The maximum required dollar amount of Paid Sick Leave is limited, and the limits are based on the six categories shown above:

  • $511 per day and $5,110 in the aggregate for a use described in reasons (1), (2), or (3), above;
  • $200 per day and $2,000 in the aggregate for a use described in reasons (4), (5), or (6), above.

Many laws that protect employees require that information about the law be posted. Although parental leave is covered under the FMLA posting requirements, the new Sick Leave law requires a new poster, which can be downloaded from the Wage and Hours Division website (https://www.dol.gov/agencies/whd).

Payroll Tax Credit

If your business pays Sick Leave or Parental Leave under these provisions, then it can use the amount paid (subject to the limits described above) as a payroll credit. If you pay more than the statutory amounts then you can only take a credit for the statutory amounts. If your payroll credit exceed your payroll tax, then you can apply for an expedited refund from the IRS. If you take the payroll credit, then you will have to treat that money as (taxable) income.

Both of these new forms of paid leave expire on December 31, 2020, so they will continue to apply through the end of the year. If you find yourself needing to provide Paid Parental Leave or Sick Leave under the new law, then please be sure to look at the law or get appropriate legal advice because there are additional details in the law that may apply to your specific fact pattern!

Traceability, Evidence, and Trust in the Aircraft Parts Industry

Traceability, Evidence, and Trust in the Aircraft Parts Industry

This is an article about complying with maintenance regulations. We rely much more on traceability and documentation, today, than we did twenty years ago. But as we rely more on traceability, it is important to reflect on why we rely on traceability, what is the purpose of the traceability, and based on these first two factors, what traceability should be acceptable.

Answering these three questions is an important exercise in identifying the right evidence to use in ascertaining an aircraft article’s airworthiness.

It is important to recognize that whether an article is airworthy is a simple binary function. It is or it isn’t. If we lose the documentation for the article, then this does not make the article unairworthy; but it may make it more difficult to prove the airworthiness. This simple binary function becomes more complicated when we consider the wide range of evidence used to demonstrate airworthiness. This use of evidence – this reliance on aviation documentation – is really an exercise in abstraction.

There are many areas of study that examine levels of abstraction. Philosophy of Language studies the difference between a word, the denotation of the word and the connotation of the word. Computer science languages are abstractions that translate into machine code written in ones and zeros. Numbers are abstractions that can stand for a quantity of physical objects. Even words are abstractions. The word “computer” used earlier in this paragraph, is an abstraction that denotes the thing you use for accessing online parts databases. We know what the word “computer” means, and we also know that the word cannot be confused with an actual computer.

Aviation mechanics deal with abstractions on a daily basis. When an aviation mechanic installs an aircraft article, that mechanic must ensure that the article installed will return the aircraft to a condition at least equal to original condition. That means the installation has to be airworthy.

It is possible to ensure airworthiness through direct measurement. We can measure the dimensions, metallurgical properties, and other key airworthiness properties of an aircraft article to verify its airworthiness. We can also rely on system elements; for example, we can rely on the fact that a production approval holder is not allowed to release an aircraft article from its quality system unless the article is airworthy. And when we start to rely on system elements – and the traceability associated with those elements, then we begin to engage in an exercise in abstraction, in which we rely on someone else’s documentation of a fact, rather than our own direct personal knowledge of that same fact.

How do we know that an aircraft article was released from a quality system? Typically, the installer has not seen the article released from the manufacturer’s quality system; instead, we rely on other evidence to demonstrate this fact. This sort of evidence can be based on someone with direct personal knowledge, like a production approval holder’s certification, or someone with indirect knowledge, like an air carrier that assured the airworthiness of the article at the time it was received by the carrier.

When we buy dish soap in the store, we typically don’t worry about whether it will work; we assume that the dish soap will function as expected. But airworthiness is so important that we have historically asked for evidence to support the allegations of airworthiness for aircraft articles.

The FAA’s rules specify a performance standard – namely that the installation must return the product to a condition at least equal to its original or properly altered condition; however, the FAA’s regulations do not specify what evidence is sufficient to prove the airworthiness of the article.

We can contrast this with the European aviation safety regulations managed by EASA. Those regulations specify the sort of evidence that must be received: in the European system, most aircraft articles need to be accompanied by an EASA Form 1. Other forms of evidence are thus typically insufficient (but implementations can vary).

In the United States system, there is no hard standard for what evidence is sufficient and what evidence is not sufficient. The FAA’s chief counsel’s office has made it clear that the evidence of airworthiness can be generated through test and analysis of an article (to make sure it meets the appropriate airworthiness standards) or through reliance on other persons. One source of evidence can be manufacturers. In recent years, the FAA has changed its rules to permit production approval holders (PAHs) to issue the FAA Form 8130-3 as a “birth record” for new articles. This is great for newer articles, but the authority did not exist until recently, and many older articles were not documented at birth with 8130-3 tags.

We can still rely on the regulatory structure that requires the PAH to ensure airworthiness of articles before they are released from the quality assurance system. This means if you buy an aircraft article direct from the PAH, then you know that it was airworthy. And if the chain of commerce suggests that the article was released by the PAH, then this also provides evidence of airworthiness. This can be accomplished using something other than back-to-birth traceability (back-to-birth is, of course, a commercial norm for life-limited parts but is usually not appropriate for other aviation articles).

FAA guidance has also suggested that other forms of PAH evidence may be acceptable to “provide evidence that an article was produced by a manufacturer holding an FAA-approved manufacturing process.” This includes PAH documents such as shipping tickets and invoices. It also includes unregulated PAH markings, like standard inspection stamps. The industry has also relied on commercial features like PAH packaging to provide evidence of source.

Aviation has a tradition of relying on evidence to demonstrate airworthiness, and a corollary tradition of relying on trusted sources, like certificate holders, to provide that evidence. This has meant that we rely on the accuracy of statements from certificate holders to support our airworthiness findings. This evidence can come from manufacturers, from repair stations, and even from air carriers. As certificate holders, we trust their statements. So, if an air carrier surpluses an article, and provides evidence that the article is new, and was produced by a particular PAH, then we have a tradition of trusting that evidence. This evidence has often taken the form of a statement from the air carrier, like a packing list identifying the identity and condition of the articles in a surplus lot. The industry’s trust is based, in article, on the fact that the government typically approves the air carrier’s receiving inspection system (the FAA has an entire advisory circular explaining what the system should look like).

How important is trust to our industry? It is so important the we reserve the most stringent punishments for those who violate that trust.

Falsehoods have traditionally reflected a disqualification to hold an FAA certificate. Even if this was not evident from case law, I would know it because Judge Geraghty of the NTSB administrative law court would remind me of this during our encounters – he always felt it important to remind the industry that inaccuracy and falsehood are the most terrible of sins in the aviation world. When I was a young lawyer, the Judge would explain to me that the FAA relies on documentation to perform its oversight functions. If that documentation is not accurate, or is misrepresented, then this undermines the essence of the FAA’s oversight role.

Violating this trust through fraud or misrepresentation yields severe penalties. It can lead to a lifetime ban from the industry, and criminal penalties can range from 15 years for merely misrepresenting the quality of an aircraft article to life in prison if that same article malfunctions.

Our aircraft articles system is based on this sort of trust. We trust that FAA-approved manufacturers will produce airworthy articles. In the United States, this trust is based on the FAA’s approval of the design (which verifies that the design meets FAA safety standards), and the correlative FAA approval of the production system (which verifies that the production quality system is sufficient to ensure that articles produced under the system will meet the FAA-approved design). We trust that certificate holders will provide accurate statements about the articles that were received into their systems.

Over the past twenty years, the distributor accreditation system has infused an added element of trust into the system. Accredited distributors typically pass along necessary elements of the evidence that they receive to the next partner in the chain of commerce; they also retain all received documents in their system in order to maintain the audit trail. Although traditionally the accredited distributor has not make airworthiness determinations, the FAA has started to nominate FAA designated airworthiness representatives (DARs) who are able to work in an accreditation environment, review the information, and issue an 8130-3 tag on behalf of the FAA when the evidence is sufficient to show that the article is airworthy

The system for confirming the airworthiness of aircraft articles at the time of installation remains an evidence-based system. Sources like the FAA’s accreditation advisory circular provide guidance about what sort of evidence may be acceptable, but ultimately the installer must decide what sort of evidence is credible.
Next time you look at a traceability document, think about who you are trusting and why.


Disclosure: Jason Dickstein is the General Counsel of the Aviation Suppliers Association, and was a member of the EASA rulemaking committee described in this article.

New Part 145 Requirements for Supplier Evaluation

New Part 145 Requirements for Supplier Evaluation

Is your Part 145 organization evaluating its suppliers? A new European rule now requires Part 145 organizations to engage in supplier evaluation.

The new standards offer a number of different strategies for accomplishing this evaluation, so it doesn’t have to be a burden; but it has become a requirement, so Part 145 organizations ought to ensure they are meeting the applicable requirements in all cases.

Many repair stations appear to have missed this change, because it has been introduced without much fanfare. It starts with a minor change in the regulations that was published last summer. In August, the European Commission issued a new rule that required repair stations (EASA 145 organization) to

“establish procedures for the acceptance of components, standard parts and materials for installation to ensure that components, standard parts and materials are in satisfactory condition and meet the applicable requirements” EASA 145.A.42(b)(i).

At first blush, this does not seem to foreshadow any change at all. Everyone has procedures for accepting components. But when you get into the official guidance material, the meaning of these words becomes more plain.

The official guidance material was issued in a March 28 EASA Executive Decision (ED Decision 2019/009/R), and it was made more widely available when it was published in the EASA Easy Access Regulations in April.

First, the guidance clarifies that

“[f]or the acceptance of components, standard parts and materials from suppliers, the [] procedures should include supplier evaluation procedures.” AMC1 145.A.42(b)(i), section (b).

If you look at this requirement without continuing to read the rest of the guidance, then this takes a small burden (acceptance procedures) and turns it in to a seemingly huge burden (supplier auditing). But the EASA regulators are clever, and they provided mechanisms to protect aviation safety by using existing best-industry-practices.

How Did We Get Here?

The reason for supplier evaluation is because the supply chain can be a source of safety jeopardy. This was recognized globally in the 1980s and 1990s. In the 1990s, the United States FAA created the Voluntary Industry Distributor Accreditation Program (“Accreditation Program”). This voluntary accreditation program encouraged aircraft parts suppliers to adopt quality system standards that protected the airworthiness of parts, and that helped to support traceability. The program endorsed quality assurance standards and identified qualified audit organizations that could audit and confirm compliance to the voluntary standards. Suppliers that chose to implement the voluntary standards, and who received an independent audit confirming compliance, could be listed on an FAA database of accredited suppliers.

At about the same time, Europe’s Joint Aviation Authorities (JAA) recognized the value in inventory control and issued Temporary Guidance Letters (TGLs) addressing subjects like inspecting parts and controlling suppliers through effective evaluation. EASA effectively supplanted the JAA, but it took on some of the JAA initiatives including supplier control.

About ten years ago, EASA formed a rulemaking committee to investigate options for supplier control. The EASA committee independently created its own list of preferred quality system elements, and then compared them to the list found in the FAA’s Accreditation Program. The lists matched.

After significant research and discussion, the EASA committee concluded that the existing FAA Accreditation Program had become global in scope (and was used extensively in Europe), and they concluded that the Program adequately served the industry’s safety needs for supplier control (through supplier evaluation).

The long path from EASA recommendation to European Commission regulation resulted in the new regulation highlighted at the beginning of this column (EASA 145.A.42(b)(i)).

How Do I Comply?

As I said, the EASA Acceptable Means of Compliance (AMC) guidance appears to impose a huge new obligation on repair stations to evaluate suppliers. But EASA has offered an easy way to meet this evaluation obligation, by relying on the existing infrastructure for supplier evaluation.

EASA guidance material (GM3 145.A.42(b)(i)) explains how to evaluate suppliers. It explains that the 145 organization should ensure that the supplier’s quality system successfully incorporates certain specific elements. It explains that suppliers who meet those elements are considered acceptable, and that there are four standards that incorporate all of the necessary elements. The four standards are AC 00-56, ASA-100, AS/EN9120 and EASO 2012, and a supplier that is known to meet those elements is considered acceptable.

This means that a 145 organization can rely on the third party evaluation of a supplier that was audited to such a standard, and the 145 organization does not have to expend resources on its own, independent, supplier evaluation.

The easiest of these standards to use is AC 00-56, because you can find the FAA’s list of accredited suppliers online at https://www.aviationsuppliers.org/FAA-AC-00-56B. This is the list that most commercial aviation installers use when looking for aircraft parts suppliers, and it is easy to use for searching and for identifying important information (like certificate expiration dates).

What do you do about suppliers who are not accredited to a recognized standard? You can encourage them to get accredited (this makes it easy for you) or you can evaluate them yourselves. The EASA guidance material clarifies that a 145 organization’s level of supplier surveillance will vary based on the hazards and risks posed by the supplier. A supplier with little to no effect on safety could be given a simple paper checklist, but a supplier who provides safety-sensitive parts may need to be subject to periodic live audits. The burden associated with maintaining an external auditing force is one reason why most 145 organizations will opt to rely on third party accreditation.

Disclosure: Jason Dickstein is the General Counsel of the Aviation Suppliers Association, and was a member of the EASA rulemaking committee described in this article.