Repairing fences instead of catching chicken

Failure affinity instead of failure aversion

Reliability problems and failures usually occur multiple times along the development chain. They are often belittled, explained away or kept silent on the grounds that the tests are excessively hard and that prototypes have to be used as test specimens or the test bench probably had a problem, etc. Of course, that can be anything. These concerns must therefore be addressed before the respective activity. Otherwise, indicators are ignored, the weak points then migrate with the development process into the next generation and after the SoP to the customer, where they reveal themselves as serial errors.

Failures are indicators of a lack of understanding. They bring novel weak points to light, reveal unexpected operational load cases, unintended control phenomena, inadequate quality assurance or demonstrate poorly coordinated development of hardware and software. Failures were and are the reason for the investigation of unknown damage mechanisms. They are therefore an efficient driver for increased reliability in the entire mechatronic industry. But cases of damage interfere with the proof of reliability. The goal of knowledge is in temporal conflict with the goal of proof. That must be processed e.g. in the steering meetings of a development project, where – apart from the completed test hours – failures and indicators of weak points should be in the focus.

All methods of reliability growth assume that the causes of failures are clarified and that the necessary changes are quickly installed in the subsequent generation. For the planning of product validation, on the other hand, the volume of endurance runs is often used as a relevant evaluation parameter. De facto, however, the quality and speed of forensic work are the more important parameters, because you can’t buy time:

At the end of the validation interval, a product goes into series production. If it has to be, there are also the problems that cannot be solved sustainably during this time. To make matters worse, recurring failures terminate the endurance runs prematurely, before further weak points can manifest themselves as failures. These hidden problems therefore also go into series production. Quick-and-dirty measures can therefore also be understood as a method of generating unreliability. But of course, no one does this on purpose. And there are countermeasures.

If you want to understand a problem, you should tackle it with alle methods.

A major hurdle for problem solving – the division of knowledge into specialist departments – is circumvented by the identification of a mixed problem-solving team. Their way of working is – analogous to agile development – interdisciplinary, non-hierarchical and result-driven. The coverage of all potential types of causes is central to the composition of such teams. In addition to design, simulation and experimental development, quality and production must therefore be represented. Problem solving should be controlled by a process, e.g. the 8D process, which is supported by our software Uptime SOLUTIONS.

The statistical analysis of the failure cases is powerful. It usually provides a reliable diagnosis of whether it is a quality or a lifetime problem, whether the cause is seasonal in nature or whether certain modes of operation are causing the damage, etc.

In practice, the fact that the engineers and technicians with a high level of expertise are not available to the necessary extent is critical. The integration of external support eliminates this problem quickly and effectively. Even more important, however, is the view from the outside. It provides the blind spots, i.e. aspects that have receded into the background due to the long occupation with the matter, or have not been sufficiently investigated for other reasons.

The results of damage analyses are primarily used for a sustainable solution to a specific problem. In addition, they can be used for future product development, for preventive plant operation or for optimized plant maintenance if damage indicators are derived from the analyses. These indicators are used to detect precursor effects to initiate maintenance before failure. They also provide the input for automated diagnostics. On this basis, the remaining service life of damaged components can finally be determined. This can be automated for risk-focused plant analysis in our Uptime HARVEST software.

Damage analyses are therefore useful beyond the specific case for reliability throughout the entire product life cycle: for efficient development, preventive operation, early diagnostics and accurate forecasting.

blog

The Principles of System Reliability

What makes our lives as part of the techno sphere stressful is its unreliability.

•March 2025

blog

Anomaly detection for condition-based maintenance

How do we find out that something is wrong with the technology - and why...

•January 2025

blog

Certification according to ISO/IEC 27001:2022 Information security, cybersecurity and privacy protection

We are now ISO 27001:2022 certified

•September 2024

Repairing fences instead of catching chicken

Failure affinity instead of failure aversion

If you want to understand a problem, you should tackle it with alle methods.

So what to do?

The failure rate in product development and fleet operation increases sharply with the degree of innovation of a product. The degree of innovation should therefore be assessed, e.g. via the Technology-Readiness Level developed by NASA.

The capacity of dedicated problem-solving teams should be according, ideally supplemented by external partners and experts.

The validated results and consequences should be incorporated into a central knowledge base. They can be used in a variety of ways: for product developments, for preventive plant operation, for system monitoring and for preventive maintenance.

Recent Articles

Categories

Failure affinity instead of failure aversion

If you want to understand a problem, you should tackle it with alle methods.

So what to do?

The failure rate in product development and fleet operation increases sharply with the degree of innovation of a product. The degree of innovation should therefore be assessed, e.g. via the Technology-Readiness Level developed by NASA.

The capacity of dedicated problem-solving teams should be according, ideally supplemented by external partners and experts.

The validated results and consequences should be incorporated into a central knowledge base. They can be used in a variety of ways: for product developments, for preventive plant operation, for system monitoring and for preventive maintenance.

Related Posts

The Principles of System Reliability

Anomaly detection for condition-based maintenance

Certification according to ISO/IEC 27001:2022 Information security, cybersecurity and privacy protection

Recent Articles

Categories