Reliability, one of the core disciplines in technical statistics, begins with the definition of product goals, evaluates the reliability evidence, and extends to forecasting lifetime.
Accordingly, there is a powerful methodological canon for these topics for use in the overall product life cycle. However, caution is advised when using these methods, simply because the result of a statistical transformation is “only” a function of the input. Scope, quality, and reliability of the available data is therefore critical for the quality of the results. Knowledge of the world outside of the data is missing in classical statistical procedures. This is why the results often do not fit so well with reality. But there are solutions for this too.
In the following, we will show statistical applications for the analysis of the past, present, and future (part 2). We will clear the path of methodological stumbling blocks or at least install warning signs to prevent belly flops.
About using statistics to understand the past and the present
Statistics is a data science. Data are unpleasant. They permanently cost money, they are expensive to keep, and without processing they are worthless. This is why statistics processes data; to distil information from them. This is exactly what we need as a basis for decisions.
In the field of reliability, we use statistical methods to create a quantitative picture of the present, i.e., the current state of reliability. This involves analysing databases containing failures in the field. Ideally, this data can be connected with other sources, e.g., maintenance documentation or spare parts management (CMMS).
Analyses of downtime event over time deliver valuable input for the identification of quality or lifetime problems. These problem types can be differentiated over the course of the downtime event against operational time or performance. Reliability problems increase over time, the opposite is the case for quality problems. This difference should prevent quality topics being treated with reliability methods and vice versa. This clarification is important because this difference is ignored in the problem resolving process. Statistics must therefore put the team on the right track so that they are not side-tracked during problem resolution.
If we want to learn from the past, then we expect “uncensored” data that represent the entire product lifetime. This case does exist but can almost only be found in textbooks. Field data is normally censored because customers tend to not service their products in contractual workshops when the warranty has expired, or the maintenance activities are not documented in detail, or the documentation is not suitable for reliability purposes. There are many causes. Still, this data represents the empirical basis of the downtime event, with which relevant aspects of reliability are often evaluable better than expected.
First of all, it becomes evident which components suffered from failure or problems. They are not always the ones that are exchanged. Cheaper, quickly replaceable components do not attract attention even with high failure rates. Organisation carry quality problems as ballast in their maintenance programs for many years without noticing them. Statistics bring this to light with little effort, decides whether it is a quality or a reliability topic, and generates appropriate suggestions. These recommendations do not always correspond to conventional patterns of action. For example, the preventative exchange of (non-defective!) parts towards the end of their lives is not a simple task for all maintenance teams.
Lifetime analyses also show a series of other influences. Seasonal fluctuations in failure rates provide clues towards temperature effects or corrosion. Locations, operating styles, climate effects can all be measured.
Statistics uncovers the effect on maintenance on the failure rate, the relationship of maintenance to repair, any driver influence; in short everything.
Field data are analysed with statistical procedures for product development and the validation of reliability. This results in a use space that tells us in what respect and how strongly the use styles of a product vary. This also results in information describing which customer types or operating styles are critical for which failure risks.
This analysis also provides an answer as to whether the plethora of use styles can be grouped according to the load profiles. This would enable test variants to be selected that deliver information representative for the entire cluster.
The use space also provides the evaluation criteria for the reference load collective. We expect them to represent critical customers – and are therefore located towards the edges of the use space. The average customer is located in the centre of the use space. This set of load cycles forms the reference for the evaluation of tests that meet the requirements.
The latter consist of representative and accelerated tests. Representative tests can take a seat adjacent to the average customer in the use space. The accelerated tests however can calmly ignore the limits of the use space and face the storms of overload on the horizon. We actually even accept that their samples are lost. That’s what we call failure-oriented testing. From a statistical point of view, it is very efficient.
The expected reliability represents a significantly better evaluation of reality.
Some limitations with statistics and how we deal with them
The classical statistical approach assumes that we know nothing of the world and only have data such as operational performance and failure rates over the course of product development. This can be used to calculate the proof of reliability, but the actual product reliability is considerably higher. The statistical confirmation is only a measure of the worst possible product reliability that is still compatible with the data used.
In fact, this result does not really fit in with reality since designers and developers actually know a lot about their product before it even exists. If we extend the statistical approach with this knowledge, then this an evaluation of this “expected reliability” aligns much more closely with reality.
This enables us to deliver a lot of useful information for difficult decisions, for example the required provisions for the warranty period, or the test program with the best cost-benefit ratio. We can determine the cost risk of an extended warranty and also the effort required to lower it.