Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Quality Statistics: Summary and Introduction

Links : Home Index (Subjects) Contact StatsToDo


Quality control and its statistics are huge subjects, and pages on this site cannot hope and do not attempt to cover but a tiny part of it that are more frequently used in health care.

This page lists some of the statistical tools from this site that can be used to support quality control.

Quality control itself encompasses many activities. In planning, this involves ensuring the resources and structure of organisations, setting up of proper policies and protocols, establish lines of management control, provision of training, selection of staff, and so on. In management, there are the detection of unexpected events and its subsequent analysis, the review of outcomes, and the cultivation of attitudes and culture.

Statistics in quality control consists of statistical tools that can be used to support these activities.

StatsToDo provides 4 sets of statistical tools that can be used in quality control.

  • How to establish bench marks
  • How to measure conformity to the bench mark
  • How to detect a drift away from the bench mark
  • How to determine a statistically significant departure from the bench mark.

Bench Mark

The measurement of quality is often against a bench mark, and the important issue here is how this bench mark is established. One way is to define it arbitrarily, setting it against an ideal that one hopes to achieved. Examples of these are that "50% of those on the waiting list will be seen within 2 weeks", "the Caesarean Section rate will be less than 25%", and so on. Unfortunately, arbitrarily set bench marks are often unrealistic and may be unachievable.

Another approach is to discover the current level of quality, and use this to decide an appropriate level that is achievable. Most of the statistical tools designed for surveillence and discovery can be used, but this site provides the following

  • Precision.php is an analysis of variance program often used to evaluate the precision of normally distributed measurements. For example, when establishing the expected 95% confidence interval for a measurement of a particular chemical
  • SSizSD.php provides sample size requirements and precision evaluation for assessing the Standard Deviation of a population. For example, to establish the variation in weight of a product packaged by a particular machine
  • SSizMean.php provides sample size requirements and precision evaluation for assessing the mean values of populations. For example, the volume of blood loss in a particular operation
  • SSizProp.php provides sample size requirements and precision evaluation for assessing the proportion in a population with a particular characteristics. For example, the proportion of patients with wound infections following a particular operation


Once a bench mark is established, it is important to know whether the institution, a group, or an individual conforms to the bench mark. The idea is to sample the institution, group or the individual's performance, and the aim is to minimise the number of samples necessary so that a decision can be made as soon as possible.

Examples in industry are whether a batch of bullets delivered to the army has a defective rate below specification, whether a batch of eggs delivered to the store conforms to the minimum size required. In medical care, it may be whether a newly introduced procedure exceeds a prescribed failure rate, or whether the blood loss in a particular operation exceeds that expected.

The need to sample an unknown batch, to be able to draw conclusions with confidence yet as quickly and as cheaply as possible, drove the development of quality control statistics. SeqSPRT.php contains 2 programs developed by Wald to quality control supplies during WWII.

  • The first program tests whether a batch of product conforms to a specified measurement. For example, whether packages from a packing machine delivers 1±0.05 kg of sugar
  • Rhe second programs tests whether the proportion of defective items conforms to that specified. For example, whether the proportion of bullets that failed to fire is less than 1 in a thousand
  • In both tests, cases are sequentially sampled until a decision can be made, and the tests are designed to make decision with minimum sample requirements

Detecting drifts

Once a bench mark is established and complied with, it is important to continuously monitor the outcomes, so that a departure from the bench mark can be identified.

In CUSUM charting, the sensitivity of detecting a departure from bench mark is traded against the frequency of false alarms, so the user must set the level of sensitivity according to the needs of the situation. Excessive sensitivity results in constant false alarms, which disrupts services and production as well as reducing the credibility of the monitoring system. Insufficient sensitivity however will result in delays in investigations and remedial actions.

The CUSUM model to use depends on the nature of the numbers to be used. StatsToDo provides models for use supporting Normal, Poisson, Bernoulli, Binomial, Inverse Gaussian and Exponential distributions (see CUSUM pages in the index). The commonly used ones are cusum for changing means, cusum for changing proportions, and cusum for changing rates of events (counts)

Another 2 methods of continuous monitoring is using Moving average. Two such programs are available. The conceptually simpler one is using the rotating average of a number of continuous samples. Another is the Exponentially Weighted Moving Average (EWMA), where each new measurement is weighted and added to the existing average, in order to reduce random variations, so that departure from the in nominated mean can be better detected.

Significant Departures

Much of quality control statistics are intended to assist and not judge the user. They evaluate a situation, and triggering alarms that may lead to investigation and remedy, and of ensuring conformity to standards set by the user.

Every now and then, the situation is complicated by disputes, where individuals that performed poorly may argue that the data collected had not truly reflect their outcomes, or when remedial measures are very time consuming or expensive that there is a reluctance to accept that something needs to be done.

In these situations, it is important to have hypothesis testing methods that are statistically robust and reliable, and produce probability estimates that can be relied upon. The following methods are available from StatsToDo.

The Binomial Test provides a probability estimate whether the number of positives (k) in a sample (n) conforms to a prescribed proportion (prop). A typical example is a cardiac surgeon who had 5 deaths in 8 consecutive operations (62.5%), when the expected death rate (benchmark) is 14%. Using the binomial test, the probability that 5 deaths in 8 cases is no difference to 14% is 0.002, very unlikely, so that a confident decision could be made that this death rate is excessive and cannot be ignored.

The Poisson Test provides a probability estimate whether the number of events in a defined environment conforms to a bench mark. A typical example is an increase of falls in an age care institution, which jumps from the expected 4 per month to 6 following structural and staff changes. Poisson test shows the probability of 6 events conforms to the expected 4 is 0.1, which can be considered to be not statistically significant. A confident decision that no further action is necessary at this point can therefore be made. However, if the trend continues to the next month, and there are 12 falls when the bench mark should be 8, then the probability that 12 falls conform to a bench mark of 8 is 0.048, now worth investigating and closer monitoring.

The Paired Difference Test provides a probability estimate whether the mean and Standard Deviation conforms to a bench mark mean. A typical example is when the averaged blood loss from a particular major operation is 500mls, and a particular surgeon, evaluated over 10 operations, has an averaged blood loss of 700mls, with a standard deviation of 100 mls. The difference is 200mls (diff=700-500=200) to be compared with the bench mark of difference=0. The paired difference test show that the 95% confidence interval is 125mls to 275mls, significantly greater than the null value of 0. The conclusion that blood loss over these 10 operations significantly exceeded 500mls could therefore be made.