CUSUM is a set of statistical procedures used in quality control. CUSUM stands for Cumulative Sum of Deviations.
In any ongoing process, be it manufacture or delivery of services and products, once the process is established and running, the outcome should be stable and within defined limits near a benchmark. The situation is said to be In Control
When things go wrong, the outcomes depart from the defined benchmark. The situation is then said to be Out of Control
In some cases, things go catastrophically wrong, and the outcomes departure from the benchmark in a dramatic and obvious manner, so that investigation and remedy follows. For example, the gear in an engine may fracture, causing the machine to seize. An example in health care is the employment of an unqualified fraud as a surgeon, followed by sudden and massive increase in mortality and morbidity.
The detection of catastrophic departure from the benchmark is usually by the Shewhart Chart, not covered on this site. Usually, some statistically improbable outcome, such as two consecutive measurements outside 3 Standard Deviations, or 3 consecutive measurements outside 2 Standard Deviations, is used to trigger an alarm that all is not well.
In many instances however, the departures from outcome benchmark are gradual and small in scale, and these are difficult to detect. Examples of this are changes in size and shape of products caused by progressive wearing out of machinery parts, reduced success rates over time when experienced staff are gradually replaced by novices in a work team, increases in client complaints to a service department following a loss of adequate supervision.
CUSUM is a statistical process of sampling outcome, and summing departures from benchmarks. When the situation is in control, the departures caused by random variations cancel each other numerically. In the out of control situation, departures from benchmark tend to be unidirectional, so that the sum of departures accumulates until it becomes statistically identifiable.
R Codes
In May, 2020, most CUSUM programs in StatsToDo have been replaced by template R Codes. An explanation for how to set up R and some basic procedures are provided in the
R Explained Page
for those unfamiliar with this language.
This section provides explanations for some of the terms that are used in CUSUM, based on descriptions from the text book by Hawkins and Olwell.
Terms the user needs to attend to
In control describe the situation when everything is going according to plan, and the measurements being monitored are within the benchmark.
Out of control is the situation CUSUM is designed to detect, when the measurements drift outside of the benchmark
Average run length (ARL) is the estimated number of continuous observations before a false alarm is triggered. It is equivalent to the false positive rate or the Type I Error. A false positive rate of 1% (p=0.01) is the same as ARL=100.
The ARL is usually set in a balance between the need for investigation and intervention when things go wrong and the inconvenience and cost of a false alarm. For example, if the sampling rate is 5 a day, and the requirement is that a false alarm does not occur more frequently than every 20 days, then the ARL = 5x20 = 100
The CUSUM is designed to be a one tail algorithm, to test for departure from the benchmark upwards or downwards, but not both. If the user wishes to have a two tail test for both at the same time, then he/she needs to use two CUSUMs, one for each tail, but the ARL for each should be half of that required for the one tail situation.
Data is a vector (array) of values obtained during monitoring that are used to calculate the CUSUM
Terms the user can control, but is usually set in default values in StatsToDo
Model sets the initial value of CUSUM in a run, which determines how rapidly the out of control situation can be detected if it exists already. The more rapid the response will of course lead to a greater risk of a false alarm. The 3 options are
- F for Fast Initial Resposnse (FIR), where the initial CUSUM value is set at half of the Decision Interval (h). This is the default option in StatsToDo as recommended by Hawkin's textbook
- Z for zero (0), where the initial CUSUM value is set to 0. This can be used if the user is certain that the situation is in control initially, and wish to avoid an early false alarm
- S is for steady state, used when the CUSUM value is supposed to be from the end of a previous CUSUM which has just ended, and the value can be set by the user. S is usually not offered in StatsToDo as this requires the user to alter the algorithm to set an initial value
Winsorization is a statistical process whereby unexpected outliers with extreme values are modified before they are used for calculating CUSUM. Winsorization is not provided by StatsToDo, and users will need to manually modify extreme outlier values before analysis if they should choose to do so.
Terms for results produced by the algorithm
Reference Value (k) is used to adjust the value of the CUSUM and control the proliferation of its variance. It is used in all subsequent calculations, but need not be attended to by the user
Decision Interval (h) the the value of the CUSUM which should trigger an alarm that the out of control situation has been detected.
CUSUM values is calculated from the data using the reference value (k). It is usually stored in a vector, and used for plotting.