Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: CUSUM for Poisson Distributed Counts

Links : Home Index (Subjects) Contact StatsToDo


Explanations R Code
We will use an example to demonstrate how to use R codes for CUSUM

The example is a made up one to demonstrate the numerical process, and the data is generated by the computer. It purports to be from a quality control exercise in a large age care facility looking after bedridden residents, where the quality of care is measured weekly by the number of residents suffering from bed sores.

  • From records in the past, we expect 3 bedsores per week, and this can be capped if the staff are well trained and motivated
  • With time however, experienced staff leave and replaced by the less experienced and trained. The standard of care would gradually deteriorate, resulting in an increase in the weekly bed sores numbers.
  • We would like to trigger an alarm and refresh the training of the staff when the weekly number of bed sores reached 5 or more.
  • As training is expensive and dusrupts services, we would like any false alarm to be no more frequent than once every 100 weeks, so the average run length ARL = 100

Step 1: Parameters

We enter the following parameters to the R Studio source panel
# Step 1: parameters and data
inControlCount = 3
outOfControlCount = 5
arl = 100 
theModel = "F"   #F for FIR,   Z for zero, S for steady state 
The first 3 lines sets the parameters required for the analysis.

The 4th line, the model has 3 options, which sets the first value of the CUSUM

  • F means Fast Initial Response, where the initial CUSUM value is set at half of the Decision Interval h. The rationale is that, if the situation is in control then CUSUM will gradually drift towards zero, but if the situation is already out of control, an alarm would be triggered early. The down side is that a false alarm is slightly more likely early on in the monitoring. As FIR is recommended by Hawkins, it is set as the default option
  • Z is for zero, and CUSUM starts at the baseline value of 0. This will lower the risk of false alarm in the early stages of monitoring, but will detect the out of control situation slower if it already exists at the begining.
  • S is for steady state, intended for when monitoring is already ongoing, and a new plot is being constructed. The CUSUM starts at the value when the previous chart ends.
  • Each model will make minor changes to how quickly a departure can be detected

Step 2: Calculate the Reference Value (k) and Decision Interval (h)

# Step 2: Calculate k and h
#install.packages("CUSUMdesign")   # if not already installed
library(CUSUMdesign)
result <- getH(distr=3, ICmean=inControlCount, OOCmean=outOfControlCount, ARL=arl, type=theModel)
k = result$ref
h = result$DI
if(outOfControlCount<inControlCount)
{
  h = -h
}
cat("Reference Value k=",k,"\tDecision Interval h=", h, "\n")
Step 2 performs that statistical calculations using the parameters entered. The package CUSUMdesign needs to be alrady installed, and the library activated each time the program is used.

result is the object that contains the results of the analysis. The result required for this program are the reference value (k) and decision interval h. Please note that h is calculated as a positive value. If the CUSUM is designed to detect a decrease from in control value, then h needs to be changed to a negative value.

The last line displays the results we need

Reference Value k= 3.9 	Decision Interval h= 5.6 
Step 3 is divided into 2 parts. Step 3a calculates the cusum vector, and 3b plots the vector and h in a graph.
# Step 3a: Calculate CUSUM
# dat is the vector of measurements
dat=c(3,3,2,5,3,5,1,5,1,3,5,4,3,8,6,1,8,2,5,7)
cusum <- vector()
cusumValue = 0
if(theModel=="F")
{
  cusumValue = h / 2
}
for(i in 1 : length(dat))
{
  cusumValue = cusumValue + dat[i] - k
  if(outOfControlCount>inControlCount) # Up
  {
    if(cusumValue<0)
    {
      cusumValue = 0
    }
  }
  else                               # down
  {
    if(cusumValue>0)
    {
      cusumValue = 0
    }
  }
  cusum[i] = cusumValue
}
cusum
The calculations produces a vector of cusum values that can be plotted
cusum
 [1]  0.0  0.0  0.0  1.1  0.2  1.3  0.0  1.1  0.0  0.0  1.1  1.2  0.3  4.4  6.5  3.6
[17]  7.7  5.8  6.9 10.0

Step 3b: CUSUM Plot

# Step 3b: Plot the cusum vector and h
plot(cusum,type="l")
abline(h=h)
In step 3b, the first line plots the cusum vector, and the second line the decision interval h. The result plot is shown to the right.

Step 4: Optional export of results

# Step 4: Optional export of results
#myDataFrame <- data.frame(dat,cusum)         #combine dat and cusum to dataframe
#myDataFrame                                  #display dataframe
#write.csv(myDataFrame, "CusumPoisson.csv")   # write dataframe to .csv file
Step 4 is optional, and in fact commented out, and included as a template only. Each line can be activated by removing the #

The first line places the two vectors, dat and cusum together into a dataframe

The second line displays the data, along with row numbers, in the console, which can then be copied and pasted into other applications for further processing

The third line saves the dataframe as a comma delimited .csv file. This is needed if the data is too large to handle by copy and paste from the console.

Javascript Plot