CombineMeanSD

Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Combining n, mean, and Standard Deviation from Multiple Groups

Links : Home Index (Subjects) Contact StatsToDo

Explanations Javascript Program

The data is a 3 column numerical data
   Column 1 is sample size n
   Column 2 is mean value
   Column 3 is Standard Deviation value

R Codes

The two algorithms for combining means and Standard Deviations from 2 or more groups are presented as R codes, for those who are wish to check the validity of the calculations, incorporate the algorithm into their own applications, or merely interested.

The R code is designed to run from the source panel of R Studio.

The code is presented in maroon, and the results in navy. It is divided into 3 sections. Section 1 contains data entry, section 2 the first option, using the algorithm in StatsToDo, and section 3 the second option, using Cochrane's formulation.

The data. The same data as in the Javascript program is used, and a data frame is created

myDat = ("
n   mean   sd
10  11.8   2.4
20  15.3   3.2
15   8.4   4.1
") 
myDataFrame <- read.table(textConnection(myDat),header=TRUE)  # conversion to data frame    
#myDataFrame                                                  # optional check input

Algorithm 1: Decomposition of mean and SD to ex (Σx) and exx (Σx²)

nr = nrow(myDataFrame)   # number of rows
ex <- rep(0,nr)          # array to contain Σx
exx <- rep(0,nr)         # array to contain Σx²
tn = 0                   # total n
tx = 0                   # total Σx 
txx = 0                  # total Σx²
for(i in 1:nr)
{
  ex[i] = myDataFrame$n[i] * myDataFrame$mean[i]
  exx[i] = myDataFrame$sd[i]^2 * (myDataFrame$n[i]-1) + ex[i]^2 / myDataFrame$n[i]
  tn = tn + myDataFrame$n[i]
  tx = tx + ex[i]
  txx = txx + exx[i]
}
# concatenate Σx and Σx² to data frame
myDataFrame$ex <- ex
myDataFrame$exx <- exx
myDataFrame             # show data frame
# Calculate combined values
tMean = tx / tn
tSD = sqrt((txx-tx^2/tn)/(tn-1))
print("Combined n, mean, and SD")
print(c(tn,tMean,tSD))

The results are as follows

> myDataFrame
   n mean  sd  ex     exx
1 10 11.8 2.4 118 1444.24
2 20 15.3 3.2 306 4876.36
3 15  8.4 4.1 126 1293.74
> 
[1] "Combined n, mean, and SD"
[1] 45.000000 12.222222  4.502822

Algorithm 2: Cochrane's formula
Ref: https://handbook-5-1.cochrane.org/chapter_7/table_7_7_a_formulae_for_combining_groups.htm

Using the same data as in the Javascript program and algorithm 1

nr = nrow(myDataFrame)   # number of rows
newN <- rep(0,nr)        # array for combined n of this and previous group
newMean <- rep(0,nr)     # array for combined mean of this and previous group
newSD <- rep(0,nr)       # array for combined Standard Deviations of this and previous group
# Prime the first row by copying from data frame
newN[1] = myDataFrame$n[1]
newMean[1]  = myDataFrame$mean[1]
newSD[1] = myDataFrame$sd[1]
# designate values of current row as "old"
oldN = newN[1]
oldMean = newMean[1]
oldSD = newSD[1]
# Combining each pair of rows from row 2 onwards
for(i in 2:nr)
{
  # data from row
  n = myDataFrame$n[i]
  mean  = myDataFrame$mean[i]
  sd = myDataFrame$sd[i]
  #combining with old values (Cochrane's algorithm)
  newN[i] = oldN + n
  newMean[i] = (oldN * oldMean + n * mean) / (oldN + n)
  newSD[i] = sqrt(((oldN-1)*oldSD^2 + (n-1)*sd^2 + oldN * n / (oldN + n) * (oldMean^2 + mean^2
                  - 2 * oldMean * mean)) / (oldN + n -1))
  # designate values of current row as "old"
  oldN = newN[i]
  oldMean = newMean[i]
  oldSD = newSD[i]
}
# Concatenate columns of combined values to data frame
myDataFrame$newN <- newN
myDataFrame$newMean <- newMean
myDataFrame$newSD <- newSD
myDataFrame                # display dataframe containing original and combined results

The results are as follows

> myDataFrame
   n mean  sd newN  newMean    newSD
1 10 11.8 2.4   10 11.80000 2.400000
2 20 15.3 3.2   30 14.13333 3.363427
3 15  8.4 4.1   45 12.22222 4.502822

The final results from both algorithms are the same.