![]() | Content Disclaimer Copyright @2020. All Rights Reserved. |
Links : Home Index (Subjects) Contact StatsToDo |
Program
Explanation
Excel & R Code
This page is a simple utility to combine multiple groups of n, mean, and SD into a single group using the following algorithm.
The algorithm on this page differs from that described in cochrane's handbook (see reference). Cochrane's formula combines two groups each time, so multiple groups will need to be combined in steps, or the formula needs to be modified to accommodate multiple groups. In any case, the results produced by both algorithms are the same (see R Codes) Please Note : This algorithm must be used with care, as the statistical assumption is that all the groups are merely sub-samples of a single group, and combining them merely restore them back into the original single group. In many cases this assumption is faulty, as the groups may be from different populations, and sampled under different environments. It is much safer therefore to combine groups using the meta-analysis algorithm, using the Random Effect Model, available in the Meta-analysis for Comparing Two Unpaired Groups Program Page , using the mean and Standard Error of the mean for each group. The Standard Error of the mean is calculated as SE = SD / sqrt(n) of each group. After combining them using the Random Effect Model, the Standard Deviation can be recalculated as SD = SE * sqrt(tn), where tn is the sum of sample sizes from all the groups. The results should look like the following. I have made bold calculations of SE = SD / sqrt(n) before meta-analysis, and from SD = SE x sqrt(n) after meta-analysis
References : Altman DG, Machin D, Bryant TN and Gardner MJ. (2000) Statistics with Confidence Second Edition. BMJ Books ISBN 0 7279 1375 1. p. 28-31 Higgins JPT, Li T, Deeks JJ (editors). Chapter 6: Choosing effect measures and computing estimates of effect. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.0 (updated July 2019). Cochrane, 2019. Available from https://training.cochrane.org/handbook/current/chapter-06#section-6-5-2 (table 6.5.2a)
This page is the most accessed one in StatsToDo, and attracts a lot of questions and feedbacks. A major issue is that the formulation used differs from that published by others, particularly from the formula described in the Cochrane handbook (see reference).
After a lot of cross checking, I have concluded that the algorithm used on this page is a valid one, and although differs from that described in the Cochrane handbook, it produces the same results Much of the testing and validating was carried out using R, and the codes used are now displayed here, so that users can satisfy themselves that the results are the same as that produced by Cochrane's formula, and also the algorithm can be copied and used independently from the StatsToDo website. The R code is designed to run from the source panel of R Studio. Those unfamiliar with R should read R Explained Page to learn how to set up and run R Studio. The code is presented in maroon, and the results in navy. It is divided into 3 sections. Section 1 contains data entry, section 2 the first option, using the algorithm in StatsToDo, and section 3 the second option, using Cochrane's formulation. # CombineMeanSD.R # https://training.cochrane.org/handbook/current/chapter-06#section-6-5-2 # Section 1: Data entry g1 <- c(10, 11.8, 2.4) #n mean,sd of grp 1 g2 <- c(20, 15.3, 3.2) #n mean,sd of grp 2 g3 <- c(15, 8.4, 4.1) #n mean,sd of grp 3 # Section 2: First option: StatsToDo formula Calc1 <- function(myDat) # myDat is matrix with 3 cols of n, mean, and SD { m = nrow(myDat) # number of groups tn = 0 tx = 0 txx = 0 for(i in 1:m) { n = myDat[i,1] mean = myDat[i,2] sd = myDat[i,3] x = n * mean xx = sd^2*(n - 1) + x^2 / n out<-cat("grp",i," n=",n," mean=",mean," SD=", sd, " Ex=", x, " Exx=",xx, "\n") tn = tn + n tx = tx + x txx = txx + xx } tmean = tx / tn tsd = sqrt((txx - tx^2/tn) / (tn - 1)) out <- cat("Combined","n=",tn," mean=",tmean," SD=", tsd, " Ex=", tx, " Exx=",txx,"\n") c(tn,tmean,tsd) } ar <- Calc1(matrix(data=c(g1,g2,g3), ncol=3, byrow=TRUE)) # combine 3 groups entry data into matrix and call function # Section 3: Second Option: Cochrane formula combining two groups Calc2 <- function(n1,m1,sd1,n2,m2,sd2) # n mean and SD of two groups { out<-cat("grp1"," n=",n1," mean=",m1," SD=", sd1, "\n") out<-cat("grp2"," n=",n2," mean=",m2," SD=", sd2, "\n") tn = n1 + n2 tmean = (n1*m1 + n2*m2) / (n1 + n2) tsd = sqrt(((n1-1)*sd1^2 + (n2-1)*sd2^2 + n1 * n2 / (n1 + n2) * (m1^2 + m2^2 - 2 * m1 * m2)) / (n1 + n2 -1)) out<-cat("grp1+2"," n=",tn," mean=",tmean," SD=", tsd, "\n") c(tn, tmean, tsd) } ar <- Calc2(g1[1],g1[2],g1[3],g2[1],g2[2],g2[3]) # combine group 1+2 ar <- Calc2(ar[1],ar[2],ar[3],g3[1],g3[2],g3[3]) # add grp 3 to result of combined 1+2The results from section 2, first option, StatsToDo algorithm, are > ar <- Calc1(matrix(data=c(g1,g2,g3), ncol=3, byrow=TRUE)) grp 1 n= 10 mean= 11.8 SD= 2.4 Ex= 118 Exx= 1444.24 grp 2 n= 20 mean= 15.3 SD= 3.2 Ex= 306 Exx= 4876.36 grp 3 n= 15 mean= 8.4 SD= 4.1 Ex= 126 Exx= 1293.74 Combined n= 45 mean= 12.22222 SD= 4.502822 Ex= 550 Exx= 7614.34The results from section 3, second option, Cochrane algorithm, are > ar <- Calc2(g1[1],g1[2],g1[3],g2[1],g2[2],g2[3]) # combine group 1+2 grp1 n= 10 mean= 11.8 SD= 2.4 grp2 n= 20 mean= 15.3 SD= 3.2 grp1+2 n= 30 mean= 14.13333 SD= 3.363427 > ar <- Calc2(ar[1],ar[2],ar[3],g3[1],g3[2],g3[3]) # add grp 3 to result of combined 1+2 grp1 n= 30 mean= 14.13333 SD= 3.363427 grp2 n= 15 mean= 8.4 SD= 4.1 grp1+2 n= 45 mean= 12.22222 SD= 4.502822Please note: that Cochran's formula combines two groups only. when there are more than two groups, the first two groups are combined, then each subsequent group is added to the results. The final results from both algorithms are the same. For those who prefer to use the formula described in Cochran's manual, I have produced a short Excel program that allows this be be done easily. The Excel workbook can be down loaded and used without the Internet. |