Content Disclaimer Copyright @2020. All Rights Reserved. |

**Links : **Home
Index (Subjects)
Contact StatsToDo

Explanations and References ## IntroductionThis page provides calculations to estimate sample size requirements, and power of results, in comparing two sets of count data. The actual comparisons are in Compare Two Counts programThree algorithms are available for this comparison - The most commonly used method, initially described by Przyborowski and Wilenski (see reference), is known as the Conditional Test (the C Test). The test is based on the null hypothesis that the ratio of the two count rates (λ
_{2}/ λ_{1}) is equal to 1. - More recently, Krishnamoorthy and Thomson (see reference) proposed an improvement on the C Test, where the null hypothesis is that the difference between the two count rates (λ
_{2}- λ_{1}) is equal to 0. Althought computation for this test is more complex, the advantages are that it is more robust, and the results have greater power - Whitehead (see reference), in his text book on unpaired sequential analysis, provided algorithms to determine sample sizes for non-sequential methods, and a method for comparing two counts at the end of the sequence. This test depends on a transformation of the difference into a normally distributed mean and compares the means against the null hypothesis of 0.
**Comparing the 3 tests:**In most cases, the results from the 3 tests on the same set of data are approximately the same. There is only a need to choose when the ample size is small or when the difference between the two groups are minor, so that the statistical significance is ambiguous.- The advantage of the Whitehead algorithm is the speed of computation, as there is no need for repeated estimation of factorial numbers that are necessary in the other two tests. Users may prefer this test if the sample sizes are large or there are numerous calculations to be done
- The C Test has been used the longest, and quoted by most text books.
- The E Test is the most robust, least likely to result in a false statistical significance, and so is the safest one to use. With large numbers however, it is also the most computational intensive and time consuming
As the 3 methods of comparison have different powers and precisions, they require different methods of estimating sample size. - The Poisson's Test comparing two counts, initially described by Przyborowski and Wilenski, and is known as the Conditional Test (the C Test), is based on the Poisson distribution, with the null hypothesis that the ratio of the two count rates (λ
_{2}/ λ_{1}) is equal to 1. It is the most powerful test, and requires less dample size than the E Test - Krishnamoorthy and Thomson (see reference) proposed an improvement on the C Test, where the null hypothesis is that the difference between the two count rates (λ
_{2}- λ_{1}) is equal to 0. Althought this test is more complex, the advantages are that it is both more robust (less likely to commit the Type I Error). The sample size required is similar but slightly greater then that of the C Test - Whitehead, in his text book on unpaired sequential analysis, provided algorithms to determine sample sizes for non-sequential methods, including a method for comparing two counts was also described. This method is based on trnasforming the counts to a normally distributed value, on the assumption that with large numbers, the Poisson distribution approximates the normal. The advantage of the Whitehead algorithm is the speed of computation, as there is no need for prolonged iterative estimation of probabilities required for the C and E Test
Please Note : In StatsToDo, estimating sample size requirement for comparing two counts uses the one tail model. For a two tail model, do the same calculation using half the α value. For example, sample size for α=0.05 in a two tail model is the same as that for α=0.1 in the one tail model, everything else being the same.
## Different AlgorithmsThe plot to the right shows the relationship between sample sizes required calculated from the 3 algorithms.The x axis represents sample size calculated by Whitehead's algorithm, and the y axis the percentage difference between sample sizes calculated from the other two algorithms and that from Whitehead's algorithm. If the sample size calculated by C or E Test is n It can be seen that sample size for the C Test (in blue) is slightly greater than sample size for the E Test (red), as the E Test is more powerful and so require a smaller sample size. The sample size of both the C and E Tests are greater than that from Whitehead's algorithm, but the difference (in term of %) decreases as sample size increases. Where sample size is less than 50 according to Whitehead's algorithm, it can be double that for the C and E Tests The differences between the sample sizes from the 3 algorithm are therefore trivial when the sample size is more than 100, but becomes increasingly relevant in smaller sample sizes. The table of sample size can therefore be consulted, and the sample size required can be derived from numbers in the table. If the sample size is over 100, it is probably usable. A more precise sample size should be calculated using the Javascript program, if the initial sample size is estimated to be less than 100. ## ReferencesPrzyborowski J and Wilenski H (1940) Homogeneity of results in testing samples from Poisson series. Biometrika 31:313-323.Krishnamoorthy, K and Thomson, J. (2004). A more powerful test for comparing two Poisson means. Journal of Statistical Planning and Inference, 119, 249-267. Program adapted from FORTRAN program by Krishnamoorthy and Thomson, downloaded from https://userweb.ucs.louisiana.edu/~kxk4695/statcalc/POIS2POW.FOR Whitehead John (1992). The Design and Analysis of Sequential Clinical Trials (Revised 2nd. Edition) . John Wiley & Sons Ltd., Chichester, ISBN 0 47197550 8. p. 48-50
Explanations
Javascript Program
This section provides a series of tables presenting commonly used sample sizes comparing two count rates.
The tables are for Type I Error (α) of 0.05 for 1 tail, power of 0.8, and assuming that the two groups have similar sample sizes.
λ:0.1-0.3
The calculations are from Whitehead, C Test, and E Test, as referenced. Although the tables present only a limited range of λs, the sample size can be extrapolated from the tables, as, for the same ratio of the two λs, the sample size is proportionate to the λs, as shown in the following table
Sample size for comparing two count rates (λ1 and λ2) with Poisson distribution.
λ:0.4-0.6
- Probability of Type I Error (α, 1 tail) = 0.05
- Power (1-β)= 0.8
- Assuming equal sample size in the two groups (Ratio n2/n1=1)
- Sample size according to Whitehead's formula (WH), for C Test (C), and E Test (E)
Sample size for comparing two count rates (λ1 and λ2) with Poisson distribution.
λ:0.7-0.9
- Probability of Type I Error (α, 1 tail) = 0.05
- Power (1-β)= 0.8
- Assuming equal sample size in the two groups (Ratio n2/n1=1)
- Sample size according to Whitehead's formula (WH), for C Test (C), and E Test (E)
Sample size for comparing two count rates (λ1 and λ2) with Poisson distribution.
λ:1.0-1.2
- Probability of Type I Error (α, 1 tail) = 0.05
- Power (1-β)= 0.8
- Assuming equal sample size in the two groups (Ratio n2/n1=1)
- Sample size according to Whitehead's formula (WH), for C Test (C), and E Test (E)
Sample size for comparing two count rates (λ1 and λ2) with Poisson distribution.
- Probability of Type I Error (α, 1 tail) = 0.05
- Power (1-β)= 0.8
- Assuming equal sample size in the two groups (Ratio n2/n1=1)
- Sample size according to Whitehead's formula (WH), for C Test (C), and E Test (E)
Please note that computation may take a long time if the sample size is large. On average, sample size less than 100
takes about 30 seconds. Time increases exponentially so that sample size of 200 may take 120 seconds, and further increase may
take even hours.
R Codes
Please note that some browsers have time limits, and when that limit is reached it asks the user whether to continue or not. Although long programs can be run, it does require the user to attend and repeatedly tell the browser to continue. The limits are as follows. - Internet Explorer - 5 million statements
- Firefox - 10 secs
- Safari - 5 secs
- Chrome - no time limit
- Opera - no time limit
Please note: that the calculations are for one tail studies.
Thid panel provides algorithms for power and sample size estimates for comparing two counts based on the Poisson distribution
The program is divided into two major sections, a short and easy section for the Whitehead algorithm, and the loger and more convoluted algorithms in the C and E Tests. More details concerning the three tests are in the Explanation panel. ## Section 1: Whitehead Algorithm# SSiz2Counts.R Firstly the subroutines.Please note: that these subroutines are also needed for section 2 C and E tests algorithms
# functions SSizWhitehead estimates sample size requirements using the Whitehead algorithm # alpha = Probability of Type I Error, α. Commonly used value is 0.05 # Power = 1 - β where β is probability of Type II Error. Commonly used value is 0.8 # lambda_1 and _2 are the two average counts (k1/n1 and k2/n2) # ratio is the ratio of the sample sizes of the two groups (n1 / n2) # function returns the total sample size (n1+n2) SSizWhitehead <- function(alpha, power, lambda_1, lambda_2, ratio) # estimate sample size { tr = abs(-log(lambda_1/lambda_2)) lb = (ratio * lambda_1 + lambda_2)/(ratio + 1) za = qnorm(alpha) # 1 tail zb = qnorm(1 - power) v = ((za + zb) / tr)^2 return (round(v / ratio / lb * (ratio + 1) * (ratio + 1))) # total sample size (2 groups) } #function PowerWhitehead estimates power of the data presented # alpha = Probability of Type I Error α used to determine significance, Commonly used value is 0.05 # k1 k2 are the counts in the two groups # n1 n2 are the sample size in the two groups #function returns power (1-β) PowerWhitehead <- function(alpha,n1,k1,n2,k2) # estimate power { ssiz = n1 + n2 r = n1 / n2 lambda_1 = k1 / n1 lambda_2 = k2 / n2 lP = 0.00001 # low power hP = 0.99999 # high power mP = 0.5; # middle power oldss = 0 ss = SSizWhitehead(alpha,mP,lambda_1,lambda_2,r) while(oldss!=ss & ss!=ssiz) { oldss = ss if(ss<ssiz){ lP = mP } else { hP = mP } mP = (lP + hP) / 2.0 ss = SSizWhitehead(alpha,mP,lambda_1,lambda_2,r) } return (mP) } Program 1a: Sample size for comparing two groups, Whitehead Algorithm
# Input data alpha = 0.05 power = 0.8 lambda_1 = 0.1 # k1/n1 lambda_2 = 0.6 # k2/n2 ratio = 1 # n1/n2 # Program ssizTotal = SSizWhitehead(alpha, power, lambda_1, lambda_2, ratio) # Sample size for the 2 groups ssizWH1 = ceiling(ssizTotal/(1 + ratio)) ssizWH2 = ceiling(ssizTotal/(1 + 1/ratio)) c(ssizWH1,ssizWH2) # sample size of the two groupsThe results are > c(ssizWH1,ssizWH2) # sample size of the two groups [1] 11 11 Program 1b: Power estimates for two counts, Whitehead algorithm
# input data alpha = 0.05 n1 = 210 k1 = 20 n2 = 215 k2 = 50 # Calculations and results pw = PowerWhitehead(alpha,n1,k1,n2,k2) pw # powerThe result is > pw # power [1] 0.9816798 ## Section 2 Sample Size and Power for C and E TestsFirstly the basic program as subroutine functions to estimate power of two sets of counts. They were copied from the Fortran program https://userweb.ucs.louisiana.edu/~kxk4695/statcalc/POIS2POW.FOR, modified into Javascript for the web based program on this page, and further modified here into C codes.These subroutines are used later in the main C and E programs later # subroutines for power for C and E Tests # Global variable pvalue1 = 0 power1 = 0 pvalue2 = 0 power2 = 0 # Functions #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// Logarithmic gamma function = alng(x), x > 0 #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc alng1 <- function(x) # translated from Fortran/Javascript { b = c(8.33333333333333e-2, 3.33333333333333e-2, 2.52380952380952e-1, 5.25606469002695e-1, 1.01152306812684, 1.51747364915329, 2.26948897420496, 3.00991738325940) if(x<8.0) { xx = x + 8.0 indx = TRUE } else { indx = FALSE xx = x } fterm = (xx - 0.5) * log(xx) - xx + 9.1893853320467e-1 sum = b[1] / (xx + b[2] / (xx + b[3] / (xx + b[4] / (xx + b[5] / (xx + b[6] /(xx + b[7] / (xx + b[8]))))))) rv = sum + fterm if(indx) rv = rv - log(x + 7) - log( x + 6) - log(x + 5) - log( x + 4) - log( x + 3) - log( x + 2) - log( x + 1) - log(x); return (rv) } alng2 <- function(x) # R function { lgamma(x) } # alng1 and alng2 produces the same results and are both presented for users to choose alng <- function(x) # alng2 has been chosen for this presentation { #alng1(x) # translated fortran code alng2(x) # R function } #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// This program computes the P(X .ge. k) where X is a beta random variable #// with a = alpha and b = beta #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc betadf <- function(x,a,b) { eps = 1e-14 if(x>=1.0)return (1) if(x<=0.0)return (0) if(x==0.5 & a==b)return (0.5) betf = alng(a) + alng(b) - alng(a + b) aplusb = a + b omx = 1.0 - x if (a<aplusb*x) { y = omx omx = x p = b q = a check = TRUE } else { y = x p = a q = b check = FALSE } ensteps = q + omx*aplusb xovomx = y/omx i = 1 term = 1.0 ai = 1.0 ans = 1.0 etermq = q - ai if(ensteps==0.0)xovomx = y term = term*etermq*xovomx/(p+ai) ans = ans + term while(!(abs(term)<=eps & abs(term)<=eps*ans | i>1000)) { ai = ai + 1.0 ensteps = ensteps-1.0 i = i + 1 if(ensteps>=0.0) { etermq = q - ai if(ensteps==0.0)xovomx = y } else { etermq = aplusb aplusb = aplusb + 1.0 } term = term*etermq*xovomx/(p+ai) ans = ans + term } ans = ans * exp(p * log(y) + (q-1.0) * log(omx) - betf) / p if(check)ans = 1.0 - ans return (ans) } #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// This program computes the P(X = k), where X is a Poisson random #// variable with mean # of defective items = el, and observed # of #// defective items = k #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc poipr <- function(k,el) { return (exp(-el + k * log(el) - alng(k+1))) } #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// Here, we carry out the sum over i2 to compute the power of the E-test #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc sumi2 <- function(iside,n1,n2,elhat2,t_k1k2,i1,pi1,i2mode,pi2mode,d) { pi2 = pi2mode i2 = i2mode while(i2<=1000 & pi2>=1e-7) { elhati1 = 1.0*i1/n1 elhati2 = 1.0*i2/n2 diffi = elhati1 - elhati2 - d varr = elhati1/n1 + elhati2/n2 if (iside==1 & t_k1k2>0.0 & diffi<0.0) break if(iside==1) { if(1.0*i1/n1 - 1.0*i2/n2 <= d){ t_i1i2 = 0 } else { t_i1i2 = diffi/sqrt(varr)} if(t_i1i2>=t_k1k2)pvalue1 <<- pvalue1 + pi1*pi2 } else if(iside==2) { if(abs(1.0*i1/n1 - 1.0*i2/n2)<=d){ t_i1i2 = 0.0 } else { t_i1i2 = diffi/sqrt(varr) } if(abs(t_i1i2)>=abs(t_k1k2))pvalue1 <<- pvalue1 + pi1*pi2 } pi2 = elhat2*pi2/(i2+1.0) i2 = i2 + 1 } pi2 = pi2mode pi2 = i2mode*pi2/elhat2 i2 = i2mode-1 while(i2>=0) { if(pi2<1e-07) return() elhati1 = 1.0*i1/n1 elhati2 = 1.0*i2/n2 diffi = elhati1 - elhati2 - d varr = elhati1/n1 + elhati2/n2 if(iside==1) { if(1.0*i1/n1 - 1.0*i2/n2<=d){ t_i1i2 = 0.0 } else { t_i1i2 = diffi/sqrt(varr) } if(t_i1i2>=t_k1k2)pvalue1 <<- pvalue1 + pi1*pi2 } else if(iside==2) { if(abs(1.0*i1/n1 - 1.0*i2/n2)<=d){ t_i1i2 = 0.0; } else { t_i1i2 = diffi/sqrt(varr); } if(abs(t_i1i2)>=abs(t_k1k2))pvalue1 <<- pvalue1 + pi1*pi2 } pi2 = i2*pi2/elhat2 i2 = i2 - 1 } } #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// Here, we carry out the sum over i1 to compute the power of the E-test #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc sumi1 <- function(iside,n1,n2,elhatk,t_k1k2,d,alpha) { pvalue1 <<- 0 elhat1 = n1 * (elhatk + d) elhat2 = n2 * elhatk i1mode = floor(elhat1) i2mode = floor(elhat2) pi1mode = poipr(i1mode, elhat1) pi1 = pi1mode pi2mode = poipr(i2mode, elhat2) i1 = i1mode while(i1<=1000 & pi1>=1e-7) { sumi2(iside,n1,n2,elhat2,t_k1k2,i1,pi1,i2mode,pi2mode,d) if(pvalue1 > alpha) return() pi1 = elhat1 * pi1 / (i1 + 1.0) i1 = i1 +1 } pi1 = pi1mode pi1 = i1mode * pi1 / elhat1 i1 = i1mode while(i1>=0) { if(pi1<1e-7) return() sumi2(iside,n1,n2,elhat2,t_k1k2,i1,pi1,i2mode,pi2mode,d) if(pvalue1>alpha) return() pi1 = i1 * pi1 / elhat1 i1 = i1 - 1 } } min <- function(x1,x2) { if(x1<x2) return (x1) return (x2) } #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// Here, we carry out the sum over k2 to compute the power of the C-test #// and the E-test #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc sumk2 <- function(iside,n1,k1,pk1,n2,el2n2,k2mode,pk2mode,d,ratio,alpha) { pv_ls = 0 pv_rs = 0 pk2 = pk2mode sprob = ratio / (ratio + 1.0) k2 = k2mode while(k2<=1000 & pk2>=1e-7) { elhatk1 = k1 * 1.0 / n1; elhatk2 = k2 * 1.0 / n2; elhatk = (k1 + k2) * 1.0 / (n1 + n2) - d * n1 / (n1 + n2); if(iside==1) { if((k1 * 1.0 / n1 - k2 * 1.0 / n2)<=d) { pvalue1 <<- 1 } else { varr = elhatk1 / n1 + elhatk2 / n2 t_k1k2 = (elhatk1 - elhatk2 - d) / sqrt(varr) sumi1(iside, n1, n2, elhatk, t_k1k2, d, alpha) } if(k1==0){ pvalue2 <<- 1.0 } else { pvalue2 <<- betadf(sprob,k1 * 1,k2 + 1.0) } if(pvalue1<=alpha)power1 <<- power1 + pk1 * pk2 if(pvalue2<=alpha)power2 <<- power2 + pk1 * pk2 } else if(iside==2) { if(abs(k1 * 1.0 / n1 - k2 * 1.0 / n2)<=d) { pvalue1 <<- 1.0 } else { varr = elhatk1 / n1 + elhatk2 / n2 t_k1k2 = (elhatk1 - elhatk2 - d) / sqrt(varr) sumi1(iside,n1,n2,elhatk,t_k1k2,d,alpha) } if(k1==0 & k2==0) { pv_rs = 1.0 pv_ls = 1.0 } else if(k1!=0 & k2==0) { pv_ls = 1.0 pv_rs = betadf(sprob, k1 * 1.0, k2 + 1.0) } else if(k1==0 & k2!=0) { pv_rs = 1.0 pv_ls = betadf(sprob, k2 * 1.0, k1 + 1.0) } else if(k1!=0 & k2!=0) { pv_rs = betadf(sprob, k1 * 1.0, k2 + 1.0) pv_ls = betadf(sprob, k2 * 1.0, k1 + 1.0) } pvalue2 <<- min(pv_rs,pv_ls) if(pvalue1<=alpha)power1 <<- power1 + pk1 * pk2; if(pvalue2<=alpha/2)power2 <<- power2 + pk1 * pk2; } pk2 = el2n2 * pk2 / (k2 + 1.0) k2 = k2 + 1 } pk2 = pk2mode pk2 = k2mode * pk2 / el2n2 k2 = k2mode-1 while(k2>=0) { if(pk2<1e-7) return() elhatk1 = k1 * 1.0 / n1 elhatk2 = k2 * 1.0 / n2 elhatk = (k1 + k2) * 1.0 / (n1 + n2) - d * n1 / (n1 + n2) if(iside==1) { if((k1 * 1.0 / n1 - k2 * 1.0 / n2) <= d) { pvalue1 <<- 1.0 } else { varr = elhatk1 / n1 + elhatk2 / n2 t_k1k2 = (elhatk1 - elhatk2 - d) / sqrt(varr) sumi1(iside,n1,n2,elhatk,t_k1k2,d,alpha) } if(k1==0){ pvalue2 <<- 1.0 } else { pvalue2 <<- betadf(sprob, k1 * 1.0, k2 + 1.0) } if(pvalue1<=alpha)power1 <<- power1 + pk1 * pk2 if(pvalue2<=alpha)power2 <<- power2 + pk1 * pk2 } else if(iside==2) { if(abs(k1 * 1.0 / n1 - k2 * 1.0 / n2)<=d) { pvalue1 <<- 1.0 } else { varr = elhatk1 / n1 + elhatk2 / n2 t_k1k2 = (elhatk1 - elhatk2 - d) / sqrt(varr) sumi1(iside,n1,n2,elhatk,t_k1k2,d,alpha) } if(k1==0 && k2==0) { pv_rs = 1.0 pv_ls = 1.0 } else if(k1!=0 & k2==0) { pv_ls = 1.0 pv_rs = betadf(sprob, k1 * 1.0, k2 + 1.0) } else if(k1==0 & k2!=0) { pv_rs = 1.0 pv_ls = betadf(sprob, k2 * 1.0, k1 + 1.0) } else if(k1!=0 & k2!=0) { pv_rs = betadf(sprob, k1 * 1.0, k2 + 1.0) pv_ls = betadf(sprob, k2 * 1.0, k1 + 1.0) } pvalue2 <<- min(pv_rs,pv_ls) if(pvalue1<=alpha)power1 <<- power1 + pk1 * pk2 if(pvalue2<=alpha / 2)power2 <<- power2 + pk1 * pk2 } pk2 = k2 * pk2 / el2n2; k2 = k2 - 1 } } #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc #// Program for computing the power of the E-test and C-test #// In the first subroutine, the sum over k1 is carried out #// ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc poistest <- function(iside,n1,el1,n2,el2,d,ratio,alpha) { el1n1 = el1 * n1 el2n2 = el2 * n2 k1mode = floor(el1n1) k2mode = floor(el2n2) pk1mode = poipr(k1mode, el1n1) pk1 = pk1mode pk2mode = poipr(k2mode, el2n2) power1 <<- 0.0 power2 <<- 0.0 k1 = k1mode while (k1<=1000 & pk1>=1e-7) { sumk2(iside,n1,k1,pk1,n2,el2n2,k2mode,pk2mode,d,ratio,alpha) pk1 = el1n1 * pk1 / (k1 + 1.0) k1 = k1 + 1 } pk1 = pk1mode pk1 = k1mode * pk1 / el1n1 k1 = k1mode - 1 while(k1>=0) { if(pk1<1e-7) return() sumk2(iside,n1,k1,pk1,n2,el2n2,k2mode,pk2mode,d,ratio,alpha) pk1 = k1 * pk1 / el1n1 k1 = k1 - 1 } } # End of subroutine algorithms from Fortran program # Findpower finds the power of the data #// power1 = power of the unconditional (E) test #// power2 = power of the conditional (C) test (Przyborowski and Wilenski 1940) #// cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc # alpha = Probability of Type I Error α # lambda1 lambda2 the two averaged conts # n1 n2 the two sample sizes # d = the difference to test against, and 0 is used on this page because the test is against the null (0) hypothesis # iside is the tail, 1 for one tail, 2 for two tailed. One tail is used throughout this presentation FindPowers <- function(alpha,el1,el2,n1,n2,d,iside) { if(el2>el1) { ratio = el1 el1 = el2; el2 = ratio; } ratio = n1 * 1.0 / n2 + n1 * d / (n2 * el2) poistest(iside,n1,el1,n2,el2,d,ratio,alpha) } # end of common subroutines for C and E Test Section 2a. Power estimate. As power estimate mainly calculate that for the E test, and produces that for C test as an addition, the same program can be used to estimate powers for both, as follows both C and E
# Input data alpha = 0.05 # Pprobability of Type I Error n1 = 210 # ssiz group 1 k1 = 20 # count group 1 n2 = 215 # sample size group 2 k2 = 50 # count group 2 # Calculations FindPowers(alpha,k1/n1,k2/n2,n1,n2,0,1); # General power function for C and E Test, one tail power1 # Power E Test power2 # Power C TestThe results are > power1 # Power E Test [1] 0.9721986 > power2 # Power C Test [1] 0.9650461 Section 2b:Sample size for C and E Tests.
The following procedures are used. - An initial sample size based on the Whitehead Algorithm is calculated
- From this, the possible range of sample size, from x0.5 to x8 of that estimated by Whitehead is set
- The sample size is iteratively tested for power, until the closest power to that required is obtained
# subroutine to test the power of a set of data # alpha = probability of Type I Error # iside = tails, 1 for 1 tail, 2 for 2 tail. 1 is used throughout in this presentation # L1 L2 are the 2 lambdas, averaged counts of the two groups # Ld is the difference to be tested. 0 is used throughout in this presentation # ra is the ratio (n1/n2) # test is 1 for the C Test and 2 for thr E Test # n is the sample size for one of the groups TestPower <- function(alpha,iside,L1,L2,Ld,ra,test,n) { n1 = n; n2 = ceiling(n1 * ra) FindPowers(alpha,L1,L2,n1,n2,Ld,iside) if(test==1)return (power2) # C test return (power1) # E Test } # Subroutine to find the sample size #// iside is tails 1 or 2 #// nL and nR are initial ssiz range for search #// test is 1 for E test and 2 for C test #// ra = ratio n1/n2 #// Ld = difference that is null default to 0 #// returns ssiz in grp 1 #// ssiz grp2 = ratio * ssiz grp 1 FindSSiz <- function(alpha,power,iside,L1,L2,Ld,ra,nL,nR,test) { pL = TestPower(alpha,iside,L1,L2,Ld,ra,test,nL)-power pR = TestPower(alpha,iside,L1,L2,Ld,ra,test,nR)-power if(pL*pR>0) return (0) nM = round((nL+nR)/2) pM = TestPower(alpha,iside,L1,L2,Ld,ra,test,nM)-power if(pL*pM>0){ nL = nM } else { nR = nM } i = 0; while(i<20 & abs(pM)>=0.01 & abs(nL-nR)>1) { nM = round((nL+nR)/2) pM = TestPower(alpha,iside,L1,L2,Ld,ra,test,nM)-power pL = TestPower(alpha,iside,L1,L2,Ld,ra,test,nL)-power if(pL*pM>0){ nL = nM } else { nR = nM } i = i + 1; } if(pM<0)nM = nM + 1 return (nM) }Finally, the 2 sample size programs for C Test and E Test
# Pgm 2c_1: Sample Size C Test alpha = 0.05 power = 0.8 lambda_1 = 0.1 # k1/n1 lambda_2 = 0.6 # k2/n2 ratio = 1 # n1/n2 # initial estimate using Whitehead's algorithm ssizTotal = SSizWhitehead(alpha, power, lambda_1, lambda_2, ratio) ssizWH1 = ceiling(ssizTotal/(1 + ratio)) ssizWH2 = ceiling(ssizTotal/(1 + 1/ratio)) # set range for iterative search (dn to up) dn = ssizWH1 - 1 # lower limit for iteration, 1 less than Whitehead sample size if(dn<2) dn = 2 # make sure it is not less than 2 up = ssizWH1 * 2 # upper limit for iteration, twice that of Whitehead sample size # Main Pgrm 2c_1: ssize C Test ssiz1 = FindSSiz(alpha,power,1,lambda_1,lambda_2,0,ratio,dn,up,1) # sample size C test group 1 one tail ssiz2 = ceiling(ssiz1 * ratio) # sample size C test group 2 one tail ssizTotal = ssiz1 + ssiz2; c(ssiz1, ssiz2, ssizTotal) # Sample Size C Test ResultsThe results are > c(ssiz1, ssiz2, ssizTotal) # Sample Size C Test Results [1] 19 19 38 Sample Size for E Test
# Pgm 2c_2: Sample Size E Test alpha = 0.05 power = 0.8 lambda_1 = 0.1 # k1/n1 lambda_2 = 0.6 # k2/n2 ratio = 1 # n1/n2 # initial estimate using Whitehead's algorithm ssizTotal = SSizWhitehead(alpha, power, lambda_1, lambda_2, ratio) ssizWH1 = ceiling(ssizTotal/(1 + ratio)) ssizWH2 = ceiling(ssizTotal/(1 + 1/ratio)) # set range for iterative search (dn to up) dn = ssizWH1 - 1 # lower limit for iteration, 1 less than Whitehead sample size if(dn<2) dn = 2 # make sure it is not less than 2 up = ssizWH1 * 2 # upper limit for iteration, twice that of Whitehead sample size # Main Pgrm 2c_2: ssize E Test ssiz1 = FindSSiz(alpha,power,1,lambda_1,lambda_2,0,ratio,dn,up,2) # sample size E test group 1 one tail ssiz2 = ceiling(ssiz1 * ratio) # sample size E test group 2 one tail ssizTotal = ssiz1 + ssiz2; c(ssiz1, ssiz2, ssizTotal) # Sample Size E Test ResultsThe results are > c(ssiz1, ssiz2, ssizTotal) # Sample Size E Test Results [1] 17 17 34 |