SSiz Paired Diff

Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Sample Size for Paired Differences

Links : Home Index (Subjects) Contact StatsToDo

Explanations and References
This page presents 4 programs related to sample size requirements for estimating the mean of pair differences. The programs are in Javascript for immediate use via the web page, R codes, and a table of sample sizes.
The programs and tables on this page assumes that the data used and the pair differences are continuous measurements that are normally distributed. For non-parametric ordinal measurements, such as in the Wilcoxon Signed Rank Test, the power efficiency is 95.5% that of the paired t test for parametric t test. This means that, when calculating sample size, the power or β used should be adjusted appropriately, as follows
β for sample size estimation for Wilcoxon Signed Rank Test = that for paired t test * 0.955
for using power of 80% (0.8), beta for paired t test = 0.2, β for Wilcoxon Test = 0.2 * 0.995 = 0.191, or power of 0.809 (80.9%)
for using power of 90% (0.9), beta for paired t test = 0.1, β for Wilcoxon Test = 0.1 * 0.995 = 0.995, or power of 0.905 (90.5%)

The following 4 programs are available on this page

Sample Size requires the following input

The probability of Type I Error (p, &alphs;) that will be used to determine significance. This is usually 0.05, but 0.1, 0.01, 1nd 0.001 are also used
The power (1 - β) required. Usually this is 0.8, and sometimes 0.9
The expected mean and Standard Deviation of the paired differences
The results is the sample size in each of the 2 groups, assuming that they are equal. Sample size for the 1 and 2 tail models are produced.
For example, if α of 0.05 and power of 0.8 are required, and the mean difference is half (0.5) that of the Standard Deviation of the paired differences (1.0), then the sample size required is 27 paors for a 1 tail study, or 34 pairs for a 2 tail study.
Power requires the following imput

The probability of Type I Error (p, &alphs;) that will be used to determine significance. This is usually 0.05, but 0.1, 0.01, 1nd 0.001 are also used
The sample size (number of pairs) used in the study
The mean difference and Standard Deviation of the differences observed in the data
The results are the powers of the comparison, 1 and 2 tail models.
For example, if α of 0.05 is to be used to determine statistical significance, the sample size is 34 pairs, the mean of the paired difference is 0.5, and /Standard Deviation is 1, then the power of the results are 0.89 (1 tail) or 0.81 (2 tail)
Confidence Interval (CI) reuires the following input

The percentage representing the level of confidence reqwuired. This is usually 95%, but sometimes 99% is used
The sample size (number of pairs) and Standard Deviation of the paired differences found in the data
The results are the confidence interval of the difference, 1 and 2 tail. this being the distance between the mean and the limit of the confidence interval
For example, if the sample size is 16 pairs, and the Standard Deviation = 6, then the 95% confidence intervals are

For the 1 tail model, 2.63. The actual 95% confidence interval is either > mean - 2.63 or < mean + 2.63
For the two tail model, 3.2. The actual 95% confidence interval is mean ± 3.2

Pilot Studies requires the following parameters "

1 Tail 2 Tail

Ssiz CI1 Diff Dec/case %Dec/cas CI Diff Dec/case %Dec/cas

5 1.9068 2.4833

10 1.1594 0.7474 0.1495 8.0 1.4307 1.0526 0.2105 8.0

15 0.9095 0.2498 0.05 4.0 1.1076 0.3231 0.0646 5.0

20 0.7733 0.1362 0.0272 3.0 0.936 0.1715 0.0343 3.0

25 0.6844 0.0889 0.0178 2.0 0.8256 0.1105 0.0221 2.0

30 0.6204 0.0639 0.0128 2.0 0.7468 0.0788 0.0158 2.0

35 0.5716 0.0488 0.0098 2.0 0.687 0.0598 0.012 2.0

40 0.5328 0.0388 0.0078 1.0 0.6396 0.0474 0.0095 1.0

45 0.5009 0.0319 0.0064 1.0 0.6009 0.0388 0.0078 1.0

50 0.4742 0.0267 0.0053 1.0 0.5684 0.0325 0.0065 1.0

55 0.4513 0.0229 0.0046 1.0 0.5407 0.0277 0.0055 1.0

60 0.4315 0.0199 0.004 1.0 0.5167 0.024 0.0048 1.0

65 0.414 0.0174 0.0035 1.0 0.4956 0.0211 0.0042 1.0

70 0.3985 0.0155 0.0031 1.0 0.4769 0.0187 0.0037 1.0

75 0.3847 0.0139 0.0028 1.0 0.4602 0.0167 0.0033 1.0

80 0.3722 0.0125 0.0025 1.0 0.4451 0.0151 0.003 1.0

85 0.3608 0.0114 0.0023 1.0 0.4314 0.0137 0.0027 1.0

90 0.3504 0.0104 0.0021 1.0 0.4189 0.0125 0.0025 1.0

95 0.3409 0.0095 0.0019 1.0 0.4074 0.0115 0.0023 1.0

100 0.3321 0.0088 0.0018 1.0 0.3968 0.0106 0.0021 1.0

The % confidence, usually 95% or 99%, which is then converted to α of 0.05 or 0.01
The expected Standard Deviation of the paired differences, 1 in this example
The interval of sample size (intv) to exaamine changes in confidence interval as sample sizw=es increase. Usually this is between 3 and 10 pairs
The maximum sample size (pairs) for the estimates. In most cases, pilot studies end in 30 to 40 pairs, and there is no point having a pilot study with more than 100 pairs. A common value used is 50
The program produces a table as shown to the right, listing the confidence intervals (1 and 2 tails). With increasing sample size, the reduction in confidence interval decreases, and it can be seen that beyond 35 pairs, the further decrease in confidence interval is less than 5% of the Standard Deviation, and can be considered trivial. A conclusion can therefore be made that a sample size of 35 pairs would be suitable for a pilot study, to define the expected mean and Standard Deviation of the paired differences, and provide information on the research environment, so that a formal comparison can be planned.

References
Machin D, Campbell M, Fayers, P, Pinol A (1997) Sample Size Tables for Clinical Studies. Second Ed. Blackwell Science IBSN 0-86542-870-0 p. 73-74
Johanson GA and Brooks GP (2010) Initial Scale Development: Sample Size for Pilot Studies. Educational and Psychological Measurement Vol.70,Iss.3;p.394-400
Sample Size Table
Abbreviations
α = alpha, p, Probability of Type I Error
Power = 1-β, 1-Probability of Type II Error
es = Effect Size = Mean of paired difference to be detected / Expected Standard Deviation of the paired difference
Value in cells = sample size = number of pairs
bold : commonly used sample sizes for small, moderate, and large effect size

Power 0.8 0.9 0.95

α 0.1 0.05 0.01 0.001 0.1 0.05 0.01 0.001 0.1 0.05 0.01 0.001

es 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail 1
tail 2
tail

0.05 1804 2475 2475 3142 4018 4675 6189 6836 2629 3427 3427 4205 5210 5955 7650 8367 3427 4331 4331 5200 6311 7129 8974 9749

0.1 452 620 620 787 1007 1172 1551 1713 658 858 858 1053 1305 1492 1917 2096 858 1084 1084 1302 1580 1785 2247 2442

0.15 202 277 277 351 449 523 692 765 293 382 382 469 582 665 855 935 382 483 483 580 704 796 1002 1088

0.2 114 156 156 199 254 296 392 433 166 216 216 265 329 376 483 529 215 272 272 327 397 449 566 615

0.25 73 101 101 128 164 191 253 279 106 139 139 171 211 242 311 340 138 175 175 210 256 289 364 396

0.3 51 71 71 90 115 134 177 196 74 97 97 119 148 169 218 238 96 122 122 147 178 202 254 277

0.35 38 52 52 66 85 99 131 145 55 72 72 88 109 125 161 177 71 90 90 108 132 149 188 205

0.4 29 40 40 51 66 77 102 113 42 55 55 68 85 97 125 137 55 69 69 84 102 115 145 158

0.45 24 32 32 41 53 61 82 90 34 44 44 54 67 77 100 109 44 55 55 67 81 92 116 126

0.5 19 27 27 34 43 51 67 74 28 36 36 44 55 63 82 90 36 45 45 54 66 75 95 103

0.55 16 22 22 28 36 42 56 62 23 30 30 37 46 53 68 75 30 38 38 45 55 63 79 86

0.6 14 19 19 24 31 36 48 53 20 26 26 32 39 45 58 64 25 32 32 39 47 53 68 74

0.65 12 16 16 21 27 31 42 46 17 22 22 27 34 39 51 55 22 27 27 33 41 46 58 64

0.7 11 14 14 18 24 28 37 41 15 19 19 24 30 34 44 49 19 24 24 29 35 40 51 56

0.75 9 13 13 16 21 25 33 36 13 17 17 21 26 30 39 43 17 21 21 26 31 35 45 49

0.8 8 12 12 15 19 22 29 33 12 15 15 19 24 27 35 39 15 19 19 23 28 32 40 44

0.85 8 10 10 13 17 20 27 30 10 14 14 17 21 24 32 35 13 17 17 20 25 28 36 40

0.9 7 9 9 12 16 18 24 27 9 12 12 15 19 22 29 32 12 15 15 18 23 26 33 36

0.95 6 9 9 11 14 17 22 25 9 11 11 14 18 20 26 29 11 14 14 17 21 24 30 33

1 6 8 8 10 13 15 21 23 8 10 10 13 16 19 24 27 10 13 13 15 19 22 28 30

1.05 5 7 7 10 12 14 19 21 7 10 10 12 15 17 23 25 9 12 12 14 18 20 26 28

1.1 5 7 7 9 12 13 18 20 7 9 9 11 14 16 21 23 8 11 11 13 16 19 24 26

1.15 5 7 7 8 11 13 17 19 6 8 8 10 13 15 20 22 8 10 10 12 15 17 22 24

1.2 6 6 8 10 12 16 18 6 8 8 10 12 14 19 20 7 9 9 11 14 16 21 23

1.25 6 6 7 10 11 15 17 6 7 7 9 12 13 18 19 7 9 9 11 13 15 20 22

1.3 6 6 7 9 11 14 16 5 7 7 9 11 13 17 18 6 8 8 10 13 14 19 20

1.35 5 5 7 9 10 14 15 5 7 7 8 10 12 16 17 6 8 8 10 12 14 18 19

1.4 5 5 6 8 10 13 15 5 6 6 8 10 11 15 17 6 7 7 9 11 13 17 18

1.45 5 5 6 8 9 13 14 6 6 7 9 11 14 16 5 7 7 9 11 12 16 17

1.5 5 5 6 8 9 12 14 6 6 7 9 10 14 15 5 7 7 8 10 12 15 17

1.55 6 7 9 12 13 5 5 7 9 10 13 15 5 6 6 8 10 11 15 16

1.6 5 7 8 11 13 5 5 7 8 10 13 14 5 6 6 7 9 11 14 15

1.65 5 7 8 11 12 5 5 6 8 9 12 14 6 6 7 9 10 14 15

1.7 5 7 8 11 12 5 5 6 8 9 12 13 6 6 7 9 10 13 14

1.75 5 6 8 10 11 5 5 6 7 9 12 13 5 5 7 8 10 13 14

1.8 5 6 7 10 11 6 7 8 11 12 5 5 6 8 9 12 13

1.85 5 6 7 10 11 5 7 8 11 12 5 5 6 8 9 12 13

1.9 5 6 7 10 11 5 7 8 11 12 5 5 6 8 9 11 13

1.95 6 7 9 10 5 7 8 10 11 5 5 6 7 9 11 12

2 6 7 9 10 5 6 8 10 11 5 5 6 7 8 11 12

Javascript Programs

Data Input Sample Size Estimation is a table of 4 columns
  - Each row contains data from a separate study
  - Col 1 = probability of Type I error (α)
  - Col 2 = power (1-β)
  - Col 3 = mean of paired differences
  - Col 4 = Standard Deviation of paired differences

Data Input for Power Estimation is a table of 4 columns
  - Each row contains data from a separate study
  - Col 1 = probability of Type I error (α)
  - Col 2 = sample size (Pairs) used
  - Col 3 = mean of Paired Differences observed
  - Col 4 = Standard Deviation of paired differences

Data Input for Confidence Interval Estimation is a table of 3 columns
  - Each row contains data from a separate study
  - Col 1 = percent confidence (usually 95)
  - Col 2 = sample size used (pairs)
  - Col 3 = standard deviation of paired differences observed

Data Input for Pilot Study is for a single plan, a single column with 4 rows
  - Row 1 : Percent Confidence required, usually 95 or 99
  - Row 2 : Standard Deviation of Paired Mean
  - Row 3 : Sample Size Interval
  - Row 4 : Maximum sample size


R Codes
This panel presents R codes related to sample size for Paired Difference
Program 1: Sample Size
Alpha = Probability of Type I Error α
Power = 1 - β
Diff = mean of paired differences
SD = Standard Deviation of paired differences
# Pgm 1: Sample Size # data entry dat = (" Alpha Power Diff SD 0.05 0.8 0.5 1.0 0.01 0.8 0.5 1.0 0.05 0.9 0.5 1.0 0.01 0.9 0.5 1.0 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame # vectors to store results SSiz1Tail <- vector() SSiz2Tail <- vector() # Calculations delta <- abs(df$Diff / df$SD) # effect size zb <- abs(qnorm(1 - df$Power)) # z for beta # 1 tail za <- abs(qnorm(df$Alpha)) # 1 tail z for alpha f <- (za + zb) / delta SSiz1Tail <- append(SSiz1Tail,ceiling(f**2 + za**2 / 2.0)) # 2 tail za <- abs(qnorm(df$Alpha / 2)) # 2 tail z for alpha #za f <- (za + zb) / delta SSiz2Tail <- append(SSiz2Tail,ceiling(f**2 + za**2 / 2.0)) # append results to data frame df$SSiz1Tail <- SSiz1Tail df$SSiz2Tail <- SSiz2Tail df # show data frame with input data and rsults
The results are as follows. Sample size is the number of pairs
> df # show data frame with input data and rsults Alpha Power Diff SD SSiz1Tail SSiz2Tail 1 0.05 0.8 0.5 1 27 34 2 0.01 0.8 0.5 1 43 51 3 0.05 0.9 0.5 1 36 44 4 0.01 0.9 0.5 1 55 63

Program 2: Power (1 - β)
Alpha = probability of Type I Error α N = sample size (nuber of pairs) MEAN = mean of paired sifferences SD = Standard Deviation of mean differences
# Pgm2: Power # data entry dat = (" Alpha N MEAN SD 0.05 34 0.5 1.0 0.01 51 0.5 1.0 0.05 44 0.5 1.0 0.01 63 0.5 1.0 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame df # vector to store results Power1Tail <- vector() Power2Tail <- vector() # Calculations delta <- abs(df$MEAN / df$SD); #delta ZA <- abs(qnorm(df$Alpha)) ZB <- delta * sqrt(df$N -ZA**2 / 2) - ZA Power1Tail <- append(Power1Tail,pnorm(ZB)) #Power1Tail ZA <- abs(qnorm(df$Alpha / 2)) ZB <- delta * sqrt(df$N -ZA**2 / 2) - ZA Power2Tail <- append(Power2Tail,pnorm(ZB)) #Power2Tail # append ewsults to data frame df$Power1Tail <- Power1Tail df$Power2Tail <- Power2Tail df # show data frame with input data and results
The results are as follows
Alpha N MEAN SD Power1Tail Power2Tail 1 0.05 34 0.5 1 0.8872503 0.8083861 2 0.01 51 0.5 1 0.8745876 0.8097019 3 0.05 44 0.5 1 0.9474256 0.9003350 4 0.01 63 0.5 1 0.9401596 0.9009345

Confidence Interval
Firstly, the sybroutine to calculate confidence intervals which will be used by both this and the pilot study algorithms
# subroutine to calculate confidence interval ConfIntv <- function(pc,ssiz,sd) #pc= % confidence, ssiz=number of pairs, sd = Standard Deviation of pair differences { se = sd / sqrt(ssiz) # Standard Error alpha = (1 - pc / 100) # convert % confidence into α ci1 = qt(1 - alpha, ssiz - 1) * se # 1 tail # confidence 1 tail ci2 = qt(1 - alpha / 2, ssiz - 1) * se # 2 tail # confidence 2 tail return (c(ci1, ci2)) # returns 1 and 2 tail CI }
Main program for confidence interval. Please note: confidence interval here is the distance between mean and the limit of the interval. The full confidence interval is mean±CI, or twice that shown here
# data entry: PC=% confidence, N = sample size in number of pairs, and SD=Standard Deviation of paired differences dat = (" PC N SD 95 16 6.0 99 16 6.0 95 25 1.0 99 25 1.0 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame # vectors for results CI1 <- vector() # Confidence interval 1 tail CI2 <- vector() # Confidence interval 2 tail # calculations # 1 tail for(i in 1 : nrow(df)) { ar <- ConfIntv(df$PC[i],df$N[i],df$SD[i]) CI1 <- append(CI1, ar[1]) # Confidence interval 1 tail CI2 <- append(CI2, ar[2]) # Confidence interval 2 tail } # combine results with input data df$CI1 <- CI1 df$CI2 <- CI2 df # display input data and results
The results are as follows
> df # display input data and results PC N SD CI1 CI2 1 95 16 6 2.6295755 3.1971743 2 99 16 6 3.9037204 4.4200693 3 95 25 1 0.3421764 0.4127797 4 99 25 1 0.4984319 0.5593879

Program 4: Pilot Study

# Program 4. Pilot study # Parameters pc = 95 # % confidence sd = 1.0 # within group or population SD intv = 5 # interval maxN = 100 # maximum sample size # vectors for results SSiz <- vector() # sample size CI1 <- vector() # confidence interval 1 tail Diff1 <- vector() # difference in CI from previous row 1 tail DecCase1 <- vector() # decrease in CI per case increase 1 tail PDCase1 <- vector() # % decrease in CI per case increase 1 tailCI1 <- vector() # confidence interval 1 tail CI2 <- vector() # confidence interval 2 tail Diff2 <- vector() # difference in CI from previous row 2 tail DecCase2 <- vector() # decrease in CI per case increase 2 tail PDCase2 <- vector() # % decrease in CI per case increase 2 tail # Calculations # first row n = intv SSiz <- append(SSiz,n) ar <- ConfIntv(pc, n, sd) ci1 = ar[1] * 2 CI1 <- append(CI1,sprintf(ci1, fmt="%#.4f")) # confidence interval 1 tail Diff1 <- append(Diff1,0) # difference in CI from previous row 1 tail DecCase1 <- append(DecCase1,0) # decrease in CI per case increase 1 tail PDCase1 <- append(PDCase1,0) # % decrease in CI per case increase 1 tailCI1 <- vector() # confidence interval 1 tail ci2 = ar[2] * 2 CI2 <- append(CI2,sprintf(ci2, fmt="%#.4f")) # confidence interval 1 tail Diff2 <- append(Diff2,0) # difference in CI from previous row 1 tail DecCase2 <- append(DecCase2,0) # decrease in CI per case increase 1 tail PDCase2 <- append(PDCase2,0) # % decrease in CI per case increase 1 tailCI1 <- vector() # confidence interval 1 tail # subsequent rows while(n < maxN) { n = n + intv SSiz <- append(SSiz,n) ar <- ConfIntv(pc, n, sd) oldci1 = ci1 ci1 = ar[1] * 2 CI1 <- append(CI1,sprintf(ci1, fmt="%#.4f")) # confidence interval 1 tail diff1 = oldci1 - ci1 Diff1 <- append(Diff1,sprintf(diff1, fmt="%#.4f")) # difference in CI from previous row 1 tail decCase1 = diff1 / intv DecCase1 <- append(DecCase1,sprintf(decCase1, fmt="%#.4f")) # decrease in CI per case increase 1 tail pDCase1 = sprintf(decCase1 / oldci1 * 100, fmt="%#.1f") PDCase1 <- append(PDCase1,pDCase1) # % decrease in CI per case increase 1 tail oldci2 = ci2 ci2 = ar[2] * 2 CI2 <- append(CI2,sprintf(ci2, fmt="%#.4f")) # confidence interval 2 tail diff2 = oldci2 - ci2 Diff2 <- append(Diff2,sprintf(diff2, fmt="%#.4f")) # difference in CI from previous row 2 tail decCase2 = diff2 / intv DecCase2 <- append(DecCase2,sprintf(decCase2, fmt="%#.4f")) # decrease in CI per case increase 2 tail pDCase2 = sprintf(decCase2 / oldci2 * 100, fmt="%#.1f") PDCase2 <- append(PDCase2,pDCase2) # % decrease in CI per case increase 2 tail } #combine all results into data frame for display df <- data.frame(SSiz,CI1,Diff1,DecCase1,PDCase1,CI2,Diff2,DecCase2,PDCase2) df # display results in data frame
The result pilot study table is as follows
> df # display results in data frame SSiz CI1 Diff1 DecCase1 PDCase1 CI2 Diff2 DecCase2 PDCase2 1 5 1.9068 0 0 0 2.4833 0 0 0 2 10 1.1594 0.7474 0.1495 7.8 1.4307 1.0526 0.2105 8.5 3 15 0.9095 0.2498 0.0500 4.3 1.1076 0.3232 0.0646 4.5 4 20 0.7733 0.1362 0.0272 3.0 0.9360 0.1715 0.0343 3.1 5 25 0.6844 0.0889 0.0178 2.3 0.8256 0.1105 0.0221 2.4 6 30 0.6204 0.0639 0.0128 1.9 0.7468 0.0787 0.0157 1.9 7 35 0.5716 0.0488 0.0098 1.6 0.6870 0.0598 0.0120 1.6 8 40 0.5328 0.0388 0.0078 1.4 0.6396 0.0474 0.0095 1.4 9 45 0.5009 0.0319 0.0064 1.2 0.6009 0.0388 0.0078 1.2 10 50 0.4742 0.0267 0.0053 1.1 0.5684 0.0325 0.0065 1.1 11 55 0.4513 0.0229 0.0046 1.0 0.5407 0.0277 0.0055 1.0 12 60 0.4315 0.0199 0.0040 0.9 0.5167 0.0240 0.0048 0.9 13 65 0.4140 0.0174 0.0035 0.8 0.4956 0.0211 0.0042 0.8 14 70 0.3985 0.0155 0.0031 0.7 0.4769 0.0187 0.0037 0.8 15 75 0.3847 0.0139 0.0028 0.7 0.4602 0.0167 0.0033 0.7 16 80 0.3722 0.0125 0.0025 0.7 0.4451 0.0151 0.0030 0.7 17 85 0.3608 0.0114 0.0023 0.6 0.4314 0.0137 0.0027 0.6 18 90 0.3504 0.0104 0.0021 0.6 0.4189 0.0125 0.0025 0.6 19 95 0.3409 0.0095 0.0019 0.5 0.4074 0.0115 0.0023 0.5 20 100 0.3321 0.0088 0.0018 0.5 0.3968 0.0106 0.0021 0.5