Paired Diff

Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Paired Difference in Measurements

Links : Home Index (Subjects) Contact StatsToDo

Explanations and References
This page provides 3 statistical tests for paired differences in scalar measurements. These are the Paired t Test, the Wilcoson, Paired Signed Rank Test, and the Permutation Test for paired differences.
Paired t Test

Twin 1 Twin 2 Difference

3163 3124 39

3245 2807 438

3391 3014 376

2547 2727 -180

3042 3254 -211

3200 3826 -626

3115 2596 519

3294 2952 343

3019 3279 -260

3222 3325 -103

2831 2984 -153

3043 2765 277

2646 3109 -463

3327 2757 570

3182 3781 -599

2984 3061 -77

2878 3658 -780

2770 3665 -895

3092 2739 353

2735 3324 -589

The following example is used to demonstrate the Paired t Test for parametric data. Please note: the data is computer generated and does not represent any research information.
We purports to conduct a study to test whether the order of birth resulted in differences in the birth weight of twins. We obtained the birth weights (grams) from 20 sets of twins, which are shown in the first two columns in the table to the right, and the differences (Twin1 - Twin 2), as shown in the third columns. The calculations are as follows
Number of Paires = 20
Mean Difference = -101
Standard Deviation of Difference = 456
Standard Error of Mean = 105
t Test : t = -0.9659 df = 19 p(α) = 0.17 (1 tail) and 0.35 (2 tail)
95% Confidence Interval of Paired Difference (1 tail) = >-282 or <80
95% Confidence Interval of Paired Difference (2 tail) = -320 to 118
Our conclusion is therefore that birth weights between first and second born twins are not significantly different
Wilcoxon Paired Signed Rank Test

Before After Diff

3 1 2

2 1 1

2 0 2

3 0 3

1 1 0

1 0 1

2 2 0

1 2 -1

1 1 0

3 1 2

1 2 -1

1 1 0

1 1 0

3 0 3

3 1 2

1 2 -1

1 1 0

1 3 -2

1 1 0

3 1 2

3 2 1

1 1 0

The following example demonstrates the Wilcoson Paired Signed Rank Test for non-parametric data. Please note: the data is computer generated and does not represent any research information.
We purports to study whether a new analgesic is effective in relieving headaches. We ask the subjects to describe their headache as none (0), some (1), moderate (2), and severe (3), a 4 point scale (0 to 3), before and after administering the analgesics, and use the paired differences to evaluate the analgesics.
We collected data from 22 subjects. The headache scales are as shown in the first 2 columns, and the paired difference (Before - After) in column 3 of the table to the left.

Paired diff - +

1 3 3

2 1 5

3 0 2

The differences are ranked and counted as in the table to the right. The - ranks are in column 1 and the positive ranks in column 2
There were 8 subjects whose headache scores did not change (0), and these are not included in the table of counts.
On the negative side, there were:
  - 3 subjects that got worse (-1)
  - 1 subjects that got a lot worse (-2)
On the positive side, there were
  - 3 subjects that got better (1)
  - 5 subjects that got a lot better (2)
  - 2 subject that got completely better(3)
The sums of negative ranks (T-) = 20, and sums of positive ranks (T+) = 85.
As the numbers included in the count is 14, less than 16, Segal's Table H is used to determine statistical significance, and this is p<0.05. We can therefore conclude that headaches decreased significantly after receiving the analgesic.
Permutation Test for paired differences

The general principle underlying the permutation of paired differences is that, in a randomly allocated study, the data obtained could have been in either of the paired measurements. In ther words, each paired difference can either be + or - of that difference.
The test therefore consists of calculating every possible permutation of the paired differences, and examine the results. If the results from the original data is near the extremes (e.g. less than 5 percentile or more than 95 percentile in a one tail model), then a decision can be made that it is unlikely to be null and therefore statistically significant.
The advantages of using the Permutation test are :

Exhaustive permutation allows the calculation of the precise probability that the data presented is null, so the tests calculate the Type I Error (α), with the power (1-β) of 100%.
The test is not dependent on any assumption of data distribution, so they can be used in any regular interval data (where 10-9 is the same as 4-3). The test can therefore be used on parametric measurements, ratios, variances, and time.
Because of the above two characteristics, the test can be used with a very small sample size

The disadvantages of using the test are related to the computation resources (time and memory) required. The number of permutation is 2ⁿ, were n is the number of pairs. Computation time therefore increases exponentially with increasing sample size, and large dataset may either crash the program when available RAM is exhausted, or the computation becomes unacceptably too long.
In theory, the Permutation Test can cope with any number of pairs. However, α<0.05 cannot be computed with less than 6 pairs unless the differences are uniformly in one direction, and computation will take an unacceptably long time with 22 pairs or more.
The Permutation Test is therefore ideal for handling small sets of interval data with uncertain distributions. With larger sample size, Wilcoxon PSRT for non-parametric data or Paired t test for parametric data should be preferred.
The mathematical argument of the Permutation Test is as follows

In a pair of measurements, the null hypothesis is that there is no difference between the pair. In other words, that the values observed can be in either of the pair.
The Permutation Test therefore consists of examining the sum of paired differences, in all permutations where the values in each pair are in either groups. The total permutation is therefore 2ⁿ, were n is the number of pairs.
The sum of differences in the original data is then compared with all possible outcomes, so that its position (thus probability) can be estimated.

The following example uses the same data as that presented in the Paired t Test.

There are 22 pairs of birth weights, so the number of possible permutations are 2²² = 1048576
The sum of paired differences in the input data = -2021
There are 173010 permutated sums that are less than that from the input data (<-2021)
There are 262 permutated sums that are the same as that from the input data (=-2021)
There are 875304 permutated sums that are more than that from the input data (>-2021)
Therefore, there are 262 + 173010 = 173272 permutated sums that are equal or less than from the input data (<=-2021)
Therefore, the probability of having a permuted sum equal or less than that of the input data is 173272 / 1048576 = 0.1652
Putting this another way

The input data is at the 16.5^th percentile of all possible permutationed values
Statistical significant (α) = 0.1652 (1 tail) and 2 * 0.1652 = 0.3305 (2 tail)
We can conclude that paired differences is not significant differenct to null (0)

Comparison of the 3 methods
Using the same twin data by the 3 tests provides the following results

Paired t Test: α = 0.1731 (1 tail)
Wilcoxon PSRT: α = 0.1753
Permutation Test: α = 0.1652 (1 tail)
Although the differences between the α values in this example are trivial, they do show the relatively greater power of the Permutation Test (lower value of α) and less power of the Wicoxon Test (higher value of α), compared with the Paired t Test
References
Paired t Test :
Armitage P. Statistical Methods in Medical Research (1971). Blackwell Scientific Publications. Oxford. P.189-207
Wilcoxon Paired Signed Rank Test :
Siegel S and Castellan Jr. NJ (2000) Nonparametric Statistics for the Behavioral Sciences. Second Edition. McGraw Hill, Sydney. ISBN0-07-100326-6 p. 95. Table H Critical values p.332-334
Permutation Test : Siegel S and Castellan Jr. NJ (2000) Nonparametric Statistics for the Behavioral Sciences. Second Edition. McGraw Hill, Sydney. ISBN0-07-100326-6 p. 95-101.
Javascript Program

The data is for a single study, a 2 column table of numbers.
  - Each row represents data from a pair
  - The two columns are paired values from that pair
Paired t Test (parametric data)

Wilcoxon's Paired Signed Rank Test (non-parametric data)

Permutation Test for Paired Differences (small sample size)


R Codes

Program 1: Paired t test

# Paired t test # data entry dat = (" V1 V2 3163 3124 3245 2807 3391 3014 2547 2727 3042 3254 3200 3826 3115 2596 3294 2952 3019 3279 3222 3325 2831 2984 3043 2765 2646 3109 3327 2757 3182 3781 2984 3061 2878 3658 2770 3665 3092 2739 2735 3324 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame df$Diff <- df$V1 - df$V2 #df # optional display of input data and paired differences # Clculation: paired t n = nrow(df) # sample size degFm = n - 1 # degrees of freedom mean = mean(df$Diff) # mean of paired difference sd = sd(df$Diff) # Standard Deviation of paired differences se = sd / sqrt(degFm) # Standard Error of mean t = mean / se # paired t p = 1 - pt(abs(t),degFm) # probability of Type I Error α # 95% confidence interval t = qt(1 - 0.05, df=degFm) # t for 95% CI 1 tail ll1 = mean - t * se # lower limit 1 tail ul1 = mean + t * se # upper limit 1 tail t = qt(1 - 0.025, df=degFm) # t for 95% CI 2 tail ll2 = mean - t * se # lower limit 2 tail ul2 = mean + t * se # upper limit 2 tail # result output c(n, mean, sd) # sample size, mean difference and SD difference c(degFm, se, t, p, p*2) # deg freedom, Standard Error t, p(α) 1 and 2 tail c(ll1, ul1) # 95% CI 1 tail c(ll2, ul2) # 95% CI 2 tail
The results are as follows
> # result output > c(n, mean, sd) # sample size, mean difference and SD difference [1] 20.0000 -101.0500 455.9936 > c(degFm, se, t, p, p*2) # deg freedom, Standard Error t, alpha (1 and 2 tail) [1] 19.0000000 104.6121044 2.0930241 0.1731023 0.3462046 > c(ll1, ul1) # 95% CI 1 tail [1] -281.93822 79.83822 > c(ll2, ul2) # 95% CI 2 tail [1] -320.0057 117.9057

Program 2 Wilcoxon Paired Signed Rank Test

# Data input dat = (" V1 V2 3 1 2 1 2 0 3 0 1 1 1 0 2 2 1 2 1 1 3 1 1 2 1 1 1 1 3 0 3 1 1 2 1 1 1 3 1 1 3 1 3 2 1 1 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame df$Diff <- df$V1 - df$V2 # paired difference df # optional display of input data and paired difference # Make Wilcoxon table with non-equal cases x <- vector() # rank of non-zero differences in abs values y <- vector() # neg or pos ranks for(i in 1:nrow(df)) { v = df$Diff[i] # v = diff if(v !=0) # non-zero v { x <- append(x, abs(v)) # abs diff # abs(diff) y <- append(y, v / abs(v)) # sign + or - 1 # -1 or +1 } } n = length(x) # number of non-zero differences x <- rank(x) # abs diff ranked #x mx <- xtabs(~ x + y) # matrix +/- ranks # calculate Wicoson W ranks <- rownames(mx) # ranks as value array #ranks # optional display during debug tp = 0; # T+ sum of positive ranks tm = 0; # T- sum of negative ranks for(i in 1 : nrow(mx)) { r = as.numeric(ranks[i]) # rank of ith row tm = tm + mx[i,1] * r # T+ sum of positive ranks tp = tp + mx[i,2] * r # T- sum of negative ranks } t = tp if(tm>t) t = tm # use the larger of the 2 sums for calculation # significance testing if(n>15) # large sample size use z and its probability { z = (t - n * (n + 1) / 4) / sqrt(n * (n + 1) * (2 * n + 1) / 24) p = 1-pnorm(abs(z)) res = sprintf("z=%.4f p=%.4f", z, p) } else # small sample size use coefficients { Sig <- c(c(0,0,0),c(0,0,0),c(0,0,0),c(0,0,0),c(0,0,0),c(15,0,0), # 1-6 c(19,0,0),c(24,28,0),c(31,45,0),c(37,42,0),c(45,50,55), # 7-11 c(53,59,65),c(61,69,76),c(70,79,87),c(80,90,99),c(90,100,111)) # Table H. Siegal p.332-334 mxSig <- matrix(data = Sig, nrow=3,ncol=16) # converted to matrix for reference nn = n + 1 if(n<5) { res = "p = n.s." } else if(mxSig[3,nn]>0 && t>=mxSig[3,nn]) { res = "p = <0.001"; } else if(mxSig[2,nn]>0 && t>=mxSig[2,nn]) { res = "p = <0.01"; } else if(mxSig[1,nn]>0 && t>=mxSig[1,nn]) { res = "p = <0.05"; } else { res = "p = n.s."; } } # result output mx # matrix of +/- ranks c(nrow(df), n, tm, tp) # sample size, pairs different, Sum Rank + and - res # statistical significance
The results are as follows

Table of neg and pos ranks. x = absolute rank, y= - or +
initial sample size (nrow), number of non-zero differences (n), sums of - (tm) and + (yp) ranks
Statistical significance (p, α)

> # result output > mx # matrix of +/- ranks y x -1 1 3.5 3 3 9.5 1 5 13.5 0 2 > c(nrow(df), n, tm, tp) # sample size, pairs different, Sum Rank + and - [1] 22 14 20 85 > res # statistical significance [1] "p = <0.05" >

Program 3: Permutation of paired differences
Section 1: declare global variabes
# Global variables rows <- 0 # number of rows (pairs) sumDiff <- 0 # sum of paired difference from input data arV <- vector() # array of abs(paired diff) from all pairs nLess <- 0 # number of pairs with values less than sumDiff nSame <- 0 # number of pairs with values same as sumDiff nMore <- 0 # number of pairs with values more than sumDiff
Section 2: subroutine for recursive calculation of paired differences
All variables except col, sum, sumPlus and sumMinus are globally assigned and read
Recurse <- function(col,sum) # col = pair number, sum = sum of paired diff { sumPlus = sum + arV[col] # sum with +ve [aired diff sumMinus = sum - arV[col] # sum with -ve [aired diff if(col==(rows)) # reaching last row { if(abs(sumPlus-sumDiff)<1e-10) # sum == sumDiff (allowing minor differences by binary math) { assign("nSame", nSame+1, envir = .GlobalEnv) # add to nSame } else { if(sumPlus<sumDiff) { assign("nLess", nLess+1, envir = .GlobalEnv) # add to nLess } else { assign("nMore", nMore+1, envir = .GlobalEnv) # add to nMore } } if(abs(sumMinus-sumDiff)<1e-10) { assign("nSame", nSame+1, envir = .GlobalEnv) # add to nSame } else { if(sumMinus<sumDiff) { assign("nLess", nLess+1, envir = .GlobalEnv) # add to nLess } else { assign("nMore", nMore+1, envir = .GlobalEnv) } } return() } arCount <- Recurse(col+1,sumPlus) # recursive call if not end row arCount <- Recurse(col+1,sumMinus) # recursive call if not end row return () }
Main program
# input data dat = (" V1 V2 3163 3124 3245 2807 3391 3014 2547 2727 3042 3254 3200 3826 3115 2596 3294 2952 3019 3279 3222 3325 2831 2984 3043 2765 2646 3109 3327 2757 3182 3781 2984 3061 2878 3658 2770 3665 3092 2739 2735 3324 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame rows <- nrow(df) # number of pairs df$Diff <- df$V1 - df$V2 # paired differences #df # optional check t debug sumDiff <- sum(df$Diff) # sumDiff, sum of paired diff from input data arV <- abs(df$Diff) # vector of abs(paired differences for use in permutation col = 1 # start permutation with first row sum = 0 # start permutation with sum of diff = 0 arCount <- Recurse(col, sum) # permutate calculations subroutine # Assemble results nTot = nLess + nSame + nMore # total number of possible permutations pLess = nLess / nTot # proportion with smaller effect size pSame = nSame / nTot # proportion with the same effect size pMore = nMore / nTot # proportion with larger effect size # probability of obtaining reference effect size or more extreme p = min(pLess,pMore) + pSame # significance (probability, α) # result output c(rows, sumDiff) # number of pairs and sum of paired diff from input data c(nTot, nLess, nSame, nMore) # result numbers c(pLess, pSame, pMore) # result proportions p # probability of sum(x) or more extreme (p, α)
The results are as follows
> # result output > c(rows, sumDiff) # number of pairs and sum of paired diff from input data [1] 20 -2021 > c(nTot, nLess, nSame, nMore) # result numbers [1] 1048576 173010 262 875304 > c(pLess, pSame, pMore) # result proportions [1] 0.1649951935 0.0002498627 0.8347549438 > p # probability of sum(x) or more extreme (p, α) [1] 0.1652451