Unpaired Diff

Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Compare Unpaired Differences in Measurrements

Links : Home Index (Subjects) Contact StatsToDo

Explanations Javascript Programs R Codes References E F

Introduction Parametric Non-parametric Permutation E F G

This page provides a number of commonly used programs that compares measurements in 2 or more unpaired groups. The structure of this page is as follows
Explanations : with the following subpanels
Introduction: This sub-panel
Parametric Comparisons: with the following explanations
Equality of variance: includeing
F Ratio of two variances
Bartlett's Test
Levene Test
Comparing group means: including:
One Way Analysis of Variance (OWAV)
Two sample t test
Post-hoc comparison between any 2 groups after OWAV: including
Tukey's least significant difference
Bonferroni's (also called Fisher's) least significant difference
Scheffe's least significant difference
unadjusted confidence interval for difference between means

Non-parametric Comparisons: with the following explanations
For 2 or more groups:
Median Test
For 3 or more groups:
Kruskall Wallis one way analysis of variance by ranks
Post-hoc comparison of mean ranks
For 2 groups:
Wilcoxon Mann Whitney Test
Robust Signed Rank Test, which is aslso known as the Mann-Whitney U Test

Permutation Test for Unpaired Groups

Javascript Programs: All the programs listed above are organized in a similar structure in the Javascript sub-panle. These programs are web-based, requiring no more than the insertion of data to be analysed and a click of the button
R Codes: All the programs listed are also available in R codes, and similarly organized. This allows users to copy the codes, paste them in the R programming environment, and edit and used as required.
References: The text book and published references from which the programs were developed.

This subpanel describes parametric statistics programs that compares data in unpaired groups, assuming that the data are continuus measurements with a normal (Gaussian) distribution.
Tests for equality of variance
In most parametric comparisons, there is an assumption that the data in different groups are from the same population, and they differ only because of the treatment they received in each group. In other words, the assumption is that the variances of the groups are equal. It is prudent that this assumption is tested and confirmed because the results of other comparisons are interpreted, and 3 programs are provided in this page for this purpose.
F Ratio of the two variances: This can be carried out prior to the two sample t test. The F ratio is the ratio of the variances in the two groups (the larger / the smaller). The probability for F is the Probability of Type I Error, and can be used to determine whether a significant difference exists. The test can be demonstrated here, using the example provided in the Javascript panel.
If groups 1 and 2 in the example data is used for comparison, the Standard Deviation (and therefore the variance) of group 2 is larger than group 1. The F ratio test can be manually calculated as follows
The larger variance is from group 2, so variance = SD² = 2.13² = 4.54, df = n-1 = 20 -1 = 19
The smaller variance is from group 1, so variance = SD² = 1.04² = 1.08, df = n-1 = 10 -1 = 9
F = 4.34 / 1.08 = 4.19, and probability of F, with 19 and 9 degrees of freedom, p (α) = 0.02
The conclusion is that the variance from groups 1 and 3 are significantly unequal.
Bartlett's test: This is conceptually similar to the F ratio, but based on the chi square distribution, and the inclusion of Bonferroni's correction when more than 2 groups are included for comparison. Using the same exzmples
Comparing group 1 and group 2: chi sq=4.97 df=1 p=0.02. The variances are significantly unequal
Comparing all 3 groups: chi sq=4.98 df=2 p=0.08. The variances are not significantly unequal
Levenne's test: less sensitive and more robust than either the F ratio of the Bartlett's test. It is less likely to conclude that unequal variance exists. Levene's test also required the primary measurements from the groups, and cannot be calculated from summaries of sample size and Standard Deviations.
Choice of program: . Please note: The following comments (in blue) are my opinions, based on following comments on the Internet, and not from any formal study, text book or peer reviewed. Care is required to accept these comments, and expert statistical advice sought if there is doubt.
Both the Variance ratio and the Bartlett test are more sensitive than the Levene, and can be used to test for equality of variance before the two sample t test or one way analysis of variance. The Bartlett test is more flexible as it can cope with any number of groups, but it requires a program. The F ratio can be easily calculated by hand, but can only be used for two group comparisons.
The Levene test is best used when testing the equality of variance is the primary purpose, as it is less likely to lead to a false positive conclusion (that the variances are unequal)

Compare means between groups.
The Oneway analysis of variance (OWAV) is used to compare means from 2 or more groups. The variations (variance, mean sum square, msq) in the data are partitioned into that attributeable to those between the groups and within the groups (residual). The ratio of these 2 variances (F) is then tested for statistical significance
The two sample t test is a special case of OWAV, when there are only 2 groups. This test is important in that most evidence based practice is based on the meta-analysis of the difference between two means (Diff) and its Standard Error (SE). Although programs are provided to calculate Diff and SE from two groups on this page, they can just as easily be estimated from the results of analysis of variance (2 groups) as follows
Difference between means Diff = mean_{group 1} - mean_{group 2}
From table of variance, F is obtained, and t = sqrt(F)
As t = Diff / SE, SE = Diff / t
Please note: that both OWAV and the two sample t test on this page are based on the assumption that the variances between the groups are equal. A variant of the two sample t test (Welch) is provided in R and other statistical packages to estimate statistical significance if the equality of variances in the two groups cannot be assured. (see R code panel)
Post-hoc comparisons between pairs of means
Following OWAV when there are more than 2 groups, the analyst may want to compare pairs of groups to obtain more details than whether significant difference exists somewhere between the groups. Post-hoc analysis allows this to be done.
It is important to emphasize that post-hoc analysis should not be used as the primary method of comparing group means, as in multiple groups, there is a random chance that some groups may be significantly different. The recommendation is therefore that post-hoc analysis should only be done if a significant difference is already demonstrated by OWAV.
Tukey's least significant difference. Although this is a veery commonly used test, there are variations in terminoloty and presentation that can be confusing. The following attempt to unravel this complexity. Please note however that these explanations (in blue) are based on my understandings after consulting text books, published papers, and the Internet, and not from formal instructions, so it is not authoratative. Users are urged to study or obtain advice before accepting these explanations.
Tukey initially provided the mathematical model for estimating the confidence interval and statistical significance of the diference between any two group means following the analysis of variance. The model estimated the Studentised range q from the data, which allows the estimation of Standard Error of difference between two means, hence its confidence interval and statistical significance. The test was named Tukey's Studentized Range Test. The mathematics is unfortunately highly complex, requiring extensive numerical approximations. The proof was generally accepted, but the method is difficult to use in practice, especially at the time computers were not widely available.

To enable the test to be widely used in data analysis, the mathematics was turned back to front. By assuming that all groups have the same variance and sample size, and the significance p (α) set at either 0.05 or 0.01, a table of q values can be established for different number of groups and the residual degrees of freedom. The appropriate q value can be adjusted by the between group mean sum square, and the least significance calculated. This approach was widely adopted, and the test was named Tukey's least significant difference.

With the availability of powerful computers and easily accessible statistical packages such as SPSS, SAS, and R, the original Tukey's Range Test is often included as an option. The test however is often named Tukey's lsd, but the confidence interval and statistical significance are presented instead of a defined last significant difference in value. In the meantime text books describes the least significant values, using the same name.

Another recent development was introduced by.Kramer, where the assumtion of equal variance and sample size are considered unnecessary. The result least significant difference is similar to the original test, but slightly different. This test is named the Tukey Kramer least significant difference. As the method is widely adopted, it is also called the Honest significant difference (HSD)

The Tukey Kramer least significant difference is used in the Javascript program, and the calculation for this is also available in R codes under post-hoc analysis. However, if the R package for analysis of variance is used, the default results are from the Tukey Studentised Range test.
Bonferroni's least significant difference, which is also called Fisher's least significant difference reverses the mathematics of the t test, to define the least significant difference, when the residual variance is known and the significant p (α) is defined. Included in the algorithm is the Bonferroni's correction, where the p (α) value is divided by the number of comparisons being made.
The number of comparison being made is automatically calculated in the Javascript analysis of variance program. When calculating least significant difference from summary results of analysis of variance, the relationship between number of groups (nGrp) and number of comparison (nComp) is 1 for 2 groups, 3 for 2 groups, 6 for 3 groups, and so on, with the generalized formula as nComp = (nGrp * (nGrp - 1) / 2)
Scheffe's least significant difference is conceptually similar to Bonferroni's least significant difference, but the calculation is based on th F distribution rather than the t distribution
Unadjusted confidence interval for difference between means is calculating the confidence interval between any two means in the manner of the two sample t test, assuming that all groups have the same variance. This is a simplification of Tukey's Studentized range, but without consideration of the number of groups or comparisons, and assuming that all groups have the same variance. In the paper by Bird, it was argued that Bonferroni correction is not necessary if the comparisons are pre-determined before analysis, and if the variances are indeed equal, then the results obtained is close to that from Tukey's, without the necessity of complex computations
The choice of post-hoc method: Please note: The following comments (in blue) are my own conclusions after reading all I can about these methods in text books, published papers and on line discussions. Users should do their own studies before accepting these comments, and seek expert advice if in doubt.
The most comprehensive, robust (least likely to wrongly conclude that significant difference exists), and widely accepted post-hoc comparison is Tukey's Studentized range. However this requires the use of the original data, and the availability of the software package (e.g. in R, SAS, STATA, and so on). If Tukey's Studentized range is not available, the following can be considered.

In order of robustness, are Tukey Kramer, Bonferroni, Scheffe, then unadjusted confidence intervals

All the tests other than Tukey Kramer can increase its sensitivity (detect a significant difference if it exists) by defining the number of comparisons to be made. For example, if the analysis of variance involves many groups, but the analyst is only interested in a single comparison between two specific groups, the least significant difference is wider, so that a conclusion of significant is more likely, then comparing all groups against each other

This sub-panel describes the statistical procedures for comparing non-parametric measurements between groups. These are ordinal values, in that the values are ordered (1<2<3), but the interval between values are undefined. An obvious example is the Likert scale of 1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree. However, the difference between disagree and strongly disagree (2-1=1) is not the same as strongly agree and agree (5-4=1). Ordinal measurements are usually converted to ranks before they are processed.
The Median Test: This is a simple and easy test for comparing ordinal values in 2 or more groups.

The first step is to estimate the median value using all the cases in all the groups
All the values in each group are then separated into those less than the median, and those equal or greater than the median
The chi squares test is then used to test the statistical significance that the proportions of < ot >= median are not the same between the groups.
If there are only 2 groups, and the sample size is small, Fisher's Exact Probability is used instead of the chi square to test for statistical signofcance.
Kruskall Wallis One Way Analysis of Variance by Ranks: is used for comparing ordinal values in 3 or more groups. The test produces the effect size H, which has a chi square distribution.
If H is statistically significant, the post-hoc test of estimating the least significant difference between group mean ranks can be estimated to compare individual pairs of groups.
According to Siegel (see reference), the Kruskall Wallis test has a power efficiency of 95.5% that of the parametric F test. For sample size estimation therefore, the Type II error (β) should be appropriately reduced. For example
For power of 80% (power=0.8, β= 1-power = 0.2), the β used should be 0.955 * 0.2 = 0.191, power of 80.9% (0.801). This can then be used to calculate sample size using the program for analysis of variance

Wilcoxon Mann Whiney Test and the Robust Signed Rank Test (Mann Whitney U) are non-parametric tests for 2 group comparisons.
The Wilcoxon Mann Whitney Test has a power efficiency of 95.5% that of the t test, and can be used to replace the t test when the distribution is uncertain.
The Robust Signed Rank test, in addition will test whether the distributions in the two groups are similar, and its power efficiency approaches that of the t test.

The Permutation Test is intended to compare two groups when the sample size is very small, and the distribution uncertain.
The effect size of the input data is firstly computed. This is then compared with the effect sizes of all possible permutation of the data. The test is where the effect size of the original data sits, in term of the percentile amongst all permutations. This is then interpreted as the probability of Type I Error (α)
The test has 100% power as it makes no assumption about distributions.
However, the test is constrained by the total sample size of the two groups combined.
A sample size of <3 cases in either group cannot mathematically produce α<=0.05

As computation requires recursive calculations, the computation time and memory use increase exponentially with sample size, and a total sample size exceeding 26 cases ((n1+n2)>26) will take a impracticably long time to compute, and the program may hang or crash if the available memory capacity is exhausted.

Contents of E:34

Contents of F:35

Contents of G:36

Input Data

Parametric Non-parametric Post hoc Tests Permutation Test E F G

One Way Analysis of Variance, Equality of Variance, t Test

The data is for a single analysis. Two modes of data entry are available
Summary data using n, mean, and Standard Deviation:
The data is in 3 columns
  - Each row is data from a group
  - Col 1 = sample size (n)
  - Col 2 = mean (μ)
  - Col 3 = Standard Deviation (SD, σ)

Primary data using group, values:
The data is in 2 columns
  - Each row is data from a case
  - Col 1 = group designation, an integer number
  - Col 2 = measurement value


Comparison of non-parametric measurements in 2 or more groups

Two modes of data entry are available
Summary data using a table of ranked counts: The data is for a single study
  - The data is a table of counts by ranks
  - Each row is the rank of values, in order of ranks
  - Each of 2 or more columns is a group
  - The cells contain the number of subjects in that group with that rank

Data as they are collected: The data is for a single study
  - The data is a table with 2 columns
  - Each row is data from a subject
  - Column 1 is group designation
  - Column 2 is value of measurement for that case


Least Significant Difference between means Input data in 3 columns
  - Each row is data from a separate test
  - Parameters from One Way Analysis of Variance
    - Column 1 : Between group degrees of freedom (bdf = number of groups - 1)
    - Column 2 : Residual degrees of freedom (rdf)
    - Column 3 : Residual mean sum squares (rms)
  - Parameters specific to LSD
    - Column 4 : Probability of Type I Error (α)
    - Column 5 : Number of comparisons from a aov
    - Column 6, 7 : Sample size of two groups being compared (n1, n2)

Least significant difference between mean ranks : Input data in 5 columns
  - Each row is data from a separate study
  - Column 1 : Probability of Type I Error (α)
  - Column 2 : Total sample size in the study (nTotal)
  - Column 3 : The number of groups in the study (ngrps)
  - Column 4 & 5 : Sample sizes of the two groups (n1, n2)


Permutation Test comparing 2 groups
The data is for a single study, table in 2 columns
  - Each row is data from a subject
  - Column 1 = numerical group designation (integers, e.g. 1 and 2)
  - Column 2 = value of measurements

Please note: Permutation Test Unpaired (total sample size n₁+n₂ should be 6 to 26)
  - p(α)<0.05 cannot be calculated with sample size <6 cases
  - Computation time increases exponentially with sample size, and impracticably long when sample size >26

Contents of E:14

Contents of F:15

Contents of G:16

Parametric Post-hoc Tests Non-parametric Permutation Test E F

This panel provides the main programs associated with comparing means of parametric measurements from different groups.
Section 1: Data input and preparation

# data entry dat = (" Grp Val g1 50 g1 57 g1 70 g1 60 g1 55 g2 58 g2 65 g2 70 g2 70 g2 72 g3 70 g3 72 g3 60 g3 77 g3 75 ") dfDat <- read.table(textConnection(dat),header=TRUE) # conversion to data frame # dfDat # optional showing input data # Organize and prepare data for analysis # create a reference vector for group names ar <- sapply(dfDat[, Grp], as.character) # extract Grp as a char vector uar <- unique(ar) # remove duplications # uar optional show vector # Template for getting vectors of Val dselected by Grp #for(i in 1 : length(uar)) #{ # s <- subset(dfDat, dfDat$Grp==uar[i])$Val # print(s) #} # Separation values (Val) into its groups and calculate mean and SD of each group # vectors to store results from each group Grp <- uar # name of each group N <- vector() # sample size of each group Mean <- vector() # mean of each group SD <- vector() # SD of each group # Calculate for(i in 1 : length(uar)) { v <- subset(dfDat, dfDat$Grp==uar[i])$Val # vector of ?Val for each group N <- append(N, length(v)) Mean <- append(Mean, mean(v)) SD <- append(SD, sd(v)) } dfNMS <- data.frame(Grp, N, Mean, SD) # combine vectos rs to data feame for display and futture access dfNMS # show n, mean, SD in groups
The results are as follows
> dfNMS # show n, mean, SD in groups Grp N Mean SD 1 g1 5 58.4 7.436397 2 g2 5 67.0 5.656854 3 g3 5 70.8 6.610598

Program 2: Bartlett's Test for equality of variance
this uses the summary data in dfNMS (n, mean, SD of groups)
# Bartlett's Test k = nrow(dfNMS) # number of groups N = 0 # total sample size for(i in 1 : k) N = N + dfNMS$N[i] # total sample size (all grps) Sp2 = 0 t = 0 b = 0 for(i in 1 : k) # from each group { num = dfNMS$N[i] mean = dfNMS$Mean[i] sd = dfNMS$SD[i] Sp2 = Sp2 + (num - 1) * sd^2 / (N - k) t = t + (num - 1) * log(sd^2) b = b + 1.0 / num } t = (N - k) * log(Sp2) - t b = 1.0 + (1.0 / (3.0 * (k-1))) * b - 1.0 / (N - k) chi = t / b df = k - 1 p = 1 - pchisq(chi, df=df) c(chi, df, p) # result of Barlett's test
The results is the significance of Chi Sq. p< 0.05 means there are significant differences in variance exists between groups
> c(chi, df, p) # result of Barlett's test [1] 0.2907470 2.0000000 0.8646993

Program 3: Levene Test for equality of variances
this uses the raw data in dfDat (grp and Vals)
# Levene Test k = nrow(dfNMS) # number of groups X = 0 N = 0 mm <- matrix(data=0, nrow=k,ncol=3) for(i in 1 : k) { vec <- subset(dfDat, dfDat$Grp==uar[i])$Val # vector of Val for each group mean = mean(vec) for(j in 1 : length(vec)) { z = abs(vec[j] - mean) mm[i,1] = mm[i,1] + 1 mm[i,2] = mm[i,2] + z mm[i,3] = mm[i,3] + z^2 N = N + 1 X = X + z } } for(i in 1 : k) { zi = mm[i,2] / mm[i,1] - Zu Zi = Zi + mm[i,1] * zi^2 Zj = Zj + mm[i,3] - mm[i,2]^2 / mm[i,1] } F = ((N - k) * Zi) / ((k-1) * Zj) df1 = k-1 df2 = N - k p = 1 - pf(F, df1, df2) c(F, df1, df2, p) # results of Levenne Test
The results are as follows. These are F, its two degrees of freedom, and p (α). p<0.05 indicates significant differences in variances between groups
> c(F, df1, df2, p) # results of Levenne Test [1] 0.06891652 2.00000000 12.00000000 0.93377130

Program 4: One Way Analysis of Variance

res <- aov(dfDat$Val ~ dfDat$Grp) summary(res) # result of analysis of variance TukeyHSD(res) # Tukey's Least Significant Difference with adjusted p
The results of analysis of variance are as follows
> summary(res) Df Sum Sq Mean Sq F value Pr(>F) dfDat$Grp 2 403.6 201.80 4.621 0.0325 * Residuals 12 524.0 43.67 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results of Tukey's HSD by R are as follows
> TukeyHSD(res) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = dfDat$Val ~ dfDat$Grp) $`dfDat$Grp` diff lwr upr p adj g2-g1 8.6 -2.54984 19.74984 0.1408789 g3-g1 12.4 1.25016 23.54984 0.0294074 g3-g2 3.8 -7.34984 14.94984 0.6449915

Program 5: Two samples t test

g1 <- c(50,57,70,60,55) g2 <- c(70,72,60,77,75) t.test(g1,g2, var.equal=TRUE)
The results are as follows.
Two Sample t-test data: g1 and g2 t = -2.7867, df = 8, p-value = 0.02368 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -22.661071 -2.138929 sample estimates: mean of x mean of y 58.4 70.8
Please note:

The program uses 2 vectors of values, representing the 2 groups (g1 and g2 in this example). The two groups can be difference in size
This algorithm uses the standard t test where the variances between groups are assumed to be equal. This is the same as the Javasceipt program
If the term "var.equal=TRUE" is left out, Welch's 2 sample t test will be carried out by R. This assumes the variances of the two groups are unequal, and the degrees of freedom is adjusted accordingly. The results will be slightly different
The default result from R is for the 2 tail model
For those who wants to use the results for meta-analysis else where, the difference between the means (Diff) and its Standard Error (SE) can be calculated as follows
Diff = mean₁ - mean₂. In this example, = 50.4 - 70.8 = -12.4
t = Diff / SE, therefore SE = Diff / t. In this example = -12.4 / -2.7867 = 4.4487

This subpanel presents R codes for the 4 post-hoc tests of significance following analysis of variance using parametric data. These are

Tukey-Kramer estimate of least significant difference between means for p=0.05 and p=0.01
Bonferroni's (also named Fisher's) least significant difference between means
Scheffe's least significant difference between means
Unadjusted confidence intervals for difference between means
Please note:

Detailed explanations are provided in the Explanation panel of this page
Post-hoc test for lest significant difference between mean ranks in non-parametric measurements is presented in the non-parametric sub-panel
All the different programs are presented as subroutine functions. The main program here calculates the lot together

Subroutine functions
Tukey-Kramer Least Significant Difference
# subroutine for Tukey-Kramer Least Significant Difference # rms = residual mean square, rdf=residual degrees of freedom # nGrps = number of groups in aov, # n1, n2 = sample sizes of the 2 groups being compared # x05 and x01 are the tables (for p=0.05 and p=0.01)of studentized range statistics q. These are rearranged into q05 and q01 to suit the matric calls of R. The tables are copied from the text book by Steelr et.al, which quote the original tables by Pearson and Hartley (see references) TukeyLSD <- function(rms, rdf, nGrps, n1, n2) { x05 = c( c(3.635,4.602,5.218,5.673,6.033,6.330,6.582,6.801,6.995,7.167,7.323,7.466,7.596,7.716,7.828,7.932,8.030,8.122,8.208), #rdf=5 c(3.460,4.339,4.896,5.305,5.628,5.895,6.122,6.319,6.493,6.649,6.789,6.917,7.034,7.143,7.244,7.338,7.426,7.508,7.586), c(3.344,4.165,4.681,5.060,5.359,5.606,5.815,5.997,6.158,6.302,6.431,6.550,6.658,6.759,6.852,6.939,7.020,7.097,7.169), c(3.261,4.041,4.529,4.886,5.167,5.399,5.596,5.767,5.918,6.053,6.175,6.287,6.389,6.483,6.571,6.653,6.729,6.801,6.869), c(3.199,3.948,4.415,4.755,5.024,5.244,5.432,5.595,5.738,5.867,5.983,6.089,6.186,6.276,6.359,6.437,6.510,6.579,6.643), c(3.151,3.877,4.327,4.654,4.912,5.124,5.304,5.460,5.598,5.722,5.833,5.935,6.028,6.114,6.194,6.269,6.339,6.405,6.467), #rdf=q0 c(3.113,3.820,4.256,4.574,4.823,5.028,5.202,5.353,5.486,5.605,5.713,5.811,5.901,5.984,6.062,6.134,6.202,6.265,6.325), c(3.081,3.773,4.199,4.508,4.750,4.950,5.119,5.265,5.395,5.510,5.615,5.710,5.797,5.878,5.953,6.023,6.089,6.151,6.209), c(3.055,3.734,4.151,4.453,4.690,4.884,5.049,5.192,5.318,5.431,5.533,5.625,5.711,5.789,5.862,5.931,5.995,6.055,6.112), c(3.033,3.701,4.111,4.407,4.639,4.829,4.990,5.130,5.253,5.364,5.463,5.554,5.637,5.714,5.785,5.852,5.915,5.973,6.029), c(3.014,3.673,4.076,4.367,4.595,4.782,4.940,5.077,5.198,5.306,5.403,5.492,5.574,5.649,5.719,5.785,5.846,5.904,5.958), c(2.998,3.649,4.046,4.333,4.557,4.741,4.896,5.031,5.150,5.256,5.352,5.439,5.519,5.593,5.662,5.726,5.786,5.843,5.896), c(2.984,3.628,4.020,4.303,4.524,4.705,4.858,4.991,5.108,5.212,5.306,5.392,5.471,5.544,5.612,5.675,5.734,5.790,5.842), c(2.971,3.609,3.997,4.276,4.494,4.673,4.824,4.955,5.071,5.173,5.266,5.351,5.429,5.501,5.567,5.629,5.688,5.743,5.794), c(2.960,3.593,3.977,4.253,4.468,4.645,4.794,4.924,5.037,5.139,5.231,5.314,5.391,5.462,5.528,5.589,5.647,5.701,5.752), c(2.950,3.578,3.958,4.232,4.445,4.620,4.768,4.895,5.008,5.108,5.199,5.282,5.357,5.427,5.492,5.553,5.610,5.663,5.714), #rdf=20 c(2.941,3.565,3.942,4.213,4.424,4.597,4.743,4.870,4.981,5.081,5.170,5.252,5.327,5.396,5.460,5.520,5.576,5.629,5.679), c(2.933,3.553,3.927,4.196,4.405,4.577,4.722,4.847,4.957,5.056,5.144,5.225,5.299,5.368,5.431,5.491,5.546,5.599,5.648), c(2.926,3.542,3.914,4.180,4.388,4.558,4.702,4.826,4.935,5.033,5.121,5.201,5.274,5.342,5.405,5.464,5.519,5.571,5.620), c(2.919,3.532,3.901,4.166,4.373,4.541,4.684,4.807,4.915,5.012,5.099,5.179,5.251,5.319,5.381,5.439,5.494,5.545,5.594), c(2.913,3.523,3.890,4.153,4.358,4.526,4.667,4.789,4.897,4.993,5.079,5.158,5.230,5.297,5.359,5.417,5.471,5.522,5.570), c(2.907,3.514,3.880,4.141,4.345,4.511,4.652,4.773,4.880,4.975,5.061,5.139,5.211,5.277,5.339,5.396,5.450,5.500,5.548), c(2.902,3.506,3.870,4.130,4.333,4.498,4.638,4.758,4.864,4.959,5.044,5.122,5.193,5.259,5.320,5.377,5.430,5.480,5.528), c(2.897,3.499,3.861,4.120,4.322,4.486,4.625,4.745,4.850,4.944,5.029,5.106,5.177,5.242,5.302,5.359,5.412,5.462,5.509), c(2.892,3.493,3.853,4.111,4.311,4.475,4.613,4.732,4.837,4.930,5.014,5.091,5.161,5.226,5.286,5.342,5.395,5.445,5.491), c(2.888,3.486,3.845,4.102,4.301,4.464,4.601,4.720,4.824,4.917,5.001,5.077,5.147,5.211,5.271,5.327,5.379,5.429,5.475),#rdf+30 c(2.884,3.481,3.838,4.094,4.292,4.454,4.591,4.709,4.812,4.905,4.988,5.064,5.134,5.198,5.257,5.313,5.365,5.414,5.460), c(2.881,3.475,3.832,4.086,4.284,4.445,4.581,4.698,4.802,4.894,4.976,5.052,5.121,5.185,5.244,5.299,5.351,5.400,5.445), c(2.877,3.470,3.825,4.079,4.276,4.436,4.572,4.689,4.791,4.883,4.965,5.040,5.109,5.173,5.232,5.287,5.338,5.386,5.432), c(2.874,3.465,3.820,4.072,4.268,4.428,4.563,4.680,4.782,4.873,4.955,5.030,5.098,5.161,5.220,5.275,5.326,5.374,5.420), c(2.871,3.461,3.814,4.066,4.261,4.421,4.555,4.671,4.773,4.863,4.945,5.020,5.088,5.151,5.209,5.264,5.315,5.362,5.408), c(2.868,3.457,3.809,4.060,4.255,4.414,4.547,4.663,4.764,4.855,4.936,5.010,5.078,5.141,5.199,5.253,5.304,5.352,5.397), c(2.865,3.453,3.804,4.054,4.249,4.407,4.540,4.655,4.756,4.846,4.927,5.001,5.069,5.131,5.189,5.243,5.294,5.341,5.386), c(2.863,3.449,3.799,4.049,4.243,4.400,4.533,4.648,4.749,4.838,4.919,4.993,5.060,5.122,5.180,5.234,5.284,5.331,5.376), c(2.861,3.445,3.795,4.044,4.237,4.394,4.527,4.641,4.741,4.831,4.911,4.985,5.052,5.114,5.171,5.225,5.275,5.322,5.367), c(2.858,3.442,3.791,4.039,4.232,4.388,4.521,4.634,4.735,4.824,4.904,4.977,5.044,5.106,5.163,5.216,5.266,5.313,5.358),#rdf = 40 c(2.843,3.420,3.764,4.008,4.197,4.351,4.481,4.592,4.690,4.777,4.856,4.927,4.993,5.053,5.109,5.161,5.210,5.256,5.299),#rdf = 50 c(2.829,3.399,3.737,3.977,4.163,4.314,4.441,4.550,4.646,4.732,4.808,4.878,4.942,5.001,5.056,5.107,5.154,5.199,5.241),#rdf = 60 c(2.814,3.377,3.711,3.947,4.129,4.277,4.402,4.509,4.603,4.686,4.761,4.829,4.892,4.949,5.003,5.052,5.099,5.142,5.183),#ref = 90 c(2.800,3.356,3.685,3.917,4.096,4.241,4.363,4.468,4.560,4.641,4.714,4.781,4.842,4.898,4.950,4.998,5.043,5.086,5.126),#rdf =120 c(2.786,3.335,3.659,3.887,4.063,4.205,4.324,4.427,4.517,4.596,4.668,4.733,4.792,4.847,4.897,4.944,4.988,5.030,5.069),#rdf = 240 c(2.772,3.314,3.633,3.858,4.030,4.170,4.286,4.387,4.474,4.552,4.622,4.685,4.743,4.796,4.845,4.891,4.934,4.974,5.012))# inf q05 <- matrix(data=x05, nrow=19, ncol=42) x01=c( c(5.702,6.976,7.804,8.421,8.913,9.321,9.669,9.971,10.239,10.479,10.696,10.894,11.076,11.244,11.400,11.545,11.682,11.811,11.932), c(5.243,6.331,7.033,7.556,7.972,8.318,8.612,8.869,9.097,9.300,9.485,9.653,9.808,9.951,10.084,10.208,10.325,10.434,10.538), c(4.949,5.919,6.542,7.005,7.373,7.678,7.939,8.166,8.367,8.548,8.711,8.860,8.997,9.124,9.242,9.353,9.456,9.553,9.645), c(4.745,5.635,6.204,6.625,6.959,7.237,7.474,7.680,7.863,8.027,8.176,8.311,8.436,8.552,8.659,8.760,8.854,8.943,9.027), c(4.596,5.428,5.957,6.347,6.657,6.915,7.134,7.325,7.494,7.646,7.784,7.910,8.025,8.132,8.232,8.325,8.412,8.495,8.573), c(4.482,5.270,5.769,6.136,6.428,6.669,6.875,7.054,7.213,7.356,7.485,7.603,7.712,7.812,7.906,7.993,8.075,8.153,8.226), c(4.392,5.146,5.621,5.970,6.247,6.476,6.671,6.841,6.992,7.127,7.250,7.362,7.464,7.560,7.648,7.731,7.809,7.883,7.952), c(4.320,5.046,5.502,5.836,6.101,6.320,6.507,6.670,6.814,6.943,7.060,7.166,7.265,7.356,7.441,7.520,7.594,7.664,7.730), c(4.260,4.964,5.404,5.726,5.981,6.192,6.372,6.528,6.666,6.791,6.903,7.006,7.100,7.188,7.269,7.345,7.417,7.484,7.548), c(4.210,4.895,5.322,5.634,5.881,6.085,6.258,6.409,6.543,6.663,6.772,6.871,6.962,7.047,7.125,7.199,7.268,7.333,7.394), c(4.167,4.836,5.252,5.556,5.796,5.994,6.162,6.309,6.438,6.555,6.660,6.756,6.845,6.927,7.003,7.074,7.141,7.204,7.264), c(4.131,4.786,5.192,5.489,5.722,5.915,6.079,6.222,6.348,6.461,6.564,6.658,6.744,6.823,6.897,6.967,7.032,7.093,7.151), c(4.099,4.742,5.140,5.430,5.659,5.847,6.007,6.147,6.270,6.380,6.480,6.572,6.656,6.733,6.806,6.873,6.937,6.997,7.053), c(4.071,4.703,5.094,5.379,5.603,5.787,5.944,6.081,6.201,6.309,6.407,6.496,6.579,6.655,6.725,6.791,6.854,6.912,6.967), c(4.046,4.669,5.054,5.334,5.553,5.735,5.889,6.022,6.141,6.246,6.342,6.430,6.510,6.585,6.654,6.719,6.780,6.837,6.891), c(4.024,4.639,5.018,5.293,5.510,5.688,5.839,5.970,6.086,6.190,6.285,6.370,6.449,6.523,6.591,6.654,6.714,6.770,6.823), c(4.004,4.612,4.986,5.257,5.470,5.646,5.794,5.924,6.038,6.140,6.233,6.317,6.395,6.467,6.534,6.596,6.655,6.710,6.762), c(3.986,4.588,4.957,5.225,5.435,5.608,5.754,5.882,5.994,6.095,6.186,6.269,6.346,6.417,6.482,6.544,6.602,6.656,6.707), c(3.970,4.566,4.931,5.195,5.403,5.573,5.718,5.844,5.955,6.054,6.144,6.226,6.301,6.371,6.436,6.497,6.553,6.607,6.658), c(3.955,4.546,4.907,5.168,5.373,5.542,5.685,5.809,5.919,6.017,6.105,6.186,6.261,6.330,6.394,6.453,6.510,6.562,6.612), c(3.942,4.527,4.885,5.144,5.347,5.513,5.655,5.778,5.886,5.983,6.070,6.150,6.224,6.292,6.355,6.414,6.469,6.522,6.571), c(3.930,4.510,4.865,5.121,5.322,5.487,5.627,5.749,5.856,5.951,6.038,6.117,6.190,6.257,6.319,6.378,6.432,6.484,6.533), c(3.918,4.495,4.847,5.101,5.300,5.463,5.602,5.722,5.828,5.923,6.008,6.087,6.158,6.225,6.287,6.344,6.399,6.450,6.498), c(3.908,4.481,4.830,5.082,5.279,5.441,5.578,5.697,5.802,5.896,5.981,6.058,6.129,6.195,6.256,6.314,6.367,6.418,6.465), c(3.898,4.467,4.814,5.064,5.260,5.420,5.556,5.674,5.778,5.871,5.955,6.032,6.103,6.168,6.228,6.285,6.338,6.388,6.435), c(3.889,4.455,4.799,5.048,5.242,5.401,5.536,5.653,5.756,5.848,5.932,6.008,6.078,6.142,6.202,6.258,6.311,6.361,6.407), c(3.881,4.443,4.786,5.032,5.225,5.383,5.517,5.633,5.736,5.827,5.910,5.985,6.055,6.119,6.178,6.234,6.286,6.335,6.381), c(3.873,4.433,4.773,5.018,5.210,5.367,5.500,5.615,5.716,5.807,5.889,5.964,6.033,6.096,6.155,6.211,6.262,6.311,6.357), c(3.865,4.423,4.761,5.005,5.195,5.351,5.483,5.598,5.698,5.789,5.870,5.944,6.013,6.076,6.134,6.189,6.240,6.289,6.334), c(3.859,4.413,4.750,4.992,5.181,5.336,5.468,5.581,5.682,5.771,5.852,5.926,5.994,6.056,6.114,6.169,6.220,6.268,6.313), c(3.852,4.404,4.739,4.980,5.169,5.323,5.453,5.566,5.666,5.755,5.835,5.908,5.976,6.038,6.096,6.150,6.200,6.248,6.293), c(3.846,4.396,4.729,4.969,5.156,5.310,5.439,5.552,5.651,5.739,5.819,5.892,5.959,6.021,6.078,6.132,6.182,6.229,6.274), c(3.840,4.388,4.720,4.959,5.145,5.298,5.427,5.538,5.637,5.725,5.804,5.876,5.943,6.004,6.061,6.115,6.165,6.212,6.256), c(3.835,4.381,4.711,4.949,5.134,5.286,5.414,5.526,5.623,5.711,5.790,5.862,5.928,5.989,6.046,6.099,6.148,6.195,6.239), c(3.830,4.374,4.703,4.940,5.124,5.275,5.403,5.513,5.611,5.698,5.776,5.848,5.914,5.974,6.031,6.084,6.133,6.179,6.223), c(3.825,4.367,4.695,4.931,5.114,5.265,5.392,5.502,5.599,5.685,5.764,5.835,5.900,5.961,6.017,6.069,6.118,6.165,6.208), c(3.793,4.324,4.644,4.874,5.052,5.198,5.322,5.428,5.522,5.606,5.681,5.750,5.814,5.872,5.926,5.977,6.024,6.069,6.111), c(3.762,4.282,4.594,4.818,4.991,5.133,5.253,5.356,5.447,5.528,5.601,5.667,5.728,5.784,5.837,5.886,5.931,5.974,6.015), c(3.732,4.241,4.545,4.763,4.931,5.069,5.185,5.284,5.372,5.451,5.521,5.585,5.644,5.698,5.749,5.796,5.840,5.881,5.920), c(3.702,4.200,4.497,4.709,4.872,5.005,5.118,5.214,5.299,5.375,5.443,5.505,5.561,5.614,5.662,5.708,5.750,5.790,5.827), c(3.672,4.160,4.450,4.655,4.814,4.943,5.052,5.145,5.227,5.300,5.366,5.426,5.480,5.530,5.577,5.621,5.661,5.699,5.735), c(3.643,4.120,4.403,4.603,4.757,4.882,4.987,5.078,5.157,5.227,5.290,5.348,5.400,5.448,5.493,5.535,5.574,5.611,5.645)) q01 <- matrix(data=x01, nrow=19, ncol=42) col = nGrps - 1 if(col<1) col = 1 if(col>19) col = 19 row = rdf - 4 if(rdf>40 & rdf<=50) row = 36 + 1 if(rdf>50 & rdf<=60) row = 36 + 2 if(rdf>60 & rdf<=90) row = 36 + 3 if(rdf>90 & rdf<=120) row = 36 + 4 if(rdf>120 & rdf<=240) row = 36 + 5 if(rdf>120) row = 36 + 6 # rows and columns of the tables lsd05 = q05[col,row] * sqrt(rms / 2 * (1 / n1 + 1 / n2)) lsd01 = q01[col,row] * sqrt(rms / 2 * (1 / n1 + 1 / n2)) return (c(lsd05, lsd01)) # LSD for p=0.05 and p=0.01 }
Sibroutine function for Bonferroni's (also called Fisher's) least significant difference
# Subroutine for Bonferroni (Fisher) Least Significant Difference # alpha = probability of Type I error (p, α) # rms = residual mean square, rdf=residual degrees of freedom # nComp = number of comparisons being made from the same aov # n1, n2 = sample sizes of the 2 groups being compared BonferroniLSD <- function(alpha, rms, rdf, nComp, n1,n2) { return (qt(1 - alpha / nComp, rdf) * sqrt(rms * (1 / n1 + 1 / n2))) }
Subroutine function for Scneffe's least significant difference
# Subroutine for Scheffe Least Significant Difference # alpha = probability of Type I error (p, α) # bdf = between group degrees of freedom = number of groups in analysis of variance - 1 # rms = residual mean square, rdf=residual degrees of freedom # nComp = number of comparisons being made from the same aov # n1, n2 = sample sizes of the 2 groups being compared ScheffeLSD <- function(alpha, bdf, rms, rdf, nComp, n1, n2) { f = qf(1 - alpha, df1=bdf, df2=rdf) # find F value for the significance return (sqrt(nComp * f) * sqrt(rms * (1.0 / n1 + 1.0 / n2))) }
Subroutine for unadjusted confidence interval of differences between means following analysis of variance
Please note: the parameter for level of confidence is usually expressed as a percentage. e.g. 95% CI. For this subroutine, probability of Type I error (alpha) is used. The conversions is
alpha = 1 - %confidence / 100
e.g. for 95% confidence interval alpha = 1 - 95 / 100 = 1 - 0.95 = 0.05

# Subroutine for confidence interval # alpha 0.05 = confidence of 95%, 0.01 as 99% confidence # alpha / 2 for 2 tail mpdel ConfidenceInterval <- function(alpha, rms, rdf, n1, n2) { se = sqrt(rms * (1.0 / n1 + 1.0 / n2)) # Standard Error ci1 = se * qt(1 - alpha, df=rdf) # CI 1 tail ci2 = se * qt(1 - alpha / 2, df=rdf) # CI 2 tail return (c(ci1,ci2)) }

Main Program
All the subroutines are run using a single main program. All the data are not used in every subroutines
# data entry # From analysis of variance # BDF = between group degrees of freedom (number of groups -1) # RDF = residual degrees of freedom (total sample size - number of groups) # RMS = residual mean square (residual variance) # For each test # ALPHA = probability of Type I Error (p, α) # NCOMP = number of comparisons being made from the analysis of variance # N1, N2 the sample size of the two groups for calculating difference dat = (" BDF RDF RMS ALPHA NCOMP N1 N2 2 57 6.35 0.05 3 10 20 2 57 6.35 0.01 3 10 20 2 57 6.35 0.05 2 10 20 2 57 6.35 0.01 1 10 20 2 57 6.35 0.05 3 20 30 2 57 6.35 0.01 3 30 30 2 57 6.35 0.05 2 10 30 2 57 6.35 0.01 1 10 30 ") dfPostHoc <- read.table(textConnection(dat),header=TRUE) # conversion to data frame dfPostHoc # optional showing input data # vectors to record results T05 <- vector() # Tukey Kramer LSD for p=0.05 T01 <- vector() # Tukey Kramer LSD for p=0.01 BON <- vector() # Bonerroni LSD SCH <- vector() # Scheffe LSD CI1 <- vector() # confidence interval 1 tail CI2 <- vector() # confidence interval 2 tail # Calculations for(i in 1 : nrow(dfPostHoc)) { bdf = dfPostHoc$BDF[i] # between group degrees of freedom nGrps = bdf + 1 # number of groups in aov rdf = dfPostHoc$RDF[i] # residual degrees of freedom rms = dfPostHoc$RMS[i] # residual nean square alpha = dfPostHoc$ALPHA[i] # Probability of Tyupe I Error nComp = dfPostHoc$NCOMP[i] # number of comparisons between groups n1 = dfPostHoc$N1[i] n2 = dfPostHoc$N2[i] # Tukey-Kramer LSD ar <- TukeyLSD(rms, rdf, nGrps, n1, n2) T05 <- append(T05, sprintf(ar[1], fmt="%#.4f")) # Tukey Kramer LSD p=0.05 T01 <- append(T01, sprintf(ar[2], fmt="%#.4f")) # Tukey Kramer LSD p=0.01 # Bonferroni (Fisher) LSD bonferroniLSD = BonferroniLSD(alpha, rms, rdf, nComp, n1, n2) BON <- append(BON, sprintf(bonferroniLSD, fmt="%#.4f") ) # Bonferroni LSD # Scheffe LSD scheffeLSD = ScheffeLSD(alpha, bdf, rms, rdf, nComp, n1, n2) SCH <- append(SCH, sprintf(scheffeLSD, fmt="%#.4f") ) # Scheffe LSD # Un adjusted confidence interval ar <- ConfidenceInterval(alpha, rms, rdf, n1, n2) CI1 <- append(CI1, sprintf(ar[1], fmt="%#.4f") ) # confidence interval 1 tail CI2 <- append(CI2, sprintf(ar[2], fmt="%#.4f") ) # confidence interval 2 tail } # Add result vectors to data frame for display dfPostHoc$T05 <- T05 dfPostHoc$T01 <- T01 dfPostHoc$BON <- BON dfPostHoc$SCH <- SCH dfPostHoc$CI1 <- CI1 dfPostHoc$CI2 <- CI2 dfPostHoc # display input data and results
The results are as follows
From input data

BDF, RDF = between and residual degrees of greedom
RMS = residual mean square
ALPHA = probability of Type I Error (p, α)
NCOMP = number of comparisons being made from an analysis of variance
N1, N2 = sample size of the two groups being compared
The results are

T05, T01 = Tukey-Kramer least significant difference for p=0.05 and p=0.01
BON = Bonferroni (Fisher) least significant difference
SCH = Scheffe's least significant difference
CI1, CI2 = confidence interval (unadjusted) for difference between means, 1 and 2 tail

> dfPostHoc # display input data and results BDF RDF RMS ALPHA NCOMP N1 N2 T05 T01 BON SCH CI1 CI2 1 2 57 6.35 0.05 3 10 20 2.3457 2.9550 2.1285 3.0044 1.6318 1.9543 2 2 57 6.35 0.01 3 10 20 2.3457 2.9550 2.7483 3.7792 2.3360 2.6008 3 2 57 6.35 0.05 2 10 20 2.3457 2.9550 1.9543 2.4531 1.6318 1.9543 4 2 57 6.35 0.01 1 10 20 2.3457 2.9550 2.3360 2.1819 2.3360 2.6008 5 2 57 6.35 0.05 3 20 30 1.7484 2.2026 1.5865 2.2393 1.2163 1.4567 6 2 57 6.35 0.01 3 30 30 1.5638 1.9700 1.8322 2.5194 1.5574 1.7339 7 2 57 6.35 0.05 2 10 30 2.2115 2.7860 1.8426 2.3128 1.5385 1.8426 8 2 57 6.35 0.01 1 10 30 2.2115 2.7860 2.2024 2.0571 2.2024 2.4521

This sub-panel presents R codes for comparing non-parametric measurements from different groups, and consisting of

For 2 or more groups: Median Test
For 3 or more groups: Kruskall Wallis One Way Analysis of Variance for 3 or more groups, plus post-how comparison of mean ranks
For 2 groups

Robust Ranked Ordered Test (Mann-Whitney U Test)
Mann Whitney Wilcoxon Test

Data entry for 3 groups

# data entry dat = (" Grp Val g1 50 g1 57 g1 70 g1 60 g1 55 g2 58 g2 65 g2 70 g2 72 g2 70 g3 70 g3 72 g3 60 g3 77 g3 75 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame df$Rank <- rank(df$Val) # values as ranks df # optional display of data mxRanks <- xtabs(~ df$Grp + df$Rank) # matrix of counts by rank mxRanks # optional display of counts of ranks
The matrix of ranks for the 3 groups are displayed as follows
> mxRanks # optional display of counts of ranks df$Rank df$Grp 1 2 3 4 5.5 7 9.5 12.5 14 15 g1 1 1 1 0 1 0 1 0 0 0 g2 0 0 0 1 0 1 2 1 0 0 g3 0 0 0 0 1 0 1 1 1 1

Median Test

median = nrow(df) / 2 # median rank for all data MED <- vector(); # intermediate vector of 1(<) and 2(>=) median # calculate counts of < or >= medians in groups for(i in 1:nrow(df)) { if(df$Rank[i]<median) { MED <- append(MED,"<Median") } else { MED <- append(MED,">=median") } } # convert median values into a matrix of counts mxMed <- xtabs(~ df$Grp + MED) # matrix of counts for < medians and >= medians, display with results # create matrix to calculate chi square mxCount <- matrix(data=0, nrow=nrow(mxMed)+1,ncol=3) # for calculating chi sq # add row and col total for(i in 1:nrow(mxMed)) { for(j in 1 : ncol(mxMed)) { x = mxMed[i,j] mxCount[i,j] = x mxCount[i,3] = mxCount[i,3] + x mxCount[nrow(mxRanks)+1,j] = mxCount[nrow(mxRanks)+1,j] + x mxCount[nrow(mxRanks)+1,3] = mxCount[nrow(mxRanks)+1,3] + x } } #mxCount # optional display, for debug only # calculate chi sq ng = nrow(mxMed) # number of groups chi = 0 for(i in 1:ng) { for(j in 1:2) { o = mxCount[i,j] # observed e = mxCount[i,3] * mxCount[r+1,j] / mxCount[r+1,3] # expected chi = chi + (e-o)^2 / e # chi sq } } degF = r-1 # df p = 1 - pchisq(chi, df=degF) # p alpha # Show results mxMed # table of median counts c(chi, degF, p) # output results of Median Test: chi sq, df, and p
The results are as follows
> mxMed # table of median counts MED df$Grp <Median >=median g1 4 1 g2 2 3 g3 1 4 > c(chi, degF, p) # output results of Median Test: chi sq, df, and p [1] 3.750000 2.000000 0.153355

Kruskall Wallis Analysis of Variance by Rank
Firstly, the subroutine to calculate post hoc comparison of mean ranks between groups, which will be called from the main program
# subroutine for Least Significant differnce in mean ranks LSDRank <- function(TotN,NGrps,n1,n2,alpha) # total ssiz, num of grps 2 tail { tt = sqrt((TotN * (TotN + 1) / 12.0) * (1.0 / n1 + 1.0 / n2)) gg = alpha / (NGrps * (NGrps - 1)) # gg is Bonferroni corrected prob 1 tail return(-qnorm(gg) * tt) }
The main program. This uses the same example data as in the median test, and results of the following preliminary calculations

The data frame df, including the calculated rank values in df$Rank
The matrix of counts (group x ranks) in mxRanks

# Main Program nn = nrow(df) # total sample size arGrpNames <- rownames(mxRanks) # name of groups ng = length(arGrpNames) # numver of groups # Step 1: calculate mean ranks arMeanRanks <- vector() for(k in 1 : ng) { s <- subset(df$Rank, df$Grp==arGrpNames[k]) arMeanRanks <- append(arMeanRanks, mean(s)) } #arMeanRanks # optional display the mean ranks (during debugging) # Cakcukate Kruskall Wallis nr = ncol(mxRanks) # number of columns (number of rank intervals # Calculations begin ET = 0 # for calculating ties arN <- array(0,ng) # group sample size arI <- array(0,nr) # rank sample size for(i in 1 : nr) { for(j in 1 : ng) { x = mx[j,i] arI[i] = arI[i] + x arN[j] = arN[j] + x } ET = ET + 1.0 * arI[i] * arI[i] * arI[i] - arI[i]; # ties } SumSq = 0 #Sum(Nj*Rj2) for(j in 1 : ng) { SumSq = SumSq + arN[j] * arMeanRanks[j]^2 } H = (12.0 / (1.0 * nn * (nn + 1)) * SumSq - 3.0 * (nn+1)) / (1 - ET / (1.0 * nn * nn * nn - nn)) degF = ng - 1 p = 1 - pchisq(H, df=degF) # Results arGrpNames # name of groups arN # sample size of groups arMeanRanks # mean ranks of groups c(H, degF, p) # Results Kruskall Wallis H, degrees of freedom, p (α)
The initial results are displayed as follows
> # Results > arGrpNames # name of groups [1] "g1" "g2" "g3" > arN # sample size of groups [1] 5 5 5 > arMeanRanks # mean ranks of groups [1] 4.2 8.5 11.3 > c(H, degF, p) # Results Kruskall Wallis H, degrees of freedom, p (α) [1] 6.53503650 2.00000000 0.03810087 >
Post-hoc analysis
# vectors to store variables and results # post hoc analysis Grp1 <- vector() Grp2 <- vector() Diff <- vector() LSD_05 <- vector() LSD_01 <- vector() LSD_005 <- vector() for(i in 1 : (ng - 1)) { for(j in (i+1) : ng) { Grp1 <- append(Grp1, arGrpNames[i]) Grp2 <- append(Grp2, arGrpNames[j]) Diff <- append(Diff, arMeanRanks[i] - (arMeanRanks[j])) n1 = arN[i] n2 = arN[j] LSD_05 <- append(LSD_05, LSDRank(nn,ng,n1,n2,0.05)) LSD_01 <- append(LSD_01, LSDRank(nn,ng,n1,n2,0.01)) LSD_005 <- append(LSD_005, LSDRank(nn,ng,n1,n2,0.005)) } } dfRes <- data.frame(Grp1, Grp2, Diff, LSD_05, LSD_01, LSD_005) dfRes # results of post-hoc comparison of mean ranks
The results are as follows
> dfRes <- data.frame(Grp1, Grp2, Diff, LSD_05, LSD_01, LSD_005) > dfRes # results of post-hoc comparison of mean ranks Grp1 Grp2 Diff LSD_05 LSD_01 LSD_005 1 g1 g2 -4.3 6.771197 8.301998 8.892519 2 g1 g3 -7.1 6.771197 8.301998 8.892519 3 g2 g3 -2.8 6.771197 8.301998 8.892519

Non-parametric tests for 2 groups
In addition to the Mrdian test (already listed above), 2 programs for comparing non-parametric measurements in 2 groups are presented here. These are

The Wilcoson Mann Whitney Test
The Robust Signed Rank Test (also known as the Mann-Whitney U test

First, the data to be used. This is the same as that for the Kruskall Wallis test, except that the group g2 is removed, so that the 2 groups g1 and g3 are being compared.
# data entry dat = (" Grp Val g1 50 g1 57 g1 70 g1 60 g1 55 g3 70 g3 72 g3 60 g3 77 g3 75 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame df$Rank <- rank(df$Val) # values to rank df # optional display of data mxRanks <- xtabs(~ df$Grp + df$Rank) # matrix of counts by rank mxRanks # optional display of counts of ranks arGrpNames <- rownames(mxRanks) # name of groups

Wilcoxon Mann Whitney Test

ng = nrow(mxRanks) # number of groups nr = ncol(mxRanks) # number of rank divisions c(ng,nr) m = sum(mxRanks[1,]) # sample size grroup 1 n = sum(mxRanks[2,]) # sample size grroup 2 nn = m + n # total sample size c(m,n,nn) # calculate ties t = 0 for(i in 1 : nr) { z = mxRanks[1,i] + mxRanks[1,i] if(z>1) { t = t + (z * z * z - z) / 12 # calculate ties } } # Calculate w = sum of ranks w = c(0,0) # w for the 2 groups for(i in 1 : 2) { s <- subset(df$Rank, df$Grp==arGrpNames[i]) # ranks of each geoup for(j in 1:length(s)) { w[i] = w[i] + s[j] # w for each group } } # Wilcoxon Mann Whitney Test z z = (w[1] + 0.5 - m * (nn + 1) / 2) / sqrt((m * n / (nn * (nn - 1))) * ((nn * nn * nn - nn) / 12 - t)) p = 1 - pnorm(abs(z)) # Results output #n, w, mean rank of each group c(m, w[1], w[1] / m) c(n, w[2], w[2] / n) # Wilcoxon Mann Whitney U z and p c(z, p)
The results are as follows
> #n, w, mean rank of each group > c(m, w[1], w[1] / m) [1] 5.0 17.0 3.4 > c(n, w[2], w[2] / n) [1] 5.0 38.0 7.6 > # Wilcoxon Mann Whitney z and p > c(z, p) [1] -2.12132034 0.01694743

Robust Signed Rank Test (Mann-Whitney U
Firstly the subroutine function to calculate statistical significance (p, α) from U, which will be called from the main program
# subroutine significance of U SigOfU <- function(U, m, n) { U = abs(U) if(m>12 | n>12) return (sprintf("p = %.4g",(1 - pnorm(U)))) if(m<3 | n<3) return ("p = n.s.") i = m j = n if(n<m) { i = n j = m } if(i==3) { mx <- c(c(2.347,1.732,1.632,1.897,1.644,1.500,1.575,1.611,1.638,1.616), c(1e10, 2.273,2.324,2.912,2.605,2.777,2.353,2.553,2.369,2.449), c(1e10, 1e10,4.195,5.116,6.037,4.082,3.566,3.651,3.503,3.406), c(1e10, 1e10, 1e10, 1e10, 1e10,6.957,7.876,8.795,5.831,5.000)) mxC <- matrix(data = mx, nrow=10,ncol=4) } else if(i==4) { mx <- c(c(1.586,1.500,1.434,1.428,1.371,1.434,1.466,1.448,1.455), c(2.502,2.160,2.247,2.104,2.162,2.057,2.000,2.067,2.096), c(4.483,3.265,3.021,3.295,2.868,2.683,2.951,2.776,2.847), c(1e10, 1e10,6.899,4.786,4.252,4.423,4.276,4.017,3.904)) mxC <- matrix(data = mx, nrow=9,ncol=4) } else if(i==5) { mx <- c(c(1.447,1.362,1.308,1.378,1.361,1.361,1.340,1.369), c(2.063,1.936,1.954,1.919,1.893,1.900,1.890,1.923), c(2.859,2.622,2.465,2.556,2.536,2.496,2.497,2.497), c(7.187,3.913,4.246,3.730,3.388,3.443,3.435,3.444)) mxC <- matrix(data = mx, nrow=8,ncol=4) } else if(i==6) { mx <- c(c(1.335,1.326,1.327,1.338,1.339,1.320,1.330), c(1.860,1.816,1.796,1.845,1.829,1.833,1.825), c(2.502,2.500,2.443,2.349,2.339,2.337,2.349), c(3.712,3.519,3.230,3.224,3.164,3.161,3.151)) mxC <- matrix(data = mx, nrow=7,ncol=4) } else if(i==7) { mx <- c(c(1.333,1.310,1.320,1.313,1.302,1.318), c(1.804,1.807,1.790,1.776,1.769,1.787), c(2.331,2.263,2.287,2.248,2.240,2.239), c(3.195,3.088,2.967,3.002,2.979,2.929)) mxC <- matrix(data = mx, nrow=6,ncol=4) } else if(i==8) { mx <- c(c(1.295,1.283,1.284,1.290,1.293), c(1.766,1.765,1.756,1.746,1.759), c(2.251,2.236,2.209,2.205,2.198), c(2.954,2.925,2.880,2.856,2.845)) mxC <- matrix(data = mx, nrow=5,ncol=4) } else if(i==9) { mx <- c(c(1.294,1.304,1.288,1.299), c(1.744,1.742,1.744,1.737), c(2.206,2.181,2.172,2.172), c(2.857,2.802,2.798,2.770)) mxC <- matrix(data = mx, nrow=4,ncol=4) } else if(i==10) { mx <- c(c(1.295,1.284,1.284), c(1.723,1.726,1.720), c(2.161,2.152,2.144), c(2.770,2.733,2,718)) mxC <- matrix(data = mx, nrow=3,ncol=4) } else if(i==11) { mx <- c(c(1.289,1.290), c(1.716,1.708), c(2.138,2.127), c(2.705,2.683)) mxC <- matrix(data = mx, nrow=2,ncol=4) } else if(i==12) { mx <- matrix(c(1.283), c(1.708), c(2.117), c(2.661)) mxC <- matrix(data = mx, nrow=1,ncol=4) } if(U>=mxC[j-i+1,4]) return("p<0.01") if(U>=mxC[j-i+1,3]) return("p<0.025") if(U>=mxC[j-i+1,2]) return("p<0.05") if(U>=mxC[j-i+1,1]) return("p<0.1") return("p=n.s.") }
The main program" This uses the same data entry as the Wilcoxon Mann Whitney Test, listed above. This will not be repeated here. The program starts from a copy ot the procedures to convert measurements to rans
# Data entry as in Wilcoson test # Create ranks and tuen ranks into a matrix, copied b\from Wilcoson Mann Whitney Test ng = nrow(mxRanks) # number of groups nr = ncol(mxRanks) # number of rank divisions c(ng,nr) m = sum(mxRanks[1,]) # sample size grroup 1 n = sum(mxRanks[2,]) # sample size grroup 2 nn = m + n # total sample size c(m,n,nn) x1 = mxRanks[2,1] x2 = mxRanks[1,1] uyx = 0 uxy = 0 for(i in 2 : nr) { uyx = uyx + mxRanks[1,i] * x1 uxy = uxy + mxRanks[2,i] * x2 x1 = x1 + mxRanks[2,i] x2 = x2 + mxRanks[1,i] } uyx = uyx / m uxy = uxy / n x1 = 0 x2 = 0 vx = 0 vy = 0 for(i in 1 : nr) { vx = vx + mxRanks[1,i] * (uyx - x1)^2 vy = vy + mxRanks[2,i] * (uxy - x2)^2 x1 = x1 + mxRanks[2,i] x2 = x2 + mxRanks[1,i] } U = (m * uyx - n * uxy) / (2 * sqrt(vx + vy + uyx * uxy)) p = SigOfU(U, m, n) # significance (p, alpha) of U # results output c(U, m, n) # Mann Whitney's U and sample size of the 2 groups p # significance (p, α) of U
The results are as follows
> # results output > c(U, m, n) # Mann Whitney's U and sample size of the 2 groups [1] -4.753127 5.000000 5.000000 > p # significance (p, α) of U [1] "p<0.025"

This sub-panel provides R codes for the Permutation Test
Section 1: the global variables which will be set up in the main program bu=ut used in the recursive calculations
nSame <- 0 nLess <- 0 nMore <- 0 nv <- 0 sumTot <- 0 effectSize <- 0 refAr <- vector()
Section 2: Subroutine function Permutate a recursive function calculating the permutation results
Permutate <- function(m, indexAr) { if(indexAr[m]>nv) # last element of index exceeds data size { return(c(nLess, nSame, nMore)) # all permutation done } # Calculate sum = 0; for(i in 1 : m){ sum = sum + refAr[indexAr[i]] } # sum of permutated values diff = sum * 2.0 - sumTot # difference to reference if(abs(diff-effectSize)<1e-10) # same (trivial difference by binary math notation) { assign("nSame", nSame+1, envir = .GlobalEnv) } else { if(diff<effectSize) # less than input effect size { assign("nLess", nLess+1, envir = .GlobalEnv) } else # more than input effect size { assign("nMore", nMore+1, envir = .GlobalEnv) } } # recurse indexAr[m] = indexAr[m] + 1 # increment last cell of index for(i in m:2) # backwards to second cell { if(indexAr[i]>(nv-m+i)) # cell exceeds max for that position { indexAr[i-1] = indexAr[i-1] + 1 # advance prvious cell for(j in i:m) indexAr[j] = indexAr[j-1] + 1 # rest cell } } Permutate(m, indexAr) # recurse }
Section 3: the main program
# main program # data entry dat = (" Grp Val g1 57 g1 70 g1 60 g1 55 g2 58 g2 65 g2 70 g2 70 g2 72 g2 70 g2 72 g2 60 g2 77 g2 75 ") df <- read.table(textConnection(dat),header=TRUE) # conversion to data frame # set up and initialize globasl variables nv <- nrow(df) # sample size (global) # cross tab mxTmp <- xtabs(~ df$Grp + df$Val) # tmp matrix of counts arName <- rownames(mxTmp) # name of groups refAr <- df$Val # all the values (global) sumTot <- sum(df$Val) # sum(y) for all cases (global) # Effect size of data as entered dfGrp1 <- subset(df, df$Grp==arName[1]) # group 1 data m = nrow(dfGrp1) # sample size grp 1 # effect size of the 2 grps entered, to be used for comparisons recursively sumM = sum(dfGrp1$Val) # sum (values in grp 1) effectSize <- sumM * 2 - sumTot # Effect size of data as entered (global) # results nLess <- 0 # number of cases with smaller effect size (global) nSame <- 0 # number of cases with the same effect size (global) nMore <- 0 # number of cases with larger effect size (global) # set up initial index vector to control recursion indexAr <- vector() # index vector for(col in 1:m) indexAr <- append(indexAr, col) # fill to sample size of grp 1 #Call subroutine to calculate resAr <- Permutate(m, indexAr) # assemble results nTot = nLess + nSame + nMore # Total number of possible permutations pLess = nLess / nTot # proportion with smaller effect size pSame = nSame / nTot # proportion with the same effect size pMore = nMore / nTot # proportion with larger effect size # probability of obtaining reference effect size or more extreme p = min(pLess,pMore) + pSame # significance (probability, α) # result output effectSize # effect size of the 2 groups in input data c(nTot, nLess, nSame, nMore) # result numbers c(pLess, pSame, pMore) # result proportions p # probability of sum(x) or more extreme (p, α)
The resuts are as follows
> effectSize # effect size of the 2 groups in input data [1] -447 > c(nTot, nLess, nSame, nMore) # result numbers [1] 1001 17 11 973 > c(pLess, pSame, pMore) # result proportions [1] 0.01698302 0.01098901 0.97202797 > p # probability of sum(x) or more extreme (p, α) [1] 0.02797203

Contents of E:24

Contents of F:25

Parametric Tests
Confidence Intervals :
Altman DG, Machin D, Bryant TN and Gardner MJ. (2000) Statistics with Confidence Second Edition. BMJ Books ISBN 0 7279 1375 1. p. 28-31

Comparison of Variances
Bartlett's Test I have not read this reference, but if is quoted in the nist web site from which I obtained the algorithm.
Snedecor, George W. and Cochran, William G. (1989), Statistical Methods, Eighth Edition, Iowa State University Press.
Levene's Test : I have not read this reference, but if is quoted in the nist web site from which I obtained the algorithm.
Levene, H. (1960). In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, I. Olkin et al. eds., Stanford University Press, pp. 278-292.
Formulae : I obtained these from the National Institute of Science and Technology resource website. The urls are
handbook index
Bartlett test
Levene test

One Way Analysis of Variance
Algorithm: Armitage P. Statistical Methods in Medical Research (1971). Blackwell Scientific Publications. Oxford. P.189-207.
Confidence Intervals : Altman DG, Machin D, Bryant TN and Gardner MJ. (2000) Statistics with Confidence Second Edition. BMJ Books ISBN 0 7279 1375 1. p. 28-31
Post hoc Analysis
Least significant difference Tukey:
Armitage P. Statistical Methods in Medical Research (1971). Blackwell Scientific Publications. Oxford. P.189-207
Steel R.G.D., Torrie J.H., Dickey D.A. Principles and Procedures of Statistics. A Biomedical Approach. 3rd. Ed. (1997)ISBN 0-07-061028-2 p. 191-192
Studentised range tables : Pearson ES, Hartley HO (1966) Biometrika table for statisticians Ed. 3 Table 29.
https://en.wikipedia.org/wiki/Tukey's_range_test from Wikipedia
Bonferroni (Fisher)
https://www.itl.nist.gov/div898/handbook/prc/section4/prc473.htm Bonferroni Test from NIST

Scheffe
cheffe H (1959) The Analysis of Variance NY Wiley (quoted by everyone else but I have not read it)
Pedhazur E.J. Multiple regression in behavioral research explanation and prediction (3rd Ed) 1993. Harcourt Brace College Publishers, Orlando Florida. ISBN 0-03-072831-2 p. 369-371
Portney LG, Watkins MP (2000) Foundations of Clinical Research. Applications to Practice (Second Edition) Prentice Hall Health, New Jersey. ISBN 0-8385-2695-0. p. 460-461
Steel R.G.D., Torrie J.H., Dickey D.A. Principles and Procedures of Statistics. A Biomedical Approach. 3rd. Ed. (1997)ISBN 0-07-061028-2 p. 189-190
Confidence interval of difference between means
Bird KD (2002) Confidence Intervals for Effect Sizes in Analysis of Variance. Educational and Psychological Measurements 62:2:197-226

Nonparametric Tests
Siegel S and Castellan Jr. N J (1988) Nonparametric Statistics for the Behavioral Sciences 2nd. Ed. McGraw Hill, Inc. Boston Massachusetts. ISBN 0-07-057357-3

Wilcoxon Mann Whitney Test p. 128-137.
Robust Rank Ordered Test also called Mann Whitney U Test p. 137-144.
Kruskall-Wallis One Way Analysis of Variance p. 206-215.

Least significant difference between mean ranks

Siegel S and Castellan jr. N,J (1988) Nonparametric statistics for the behavioural sciences. McGraw-Hill International Editions Statistics series. ISBN 0-07-057357-3. P213-214.

Dunn's Test

Zar Z.H. (1974) Biostatistical analysis (3rd.Ed) Prentice Hall, New Jersey. ISBN 0-13-084542-6. p 227-228.
Table for Q values for Dunn's Test: App. 106 Dunn O.J. (1964) Multiple contrasts using rank sums. Technometrics 6:241:252

Median Test

Siegel S and Castellan Jr. NJ (2000) Nonparametric Statistics for the Behavioral Sciences. Second Edition. McGraw Hill, Sydney. ISBN0-07-100326-6 p.124 (2 groups), p. 200 (3 or more groups).

Permutation Test

Siegel S and Castellan Jr. NJ (2000) Nonparametric Statistics for the Behavioral Sciences. Second Edition. McGraw Hill, Sydney. ISBN0-07-100326-6 p. 151-155

Contents of E:4

Contents of F:5