Content Disclaimer
Copyright @2020.
All Rights Reserved.
StatsToDo : Probability of Chi Square
Explanations, Calculations, Codes and Tables

Links : Home Index (Subjects) Contact StatsToDo

Explaination Calculations Codes Tables
The chi-square distribution results from the sums of square normal variables, and is a special case of the gamma distribution. There are numerous chi-square distributions, such as the non-central chi-square distribution, chi distribution and non-central chi distribution. However the most common is the central chi-square distribution, which is what this discussion focuses on. The chi-square distribution allows only non-negative numbers and is positively (right) skewed. The curve is specified by the degrees of freedom (df) which is the number of unconstrained variables whose Square are being summed and must be positive. As the degrees of freedom get larger, the chi-square distribution approaches the normal distribution. The mean of the curve is the degrees of freedom, and the standard deviation is calculated as the square root of 2*df.

The most well-known applications of the chi-square distribution are the chi-square goodness-of-fit test to compare an observed distribution to a theoretical one and testing independence between two categorical variables (Pearson's chi-square test). However, many other tests also use the chi-square distribution. It is also an integral part of the F distribution, whose test statistic is the ratio of two chi-square distributions.

Chi square for large degrees of freedom

Calculations of probability associated with chi-square, using the standard algorithm as described by Press et.al involved convoluted algorithms and use of large numbers. Depending on the computer, calculations for probability of chi-square becomes impossible at degrees of freedom between 100 and 300. The program either crashes, or a maximum chi-square value is presented regardless of further changes in probability or degrees of freedom.

Wilson and Hilferty devised an approximation of probability associated with the chi-square which allows for very large chi-square and degrees of freedom. The difference between this and the standard method was found to be trivial, less than 1%, when degrees of freedom is 200 or more, but the approximation progressively become less accurate as the degrees of freedom decreases. The general advice is that the Wilson Hilferty approximation is not necessary when the degrees of freedom is less than 100. Between 100 and 150 degrees of freedom, the probability calculated varies. In fast computers with 64 bit processors, the basic calculations can be performed even with degrees of freedom as high as 300. With 32 bits or less processor, the calculations fails somewhere between 100 and 150 degrees of freedom.

The Javascript program on this page uses the basic algorithm for calculations for up to degrees of freedom = 100. The Wilson Hilferty algorithm is then used for degrees of freedom 101 or more. Users will therefore find some inconsistencies between the reults from the Javascript program and those from other sources, particularly when the degrees of freedom between 100-150 are encountered. However the discrepancies should be less than 1% of the values.

The other panels on this page are

  • Calculations: Javascript program to calculate the probability of chi square
  • Codes: R and Python codes to calculate the probability of chi square
  • Tables: Tables for probability of chi square
References

https://en.wikipedia.org/wiki/Chi-square_distribution Wikipedia on chi square

Norusis MJ (1978) SPSS Statistical Algorithms Release 8; A companion to the manual [SPSS, second Edition by Nie, et.al]. This was published by SPSS Inc. Library of congress ID HA33.S15 029.779-4159

Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1994) Numerical Recipes in Pascal. Cambridge University Press UK. ISBN 0-521-37516-9 p. 180-186

Wilson EB and Hilferty MM (1931)  Proceedings of the National Academy of Sciences  19 (12)  p. 684-688

Greenwood JA and Sandomire MM (1950) Journal of the American Statistical Association 45 (250) p. 257 - 260