Content Disclaimer
Copyright @2020.
All Rights Reserved.
StatsToDo : Discriminant Analysis Explained

Links : Home Index (Subjects) Contact StatsToDo

Related link :
Discriminant Analysis Program Page

Explanation Example R References
This page provides explanations, clarifications, and supports to Linear Discriminant Analysis, as a web page based php program in the Discriminant Analysis Program Page , and as a R program in a latter panel on this page.

Discriminant Analysis is clearly and succinctly described in Wikipedia. and general theories and descriptions of the model will not be repeated here.

However, Discriminant Analysis was initially developed over a century ago, and options on how the algorithm is used have since developed. The algorithm in StatsToDo chose different options to some of the statistial packages in the following manner.

  • Discriminant Analysis was created as a variant of multiple regression, based on the least squares analysis of variance. The coefficients were calculated using the covariance matrix derived from the values of the independent variables. Experience in using the method showed that the results can be distorted by differences in the scales used to represent the independent variables. For example, using weight in pounds and height in inches will produce different results to when the same data are represented as cms and kgms.

    By firstly converting all measurements to its standard deviates z, where z=(value-mean) / Standard Deviation, all measurements can be converted to the same scalar with mean of 0 and Standard Deviation of 1, and the results obtained would be consistent, regardless of the units of measurement in the original data. Some but not all statistical packages offer this option, but the programs described here and in Discriminant Analysis Program Page uses the z values by default, so the results are sometimes different to those produced by the commonly used statistical packages

  • The Linear Discriminant functions are estimated using Principal Component Analysis, and the maximum number of fuctions produced is one less than the number of outcome groups. As these functions are based on Principal Component extractions, they have a heirarchy of precision, and the earlier extracted functions contains more discriminant capability than the later ones, particularly if there are correlations between the independent variables. Experience shows that, in some cases, the use of all Discriminant functions produced actually reduces the precision of the algorithm, as the latter functions contain mostly statistical noise.

    However, most statistical packages continue to present the use of all Discriminant functions, which is useful to understanding the structure of the model produced, but less useful when the functions are used to predict outcome in future and different sets of data. The program in the Discriminant Analysis Program Page provides 2 buttons, allowing the user to choose whether all functions, or only those found to be statistically significant (and therefore containing less random noise) are used

The contents of the other panels on this page are as follows
  • Example explains how the program works, and how the results can be interprested, using the algorithm and default example data from the Discriminant Analysis Program Page
  • R presents the basic program in the R language. This allows users familiar or willing to learn R to use the codes as a template to begin deveolop his/her own program independent to the www and StatsToDo
  • References is a short list of references for Discriminant Analysis