![]() | Content Disclaimer Copyright @2020. All Rights Reserved. |
Links : Home Index (Subjects) Contact StatsToDo |
Related link :
This page provides explanations, clarifications, and supports to Linear Discriminant Analysis, as a web page based php program in the Discriminant Analysis Program Page
, and as a R program in a latter panel on this page.
Discriminant Analysis is clearly and succinctly described in Wikipedia. and general theories and descriptions of the model will not be repeated here. However, Discriminant Analysis was initially developed over a century ago, and options on how the algorithm is used have since developed. The algorithm in StatsToDo chose different options to some of the statistial packages in the following manner.
The panel describes and explains the procedures used and results presented by the Discriminant Analysis Program Page
, using the default example data to demonstrate.
Default Example DataThe default example data are artificially created to demonstrate the algorithm, and does not reflect reality. It purports to be a study to discriminate the type of wine (SR for sweet red, DR for dry red, SW for sweet white, and DW for dry white). It uses 4 measurements to do so, col 1 is measurement of tannin, col 2 color, col 3 acidity, and col 4 sugar. Sixteen (16) wines are used to create the model, 4 from each type. The data table is as shown to the right, and entered (without the headers) in the data input text area in the programOptionsThe program provides two buttons for 2 options in how the results are presented
Part 1 of Program. Create the ModelPart 1, to create the Disciminant model, is the same for both options, regardless of which buttion is chosen.
Step 1. The program places the outcome group names in alphabetical order, and label them as groups 1, 2, and so on. The reason for doing so is to reduce the space required when presenting results, in case there are large number of outcome groups with long names. The table for naming groups is as shown to the right, followed by the basic parameters of case numbers variable numbers, and outcome group numbers
Step 2. This calculates the means and Standard Deviations (SD) for all the independent variables. The means and Standard Deviations estimated using the reference data are assumed to be population parameters, and can be used validely with future data that use this model. The table of means and Standard Deviations are shown in the table to the right. Please note that, with the default example data, variable 1=tannin 2=color, 3=acidity, and 4=sugar
Step 3. This calculates the standard deviates (z) for all independent variables, where z = (value-mean)/Standard Deviation. z values are not displayed, but are used for all subsequent calculations
Step 4. This calculates the coefficients for each combination of function and standardized variable values z. It then calculates whether its contribution to discrimination is statistically significant, using the Chi Square Test. The table of significance is shown to the left, and the coefficients to the right. The # character is used to identify those functions that are not statistically significant. Please note that, in the coefficient table, the columns are functions, and the rows are for the z values of each independent variable. The function coefficients are used to estimate function scores.
This concludes Part 1 of the program, the development of the Discriminant model, and the tables produced are the parameters of the model. Part 2 of program consists of the description of the model produced. Part 2 of Program. Description of the Model Created in Part 1Depending on the button clicked in Discriminant Analysis Program Page , the choice of either using all Discriminant functions created (with default example data this is 3), or only those that are statistically significant (with default example data this is 2 as function 3 is not statistically significant). The following explanations describes both, but users should be aware that only one of the results will be shown by the program, depending on which button is clicked.
Step 6 This combines the z values of the independent variables and the Discriminant function coefficients to estimate the function scores of each function for each case (row of data). The function scores are then collated in the multidimensional Euclidean space, and their distance to each centrod computed. The distances are then converted to probabilities of belonging to each group. These probabilities are named Maximum Likelihood, as the assumptions that the a priori probability of belonging to each group is the same regardless of the sample sizes in the reference data. Two tables are shown here. The one on the left uses all 3 functions, and on the right the 2 significant functions. Please Note: the program displays results with 4 decimal point precision, but these tables presents only 2 decimal points precision to conserve space and to make the tables easier to read.
Step 7 The group with the highest probability is the group that case is allocated to. This is compared with the original group designation in the data. A table of counts is thus produced, where the numbers in the diagonal represent the number of cases correctly allocated by the model, while of off diagonal those erroneously allocated. With Only 1 table is shown here as, with the default reference data, the allocations perfectly reflects the designation with both options. The table is shown to the right. Step 8 This produces 2 dimensional plots of the scores between Linear Discriminant function scores. Each line in the plot joins the function scores with the centroid value of the designated group. With the default example data and all functions used, 3 plots are produced. If only significant functions are included, then only 1 plot, between the first and second functions, will be presented. The plots are shown as follows.
Step 8 This reproduces the model parameters in the form of Javascript codes, which can be copied and pasted onto html pages, to be used remote from the www and when interacted with new data. A template html file DiscriminantTemplate.html is provided to incorporate and use these parameters. As details on how to import the parameters and edit the html files are presented in the template file itself, they are not elaborated here. Summary, Commentary, and Technical IssuesDiscriminant Analysis was introduced by Fisher more than a century ago, as a variant of the multiple regression model, but with multinomial groups as dependent variable and normally distributed measurements as independent variable. It enjoys continued usage, but with time many modifications and additions are made, producing options that users may wish to choose, and producing results that are similar but numerically different. This confusion is further aggravated by some statistical packages keeping to the original algorithm, some including useful additions and and modifications, and some providing menus from which users can choose different options. The following comments attempts to clarify some of these issues
The Linear Discriminant analysis using R was carried out to check the accuracy of the web based php program in the Discriminant Analysis Program Page
. Only the minimum amount of coding is used. User can search R for the numerous version of calculation and graphic support for Discriminant analysis.
The R codemyDat = (" v1 v2 v3 v4 Grp 1.2 45 3.16 72.7 SR 1.3 67 3.38 102.4 SR 1.1 48 3.61 33.7 SR 1.6 36 3.51 58.2 SR 1.5 47 3.20 44.2 DR 1.5 74 3.21 91.8 DR 1.7 47 3.39 53.1 DR 1.6 56 3.36 88.5 DR 1.1 27 3.30 36.3 SW 1.0 53 3.55 74.7 SW 0.9 37 3.23 94.2 SW 1.2 23 3.07 53.8 SW 1.4 44 3.34 20.7 DW 1.3 34 3.24 9.5 DW 1.1 37 3.24 17.8 DW 1.4 55 3.35 35.9 DW ") myDataFrame <- read.table(textConnection(myDat),header=TRUE) myDataFrame$z1<-(myDataFrame$v1-mean(myDataFrame$v1)) / sd(myDataFrame$v1) myDataFrame$z2<-(myDataFrame$v2-mean(myDataFrame$v2)) / sd(myDataFrame$v2) myDataFrame$z3<-(myDataFrame$v3-mean(myDataFrame$v3)) / sd(myDataFrame$v3) myDataFrame$z4<-(myDataFrame$v4-mean(myDataFrame$v4)) / sd(myDataFrame$v4) myDataFrame #install.packages("MASS") # if not already installed library(MASS) fit <- lda(Grp ~ z1 + z2 + z3 + z4, data=myDataFrame) fit predict(fit,newdata=myDataFrame,prior=c(1,1,1,1)/4)$x #calculate faction scores predict(fit,newdata=myDataFrame,prior=c(1,1,1,1)/4)$posterior #calculate probabilities The Code and Results ExplainedPlease note that R codes are in maroon, and results in blueStep 1. Data entry myDat = (" v1 v2 v3 v4 Grp 1.2 45 3.16 72.7 SR 1.3 67 3.38 102.4 SR 1.1 48 3.61 33.7 SR 1.6 36 3.51 58.2 SR 1.5 47 3.20 44.2 DR 1.5 74 3.21 91.8 DR 1.7 47 3.39 53.1 DR 1.6 56 3.36 88.5 DR 1.1 27 3.30 36.3 SW 1.0 53 3.55 74.7 SW 0.9 37 3.23 94.2 SW 1.2 23 3.07 53.8 SW 1.4 44 3.34 20.7 DW 1.3 34 3.24 9.5 DW 1.1 37 3.24 17.8 DW 1.4 55 3.35 35.9 DW ") myDataFrame <- read.table(textConnection(myDat),header=TRUE)Please note that the headers are included in R as they are required to call the algorithm Step 2. Calculate the Standard Deviate z=(v-mean/SD) myDataFrame$z1<-(myDataFrame$v1-mean(myDataFrame$v1)) / sd(myDataFrame$v1) myDataFrame$z2<-(myDataFrame$v2-mean(myDataFrame$v2)) / sd(myDataFrame$v2) myDataFrame$z3<-(myDataFrame$v3-mean(myDataFrame$v3)) / sd(myDataFrame$v3) myDataFrame$z4<-(myDataFrame$v4-mean(myDataFrame$v4)) / sd(myDataFrame$v4)Step 3. Display the data object, including the calculated z values summary(myDataFrame)The results are v1 v2 v3 v4 Grp z1 z2 z3 z4 1 1.2 45 3.16 72.7 SR -0.45185501 -0.0460777 -1.1033570 0.58784387 2 1.3 67 3.38 102.4 SR -0.02657971 1.5758573 0.4019983 1.60105898 3 1.1 48 3.61 33.7 SR -0.87713031 0.1750953 1.9757788 -0.74264062 4 1.6 36 3.51 58.2 SR 1.24924619 -0.7095966 1.2915264 0.09317656 5 1.5 47 3.20 44.2 DR 0.82397089 0.1013709 -0.8296560 -0.38443326 6 1.5 74 3.21 91.8 DR 0.82397089 2.0919275 -0.7612308 1.23944012 7 1.7 47 3.39 53.1 DR 1.67452149 0.1013709 0.4704235 -0.08080988 8 1.6 56 3.36 88.5 DR 1.24924619 0.7648898 0.2651478 1.12686066 9 1.1 27 3.30 36.3 SW -0.87713031 -1.3731154 -0.1454036 -0.65394166 10 1.0 53 3.55 74.7 SW -1.30240561 0.5437168 1.5652273 0.65607384 11 0.9 37 3.23 94.2 SW -1.72768091 -0.6358722 -0.6243803 1.32131609 12 1.2 23 3.07 53.8 SW -0.45185501 -1.6680127 -1.7191841 -0.05692938 13 1.4 44 3.34 20.7 DW 0.39869559 -0.1198020 0.1282973 -1.18613545 14 1.3 34 3.24 9.5 DW -0.02657971 -0.8570452 -0.5559551 -1.56822331 15 1.1 37 3.24 17.8 DW -0.87713031 -0.6358722 -0.5559551 -1.28506892 16 1.4 55 3.35 35.9 DW 0.39869559 0.6911655 0.1967226 -0.66758765Please note: z1, z2, z3, and z4 are the z values for v1, v2, v3, and v4 Step 4. Perform Linear Discriminant analysis and display results #install.packages("MASS") # if not already installed library(MASS) fit <- lda(Grp ~ z1 + z2 + z3 + z4, data=myDataFrame) fitPlease note: the calculations are based on the z values and not the original measurements Prior probabilities of groups: DR DW SR SW 0.25 0.25 0.25 0.25 Group means: z1 z2 z3 z4 DR 1.14292737 0.7648898 -0.2138289 0.4752644 DW -0.02657971 -0.2303885 -0.1967226 -1.1767538 SR -0.02657971 0.2488196 0.6414866 0.3848597 SW -1.08976796 -0.7833209 -0.2309352 0.3166297 Coefficients of linear discriminants: LD1 LD2 LD3 z1 1.3710438 -0.7462890 0.1364612 z2 1.7208986 0.4156835 -0.2198221 z3 -0.6709095 -0.2961205 -0.8885895 z4 -1.5797134 -1.4082862 0.2128178 Proportion of trace: LD1 LD2 LD3 0.7899 0.1899 0.0202The prior probabilities are calculated from the sample sizes of the groups The LD1, 2, and 3 are the 3 Linear Discriminant functions The proportion of trace represents the proportion of discriminating power of each function, and can be used to test for statistical significance Step 5. Calculate function scores predict(fit,newdata=myDataFrame,prior=c(1,1,1,1)/4)$x #calculate function scoresplease note that the prior term is actually unnecesary, as it is not used when function scores are calculated When the data object is called as newdata, a separate sets of data can be used, providing the the appropriately labelled independent variables are present (in this example z1, z2, z3, and z4) LD1 LD2 LD3 1 -0.8871802 -0.1830651 1.054003243 2 -0.1234701 -1.6988951 -0.366512945 3 -1.0536723 1.1881588 -2.071887461 4 -0.5220622 -1.7409330 -0.801348480 5 2.4680678 0.2142880 0.745565901 6 3.2824521 -1.2654109 0.592784805 7 2.2823362 -1.2330374 -0.228987474 8 1.0710619 -2.2798046 0.006542448 9 -2.4349834 1.0478052 0.172180524 10 -2.9365081 -0.1894505 -1.548469220 11 -5.1313958 -0.6508717 0.740034619 12 -2.2466446 0.2331076 1.820538675 13 2.1281402 1.2850848 -0.285692745 14 1.3390091 2.0367135 0.345040379 15 0.1061805 2.3646456 0.240614788 16 2.6586689 0.8716648 -0.414407058Step 6. Calculate the posterior (Bayesean) Probability of belonging to each group predict(fit,newdata=myDataFrame,prior=c(1,1,1,1)/4)$posterior #calculate posteriori (Bayesean) probabilitiesPlease note that the prior term for apriori probability is used here. If this is left out, the program assumes the prior probabilities are the same as that in the reference data, depending on the sample sizes of the groups there. DR DW SR SW 1 1.027719e-02 1.738244e-02 0.8056435151 1.666969e-01 2 7.584478e-02 1.695596e-03 0.9196816594 2.777961e-03 3 2.543569e-04 5.741198e-02 0.8883330540 5.400061e-02 4 1.793459e-02 5.428730e-04 0.9760628920 5.459645e-03 5 6.615467e-01 3.338972e-01 0.0045559379 1.919581e-07 6 9.948814e-01 4.791425e-03 0.0003272035 5.284128e-10 7 9.744939e-01 1.355788e-02 0.0119480765 1.254452e-07 8 8.326310e-01 1.399772e-03 0.1659475333 2.173213e-05 9 2.634318e-06 5.441063e-04 0.0758854570 9.235678e-01 10 7.201969e-07 1.160473e-05 0.1924352689 8.075524e-01 11 9.464596e-12 1.011071e-10 0.0001827173 9.998173e-01 12 2.029302e-05 2.288982e-04 0.0560473803 9.437034e-01 13 5.420201e-02 9.416221e-01 0.0041755258 3.779071e-07 14 4.865341e-03 9.917932e-01 0.0033348804 6.545613e-06 15 7.663001e-04 9.729023e-01 0.0250258317 1.305537e-03 16 2.029808e-01 7.940588e-01 0.0029603250 4.639190e-08when translated to normal numerical format with 2 decimal point precisions DR DW SR SW 1 0.01 0.02 0.81 0.17 2 0.08 0.00 0.92 0.00 3 0.00 0.06 0.89 0.05 4 0.02 0.00 0.98 0.01 5 0.66 0.33 0.00 0.00 6 0.99 0.00 0.00 0.00 7 0.97 0.01 0.01 0.00 8 0.83 0.00 0.17 0.00 9 0.00 0.00 0.08 0.92 10 0.00 0.00 0.19 0.81 11 0.00 0.00 0.00 1.00 12 0.00 0.00 0.06 0.94 13 0.05 0.94 0.00 0.00 14 0.00 0.99 0.00 0.00 15 0.00 0.97 0.03 0.00 16 0.20 0.79 0.00 0.00The function coefficients, scores, and probabilities are the same as that produced by the program in Discriminant Analysis Program Page , other than minor discrepancies caused by different rounding errors. Genral references on Discriminant analysisWikipedia on Discriminant Analysis.George D and Mallery P (1999) SPSS for Windows Step by Step. A Simple Guide and Reference. Allyn and Bacon, Sydney. ISBN 0-205-28395-0 Chapter 26. The Discriminant Procedure p.313-328. Resources used to developed the web bases php programThese are very old books, that still present actual formulae and algortithms for all step in the calculations. Most newer references do not provide detailed algorithms, but advise users to access available packages such as SAS, SPSS, R, and PythonOverall JE and Klett CJ (1972) Applied Multivariate Analysis. McGraw Hill Series in Psychology. McGraw Hill Book Company New York. Library of Congress No. 73-14716407-047935-6
Press WH, Flannery VP, Teukolsky SA, Vetterling WT (1989). Numerical Recipes in Pascal. Cambridge University Press IBSN 0-521-37516-9 p.395-396 and p.402-404. Jacobi method for finding Eigen values and Eigen vectors Norusis MJ (1979) SPSS Statistical Algorithms Release 8. SPSS Inc Chicago Chapterr 23 : Discriminant p. 69-83. Formulae for algorithm provided by SPSS. Useful references and advice on Discriminant analysis using Rhttps://www.geeksforgeeks.org/linear-discriminant-analysis-in-r-programming/https://www.statmethods.net/advstats/discriminant.html Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer. |