Content Disclaimer
Copyright @2020.
All Rights Reserved.

StatsToDo: Factor Analysis Explanation and Programs

Links : Home Index (Subjects) Contact StatsToDo

Introduction Gaussian Binomial Negative Binomial Multinomial Ordinal Poisson

The Generalized Linear Models calculate regression coefficients by adptiong the model to the data presented, using numerical approximation.

A particular advantage of the models is that the algorithms can cope with a combination of factors (group names in text) and values (numerical data). This allows the models to be highly complex, providing the probability distribution of the data is known.

Programs

The panels of this page provides R codes for regression with the following distributions for the dependent variable

  • Gaussian. Where the dependent variable is normally distributed. This model is also termed General Linear Model, not to be confused with the generic name of all the models presented her the Generalized Linear Models
    • When all the independent variables are factors (group names in text), the results are similar to that produced by the Analysis of Variance
    • When all the independent variables are values (numerical measurements), the results are similar to Multiple Regression
    • Wnem the independent variables are a misture of factors and values, the results are similar to that of Analysis of Covariance
  • Proportions. Three models of regression for proportions are available
    • Binomial Where the dependent variable is the probabilities in each of two groups (no/yes, false/true)
    • Negative Binomial Where the dependent variable is the Odds Ratio, the number of cases in one group for each cases in the other group
    • Multinomial, the same as Binomial, except there are more than 2 groups
    • Ordinal, the same as Multinominal, except the groups are ordered
  • Poisson where the dependent variable is a count of events in a defined environment

Format

Each program is described in one of the subsequent panels. Each panel containing an introduction and a program template, each in a sub-panel

Each program is provided with a set of example data, to demonstrate the procedures. Please note:

  • The research model is deliberately simplistic so the user is not distracted from the computation
  • The data is computer generated and do not represent anything real.
  • The sample size is deliberately small, to make visualization easier.
The R code in each program is broken up into its constituent steps, and each step contains
  • Description of the step (in black)
  • The R code (in Maroon)
  • The results from that step (in Navy blue)
To re-constituate the whole program, the user should
  1. Copy all the codes in Maroon to the source code panel of RStudio, in the same order as in the template. This should include the example data
  2. Test run the program to make sure it works
  3. Change the example data to the user's own data
  4. Repeated cycles of test run and editing the codes (add, delete, and modify) until the required results are produced.

References

Please note that the R code template contins only the minimum amount of code, and produces only the basic results. Users may want to produce a more complete program, including the many options that are availble with each procedure.

References are provided in the Explanation panel of each program for this purpose