Full explanations on Naive Bayes Probability, the terminology used, and the default example of this program are provided in the Introduction and Naive Bayes panels of
Classification by Bayes Probability Explained Page
. This panel presents brief summaries, sufficient only to help new users to negotiate data entry and program options.
Default Example
The data in the default example are artificially generated and does not represent reality. It purports to be from an exercise to use hair color (Dark and Light) and eye color (Blue, Brown, Others) to identify ethnicity (French, German, Italian). Once the model is established, it is used in a population where the ratios of French:German:Italians are 3:2:1 (normalized to 0.5:0.3333:0.1667), and biased by coefficients of 1:2:1 (normalized to 0.25:0.5:0.25).
Input
Multiple Predictors are used in the Naive Bayes model to predict outcome. In our example, hair and eye colors are presented as a pattern, an array [hair, eye], for processing. The program uses Col to represent predictors, Col1, Col2, ...etc.
Attributes (a) are mutually exclusive alternatives in each of the predictors.
Patterns (p) are arrays of attributes, one from each predictor. Patterns are used to predict outcomes
Input Data are placed in the first text area labeled "Data" in Program 1, and can be in one or two formats
- For Program 1, for building and testing a model, both the attributes of predictors and the outcome names are required. These are in multiple columns separated by spaces of tabs. All the columns except the last on the right are predictor columns. The last column to the right contains the outcome names. The program identifies the predictors as col1, col2, and so on.
- For all other programs in the program panel, data is optional, and when present is interpretted by the model. The number of columns must be the same as the number of predictors in the model. If more columns are provided, only the number of columns that are in the model will be read. If insufficient columns are provided, the program may crash or produce erroneous answers.
a priori Probability (π) is a believe in the probability of belonging to an outcome prior to executing the Baysian model. In data entry, any set of number representing relative proportions (sample size, ratios, percents) can be used, and the program will normalize them by dividing each by the total. The default example assigned the ratio of French:German:Italian to 3:2:1, normalized to 0.5:0.3333:0.1667.
Cost Coefficients (C) is used to insert a bias to decisions based on value judgement. The input is an array of number representing the relative cost to the user of the model if an outcome is wrongly not identified, so represents the importance of each outcome. Any set of numbers representing relative values can be used (dollar, quality of life, death rate), as the program normalizes the numbers to proportions of the total. The default example sets the relative importance of French:German:Italia to 1:2:1, normalized to 0.25:0.5:0.25.
Progam Output
Probabilities
During model development, the coefficient produced from the reference data is the Probability of attribute (a) given outcome (o), P(a|o). When using P(a|o) and an array of attributes (pattern (p)) to predict, however, the following probabilities can be produced
- If both a priori probabilities and costs are not set (or same for all outcomes), the oucome probabilities are P(outcome|pattern) or P(o|p), also known as Maximum Likelihood. This describe the model, and the relationships between the patterns and outcomes.
- If s priori probabilities are set, but costs not, the outcome probabilities are πP(o|p). This is the most commonly uese setting and output, and is referred to as Naive Bayesian Probability or a posteriori Probability.
- If both a priori probabilities and costs are set, the outcome probabilities are CπP(o|p), and referred to as Naive Bayesian Probability with cost adjustment.
Missing Data: The program assumes all data used to develop the model to be valid, and included them as attributes or outcomes. When the developed coefficients are used to interpret attributes, any attribute string that does not match the list of reference attributes for that particular predictor, the program will not process that predictor, but will continue to calculate with what valid attributes there are.
Javascript Program produces a function that will match an attribute array (pattern) against the reference attributes, and returns the appropriate array of probabilities. The P(a|o) table must be available in the text area in Program 3, and the program will also read the a priori and cost coefficients in their textboxes near Program 1. The script produced can be copied and pasted into a html page and useed to produce probabilities. NaiveBayesTemplate.html is a template html page demonstrating how the function produced by this page using default example data.
Programs
The programs on this page consists of a single program with 3 points of entry (Program 1-3), and a supplementary program (program 4) to create a Javascript interpreter. The Example buttons triggers the loading of the default example data for each entry, and runs the program using that data. Users can enter his/her own data and runs the program with the program buttons
Program 1 produces prediction model using raw data, and interprets the same data
- Reads the multiple column data from the data text area
- Counts all the attribute/outcome combinations to produce the count table
- Presents the count table in the results and in the text area of Program 2
- Passes the results to the algorithms of Program 2 to complete the program.
Program 2 produces the coefficients P(a|o) using a table of counts. On clicking the Example or Analyse Reference Table button, or if Program 1 is already running, the program will
- Reads the count table from the text area of Program 2 or uses the table as passed from Program 1
- Calculates the coefficients P(a|o), and deposits the coefficient table to the text area in Program 3
- Passes the results to the algorithm of Program 3
Program 3 produces a posteriori probabilities using the table of coefficients P(a|o). On clicking the
Example or
Calculate Probabilities button, or if Program 1 or 2 is already running, the program will
- Reads the count P(a|o) from the text area of Program 3 or the table passed from Program 1 or 2
- Calculates and presents the a posterior probabilities. These will be
- the Maximum Likelihood for patterns P(o|p), if both apriori probabilities and costs are not set
- the Bayesian Probability πP(o|p), if a priori probabilities are set but costs not
- the Bayesian Probability with cost adjustment CπP(o|p), if both a priori probabilities and costs are set.
- If the number of possible patterns (product of the number of attributes in all predictors) are <=500, their probabilities are presented in a table to be used for reference purposes. This table will not be produced if the number exceeds 500
- If data exists in the text area of Program 1, then these will be interpreted and produced in a table. If the number of rows exceeds 500 then only the first 500 will be processed.
- The limitation is to avoid the program crashing when time limits for processing is exceeded.
- In the tables of output, the outcome with the highest probability is marked in bold, to indicate the preferred decision If there is more than one outcome with equal highest value, then only the first (alphabetically) is marked.
Program 4 is supplementary, and can be executed when the table of P(a|o) is available in the text area in Program 3. On clicking the
Create Javascript Code for Interpretation button, the program will
- Reads the table of P(o|a) from the text area of Program 3, the a priori probabilites and costs from their text boxes of Program 1
- Creates and presents the Javascript codes as a function, which accepts an array of attribute strings (pattern), and returns an array of probabilities
An
example and template page can be examined to see how the Javascript function can be incorporated into a html file and used in other applications