Practical Statistics for Data Scientists Notes Chapter Five
Logistic regression needs a sliding cutoff for converting the propensity score to a decision.
Naive Bayes
The Naive Bayes algorithm uses the probability of observing predictor values, given an outcome, to estimate the probability of observing outcome Y = i, given a set of predictor values. Predictor variables must be categorical variables in the standard Naive Bayes algorithm.
Naive Bayes can be applied to numerical predictors, the approaches are as follows:
Discriminant Analysis
LDA is the most common technique in DA. It has links to PCA, can provide a measure of preditor importance, and it is used as a computationally efficient method of feature selection. Seeking to divide the records into two groups, LDA focuses on maximizing the “between” sum of squares relative to the “within” sum of squares. The method yields the greatest separation between the two groups.
Using the discrimination function weights, LDA splits the predictor space into two regions as shown by the solid line. The predictions farther away from the line have a higher level of confidence.
Both LDA and PCA are linear transformation techniques: LDA is supervised whereas PCA is unsupervised — PCA ignores class labels. Remember that LDA makes assumptions about normally distributed classes and equal class covariances. The function is applied to the records to derive weights or scores, for each record that determines its estimated class.
Logistic Regression
LR is a structured model approach.
Key ingredients are the logistic response function and the logit, in which we map a probability to a more expansive scale suitable for linear modeling.
Evaluation:
ROC
AUC
LIFT
Imbalance data
undersample
oversample
data genaration
cost based classification
explore the predictions