Predictive modeling is an area of study to help humans leverage advanced mathematical approaches to find patterns in the historical data upon which future predictions can occur. Powerful technique to help scientists and engineers understand complex problems leveraging technology and a bit of ingenuity.
Preprocessing of the data helps ensure that the information you are feeding the model is balanced and optimized in terms of what features are most viable to provide the best patterns. Think removing the noise from your data to help the model be successful.
Mathematicians have been working for centuries learning how to develop and resolve patterns to help lead to something useful (buildings, molecules, etc). The models described here cover the most popular ones used today. These will cover:
- Linear Regression Models
- Non-Linear Regression Models
- Regression Tree Models
- Rule Based Models
Two main modeling approaches within machine learning include supervised and unsupervised modeling techniques. Each has it's own strength and weaknesses.
Modeling KPIs include:
- Regression Models
- Root Mean Square Error
- R^2
- Trees (ROC)
- PCA: Scree Plot
With many of the models, they will inform you of the most influential features that led to the models ability to predict. These are useful to understanding and verification/validation of the model.