Updated: Nov 11, 2019
Recommended Online Resources to Learn About Bagging and Boosting Algorithms
Making Data Science Accessible: Tree methods, post on LinkedIn
Random Forests website by Leo Breiman and Adele Cutler
Introduction to Statistical Learning, book by Hastie, Tibshirani and others
Elements of Statistical Learning, introductory book by Hastie, Tibshirani and Friedman
Recommended Scientific Articles to Learn About Bagging and Boosting
Bagging Predictors, Breiman, Machine Learning, 1996
Random Forest, Breiman, Machine Learning, 2001
Strength of Weak Learnability, Robert Schapire, Machine learning, 1991.
Boosting the Margins, Schapire and others, Annals Statistics, 1998.
Stochastic Gradient Boosting, Friedman, 1999.
A greedy Function Approximation, Friedman, Annals Statistics, 2000.
Boosting Algorithms: Regularization, Prediction and Model Fitting, Statistical Science, 2007
Gradient Boosting Machines, a tutorial, frontiers in neurorobotics, 2013
Overview of the Recommended Resources
For those giving the first steps into bagging algorithms and boosting trees, the LinkedIn post Machine Data Science Accessible: Tree methods, gives a good intuition on how these models work, without the stress of mathematical formulations. Following that, the Random Forests website by Leo Breiman and Adele Cutler provides a clear explanation of Random Forests, plus a variety of resources for further study. Finally, you can learn about bagging and boosting algorithms in Trevor Hastie’s Gradient Boosting Machine talk on YouTube. The book Introduction of Statistical Learning, also provides a good introduction into bagging and boosting of algorithms, without relying heavily on maths, and providing examples in R.
To get a better understanding on the mathematical reasoning underlying these algorithms, the book Elements of Statistical Learning provides a good starting point, covering bagging and boosting extensively, and offering explanatory examples.
To dig even deeper in understanding these algorithms, the scientific articles from the creators of the models, and more recent revisions which summarise the different algorithms, highlight the advantages and limitations of the different models.
In scientific articles, the authors describe the algorithms, i.e., Bagging, Boosting or Random Forests, and discuss in detail their advantages over existing models. They provide a variety of examples on real or simulated data where these methods have been successfully implemented. Scientific articles are usually very rich in the mathematical formulations that explain the functionality of the model. They are good for those seeking to understand the mathematical pillars of the algorithms and how they were developed.
Disclaimer: Opinions are my own and I do not become financial compensation from any of the links included in this article.