Jan 6th, 2014 by Diego Evin
Hi there! We’re pleased to announce the release of our first dossier on machine learning.
Since this is a very complex subject, the dossier will necessarily be a bit limited, but we will continue to develop it and thus your suggestions are welcome.
On the other hand, most of the paper requires that the reader have some background knowledge on probability and statistics, linear algebra and optimization.
We suggest that you bookmark each dossier so you can use it whenever you need it for essential documents, papers and articles about the technologies that will shape the future of the digital world.
Enjoy, and give your knowledge a boost.
General & Introductory
The Discipline of Machine Learning 

Tom Mitchell  
Why is this here? This report defines the problem of machine learning as “how to get computers to program themselves.” It also puts the field of machine learning into the domain of computer science. Although it is a very introductory document, it sheds light on some current and future problems as well as research topics in the field. 
A Few Useful Things to Know about Machine Learning 

Pedro Domingos  
Why is this here? As its title promises, this short article contains a valuable introduction to some of the fundamental concepts and terms of machine learning, such as what learning is, what generalization implies, what the biasvariance tradeoff is, what the curse of dimensionality is, why it is important to select features, and other key topics in the area. 
Machine Learning Paradigms
Statistical Pattern Recognition: A Review 

Anil Jain, Robert Duin, and Jianchang Mao  
Why is this here? Because this paper provides an introduction to the statistical paradigm and can be useful to clarify the taxonomy of statistical algorithms: its approaches, representations, recognition functions and acceptance criterion. This is a general summary which presents and compares several methods and domains of applications for statistical pattern recognition. 
Instancebased Learning Algorithms 

David Aha, Dennis Kibler, and Marc Albert  
Why is this here? This paper reviews the paradigm of classification of unknown instances based on its similarities with specific training examples, including very popular algorithms and approaches like the nearest neighbor and casebased reasoning. It describes the caveats and possible workaround for this set of algorithms, such as noise tolerance and storage requirement. 
Kernel Methods in Machine Learning 

Thomas Hofmann, Bernhard Schölkopf, and Alexander Smola  
Why is this here? This paper reviews the concepts and algorithms related to machine learning which make use of positive definite kernels. Even though it is a bit dense and requires a solid background in algebra, this work presents several kernel methods in order to provide a general overview of this area.  
Algorithms
Random forests 

Leo Breiman  
Why is this here? This paper describes a very useful and applied algorithm consisting of a combination of tree predictors. The author studies the generalization capacity of these forests and compares its performance with that of other algorithms comprised of an ensemble of classifiers: AdaBoost.  

Efficient BackProp 

Yann LeCun, Leon Bottou, Genevieve Orr and KlausRobert Müller  
Why is this here? This is one of the few technical publications with tips and tricks for training neural networks using backpropagation, including advice on how to make proper use of training data and how to select the optimal learning rate and momentum, while explaining why they work. It also suggests alternatives to avoid the undesirable behaviors of backpropagation.  

A Training Algorithm for Optimal Margin Classifiers 

Bernhard Boser, Isabelle Guyon, and Vladimir Vapnik  
Why is this here? This is the paper which introduced support vector machines (SVMs). This algorithm falls under the general category of kernel methods, and is the state of the art in several classification problems due to its high accuracy, ability to deal with highdimensional data, and flexibility in modeling heterogeneous sources of data. 
Advanced Topics
The boosting approach to machine learning: An overview 

Robert Schapire  
Why is this here? This article describes one of the most commonly used methods for classifiers ensembles: boosting. Ensemble classifiers usually perform better than single classifiers, and this method constitutes the state of the art for several classification problems. The paper focuses on a particular algorithm: AdaBoost, which employs the concept of cascade of classifiers, according to which the next classifiers built are tweaked to incorporate those instances misclassified by previous classifiers. Furthermore, the paper explains extensions for multiclass classification and alternatives for incorporating human knowledge into boosting.  

A Unifying Review of Linear Gaussian Models 

Sam Roweis and Zoubin Ghahramani  
Why is this here? This article shows what algorithms like factor analysis, principal component analysis, and clustering through mixtures of Gaussian, vector quantization, Kalman filter models, and hidden Markov models all have in common. The paper links these algorithms and explains how they can be unified as variations of unsupervised learning under a single basic generative model. Overall, this paper provides a deeper understanding of the field of machine learning. 
Previous: Dossier #1: Text Mining and related topics.