Hi there! We’re pleased to announce the release of our first dossier on machine learning.
Since this is a very complex subject, the dossier will necessarily be a bit limited, but we will continue to develop it and thus your suggestions are welcome.
On the other hand, most of the paper requires that the reader have some background knowledge on probability and statistics, linear algebra and optimization.
We suggest that you bookmark each dossier so you can use it whenever you need it for essential documents, papers and articles about the technologies that will shape the future of the digital world.
Enjoy, and give your knowledge a boost.
General & Introductory
|Why is this here? This report defines the problem of machine learning as “how to get computers to program themselves.” It also puts the field of machine learning into the domain of computer science. Although it is a very introductory document, it sheds light on some current and future problems as well as research topics in the field.|
|Why is this here? As its title promises, this short article contains a valuable introduction to some of the fundamental concepts and terms of machine learning, such as what learning is, what generalization implies, what the bias-variance trade-off is, what the curse of dimensionality is, why it is important to select features, and other key topics in the area.|
Machine Learning Paradigms
|Anil Jain, Robert Duin, and Jianchang Mao|
|Why is this here? Because this paper provides an introduction to the statistical paradigm and can be useful to clarify the taxonomy of statistical algorithms: its approaches, representations, recognition functions and acceptance criterion. This is a general summary which presents and compares several methods and domains of applications for statistical pattern recognition.|
|David Aha, Dennis Kibler, and Marc Albert|
|Why is this here? This paper reviews the paradigm of classification of unknown instances based on its similarities with specific training examples, including very popular algorithms and approaches like the nearest neighbor and case-based reasoning. It describes the caveats and possible workaround for this set of algorithms, such as noise tolerance and storage requirement.|
|Thomas Hofmann, Bernhard Schölkopf, and Alexander Smola|
|Why is this here? This paper reviews the concepts and algorithms related to machine learning which make use of positive definite kernels. Even though it is a bit dense and requires a solid background in algebra, this work presents several kernel methods in order to provide a general overview of this area.|
|Why is this here? This paper describes a very useful and applied algorithm consisting of a combination of tree predictors. The author studies the generalization capacity of these forests and compares its performance with that of other algorithms comprised of an ensemble of classifiers: AdaBoost.|
|Yann LeCun, Leon Bottou, Genevieve Orr and Klaus-Robert Müller|
|Why is this here? This is one of the few technical publications with tips and tricks for training neural networks using backpropagation, including advice on how to make proper use of training data and how to select the optimal learning rate and momentum, while explaining why they work. It also suggests alternatives to avoid the undesirable behaviors of backpropagation.|
|Bernhard Boser, Isabelle Guyon, and Vladimir Vapnik|
|Why is this here? This is the paper which introduced support vector machines (SVMs). This algorithm falls under the general category of kernel methods, and is the state of the art in several classification problems due to its high accuracy, ability to deal with high-dimensional data, and flexibility in modeling heterogeneous sources of data.|
|Why is this here? This article describes one of the most commonly used methods for classifiers ensembles: boosting. Ensemble classifiers usually perform better than single classifiers, and this method constitutes the state of the art for several classification problems. The paper focuses on a particular algorithm: AdaBoost, which employs the concept of cascade of classifiers, according to which the next classifiers built are tweaked to incorporate those instances misclassified by previous classifiers. Furthermore, the paper explains extensions for multiclass classification and alternatives for incorporating human knowledge into boosting.|
|Sam Roweis and Zoubin Ghahramani|
|Why is this here? This article shows what algorithms like factor analysis, principal component analysis, and clustering through mixtures of Gaussian, vector quantization, Kalman filter models, and hidden Markov models all have in common. The paper links these algorithms and explains how they can be unified as variations of unsupervised learning under a single basic generative model. Overall, this paper provides a deeper understanding of the field of machine learning.|
Previous: Dossier #1: Text Mining and related topics.