Machine learning

Disciplinary field: Methods (Mandatory)
Level: M2
Credits: 3 ECTS

Teacher: Nicolas Desassis (Mines)
Teaching type: Cours/TD
Hourly volume: 22h – 30h max

Evaluation: Evaluation will be based on both a numerical project and a written exam (documents allowed).

Prerequisites: Notions de bases en probabilités et statistiques. Des notions sur les algorithmes numériques d’optimisation (type descente de gradient) sont utiles, mais ne sont pas du tout indispensables.

We are currently accumulating large amounts of data of all types (numerical, images, text, etc.). Exploiting them requires automating the mining and exploitation of these data. Many algorithms (including neural networks, SVM or random forests) make this possible at a scale unprecedented by classical statistics. This course presents a panorama of machine learning techniques, their theoretical and methodological framework, and various applications, inluding to geoscicences.

Program – The course is made of 15 sessions of 2 hours, most of which will be split in 1 hour of lecture and 1 hour of hands-on practice in Python. Practicals will be hosted on Google Colab, so that students aren’t required to install anything locally.

The course will discuss the following topics:

  • Overview of machine learning problems; generalization and overfitting; model selection
  • Model evaluation and selection
  • Bayesian methods
  • Empirical risk minimization; linear and logistic regressions
  • Regularization
  • SVM and kernel methods
  • Feature selection
  • Unsupervised dimensionality reduction
  • Trees and ensemble methods
  • Clustering
  • Deep learning and images
  • Applications to geosciences

The course briefly introduces various paradigms and their applications, and focuses on practical aspects, in particular through hands-on sessions.