Recent Publications

Full list of publications

Contact

Email: joseph"dot"salmon "dot"taff@gmail"dot"com

Address: IMAG, c.c. 051
Université de Montpellier
Place Eugène Bataillon
34095 Montpellier Cedex 5
(office 415, building 9)

Joining? Open positions in my lab

I am looking for outstanding and highly motivated people to work (as intern, Ph.D. student, post-doctorate or research engineer) on machine learning, and more precisely on:

  • citizen science and crowdsourcing
  • high dimensional / robust statistics, variable selection, sparsity
  • optimization for machine learning (including federated learning, privacy, etc.)

The application process is light:

  1. Email a CV, a transcript of recent grades, and explain in a short paragraph why you are interested to join.
  2. Upon interest, I will ask two reference letters (one only for interns or Ph.D. students) to be sent directly to me.
  3. At this stage an interview (possibly online) will be arranged to double-check your skills and profile compatibility.

News

  • Sept. 2023: ANR VITE (PI: B. Thirion, theme: variable importance / explainability) accepted.
  • March 2022: visitor at the Simons Institute for the Theory of Computing
  • July 2021: IUF Nomination (junior member): https://www.iufrance.fr/detail-de-lactualite/247.html
  • Dec. 2019: The ANR AI chair proposal CaMeLOt (CooperAtive MachinE Learning and OpTimization) has been selected.

Seminar and conference organization

Software and datasets

  • OrganizationFiles: This repository provides some tools, advice and guidelines for researchers working in applied mathematics, statistics or machine learning.
  • BenchOpt: a package to make transparent and reproducible comparisons between optimization algorithms
  • PlantNet-300K: a subset of the Pl@ntNet database, with about 300k labeled images (plant species) and 1k classes. The dataset is available on Zenodo.
  • Celer: a fast Lasso solver (associated ICML2018 paper "Dual Extrapolation for Faster Lasso Solvers"), pdf, slides
  • sparse-ho: a fast hyper-parameter package to select the best Lasso parameter efficiently (associated to ICML2020 paper "Implicit differentiation of Lasso-type models for hyperparameter optimization", pdf)
  • matlab toolboxes for statistics and image processing (this is legacy), I don't use Matlab anymore.

More on my Github Page

Team

Ph.D. Students

Engineers

Alumni

Short Bio

I am a statistician and an applied mathematician, with a strong interest in machine learning, optimization and data science. In terms of applications, I focus on citizen science, crowdsourcing and high dimensional statistics.

Since 2018, I am a full professor at Université de Montpellier and a Junior member of the Institut Universitaire de France (IUF). For the spring and summer quarters 2018, I was a visiting assistant professor at UW, Statistics departement. From 2012 to 2018 I was an assistant professor at Telecom ParisTech and an associate member at INRIA Parietal Team. Back in 2011-2012, I was a post-doctoral Associate at Duke university working with Rebecca Willett.

In 2010, I finished my Ph.D. in statistics and image processing under the supervision of Dominique Picard and Erwan Le Pennec at the Laboratoire de Probabilités et de Modélisation Aléatoire, now LPSM, in Université Paris Diderot.


Teaching: list of courses

HAX603X - Modélisation Stochastique (2023-2024)

This is an undergraduate course on stochastic modeling and Monte Carlo methods, with exercises in Python.
Details can be found here: HAX603X - Stochastic Modeling.
Course language: 🇫🇷

HAX606X - Convex optimization (2020-2023)

This is an undergraduate course on convex optimization with exercises in Python.
Details can be found here: HAX606X - Convex optimization.
Course language: 🇫🇷

HAX712X - Software development for data science (2020-2023)

This is a master level course on Software development for data science, using Python.
Details can be found here: HAX712X.
Course language: 🇬🇧

HMMA308 - Statistical Machine Learning (2018-2021)

This course is mostly about supervised techniques in Machine Learning.
Details can be found here HMMA308 - Statistical Machine Learning.
Course language: 🇫🇷

HLMA408 - Data science for ecology (2018 - 2021)

This is an undergraduate course introducing statistics and data visualisation.
Details can be found here: HLMA408 - Data science for ecology.
Course language: 🇫🇷

HMMA237 - Advanced time series (2018-2019)

This is an undergraduate course introducing advanced time series analysis.
Details can be found here: HMMA237 - Advanced time series.
Course language: 🇫🇷 and 🇬🇧

HMMA238 - Scientific Software Development (2018-2019)

This is a master level course introducing scientific computing and modern software practices.
Details can be found here: HMMA238 - Scientific Software Development.
Course language: 🇬🇧

M2MO - Statistical Learning (2013-2018)

This contains Master 2 exercices on statistical learning.
Details can be found here: M2MO - Machine Learning.
Course language: 🇫🇷

SD204 - Linear Models (2016-2018)

This is an undergraduate course on linear models.
Details can be found here: SD204 - Linear Models.
Course language: 🇬🇧

STAT593 - Robust statistics (2018-2019)

This is a grade course on robust statistics and optimization.
Details can be found here: STAT593 - Robust statistics.
Course language: 🇬🇧

HLMA310 - Scientific Python (2018-2020)

This is an undergraduate course introducing Python for scientific computing.
Details can be found here: HLMA310 - Scientific Python.
Course language: 🇫🇷

MDI720 - Linear Models (2013-2018)

This is an undergraduate course introducing linear models.
Details can be found here MDI720 - Linear Models.
Course language: 🇫🇷

HMMA307 - Advanced Linear Models (2019-2021)

This is an undergraduate course introducing advanced linear models (ANOVA, Mixed-effects models, etc.).
Details can be found here: HMMA307 - Advanced Linear Models.
Course language: 🇫🇷 and 🇬🇧

CR12 - Machine Learning (2013-2015)

This is a master Master 2 course on Machine learning (with Z. Harchaoui, J. Mairal and L. Jacob).
Details can be found here:
CR12 - Machine Learning (2014-2015)
CR12 - Machine Learning (2013-2014).
Course language: 🇬🇧

SD3 - Descriptive Statistics (2010-2011)

This is an undergraduate course on descriptive statistics.
Details can be found here: SD3 - Descriptive Statistics.
Course language: 🇫🇷

M53010 - Econometrics (2009-2010)

This course is mostly about linear models in econometrics.
Details can be found here: M53010 - Econometrics.
Course language: 🇫🇷




Talks

Full list of talks, with slides and possibly videos when recorded.

Miscellaneous

Here (Miscellaneous), you will find some (math)art and other distractions.