Citizen science & machine learning for plant Identification: challenges from Pl@ntNet


Joseph Salmon

IMAG, Univ Montpellier, CNRS, Inria, Montpellier, France

Consortium Pl@ntnet

Pl@ntNet insights

Pl@ntNet: ML for citizen science


A citizen science platform using machine learning to help people identify plants with their mobile phones





Pl@ntNet: usage



The current Pl@ntNet Team

Alexis Joly
Primary investigator, INRIA
ResearchGate

Pierre Bonnet
Primary investigator, CIRAD
ResearchGate

Hervé Goëau
Researcher, CIRAD
ResearchGate

Antoine Affouard
Backend & Staff engineer, INRIA
LinkedIn

Jean-Christophe Lombardo
IA engineer, INRIA
LinkedIn

Mathias Chouet
Backend engineer, CIRAD
GitHub

Hugo Gresse
Mobile engineer, INRIA
GitHub | LinkedIn

Thomas Paillot
Front engineer, INRIA
LinkedIn

Rémi Palard
Geo & Fullstack engineer, CIRAD
LinkedIn

Vanessa Hequet
Botanist, IRD
LinkedIn

Murielle Simo-Droissart
Botanist, IRD
ResearchGate

Théo Simoes
Backend engineer, INRAE
LinkedIn

Jean-Marc Sadaillan
Project manager, INRAE
LinkedIn

Christophe Botella
Researcher, INRIA
ResearchGate

Joseph Salmon
Researcher, INRIA
Website

Benjamin Bourel
Researcher, INRIA
Website

Théo Larcher
PhD candidate, INRIA
LinkedIn

Giulio Martellucci
PhD candidate, INRIA
LinkedIn

Raphaël Benerradi
PhD candidate, INRIA
LinkedIn

Ilyass Moummad
Post-doc, INRIA
Personal website | Google Scholar

Pl@ntNet & Cooperative Learning

Chronology of Pl@ntNet


Note: I am mostly innocent, I started working with the Pl@ntNet team in 2020

Scientific challenges


  • Collaborative effort, involving:
    • machine learners
    • ecologist
    • engineers
    • amateurs
  • Open problems:
    • theoretical
    • methodological
    • computational (due to the size of the problems)
    • interfacing many disciplines

We need you: come and help us improve it!

Contributions

Some personal contributions



  • Pl@ntNet-300K (Garcin et al., 2021): Creation and release of a large-scale dataset sharing the same property (Long Tail!) as Pl@ntNet; available for the community to improve learning systems


  • Prediction uncertainty quantification with long tail data (Ding et al., 2026) : providing prediction sets with statistical guarantees with Conformal Prediction


  • Learning & crowd-sourced data (Lefort et al., 2024) and (Lefort et al., 2025): How to leverage multiple labels per image to improve the model? Need to assert quality: the workers, the images/labels, the model, etc.


Pl@ntNet data characteristics

Plants trivia



Biomass on earth
Biomass on earth (Bar-On et al., 2018)
Diversity of plants @ Rkitko (Wikimedia Commons)
Diversity: 400K+ species
Angiosperm Tree of Life © RBG Kew
Angiosperm Tree of Life


Consequences: for supervised learning, the classes are many and hierarchically structured

Intra-class variability

Guizotia abyssinica (L.f.) Cass. © Benoît Janichon Guizotia abyssinica (L.f.) Cass.
Diascia rigescens E.Mey. ex Benth. © Patrice SIROT Diascia rigescens E.Mey. ex Benth.
Lapageria rosea Ruiz & Pav. © Borquez Vicent Lapageria rosea Ruiz & Pav.
Casuarina cunninghamiana Miq. LC © Наталья Casuarina cunninghamiana Miq. LC
Guizotia abyssinica (L.f.) Cass. © Annette Bejany Guizotia abyssinica (L.f.) Cass.
Diascia rigescens E.Mey. ex Benth. © A Lee Diascia rigescens E.Mey. ex Benth.
Lapageria rosea Ruiz & Pav.
© Daniel Barthelemy Lapageria rosea Ruiz & Pav.
Casuarina cunninghamiana Miq. LC © Campos Ignacio Casuarina cunninghamiana Miq. LC
Based on pictures only, plant species are challenging to discriminate!

Inter-class ambiguity

Cirsium rivulare (Jacq.) All. © stefano mazzotti Cirsium rivulare (Jacq.) All.
Chaerophyllum aromaticum L. © buqa Jarmil Chaerophyllum aromaticum L.
Adenostyles leucophylla Rchb. © Walter Reider Adenostyles leucophylla Rchb.
Petrosedum montanum (Songeon & E.P.Perrier) Grulich © furs Petrosedum montanum (Songeon & E.P.Perrier) Grulich
Cirsium tuberosum (L.) All. © Rene Weck Cirsium tuberosum (L.) All.
Chaerophyllum temulum L. © Jcm Arthur Chaerophyllum temulum L.
Adenostyles alpina (L.) Bluff & Fingerh. © pierre Lamy Adenostyles alpina (L.) Bluff & Fingerh.
Petrosedum rupestre (L.) P.V.Heath © Wolfi 41 Petrosedum rupestre (L.) P.V.Heath
Based on pictures only, plant species are challenging to discriminate!

Sampling bias

Geographic bias

Spatial density of images collected by Pl@ntNet (13/04/2024)

Spatial density of images collected by Pl@ntNet (13/04/2024)

Food bias



Top-5 most observed plant species in Pl@ntNet (13/04/2024):


25134 obs.
Echium vulgare L. © Llandrich anna
Echium vulgare L.
24720 obs.
Ranunculus ficaria L. © JYCO
Ranunculus ficaria L.
24103 obs.
Prunus spinosa L. © fuerst.ernst
Prunus spinosa L.
23288 obs.
Zea mays L. © Uta Groger
Zea mays L.
23075 obs.
Alliaria petiolata (M.Bieb.) Cavara & Grande © Pascal Ollagnier
Alliaria petiolata (M.Bieb.) Cavara & Grande

Beauty bias


10753 obs.
Centaurea jacea L. © Dieter Wagner
Centaurea jacea L.
6 obs.
Cenchrus agrimonioides Trin. © David Eickhoff − EOL
Cenchrus agrimonioides Trin.

Size bias


8376 obs.
Magnolia grandiflora L. © Patrick Cartier
Magnolia grandiflora L.
413 obs.
Moehringia trinervia (L.) Clairv. © Maximilien Perrin
Moehringia trinervia (L.) Clairv.

Many more biases …



  • Selection bias
    • Convenience sampling: easily vs. hardly accessible
    • Preference for certain species: visibility / ease of identification
    • Subjective bias: selection based on personal judgment, may not be random or representative
    • Rare species: rare or endangered species may be under-represented
  • Temporal bias / seasonal variation: seasonal changes in plant characteristics

The Pl@ntNet-300K dataset

A need for new benchmarks



Popular datasets limitations:

  • structure of labels too simplistic (CIFAR-10, CIFAR-100)
  • tasks too easy to discriminate (MNIST)
  • too well-balanced, with same number of images per class (Imagenet)
  • contains duplicate, low-quality, or irrelevant images (Recht et al., 2019)
Motivation:

Release a dataset sharing similar features as the Pl@ntNet dataset to foster research in plant identification

\(\implies\) Pl@ntNet-300K (Garcin et al., 2021)



“The collective behavior induced by frictionless research exchange is the emergent superpower driving many events that are so striking today.” (Donoho, 2024)

Construction of Pl@ntNet-300K

Sample at genus level to preserve intra-genus ambiguity: use hierarchical structure

Species distribution, long tail & Lorenz curve



  • Earth: 400K+ species
  • Pl@ntNet: 80K+ species
  • Pl@ntNet-300K: 1K+ species

Note: long tail preserved by genera subsampling


80% of species | 11% of images \(\iff\) 20% of species | 89% of images

Pl@ntNet300K: Long tail visualization



Y-Scale:
Species Preview

Details on Pl@ntNet-300K



Goal: Provide a real life scenario for supervised learning, with multi-class and long-tail label distribution


Caracteristics:


  • 306,146 color images
  • Size : 32 GB
  • Labels: 1 000+ species
  • Required 2 000 000 volunteers

Prediction
&
uncertainty quantification

Joint work with


Tiffany Ding

UC Berkeley

within

Jean-Baptiste Fermanian

Inria



Conformal Prediction for Long-Tailed Classification

T. Ding, J.-B. Fermanian and J. Salmon

ICLR 2026

Pl@ntNet: set prediction (recommendation)



Elements to help guide the users

  • provide a set of possible species/labels
  • display similar images from proposed species
  • give a score of confidence

(Split) Conformal Prediction (Vovk et al., 2005)



Goal:

For an input image \(X\), propose the most probable classes \(y\) with confidence level \(1-\alpha\) (with small \(\alpha\))



Main idea: Return classes with predicted score above a threshold:

\[ \mathcal{C}_{\alpha}(X) = \big\{ y : s(X,y) \geq t_\alpha \big\} \]

In classification: a common conformal score is \(s(x,y) = \hat{p}(y|x)\) (e.g., softmax score from a classifier fitted on a train/validation set).



Key assumption: The data calibration \((X_i, Y_i)\) are exchangeable with the tested points.

Conformal prediction: sets \(t_{\alpha}\) as the \((1-\alpha)\) quantile of the scores on a calibration set

Optimal sets (Sadinle et al., 2019)


Marginal coverage targets:

\[\mathbb{P}\big[ Y \in \mathcal{C}_{\alpha}(X) \big ] \geq 1 - \alpha.\]

Class conditional coverage targets:

\[\forall y,\quad \mathbb{P}\big[ Y \in \mathcal{C}_\alpha(X) | Y=y \big ] \geq 1 - \alpha.\]

The optimal set of minimum size and marginal coverage of at least \(1-\alpha\) is: \[ \mathcal{C}_{\alpha}(x) = \left\{ y : p(y|x) \geq t_\alpha \right\} \]

The optimal set of minimum size and conditional coverage of at least \(1-\alpha\) is: \[ \mathcal{C}_{\alpha}(x) = \left\{ y : p(y|x) \geq t_\alpha^{y} \right\} \]


Marginal:

calibrate \(t_{\alpha}\) on whole calibration set \((X_i, Y_i)_{i=1}^n\)

Conditional:

calibrate \(t_{\alpha}^y\) only on \((X_i, Y_i)\) such that \(Y_i = y\)

Interactive optimal set visualization


Class conditional: useless for long-tail




Targeting Macro-Coverage


Goal: Average coverage across all classes (better for long-tail!)

\[ \text{MacroCoverage} = \frac{1}{|\mathcal{Y}|} \sum_{y \in \mathcal{Y}} \mathbb{P}\big( y \in \mathcal{C}(X) \, | \, Y = y \big) \]

The optimal set of minimum size and Macro-Coverage of at least \(1-\alpha\) is: \[ \mathcal{C}_{\alpha}(x) = \left\{ y : \frac{p(y|x)}{p(y)} \geq t_\alpha \right\} \]


Introduce a new conformal score, Prevalence Adjusted Softmax (PAS) : \(s(x,y) = \frac{\hat{p}(y|x)}{\hat{p}(y)}\)

Interactive optimal set visualization (II)


Experiments on Pl@ntNet-300K



Generalization : weighted Macro-Coverage


Given user-chosen class weights \(\omega(y)\) for \(y \in \mathcal{Y}\) that sum to one, we can define the \(\omega\)-weighted macro-coverage as

\[ \begin{align} \mathrm{MacroCov}_{\omega}(\mathcal{C}) = \sum_{y \in \mathcal{Y}} \omega(y) \mathbb{P}(Y \in \mathcal{C}(X) \mid Y = y). \end{align} \]

The optimal set of minimum size and Macro-Coverage of at least \(1-\alpha\) is: \[ \begin{align} \mathcal{C}^*(x) = \left\{ y \in \mathcal{Y} : \omega(y) \dfrac{p(y|x)}{p(y)} \geq t\right\}, \end{align} \]


Introduce a new conformal score, Weighted Prevalence Adjusted Softmax (WPAS) : \(s(x,y) = \omega(y) \frac{\hat{p}(y|x)}{\hat{p}(y)}\)

Experiments on Pl@ntNet-300K: endangered species


Goal: handle at-risk species in Pl@ntNet-300K, weight by \(\gamma \geq 1\) times more at-risk species:

\[ \omega(y) = \begin{cases} \frac{\gamma}{W} & \text{if } y \in \mathcal{Y}_{\text{at-risk}} \quad (\text{with } W = \gamma|\mathcal{Y}_{\text{at-risk}}| + |\mathcal{Y} \setminus \mathcal{Y}_{\text{at-risk}}|)\\ \frac{1}{W} & \text{otherwise}, \end{cases} \]

Crowd-sourced data:
Votes, labels & aggregation

Joint work with


Tanguy Lefort

Now at Seenovate

within

Benjamin Charlier

Inrae



“Cooperative learning of Pl@ntNet’s Artificial Intelligence algorithm:
how does it work and how can we improve it?”

T. Lefort et al.

Methods in Ecology and Evolution, 2025

Pl@ntNet online “votes”


Link: https://identify.plantnet.org/weurope/observations/1012500059

What about labels?



  • Images from users… so are the labels!

  • But users can be wrong or not experts

  • Several labels can be available per image!

Users can make corrections


… sometimes can’t be trusted


Link: https://identify.plantnet.org/weurope/observations/1012500059

Pl@ntNet label aggregation (EM algorithm)


Weighting scheme: weight user vote by its number of identified species

Take home message


  • Challenges in citizen science: many and varied (need more attention)
  • Crowdsourcing / Label uncertainty: helpful for data curation
  • Improved data quality \(\implies\) improved learning performance
  • Prediction: theory can guide the set to display

Dataset release:

Code release:

Future work

  • Handling label hierarchy
  • Human–computer interaction / performative learning / model collapse
  • Improve robustness to adversarial users
  • Leverage gamification for more quality labels theplantgame.com

References

References


Bar-On, Y. M., Phillips, R., & Milo, R. (2018). The biomass distribution on earth. Proceedings of the National Academy of Sciences, 115(25), 6506–6511.
Ding, T., Fermanian, J.-B., & Salmon, J. (2026). Conformal Prediction for Long-Tailed Classification. ICLR.
Donoho, D. (2024). Data science at the singularity. Harvard Data Science Review, 6(1).
Garcin, C., Joly, A., Bonnet, P., Affouard, A., Lombardo, J.-C., Chouet, M., Servajean, M., Lorieul, T., & Salmon, J. (2021). Pl@ntNet-300K: A plant image dataset with high label ambiguity and a long-tailed distribution. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks.
Lefort, T., Affouard, A., Charlier, B., Lombardo, J.-C., Chouet, M., Goëau, H., Salmon, J., Bonnet, P., & Joly, A. (2025). Cooperative learning of pl@ntNet’s artificial intelligence algorithm: How does it work and how can we improve it? Methods in Ecology and Evolution.
Lefort, T., Charlier, B., Joly, A., & Salmon, J. (2024). Identify ambiguous tasks combining crowdsourced labels by weighting areas under the margin. TMLR.
Recht, B., Roelofs, R., Schmidt, L., & Shankar, V. (2019). Do imagenet classifiers generalize to imagenet? ICML, 5389–5400.
Sadinle, M., Lei, J., & Wasserman, L. (2019). Least ambiguous set-valued classifiers with bounded error levels. Journal of the American Statistical Association, 114(525), 223–234.
Vovk, V., Gammerman, A., & Shafer, G. (2005). Algorithmic learning in a random world. Springer.