Citizen science & machine learning for plant Identification: challenges from Pl@ntNet

Joseph Salmon

IMAG, Univ Montpellier, CNRS, Inria, Montpellier, France

Consortium Pl@ntnet

Pl@ntNet insights

Pl@ntNet: ML for citizen science

A citizen science platform using machine learning to help people identify plants with their mobile phones

Birth: Montpellier, 2009
Website: https://plantnet.org/
Note: no mushroom identification!

Pl@ntNet: usage

The current Pl@ntNet Team

Alexis Joly
Primary investigator, INRIA
ResearchGate

Pierre Bonnet
Primary investigator, CIRAD
ResearchGate

Hervé Goëau
Researcher, CIRAD
ResearchGate

Antoine Affouard
Backend & Staff engineer, INRIA
LinkedIn

Jean-Christophe Lombardo
IA engineer, INRIA
LinkedIn

Mathias Chouet
Backend engineer, CIRAD
GitHub

Hugo Gresse
Mobile engineer, INRIA
GitHub | LinkedIn

Thomas Paillot
Front engineer, INRIA
LinkedIn

Rémi Palard
Geo & Fullstack engineer, CIRAD
LinkedIn

Vanessa Hequet
Botanist, IRD
LinkedIn

Murielle Simo-Droissart
Botanist, IRD
ResearchGate

Théo Simoes
Backend engineer, INRAE
LinkedIn

Jean-Marc Sadaillan
Project manager, INRAE
LinkedIn

Christophe Botella
Researcher, INRIA
ResearchGate

Joseph Salmon
Researcher, INRIA
Website

Benjamin Bourel
Researcher, INRIA
Website

Théo Larcher
PhD candidate, INRIA
LinkedIn

Giulio Martellucci
PhD candidate, INRIA
LinkedIn

Raphaël Benerradi
PhD candidate, INRIA
LinkedIn

Ilyass Moummad
Post-doc, INRIA
Personal website | Google Scholar

Pl@ntNet & Cooperative Learning

Chronology of Pl@ntNet

Note: I am mostly innocent, I started working with the Pl@ntNet team in 2020

Scientific challenges

Collaborative effort, involving:
- machine learners
- ecologist
- engineers
- amateurs

Open problems:
- theoretical
- methodological
- computational (due to the size of the problems)
- interfacing many disciplines

We need you: come and help us improve it!

Contributions

Some personal contributions

Pl@ntNet-300K (Garcin et al., 2021): Creation and release of a large-scale dataset sharing the same property (Long Tail!) as Pl@ntNet; available for the community to improve learning systems

Prediction uncertainty quantification with long tail data (Ding et al., 2026) : providing prediction sets with statistical guarantees with Conformal Prediction

Learning & crowd-sourced data (Lefort et al., 2024) and (Lefort et al., 2025): How to leverage multiple labels per image to improve the model? Need to assert quality: the workers, the images/labels, the model, etc.

Pl@ntNet data characteristics

Plants trivia

Biomass on earth (Bar-On et al., 2018)

@ Rkitko (Wikimedia Commons)

Diversity: 400K+ species

RBG Kew
CC-BY-SA

Angiosperm Tree of Life

Consequences: for supervised learning, the classes are many and hierarchically structured

Intra-class variability

Benoît Janichon
CC-BY-SA Guizotia abyssinica (L.f.) Cass.

Patrice SIROT
CC-BY-SA Diascia rigescens E.Mey. ex Benth.

Borquez Vicent
CC-BY-SA Lapageria rosea Ruiz & Pav.

Наталья
CC-BY-SA Casuarina cunninghamiana Miq. LC

Annette Bejany
CC-BY-SA Guizotia abyssinica (L.f.) Cass.

A Lee
CC-BY-SA Diascia rigescens E.Mey. ex Benth.

Daniel Barthelemy
CC-BY-SA Lapageria rosea Ruiz & Pav.

Campos Ignacio
CC-BY-SA Casuarina cunninghamiana Miq. LC

Based on pictures only, plant species are challenging to discriminate!

Inter-class ambiguity

stefano mazzotti
CC-BY-SA Cirsium rivulare (Jacq.) All.

buqa Jarmil
CC-BY-SA Chaerophyllum aromaticum L.

Walter Reider
CC-BY-SA Adenostyles leucophylla Rchb.

furs
CC-BY-SA Petrosedum montanum (Songeon & E.P.Perrier) Grulich

Rene Weck
CC-BY-SA Cirsium tuberosum (L.) All.

Jcm Arthur
CC-BY-SA Chaerophyllum temulum L.

pierre Lamy
CC-BY-SA Adenostyles alpina (L.) Bluff & Fingerh.

Wolfi 41
CC-BY-SA Petrosedum rupestre (L.) P.V.Heath

Based on pictures only, plant species are challenging to discriminate!

Sampling bias

Geographic bias

Spatial density of images collected by Pl@ntNet (13/04/2024)

Food bias

Top-5 most observed plant species in Pl@ntNet (13/04/2024):

25134 obs.

Llandrich anna
CC-BY-SA

Echium vulgare L.

24720 obs.

JYCO
CC-BY-SA

Ranunculus ficaria L.

24103 obs.

fuerst.ernst
CC-BY-SA

Prunus spinosa L.

23288 obs.

Uta Groger
CC-BY-SA

Zea mays L.

23075 obs.

Pascal Ollagnier
CC-BY-SA

Alliaria petiolata (M.Bieb.) Cavara & Grande

Beauty bias

10753 obs.

Dieter Wagner
CC-BY-SA

Centaurea jacea L.

6 obs.

David Eickhoff − EOL
CC-BY-SA

Cenchrus agrimonioides Trin.

Size bias

8376 obs.

Patrick Cartier
CC-BY-SA

Magnolia grandiflora L.

413 obs.

Maximilien Perrin
CC-BY-SA

Moehringia trinervia (L.) Clairv.

Many more biases …

Selection bias
- Convenience sampling: easily vs. hardly accessible
- Preference for certain species: visibility / ease of identification
- Subjective bias: selection based on personal judgment, may not be random or representative
- Rare species: rare or endangered species may be under-represented
Temporal bias / seasonal variation: seasonal changes in plant characteristics
…

The Pl@ntNet-300K dataset

A need for new benchmarks

Popular datasets limitations:

structure of labels too simplistic (CIFAR-10, CIFAR-100)
tasks too easy to discriminate (MNIST)
too well-balanced, with same number of images per class (Imagenet)
contains duplicate, low-quality, or irrelevant images (Recht et al., 2019)

Motivation:

Release a dataset sharing similar features as the Pl@ntNet dataset to foster research in plant identification

\(\implies\) Pl@ntNet-300K (Garcin et al., 2021)

“The collective behavior induced by frictionless research exchange is the emergent superpower driving many events that are so striking today.” (Donoho, 2024)

Construction of Pl@ntNet-300K

Sample at genus level to preserve intra-genus ambiguity: use hierarchical structure

Species distribution, long tail & Lorenz curve

Earth: 400K+ species
Pl@ntNet: 80K+ species
Pl@ntNet-300K: 1K+ species

Note: long tail preserved by genera subsampling

80% of species | 11% of images \(\iff\) 20% of species | 89% of images

Pl@ntNet300K: Long tail visualization

Y-Scale:

Species Preview

Details on Pl@ntNet-300K

Goal: Provide a real life scenario for supervised learning, with multi-class and long-tail label distribution

Characteristics:

306,146 color images
Size: 32 GB
Labels: 1 000+ species
Required 2 000 000 volunteers

Zenodo, 1 click download

https://zenodo.org/record/5645731

Code to train models

https://github.com/plantnet/PlantNet-300K

Prediction
&
uncertainty quantification

Joint work with

Tiffany Ding

UC Berkeley

within

Jean-Baptiste Fermanian

Inria

“Conformal Prediction for Long-Tailed Classification”

T. Ding, J.-B. Fermanian and J. Salmon

ICLR 2026

Pl@ntNet: set prediction (recommendation)

Elements to help guide the users

provide a set of possible species/labels
display similar images from proposed species
give a score of confidence

(Split) Conformal Prediction (Vovk et al., 2005)

Goal:

For an input image \(X\), propose the most probable classes \(y\) with confidence level \(1-\alpha\) (with small \(\alpha\))

Main idea: Return classes with predicted score above a threshold:

\[ \mathcal{C}_{\alpha}(X) = \big\{ y : s(X,y) \geq t_\alpha \big\} \]

In classification: a common conformal score is \(s(x,y) = \hat{p}(y|x)\) (e.g., softmax score from a classifier fitted on a train/validation set).

Key assumption: The data calibration \((X_i, Y_i)\) are exchangeable with the tested points.

Conformal prediction: sets \(t_{\alpha}\) as the \((1-\alpha)\) quantile of the scores on a calibration set

Optimal sets (Sadinle et al., 2019)

Marginal coverage targets:

\[\mathbb{P}\big[ Y \in \mathcal{C}_{\alpha}(X) \big ] \geq 1 - \alpha.\]

Class conditional coverage targets:

\[\forall y,\quad \mathbb{P}\big[ Y \in \mathcal{C}_\alpha(X) | Y=y \big ] \geq 1 - \alpha.\]

The optimal set of minimum size and marginal coverage of at least \(1-\alpha\) is: \[ \mathcal{C}_{\alpha}(x) = \left\{ y : p(y|x) \geq t_\alpha \right\} \]

The optimal set of minimum size and conditional coverage of at least \(1-\alpha\) is: \[ \mathcal{C}_{\alpha}(x) = \left\{ y : p(y|x) \geq t_\alpha^{y} \right\} \]

Marginal:

calibrate \(t_{\alpha}\) on whole calibration set \((X_i, Y_i)_{i=1}^n\)

Conditional:

calibrate \(t_{\alpha}^y\) only on \((X_i, Y_i)\) such that \(Y_i = y\)

Interactive optimal set visualization

\(\alpha=\) 0.15

Class conditional: useless for long-tail

Targeting Macro-Coverage

Goal: Average coverage across all classes (better for long-tail!)

\[ \text{MacroCoverage} = \frac{1}{|\mathcal{Y}|} \sum_{y \in \mathcal{Y}} \mathbb{P}\big( y \in \mathcal{C}(X) \, | \, Y = y \big) \]

The optimal set of minimum size and Macro-Coverage of at least \(1-\alpha\) is: \[ \mathcal{C}_{\alpha}(x) = \left\{ y : \frac{p(y|x)}{p(y)} \geq t_\alpha \right\} \]

Introduce a new conformal score, Prevalence Adjusted Softmax (PAS) : \(s(x,y) = \frac{\hat{p}(y|x)}{\hat{p}(y)}\)

Interactive optimal set visualization (II)

\(\alpha\): 0.15

Experiments on Pl@ntNet-300K

Generalization : weighted Macro-Coverage

Given user-chosen class weights \(\omega(y)\) for \(y \in \mathcal{Y}\) that sum to one, we can define the \(\omega\)-weighted macro-coverage as

\[ \begin{align} \mathrm{MacroCov}_{\omega}(\mathcal{C}) = \sum_{y \in \mathcal{Y}} \omega(y) \mathbb{P}(Y \in \mathcal{C}(X) \mid Y = y). \end{align} \]

The optimal set of minimum size and Macro-Coverage of at least \(1-\alpha\) is: \[ \begin{align} \mathcal{C}^*(x) = \left\{ y \in \mathcal{Y} : \omega(y) \dfrac{p(y|x)}{p(y)} \geq t\right\}, \end{align} \]

Introduce a new conformal score, Weighted Prevalence Adjusted Softmax (WPAS) : \(s(x,y) = \omega(y) \frac{\hat{p}(y|x)}{\hat{p}(y)}\)

Experiments on Pl@ntNet-300K: endangered species

Goal: handle at-risk species in Pl@ntNet-300K, weight by \(\gamma \geq 1\) times more at-risk species:

\[ \omega(y) = \begin{cases} \frac{\gamma}{W} & \text{if } y \in \mathcal{Y}_{\text{at-risk}} \quad (\text{with } W = \gamma|\mathcal{Y}_{\text{at-risk}}| + |\mathcal{Y} \setminus \mathcal{Y}_{\text{at-risk}}|)\\ \frac{1}{W} & \text{otherwise}, \end{cases} \]

Crowd-sourced data:
Votes, labels & aggregation

Joint work with

Tanguy Lefort

Now at Seenovate

within

Benjamin Charlier

Inrae

“Cooperative learning of Pl@ntNet’s Artificial Intelligence algorithm:
how does it work and how can we improve it?”

T. Lefort et al.

Methods in Ecology and Evolution, 2025

Pl@ntNet online “votes”

Link: https://identify.plantnet.org/weurope/observations/1012500059

What about labels?

Images from users… so are the labels!
But users can be wrong or not experts
Several labels can be available per image!

Users can make corrections

… sometimes can’t be trusted

Link: https://identify.plantnet.org/weurope/observations/1012500059

Pl@ntNet label aggregation (EM algorithm)

Weighting scheme: weight user vote by its number of identified species

Take home message

Challenges in citizen science: many and varied (need more attention)
Crowdsourcing / Label uncertainty: helpful for data curation
Improved data quality \(\implies\) improved learning performance
Prediction: theory can guide the set to display

Dataset release:

Pl@ntNet-300K: https://zenodo.org/record/5645731
Pl@ntNet SWE flora: https://zenodo.org/records/10782465

Code release:

Toolbox: https://peerannot.github.io/
Some benchmarks: https://benchopt.github.io/

Future work

Handling label hierarchy
Human–computer interaction / performative learning / model collapse
Improve robustness to adversarial users
Leverage gamification for more quality labels theplantgame.com

References

Bar-On, Y. M., Phillips, R., & Milo, R. (2018). The biomass distribution on earth. Proceedings of the National Academy of Sciences, 115(25), 6506–6511.

Ding, T., Fermanian, J.-B., & Salmon, J. (2026). Conformal Prediction for Long-Tailed Classification. ICLR.

Donoho, D. (2024). Data science at the singularity. Harvard Data Science Review, 6(1).

Garcin, C., Joly, A., Bonnet, P., Affouard, A., Lombardo, J.-C., Chouet, M., Servajean, M., Lorieul, T., & Salmon, J. (2021). Pl@ntNet-300K: A plant image dataset with high label ambiguity and a long-tailed distribution. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks.

Lefort, T., Affouard, A., Charlier, B., Lombardo, J.-C., Chouet, M., Goëau, H., Salmon, J., Bonnet, P., & Joly, A. (2025). Cooperative learning of pl@ntNet’s artificial intelligence algorithm: How does it work and how can we improve it? Methods in Ecology and Evolution.

Lefort, T., Charlier, B., Joly, A., & Salmon, J. (2024). Identify ambiguous tasks combining crowdsourced labels by weighting areas under the margin. TMLR.

Recht, B., Roelofs, R., Schmidt, L., & Shankar, V. (2019). Do imagenet classifiers generalize to imagenet? ICML, 5389–5400.

Sadinle, M., Lei, J., & Wasserman, L. (2019). Least ambiguous set-valued classifiers with bounded error levels. Journal of the American Statistical Association, 114(525), 223–234.

Vovk, V., Gammerman, A., & Shafer, G. (2005). Algorithmic learning in a random world. Springer.

Pl@ntNet insights

Pl@ntNet: ML for citizen science

Pl@ntNet: usage

The current Pl@ntNet Team

Pl@ntNet & Cooperative Learning

Chronology of Pl@ntNet

Scientific challenges

Contributions

Some personal contributions

Pl@ntNet data characteristics

Plants trivia

Intra-class variability

Inter-class ambiguity

Sampling bias

Geographic bias

Food bias

Beauty bias

Size bias

Many more biases …

The Pl@ntNet-300K dataset

A need for new benchmarks

Construction of Pl@ntNet-300K

Species distribution, long tail & Lorenz curve

Pl@ntNet300K: Long tail visualization

Details on Pl@ntNet-300K

Prediction&uncertainty quantification

Joint work with

Pl@ntNet: set prediction (recommendation)

(Split) Conformal Prediction (Vovk et al., 2005)

Optimal sets (Sadinle et al., 2019)

Interactive optimal set visualization

Class conditional: useless for long-tail

Targeting Macro-Coverage

Interactive optimal set visualization (II)

Experiments on Pl@ntNet-300K

Generalization : weighted Macro-Coverage

Experiments on Pl@ntNet-300K: endangered species

Crowd-sourced data: Votes, labels & aggregation

Joint work with

Pl@ntNet online “votes”

What about labels?

Users can make corrections

… sometimes can’t be trusted

Pl@ntNet label aggregation (EM algorithm)

Take home message

References

References

Prediction
&
uncertainty quantification

Crowd-sourced data:
Votes, labels & aggregation