{{
}}
Citizen science for plant identification: insights from Pl@ntnet
Joseph Salmon
IMAG, Univ Montpellier, CNRS, Montpellier
Institut Universitaire de France (IUF)
Mainly joint work with:
and from
A citizen science platform using machine learning to help people identify plants with their mobile phones
https://identify.plantnet.org/stats
Note: I am mostly innocent; started working with the Pl@ntNet team in 2020
Motivation: excellent app … but not a perfect app; How to improve?
Popular datasets limitations:
Motivation:
release a large-scale dataset sharing similar features as the Pl@ntNet dataset to foster research in plant identification
\(\implies\) Pl@ntNet-300K (Garcin et al. 2021)
Top-5 most observed plant species in Pl@ntNet (13/04/2024):
25134 obs. Echium vulgare L.
24720 obs. Ranunculus ficaria L.
24103 obs. Prunus spinosa L.
23288 obs. Zea mays L.
23075 obs. Alliaria petiolata
10753 obs.
Centaurea jacea
6 obs.
Cenchrus agrimonioides
8376 obs.
Magnolia grandiflora
413 obs.
Moehringia trinervia
Note: long tail preserved by genera subsampling
Caracteristics:
Zenodo, 1 click download
https://zenodo.org/record/5645731
Code to train models
https://github.com/plantnet/PlantNet-300KImages from users… so are the labels!
But users can be wrong or not experts
Several labels can be available per image!
Link: https://identify.plantnet.org/weurope/observations/1012500059
The good, the bad and the ugly
Provide for all images \(x_i\) an aggregated label \(\hat{y}_i\) based on the votes \(y^{u}_i\) of the workers \(u \in \mathcal{U}\).
Naive idea: make users vote and take the most voted label for each image
Naive idea: make users vote and take the most voted label for each image
\[ \forall x_i \in \mathcal{X}_{\text{train}},\quad \hat y_i^{\text{MV}} = \mathop{\mathrm{arg\,max}}_{k\in [K]} \Big(\sum\limits_{u\in\mathcal{U}(x_i)} {1\hspace{-3.8pt} 1}_{\{y^{u}_i=k\}} \Big) \]
Properties:
✓ simple
✓ adapted for any number of users
✓ efficient, few labelers sufficient (say < 5, Snow et al. 2008)
✗ ineffective for borderline cases
✗ suffer from spammers / adversarial users
Constraints: wide range of skills, different levels of expertise
Modeling aspect: add a user weight to balance votes
Assume given weights \((w_u)_{u\in\mathcal{U}}\) for now
The label confidence \(\mathrm{conf}_{i}(k)\) of label \(k\) for image \(x_i\) is the sum of the weights of the workers who voted for \(k\): \[ \forall k \in [K], \quad \mathrm{conf}_{i}(k) = \sum\limits_{u\in\mathcal{U}(x_i)} w_u {1\hspace{-3.8pt} 1}_{\{y^{u}_i=k\}} \]
Size effect:
The label accuracy \(\mathrm{acc}_{i}(k)\) of label \(k\) for image \(x_i\) is the normalized sum of weights of the workers who voted for \(k\): \[ \forall k \in [K], \quad \mathrm{acc}_{i}(k) = \frac{\mathrm{conf}_i(k)}{\sum\limits_{k'\in [K]} \mathrm{conf}_i(k')} \]
Interpretation:
Majority voting but weighted by a confidence score per user \(u\): \[ \forall x_i \in \mathcal{X}_{\texttt{train}},\quad \hat{y}_i^{\textrm{WMV}} = \mathop{\mathrm{arg\,max}}_{k\in [K]} \Big(\sum\limits_{u\in\mathcal{U}(x_i)} w_u {1\hspace{-3.8pt} 1}_{\{y^{u}_i=k\}} \Big) \]
Note: the weighted majority vote can be computed from confidence or accuracy \[ \hat{y}_i^{\textrm{WMV}} = \mathop{\mathrm{arg\,max}}_{k\in [K]} \Big( \mathrm{conf}_i(k) \Big) = \mathop{\mathrm{arg\,max}}_{k\in [K]} \Big(\mathrm{acc}_i(k) \Big) \]
Two pillars for validating a label \(\hat{y}_i\) for an image \(x_i\) in Pl@ntNet :
Expertise: labels quality check
keep images with label confidence above a threshold \(\theta_{\text{conf}}\), validate \(\hat{y}_i\) when \[ \boxed{\mathrm{conf}_{i}(\hat{y}_i) > \theta_{\text{conf}}} \]
Consensus: labels agreement check
keep images with label accuracy above a threshold \(\theta_{\text{acc}}\), validate \(\hat{y}_i\) when \[ \boxed{\mathrm{acc}_{i}(\hat{y}_i) > \theta_{\text{acc}}} \]
Weighting scheme: weight user vote by its number of identified species
Take into account 4 users out of 6
Take into account 4 users out of 6
Invalidated label: Adding User 5 reduces accuracy
Label switched: User 6 is an expert (even self-validating)
\[ f(n_u) = n_u^\alpha - n_u^\beta + \gamma \text{ with } \begin{cases} \alpha = 0.5 \\ \beta=0.2 \\ \gamma=\log(1.7)\simeq 0.74 \end{cases} \]
Worker agreement with aggregate (WAWA): 2-step method
No ground truth available to evaluate the strategies
Pl@ntNet South Western European flora
Why?
Main danger
\(\Longrightarrow\) confident AI with \(\theta_{\text{score}}=0.7\) performs best… but invalidating AI could be preferred for safety \(\Longleftarrow\)
peerannot
: Python library to handle crowdsourced data
Dataset release:
Code release:
Future work