{{
}}
Conformal Prediction
for Long-Tailed Classification
Joseph Salmon
IMAG, Univ Montpellier, CNRS, Inria, Montpellier, France
Pl@ntnet Consortium

Tiffany Ding
UC Berkeley

Jean-Baptiste Fermanian
Inria
and all the team from
A citizen science platform using machine learning to help people identify plants with their mobile phones

Pl@ntNet-300K dataset (Garcin et al., 2021): a baby Pl@ntNet, available on Zenodo
\[ \mathcal{C}_{\alpha}(X) = \big\{ y : s(X,y) \geq t_\alpha \big\} \]
Marginal coverage targets:
\[\mathbb{P}\big[ Y \in \mathcal{C}_{\alpha}(X) \big ] \geq 1 - \alpha.\]
Class conditional coverage targets:
\[\forall y,\quad \mathbb{P}\big[ Y \in \mathcal{C}_\alpha(X) | Y=y \big ] \geq 1 - \alpha.\]
Theorem (Informal)
The optimal set of minimum size and marginal coverage of at least \(1-\alpha\) is:
\[ \mathcal{C}_{\alpha}(x) = \left\{ y : p(y|x) \geq t_\alpha \right\} \]
Theorem (Informal)
The optimal set of minimum size and conditional coverage of at least \(1-\alpha\) is:
\[ \mathcal{C}_{\alpha}(x) = \left\{ y : p(y|x) \geq t_\alpha^{y} \right\} \]
Marginal: calibrate \(t_{\alpha}\) on whole calibration set \((X_i, Y_i)_{i=1}^n\)
Conditional: calibrate \(t_{\alpha}^y\) only on \((X_i, Y_i)\) such that \(Y_i = y\)
\[ \text{MacroCoverage} = \frac{1}{|\mathcal{Y}|} \sum_{y \in \mathcal{Y}} \mathbb{P}\big( y \in \mathcal{C}(X) \, | \, Y = y \big) \]
Theorem (Informal)
The optimal set of minimum size and Macro-Coverage of at least \(1-\alpha\) is:
\[ \mathcal{C}_{\alpha}(x) = \left\{ y : \frac{p(y|x)}{p(y)} \geq t_\alpha \right\} \]