Conformal Prediction to Classify Animal Behaviours with Uncertainty
Highlights
- Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used to classify animal behaviours across taxa. However, practitioners face challenges like sparsely labeled data with class imbalance, noisy data, distribution shift, and unquantified uncertainty.
- We introduce an open-source pipeline that integrates 1D Convolutional Neural Networks (CNNs) with Conformal Prediction (CP) to produce statistically valid prediction sets for animal behavior.
- Our approach includes a flexible class rebalancing strategy that significantly improves the detection of rare, ecologically important behaviors.
- We evaluate our approach via simulation and highlight its utility using data collected from a free-ranging large carnivore, African wild dogs (Lycaon pictus), in the Okavango Delta, Botswana.
Challenges of Behavior Prediction in Ecology
Accelerometers capture waveform patterns corresponding to specific movements, but translating these into reliable behavioral logs is difficult. Current ML applications in ecology face four main limitations:
- Sparse and imbalanced labels: Ecological datasets are often dominated by common behaviors (e.g., Resting). In our dataset, minority classes like Feeding and Running comprised only ~6% of the data.
- Uncertainty quantification: Most models output a single "best guess" without indicating confidence, which can be misleading for ambiguous behaviors.
- Distribution shifts: Models trained on specific individuals or time periods often degrade when applied to new contexts (e.g., different years or animals).
- Temporal inconsistency: Frame-by-frame classifiers often produce rapid, biologically impossible fluctuations in predicted behavior.
Methodology
Feature Extraction & Class Rebalancing
We use a 1D CNNs to extract features from raw accelerometry data. To address the severe class imbalance (e.g., 66% Resting vs 2% Feeding), we employ a flexible class rebalancing approach parametrized by $\theta$. This allows us to interpolate between the original distribution ($\theta=0$) and a perfectly balanced uniform distribution ($\theta=1$), optimizing for the detection of rare events.
Left: Our 1D CNN architecture extracts features from raw tri-axial accelerometer data. Right: Confusion matrices showing that our rebalancing approach improves recall on the rare Feeding class from 0.62 to 0.97.
Uncertainty Quantification via Conformal Prediction
Standard neural networks output softmax scores that often do not reflect true probabilities. To address this, we apply Regularized Adaptive Prediction Sets (RAPS). This method calibrates the model using a held-out validation set to generate prediction sets $\mathcal{C}(X)$. These sets are guaranteed to contain the true behavior $y$ with a user-specified probability $1-\alpha$ (e.g., 95%). This allows ecologists to identify ambiguous data segments where the model is uncertain.
Temporal Smoothning
To mitigate noisy, rapid fluctuations in predictions, we implement a temporal smoothing step. We average softmax scores over a sliding window of size $s$, leveraging the temporal context of surrounding behaviors to enforce biological plausibility.
Citation
@article{agarwal2025leveraging,
title={Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty},
author={Agarwal, Medha and Rafiq, Kasim and Mehta, Ronak and Abrahms, Briana and Harchaoui, Zaid},
booktitle={Methods in Ecology and Evolution (MEE)},
year={2025}
}