Generating Synthetic Wildlife Health Data from Camera Trap Imagery: A Pipeline for Alopecia and Body Condition Training Data
arXiv:2603.26754v1 Announce Type: cross Abstract: No publicly available, ML ready datasets exist for wildlife health conditions in camera trap imagery, creating a fundamental barrier to automated health screening. We present a pipeline for generating synthetic training images depicting alopecia and body condition deterioration in wildlife from real camera trap photographs. Our pipeline constructs a curated base image set from iWildCam using MegaDetector derived bounding boxes and center frame weighted stratified sampling across 8 North American species. A generative phenotype editing system pr — David Brundage
View PDF HTML (experimental)
Abstract:No publicly available, ML ready datasets exist for wildlife health conditions in camera trap imagery, creating a fundamental barrier to automated health screening. We present a pipeline for generating synthetic training images depicting alopecia and body condition deterioration in wildlife from real camera trap photographs. Our pipeline constructs a curated base image set from iWildCam using MegaDetector derived bounding boxes and center frame weighted stratified sampling across 8 North American species. A generative phenotype editing system produces controlled severity variants depicting hair loss consistent with mange and emaciation. An adaptive scene drift quality control system uses a sham prefilter and decoupled mask then score approach with complementary day or night metrics to reject images where the generative model altered the original scene. We frame the pipeline explicitly as a screening data source. From 201 base images across 4 species, we generate 553 QC passing synthetic variants with an overall pass rate of 83 percent. A sim to real transfer experiment training exclusively on synthetic data and testing on real camera trap images of suspected health conditions achieves 0.85 AUROC, demonstrating that the synthetic data captures visual features sufficient for screening.
Subjects:
Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.26754 [cs.CV]
(or arXiv:2603.26754v1 [cs.CV] for this version)
https://doi.org/10.48550/arXiv.2603.26754
arXiv-issued DOI via DataCite
Submission history
From: David Brundage [view email] [v1] Mon, 23 Mar 2026 13:35:28 UTC (14,536 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxiv
New method predicts the success of LLMs on untried tasks with high accuracy
A team from the Universitat Politècnica de València, part of the Valencian University Research Institute for Artificial Intelligence (VRAIN) and ValgrAI, has participated in the development of ADeLe, a new methodology that offers precise explanations and predictions regarding whether large language models (LLMs) will succeed or fail at specific new tasks they have not yet performed. Furthermore, this methodology identifies exactly the limits of any given model s reasoning capacity.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.





Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!