Foundation Models for Bioacoustics -- a Comparative Review
arXiv:2508.01277v2 Announce Type: replace-cross Abstract: Automated bioacoustic analysis is essential for biodiversity monitoring and conservation, requiring advanced deep learning models that can adapt to diverse bioacoustic tasks. This article presents a comprehensive review of large-scale pretrained bioacoustic foundation models and systematically investigates their transferability across multiple bioacoustic classification tasks. We overview bioacoustic representation learning by analysing pretraining data sources and benchmarks. On this basis, we review bioacoustic foundation models, diss — Raphael Schwinger, Paria Vali Zadeh, Lukas Rauch, Mats Kurz, Tom Hauschild, Sam Lapp, Sven Tomforde
View PDF
Abstract:Automated bioacoustic analysis is essential for biodiversity monitoring and conservation, requiring advanced deep learning models that can adapt to diverse bioacoustic tasks. This article presents a comprehensive review of large-scale pretrained bioacoustic foundation models and systematically investigates their transferability across multiple bioacoustic classification tasks. We overview bioacoustic representation learning by analysing pretraining data sources and benchmarks. On this basis, we review bioacoustic foundation models, dissecting the models' training data, preprocessing, augmentations, architecture, and training paradigm. Additionally, we conduct an extensive empirical study of selected models on the BEANS and BirdSet benchmarks, evaluating generalisability under linear and attentive probing. Our experimental analysis reveals that Perch~2.0 achieves the highest BirdSet score (restricted evaluation) and the strongest linear probing result on BEANS, building on diverse multi-taxa supervised pretraining; that BirdMAE is the best model among probing-based strategies on BirdSet and second on BEANS after BEATs$_{NLM}$, the encoder of NatureLM-audio; that attentive probing is beneficial to extract the full performance of transformer-based models; and that general-purpose audio models trained with self-supervised learning on AudioSet outperform many specialised bird sound models on BEANS when evaluated with attentive probing. These findings provide valuable guidance for practitioners selecting appropriate models to adapt them to new bioacoustic classification tasks via probing.
Comments: Preprint
Subjects:
Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Quantitative Methods (q-bio.QM)
Cite as: arXiv:2508.01277 [cs.SD]
(or arXiv:2508.01277v2 [cs.SD] for this version)
https://doi.org/10.48550/arXiv.2508.01277
arXiv-issued DOI via DataCite
Submission history
From: Raphael Schwinger [view email] [v1] Sat, 2 Aug 2025 09:15:16 UTC (622 KB) [v2] Sun, 29 Mar 2026 12:45:56 UTC (649 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
researchpaperarxivWe Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak - Finextra Research
<a href="https://news.google.com/rss/articles/CBMitAFBVV95cUxQSkNxZGExOG5KR1piVXBnRTN0dkxmak84akUyc0QteDdvSFlXZVNRZzktUjRyYVNvLWlKUVI5Ulp1M0hPY3g0RU9yNmowd0xmWDBIMmxCVkVDTkVjMXRscXFaV1lGTWVXajRycklSWnA4end2NDRkckM3ZE1VenZ6ZVluMmh4LXVqWXVzMEZGY2hyMXBpdnBYYldHTzVfZ2JxT3JCYmExOFphQUlTRER6bl9waWY?oc=5" target="_blank">We Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak</a> <font color="#6f6f6f">Finextra Research</font>
Stanford Researchers Find Thin Evidence Behind AI Classroom Tools - GovTech
<a href="https://news.google.com/rss/articles/CBMipwFBVV95cUxQYmVMLUpxaHV6R1RPY1R0WGtNLTVrQXlWTzUySzJRamxoWEdqYlptMW1lMjNWMWRuS1hhb2pVNjhpdWRxekRfclhVbl9FT3E0U1Byc18xcWd0Wm5XM1BTUlNRRWNpaFlzNVk4SDN3eW9YRkFWNlJsVXhIUWdnWmdxX3ZJQUUtcm5MSFRxNTRlZ0I1cXdnV2xHUGdRT0NaQ015Z3czV3J2Yw?oc=5" target="_blank">Stanford Researchers Find Thin Evidence Behind AI Classroom Tools</a> <font color="#6f6f6f">GovTech</font>
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Research Papers
Oracle Cuts 30,000 Jobs to Fund Its AI Gamble - CX Today
<a href="https://news.google.com/rss/articles/CBMikwFBVV95cUxQTTFVNGlKYVNVVThtbUowS01MSTIzemNzV2Y4NWMtd0ItNXhxeXVtUENILXdIVHVSSnZodkFqRkdxdkhqaFo3X3VQbmdSNkdBLWlyeS1xOU01blFLa01UZ0hQMlkza1dpMVRKQk5xVmM5dUFHcURMblN6b05HTjZlZjlXeWlLZ1ROdFh3eTl6WlA1Y00?oc=5" target="_blank">Oracle Cuts 30,000 Jobs to Fund Its AI Gamble</a> <font color="#6f6f6f">CX Today</font>
We Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak - Finextra Research
<a href="https://news.google.com/rss/articles/CBMitAFBVV95cUxQSkNxZGExOG5KR1piVXBnRTN0dkxmak84akUyc0QteDdvSFlXZVNRZzktUjRyYVNvLWlKUVI5Ulp1M0hPY3g0RU9yNmowd0xmWDBIMmxCVkVDTkVjMXRscXFaV1lGTWVXajRycklSWnA4end2NDRkckM3ZE1VenZ6ZVluMmh4LXVqWXVzMEZGY2hyMXBpdnBYYldHTzVfZ2JxT3JCYmExOFphQUlTRER6bl9waWY?oc=5" target="_blank">We Asked 300 Finance Leaders What's Next in Fintech. Here's What They Said.: By Sergiy Fitsak</a> <font color="#6f6f6f">Finextra Research</font>
Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn - The Independent
<a href="https://news.google.com/rss/articles/CBMiogFBVV95cUxQcVVnRFpzdEtnNVFmdll6VlViUUc5aUhkSzR4Wi1zOVNOMFo2TGtBcjZLR1ZnNVdmYUlPcDNrNW9oT3YzUFFSYlJjLUlLUmtQT1pWQzFxVWRnSXZjelJpaXoxTURrZGw0OFVMc2U5SGhyOVpEMnlnVmhrQ3R6VF9teFNPLTJ0c3JaNGJJeHRaR3ZmOGRFd0FMLVQ2ZHpTMm42NGc?oc=5" target="_blank">Data centers are creating ‘heat islands’ on land around them – warming them by up to 16 degrees, researchers warn</a> <font color="#6f6f6f">The Independent</font>


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!