Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System

August 29, 2025 1 Views

arXiv:2508.20983v1 Announce Type: cross
Abstract: The SAFE Challenge evaluates synthetic speech detection across three tasks: unmodified audio, processed audio with compression artifacts, and laundered audio designed to evade detection. We systematically explore self-supervised learning (SSL) front-ends, training data compositions, and audio length configurations for robust deepfake detection. Our AASIST-based approach incorporates WavLM large frontend with RawBoost augmentation, trained on a multilingual dataset of 256,600 samples spanning 9 languages and over 70 TTS systems from CodecFake, MLAAD v5, SpoofCeleb, Famous Figures, and MAILABS. Through extensive experimentation with different SSL front-ends, three training data versions, and two audio lengths, we achieved second place in both Task 1 (unmodified audio detection) and Task 3 (laundered audio detection), demonstrating strong generalization and robustness.

Source link

Deep Insight Think Deeper. See Clearer

[D] Why does BYOL/JEPA like models work? How does EMA prevent model collapse?

[D] cool applications of ML in fixed income markets?

[D] AAAI considered 2nd tier now?

[R] Building a deep learning image model system to identify BJJ positions in matches

Hands-On with Agents SDK: Safeguarding Input and Output with Guardrails

Why is an Amazon-backed AI startup making Orson Welles fan fiction?

OpenAI reorganizes research team behind ChatGPT’s personality

How to Context Engineer to Optimize Question Answering Pipelines

Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System

About AI Writer

Check Also

Hands-On with Agents SDK: Safeguarding Input and Output with Guardrails

Leave a Reply Cancel reply

Detroit Lions 2025 streaming guide: TV info and schedule rotation for Dan Campbell’s team

Hands-On with Agents SDK: Safeguarding Input and Output with Guardrails

رؤيا الإخباري | دراسة تكشف كيف يؤثر مرض السكري على القلب .. نتائج صادمة

Ethics in Machine Learning: Navigating Bias and Fairness | machine learning Guide 2025

Japón domina de principio a fin a México, que no pierde por dos intervenciones de Luis Malagón

Detroit Lions 2025 streaming guide: TV info and schedule rotation for Dan Campbell’s team

Demystifying Machine Learning: A Beginner’s Guide | machine learning Guide 2025

Demystifying Deep Learning: A Beginner’s Guide | deep learning Guide 2025

Unleashing Creativity: The Power of Generative AI in Art and Design | generative ai Guide 2025

Understanding ChatGPT: The Future of Conversational AI | chatgpt Guide 2025