DCASE2021 Challenge

Antoine Deleforge

Sharath Adavanne

Prerak Srivastava

Daniel Krause

Tuomas Virtanen

Task description Results

Sound Event Detection and Separation in Domestic Environments

Domestic Task 4

The task evaluates systems for the detection of sound events using weakly labeled data (without timestamps). The target of the systems is to provide not only the event class but also the event time boundaries given that multiple events can be present in an audio recording. This year, we also encourage participants to propose systems that use source separation jointly with sound event detection. This task aims to investigate how we can optimally exploit synthetic data and to what extent can source separation improve sound event detection, and vice-versa? This task is a follow up to DCASE 2020 Task 4.

Organizers

Romain Serizel

University of Lorraine

Nicolas Turpault

Francesca Ronchini

Scott Wisdom

Hakan Erdogan

John Hershey

Università Politecnica delle Marche,

Justin Salamon

Adobe Research

Prem Seetharaman

Northwestern University

Eduardo Fonseca

Universitat Pompeu Fabra

Samuele Cornell

Daniel P. W. Ellis

Queen Mary University of London

Task description Results

Few-shot Bioacoustic Event Detection

Bio Task 5

This challenge focuses on sound event detection in a few-shot learning setting for animal (mammal and bird) vocalisations. Participants will be expected to create a method that can extract information from five exemplar vocalisations (shots) of mammals or birds and detect and classify sounds in field recordings. The main objective is to find reliable algorithms that are capable of dealing with data sparsity, class imbalance, and noisy/busy environments.

Organizers

Veronica Morfi

Dan Stowell

Queen Mary University of London
Tilburg University

Vincent Lostanlen

Centre National de la Recherche Scientifique(CNRS)

Ariana Strandburg-Peshkin

University of Konstanz
Max Planck Institute of Animal Behavior

Lisa Gill

BIOTOPIA Naturkundemuseum Bayern

Hanna Pamula

AGH University of Science and Technology

Ines Nolasco

Queen Mary University of London

David Benvent

Mathieu Duteil

University of Konstanz

Sripathi Sridhar

Shubhr Singh

Queen Mary University of London

Task description Results

Automated Audio Captioning

Caption Task 6

Automated audio captioning is the task of general audio content description using free text. It is an intermodal translation task (not speech-to-text), where a system accepts as an input an audio signal and outputs the textual description (i.e. the caption) of that signal. Audio captioning methods can model concepts (e.g. "muffled sound"), physical properties of objects and environment (e.g. "the sound of a big car", "people talking in a small and empty room"), and high level knowledge ("a clock rings three times"). This modeling can be used in various applications, ranging from automatic content description to intelligent and content oriented machine-to-machine interaction. This task is a follow up to DCASE 2020 Task 6.

Organizers

Konstantinos Drossos

Samuel Lipping

Tuomas Virtanen