Challenge has ended.

Results for each tasks are presented in task specific results pages:

Introduction

Sounds carry a large amount of information about our everyday environment and physical events that take place in it. We can perceive the sound scene we are within (busy street, office, etc.), and recognize individual sound sources (car passing by, footsteps, etc.). Developing signal processing methods to automatically extract this information has huge potential in several applications, for example searching for multimedia based on its audio content, making context-aware mobile devices, robots, cars etc., and intelligent monitoring systems to recognize activities in their environments using acoustic information. However, a significant amount of research is still needed to reliably recognize sound scenes and individual sound sources in realistic soundscapes, where multiple sounds are present, often simultaneously, and distorted by the environment.

Acoustic scene classification

Scenes Task 1

A Match

B Mismatch

C External

The goal of acoustic scene classification is to classify a test recording into one of the provided predefined classes that characterizes the environment in which it was recorded. Audio data recorded in different large european cities will provide a new challenging problem by introducing more acoustic variability for each class than the previous editions.

Organizers

General-purpose audio tagging of Freesound content with AudioSet labels

Tags Task 2

The task evaluates systems for general-purpose audio tagging with an increased number of categories and using data with annotations of varying reliability. This poses the challenges of classifying sound events of very diverse nature (including musical instruments, human sounds, domestic sounds, animals, etc.) and leveraging subsets of training data with annotations of different quality levels. The data used are audio samples from Freesound organized by some categories of the AudioSet Ontology. This task will provide insight towards the development of broadly-applicable sound event classifiers that consider an increased and diverse amount of categories. These models can be used, for example, in automatic description of multimedia or acoustic monitoring applications.

Organizers

Frederic Font Corbera

Universitat Pompeu Fabra

Eduardo Fonseca

Universitat Pompeu Fabra

Daniel P. W. Ellis

Google, Inc.

Manoj Plakal

Google, Inc.

Task description Results

Bird audio detection

Birds Task 3

Detecting bird sounds in audio is an important task for automatic wildlife monitoring, as well as in citizen science and audio library management. Bird sound detection is a very common required first step before further analysis (e.g. classification, counting), and makes it possible to conduct work with large datasets (e.g. continuous 24h monitoring) by filtering data down to regions of interest.

In order to be relevant to a wide variety of sound monitoring applications, and accessible to a wide range of methods, the bird detection task is deliberately simplified to a binary classification paradigm: within each ten-second time region, are there any birds present?

The major challenge in this task is generalisation. In real applications, the deployment conditions do not match the "training" conditions, and sound analysis algorithms should be able to handle this scenario in order to be practically useful. Hence, we provide development datasets recorded in different parts of the world, and we will use testing data from outdoor monitoring scenarios which do not match the development data. The challenge is to develop an algorithm which inherently generalises well, or which can self-adapt to the new conditions.

Organizers

Dan Stowell

Queen Mary University of London

Hervé Glotin

University of Toulon

Yannis Stylianou

University of Crete

Mike Wood

University of Salford

Task description Results

Large-scale weakly labeled semi-supervised sound event detection in domestic environments

Large-scale Task 4

The task evaluates systems for the large-scale detection of sound events using weakly labeled data. The challenge is to explore the possibility to exploit a large amount of unbalanced and unlabelled training data together with a small weakly annotated training set to improve system performance. The data are YouTube video excerpts focusing on domestic context which could be used for example in ambient assisted living applications. The domain was chosen due to the scientific challenges (wide variety of sounds, time-localized events...) and potential industrial applications.

Organizers

Romain Serizel

University of Lorraine

Hamid Eghbal-zadeh

Johannes Kepler University

Nicolas Turpault

Inria Nancy Grand-Est

Ankit Parag Shah

Carnegie Mellon University

Task description Results

Monitoring of domestic activities based on multi-channel acoustics

Monitor Task 5

There is a rising interest in smart environments that enhance the quality of live for humans in terms of e.g. safety, security, comfort, and home care. In order to have smart functionality, situational awareness is required, which might be obtained by interpreting a multitude of sensing modalities including acoustics. The latter is already used in vocal assistants such as Google Home, Apple HomePod, and Amazon Echo. While these devices focus on speech, they could be extended to identify domestic activities carried out by humans. In the literature, this recognition of activities based on acoustics is already touched upon. Yet, the acoustic models are typically based on single channel and single location recordings. In this task, it is investigated to which extend multi-channel acoustic recordings are beneficial for the purpose of detecting domestic activities.

Organizers

Gert Dekkers

KU Leuven

Peter Karsmakers

KU Leuven

Lode Vuegen

KU Leuven

Task description Results

DCASE2018 Challenge

Contact

Introduction

Acoustic scene classification

Organizers

Annamaria Mesaros

Tuomas Virtanen

Toni Heittola

General-purpose audio tagging of Freesound content with AudioSet labels

Organizers

Frederic Font Corbera

Eduardo Fonseca

Daniel P. W. Ellis

Manoj Plakal

Bird audio detection

Organizers

Dan Stowell

Hervé Glotin

Yannis Stylianou

Mike Wood

Large-scale weakly labeled semi-supervised sound event detection in domestic environments

Organizers

Romain Serizel

Hamid Eghbal-zadeh

Nicolas Turpault

Ankit Parag Shah

Monitoring of domestic activities based on multi-channel acoustics

Organizers

Gert Dekkers

Peter Karsmakers

Lode Vuegen

Schedule

15 Sep 2018

Challenge results

Contact

Recent news

DCASE2018 Challenge results published

DCASE2018 Challenge received 223 entries

DCASE2018 Challenge evaluation datasets available

Introduction

Acoustic scene classification

Organizers

Annamaria Mesaros

Tuomas Virtanen

Toni Heittola

General-purpose audio tagging of Freesound content with AudioSet labels

Organizers

Frederic Font Corbera

Eduardo Fonseca

Daniel P. W. Ellis

Manoj Plakal

Bird audio detection

Organizers

Dan Stowell

Hervé Glotin

Yannis Stylianou

Mike Wood

Large-scale weakly labeled semi-supervised sound event detection in domestic environments

Organizers

Romain Serizel

Hamid Eghbal-zadeh

Nicolas Turpault

Ankit Parag Shah

Monitoring of domestic activities based on multi-channel acoustics

Organizers

Gert Dekkers

Peter Karsmakers

Lode Vuegen