The goal of the task is to evaluate systems for the detection of sound events using real data either weakly labeled or unlabeled and simulated data that is strongly labeled (with time stamps).

Challenge has ended. Full results for this task can be found in the Results page.

Description

This task is the follow-up to DCASE 2018 task 4. The task evaluates systems for the large-scale detection of sound events using weakly labeled data (without timestamps). The target of the systems is to provide not only the event class but also the event time boundaries given that multiple events can be present in an audio recording. The challenge of exploring the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly annotated training set to improve system performance remains but an additional training set with strongly annotated synthetic data is provided. The labels in all the annotated subsets are verified and can be considered as reliable. An additional scientific question this task is aiming to investigate is whether we really need real but partially and weakly annotated data or is using synthetic data sufficient? or do we need both?

Figure 1: Overview of a sound event detection system.

Audio dataset

The dataset for this task is composed of 10 sec audio clips recorded in domestic environment or synthesized to simulate a domestic environment. The task focuses on 10 class of sound events that represent a subset of Audioset (not all the classes are present in Audioset, some classes of sound events are including several classes from Audioset):

Speech Speech
Dog Dog
Cat Cat
Alarm/bell/ringing Alarm_bell_ringing
Dishes Dishes
Frying Frying
Blender Blender
Running water Running_water
Vacuum cleaner Vacuum_cleaner
Electric shaver/toothbrush Electric_shaver_toothbrush

The dataset for DCASE 2019 task 4 is composed of a subset with real recordings (from Audioset) and a subset with synthetic recordings. The datasets used de generate the dataset for DCASE 2019 task 4 are described below.

Audioset: Real recordings are extracted from Audioset. It consists of an expanding ontology of 632 sound event classes and a collection of 2 million human-labeled 10-second sound clips (less than 21% are shorter than 10-seconds) drawn from 2 million Youtube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds.

Publication

Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, and Marvin Ritter. Audio set: an ontology and human-labeled dataset for audio events. In Proc. IEEE ICASSP 2017. New Orleans, LA, 2017.

	Romain Serizel University of Lorraine
	Nicolas Turpault Inria Nancy Grand-Est
	Ankit Parag Shah Carnegie Mellon University
	Justin Salamon Adobe Research
	Hamid Eghbal-zadeh Johannes Kepler University

Class	# unique events
Speech	128
Dog	136
Cat	88
Alarm/bell/ringing	190
Dishes	109
Frying	64
Blender	98
Running water	68
Vacuum cleaner	74
Electric shaver/toothbrush	56
Total	1011

Class	# events
Speech	2132
Dog	516
Cat	547
Alarm/bell/ringing	755
Dishes	814
Frying	137
Blender	540
Running water	157
Vacuum cleaner	204
Electric shaver/toothbrush	230
Total	6032

Class	# events
Speech	1662
Dog	577
Cat	340
Alarm/bell/ringing	418
Dishes	492
Frying	91
Blender	96
Running water	230
Vacuum cleaner	92
Electric shaver/toothbrush	65
Total	4093

F-score metrics (macro averaged)
	Validation 2019	Evaluation 2018
Event-based	23.7 %	20.6 %
Segment-based	55.2 %	51.4 %

Rank	Submission Information
Rank	Code	Author	Affiliation	Technical Report	Event-based F-score (Evaluation dataset)
	Wang_NUDT_task4_4	Dezhi Wang	National University of Defense Technology, College of Meteorology and Oceanography, Changsha, China	task-sound-event-detection-in-domestic-environments-results#Wang2019	16.8
	Wang_NUDT_task4_3	Dezhi Wang	National University of Defense Technology, College of Meteorology and Oceanography, Changsha, China	task-sound-event-detection-in-domestic-environments-results#Wang2019	17.5
	Wang_NUDT_task4_2	Dezhi Wang	National University of Defense Technology, College of Meteorology and Oceanography, Changsha, China	task-sound-event-detection-in-domestic-environments-results#Wang2019	17.2
	Wang_NUDT_task4_1	Dezhi Wang	National University of Defense Technology, College of Meteorology and Oceanography, Changsha, China	task-sound-event-detection-in-domestic-environments-results#Wang2019	17.2
	Delphin_OL_task4_2	Lionel Delphin-Poulat	Orange Labs, HOME/CONTENT, Lannion, France	task-sound-event-detection-in-domestic-environments-results#Delphin-Poulat2019	42.1
	Delphin_OL_task4_1	Lionel Delphin-Poulat	Orange Labs, HOME/CONTENT, Lannion, France	task-sound-event-detection-in-domestic-environments-results#Delphin-Poulat2019	38.3
	Kong_SURREY_task4_1	Qiuqiang Kong	University of Surrey, Centre for Vision, Speech and Signal Processing (CVSSP), Guildford, England	task-sound-event-detection-in-domestic-environments-results#Kong2019	22.3
	CTK_NU_task4_2	Teck Kai Chan	Newcastle University, Singapore, Faculty of Science, Agriculture, Engineering, Singapore	task-sound-event-detection-in-domestic-environments-results#Chan2019	29.7
	CTK_NU_task4_3	Teck Kai Chan	Newcastle University, Singapore, Faculty of Science, Agriculture, Engineering, Singapore	task-sound-event-detection-in-domestic-environments-results#Chan2019	27.7
	CTK_NU_task4_4	Teck Kai Chan	Newcastle University, Singapore, Faculty of Science, Agriculture, Engineering, Singapore	task-sound-event-detection-in-domestic-environments-results#Chan2019	26.9
	CTK_NU_task4_1	Teck Kai Chan	Newcastle University, Singapore, Faculty of Science, Agriculture, Engineering, Singapore	task-sound-event-detection-in-domestic-environments-results#Chan2019	31.0
	Mishima_NEC_task4_3	Sakiko Mishima	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Mishima2019	18.3
	Mishima_NEC_task4_4	Sakiko Mishima	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Mishima2019	19.8
	Mishima_NEC_task4_2	Sakiko Mishima	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Mishima2019	17.7
	Mishima_NEC_task4_1	Sakiko Mishima	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Mishima2019	16.7
	CANCES_IRIT_task4_2	Thomas Pellegrini	Université Paul Sabatier Toulouse III, Institut de Recherche en Informatique de Toulouse, Theme 1 - Signal Image, Toulouse, France	task-sound-event-detection-in-domestic-environments-results#Cances2019	28.4
	CANCES_IRIT_task4_2	Thomas Pellegrini	Université Paul Sabatier Toulouse III, Institut de Recherche en Informatique de Toulouse, Theme 1 - Signal Image, Toulouse, France	task-sound-event-detection-in-domestic-environments-results#Cances2019	26.1
	PELLEGRINI_IRIT_task4_1	Thomas Pellegrini	Université Paul Sabatier Toulouse III, Institut de Recherche en Informatique de Toulouse, Theme 1 - Signal Image, Toulouse, France	task-sound-event-detection-in-domestic-environments-results#Cances2019	39.7
	Lin_ICT_task4_2	Liwei Lin	Institute of Computing Technology, Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Lin2019	40.9
	Lin_ICT_task4_4	Liwei Lin	Institute of Computing Technology, Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Lin2019	41.8
	Lin_ICT_task4_3	Liwei Lin	Institute of Computing Technology, Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Lin2019	42.7
	Lin_ICT_task4_1	Liwei Lin	Institute of Computing Technology, Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Lin2019	40.7
	Baseline_dcase2019	Romain Serizel	University of Lorraine, Loria, Department of Natural Language Processing & Knowledge Discovery, Nancy, France	task-sound-event-detection-in-domestic-environments-results#Turpault2019	25.8
	bolun_NWPU_task4_1	Wang bolun	Northwestern Polytechnical University, School of Computer Science, Xi'an, China	task-sound-event-detection-in-domestic-environments-results#Bolun2019	21.7
	bolun_NWPU_task4_4	Wang bolun	Northwestern Polytechnical University, School of Computer Science, Xi'an, China	task-sound-event-detection-in-domestic-environments-results#Bolun2019	25.3
	bolun_NWPU_task4_3	Wang bolun	Northwestern Polytechnical University, School of Computer Science, Xi'an, China	task-sound-event-detection-in-domestic-environments-results#Bolun2019	23.8
	bolun_NWPU_task4_2	Wang bolun	Northwestern Polytechnical University, School of Computer Science, Xi'an, China	task-sound-event-detection-in-domestic-environments-results#Bolun2019	27.8
	Agnone_PDL_task4_1	Anthony Agnone	Pindrop, Audio Research, Atlanta, GA	task-sound-event-detection-in-domestic-environments-results#Agnone2019	25.0
	Kiyokawa_NEC_task4_1	Yu Kiyokawa	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Kiyokawa2019	27.8
	Kiyokawa_NEC_task4_4	Yu Kiyokawa	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Kiyokawa2019	32.4
	Kiyokawa_NEC_task4_3	Yu Kiyokawa	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Kiyokawa2019	29.4
	Kiyokawa_NEC_task4_2	Yu Kiyokawa	NEC Corporation, Central Research Laboratories, Kanagawa, Japan	task-sound-event-detection-in-domestic-environments-results#Kiyokawa2019	28.3
	Kothinti_JHU_task4_2	Sandeep Kothinti	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments-results#Kothinti2019	30.5
	Kothinti_JHU_task4_3	Sandeep Kothinti	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments-results#Kothinti2019	29.0
	Kothinti_JHU_task4_4	Sandeep Kothinti	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments-results#Kothinti2019	29.4
	Kothinti_JHU_task4_1	Sandeep Kothinti	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments-results#Kothinti2019	30.7
	Shi_FRDC_task4_2	Ziqiang Shi	Fujitsu Research and Development Center, Information Technology Laboratory, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Shi2019	42.0
	Shi_FRDC_task4_3	Ziqiang Shi	Fujitsu Research and Development Center, Information Technology Laboratory, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Shi2019	40.9
	Shi_FRDC_task4_4	Ziqiang Shi	Fujitsu Research and Development Center, Information Technology Laboratory, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Shi2019	41.5
	Shi_FRDC_task4_1	Ziqiang Shi	Fujitsu Research and Development Center, Information Technology Laboratory, Beijing, China	task-sound-event-detection-in-domestic-environments-results#Shi2019	37.0
	ZYL_UESTC_task4_1	Zhang Zhenyuan	University of Electronic Sci-ence and Technology of China, Department of Internet of Things Engineering, Chengdu, China	task-sound-event-detection-in-domestic-environments-results#Zhang2019	29.4
	ZYL_UESTC_task4_2	Zhang Zhenyuan	University of Electronic Sci-ence and Technology of China, Department of Internet of Things Engineering, Chengdu, China	task-sound-event-detection-in-domestic-environments-results#Zhang2019	30.8
	Wang_YSU_task4_1	Qian Yang	Yanshan University, Information Science and Engineering, Qinghuangdao, China	task-sound-event-detection-in-domestic-environments-results#Yang2019	6.5
	Wang_YSU_task4_2	Qian Yang	Yanshan University, Information Science and Engineering, Qinghuangdao, China	task-sound-event-detection-in-domestic-environments-results#Yang2019	6.2
	Wang_YSU_task4_3	Qian Yang	Yanshan University, Information Science and Engineering, Qinghuangdao, China	task-sound-event-detection-in-domestic-environments-results#Yang2019	6.7
	Yan_USTC_task4_1	Jie Yan	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments-results#Yan2019	35.8
	Yan_USTC_task4_3	Jie Yan	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments-results#Yan2019	35.6
	Yan_USTC_task4_4	Jie Yan	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments-results#Yan2019	33.5
	Yan_USTC_task4_2	Jie Yan	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments-results#Yan2019	36.2
	Lee_KNU_task4_2	Seokjin Lee	Kyungpook National University, School of Electronics Engineering, Daegu, Republic of Korea	task-sound-event-detection-in-domestic-environments-results#Lee2019	25.8
	Lee_KNU_task4_4	Seokjin Lee	Kyungpook National University, School of Electronics Engineering, Daegu, Republic of Korea	task-sound-event-detection-in-domestic-environments-results#Lee2019	24.6
	Lee_KNU_task4_3	Seokjin Lee	Kyungpook National University, School of Electronics Engineering, Daegu, Republic of Korea	task-sound-event-detection-in-domestic-environments-results#Lee2019	26.7
	Lee_KNU_task4_1	Seokjin Lee	Kyungpook National University, School of Electronics Engineering, Daegu, Republic of Korea	task-sound-event-detection-in-domestic-environments-results#Lee2019	26.4
	Rakowski_SRPOL_task4_1	Alexander Rakowski	Samsung R&D Institute Poland, Audio Intelligence, Warsaw, Poland	task-sound-event-detection-in-domestic-environments-results#Rakowski2019	24.2
	Lim_ETRI_task4_1	Wootaek Lim	Electronics and Telecommunications Research Institute, Realistic AV Research Group, Daejeon, Korea	task-sound-event-detection-in-domestic-environments-results#Lim2019	32.6
	Lim_ETRI_task4_2	Wootaek Lim	Electronics and Telecommunications Research Institute, Realistic AV Research Group, Daejeon, Korea	task-sound-event-detection-in-domestic-environments-results#Lim2019	33.2
	Lim_ETRI_task4_3	Wootaek Lim	Electronics and Telecommunications Research Institute, Realistic AV Research Group, Daejeon, Korea	task-sound-event-detection-in-domestic-environments-results#Lim2019	32.5
	Lim_ETRI_task4_4	Wootaek Lim	Electronics and Telecommunications Research Institute, Realistic AV Research Group, Daejeon, Korea	task-sound-event-detection-in-domestic-environments-results#Lim2019	34.4

Gold sponsor						Silver sponsor

Bronze sponsors

Technical sponsor

Coordinators

Content

Description

Audio dataset

Audio Set: An ontology and human-labeled dataset for audio events

Abstract

Freesound Datasets: a platform for the creation of open audio datasets

Abstract

Freesound technical demo

Abstract

The SINS Database for Detection of Daily Activities in a Home Environment Using an Acoustic Sensor Network

Abstract

Keywords

Synthetic data generation procedure

Scaper: A library for soundscape synthesis and augmentation

Reference labels

Weak annotations

Strong annotations

Download

Real recordings

Synthetic clips

Task setup

Development dataset

Training set

Validation set

Evaluation dataset

MUSAN: A Music, Speech, and Noise Corpus

Abstract

The Audio Degradation Toolbox and its Application to Robustness Evaluation

Abstract

Task rules

Evaluation

Metrics for Polyphonic Sound Event Detection

Abstract

Awards

Reproducible system award

Judges’ award

The awards are sponsored by

Baseline

System description

Mean Teacher Convolution System for DCASE 2018 Task 4

Abstract

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Abstract

Python Implementation

System Performance

Results

Citation

Sound event detection in domestic environments with weakly labeled data and soundscape synthesis

Keywords