The goal of the task is to evaluate systems for the detection of sound events using real data either weakly labeled or unlabeled and simulated data that is strongly labeled (with time stamps).

Challenge has ended. Full results for this task can be found in the Results page.

If you are interested in the task, you can join us on the dedicated slack channel

Description

This task is the follow-up to DCASE 2020 task 4. The task evaluates systems for the detection of sound events using weakly labeled data (without timestamps). The target of the systems is to provide not only the event class but also the event time localization given that multiple events can be present in an audio recording (see also Fig 1). The challenge of exploring the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly annotated training set to improve system performance remains. Isolated sound events, background sound files and scripts to design a training set with strongly annotated synthetic data are provided. The labels in all the annotated subsets are verified and can be considered as reliable.

Figure 1: Overview of a sound event detection system.

As last year, we encourage participants to propose systems that use sound separation jointly with sound event detection. This step can be used to separate overlapping sound events and extract foreground sound events from the background sound. To motivate participants to explore that direction, we provide a baseline sound separation model that can be used for pre-processing (see also Fig 2).

Figure 2: Example of a sound event detection system with a sound separation pre-processing.

Compared to previous years, this task aims to investigate how we can optimally exploit synthetic data, including non-target isolated events for both sound event detection and sound separation.

Audio dataset

The data for the DCASE 2020 task 4 consist of several datasets designed for sound event detection and/or sound separation. The datasets are described below.

Audio material

Dataset	Subset	Type	Usage	Annotations	Event type	Sampling frequency
DESED	Real: weakly labeled	Recorded soundscapes	Training	Weak labels (no timestamps)	Target	44.1kHz
	Real: unlabeled	Recorded soundscapes	Training	No annotations	Target	44.1kHz
	Real: validation	Recorded soundscapes	Validation	Strong labels (with timestamps)	Target	44.1kHz
	Real: public evaluation	Recorded soundscapes	Evaluation (do not use this subset to tune hyperparamters)	Strong labels (with timestamps)	Target	44.1kHz
	Synthetic: training	Isolated events + synthetic soundscapes	Training/validation	Strong labels (with timestamps)	Target	16kHz
	Synthetic: evaluation	Isolated events + backgrounds	Evaluation (do not use this subset to tune hyperparamters)	Event level labels (no timestamps)	Target	16kHz
SINS		Background	Training/validation	No annotations	N/A	16kHz
TUT Acoustic scenes 2017, development dataset		Background	Training/validation	No annotations	N/A	44.1kHz
FUSS dataset		Isolated events + synthetic soundscapes	Training/validation	Weak annotations from FSD50K (no timestamps)	Target and non-target	16kHz
FSD50K dataset		Isolated events + recorded soundscapes	Training/validation	Weak annotations (no timestamps)	Target and non-target	44.1kHz
YFCC100M dataset		Recorded soundscapes	Training/validation	No annotations	Sound sources	44.1kHz

If you plan to perform source separation (or to use backgrounds from the SINS dataset) please resample your recorded data in 16kHz. If you are using only recorded data and perform only sound event detection you can use sampling rates up to 44.1kHz yet we strongly encourage participant to use 16kHz as sampling rate.
Please note that the baselines work on 16kHz data.

DESED dataset

DESED dataset is the dataset that was used in DCASE 2020 task 4. The dataset for this task is composed of 10 sec audio clips recorded in domestic environments or synthesized using Scaper to simulate a domestic environment. The task focuses on 10 class of sound events that represent a subset of Audioset (not all the classes are present in Audioset, some classes of sound events are including several classes from Audioset):

Speech Speech
Dog Dog
Cat Cat
Alarm/bell/ringing Alarm_bell_ringing
Dishes Dishes
Frying Frying
Blender Blender
Running water Running_water
Vacuum cleaner Vacuum_cleaner
Electric shaver/toothbrush Electric_shaver_toothbrush

More information about this dataset and how to generate synthetic soundscapes can be found on the DESED website.

Publication

Nicolas Turpault, Romain Serizel, Ankit Parag Shah, and Justin Salamon. Sound event detection in domestic environments with weakly labeled data and soundscape synthesis. In Workshop on Detection and Classification of Acoustic Scenes and Events. New York City, United States, October 2019. URL: https://hal.inria.fr/hal-02160855.

	Romain Serizel University of Lorraine
	Nicolas Turpault Inria Nancy Grand-Est
	Francesca Ronchini Inria Nancy Grand-Est
	Scott Wisdom Google, Inc.
	Hakan Erdogan Google, Inc.
	John Hershey Google, Inc.
	Justin Salamon Adobe Research
	Prem Seetharaman Northwestern University
	Eduardo Fonseca Universitat Pompeu Fabra
	Samuele Cornell Università Politecnica delle Marche,
	Daniel P. W. Ellis Google, Inc.

Dataset	Github	Download	Automatic download with the script
DESED	DESED github repo	DESED real (zenodo) DESED synthetic (zenodo) DESED public eval (zenodo)	Yes
SINS	SINS github repo	SINS dev (zenodo)	Yes
TUT Acoustic scenes 2017, development dataset		TUT 2017 dev (zenodo)	Yes
FUSS dataset	FUSS github repo	FUSS (zenodo)	Yes
FSD50K dataset		FSD50K (zenodo)	Yes
YFFC100M dataset		YFCC100M website	No

Rank	Submission Information					Performance
Rank	Code	Author	Affiliation	Technical Report	Ranking score (Evaluation dataset)	PSDS 1 (Evaluation dataset)	PSDS 2 (Evaluation dataset)
	Na_BUPT_task4_SED_1	Tong Na	Beijing, China	task-sound-event-detection-in-domestic-environments#Na2021	0.80	0.245	0.452
	Hafsati_TUITO_task4_SED_3	Mohammed Hafsati	Tuito, Tuito R&D Speech Processing and AI team, La Ciotat, France	task-sound-event-detection-in-domestic-environments#Hafsati2021	0.91	0.287	0.502
	Hafsati_TUITO_task4_SED_4	Mohammed Hafsati	Tuito, Tuito R&D Speech Processing and AI team, La Ciotat, France	task-sound-event-detection-in-domestic-environments#Hafsati2021	0.91	0.287	0.502
	Hafsati_TUITO_task4_SED_1	Mohammed Hafsati	Tuito, Tuito R&D Speech Processing and AI team, La Ciotat, France	task-sound-event-detection-in-domestic-environments#Hafsati2021	1.03	0.334	0.549
	Hafsati_TUITO_task4_SED_2	Mohammed Hafsati	Tuito, Tuito R&D Speech Processing and AI team, La Ciotat, France	task-sound-event-detection-in-domestic-environments#Hafsati2021	1.04	0.336	0.550
	Gong_TAL_task4_SED_3	Yaguang Gong	Tomorrow Advancing Life Education Group, AI Department, Beijing, China	task-sound-event-detection-in-domestic-environments#Gong2021	1.16	0.370	0.626
	Gong_TAL_task4_SED_2	Yaguang Gong	Tomorrow Advancing Life Education Group, AI Department, Beijing, China	task-sound-event-detection-in-domestic-environments#Gong2021	1.15	0.367	0.616
	Gong_TAL_task4_SED_1	Yaguang Gong	Tomorrow Advancing Life Education Group, AI Department, Beijing, China	task-sound-event-detection-in-domestic-environments#Gong2021	1.14	0.364	0.611
	Park_JHU_task4_SED_2	Sangwook Park	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments#Park2021	1.07	0.327	0.603
	Park_JHU_task4_SED_4	Sangwook Park	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments#Park2021	0.86	0.237	0.524
	Park_JHU_task4_SED_1	Sangwook Park	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments#Park2021	1.01	0.305	0.579
	Park_JHU_task4_SED_3	Sangwook Park	Johns Hopkins University, Department of Electrical and Computer Engineering, Baltimore, MD, USA	task-sound-event-detection-in-domestic-environments#Park2021	0.84	0.222	0.537
	Zheng_USTC_task4_SED_4	Xu Zheng	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments#Zheng2021	1.30	0.389	0.742
	Zheng_USTC_task4_SED_1	Xu Zheng	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments#Zheng2021	1.33	0.452	0.669
	Zheng_USTC_task4_SED_3	Xu Zheng	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments#Zheng2021	1.29	0.386	0.746
	Zheng_USTC_task4_SED_2	Xu Zheng	University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China	task-sound-event-detection-in-domestic-environments#Zheng2021	1.33	0.447	0.676
	Nam_KAIST_task4_SED_2	Hyeonuk Nam	Korea Advanced Institute of Science and Technology, Department of Mechanical Engineering, Daejeon, South Korea	task-sound-event-detection-in-domestic-environments#Nam2021	1.19	0.399	0.609
	Nam_KAIST_task4_SED_1	Hyeonuk Nam	Korea Advanced Institute of Science and Technology, Department of Mechanical Engineering, Daejeon, South Korea	task-sound-event-detection-in-domestic-environments#Nam2021	1.16	0.378	0.617
	Nam_KAIST_task4_SED_3	Hyeonuk Nam	Korea Advanced Institute of Science and Technology, Department of Mechanical Engineering, Daejeon, South Korea	task-sound-event-detection-in-domestic-environments#Nam2021	1.09	0.324	0.634
	Nam_KAIST_task4_SED_4	Hyeonuk Nam	Korea Advanced Institute of Science and Technology, Department of Mechanical Engineering, Daejeon, South Korea	task-sound-event-detection-in-domestic-environments#Nam2021	0.75	0.059	0.715
	Koo_SGU_task4_SED_2	Hyejin Koo	Sogang University, Department of Electronic Engineering, Seoul, South Korea	task-sound-event-detection-in-domestic-environments#Koo2021	0.12	0.044	0.059
	Koo_SGU_task4_SED_3	Hyejin Koo	Sogang University, Department of Electronic Engineering, Seoul, South Korea	task-sound-event-detection-in-domestic-environments#Koo2021	0.41	0.058	0.348
	Koo_SGU_task4_SED_1	Hyejin Koo	Sogang University, Department of Electronic Engineering, Seloul, Korea	task-sound-event-detection-in-domestic-environments#Koo2021	0.74	0.258	0.364
	deBenito_AUDIAS_task4_SED_4	Diego de Benito-Gorron	Universidad Autónoma de Madrid, Escuela Politécnica Superior, Madrid, Spain	task-sound-event-detection-in-domestic-environments#de Benito-Gorron2021	1.10	0.361	0.577
	deBenito_AUDIAS_task4_SED_1	Diego de Benito-Gorron	Universidad Autónoma de Madrid, Escuela Politécnica Superior, Madrid, Spain	task-sound-event-detection-in-domestic-environments#de Benito-Gorron2021	1.07	0.343	0.571
	deBenito_AUDIAS_task4_SED_2	Diego de Benito-Gorron	Universidad Autónoma de Madrid, Escuela Politécnica Superior, Madrid, Spain	task-sound-event-detection-in-domestic-environments#de Benito-Gorron2021	1.10	0.363	0.574
	deBenito_AUDIAS_task4_SED_3	Diego de Benito-Gorron	Universidad Autónoma de Madrid, Escuela Politécnica Superior, Madrid, Spain	task-sound-event-detection-in-domestic-environments#de Benito-Gorron2021	1.07	0.345	0.571
	Baseline_SSep_SED	Nicolas Turpault	Inria Nancy Grand-Est, Department of Natural Language Processing & Knowledge Discovery, Nancy, France	task-sound-event-detection-in-domestic-environments#turpault2020b	1.11	0.364	0.580
	Boes_KUL_task4_SED_4	Wim Boes	KU Leuven, ESAT, Leuven, Belgium	task-sound-event-detection-in-domestic-environments#Boes2021	0.60	0.117	0.457
	Boes_KUL_task4_SED_3	Wim Boes	KU Leuven, ESAT, Leuven, Belgium	task-sound-event-detection-in-domestic-environments#Boes2021	0.68	0.121	0.531
	Boes_KUL_task4_SED_2	Wim Boes	KU Leuven, ESAT, Leuven, Belgium	task-sound-event-detection-in-domestic-environments#Boes2021	0.77	0.233	0.440
	Boes_KUL_task4_SED_1	Wim Boes	KU Leuven, ESAT, Leuven, Belgium	task-sound-event-detection-in-domestic-environments#Boes2021	0.81	0.253	0.442
	Ebbers_UPB_task4_SED_2	Janek Ebbers	Paderborn University, Department of Communications Engineering, Paderborn, Germany	task-sound-event-detection-in-domestic-environments#Ebbers2021	1.10	0.335	0.621
	Ebbers_UPB_task4_SED_4	Janek Ebbers	Paderborn University, Department of Communications Engineering, Paderborn, Germany	task-sound-event-detection-in-domestic-environments#Ebbers2021	1.16	0.363	0.637
	Ebbers_UPB_task4_SED_3	Janek Ebbers	Paderborn University, Department of Communications Engineering, Paderborn, Germany	task-sound-event-detection-in-domestic-environments#Ebbers2021	1.24	0.416	0.635
	Ebbers_UPB_task4_SED_1	Janek Ebbers	Paderborn University, Department of Communications Engineering, Paderborn, Germany	task-sound-event-detection-in-domestic-environments#Ebbers2021	1.16	0.373	0.621
	Zhu_AIAL-XJU_task4_SED_2	Xiujuan Zhu	XinJiang University, Key Laboratory of Signal Detection and Processing, Urumqi, China	task-sound-event-detection-in-domestic-environments#Zhu2021	0.99	0.290	0.574
	Zhu_AIAL-XJU_task4_SED_1	Xiujuan Zhu	XinJiang University, Key Laboratory of Signal Detection and Processing, Urumqi, China	task-sound-event-detection-in-domestic-environments#Zhu2021	1.04	0.318	0.583
	Liu_BUPT_task4_4	Gang Liu	Beijing University of Posts and Telecommunications, Pattern Recognition and Intelligent System Laboratory (PRIS Lab), Beijing, China	task-sound-event-detection-in-domestic-environments#Liu2021	0.37	0.102	0.231
	Liu_BUPT_task4_1	Gang Liu	Beijing University of Posts and Telecommunications, Pattern Recognition and Intelligent System Laboratory (PRIS Lab), Beijing, China	task-sound-event-detection-in-domestic-environments#Liu2021	0.30	0.090	0.169
	Liu_BUPT_task4_2	Gang Liu	Beijing University of Posts and Telecommunications, Pattern Recognition and Intelligent System Laboratory (PRIS Lab), Beijing, China	task-sound-event-detection-in-domestic-environments#Liu2021	0.54	0.152	0.322
	Liu_BUPT_task4_3	Gang Liu	Beijing University of Posts and Telecommunications, Pattern Recognition and Intelligent System Laboratory (PRIS Lab), Beijing, China	task-sound-event-detection-in-domestic-environments#Liu2021	0.24	0.068	0.146
	Olvera_INRIA_task4_SED_2	Michel Olvera	Inria Nancy Grand-Est, Department of Information and Communication Sciences and Technologies, Nancy, France	task-sound-event-detection-in-domestic-environments#Olvera2021	0.98	0.338	0.481
	Olvera_INRIA_task4_SED_1	Michel Olvera	Inria Nancy Grand-Est, Department of Information and Communication Sciences and Technologies, Nancy, France	task-sound-event-detection-in-domestic-environments#Olvera2021	0.95	0.332	0.462
	Kim_AiTeR_GIST_SED_4	Nam Kyun Kim	Gwnagju Institute of Science and Technology, School of Electrical Engineering and Computer Science, Gwnagju, South Korea	task-sound-event-detection-in-domestic-environments#Kim2021	1.32	0.442	0.674
	Kim_AiTeR_GIST_SED_2	Nam Kyun Kim	Gwnagju Institute of Science and Technology, School of Electrical Engineering and Computer Science, Gwnagju, South Korea	task-sound-event-detection-in-domestic-environments#Kim2021	1.31	0.439	0.667
	Kim_AiTeR_GIST_SED_3	Nam Kyun Kim	Gwnagju Institute of Science and Technology, School of Electrical Engineering and Computer Science, Gwnagju, South Korea	task-sound-event-detection-in-domestic-environments#Kim2021	1.30	0.434	0.669
	Kim_AiTeR_GIST_SED_1	Nam Kyun Kim	Gwnagju Institute of Science and Technology, School of Electrical Engineering and Computer Science, Gwnagju, South Korea	task-sound-event-detection-in-domestic-environments#Kim2021	1.29	0.431	0.661
	Cai_SMALLRICE_task4_SED_1	Heinrich Dinkel	Xiaomi Corperation, Technology Comittee, Beijing, China	task-sound-event-detection-in-domestic-environments#Dinkel2021	1.11	0.361	0.584
	Cai_SMALLRICE_task4_SED_2	Heinrich Dinkel	Xiaomi Corperation, Technology Comittee, Beijing, China	task-sound-event-detection-in-domestic-environments#Dinkel2021	1.13	0.373	0.585
	Cai_SMALLRICE_task4_SED_3	Heinrich Dinkel	Xiaomi Corperation, Technology Comittee, Beijing, China	task-sound-event-detection-in-domestic-environments#Dinkel2021	1.13	0.370	0.596
	Cai_SMALLRICE_task4_SED_4	Heinrich Dinkel	Xiaomi Corperation, Technology Comittee, Beijing, China	task-sound-event-detection-in-domestic-environments#Dinkel2021	1.00	0.339	0.504
	HangYuChen_Roal_task4_SED_2	Chen HangYu		task-sound-event-detection-in-domestic-environments#HangYu2021	0.90	0.294	0.473
	HangYuChen_Roal_task4_SED_1	Chen YuHang		task-sound-event-detection-in-domestic-environments#YuHang2021	0.61	0.098	0.496
	Yu_NCUT_task4_SED_1	Dongchi Yu	North China University of Technology, Department of Electronic Information Engineering, Beijing, China	task-sound-event-detection-in-domestic-environments#Yu2021	0.20	0.038	0.157
	Yu_NCUT_task4_SED_2	Dongchi Yu	North China University of Technology, Department of Electronic Information Engineering, Beijing, China	task-sound-event-detection-in-domestic-environments#Yu2021	0.92	0.301	0.485
	lu_kwai_task4_SED_1	Rui Lu	Beijing Kuaishou Technology Co., Ltd, AI-Platform, Beijing, China	task-sound-event-detection-in-domestic-environments#Lu2021	1.27	0.419	0.660
	lu_kwai_task4_SED_4	Rui Lu	Beijing Kuaishou Technology Co., Ltd, AI-Platform, Beijing, China	task-sound-event-detection-in-domestic-environments#Lu2021	0.88	0.157	0.685
	lu_kwai_task4_SED_3	Rui Lu	Beijing Kuaishou Technology Co., Ltd, AI-Platform, Beijing, China	task-sound-event-detection-in-domestic-environments#Lu2021	0.86	0.148	0.686
	lu_kwai_task4_SED_2	Rui Lu	Beijing Kuaishou Technology Co., Ltd, AI-Platform, Beijing, China	task-sound-event-detection-in-domestic-environments#Lu2021	1.25	0.412	0.651
	Liu_BUPT_task4_SS_SED_2	Gang Liu_SS	Beijing University of Posts and Telecommunications, Pattern Recognition and Intelligent System Laboratory (PRIS Lab), Beijing, China	task-sound-event-detection-in-domestic-environments#Liu_SS2021	0.94	0.302	0.507
	Liu_BUPT_task4_SS_SED_1	Gang Liu_SS	Beijing University of Posts and Telecommunications, Pattern Recognition and Intelligent System Laboratory (PRIS Lab), Beijing, China	task-sound-event-detection-in-domestic-environments#Liu_SS2021	0.94	0.302	0.507
	Tian_ICT-TOSHIBA_task4_SED_2	Gangyi Tian	Institute of Computing Technology,Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments#Tian2021	1.19	0.411	0.585
	Tian_ICT-TOSHIBA_task4_SED_1	Gangyi Tian	Institute of Computing Technology,Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments#Tian2021	1.19	0.413	0.586
	Tian_ICT-TOSHIBA_task4_SED_4	Gangyi Tian	Institute of Computing Technology,Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments#Tian2021	1.19	0.412	0.586
	Tian_ICT-TOSHIBA_task4_SED_3	Gangyi Tian	Institute of Computing Technology,Chinese Academy of Sciences, Bejing Key Laboratory of Mobile Computing and Pervasive Device, Beijing, China	task-sound-event-detection-in-domestic-environments#Tian2021	1.18	0.409	0.584
	Yao_GUET_task4_SED_3	Yu Yao	GUILIN UNIVERSITY OF ELECTRONIC TECHNOLOGY, SCHOOL OF INFORMATION AND COMMUNICATION, GuiLin, China	task-sound-event-detection-in-domestic-environments#Yao2021	0.88	0.279	0.479
	Yao_GUET_task4_SED_1	Yu Yao	GUILIN UNIVERSITY OF ELECTRONIC TECHNOLOGY, SCHOOL OF INFORMATION AND COMMUNICATION, GuiLin, China	task-sound-event-detection-in-domestic-environments#Yao2021	0.88	0.277	0.482
	Yao_GUET_task4_SED_2	Yu Yao	GUILIN UNIVERSITY OF ELECTRONIC TECHNOLOGY, SCHOOL OF INFORMATION AND COMMUNICATION, GuiLin, China	task-sound-event-detection-in-domestic-environments#Yao2021	0.54	0.056	0.496
	Liang_SHNU_task4_SED_4	Yunhao Liang	Shanghai Normal University, The College of Information,Mechanical and Electrical Engineering, Shanghai,China	task-sound-event-detection-in-domestic-environments#Liang2021	0.99	0.313	0.543
	Bajzik_UNIZA_task4_SED_2	Jakub Bajzik	University of Zilina, Department of Mechatronics and Electronics, Zilina, Slovak Republic	task-sound-event-detection-in-domestic-environments#Bajzik2021	1.02	0.330	0.544
	Bajzik_UNIZA_task4_SED_1	Jakub Bajzik	University of Zilina, Department of Mechatronics and Electronics, Zilina, Slovak Republic	task-sound-event-detection-in-domestic-environments#Bajzik2021	0.45	0.133	0.266
	Liang_SHNU_task4_SSep_SED_3	Yunhao Liang_SS	Shanghai Normal University, The College of Information,Mechanical and Electrical Engineering, Shanghai,China	task-sound-event-detection-in-domestic-environments#Liang_SS2021	0.99	0.304	0.559
	Liang_SHNU_task4_SSep_SED_1	Yunhao Liang_SS	Shanghai Normal University, The College of Information,Mechanical and Electrical Engineering, Shanghai,China	task-sound-event-detection-in-domestic-environments#Liang_SS2021	1.03	0.313	0.588
	Liang_SHNU_task4_SSep_SED_2	Yunhao Liang_SS	Shanghai Normal University, The College of Information,Mechanical and Electrical Engineering, Shanghai,China	task-sound-event-detection-in-domestic-environments#Liang_SS2021	1.01	0.325	0.542
	Baseline_SED	Nicolas Turpault	Inria Nancy Grand-Est, Department of Natural Language Processing & Knowledge Discovery, Nancy, France	task-sound-event-detection-in-domestic-environments#Turpault2020a	1.00	0.315	0.547
	Wang_NSYSU_task4_SED_1	Yih Wen Wang	National Sun Yat-sen University, Department of Computer Science and Engineering, Kaohsiung, Taiwan	task-sound-event-detection-in-domestic-environments#Wang2021	1.13	0.336	0.646
	Wang_NSYSU_task4_SED_4	Yih Wen Wang	National Sun Yat-sen University, Department of Computer Science and Engineering, Kaohsiung, Taiwan	task-sound-event-detection-in-domestic-environments#Wang2021	1.09	0.304	0.662
	Wang_NSYSU_task4_SED_2	Yih Wen Wang	National Sun Yat-sen University, Department of Computer Science and Engineering, Kaohsiung, Taiwan	task-sound-event-detection-in-domestic-environments#Wang2021	0.69	0.070	0.636
	Wang_NSYSU_task4_SED_3	Yih Wen Wang	National Sun Yat-sen University, Department of Computer Science and Engineering, Kaohsiung, Taiwan	task-sound-event-detection-in-domestic-environments#Wang2021	1.13	0.339	0.649

	PSDS-scenario1	PSDS-scenario2	Intersection-based F1	Collar-based F1
Dev-test	0.342	0.527	76.6%	40.1%

Coordinators

Content

Description

Audio dataset

Audio material

DESED dataset

Sound event detection in domestic environments with weakly labeled data and soundscape synthesis

Abstract

Keywords

Sound event detection in synthetic domestic environments

Abstract

Keywords

Audio Set: An ontology and human-labeled dataset for audio events

Abstract

THE BENEFIT OF TEMPORALLY-STRONG LABELS IN AUDIO EVENT CLASSIFICATION

Sound separation dataset (FUSS)

FSD50K: an Open Dataset of Human-Labeled Sound Events

Abstract

Freesound technical demo

Abstract

What's All the FUSS About Free Universal Sound Separation Data?

Additional (background) datasets

The SINS Database for Detection of Daily Activities in a Home Environment Using an Acoustic Sensor Network

Abstract

Keywords

TUT Database for Acoustic Scene Classification and Sound Event Detection

Abstract

Additional (event) datasets

FSD50K: an Open Dataset of Human-Labeled Sound Events

Abstract

Additional (sound sources) datasets

YFCC100M: The new data in multimedia research

Generating your own training data using Scaper

Scaper: A Library for Soundscape Synthesis and Augmentation

Reference labels

Weak annotations

Strong annotations

FSD50K: an Open Dataset of Human-Labeled Sound Events

Abstract

Freesound technical demo

Abstract

FSD50K annotations

FSD50K: an Open Dataset of Human-Labeled Sound Events

Abstract

Download

Task setup

Development dataset

Sound event detection training set

Sound event detection validation set

Source separation training set

Source separation validation set

Evaluation datasets

Sound event detection evaluation dataset

Sound separation evaluation evaluation dataset

Task rules

Evaluation

Sound event detection evaluation

Scenario 1

Scenario 2

Task Ranking

Contrastive metric (collar-based F1-score)

Metrics for Polyphonic Sound Event Detection

Abstract

A Framework for the Robust Evaluation of Sound Event Detection

Source separation evaluation (optional)

Results

Baseline systems

Sound event detection baseline

System description

Results for the development dataset

Repositories

Sound separation and sound event detection

System description

Universal Sound Separation

Unsupervised sound separation using mixtures of mixtures

Improving sound event detection in domestic environments using sound separation

Running the baseline

Results for the development dataset

Repositories

Citation