Task description

The task evaluates systems for the detection of sound events using weakly labeled data (without timestamps). The target of the systems is to provide not only the event class but also the event time boundaries given that multiple events can be present in an audio recording. The challenge of exploring the possibility to exploit a large amount of unbalanced and unlabeled training data together with a small weakly annotated training set to improve system performance remains. Isolated sound events, background sound files and scripst to design a training set with strongly annotated synthetic data are provided. The labels in all the annotated subsets are verified and can be considered as reliable.

More detailed task description can be found in the task description page

Systems ranking

Submission code	Submission name	Technical Report	Ranking score (Evaluation dataset)	PSDS 1 (Evaluation dataset)	PSDS 2 (Evaluation dataset)	PSDS 1 (Development dataset)	PSDS 2 (Development dataset)
Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na2021	0.80	0.245	0.452	0.313	0.535
Hafsati_TUITO_task4_SED_3	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.91	0.287	0.502	0.325	0.561
Hafsati_TUITO_task4_SED_4	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.91	0.287	0.502	0.325	0.561
Hafsati_TUITO_task4_SED_1	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	1.03	0.334	0.549	0.345	0.555
Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	1.04	0.336	0.550	0.345	0.555
Gong_TAL_task4_SED_3	TAL SED system	Gong2021	1.16	0.370	0.626	0.407	0.653
Gong_TAL_task4_SED_2	TAL SED system	Gong2021	1.15	0.367	0.616	0.407	0.648
Gong_TAL_task4_SED_1	TAL SED system	Gong2021	1.14	0.364	0.611	0.398	0.642
Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park2021	1.07	0.327	0.603	0.524	0.674
Park_JHU_task4_SED_4	Park_JHU_task4_SED_4	Park2021	0.86	0.237	0.524	0.446	0.561
Park_JHU_task4_SED_1	Park_JHU_task4_SED_1	Park2021	1.01	0.305	0.579	0.508	0.668
Park_JHU_task4_SED_3	Park_JHU_task4_SED_3	Park2021	0.84	0.222	0.537	0.456	0.596
Zheng_USTC_task4_SED_4	DCASE2020 SED Mean teacher system 4	Zheng2021	1.30	0.389	0.742	0.402	0.786
Zheng_USTC_task4_SED_1	DCASE2020 SED Mean teacher system 1	Zheng2021	1.33	0.452	0.669	0.454	0.671
Zheng_USTC_task4_SED_3	DCASE2020 SED Mean teacher system 3	Zheng2021	1.29	0.386	0.746	0.397	0.788
Zheng_USTC_task4_SED_2	DCASE2020 SED Mean teacher system 2	Zheng2021	1.33	0.447	0.676	0.454	0.680
Nam_KAIST_task4_SED_2	SED_mixupratip=0.8_nband=(2,3)_medianfilter=5	Nam2021	1.19	0.399	0.609	0.434	0.639
Nam_KAIST_task4_SED_1	SED_default	Nam2021	1.16	0.378	0.617	0.423	0.658
Nam_KAIST_task4_SED_3	SED_AFL	Nam2021	1.09	0.324	0.634	0.381	0.692
Nam_KAIST_task4_SED_4	Weak_SED	Nam2021	0.75	0.059	0.715	0.064	0.816
Koo_SGU_task4_SED_2	DCASE2021 SED system using wav2vec	Koo2021	0.12	0.044	0.059	0.316	0.337
Koo_SGU_task4_SED_3	DCASE2021 SED system using wav2vec	Koo2021	0.41	0.058	0.348	0.249	0.711
Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo2021	0.74	0.258	0.364	0.295	0.503
deBenito_AUDIAS_task4_SED_4	5-Resolution Mean Teacher	de Benito-Gorron2021	1.10	0.361	0.577	0.386	0.600
deBenito_AUDIAS_task4_SED_1	3-Resolution Mean Teacher	de Benito-Gorron2021	1.07	0.343	0.571	0.380	0.589
deBenito_AUDIAS_task4_SED_2	3-Resolution Mean Teacher (Higher time resolutions)	de Benito-Gorron2021	1.10	0.363	0.574	0.386	0.578
deBenito_AUDIAS_task4_SED_3	4-Resolution Mean Teacher	de Benito-Gorron2021	1.07	0.345	0.571	0.372	0.600
Baseline_SSep_SED	DCASE2021 SSep SED baseline system	turpault2020b	1.11	0.364	0.580	0.342	0.527
Boes_KUL_task4_SED_4	CRNN with optimized pooling operations for scenario 2 (2)	Boes2021	0.60	0.117	0.457	0.154	0.729
Boes_KUL_task4_SED_3	CRNN with optimized pooling operations for scenario 2 (1)	Boes2021	0.68	0.121	0.531	0.158	0.731
Boes_KUL_task4_SED_2	CRNN with optimized pooling operations for scenario 1 (2)	Boes2021	0.77	0.233	0.440	0.359	0.601
Boes_KUL_task4_SED_1	CRNN with optimized pooling operations for scenario 1 (1)	Boes2021	0.81	0.253	0.442	0.361	0.593
Ebbers_UPB_task4_SED_2	UPB sytem 2	Ebbers2021	1.10	0.335	0.621	0.377	0.748
Ebbers_UPB_task4_SED_4	UPB sytem 4	Ebbers2021	1.16	0.363	0.637	0.393	0.758
Ebbers_UPB_task4_SED_3	UPB sytem 3	Ebbers2021	1.24	0.416	0.635	0.454	0.726
Ebbers_UPB_task4_SED_1	UPB sytem 1	Ebbers2021	1.16	0.373	0.621	0.429	0.727
Zhu_AIAL-XJU_task4_SED_2	Zhu_AIAL-XJU_task4_SED_2	Zhu2021	0.99	0.290	0.574	0.342	0.614
Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	0.318	0.583	0.354	0.613
Liu_BUPT_task4_4	DCASE2020 liuliuliufangzhou system	Liu2021	0.37	0.102	0.231	0.348	0.551
Liu_BUPT_task4_1	DCASE2020 liuliuliufangzhou system	Liu2021	0.30	0.090	0.169	0.348	0.551
Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu2021	0.54	0.152	0.322	0.348	0.551
Liu_BUPT_task4_3	DCASE2020 liuliuliufangzhou system	Liu2021	0.24	0.068	0.146	0.348	0.551
Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera2021	0.98	0.338	0.481
Olvera_INRIA_task4_SED_1	DA-SED + FG/BG	Olvera2021	0.95	0.332	0.462
Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim2021	1.32	0.442	0.674	0.457	0.685
Kim_AiTeR_GIST_SED_2	RCRNN-based noisy student SED	Kim2021	1.31	0.439	0.667	0.450	0.682
Kim_AiTeR_GIST_SED_3	RCRNN-based noisy student SED	Kim2021	1.30	0.434	0.669	0.451	0.679
Kim_AiTeR_GIST_SED_1	RCRNN-based noisy student SED	Kim2021	1.29	0.431	0.661	0.449	0.675
Cai_SMALLRICE_task4_SED_1	DCASE2021_Cai_SED_CDur_Ensemble_1	Dinkel2021	1.11	0.361	0.584	0.375	0.619
Cai_SMALLRICE_task4_SED_2	DCASE2021_Cai_SED_CDur_Ensemble_2	Dinkel2021	1.13	0.373	0.585	0.382	0.622
Cai_SMALLRICE_task4_SED_3	DCASE2021_Cai_SED_CDur_Ensemble_3	Dinkel2021	1.13	0.370	0.596	0.381	0.629
Cai_SMALLRICE_task4_SED_4	DCASE2021_Cai_SED_CDur_Single_4	Dinkel2021	1.00	0.339	0.504	0.369	0.571
HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYu2021	0.90	0.294	0.473	0.134	0.557
HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	YuHang2021	0.61	0.098	0.496	0.340	0.523
Yu_NCUT_task4_SED_1	multi-scale CRNN	Yu2021	0.20	0.038	0.157	0.330	0.540
Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu2021	0.92	0.301	0.485	0.110	0.610
lu_kwai_task4_SED_1	DCASE2021 SED CRNN Model1	Lu2021	1.27	0.419	0.660	0.419	0.638
lu_kwai_task4_SED_4	DCASE2021 SED Conformer Model2	Lu2021	0.88	0.157	0.685	0.177	0.749
lu_kwai_task4_SED_3	DCASE2021 SED Conformer Model1	Lu2021	0.86	0.148	0.686	0.173	0.752
lu_kwai_task4_SED_2	DCASE2021 SED CRNN Model2	Lu2021	1.25	0.412	0.651	0.418	0.637
Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.94	0.302	0.507	0.360	0.550
Liu_BUPT_task4_SS_SED_1	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.94	0.302	0.507	0.360	0.550
Tian_ICT-TOSHIBA_task4_SED_2	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	0.411	0.585	0.396	0.587
Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	0.413	0.586	0.401	0.597
Tian_ICT-TOSHIBA_task4_SED_4	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	0.412	0.586	0.398	0.599
Tian_ICT-TOSHIBA_task4_SED_3	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.18	0.409	0.584	0.392	0.585
Yao_GUET_task4_SED_3	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.88	0.279	0.479	0.328	0.530
Yao_GUET_task4_SED_1	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.88	0.277	0.482	0.332	0.533
Yao_GUET_task4_SED_2	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.54	0.056	0.496	0.060	0.618
Liang_SHNU_task4_SED_4	Guided Learning system	Liang2021	0.99	0.313	0.543	0.328	0.575
Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik2021	1.02	0.330	0.544	0.374	0.586
Bajzik_UNIZA_task4_SED_1	CAM-based SED system	Bajzik2021	0.45	0.133	0.266	0.165	0.348
Liang_SHNU_task4_SSep_SED_3	Mean teacher system	Liang_SS2021	0.99	0.304	0.559	0.426	0.726
Liang_SHNU_task4_SSep_SED_1	Mean teacher system	Liang_SS2021	1.03	0.313	0.588	0.428	0.736
Liang_SHNU_task4_SSep_SED_2	Mean teacher system	Liang_SS2021	1.01	0.325	0.542	0.418	0.721
Baseline_SED	DCASE2021 SED baseline system	turpault2020a	1.00	0.315	0.547	0.342	0.527
Wang_NSYSU_task4_SED_1	DCASE2021_SED_A	Wang2021	1.13	0.336	0.646	0.407	0.703
Wang_NSYSU_task4_SED_4	DCASE2021_SED_D	Wang2021	1.09	0.304	0.662	0.370	0.724
Wang_NSYSU_task4_SED_2	DCASE2021_SED_B	Wang2021	0.69	0.070	0.636	0.061	0.808
Wang_NSYSU_task4_SED_3	DCASE2021_SED_C	Wang2021	1.13	0.339	0.649	0.388	0.672

Supplementary metrics

Submission code	Submission name	Technical Report	PSDS 1 (Evaluation dataset)	PSDS 1 (Public evaluation)	PSDS 1 (Vimeo dataset)	PSDS 2 (Evaluation dataset)	PSDS 2 (Public evaluation)	PSDS 2 (Vimeo dataset)	F-score (Evaluation dataset)	F-score (Public evaluation)	F-score (Vimeo dataset)
Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na2021	0.245	0.269	0.185	0.452	0.485	0.354	25.0	27.5	19.5
Hafsati_TUITO_task4_SED_3	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.287	0.321	0.207	0.502	0.547	0.386	35.7	39.2	27.4
Hafsati_TUITO_task4_SED_4	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.287	0.322	0.209	0.502	0.547	0.387	37.2	40.9	28.0
Hafsati_TUITO_task4_SED_1	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.334	0.370	0.249	0.549	0.591	0.437	39.5	43.8	29.0
Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.336	0.374	0.249	0.550	0.591	0.440	40.9	44.9	31.3
Gong_TAL_task4_SED_3	TAL SED system	Gong2021	0.370	0.419	0.273	0.626	0.672	0.509	41.9	45.1	34.0
Gong_TAL_task4_SED_2	TAL SED system	Gong2021	0.367	0.409	0.271	0.616	0.654	0.512	42.7	45.9	34.8
Gong_TAL_task4_SED_1	TAL SED system	Gong2021	0.364	0.409	0.266	0.611	0.661	0.486	41.5	44.9	33.0
Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park2021	0.327	0.371	0.240	0.603	0.644	0.492	38.4	42.2	28.9
Park_JHU_task4_SED_4	Park_JHU_task4_SED_4	Park2021	0.237	0.267	0.174	0.524	0.568	0.417	36.9	39.8	29.6
Park_JHU_task4_SED_1	Park_JHU_task4_SED_1	Park2021	0.305	0.344	0.214	0.579	0.617	0.471	34.7	37.8	26.9
Park_JHU_task4_SED_3	Park_JHU_task4_SED_3	Park2021	0.222	0.244	0.166	0.537	0.578	0.430	33.5	35.4	28.5
Zheng_USTC_task4_SED_4	DCASE2020 SED Mean teacher system 4	Zheng2021	0.389	0.438	0.261	0.742	0.775	0.644	49.5	54.2	36.9
Zheng_USTC_task4_SED_1	DCASE2020 SED Mean teacher system 1	Zheng2021	0.452	0.517	0.318	0.669	0.725	0.530	52.3	57.4	39.2
Zheng_USTC_task4_SED_3	DCASE2020 SED Mean teacher system 3	Zheng2021	0.386	0.429	0.270	0.746	0.778	0.650	49.7	55.0	36.3
Zheng_USTC_task4_SED_2	DCASE2020 SED Mean teacher system 2	Zheng2021	0.447	0.506	0.318	0.676	0.730	0.546	52.9	57.7	40.2
Nam_KAIST_task4_SED_2	SED_mixupratip=0.8_nband=(2,3)_medianfilter=5	Nam2021	0.399	0.443	0.299	0.609	0.641	0.525	48.0	52.2	37.1
Nam_KAIST_task4_SED_1	SED_default	Nam2021	0.378	0.426	0.285	0.617	0.666	0.506	44.2	47.8	34.5
Nam_KAIST_task4_SED_3	SED_AFL	Nam2021	0.324	0.364	0.235	0.634	0.672	0.536	29.3	32.3	22.7
Nam_KAIST_task4_SED_4	Weak_SED	Nam2021	0.059	0.069	0.022	0.715	0.750	0.616	12.5	13.1	11.4
Koo_SGU_task4_SED_2	DCASE2021 SED system using wav2vec	Koo2021	0.044	0.050	0.024	0.059	0.057	0.047	12.4	13.8	9.4
Koo_SGU_task4_SED_3	DCASE2021 SED system using wav2vec	Koo2021	0.058	0.060	0.048	0.348	0.406	0.257	8.5	9.0	7.3
Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo2021	0.258	0.282	0.183	0.364	0.401	0.241	20.5	22.2	16.2
deBenito_AUDIAS_task4_SED_4	5-Resolution Mean Teacher	de Benito-Gorron2021	0.361	0.405	0.262	0.577	0.635	0.443	42.7	46.7	32.7
deBenito_AUDIAS_task4_SED_1	3-Resolution Mean Teacher	de Benito-Gorron2021	0.343	0.387	0.245	0.571	0.628	0.439	42.6	46.4	33.2
deBenito_AUDIAS_task4_SED_2	3-Resolution Mean Teacher (Higher time resolutions)	de Benito-Gorron2021	0.363	0.406	0.265	0.574	0.630	0.449	43.1	47.0	33.6
deBenito_AUDIAS_task4_SED_3	4-Resolution Mean Teacher	de Benito-Gorron2021	0.345	0.383	0.255	0.571	0.628	0.438	42.2	46.4	31.6
Baseline_SSep_SED	DCASE2021 SSep SED baseline system	turpault2020b	0.364	0.407	0.283	0.580	0.627	0.471	42.0	44.9	34.7
Boes_KUL_task4_SED_4	CRNN with optimized pooling operations for scenario 2 (2)	Boes2021	0.117	0.131	0.078	0.457	0.500	0.346	10.6	11.8	7.9
Boes_KUL_task4_SED_3	CRNN with optimized pooling operations for scenario 2 (1)	Boes2021	0.121	0.139	0.081	0.531	0.555	0.435	14.0	15.9	9.3
Boes_KUL_task4_SED_2	CRNN with optimized pooling operations for scenario 1 (2)	Boes2021	0.233	0.266	0.143	0.440	0.489	0.310	31.2	34.4	22.6
Boes_KUL_task4_SED_1	CRNN with optimized pooling operations for scenario 1 (1)	Boes2021	0.253	0.290	0.150	0.442	0.483	0.319	31.0	34.7	21.3
Ebbers_UPB_task4_SED_2	UPB sytem 2	Ebbers2021	0.335	0.369	0.269	0.621	0.661	0.519	54.1	57.2	46.7
Ebbers_UPB_task4_SED_4	UPB sytem 4	Ebbers2021	0.363	0.407	0.285	0.637	0.683	0.533	56.7	59.6	49.4
Ebbers_UPB_task4_SED_3	UPB sytem 3	Ebbers2021	0.416	0.455	0.328	0.635	0.684	0.519	56.7	59.6	49.4
Ebbers_UPB_task4_SED_1	UPB sytem 1	Ebbers2021	0.373	0.410	0.300	0.621	0.661	0.516	54.1	57.2	46.7
Zhu_AIAL-XJU_task4_SED_2	Zhu_AIAL-XJU_task4_SED_2	Zhu2021	0.290	0.319	0.216	0.574	0.640	0.438	43.0	47.1	33.0
Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu2021	0.318	0.357	0.238	0.583	0.641	0.451	40.2	43.5	32.3
Liu_BUPT_task4_4	DCASE2020 liuliuliufangzhou system	Liu2021	0.102	0.123	0.043	0.231	0.244	0.165	17.5	19.4	12.9
Liu_BUPT_task4_1	DCASE2020 liuliuliufangzhou system	Liu2021	0.090	0.101	0.040	0.169	0.176	0.110	18.1	19.6	13.8
Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu2021	0.152	0.173	0.099	0.322	0.347	0.234	23.6	25.7	18.2
Liu_BUPT_task4_3	DCASE2020 liuliuliufangzhou system	Liu2021	0.068	0.086	0.012	0.146	0.152	0.104	15.1	16.9	10.8
Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera2021	0.338	0.382	0.218	0.481	0.528	0.357	43.4	48.4	30.0
Olvera_INRIA_task4_SED_1	DA-SED + FG/BG	Olvera2021	0.332	0.375	0.205	0.462	0.506	0.333	45.5	50.2	33.1
Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim2021	0.442	0.492	0.330	0.674	0.715	0.573	50.6	53.3	43.5
Kim_AiTeR_GIST_SED_2	RCRNN-based noisy student SED	Kim2021	0.439	0.492	0.319	0.667	0.710	0.564	50.5	53.3	43.0
Kim_AiTeR_GIST_SED_3	RCRNN-based noisy student SED	Kim2021	0.434	0.481	0.326	0.669	0.709	0.570	49.4	52.4	41.8
Kim_AiTeR_GIST_SED_1	RCRNN-based noisy student SED	Kim2021	0.431	0.478	0.320	0.661	0.702	0.554	49.9	52.3	43.6
Cai_SMALLRICE_task4_SED_1	DCASE2021_Cai_SED_CDur_Ensemble_1	Dinkel2021	0.361	0.406	0.239	0.584	0.654	0.418	37.8	41.4	28.3
Cai_SMALLRICE_task4_SED_2	DCASE2021_Cai_SED_CDur_Ensemble_2	Dinkel2021	0.373	0.423	0.243	0.585	0.652	0.422	38.8	41.9	30.3
Cai_SMALLRICE_task4_SED_3	DCASE2021_Cai_SED_CDur_Ensemble_3	Dinkel2021	0.370	0.419	0.241	0.596	0.662	0.433	38.8	42.0	30.7
Cai_SMALLRICE_task4_SED_4	DCASE2021_Cai_SED_CDur_Single_4	Dinkel2021	0.339	0.386	0.212	0.504	0.561	0.356	38.4	42.0	29.5
HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYu2021	0.294	0.327	0.205	0.473	0.510	0.350	34.2	37.8	25.5
HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	YuHang2021	0.098	0.104	0.090	0.496	0.515	0.391	10.7	11.7	8.7
Yu_NCUT_task4_SED_1	multi-scale CRNN	Yu2021	0.038	0.039	0.045	0.157	0.182	0.144	6.8	7.9	4.0
Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu2021	0.301	0.341	0.197	0.485	0.528	0.360	34.4	37.8	25.7
lu_kwai_task4_SED_1	DCASE2021 SED CRNN Model1	Lu2021	0.419	0.468	0.314	0.660	0.702	0.556	45.0	48.6	36.0
lu_kwai_task4_SED_4	DCASE2021 SED Conformer Model2	Lu2021	0.157	0.177	0.125	0.685	0.714	0.598	15.7	16.6	14.6
lu_kwai_task4_SED_3	DCASE2021 SED Conformer Model1	Lu2021	0.148	0.170	0.114	0.686	0.715	0.597	15.6	16.7	14.0
lu_kwai_task4_SED_2	DCASE2021 SED CRNN Model2	Lu2021	0.412	0.461	0.313	0.651	0.694	0.550	45.5	48.9	36.9
Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.302	0.328	0.235	0.507	0.537	0.410	37.6	40.5	30.5
Liu_BUPT_task4_SS_SED_1	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.302	0.328	0.235	0.507	0.537	0.410	38.4	40.9	32.2
Tian_ICT-TOSHIBA_task4_SED_2	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	0.411	0.462	0.307	0.585	0.639	0.473	38.3	41.2	31.6
Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	0.413	0.468	0.306	0.586	0.640	0.473	38.3	41.2	31.6
Tian_ICT-TOSHIBA_task4_SED_4	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	0.412	0.467	0.306	0.586	0.639	0.473	38.3	41.2	31.6
Tian_ICT-TOSHIBA_task4_SED_3	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	0.409	0.456	0.307	0.584	0.637	0.472	38.3	41.2	31.6
Yao_GUET_task4_SED_3	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.279	0.312	0.197	0.479	0.526	0.357	34.2	37.1	27.4
Yao_GUET_task4_SED_1	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.277	0.305	0.215	0.482	0.510	0.388	31.9	34.2	26.4
Yao_GUET_task4_SED_2	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.056	0.064	0.048	0.496	0.529	0.389	8.9	9.5	7.5
Liang_SHNU_task4_SED_4	Guided Learning system	Liang2021	0.313	0.349	0.226	0.543	0.589	0.422	36.0	39.5	27.5
Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik2021	0.330	0.383	0.216	0.544	0.602	0.398	39.8	43.7	30.1
Bajzik_UNIZA_task4_SED_1	CAM-based SED system	Bajzik2021	0.133	0.140	0.081	0.266	0.259	0.219	13.7	15.2	9.7
Liang_SHNU_task4_SSep_SED_3	Mean teacher system	Liang_SS2021	0.304	0.345	0.218	0.559	0.604	0.441	34.2	37.0	27.8
Liang_SHNU_task4_SSep_SED_1	Mean teacher system	Liang_SS2021	0.313	0.348	0.235	0.588	0.639	0.462	34.6	38.1	26.5
Liang_SHNU_task4_SSep_SED_2	Mean teacher system	Liang_SS2021	0.325	0.371	0.240	0.542	0.600	0.408	37.0	40.5	28.7
Baseline_SED	DCASE2021 SED baseline system	turpault2020a	0.315	0.359	0.222	0.547	0.596	0.407	37.3	40.8	29.7
Wang_NSYSU_task4_SED_1	DCASE2021_SED_A	Wang2021	0.336	0.379	0.253	0.646	0.692	0.537	43.0	47.3	32.3
Wang_NSYSU_task4_SED_4	DCASE2021_SED_D	Wang2021	0.304	0.340	0.233	0.662	0.710	0.554	38.2	41.3	30.4
Wang_NSYSU_task4_SED_2	DCASE2021_SED_B	Wang2021	0.070	0.081	0.050	0.636	0.672	0.552	9.9	10.1	9.5
Wang_NSYSU_task4_SED_3	DCASE2021_SED_C	Wang2021	0.339	0.384	0.251	0.649	0.698	0.540	43.0	46.4	34.4

Teams ranking

Table including only the best ranking score per submitting team.

Submission code (PSDS 1)	Submission name (PSDS 1)	Submission code (PSDS 2)	Submission name (PSDS 2)	Technical Report	Ranking score (Evaluation dataset)	PSDS 1 (Evaluation dataset)	PSDS 2 (Evaluation dataset)
Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na2021	0.80	0.245	0.452
Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	1.04	0.336	0.550
Gong_TAL_task4_SED_3	TAL SED system	Gong_TAL_task4_SED_3	TAL SED system	Gong2021	1.16	0.370	0.626
Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park2021	1.07	0.327	0.603
Zheng_USTC_task4_SED_1	DCASE2020 SED Mean teacher system 1	Zheng_USTC_task4_SED_3	DCASE2020 SED Mean teacher system 3	Zheng2021	1.40	0.452	0.746
Nam_KAIST_task4_SED_2	SED_mixupratip=0.8_nband=(2,3)_medianfilter=5	Nam_KAIST_task4_SED_4	Weak_SED	Nam2021	1.29	0.399	0.715
Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo2021	0.74	0.258	0.364
deBenito_AUDIAS_task4_SED_2	3-Resolution Mean Teacher (Higher time resolutions)	deBenito_AUDIAS_task4_SED_4	5-Resolution Mean Teacher	de Benito-Gorron2021	1.10	0.363	0.577
Baseline_SSep_SED	DCASE2021 SSep SED baseline system	Baseline_SSep_SED	DCASE2021 SSep SED baseline system	turpault2020b	1.11	0.364	0.580
Boes_KUL_task4_SED_1	CRNN with optimized pooling operations for scenario 1 (1)	Boes_KUL_task4_SED_3	CRNN with optimized pooling operations for scenario 2 (1)	Boes2021	0.89	0.253	0.531
Ebbers_UPB_task4_SED_3	UPB sytem 3	Ebbers_UPB_task4_SED_4	UPB sytem 4	Ebbers2021	1.24	0.416	0.637
Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	0.318	0.583
Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu2021	0.54	0.152	0.322
Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera2021	0.98	0.338	0.481
Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim2021	1.32	0.442	0.674
Cai_SMALLRICE_task4_SED_2	DCASE2021_Cai_SED_CDur_Ensemble_2	Cai_SMALLRICE_task4_SED_3	DCASE2021_Cai_SED_CDur_Ensemble_3	Dinkel2021	1.14	0.373	0.596
HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYu2021	0.90	0.294	0.473
HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	YuHang2021	0.61	0.098	0.496
Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu2021	0.92	0.301	0.485
lu_kwai_task4_SED_1	DCASE2021 SED CRNN Model1	lu_kwai_task4_SED_3	DCASE2021 SED Conformer Model1	Lu2021	1.29	0.419	0.686
Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.94	0.302	0.507
Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	0.413	0.586
Yao_GUET_task4_SED_3	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao_GUET_task4_SED_2	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.90	0.279	0.496
Liang_SHNU_task4_SED_4	Guided Learning system	Liang_SHNU_task4_SED_4	Guided Learning system	Liang2021	0.99	0.313	0.543
Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik2021	1.02	0.330	0.544
Liang_SHNU_task4_SSep_SED_2	Mean teacher system	Liang_SHNU_task4_SSep_SED_1	Mean teacher system	Liang_SS2021	1.05	0.325	0.588
Baseline_SED	DCASE2021 SED baseline system	Baseline_SED	DCASE2021 SED baseline system	turpault2020a	1.00	0.315	0.547
Wang_NSYSU_task4_SED_3	DCASE2021_SED_C	Wang_NSYSU_task4_SED_4	DCASE2021_SED_D	Wang2021	1.14	0.339	0.662

Supplementary metrics

Submission code (PSDS 1)	Submission name (PSDS 1)	Submission code (PSDS 2)	Submission name (PSDS 2)	Technical Report	Ranking score (Evaluation dataset)	Ranking score (Public evaluation)	Ranking score (Vimeo dataset)
Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na2021	0.80	0.78	0.85
Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	1.04	1.02	1.10
Gong_TAL_task4_SED_3	TAL SED system	Gong_TAL_task4_SED_3	TAL SED system	Gong2021	1.16	1.15	1.24
Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park2021	1.07	1.06	1.15
Zheng_USTC_task4_SED_1	DCASE2020 SED Mean teacher system 1	Zheng_USTC_task4_SED_3	DCASE2020 SED Mean teacher system 3	Zheng2021	1.40	1.37	1.52
Nam_KAIST_task4_SED_2	SED_mixupratip=0.8_nband=(2,3)_medianfilter=5	Nam_KAIST_task4_SED_4	Weak_SED	Nam2021	1.29	1.25	1.43
Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo2021	0.74	0.73	0.71
deBenito_AUDIAS_task4_SED_2	3-Resolution Mean Teacher (Higher time resolutions)	deBenito_AUDIAS_task4_SED_4	5-Resolution Mean Teacher	de Benito-Gorron2021	1.10	1.10	1.14
Baseline_SSep_SED	DCASE2021 SSep SED baseline system	Baseline_SSep_SED	DCASE2021 SSep SED baseline system	turpault2020b	1.11	1.09	1.22
Boes_KUL_task4_SED_1	CRNN with optimized pooling operations for scenario 1 (1)	Boes_KUL_task4_SED_3	CRNN with optimized pooling operations for scenario 2 (1)	Boes2021	0.89	0.87	0.87
Ebbers_UPB_task4_SED_3	UPB sytem 3	Ebbers_UPB_task4_SED_4	UPB sytem 4	Ebbers2021	1.24	1.21	1.39
Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	1.03	1.09
Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu2021	0.54	0.53	0.51
Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera2021	0.98	0.97	0.93
Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim2021	1.32	1.28	1.45
Cai_SMALLRICE_task4_SED_2	DCASE2021_Cai_SED_CDur_Ensemble_2	Cai_SMALLRICE_task4_SED_3	DCASE2021_Cai_SED_CDur_Ensemble_3	Dinkel2021	1.14	1.14	1.08
HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYu2021	0.90	0.88	0.89
HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	YuHang2021	0.61	0.58	0.68
Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu2021	0.92	0.92	0.89
lu_kwai_task4_SED_1	DCASE2021 SED CRNN Model1	lu_kwai_task4_SED_3	DCASE2021 SED Conformer Model1	Lu2021	1.29	1.25	1.44
Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.94	0.91	1.03
Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	1.19	1.27
Yao_GUET_task4_SED_3	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao_GUET_task4_SED_2	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.90	0.88	0.92
Liang_SHNU_task4_SED_4	Guided Learning system	Liang_SHNU_task4_SED_4	Guided Learning system	Liang2021	0.99	0.98	1.03
Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik2021	1.02	1.04	0.98
Liang_SHNU_task4_SSep_SED_2	Mean teacher system	Liang_SHNU_task4_SSep_SED_1	Mean teacher system	Liang_SS2021	1.05	1.05	1.11
Baseline_SED	DCASE2021 SED baseline system	Baseline_SED	DCASE2021 SED baseline system	turpault2020a	1.00	1.00	1.00
Wang_NSYSU_task4_SED_3	DCASE2021_SED_C	Wang_NSYSU_task4_SED_4	DCASE2021_SED_D	Wang2021	1.14	1.13	1.25

Class-wise performance

Submission code	Submission name	Technical Report	Ranking score (Evaluation dataset)	Alarm Bell Ringing	Blender	Cat	Dishes	Dog	Electric shave toothbrush	Frying	Running water	Speech	Vacuum cleaner
Na_BUPT_task4_SED_1	Na_BUPT_task4_SED_1	Na2021	0.80	23.5	28.6	42.5	25.0	16.5	15.7	19.3	19.9	35.4	23.2
Hafsati_TUITO_task4_SED_3	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.91	25.0	39.7	53.7	19.4	28.3	39.7	38.3	25.1	49.3	38.2
Hafsati_TUITO_task4_SED_4	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	0.91	25.7	40.8	50.7	26.5	28.8	39.7	42.6	26.4	49.7	41.0
Hafsati_TUITO_task4_SED_1	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	1.03	30.4	38.1	63.7	27.6	29.1	35.6	37.0	28.4	52.7	52.3
Hafsati_TUITO_task4_SED_2	TASK AWARE SOUND EVENT DETECTION BASED ON SEMI-SUPERVISED CRNN WITH SKIP CONNECTIONS DCASE 2021 CHALLENGE, TASK 4	Hafsati2021	1.04	32.8	39.0	63.3	28.9	32.8	39.7	41.0	27.9	51.5	52.6
Gong_TAL_task4_SED_3	TAL SED system	Gong2021	1.16	33.3	49.8	61.9	34.6	31.6	39.8	41.8	26.9	45.0	54.3
Gong_TAL_task4_SED_2	TAL SED system	Gong2021	1.15	35.0	48.1	62.1	33.9	36.3	40.3	41.4	28.1	45.8	55.7
Gong_TAL_task4_SED_1	TAL SED system	Gong2021	1.14	33.9	48.7	61.3	34.7	29.3	42.4	39.8	27.7	44.1	53.1
Park_JHU_task4_SED_2	Park_JHU_task4_SED_2	Park2021	1.07	25.7	41.8	52.2	10.1	27.2	40.4	47.7	36.7	58.0	44.3
Park_JHU_task4_SED_4	Park_JHU_task4_SED_4	Park2021	0.86	25.9	42.1	33.4	34.0	17.2	38.9	50.0	35.9	50.6	41.5
Park_JHU_task4_SED_1	Park_JHU_task4_SED_1	Park2021	1.01	22.8	40.3	43.6	8.2	22.9	33.5	43.2	34.8	58.5	39.0
Park_JHU_task4_SED_3	Park_JHU_task4_SED_3	Park2021	0.84	21.8	40.2	24.4	32.2	13.9	35.1	44.4	33.3	51.5	37.8
Zheng_USTC_task4_SED_4	DCASE2020 SED Mean teacher system 4	Zheng2021	1.30	36.1	53.3	70.4	18.8	45.7	58.2	40.4	32.5	70.6	68.7
Zheng_USTC_task4_SED_1	DCASE2020 SED Mean teacher system 1	Zheng2021	1.33	41.4	54.1	72.5	29.4	47.8	60.1	49.2	33.7	69.5	65.5
Zheng_USTC_task4_SED_3	DCASE2020 SED Mean teacher system 3	Zheng2021	1.29	36.4	52.5	70.9	20.9	42.9	59.0	43.3	34.1	68.7	68.7
Zheng_USTC_task4_SED_2	DCASE2020 SED Mean teacher system 2	Zheng2021	1.33	36.6	55.1	75.3	29.8	45.6	55.7	53.6	38.6	69.3	69.5
Nam_KAIST_task4_SED_2	SED_mixupratip=0.8_nband=(2,3)_medianfilter=5	Nam2021	1.19	34.2	55.4	70.5	39.6	46.2	44.7	36.2	39.3	55.7	58.6
Nam_KAIST_task4_SED_1	SED_default	Nam2021	1.16	28.6	58.3	69.8	30.3	37.0	38.1	37.8	35.7	51.7	54.6
Nam_KAIST_task4_SED_3	SED_AFL	Nam2021	1.09	27.9	36.9	25.2	9.8	7.2	30.0	32.8	33.0	40.6	49.8
Nam_KAIST_task4_SED_4	Weak_SED	Nam2021	0.75	5.3	3.3	0.5	0.0	0.3	13.6	43.2	23.5	0.3	35.0
Koo_SGU_task4_SED_2	DCASE2021 SED system using wav2vec	Koo2021	0.12	0.0	20.5	9.9	1.0	2.3	12.5	20.0	15.7	26.8	14.8
Koo_SGU_task4_SED_3	DCASE2021 SED system using wav2vec	Koo2021	0.41	2.5	7.7	2.2	0.8	1.2	7.8	22.5	15.3	1.8	23.2
Koo_SGU_task4_SED_1	DCASE2021 SED system using wav2vec	Koo2021	0.74	15.4	23.5	30.5	15.1	20.6	21.1	21.1	18.5	19.0	20.0
deBenito_AUDIAS_task4_SED_4	5-Resolution Mean Teacher	de Benito-Gorron2021	1.10	37.4	57.1	63.8	24.2	34.5	30.0	46.8	25.9	49.8	57.3
deBenito_AUDIAS_task4_SED_1	3-Resolution Mean Teacher	de Benito-Gorron2021	1.07	37.6	58.1	63.1	23.9	34.2	35.4	43.5	29.8	49.3	51.3
deBenito_AUDIAS_task4_SED_2	3-Resolution Mean Teacher (Higher time resolutions)	de Benito-Gorron2021	1.10	37.1	51.4	63.9	26.0	36.9	28.9	46.9	30.5	52.0	57.5
deBenito_AUDIAS_task4_SED_3	4-Resolution Mean Teacher	de Benito-Gorron2021	1.07	36.2	57.6	63.1	24.4	34.8	35.0	41.9	27.1	48.2	53.5
Baseline_SSep_SED	DCASE2021 SSep SED baseline system	turpault2020b	1.11	36.7	47.4	66.3	33.1	40.5	34.8	37.2	21.5	53.0	49.3
Boes_KUL_task4_SED_4	CRNN with optimized pooling operations for scenario 2 (2)	Boes2021	0.60	3.7	24.2	1.4	0.0	0.6	13.9	23.7	12.6	6.2	19.9
Boes_KUL_task4_SED_3	CRNN with optimized pooling operations for scenario 2 (1)	Boes2021	0.68	6.6	16.5	1.4	0.0	0.3	21.3	33.0	18.7	6.6	35.1
Boes_KUL_task4_SED_2	CRNN with optimized pooling operations for scenario 1 (2)	Boes2021	0.77	16.9	32.9	63.1	7.7	19.4	25.6	32.6	14.8	51.8	47.7
Boes_KUL_task4_SED_1	CRNN with optimized pooling operations for scenario 1 (1)	Boes2021	0.81	19.0	29.0	59.1	7.7	20.9	34.5	24.8	13.0	54.0	47.9
Ebbers_UPB_task4_SED_2	UPB sytem 2	Ebbers2021	1.10	37.2	60.8	73.0	24.2	45.6	58.5	65.9	36.9	65.0	73.5
Ebbers_UPB_task4_SED_4	UPB sytem 4	Ebbers2021	1.16	39.2	61.7	74.2	33.4	46.6	57.1	64.0	45.4	67.6	77.6
Ebbers_UPB_task4_SED_3	UPB sytem 3	Ebbers2021	1.24	39.2	61.7	74.2	33.4	46.6	57.1	64.0	45.4	67.6	77.6
Ebbers_UPB_task4_SED_1	UPB sytem 1	Ebbers2021	1.16	37.2	60.8	73.0	24.2	45.6	58.5	65.9	36.9	65.0	73.5
Zhu_AIAL-XJU_task4_SED_2	Zhu_AIAL-XJU_task4_SED_2	Zhu2021	0.99	30.0	46.3	63.3	23.6	16.8	44.2	47.5	40.9	59.4	57.6
Zhu_AIAL-XJU_task4_SED_1	Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	31.8	48.2	58.3	28.5	26.7	37.4	48.6	36.0	51.1	35.3
Liu_BUPT_task4_4	DCASE2020 liuliuliufangzhou system	Liu2021	0.37	14.0	26.2	40.2	9.3	16.4	18.6	7.7	5.0	26.4	11.4
Liu_BUPT_task4_1	DCASE2020 liuliuliufangzhou system	Liu2021	0.30	13.1	18.1	49.0	8.6	19.3	20.0	6.4	6.2	28.6	11.4
Liu_BUPT_task4_2	DCASE2020 liuliuliufangzhou system	Liu2021	0.54	19.6	30.4	36.5	14.5	18.5	29.0	18.6	11.8	30.1	27.2
Liu_BUPT_task4_3	DCASE2020 liuliuliufangzhou system	Liu2021	0.24	16.5	19.0	37.2	6.2	19.3	13.7	3.5	6.8	25.0	3.8
Olvera_INRIA_task4_SED_2	SED ensemble 2 OT + FG/BG	Olvera2021	0.98	46.0	47.8	63.5	23.2	39.1	51.1	20.4	27.0	62.2	53.4
Olvera_INRIA_task4_SED_1	DA-SED + FG/BG	Olvera2021	0.95	43.7	52.3	63.6	30.0	40.8	52.6	24.4	26.6	63.9	56.9
Kim_AiTeR_GIST_SED_4	RCRNN-based noisy student SED	Kim2021	1.32	34.7	59.8	71.6	40.4	47.3	26.2	61.8	32.8	64.9	66.7
Kim_AiTeR_GIST_SED_2	RCRNN-based noisy student SED	Kim2021	1.31	37.9	57.4	72.9	41.8	46.8	25.2	60.5	36.9	64.3	60.8
Kim_AiTeR_GIST_SED_3	RCRNN-based noisy student SED	Kim2021	1.30	37.4	55.4	71.9	41.0	44.6	26.5	59.5	32.3	64.6	61.1
Kim_AiTeR_GIST_SED_1	RCRNN-based noisy student SED	Kim2021	1.29	33.0	57.1	70.0	42.5	49.6	28.2	60.6	31.3	65.0	62.3
Cai_SMALLRICE_task4_SED_1	DCASE2021_Cai_SED_CDur_Ensemble_1	Dinkel2021	1.11	37.0	32.2	55.9	31.2	20.4	37.8	33.8	23.9	60.1	45.3
Cai_SMALLRICE_task4_SED_2	DCASE2021_Cai_SED_CDur_Ensemble_2	Dinkel2021	1.13	37.8	37.4	53.8	31.8	22.1	35.9	32.3	28.9	61.3	46.6
Cai_SMALLRICE_task4_SED_3	DCASE2021_Cai_SED_CDur_Ensemble_3	Dinkel2021	1.13	36.6	36.6	55.7	31.7	21.2	36.1	37.7	25.0	61.1	46.6
Cai_SMALLRICE_task4_SED_4	DCASE2021_Cai_SED_CDur_Single_4	Dinkel2021	1.00	34.9	34.1	52.5	30.8	28.0	37.6	35.1	24.9	61.3	44.4
HangYuChen_Roal_task4_SED_2	DCASE2021 SED system	HangYu2021	0.90	29.0	30.7	59.3	24.5	31.8	35.3	30.2	26.0	49.3	25.9
HangYuChen_Roal_task4_SED_1	DCASE2021 SED system	YuHang2021	0.61	5.2	4.8	5.6	4.3	2.7	12.5	26.8	16.0	5.1	24.0
Yu_NCUT_task4_SED_1	multi-scale CRNN	Yu2021	0.20	0.5	7.1	0.7	2.0	1.7	9.5	24.4	1.2	1.6	19.7
Yu_NCUT_task4_SED_2	multi-scale CRNN	Yu2021	0.92	28.6	34.6	57.9	20.2	31.7	36.0	29.7	28.0	44.4	33.1
lu_kwai_task4_SED_1	DCASE2021 SED CRNN Model1	Lu2021	1.27	37.1	41.4	62.5	40.6	39.7	46.5	46.5	34.5	54.5	46.9
lu_kwai_task4_SED_4	DCASE2021 SED Conformer Model2	Lu2021	0.88	5.8	5.9	2.1	1.0	0.3	16.9	44.6	22.0	25.6	32.9
lu_kwai_task4_SED_3	DCASE2021 SED Conformer Model1	Lu2021	0.86	6.3	5.1	1.3	0.8	0.3	15.7	43.8	21.8	29.3	31.9
lu_kwai_task4_SED_2	DCASE2021 SED CRNN Model2	Lu2021	1.25	38.6	41.6	65.5	41.0	39.3	46.1	49.0	36.0	51.5	46.6
Liu_BUPT_task4_SS_SED_2	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.94	31.7	38.2	63.5	19.9	30.1	46.6	32.0	21.1	49.4	43.4
Liu_BUPT_task4_SS_SED_1	DCASE2020 liuliuliufangzhou system	Liu_SS2021	0.94	34.3	38.8	63.1	25.7	27.3	45.3	31.1	25.8	49.4	43.7
Tian_ICT-TOSHIBA_task4_SED_2	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	33.6	44.9	60.9	26.4	34.8	24.3	38.7	25.9	48.4	45.2
Tian_ICT-TOSHIBA_task4_SED_1	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	33.6	44.9	60.9	26.4	34.8	24.3	38.7	25.9	48.4	45.2
Tian_ICT-TOSHIBA_task4_SED_4	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.19	33.6	44.9	60.9	26.4	34.8	24.3	38.7	25.9	48.4	45.2
Tian_ICT-TOSHIBA_task4_SED_3	SOUND EVENT DETECTION USING METRIC LEARNING AND FOCAL LOSS	Tian2021	1.18	33.6	44.9	60.9	26.4	34.8	24.3	38.7	25.9	48.4	45.2
Yao_GUET_task4_SED_3	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.88	32.2	32.4	58.2	21.7	17.6	36.2	26.8	24.6	49.4	42.9
Yao_GUET_task4_SED_1	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.88	31.6	22.7	58.5	23.4	22.8	35.9	31.5	23.8	45.1	23.9
Yao_GUET_task4_SED_2	Adaptive Sequential Self Attention Span for Sound Event Detection	Yao2021	0.54	4.7	4.2	2.2	1.4	1.3	10.5	27.3	16.7	3.4	17.1
Liang_SHNU_task4_SED_4	Guided Learning system	Liang2021	0.99	38.0	40.7	48.3	26.0	24.2	22.6	35.6	30.0	44.6	50.0
Bajzik_UNIZA_task4_SED_2	CAM attention SED system	Bajzik2021	1.02	37.5	44.4	57.6	28.8	22.9	35.5	44.0	29.8	51.7	45.2
Bajzik_UNIZA_task4_SED_1	CAM-based SED system	Bajzik2021	0.45	17.1	7.5	34.2	7.1	15.2	8.9	1.2	4.8	34.1	7.1
Liang_SHNU_task4_SSep_SED_3	Mean teacher system	Liang_SS2021	0.99	33.6	38.7	47.2	22.5	17.6	21.1	38.6	28.3	44.7	50.0
Liang_SHNU_task4_SSep_SED_1	Mean teacher system	Liang_SS2021	1.03	29.4	25.7	60.7	20.5	29.1	30.6	38.3	24.9	55.0	32.3
Liang_SHNU_task4_SSep_SED_2	Mean teacher system	Liang_SS2021	1.01	33.1	37.1	52.0	26.8	32.8	31.7	41.0	28.0	49.2	37.8
Baseline_SED	DCASE2021 SED baseline system	turpault2020a	1.00	32.2	39.0	62.4	28.6	34.5	21.1	37.2	26.4	49.7	42.0
Wang_NSYSU_task4_SED_1	DCASE2021_SED_A	Wang2021	1.13	34.3	46.5	62.7	35.6	29.0	50.6	51.8	38.2	45.9	35.9
Wang_NSYSU_task4_SED_4	DCASE2021_SED_D	Wang2021	1.09	32.5	49.4	66.2	28.3	15.2	34.0	47.2	33.0	39.9	36.0
Wang_NSYSU_task4_SED_2	DCASE2021_SED_B	Wang2021	0.69	7.0	5.2	0.5	0.0	0.3	11.1	31.5	15.7	0.3	27.8
Wang_NSYSU_task4_SED_3	DCASE2021_SED_C	Wang2021	1.13	34.4	52.0	70.1	32.2	25.1	41.5	47.8	36.1	52.6	37.7

System characteristics

General characteristics

Code	Technical Report	Ranking score (Evaluation dataset)	PSDS 1 (Evaluation dataset)	PSDS 2 (Evaluation dataset)	Data augmentation	Features
Na_BUPT_task4_SED_1	Na2021	0.80	0.245	0.452		log-mel energies
Hafsati_TUITO_task4_SED_3	Hafsati2021	0.91	0.287	0.502	pitch shifting, audio concatenation, volume changing	log-mel energies
Hafsati_TUITO_task4_SED_4	Hafsati2021	0.91	0.287	0.502	pitch shifting, audio concatenation, volume changing	log-mel energies
Hafsati_TUITO_task4_SED_1	Hafsati2021	1.03	0.334	0.549		log-mel energies
Hafsati_TUITO_task4_SED_2	Hafsati2021	1.04	0.336	0.550		log-mel energies
Gong_TAL_task4_SED_3	Gong2021	1.16	0.370	0.626	SpecAugment, time shift, mixup	log-mel energies
Gong_TAL_task4_SED_2	Gong2021	1.15	0.367	0.616	SpecAugment, time shift, mixup	log-mel energies
Gong_TAL_task4_SED_1	Gong2021	1.14	0.364	0.611	SpecAugment, time shift, mixup	log-mel energies
Park_JHU_task4_SED_2	Park2021	1.07	0.327	0.603	mixup, frame shifting	log-mel energies
Park_JHU_task4_SED_4	Park2021	0.86	0.237	0.524	mixup, frame shifting	log-mel energies
Park_JHU_task4_SED_1	Park2021	1.01	0.305	0.579	mixup, frame shifting	log-mel energies
Park_JHU_task4_SED_3	Park2021	0.84	0.222	0.537	mixup, frame shifting	log-mel energies
Zheng_USTC_task4_SED_4	Zheng2021	1.30	0.389	0.742	spec-augment, time-shifting, mixup	log-mel energies
Zheng_USTC_task4_SED_1	Zheng2021	1.33	0.452	0.669	spec-augment, time-shifting, mixup	log-mel energies
Zheng_USTC_task4_SED_3	Zheng2021	1.29	0.386	0.746	spec-augment, time-shifting, mixup	log-mel energies
Zheng_USTC_task4_SED_2	Zheng2021	1.33	0.447	0.676	spec-augment, time-shifting, mixup	log-mel energies
Nam_KAIST_task4_SED_2	Nam2021	1.19	0.399	0.609	time shifiting, mixup, time masking, FilterAugment	log-mel energies
Nam_KAIST_task4_SED_1	Nam2021	1.16	0.378	0.617	time shifiting, mixup, time masking, FilterAugment	log-mel energies
Nam_KAIST_task4_SED_3	Nam2021	1.09	0.324	0.634	time shifiting, mixup, time masking, FilterAugment	log-mel energies
Nam_KAIST_task4_SED_4	Nam2021	0.75	0.059	0.715	time shifiting, mixup, time masking, FilterAugment	log-mel energies
Koo_SGU_task4_SED_2	Koo2021	0.12	0.044	0.059		raw waveform
Koo_SGU_task4_SED_3	Koo2021	0.41	0.058	0.348		raw waveform
Koo_SGU_task4_SED_1	Koo2021	0.74	0.258	0.364		raw waveform
deBenito_AUDIAS_task4_SED_4	de Benito-Gorron2021	1.10	0.361	0.577		log-mel energies
deBenito_AUDIAS_task4_SED_1	de Benito-Gorron2021	1.07	0.343	0.571		log-mel energies
deBenito_AUDIAS_task4_SED_2	de Benito-Gorron2021	1.10	0.363	0.574		log-mel energies
deBenito_AUDIAS_task4_SED_3	de Benito-Gorron2021	1.07	0.345	0.571		log-mel energies
Baseline_SSep_SED	turpault2020b	1.11	0.364	0.580	mixup	log-mel energies
Boes_KUL_task4_SED_4	Boes2021	0.60	0.117	0.457	time masking, frequency masking, mixup	log-mel energies
Boes_KUL_task4_SED_3	Boes2021	0.68	0.121	0.531	time masking, frequency masking, mixup	log-mel energies
Boes_KUL_task4_SED_2	Boes2021	0.77	0.233	0.440	time masking, frequency masking, mixup	log-mel energies
Boes_KUL_task4_SED_1	Boes2021	0.81	0.253	0.442	time masking, frequency masking, mixup	log-mel energies
Ebbers_UPB_task4_SED_2	Ebbers2021	1.10	0.335	0.621	freuency warping, time-/frequency-masking, shifted superposition, random noise	log-mel energies
Ebbers_UPB_task4_SED_4	Ebbers2021	1.16	0.363	0.637	freuency warping, time-/frequency-masking, shifted superposition, random noise	log-mel energies
Ebbers_UPB_task4_SED_3	Ebbers2021	1.24	0.416	0.635	freuency warping, time-/frequency-masking, shifted superposition, random noise	log-mel energies
Ebbers_UPB_task4_SED_1	Ebbers2021	1.16	0.373	0.621	freuency warping, time-/frequency-masking, shifted superposition, random noise	log-mel energies
Zhu_AIAL-XJU_task4_SED_2	Zhu2021	0.99	0.290	0.574	mixup	log-mel spectrogram
Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	0.318	0.583	mixup	log-mel spectrogram
Liu_BUPT_task4_4	Liu2021	0.37	0.102	0.231		log-mel energies
Liu_BUPT_task4_1	Liu2021	0.30	0.090	0.169		log-mel energies
Liu_BUPT_task4_2	Liu2021	0.54	0.152	0.322		log-mel energies
Liu_BUPT_task4_3	Liu2021	0.24	0.068	0.146		log-mel energies
Olvera_INRIA_task4_SED_2	Olvera2021	0.98	0.338	0.481		log-mel energies
Olvera_INRIA_task4_SED_1	Olvera2021	0.95	0.332	0.462		log-mel energies
Kim_AiTeR_GIST_SED_4	Kim2021	1.32	0.442	0.674	time-frequency shift, mixup, specaugment	log-mel energies
Kim_AiTeR_GIST_SED_2	Kim2021	1.31	0.439	0.667	time-frequency shift, mixup, specaugment	log-mel energies
Kim_AiTeR_GIST_SED_3	Kim2021	1.30	0.434	0.669	time-frequency shift, mixup, specaugment	log-mel energies
Kim_AiTeR_GIST_SED_1	Kim2021	1.29	0.431	0.661	time-frequency shift, mixup, specaugment	log-mel energies
Cai_SMALLRICE_task4_SED_1	Dinkel2021	1.11	0.361	0.584	time shifting, mixup, time masking, frequency masking	log-mel energies
Cai_SMALLRICE_task4_SED_2	Dinkel2021	1.13	0.373	0.585	time shifting, mixup, time masking, frequency masking	log-mel energies
Cai_SMALLRICE_task4_SED_3	Dinkel2021	1.13	0.370	0.596	time shifting, mixup, time masking, frequency masking	log-mel energies
Cai_SMALLRICE_task4_SED_4	Dinkel2021	1.00	0.339	0.504	time shifting, mixup, time masking, frequency masking	log-mel energies
HangYuChen_Roal_task4_SED_2	HangYu2021	0.90	0.294	0.473	minmax	log-mel energies
HangYuChen_Roal_task4_SED_1	YuHang2021	0.61	0.098	0.496	minmax	log-mel energies
Yu_NCUT_task4_SED_1	Yu2021	0.20	0.038	0.157	mixup	log-mel energies
Yu_NCUT_task4_SED_2	Yu2021	0.92	0.301	0.485	mixup	log-mel energies
lu_kwai_task4_SED_1	Lu2021	1.27	0.419	0.660	mixup, frame-shift	log-mel energies
lu_kwai_task4_SED_4	Lu2021	0.88	0.157	0.685	mixup, frame-shift	log-mel energies
lu_kwai_task4_SED_3	Lu2021	0.86	0.148	0.686	mixup, frame-shift	log-mel energies
lu_kwai_task4_SED_2	Lu2021	1.25	0.412	0.651	mixup, frame-shift	log-mel energies
Liu_BUPT_task4_SS_SED_2	Liu_SS2021	0.94	0.302	0.507	source augmentation, random track mixing	raw waveform
Liu_BUPT_task4_SS_SED_1	Liu_SS2021	0.94	0.302	0.507	source augmentation, random track mixing	raw waveform
Tian_ICT-TOSHIBA_task4_SED_2	Tian2021	1.19	0.411	0.585	mixup	log-mel energies
Tian_ICT-TOSHIBA_task4_SED_1	Tian2021	1.19	0.413	0.586	mixup	log-mel energies
Tian_ICT-TOSHIBA_task4_SED_4	Tian2021	1.19	0.412	0.586	mixup	log-mel energies
Tian_ICT-TOSHIBA_task4_SED_3	Tian2021	1.18	0.409	0.584	mixup	log-mel energies
Yao_GUET_task4_SED_3	Yao2021	0.88	0.279	0.479	MIXUP	log-mel energies
Yao_GUET_task4_SED_1	Yao2021	0.88	0.277	0.482	MIXUP	log-mel energies
Yao_GUET_task4_SED_2	Yao2021	0.54	0.056	0.496	MIXUP	log-mel energies
Liang_SHNU_task4_SED_4	Liang2021	0.99	0.313	0.543	mixup, specAugment	log-mel energies
Bajzik_UNIZA_task4_SED_2	Bajzik2021	1.02	0.330	0.544		log-mel energies
Bajzik_UNIZA_task4_SED_1	Bajzik2021	0.45	0.133	0.266		log-mel energies
Liang_SHNU_task4_SSep_SED_3	Liang_SS2021	0.99	0.304	0.559		log-mel energies
Liang_SHNU_task4_SSep_SED_1	Liang_SS2021	1.03	0.313	0.588		log-mel energies
Liang_SHNU_task4_SSep_SED_2	Liang_SS2021	1.01	0.325	0.542		log-mel energies
Baseline_SED	turpault2020a	1.00	0.315	0.547	mixup	log-mel energies
Wang_NSYSU_task4_SED_1	Wang2021	1.13	0.336	0.646	Mixup, Time Shift, Time Mask, Frequency Mask	log-mel energies
Wang_NSYSU_task4_SED_4	Wang2021	1.09	0.304	0.662	Mixup, Time Shift, Time Mask, Frequency Mask	log-mel energies
Wang_NSYSU_task4_SED_2	Wang2021	0.69	0.070	0.636	Mixup, Time Shift, Time Mask, Frequency Mask	log-mel energies
Wang_NSYSU_task4_SED_3	Wang2021	1.13	0.339	0.649	Mixup, Time Shift, Time Mask, Frequency Mask	log-mel energies

Machine learning characteristics

Code	Technical Report	Ranking score (Evaluation dataset)	PSDS 1 (Evaluation dataset)	PSDS 2 (Evaluation dataset)	Classifier	Semi-supervised approach	Post-processing	Segmentation method	Decision making
Na_BUPT_task4_SED_1	Na2021	0.80	0.245	0.452	CNN, conformer	mean-teacher student	median filtering (93ms)
Hafsati_TUITO_task4_SED_3	Hafsati2021	0.91	0.287	0.502	CRNN	mean-teacher student	median filtering (93ms)
Hafsati_TUITO_task4_SED_4	Hafsati2021	0.91	0.287	0.502	CRNN	mean-teacher student	median filtering (93ms)
Hafsati_TUITO_task4_SED_1	Hafsati2021	1.03	0.334	0.549	CRNN	mean-teacher student	median filtering (93ms)
Hafsati_TUITO_task4_SED_2	Hafsati2021	1.04	0.336	0.550	CRNN	mean-teacher student	median filtering (93ms)
Gong_TAL_task4_SED_3	Gong2021	1.16	0.370	0.626	CRNN	mean-teacher, pseudo-labelling	class-wise median filtering	attention layers	mean
Gong_TAL_task4_SED_2	Gong2021	1.15	0.367	0.616	CRNN	mean-teacher	class-wise median filtering	attention layers	mean
Gong_TAL_task4_SED_1	Gong2021	1.14	0.364	0.611	CRNN	mean-teacher	class-wise median filtering	attention layers	mean
Park_JHU_task4_SED_2	Park2021	1.07	0.327	0.603	RCRNN	cross-referencing self-training	median filtering
Park_JHU_task4_SED_4	Park2021	0.86	0.237	0.524	RCRNN	cross-referencing self-training	median filtering
Park_JHU_task4_SED_1	Park2021	1.01	0.305	0.579	RCRNN	cross-referencing self-training	median filtering
Park_JHU_task4_SED_3	Park2021	0.84	0.222	0.537	RCRNN	cross-referencing self-training	median filtering
Zheng_USTC_task4_SED_4	Zheng2021	1.30	0.389	0.742	CRNN	mean-teacher student	median filtering (340ms)		averaging
Zheng_USTC_task4_SED_1	Zheng2021	1.33	0.452	0.669	CRNN	mean-teacher student	median filtering (340ms)		averaging
Zheng_USTC_task4_SED_3	Zheng2021	1.29	0.386	0.746	CRNN	mean-teacher student	median filtering (340ms)		averaging
Zheng_USTC_task4_SED_2	Zheng2021	1.33	0.447	0.676	CRNN	mean-teacher student	median filtering (340ms)		averaging
Nam_KAIST_task4_SED_2	Nam2021	1.19	0.399	0.609	CRNN, ensemble	mean-teacher student	median filtering (329ms), weak prediction masking		mean
Nam_KAIST_task4_SED_1	Nam2021	1.16	0.378	0.617	CRNN, ensemble	mean-teacher student	median filtering (461ms), weak prediction masking		mean
Nam_KAIST_task4_SED_3	Nam2021	1.09	0.324	0.634	CRNN, ensemble	mean-teacher student	median filtering (461ms), weak prediction masking		mean
Nam_KAIST_task4_SED_4	Nam2021	0.75	0.059	0.715	CRNN, ensemble	mean-teacher student	weak SED		mean
Koo_SGU_task4_SED_2	Koo2021	0.12	0.044	0.059	Transformer, RNN	mean-teacher student	median filtering (93ms)
Koo_SGU_task4_SED_3	Koo2021	0.41	0.058	0.348	Transformer, RNN	mean-teacher student	median filtering (93ms)
Koo_SGU_task4_SED_1	Koo2021	0.74	0.258	0.364	Transformer	mean-teacher student	median filtering (93ms)
deBenito_AUDIAS_task4_SED_4	de Benito-Gorron2021	1.10	0.361	0.577	CRNN	mean-teacher student	median filtering (45ms)		average
deBenito_AUDIAS_task4_SED_1	de Benito-Gorron2021	1.07	0.343	0.571	CRNN	mean-teacher student	median filtering (45ms)		average
deBenito_AUDIAS_task4_SED_2	de Benito-Gorron2021	1.10	0.363	0.574	CRNN	mean-teacher student	median filtering (45ms)		average
deBenito_AUDIAS_task4_SED_3	de Benito-Gorron2021	1.07	0.345	0.571	CRNN	mean-teacher student	median filtering (45ms)		average
Baseline_SSep_SED	turpault2020b	1.11	0.364	0.580	CRNN	mean-teacher student
Boes_KUL_task4_SED_4	Boes2021	0.60	0.117	0.457	CRNN	mean teacher	median filtering (3.7s)
Boes_KUL_task4_SED_3	Boes2021	0.68	0.121	0.531	CRNN	mean teacher	median filtering (3.7s)
Boes_KUL_task4_SED_2	Boes2021	0.77	0.233	0.440	CRNN	mean teacher	median filtering (460ms)
Boes_KUL_task4_SED_1	Boes2021	0.81	0.253	0.442	CRNN	mean teacher	median filtering (460ms)
Ebbers_UPB_task4_SED_2	Ebbers2021	1.10	0.335	0.621	FBCRNN,CRNN	self-training	median filtering (class dependent)	MIL
Ebbers_UPB_task4_SED_4	Ebbers2021	1.16	0.363	0.637	FBCRNN,CRNN,CTNN,CNN	self-training	median filtering (class dependent)		averaging
Ebbers_UPB_task4_SED_3	Ebbers2021	1.24	0.416	0.635	FBCRNN,CRNN,CTNN,CNN	self-training	median filtering (class dependent)		averaging
Ebbers_UPB_task4_SED_1	Ebbers2021	1.16	0.373	0.621	FBCRNN,CRNN	self-training	median filtering (class dependent)	MIL
Zhu_AIAL-XJU_task4_SED_2	Zhu2021	0.99	0.290	0.574	CRNN	mean-teacher student	median filtering	LinearSoftmax
Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	0.318	0.583	CRNN	mean-teacher student	median filtering	LinearSoftmax
Liu_BUPT_task4_4	Liu2021	0.37	0.102	0.231	CRNN	mean-teacher student	median filtering (93ms)
Liu_BUPT_task4_1	Liu2021	0.30	0.090	0.169	CRNN	mean-teacher student	median filtering (93ms)
Liu_BUPT_task4_2	Liu2021	0.54	0.152	0.322	CRNN	mean-teacher student	median filtering (93ms)
Liu_BUPT_task4_3	Liu2021	0.24	0.068	0.146	CRNN	mean-teacher student	median filtering (93ms)
Olvera_INRIA_task4_SED_2	Olvera2021	0.98	0.338	0.481	CRNN	mean-teacher student	HMM smoothing		HMM smoothing
Olvera_INRIA_task4_SED_1	Olvera2021	0.95	0.332	0.462	CRNN	mean-teacher student	HMM smoothing		HMM smoothing
Kim_AiTeR_GIST_SED_4	Kim2021	1.32	0.442	0.674	RCRNN	mean-teacher student, self-training with noisy student	median filtering		mean
Kim_AiTeR_GIST_SED_2	Kim2021	1.31	0.439	0.667	RCRNN	mean-teacher student, self-training with noisy student	median filtering		mean
Kim_AiTeR_GIST_SED_3	Kim2021	1.30	0.434	0.669	RCRNN	mean-teacher student, self-training with noisy student	median filtering		mean
Kim_AiTeR_GIST_SED_1	Kim2021	1.29	0.431	0.661	RCRNN	mean-teacher student, self-training with noisy student	median filtering		mean
Cai_SMALLRICE_task4_SED_1	Dinkel2021	1.11	0.361	0.584	CRNN, ensemble	unsupervised data augmentation			average
Cai_SMALLRICE_task4_SED_2	Dinkel2021	1.13	0.373	0.585	CRNN, ensemble	unsupervised data augmentation			average
Cai_SMALLRICE_task4_SED_3	Dinkel2021	1.13	0.370	0.596	CRNN, ensemble	unsupervised data augmentation			average
Cai_SMALLRICE_task4_SED_4	Dinkel2021	1.00	0.339	0.504	CRNN	unsupervised data augmentation
HangYuChen_Roal_task4_SED_2	HangYu2021	0.90	0.294	0.473	Transformer,CNN	mean-teacher student	median filtering (93ms)	attention layers	majority vote
HangYuChen_Roal_task4_SED_1	YuHang2021	0.61	0.098	0.496	CRNN	mean-teacher student	median filtering (93ms)	attention layers	majority vote
Yu_NCUT_task4_SED_1	Yu2021	0.20	0.038	0.157	Multi-scale CRNN	mean-teacher student	median filtering (93ms)	attention
Yu_NCUT_task4_SED_2	Yu2021	0.92	0.301	0.485	Multi-scale CRNN	mean-teacher student	median filtering (93ms)	attention
lu_kwai_task4_SED_1	Lu2021	1.27	0.419	0.660	CRNN	mean-teacher student	classwise median filtering		majority vote
lu_kwai_task4_SED_4	Lu2021	0.88	0.157	0.685	Conformer	mean-teacher student	classwise median filtering		majority vote
lu_kwai_task4_SED_3	Lu2021	0.86	0.148	0.686	Conformer	mean-teacher student	classwise median filtering		majority vote
lu_kwai_task4_SED_2	Lu2021	1.25	0.412	0.651	CRNN	mean-teacher student	classwise median filtering		majority vote
Liu_BUPT_task4_SS_SED_2	Liu_SS2021	0.94	0.302	0.507	u-net, VGG		median filtering (93ms)	attention layers, d-vector
Liu_BUPT_task4_SS_SED_1	Liu_SS2021	0.94	0.302	0.507	u-net, VGG		median filtering (93ms)	attention layers, d-vector
Tian_ICT-TOSHIBA_task4_SED_2	Tian2021	1.19	0.411	0.585	CNN	mean-teacher student	median filtering with adaptive window size	attention_layers
Tian_ICT-TOSHIBA_task4_SED_1	Tian2021	1.19	0.413	0.586	CNN	mean-teacher student	median filtering with adaptive window size	attention_layers
Tian_ICT-TOSHIBA_task4_SED_4	Tian2021	1.19	0.412	0.586	CNN	mean-teacher student	median filtering with adaptive window size	attention_layers
Tian_ICT-TOSHIBA_task4_SED_3	Tian2021	1.18	0.409	0.584	CNN	mean-teacher student	median filtering with adaptive window size	attention_layers
Yao_GUET_task4_SED_3	Yao2021	0.88	0.279	0.479	CRNN,Self Attention	mean-teacher student	median filtering (93ms)
Yao_GUET_task4_SED_1	Yao2021	0.88	0.277	0.482	CRNN,Self Attention	mean-teacher student	median filtering (93ms)
Yao_GUET_task4_SED_2	Yao2021	0.54	0.056	0.496	CRNN,Self Attention	mean-teacher student	median filtering (93ms)
Liang_SHNU_task4_SED_4	Liang2021	0.99	0.313	0.543	CRNN	teacher student	median filtering (with adaptive window size)
Bajzik_UNIZA_task4_SED_2	Bajzik2021	1.02	0.330	0.544	CRNN	mean-teacher student	median filtering (112ms)
Bajzik_UNIZA_task4_SED_1	Bajzik2021	0.45	0.133	0.266	CNN	mean-teacher student	median filtering (112ms)
Liang_SHNU_task4_SSep_SED_3	Liang_SS2021	0.99	0.304	0.559	CRNN	mean-teacher student	median filtering (with adaptive window size)
Liang_SHNU_task4_SSep_SED_1	Liang_SS2021	1.03	0.313	0.588	CRNN	mean-teacher student	median filtering (with adaptive window size)
Liang_SHNU_task4_SSep_SED_2	Liang_SS2021	1.01	0.325	0.542	CRNN	mean-teacher student	median filtering (with adaptive window size)
Baseline_SED	turpault2020a	1.00	0.315	0.547	CRNN	mean-teacher student
Wang_NSYSU_task4_SED_1	Wang2021	1.13	0.336	0.646	CRNN	mean-teacher student	median filtering	attention layer	mean
Wang_NSYSU_task4_SED_4	Wang2021	1.09	0.304	0.662	CRNN, CNN-Transformer	mean-teacher student	median filtering	attention layer, exponential softmax layer	mean
Wang_NSYSU_task4_SED_2	Wang2021	0.69	0.070	0.636	CRNN	mean-teacher student	median filtering	exponential softmax layer	mean
Wang_NSYSU_task4_SED_3	Wang2021	1.13	0.339	0.649	CRNN, CNN-Transformer	mean-teacher student	median filtering	attention layer	mean

Complexity

Code	Technical Report	Ranking score (Evaluation dataset)	PSDS 1 (Evaluation dataset)	PSDS 2 (Evaluation dataset)	Model complexity	Ensemble subsystems	Training time
Na_BUPT_task4_SED_1	Na2021	0.80	0.245	0.452	3900000		40h (1 Quadro K1200)
Hafsati_TUITO_task4_SED_3	Hafsati2021	0.91	0.287	0.502	1100000		20h (1 Tesla V100-SXM2-16GB)
Hafsati_TUITO_task4_SED_4	Hafsati2021	0.91	0.287	0.502	1100000		20h (1 Tesla V100-SXM2-16GB)
Hafsati_TUITO_task4_SED_1	Hafsati2021	1.03	0.334	0.549	1100000		6h (1 Tesla V100-SXM2-16GB)
Hafsati_TUITO_task4_SED_2	Hafsati2021	1.04	0.336	0.550	1100000		6h (1 Tesla V100-SXM2-16GB)
Gong_TAL_task4_SED_3	Gong2021	1.16	0.370	0.626	6674520	6	22.5h (1 V100)
Gong_TAL_task4_SED_2	Gong2021	1.15	0.367	0.616	2224840	2	7.5h (1 V100)
Gong_TAL_task4_SED_1	Gong2021	1.14	0.364	0.611	4449680	4	15h (1 V100)
Park_JHU_task4_SED_2	Park2021	1.07	0.327	0.603	9000000		20h (1 GTX 1080 Ti)
Park_JHU_task4_SED_4	Park2021	0.86	0.237	0.524	9000000		20h (1 GTX 1080 Ti)
Park_JHU_task4_SED_1	Park2021	1.01	0.305	0.579	9000000		20h (1 GTX 1080 Ti)
Park_JHU_task4_SED_3	Park2021	0.84	0.222	0.537	9000000		20h (1 GTX 1080 Ti)
Zheng_USTC_task4_SED_4	Zheng2021	1.30	0.389	0.742	1112420	9	3h (1 GTX 3090)
Zheng_USTC_task4_SED_1	Zheng2021	1.33	0.452	0.669	1112420	3	3h (1 GTX 3090)
Zheng_USTC_task4_SED_3	Zheng2021	1.29	0.386	0.746	1112420	10	3h (1 GTX 3090)
Zheng_USTC_task4_SED_2	Zheng2021	1.33	0.447	0.676	1112420	9	3h (1 GTX 3090)
Nam_KAIST_task4_SED_2	Nam2021	1.19	0.399	0.609	4427956	9	4h (1 GTX 2080 Ti)
Nam_KAIST_task4_SED_1	Nam2021	1.16	0.378	0.617	4427956	16	4h (1 GTX 2080 Ti)
Nam_KAIST_task4_SED_3	Nam2021	1.09	0.324	0.634	4427956	11	4h (1 GTX 2080 Ti)
Nam_KAIST_task4_SED_4	Nam2021	0.75	0.059	0.715	4427956	9	4h (1 GTX 2080 Ti)
Koo_SGU_task4_SED_2	Koo2021	0.12	0.044	0.059	102000000		19h (1 Tesla M40)
Koo_SGU_task4_SED_3	Koo2021	0.41	0.058	0.348	196000000	2	48h (1 Tesla M40)
Koo_SGU_task4_SED_1	Koo2021	0.74	0.258	0.364	95800000		19h (1 RTX 3080 Ti)
deBenito_AUDIAS_task4_SED_4	de Benito-Gorron2021	1.10	0.361	0.577	5562100	5	20h (1 RTX 2080)
deBenito_AUDIAS_task4_SED_1	de Benito-Gorron2021	1.07	0.343	0.571	3337260	3	12h (1 RTX 2080)
deBenito_AUDIAS_task4_SED_2	de Benito-Gorron2021	1.10	0.363	0.574	3337260	3	12h (1 RTX 2080)
deBenito_AUDIAS_task4_SED_3	de Benito-Gorron2021	1.07	0.345	0.571	4449600	4	16h (1 RTX 2080)
Baseline_SSep_SED	turpault2020b	1.11	0.364	0.580	2200000		6h (1 GTX 1080 Ti)
Boes_KUL_task4_SED_4	Boes2021	0.60	0.117	0.457	1038314		5h (1 GTX 1080 Ti)
Boes_KUL_task4_SED_3	Boes2021	0.68	0.121	0.531	1038314		5h (1 GTX 1080 Ti)
Boes_KUL_task4_SED_2	Boes2021	0.77	0.233	0.440	1038314		5h (1 GTX 1080 Ti)
Boes_KUL_task4_SED_1	Boes2021	0.81	0.253	0.442	1038314		5h (1 GTX 1080 Ti)
Ebbers_UPB_task4_SED_2	Ebbers2021	1.10	0.335	0.621	9568030	1	72h (4 RTX 2070)
Ebbers_UPB_task4_SED_4	Ebbers2021	1.16	0.363	0.637	59853372	6	72h (4 RTX 2070)
Ebbers_UPB_task4_SED_3	Ebbers2021	1.24	0.416	0.635	59853372	6	72h (4 RTX 2070)
Ebbers_UPB_task4_SED_1	Ebbers2021	1.16	0.373	0.621	9568030	1	72h (4 RTX 2070)
Zhu_AIAL-XJU_task4_SED_2	Zhu2021	0.99	0.290	0.574	3900000		12.5h (1 RTX 3090)
Zhu_AIAL-XJU_task4_SED_1	Zhu2021	1.04	0.318	0.583	3900000		13.5h (1 RTX 3090)
Liu_BUPT_task4_4	Liu2021	0.37	0.102	0.231	1112420		12h (1 GTX 1080 Ti)
Liu_BUPT_task4_1	Liu2021	0.30	0.090	0.169	1112420		12h (1 GTX 1080 Ti)
Liu_BUPT_task4_2	Liu2021	0.54	0.152	0.322	1112420		12h (1 GTX 1080 Ti)
Liu_BUPT_task4_3	Liu2021	0.24	0.068	0.146	1112420		12h (1 GTX 1080 Ti)
Olvera_INRIA_task4_SED_2	Olvera2021	0.98	0.338	0.481	2225868	2	24h (1 GTX 1080)
Olvera_INRIA_task4_SED_1	Olvera2021	0.95	0.332	0.462	3338802	3	24h (1 GTX 1080)
Kim_AiTeR_GIST_SED_4	Kim2021	1.32	0.442	0.674	2162412	10	5h (1 GTX 1080 Ti)
Kim_AiTeR_GIST_SED_2	Kim2021	1.31	0.439	0.667	2162412	5	5h (1 GTX 1080 Ti)
Kim_AiTeR_GIST_SED_3	Kim2021	1.30	0.434	0.669	2162412	5	5h (1 GTX 1080 Ti)
Kim_AiTeR_GIST_SED_1	Kim2021	1.29	0.431	0.661	2162412	5	5h (1 GTX 1080 Ti)
Cai_SMALLRICE_task4_SED_1	Dinkel2021	1.11	0.361	0.584	2043204	3	3h (1 GTX 2080 Ti)
Cai_SMALLRICE_task4_SED_2	Dinkel2021	1.13	0.373	0.585	2724272	4	3h (1 GTX 2080 Ti)
Cai_SMALLRICE_task4_SED_3	Dinkel2021	1.13	0.370	0.596	3405340	5	3h (1 GTX 2080 Ti)
Cai_SMALLRICE_task4_SED_4	Dinkel2021	1.00	0.339	0.504	681068		3h (1 GTX 2080 Ti)
HangYuChen_Roal_task4_SED_2	HangYu2021	0.90	0.294	0.473	11312420	2	6h (1 GTX 1080 Ti)
HangYuChen_Roal_task4_SED_1	YuHang2021	0.61	0.098	0.496	1112420	2	3h (1 GTX 1080 Ti)
Yu_NCUT_task4_SED_1	Yu2021	0.20	0.038	0.157	1300000		5h (1 GTX 1080)
Yu_NCUT_task4_SED_2	Yu2021	0.92	0.301	0.485	1300000		5h (1 GTX 1080)
lu_kwai_task4_SED_1	Lu2021	1.27	0.419	0.660	10500000	5	5h (1 GTX 2080 Ti)
lu_kwai_task4_SED_4	Lu2021	0.88	0.157	0.685	39500000	5	10h (1 GTX 2080 Ti)
lu_kwai_task4_SED_3	Lu2021	0.86	0.148	0.686	39500000	5	10h (1 GTX 2080 Ti)
lu_kwai_task4_SED_2	Lu2021	1.25	0.412	0.651	10500000	5	5h (1 GTX 2080 Ti)
Liu_BUPT_task4_SS_SED_2	Liu_SS2021	0.94	0.302	0.507	192905515		17h (1 RTX 3090)
Liu_BUPT_task4_SS_SED_1	Liu_SS2021	0.94	0.302	0.507	192905515		17h (1 RTX 3090)
Tian_ICT-TOSHIBA_task4_SED_2	Tian2021	1.19	0.411	0.585	8471847	4	6h for each model(GTX 2080 Ti)
Tian_ICT-TOSHIBA_task4_SED_1	Tian2021	1.19	0.413	0.586	8471847	4	6h for each model(GTX 2080 Ti)
Tian_ICT-TOSHIBA_task4_SED_4	Tian2021	1.19	0.412	0.586	8471847	4	6h for each model(GTX 2080 Ti)
Tian_ICT-TOSHIBA_task4_SED_3	Tian2021	1.18	0.409	0.584	8471847	4	6h for each model(GTX 2080 Ti)
Yao_GUET_task4_SED_3	Yao2021	0.88	0.279	0.479	2.5M		6h (1 Titan RTX)
Yao_GUET_task4_SED_1	Yao2021	0.88	0.277	0.482	2.5M		6h (1 Titan RTX)
Yao_GUET_task4_SED_2	Yao2021	0.54	0.056	0.496	2.5M		6h (1 Titan RTX)
Liang_SHNU_task4_SED_4	Liang2021	0.99	0.313	0.543	1431280		16h (Tesla-V100)
Bajzik_UNIZA_task4_SED_2	Bajzik2021	1.02	0.330	0.544	2200000		13h (1 GeForce GTX 1650)
Bajzik_UNIZA_task4_SED_1	Bajzik2021	0.45	0.133	0.266	1200000		5h (1 GeForce GTX 1650)
Liang_SHNU_task4_SSep_SED_3	Liang_SS2021	0.99	0.304	0.559	1112420		3h (1 GTX 1080 Ti)
Liang_SHNU_task4_SSep_SED_1	Liang_SS2021	1.03	0.313	0.588	1112420		3h (1 GTX 1080 Ti)
Liang_SHNU_task4_SSep_SED_2	Liang_SS2021	1.01	0.325	0.542	1112420		3h (1 GTX 1080 Ti)
Baseline_SED	turpault2020a	1.00	0.315	0.547	2200000		6h (1 GTX 1080 Ti)
Wang_NSYSU_task4_SED_1	Wang2021	1.13	0.336	0.646	47213260	10	480h (1 GPU 1080 Ti)
Wang_NSYSU_task4_SED_4	Wang2021	1.09	0.304	0.662	118739112	24	864h (1 GPU 1080Ti), 360h (1 GPU V100)
Wang_NSYSU_task4_SED_2	Wang2021	0.69	0.070	0.636	3350984	8	384h (1 GPU 1080Ti)
Wang_NSYSU_task4_SED_3	Wang2021	1.13	0.339	0.649	115388128	16	480h (1 GPU 1080 Ti), 360h (1 GPU V100)

Technical reports

Sound Event Detection System For DCASE 2021 Challenge

Bajzik, Jakub

University of Zilina, Department of Mechatronics and Electronics, Žilina 010 26, Slovak Republic

Bajzik_UNIZA_task4_SED_1 Bajzik_UNIZA_task4_SED_2

Input	mono
Classifier	CRNN
Acoustic features	log-mel energies
Decision making	p-norm

Content

Task description

Systems ranking

Supplementary metrics

Teams ranking

Supplementary metrics

Class-wise performance

System characteristics

General characteristics

Machine learning characteristics

Complexity

Technical reports

Sound Event Detection System For DCASE 2021 Challenge

Sound Event Detection System For DCASE 2021 Challenge

Abstract

System characteristics

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

Abstract

System characteristics

Convolution-Augmented Conformer For Sound Event Detection

Convolution-Augmented Conformer For Sound Event Detection

Abstract

System characteristics

Multi-Resolution Mean Teacher For DCASE 2021 Task 4

Multi-Resolution Mean Teacher For DCASE 2021 Task 4

Abstract

System characteristics

The Smallrice Submission To The Dcase2021 Task 4 Challenge: A Lightweight Approach For Semi-Supervised Sound Event Detection With Unsupervised Data Augmentation

The Smallrice Submission To The Dcase2021 Task 4 Challenge: A Lightweight Approach For Semi-Supervised Sound Event Detection With Unsupervised Data Augmentation

Abstract

System characteristics

Self-Trained Audio Tagging And Sound Event Detection In Domestic Environments

Self-Trained Audio Tagging And Sound Event Detection In Domestic Environments

Abstract

System characteristics

Improved Pseudo-Labeling Method For Semi-Supervised Sound Event Detection

Improved Pseudo-Labeling Method For Semi-Supervised Sound Event Detection

Abstract

System characteristics

Task Aware Sound Event Detection Based On Semi-Supervised CRNNWith Skip Connections: DCASE 2021 Challenge, Task 4

Task Aware Sound Event Detection Based On Semi-Supervised CRNNWith Skip Connections: DCASE 2021 Challenge, Task 4

Abstract

System characteristics

Self-Training With Noisy Student Model And Semi-Supervised Loss Function For DCASE 2021 Challenge Task 4

Self-Training With Noisy Student Model And Semi-Supervised Loss Function For DCASE 2021 Challenge Task 4

Abstract

System characteristics

Sound Event Detection Based On Self-Supervised Learning Of Wav2vec 2.0

Sound Event Detection Based On Self-Supervised Learning Of Wav2vec 2.0

Abstract

System characteristics

Adaptive Focal Loss With Data Augmentation For Semi-Supervised Sound Event Detection

Adaptive Focal Loss With Data Augmentation For Semi-Supervised Sound Event Detection

Abstract

System characteristics

Combined Sound Event Detection And Sound Event Separation Networks For DCASE 2021 Task 4

Combined Sound Event Detection And Sound Event Separation Networks For DCASE 2021 Task 4

Abstract

System characteristics

Integrating Advantages Of Recurrent And Transformer Structures For Sound Event Detection In Multiple Scenarios

Integrating Advantages Of Recurrent And Transformer Structures For Sound Event Detection In Multiple Scenarios

Abstract

System characteristics

Convolutional Network With Conformer For Semi-Supervised Sound Event Detection

Convolutional Network With Conformer For Semi-Supervised Sound Event Detection

Abstract

System characteristics

Heavily Augmented Sound Event Detection utilizing Weak Predictions

Heavily Augmented Sound Event Detection utilizing Weak Predictions

Abstract

System characteristics

Domain-Adapted Sound Event Detection System With Auxiliary Foreground-Background Classifier

Domain-Adapted Sound Event Detection System With Auxiliary Foreground-Background Classifier

Abstract

System characteristics

Sound Event Detection with Cross-Referencing Self-Training

Sound Event Detection with Cross-Referencing Self-Training

Abstract

System characteristics