Task description

This challenge focuses on sound event detection in a few-shot learning setting for animal (mammal and bird) vocalisations. Participants will be expected to create a method that can extract information from five exemplar vocalisations (shots) of mammals or birds and detect and classify sounds in field recordings. The main objective is to find reliable algorithms that are capable of dealing with data sparsity, class imbalance, and noisy/busy environments.

More detailed task description can be found in the task description page

Systems ranking

Submission code	Submission name	Technical Report	Event-based F-score with 95% confidence interval (Evaluation dataset)	Event-based F-score (Validation dataset)
Baseline_TempMatch_task5_1	Baseline Template Matching		12.3 (11.5 - 12.8)	3.4
Baseline_PROTO_task5_1	Baseline Prototypical Network		5.3 ( - )
Wu_SHNU_task5_1	Continual_learning	Wu2022	40.9 (40.5 - 41.3)	53.9
Zhang_CQU_task5_1	Zhang_CQU_task5_1	Zhang2022	1.2 (0.9 - 1.3)	46.5
Zhang_CQU_task5_2	Zhang_CQU_task5_2	Zhang2022	0.9 (0.0 - 1.0)	45.5
Zhang_CQU_task5_3	Zhang_CQU_task5_3	Zhang2022	1.9 (1.0 - 2.0)	44.2
Zhang_CQU_task5_4	Zhang_CQU_task5_4	Zhang2022	4.3 (3.7 - 4.6)	44.2
Kang_ET_task5_1	FewShot_using_good_embedding_model	Kang2022	2.4 (2.4 - 2.4)
Kang_ET_task5_2	FewShot_using_good_embedding_model	Kang2022	2.8 (2.8 - 2.9)
Hertkorn_ZF_task5_1	ZF_CNN1	Hertkorn2022	43.4 (42.9 - 43.8)	60.6
Hertkorn_ZF_task5_2	ZF_CNN2	Hertkorn2022	44.4 (45.0 - 45.4)	61.8
Hertkorn_ZF_task5_3	ZF_CNN3	Hertkorn2022	41.4 (41.9 - 42.3)	67.9
Hertkorn_ZF_task5_4	ZF_CNN4	Hertkorn2022	33.8 (32.4 - 34.6)	60.5
Zou_PKU_task5_1	TI_1	Yang2022	19.2 (18.9 - 19.5)	52.0
Zou_PKU_task5_2	TI_2	Yang2022	18.7 (18.4 - 19.0)	52.0
Zou_PKU_task5_3	TI_3	Yang2022	18.9 (18.6 - 19.2)	52.0
Zou_PKU_task5_4	TI_4	Yang2022	15.8 (15.4 - 16.1)	52.0
Tan_WHU_task5_1	Knowledge trasnfer 75% training 10 iteration adaptive (8)	Tan2022	8.1 (7.3 - 8.5)	52.4
Tan_WHU_task5_2	Knowledge transfer 90% training 15 iteration	Tan2022	16.9 (16.4 - 17.2)	53.9
Tan_WHU_task5_3	Knowledge Transfer 90 training (4)	Tan2022	17.1 (16.7 - 17.4)	54.9
Tan_WHU_task5_4	Knowledge Transfer 90 training adaptive (4)	Tan2022	17.2 (16.8 - 17.6)	54.5
Liu_BIT-SRCB_task5_1	TI-PN ensemble	Liu2022	44.1 (43.6 - 44.5)	61.2
Liu_BIT-SRCB_task5_2	TI-PN ensemble_2	Liu2022	41.9 (41.6 - 42.2)	63.3
Liu_BIT-SRCB_task5_3	TI_scalable	Liu2022	36.8 (36.5 - 37.2)	43.5
Liu_BIT-SRCB_task5_4	pretrained TI-PN ensemble	Liu2022	44.3 (43.9 - 44.6)	64.8
Willbo_RISE_task5_1	willbo_supervised_1	Willbo2022	17.9 (17.6 - 18.2)	51.4
Willbo_RISE_task5_2	willbo_supervised_2	Willbo2022	20.4 (20.1 - 20.7)	57.5
Willbo_RISE_task5_3	willbo_semi_1	Willbo2022	20.2 (19.9 - 20.5)	50.8
Willbo_RISE_task5_4	willbo_semi_2	Willbo2022	21.7 (21.3 - 22.0)	47.9
ZGORZYNSKI_SRPOL_task5_1	Siamese Network with fully connected head	Zgorzynski2022	28.1 (27.6 - 28.5)	67.3
ZGORZYNSKI_SRPOL_task5_2	Siamese Network with fully connected head	Zgorzynski2022	16.3 (15.1 - 16.9)	59.4
ZGORZYNSKI_SRPOL_task5_3	Siamese Network with fully connected head	Zgorzynski2022	29.9 (29.3 - 30.3)	60.0
ZGORZYNSKI_SRPOL_task5_4	Siamese Network with fully connected head	Zgorzynski2022	33.2 (32.7 - 33.7)	57.2
Huang_SCUT_task5_1	Transductive learning and modified central difference convolution	Huang2022	18.3 (18.0 - 18.6)	54.6
Martinsson_RISE_task5_1	Adaptive prototypical ensemble	Martinsson2022	48.0 (47.5 - 48.4)	60.0
Martinsson_RISE_task5_2	Adaptive prototypical ensemble	Martinsson2022	45.4 (44.9 - 45.9)	30.6
Martinsson_RISE_task5_3	Adaptive prototypical ensemble	Martinsson2022	19.4 (18.6 - 20.0)	44.6
Martinsson_RISE_task5_4	Adaptive prototypical ensemble	Martinsson2022	32.5 (31.7 - 33.1)	13.3
Liu_Surrey_task5_1	Haohe_Liu_S1	Liu2022a	43.1 (42.7 - 43.4)	58.5
Liu_Surrey_task5_2	Haohe_Liu_S2	Liu2022a	48.2 (48.5 - 48.9)	50.0
Liu_Surrey_task5_3	Haohe_Liu_S3	Liu2022a	36.9 (36.5 - 37.2)	40.7
Liu_Surrey_task5_4	Haohe_Liu_S4	Liu2022a	45.5 (45.8 - 46.2)	60.2
Li_QMUL_task5_1	Prototypical Network with ResNet and SpecAugment	Li2022	15.5 (15.2 - 15.8)	47.9
Mariajohn_DSPC_task5_1	Prototypical-1	Mariajohn2022	25.7 (25.4 - 25.9)	43.9
Du_NERCSLIP_task5_1	Segment-level embedding learning	Du2022a	36.5 (35.6 - 37.0)	68.2
Du_NERCSLIP_task5_2	Frame-level embedding learning 1	Du2022a	60.2 (59.7 - 61.7)	74.4
Du_NERCSLIP_task5_3	event filtering	Du2022a	42.9 (42.4 - 43.4)	53.4
Du_NERCSLIP_task5_4	Frame-level embedding learning 2	Du2022a	60.0 (58.5 - 61.5)	74.4

Dataset wise metrics

Submission code	Submission name	Technical Report	Event-based F-score with 95% confidence interval (Evaluation dataset)	Event-based F-score (CHE dataset)	Event-based F-score (CT dataset)	Event-based F-score (MGE dataset)	Event-based F-score (MS dataset)	Event-based F-score (QU dataset)	Event-based F-score (DC dataset)
Baseline_TempMatch_task5_1	Baseline Template Matching		12.3 (11.5 - 12.8)	21.1	7.1	44.1	8.0	9.7	35.0
Baseline_PROTO_task5_1	Baseline Prototypical Network		5.3 ( - )	42.6	8.0	3.8	11.6	1.6	40.1
Wu_SHNU_task5_1	Continual_learning	Wu2022	40.9 (40.5 - 41.3)	65.0	37.2	38.2	38.9	38.1	44.8
Zhang_CQU_task5_1	Zhang_CQU_task5_1	Zhang2022	1.2 (0.9 - 1.3)	30.3	24.6	5.8	1.1	0.3	25.4
Zhang_CQU_task5_2	Zhang_CQU_task5_2	Zhang2022	0.9 (0.0 - 1.0)	26.8	38.3	11.1	0.2	14.6	9.1
Zhang_CQU_task5_3	Zhang_CQU_task5_3	Zhang2022	1.9 (1.0 - 2.0)	29.2	26.0	55.6	0.4	15.8	17.5
Zhang_CQU_task5_4	Zhang_CQU_task5_4	Zhang2022	4.3 (3.7 - 4.6)	29.6	17.6	55.3	0.9	18.2	30.2
Kang_ET_task5_1	FewShot_using_good_embedding_model	Kang2022	2.4 (2.4 - 2.4)	11.0	0.7	3.3	3.5	4.3	4.7
Kang_ET_task5_2	FewShot_using_good_embedding_model	Kang2022	2.8 (2.8 - 2.9)	8.7	0.9	3.3	3.9	5.3	4.7
Hertkorn_ZF_task5_1	ZF_CNN1	Hertkorn2022	43.4 (42.9 - 43.8)	70.2	37.8	68.4	64.1	22.5	51.2
Hertkorn_ZF_task5_2	ZF_CNN2	Hertkorn2022	44.4 (45.0 - 45.4)	70.3	37.1	63.8	58.6	25.9	57.4
Hertkorn_ZF_task5_3	ZF_CNN3	Hertkorn2022	41.4 (41.9 - 42.3)	66.7	40.0	76.4	74.0	18.2	57.9
Hertkorn_ZF_task5_4	ZF_CNN4	Hertkorn2022	33.8 (32.4 - 34.6)	64.6	15.0	84.9	71.0	21.5	58.8
Zou_PKU_task5_1	TI_1	Yang2022	19.2 (18.9 - 19.5)	33.4	22.8	59.7	44.0	6.8	22.9
Zou_PKU_task5_2	TI_2	Yang2022	18.7 (18.4 - 19.0)	32.9	22.6	60.7	42.7	6.6	22.4
Zou_PKU_task5_3	TI_3	Yang2022	18.9 (18.6 - 19.2)	30.9	24.0	60.9	43.8	6.7	22.1
Zou_PKU_task5_4	TI_4	Yang2022	15.8 (15.4 - 16.1)	43.8	9.3	57.2	30.9	6.3	31.4
Tan_WHU_task5_1	Knowledge trasnfer 75% training 10 iteration adaptive (8)	Tan2022	8.1 (7.3 - 8.5)	39.0	43.9	2.4	10.3	15.0	12.7
Tan_WHU_task5_2	Knowledge transfer 90% training 15 iteration	Tan2022	16.9 (16.4 - 17.2)	31.5	32.8	8.0	15.3	15.4	39.8
Tan_WHU_task5_3	Knowledge Transfer 90 training (4)	Tan2022	17.1 (16.7 - 17.4)	25.5	40.3	8.4	15.7	18.0	28.6
Tan_WHU_task5_4	Knowledge Transfer 90 training adaptive (4)	Tan2022	17.2 (16.8 - 17.6)	26.2	40.3	8.4	15.7	18.0	29.6
Liu_BIT-SRCB_task5_1	TI-PN ensemble	Liu2022	44.1 (43.6 - 44.5)	54.6	45.7	47.3	51.5	32.4	48.5
Liu_BIT-SRCB_task5_2	TI-PN ensemble_2	Liu2022	41.9 (41.6 - 42.2)	54.6	56.3	47.3	51.5	24.0	48.5
Liu_BIT-SRCB_task5_3	TI_scalable	Liu2022	36.8 (36.5 - 37.2)	52.2	41.0	51.6	49.3	22.2	33.6
Liu_BIT-SRCB_task5_4	pretrained TI-PN ensemble	Liu2022	44.3 (43.9 - 44.6)	54.6	45.0	48.0	53.9	32.5	47.7
Willbo_RISE_task5_1	willbo_supervised_1	Willbo2022	17.9 (17.6 - 18.2)	43.8	19.1	24.6	20.9	12.2	12.8
Willbo_RISE_task5_2	willbo_supervised_2	Willbo2022	20.4 (20.1 - 20.7)	47.1	17.4	31.1	21.4	12.2	21.9
Willbo_RISE_task5_3	willbo_semi_1	Willbo2022	20.2 (19.9 - 20.5)	44.0	14.8	24.8	24.9	13.9	22.1
Willbo_RISE_task5_4	willbo_semi_2	Willbo2022	21.7 (21.3 - 22.0)	48.8	14.9	31.1	25.9	13.9	25.5
ZGORZYNSKI_SRPOL_task5_1	Siamese Network with fully connected head	Zgorzynski2022	28.1 (27.6 - 28.5)	51.0	52.9	13.9	33.4	27.4	33.7
ZGORZYNSKI_SRPOL_task5_2	Siamese Network with fully connected head	Zgorzynski2022	16.3 (15.1 - 16.9)	51.2	39.8	4.2	48.4	34.7	46.3
ZGORZYNSKI_SRPOL_task5_3	Siamese Network with fully connected head	Zgorzynski2022	29.9 (29.3 - 30.3)	49.7	23.7	15.5	60.9	35.9	41.7
ZGORZYNSKI_SRPOL_task5_4	Siamese Network with fully connected head	Zgorzynski2022	33.2 (32.7 - 33.7)	58.8	31.1	19.7	41.1	38.4	40.4
Huang_SCUT_task5_1	Transductive learning and modified central difference convolution	Huang2022	18.3 (18.0 - 18.6)	17.9	20.6	65.6	56.0	7.4	22.1
Martinsson_RISE_task5_1	Adaptive prototypical ensemble	Martinsson2022	48.0 (47.5 - 48.4)	71.7	48.4	77.6	70.6	24.6	53.1
Martinsson_RISE_task5_2	Adaptive prototypical ensemble	Martinsson2022	45.4 (44.9 - 45.9)	56.3	37.6	61.5	70.7	29.5	49.4
Martinsson_RISE_task5_3	Adaptive prototypical ensemble	Martinsson2022	19.4 (18.6 - 20.0)	67.1	4.7	65.5	73.3	34.7	45.0
Martinsson_RISE_task5_4	Adaptive prototypical ensemble	Martinsson2022	32.5 (31.7 - 33.1)	50.9	13.4	47.8	71.2	34.1	42.5
Liu_Surrey_task5_1	Haohe_Liu_S1	Liu2022a	43.1 (42.7 - 43.4)	81.9	58.4	46.4	48.4	22.8	52.0
Liu_Surrey_task5_2	Haohe_Liu_S2	Liu2022a	48.2 (48.5 - 48.9)	76.9	57.4	48.0	60.7	28.9	56.8
Liu_Surrey_task5_3	Haohe_Liu_S3	Liu2022a	36.9 (36.5 - 37.2)	83.0	52.2	29.1	53.5	18.5	53.7
Liu_Surrey_task5_4	Haohe_Liu_S4	Liu2022a	45.5 (45.8 - 46.2)	80.5	61.8	38.8	47.7	30.3	53.8
Li_QMUL_task5_1	Prototypical Network with ResNet and SpecAugment	Li2022	15.5 (15.2 - 15.8)	39.5	35.0	11.9	17.9	6.9	30.7
Mariajohn_DSPC_task5_1	Prototypical-1	Mariajohn2022	25.7 (25.4 - 25.9)	27.4	23.6	55.4	65.5	19.4	14.9
Du_NERCSLIP_task5_1	Segment-level embedding learning	Du2022a	36.5 (35.6 - 37.0)	53.6	43.9	43.0	57.5	17.7	46.7
Du_NERCSLIP_task5_2	Frame-level embedding learning 1	Du2022a	60.2 (59.7 - 61.7)	71.7	48.4	89.1	66.3	48.7	57.3
Du_NERCSLIP_task5_3	event filtering	Du2022a	42.9 (42.4 - 43.4)	57.4	48.6	62.3	42.4	23.5	52.2
Du_NERCSLIP_task5_4	Frame-level embedding learning 2	Du2022a	60.0 (58.5 - 61.5)	73.3	49.6	91.3	64.4	46.3	57.7

Teams ranking

Table including only the best performing system per submitting team.

Submission code	Submission name	Technical Report	Event-based F-score with 95% confidence interval (Evaluation dataset)	Event-based F-score (Development dataset)
Baseline_TempMatch_task5_1	Baseline Template Matching		12.3 (11.5 - 12.8)	3.4
Baseline_PROTO_task5_1	Baseline Prototypical Network		5.3 ( - )
Wu_SHNU_task5_1	Continual_learning	Wu2022	40.9 (40.5 - 41.3)	53.9
Zhang_CQU_task5_4	Zhang_CQU_task5_4	Zhang2022	4.3 (3.7 - 4.6)	44.2
Kang_ET_task5_2	FewShot_using_good_embedding_model	Kang2022	2.8 (2.8 - 2.9)
Hertkorn_ZF_task5_2	ZF_CNN2	Hertkorn2022	44.4 (45.0 - 45.4)	61.8
Zou_PKU_task5_1	TI_1	Yang2022	19.2 (18.9 - 19.5)	52.0
Tan_WHU_task5_4	Knowledge Transfer 90 training adaptive (4)	Tan2022	17.2 (16.8 - 17.6)	54.5
Liu_BIT-SRCB_task5_4	pretrained TI-PN ensemble	Liu2022	44.3 (43.9 - 44.6)	64.8
Willbo_RISE_task5_4	willbo_semi_2	Willbo2022	21.7 (21.3 - 22.0)	47.9
ZGORZYNSKI_SRPOL_task5_4	Siamese Network with fully connected head	Zgorzynski2022	33.2 (32.7 - 33.7)	57.2
Huang_SCUT_task5_1	Transductive learning and modified central difference convolution	Huang2022	18.3 (18.0 - 18.6)	54.6
Martinsson_RISE_task5_1	Adaptive prototypical ensemble	Martinsson2022	48.0 (47.5 - 48.4)	60.0
Liu_Surrey_task5_2	Haohe_Liu_S2	Liu2022a	48.2 (48.5 - 48.9)	50.0
Li_QMUL_task5_1	Prototypical Network with ResNet and SpecAugment	Li2022	15.5 (15.2 - 15.8)	47.9
Mariajohn_DSPC_task5_1	Prototypical-1	Mariajohn2022	25.7 (25.4 - 25.9)	43.9
Du_NERCSLIP_task5_2	Frame-level embedding learning 1	Du2022a	60.2 (59.7 - 61.7)	74.4

System characteristics

General characteristics

Code	Technical Report	Event-based F-score with 95% confidence interval (Evaluation dataset)	Sampling rate	Data augmentation	Features
Baseline_TempMatch_task5_1		12.3 (11.5 - 12.8)	any		spectrogram
Baseline_PROTO_task5_1		5.3 ( - )	22.05 KHz		PCEN
Wu_SHNU_task5_1	Wu2022	40.9 (40.5 - 41.3)	any	Time masking, Frequency masking	PCEN
Zhang_CQU_task5_1	Zhang2022	1.2 (0.9 - 1.3)	22.05 KHz		Spectrogram
Zhang_CQU_task5_2	Zhang2022	0.9 (0.0 - 1.0)	22.05 KHz		Spectrogram
Zhang_CQU_task5_3	Zhang2022	1.9 (1.0 - 2.0)	22.05 KHz		Spectrogram
Zhang_CQU_task5_4	Zhang2022	4.3 (3.7 - 4.6)	22.05 KHz		Spectrogram
Kang_ET_task5_1	Kang2022	2.4 (2.4 - 2.4)	16 KHz	specaugment	PCEN
Kang_ET_task5_2	Kang2022	2.8 (2.8 - 2.9)	16 KHz	Specaugment	PCEN
Hertkorn_ZF_task5_1	Hertkorn2022	43.4 (42.9 - 43.8)	any		Spectrogram
Hertkorn_ZF_task5_2	Hertkorn2022	44.4 (45.0 - 45.4)	any		Spectrogram
Hertkorn_ZF_task5_3	Hertkorn2022	41.4 (41.9 - 42.3)	any		Spectrogram
Hertkorn_ZF_task5_4	Hertkorn2022	33.8 (32.4 - 34.6)	any		Spectrogram
Zou_PKU_task5_1	Yang2022	19.2 (18.9 - 19.5)	22.05 KHz	time and frequency masking, mixup	Spectrogram
Zou_PKU_task5_2	Yang2022	18.7 (18.4 - 19.0)	22.05 KHz	time and frequency masking, mixup	Spectrogram
Zou_PKU_task5_3	Yang2022	18.9 (18.6 - 19.2)	22.05 KHz	time and frequency masking, mixup	Spectrogram
Zou_PKU_task5_4	Yang2022	15.8 (15.4 - 16.1)	22.05 KHz	time masking, frequency masking, mixup	Spectrogram
Tan_WHU_task5_1	Tan2022	8.1 (7.3 - 8.5)	22.05 KHz		PCEN
Tan_WHU_task5_2	Tan2022	16.9 (16.4 - 17.2)	22.05 KHz		PCEN
Tan_WHU_task5_3	Tan2022	17.1 (16.7 - 17.4)	22.05 KHz		PCEN
Tan_WHU_task5_4	Tan2022	17.2 (16.8 - 17.6)	22.05 KHz		PCEN
Liu_BIT-SRCB_task5_1	Liu2022	44.1 (43.6 - 44.5)	22.05 KHz	Specaugment	PCEN
Liu_BIT-SRCB_task5_2	Liu2022	41.9 (41.6 - 42.2)	22.05 KHz	Specaugment	PCEN
Liu_BIT-SRCB_task5_3	Liu2022	36.8 (36.5 - 37.2)	22.05 KHz		PCEN
Liu_BIT-SRCB_task5_4	Liu2022	44.3 (43.9 - 44.6)	22.05 KHz	Specaugment	PCEN
Willbo_RISE_task5_1	Willbo2022	17.9 (17.6 - 18.2)	any		Mel-spectrogram, PCEN
Willbo_RISE_task5_2	Willbo2022	20.4 (20.1 - 20.7)	any		Mel-spectrogram, PCEN
Willbo_RISE_task5_3	Willbo2022	20.2 (19.9 - 20.5)	any		Mel-spectrogram, PCEN
Willbo_RISE_task5_4	Willbo2022	21.7 (21.3 - 22.0)	any		Mel-spectrogram, PCEN
ZGORZYNSKI_SRPOL_task5_1	Zgorzynski2022	28.1 (27.6 - 28.5)	48 KHz	Noise mixing, Random Crop	Mel-spectrogram, PCEN
ZGORZYNSKI_SRPOL_task5_2	Zgorzynski2022	16.3 (15.1 - 16.9)	48 KHz	Noise mixing	Mel-spectrogram
ZGORZYNSKI_SRPOL_task5_3	Zgorzynski2022	29.9 (29.3 - 30.3)	48 KHz	Noise mixing	Mel-spectrogram
ZGORZYNSKI_SRPOL_task5_4	Zgorzynski2022	33.2 (32.7 - 33.7)	48 KHz	Noise mixing	Mel-spectrogram
Huang_SCUT_task5_1	Huang2022	18.3 (18.0 - 18.6)	22.05 KHz	Specaugment	PCEN
Martinsson_RISE_task5_1	Martinsson2022	48.0 (47.5 - 48.4)	22.05 KHz		Log-Mel energies, PCEN
Martinsson_RISE_task5_2	Martinsson2022	45.4 (44.9 - 45.9)	22.05 KHz		Log-Mel energies, PCEN
Martinsson_RISE_task5_3	Martinsson2022	19.4 (18.6 - 20.0)	22.05 KHz		PCEN
Martinsson_RISE_task5_4	Martinsson2022	32.5 (31.7 - 33.1)	22.05 KHz		PCEN
Liu_Surrey_task5_1	Liu2022a	43.1 (42.7 - 43.4)	22.05 KHz	Dynamic dataloader	PCEN, Delta-MFCC
Liu_Surrey_task5_2	Liu2022a	48.2 (48.5 - 48.9)	22.05 KHz	Dynamic dataloader	PCEN, Delta-MFCC
Liu_Surrey_task5_3	Liu2022a	36.9 (36.5 - 37.2)	22.05 KHz	Dynamic dataloader	PCEN, Delta-MFCC
Liu_Surrey_task5_4	Liu2022a	45.5 (45.8 - 46.2)	22.05 KHz	Dynamic dataloader	PCEN, Delta-MFCC
Li_QMUL_task5_1	Li2022	15.5 (15.2 - 15.8)	any	time masking, frequency masking, time warping	PCEN, Spectrogram
Mariajohn_DSPC_task5_1	Mariajohn2022	25.7 (25.4 - 25.9)	any	time shifting, segment level mirroring	Log-Mel spectrogram
Du_NERCSLIP_task5_1	Du2022a	36.5 (35.6 - 37.0)	22.05 KHz	SpecAugment	PCEN
Du_NERCSLIP_task5_2	Du2022a	60.2 (59.7 - 61.7)	22.05 KHz		PCEN
Du_NERCSLIP_task5_3	Du2022a	42.9 (42.4 - 43.4)	22.05 KHz		PCEN
Du_NERCSLIP_task5_4	Du2022a	60.0 (58.5 - 61.5)	22.05 KHz		PCEN

Machine learning characteristics

Code	Technical Report	Event-based F-score (Eval)	Classifier	Few-shot approach	Post-processing
Baseline_TempMatch_task5_1		12.3 (11.5 - 12.8)	template matching	template matching	peak picking, threshold
Baseline_PROTO_task5_1		5.3 ( - )	ResNet	prototypical	threshold
Wu_SHNU_task5_1	Wu2022	40.9 (40.5 - 41.3)	Continual Learning	prototypical, weight generator	threshold
Zhang_CQU_task5_1	Zhang2022	1.2 (0.9 - 1.3)	CNN	prototypical	peak picking, threshold
Zhang_CQU_task5_2	Zhang2022	0.9 (0.0 - 1.0)	CNN	prototypical	peak picking, threshold
Zhang_CQU_task5_3	Zhang2022	1.9 (1.0 - 2.0)	CNN	prototypical	peak picking, threshold
Zhang_CQU_task5_4	Zhang2022	4.3 (3.7 - 4.6)	CNN	prototypical	peak picking, threshold
Kang_ET_task5_1	Kang2022	2.4 (2.4 - 2.4)	TDNN	Fine tuning
Kang_ET_task5_2	Kang2022	2.8 (2.8 - 2.9)	TDNN	Fine tuning
Hertkorn_ZF_task5_1	Hertkorn2022	43.4 (42.9 - 43.8)	CNN		threshold, duration threshold, event stitching
Hertkorn_ZF_task5_2	Hertkorn2022	44.4 (45.0 - 45.4)	CNN		threshold, duration threshold, event stitching
Hertkorn_ZF_task5_3	Hertkorn2022	41.4 (41.9 - 42.3)	CNN		threshold, duration threshold, event stitching
Hertkorn_ZF_task5_4	Hertkorn2022	33.8 (32.4 - 34.6)	CNN		threshold, duration threshold, event stitching
Zou_PKU_task5_1	Yang2022	19.2 (18.9 - 19.5)	CNN	prototypical	threshold, peak picking
Zou_PKU_task5_2	Yang2022	18.7 (18.4 - 19.0)	CNN	prototypical	threshold, peak picking
Zou_PKU_task5_3	Yang2022	18.9 (18.6 - 19.2)	CNN	prototypical	threshold, peak picking
Zou_PKU_task5_4	Yang2022	15.8 (15.4 - 16.1)	CNN	prototypical	threshold, peak picking
Tan_WHU_task5_1	Tan2022	8.1 (7.3 - 8.5)	CNN	prototypical, transductive inference	threshold, minimum event length
Tan_WHU_task5_2	Tan2022	16.9 (16.4 - 17.2)	CNN	prototypical, transductive inference	threshold
Tan_WHU_task5_3	Tan2022	17.1 (16.7 - 17.4)	CNN	prototypical, transductive inference	threshold
Tan_WHU_task5_4	Tan2022	17.2 (16.8 - 17.6)	CNN	prototypical, transductive inference	threshold, minimum event length
Liu_BIT-SRCB_task5_1	Liu2022	44.1 (43.6 - 44.5)	CNN	prototypical, transductive inference	peak picking, threshold, VAD
Liu_BIT-SRCB_task5_2	Liu2022	41.9 (41.6 - 42.2)	CNN	prototypical, transductive inference	peak picking, threshold, VAD
Liu_BIT-SRCB_task5_3	Liu2022	36.8 (36.5 - 37.2)	CNN	Transductive inference	peak picking, threshold
Liu_BIT-SRCB_task5_4	Liu2022	44.3 (43.9 - 44.6)	CNN	prototypical, transductive inference	peak picking, threshold, VAD
Willbo_RISE_task5_1	Willbo2022	17.9 (17.6 - 18.2)	ResNet	prototypical	median filtering, minimum event length, threshold
Willbo_RISE_task5_2	Willbo2022	20.4 (20.1 - 20.7)	ResNet	prototypical, threshold fitting	median filtering, minimum event length, threshold
Willbo_RISE_task5_3	Willbo2022	20.2 (19.9 - 20.5)	ResNet	prototypical	median filtering, minimum event length, threshold
Willbo_RISE_task5_4	Willbo2022	21.7 (21.3 - 22.0)	ResNet	prototypical, threshold fitting	median filtering, minimum event length, threshold
ZGORZYNSKI_SRPOL_task5_1	Zgorzynski2022	28.1 (27.6 - 28.5)	CNN	Siamese network with fully connected head, fine tuning	peak picking, threshold, Gaussian filter
ZGORZYNSKI_SRPOL_task5_2	Zgorzynski2022	16.3 (15.1 - 16.9)	CNN	Siamese network with fully connected head, fine tuning	threshold, Gaussian filter
ZGORZYNSKI_SRPOL_task5_3	Zgorzynski2022	29.9 (29.3 - 30.3)	CNN	Siamese network with fully connected head, fine tuning	threshold, Gaussian filter
ZGORZYNSKI_SRPOL_task5_4	Zgorzynski2022	33.2 (32.7 - 33.7)	CNN	Siamese network with fully connected head, fine tuning	threshold, Gaussian filter
Huang_SCUT_task5_1	Huang2022	18.3 (18.0 - 18.6)	transductive learning	transductive learning	peak picking, threshold
Martinsson_RISE_task5_1	Martinsson2022	48.0 (47.5 - 48.4)	Ensemble, CNN	prototypical, input size	threshold, merging, filter too small, filter too big
Martinsson_RISE_task5_2	Martinsson2022	45.4 (44.9 - 45.9)	Ensemble, CNN	prototypical, input size	threshold, merging, filter too small, filter too big
Martinsson_RISE_task5_3	Martinsson2022	19.4 (18.6 - 20.0)	CNN	prototypical	threshold, merging, filter too small, filter too big
Martinsson_RISE_task5_4	Martinsson2022	32.5 (31.7 - 33.1)	CNN	prototypical	threshold, merging, filter too small, filter too big
Liu_Surrey_task5_1	Liu2022a	43.1 (42.7 - 43.4)	CNN, ensemble	prototypical	threshold, filter by length, split long, remove long
Liu_Surrey_task5_2	Liu2022a	48.2 (48.5 - 48.9)	CNN	prototypical	threshold, filter by length, remove long, padding
Liu_Surrey_task5_3	Liu2022a	36.9 (36.5 - 37.2)	CNN	prototypical	threshold, filter by length, split long, remove long, merge short, padding
Liu_Surrey_task5_4	Liu2022a	45.5 (45.8 - 46.2)	CNN	prototypical	threshold, filter by length, remove long
Li_QMUL_task5_1	Li2022	15.5 (15.2 - 15.8)	CNN	prototypical	peak picking, threshold
Mariajohn_DSPC_task5_1	Mariajohn2022	25.7 (25.4 - 25.9)	CNN	prototypical	threshold
Du_NERCSLIP_task5_1	Du2022a	36.5 (35.6 - 37.0)	CNN	fine tuning	peak picking, threshold
Du_NERCSLIP_task5_2	Du2022a	60.2 (59.7 - 61.7)	CNN	fine tuning	peak picking, threshold
Du_NERCSLIP_task5_3	Du2022a	42.9 (42.4 - 43.4)	CNN	fine tuning	peak picking, threshold
Du_NERCSLIP_task5_4	Du2022a	60.0 (58.5 - 61.5)	CNN	fine tuning	peak picking, threshold

Complexity

Code	Technical Report	Event-based F-score (Eval)	Model complexity	Training time
Baseline_TempMatch_task5_1		12.3 (11.5 - 12.8)
Baseline_PROTO_task5_1		5.3 ( - )
Wu_SHNU_task5_1	Wu2022	40.9 (40.5 - 41.3)	443520	2.5h
Zhang_CQU_task5_1	Zhang2022	1.2 (0.9 - 1.3)		90min
Zhang_CQU_task5_2	Zhang2022	0.9 (0.0 - 1.0)		90min
Zhang_CQU_task5_3	Zhang2022	1.9 (1.0 - 2.0)		90min
Zhang_CQU_task5_4	Zhang2022	4.3 (3.7 - 4.6)		90min
Kang_ET_task5_1	Kang2022	2.4 (2.4 - 2.4)
Kang_ET_task5_2	Kang2022	2.8 (2.8 - 2.9)
Hertkorn_ZF_task5_1	Hertkorn2022	43.4 (42.9 - 43.8)	54979	6 min/ wav file
Hertkorn_ZF_task5_2	Hertkorn2022	44.4 (45.0 - 45.4)	54979	6 min/ wav file
Hertkorn_ZF_task5_3	Hertkorn2022	41.4 (41.9 - 42.3)	54979	6 min/ wav file
Hertkorn_ZF_task5_4	Hertkorn2022	33.8 (32.4 - 34.6)	54979	6 min/ wav file
Zou_PKU_task5_1	Yang2022	19.2 (18.9 - 19.5)	468627	30 min
Zou_PKU_task5_2	Yang2022	18.7 (18.4 - 19.0)	468627	30 min
Zou_PKU_task5_3	Yang2022	18.9 (18.6 - 19.2)	468627	30 min
Zou_PKU_task5_4	Yang2022	15.8 (15.4 - 16.1)	468627	30 min
Tan_WHU_task5_1	Tan2022	8.1 (7.3 - 8.5)	700k	1h
Tan_WHU_task5_2	Tan2022	16.9 (16.4 - 17.2)	700k	1h
Tan_WHU_task5_3	Tan2022	17.1 (16.7 - 17.4)	700k	1h
Tan_WHU_task5_4	Tan2022	17.2 (16.8 - 17.6)	700k	1h
Liu_BIT-SRCB_task5_1	Liu2022	44.1 (43.6 - 44.5)	9627177	1.5h
Liu_BIT-SRCB_task5_2	Liu2022	41.9 (41.6 - 42.2)	9627177	1.5h
Liu_BIT-SRCB_task5_3	Liu2022	36.8 (36.5 - 37.2)	8757077	1.5h
Liu_BIT-SRCB_task5_4	Liu2022	44.3 (43.9 - 44.6)	9914068	1.5h
Willbo_RISE_task5_1	Willbo2022	17.9 (17.6 - 18.2)
Willbo_RISE_task5_2	Willbo2022	20.4 (20.1 - 20.7)
Willbo_RISE_task5_3	Willbo2022	20.2 (19.9 - 20.5)
Willbo_RISE_task5_4	Willbo2022	21.7 (21.3 - 22.0)
ZGORZYNSKI_SRPOL_task5_1	Zgorzynski2022	28.1 (27.6 - 28.5)	76700357	9h
ZGORZYNSKI_SRPOL_task5_2	Zgorzynski2022	16.3 (15.1 - 16.9)	76700357	9h
ZGORZYNSKI_SRPOL_task5_3	Zgorzynski2022	29.9 (29.3 - 30.3)	76700357	9h
ZGORZYNSKI_SRPOL_task5_4	Zgorzynski2022	33.2 (32.7 - 33.7)	76700357	9h
Huang_SCUT_task5_1	Huang2022	18.3 (18.0 - 18.6)	492206	50min, RTX3090
Martinsson_RISE_task5_1	Martinsson2022	48.0 (47.5 - 48.4)	25994880
Martinsson_RISE_task5_2	Martinsson2022	45.4 (44.9 - 45.9)	25994880
Martinsson_RISE_task5_3	Martinsson2022	19.4 (18.6 - 20.0)	1732992
Martinsson_RISE_task5_4	Martinsson2022	32.5 (31.7 - 33.1)	1732992
Liu_Surrey_task5_1	Liu2022a	43.1 (42.7 - 43.4)	724096	91 min, NVIDIA GeForce 3070
Liu_Surrey_task5_2	Liu2022a	48.2 (48.5 - 48.9)	724096	91 min, NVIDIA GeForce 3070
Liu_Surrey_task5_3	Liu2022a	36.9 (36.5 - 37.2)	724096	91 min, NVIDIA GeForce 3070
Liu_Surrey_task5_4	Liu2022a	45.5 (45.8 - 46.2)	724096	91 min, NVIDIA GeForce 3070
Li_QMUL_task5_1	Li2022	15.5 (15.2 - 15.8)		40 min, Colab pro Tesla p100
Mariajohn_DSPC_task5_1	Mariajohn2022	25.7 (25.4 - 25.9)		2h
Du_NERCSLIP_task5_1	Du2022a	36.5 (35.6 - 37.0)	464531	5 minutes, TeslaP40-24GB
Du_NERCSLIP_task5_2	Du2022a	60.2 (59.7 - 61.7)	469654	1 hour, TeslaV100-32GB
Du_NERCSLIP_task5_3	Du2022a	42.9 (42.4 - 43.4)	12091947	1 hour, TeslaV100-32GB
Du_NERCSLIP_task5_4	Du2022a	60.0 (58.5 - 61.5)	12091947	1 hour, TeslaV100-32GB

Technical reports

BIOACOUSTIC FEW SHOT LEARNING WITH CLASS AUGMENTATION Technical Report

Mariajohn, Aaquila

Mariajohn_DSPC_task5_1

Data augmentation	time shifting, segment level mirroring
System embeddings	False
Subsystem count	False
External data usage	directly as additional training data

Data augmentation	Specaugment, inference-time augmentation
System embeddings	False
Subsystem count	5
External data usage	AudioSet

Data augmentation	time warping, time masking, frequency masking
System embeddings	False
Subsystem count	False
External data usage	False

Data augmentation	Time masking,Frequency masking
System embeddings	False
Subsystem count	False
External data usage	False

Content

Task description

Systems ranking

Dataset wise metrics

Teams ranking

System characteristics

General characteristics

Machine learning characteristics

Complexity

Technical reports

BIOACOUSTIC FEW SHOT LEARNING WITH CLASS AUGMENTATION Technical Report

BIOACOUSTIC FEW SHOT LEARNING WITH CLASS AUGMENTATION Technical Report

Abstract

System characteristics

FEW-SHOT EMBEDDING LEARNING AND EVENT FILTERING FOR BIOACOUSTIC EVENT DETECTION Technical Report

FEW-SHOT EMBEDDING LEARNING AND EVENT FILTERING FOR BIOACOUSTIC EVENT DETECTION Technical Report

Abstract

System characteristics

FEW-SHOT BIOACOUSTIC EVENT DETECTION : DON ' T WASTE INFORMATION Technical Report

FEW-SHOT BIOACOUSTIC EVENT DETECTION : DON ' T WASTE INFORMATION Technical Report

Abstract

System characteristics

FEW-SHOT BIO-ACOUSTIC EVENT DETECTION BASED ON TRANSDUCTIVE LEARNING AND ADAPTED CENTRAL DIFFERENCE CONVOLUTION Technical Report

FEW-SHOT BIO-ACOUSTIC EVENT DETECTION BASED ON TRANSDUCTIVE LEARNING AND ADAPTED CENTRAL DIFFERENCE CONVOLUTION Technical Report

Abstract

System characteristics

FEW-SHOT BIOACOUSTIC EVENT DETECTION USING GOOD EMBEDDING MODEL Technical report

FEW-SHOT BIOACOUSTIC EVENT DETECTION USING GOOD EMBEDDING MODEL Technical report

Abstract

System characteristics

FEW-SHOT BIOACOUSTIC EVENT DETECTION USING PROTOTYPICAL NETWORKS WITH RESNET CLASSIFIER Technical Report

FEW-SHOT BIOACOUSTIC EVENT DETECTION USING PROTOTYPICAL NETWORKS WITH RESNET CLASSIFIER Technical Report

Abstract

System characteristics

BIT SRCB TEAM ' S SUBMISSION FOR DCASE2022 TASK5 - FEW-SHOT BIOACOUSTIC EVENT DETECTION Technical Report

BIT SRCB TEAM ' S SUBMISSION FOR DCASE2022 TASK5 - FEW-SHOT BIOACOUSTIC EVENT DETECTION Technical Report

Abstract

System characteristics

SURREY SYSTEM FOR DCASE 2022 TASK 5 : FEW-SHOT BIOACOUSTIC EVENT DETECTION WITH SEGMENT-LEVEL METRIC LEARNING Technical Report

SURREY SYSTEM FOR DCASE 2022 TASK 5 : FEW-SHOT BIOACOUSTIC EVENT DETECTION WITH SEGMENT-LEVEL METRIC LEARNING Technical Report

Abstract

System characteristics

FEW-SHOT BIOACOUSTIC EVENT DETECTION USING A PROTOTYPICAL NETWORK ENSEMBLE WITH ADAPTIVE EMBEDDING FUNCTIONS Technical Report

FEW-SHOT BIOACOUSTIC EVENT DETECTION USING A PROTOTYPICAL NETWORK ENSEMBLE WITH ADAPTIVE EMBEDDING FUNCTIONS Technical Report

Abstract

System characteristics

A NEW TRANSDUCTIVE FRAMEWORK FOR FEW-SHOT BIOACOUSTIC EVENT DETECTION TASK Technical Report

A NEW TRANSDUCTIVE FRAMEWORK FOR FEW-SHOT BIOACOUSTIC EVENT DETECTION TASK Technical Report

Abstract

System characteristics

WIDE RESNET MODELS FOR FEW-SHOT SOUND EVENT DETECTION Technical report

WIDE RESNET MODELS FOR FEW-SHOT SOUND EVENT DETECTION Technical report

Abstract

System characteristics

FEW-SHOT CONTINUAL LEARNING FOR BIOACOUSTIC EVENT DETECTION Technical Report

FEW-SHOT CONTINUAL LEARNING FOR BIOACOUSTIC EVENT DETECTION Technical Report

Abstract

System characteristics

IMPROVED PROTOTYPICAL NETWORK WITH DATA AUGMENTATION Technical Report

IMPROVED PROTOTYPICAL NETWORK WITH DATA AUGMENTATION Technical Report

Abstract

System characteristics

SIAMESE NETWORK FOR FEW-SHOT BIOACOUSTIC EVENT DETECTION Technical report

SIAMESE NETWORK FOR FEW-SHOT BIOACOUSTIC EVENT DETECTION Technical report

System characteristics

A META-LEARNING FRAMEWORK FOR FEW-SHOT SOUND EVENT DETECTION Technical Report

A META-LEARNING FRAMEWORK FOR FEW-SHOT SOUND EVENT DETECTION Technical Report

Abstract

System characteristics