Sound Event Detection with Weak Labels and Synthetic Soundscapes


Challenge results

Task description

More detailed task description can be found in the task description page

All confindence intervals are computed based on the three runs per systems and bootstrapping on the evaluation set.

Team Ranking

Tables including only the best ranking score per submitting team without ensembling.

Rank Submission
code
(PSDS 1)
Submission
code
(PSDS 2)
Technical
Report

Ranking score
(Evaluation dataset)

PSDS 1
(Evaluation dataset)

PSDS 2
(Evaluation dataset)
Kim_GIST-HanwhaVision_task4a_2 Kim_GIST-HanwhaVision_task4a_3 Kim2023 1.68 0.591 (0.574 - 0.611) 0.835 (0.826 - 0.846)
Zhang_IOA_task4a_6 Zhang_IOA_task4a_7 Zhang2023 1.63 0.562 (0.552 - 0.575) 0.830 (0.820 - 0.842)
Wenxin_TJU_task4a_6 Wenxin_TJU_task4a_6 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842)
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4 Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827)
Guan_HIT_task4a_3 Guan_HIT_task4a_4 Guan2023 1.60 0.526 (0.513 - 0.539) 0.855 (0.844 - 0.867)
Chen_CHT_task4a_2 Chen_CHT_task4a_2 Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792)
Li_USTC_task4a_6 Li_USTC_task4a_6 Wenxin2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796)
Liu_NSYSU_task4a_7 Liu_NSYSU_task4a_7 Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831)
Cheimariotis_DUTH_task4a_1 Cheimariotis_DUTH_task4a_1 Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808)
Baseline_BEATS Baseline_BEATS 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811)
Baseline Baseline 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566)
Wang_XiaoRice_task4a_1 Wang_XiaoRice_task4a_1 Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815)
Lee_CAUET_task4a_1 Lee_CAUET_task4a_2 Lee2023 1.28 0.425 (0.415 - 0.440) 0.674 (0.661 - 0.690)
Liu_SRCN_task4a_4 Liu_SRCN_task4a_4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676)
Barahona_AUDIAS_task4a_2 Barahona_AUDIAS_task4a_4 Barahona2023 1.21 0.380 (0.361 - 0.406) 0.673 (0.652 - 0.700)
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610)
Gan_NCUT_task4a_1 Gan_NCUT_task4a_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617)

With ensembling

Rank Submission
code
(PSDS 1)
Submission
code
(PSDS 2)
Technical
Report

Ranking score
(Evaluation dataset)

PSDS 1
(Evaluation dataset)

PSDS 2
(Evaluation dataset)
Zhang_IOA_task4a_4 Zhang_IOA_task4a_2 Zhang2023 1.80 0.625 (0.615 - 0.637) 0.903 (0.895 - 0.911)
Kim_GIST-HanwhaVision_task4a_8 Kim_GIST-HanwhaVision_task4a_5 Kim2023 1.72 0.612 (0.599 - 0.626) 0.846 (0.838 - 0.855)
Liu_SRCN_task4a_1 Liu_SRCN_task4a_2 Chen2023a 1.71 0.585 (0.572 - 0.598) 0.877 (0.867 - 0.885)
Chen_CHT_task4a_3 Chen_CHT_task4a_4 Chen2023b 1.67 0.596 (0.585 - 0.606) 0.820 (0.810 - 0.831)
Wenxin_TJU_task4a_2 Wenxin_TJU_task4a_2 Wenxin2023 1.66 0.570 (0.559 - 0.580) 0.844 (0.836 - 0.854)
Li_USTC_task4a_2 Li_USTC_task4a_4 Li2023 1.64 0.556 (0.544 - 0.569) 0.852 (0.843 - 0.863)
Xiao_FMSG_task4a_5 Xiao_FMSG_task4a_8 Xiao2023 1.62 0.555 (0.545 - 0.567) 0.834 (0.824 - 0.847)
Liu_NSYSU_task4a_6 Liu_NSYSU_task4a_6 Liu2023 1.62 0.552 (0.540 - 0.563) 0.838 (0.829 - 0.848)
Guan_HIT_task4a_1 Guan_HIT_task4a_2 Guan2023 1.62 0.536 (0.526 - 0.546) 0.862 (0.852 - 0.872)
Gan_NCUT_task4a_2 Gan_NCUT_task4a_3 Gan2023 1.54 0.511 (0.498 - 0.524) 0.816 (0.805 - 0.828)
Wang_XiaoRice_task4a_2 Wang_XiaoRice_task4a_3 Wang2023 1.53 0.497 (0.486 - 0.510) 0.835 (0.824 - 0.844)
Wu_NCUT_task4a_2 Wu_NCUT_task4a_2 Wu2023 1.53 0.519 (0.507 - 0.531) 0.793 (0.783 - 0.806)
Cheimariotis_DUTH_task4a_1 Cheimariotis_DUTH_task4a_1 Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808)
Barahona_AUDIAS_task4a_6 Barahona_AUDIAS_task4a_8 Barahona2023 1.29 0.401 (0.390 - 0.414) 0.729 (0.710 - 0.752)
Lee_CAUET_task4a_1 Lee_CAUET_task4a_2 Lee2023 1.28 0.425 (0.415 - 0.440) 0.674 (0.661 - 0.690)
Baseline_BEATS Baseline_BEATS 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811)
Baseline Baseline 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566)

Systems ranking

Performance obtained without ensembling.

Rank Submission
code
Submission
name
Technical
Report
Ranking score
(Evaluation dataset)
PSDS 1
(Evaluation dataset)
PSDS 2
(Evaluation dataset)
PSDS 1
(Development dataset)
PSDS 2
(Development dataset)
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566) 0.359 0.562
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811) 0.491 0.787
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 1.52 0.531 (0.520 - 0.544) 0.762 (0.751 - 0.773) 0.555 0.791
Li_USTC_task4a_6 Pseudo labeling and single Li2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796) 0.552 0.795
Li_USTC_task4a_7 SKCRNN MT Li2023 1.20 0.404 (0.389 - 0.421) 0.630 (0.612 - 0.648) 0.451 0.662
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 1.26 0.434 (0.420 - 0.448) 0.646 (0.633 - 0.660) 0.437 0.682
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 1.24 0.413 (0.394 - 0.438) 0.655 (0.638 - 0.673) 0.456 0.687
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831) 0.492 0.800
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 1.53 0.515 (0.488 - 0.536) 0.805 (0.791 - 0.818) 0.511 0.780
Lee_CAU_task4A_1 CAU_ET Lee2023 1.24 0.425 (0.415 - 0.440) 0.634 (0.618 - 0.648) 0.437 0.654
Lee_CAU_task4A_2 CAU_ET Lee2023 0.79 0.104 (0.090 - 0.117) 0.674 (0.661 - 0.690) 0.070 0.734
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808) 0.496 0.788
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 1.45 0.487 (0.475 - 0.502) 0.759 (0.745 - 0.773) 0.516 0.781
Chen_CHT_task4_1 VGGSK Chen2023b 1.25 0.441 (0.403 - 0.468) 0.620 (0.567 - 0.652) 0.424 0.633
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792) 0.529 0.780
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Xiao2023 1.23 0.403 (0.392 - 0.417) 0.660 (0.646 - 0.672) 0.464 0.711
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 1.55 0.525 (0.516 - 0.538) 0.808 (0.796 - 0.821) 0.543 0.801
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.86 0.071 (0.062 - 0.080) 0.807 (0.796 - 0.818) 0.098 0.845
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827) 0.539 0.793
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 1.55 0.526 (0.513 - 0.539) 0.800 (0.788 - 0.813) 0.517 0.782
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.92 0.082 (0.073 - 0.091) 0.855 (0.844 - 0.867) 0.113 0.885
Wang_XiaoRice_task4a_1 SINGLE Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815) 0.527 0.790
Zhang_IOA_task4_5 base system Zhang2023 1.52 0.524 (0.513 - 0.537) 0.774 (0.762 - 0.786) 0.498 0.746
Zhang_IOA_task4_6 strong_single Zhang2023 1.60 0.562 (0.552 - 0.575) 0.795 (0.786 - 0.805) 0.552 0.794
Zhang_IOA_task4_7 weak single Zhang2023 0.86 0.055 (0.048 - 0.064) 0.830 (0.820 - 0.842) 0.065 0.865
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610) 0.429 0.644
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 1.06 0.351 (0.333 - 0.372) 0.562 (0.532 - 0.587) 0.374 0.575
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 1.12 0.380 (0.361 - 0.406) 0.575 (0.553 - 0.594) 0.387 0.585
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.91 0.200 (0.164 - 0.225) 0.646 (0.626 - 0.664) 0.224 0.696
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.84 0.141 (0.124 - 0.155) 0.673 (0.652 - 0.700) 0.164 0.740
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617) 0.402 0.620
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676) 0.436 0.675
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 1.35 0.459 (0.431 - 0.484) 0.701 (0.681 - 0.720) 0.471 0.715
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 1.68 0.591 (0.574 - 0.611) 0.831 (0.823 - 0.841) 0.546 0.807
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 1.66 0.581 (0.553 - 0.600) 0.835 (0.826 - 0.846) 0.543 0.806
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 1.58 0.539 (0.528 - 0.549) 0.816 (0.806 - 0.831) 0.521 0.793
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842) 0.512 0.808
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 1.31 0.440 (0.429 - 0.454) 0.686 (0.673 - 0.699) 0.460 0.699
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.75 0.059 (0.049 - 0.068) 0.707 (0.694 - 0.723) 0.067 0.781

Supplementary metrics

Rank Submission
code
Submission
name
Technical
Report
PSDS 1
(Evaluation dataset)
PSDS 1
(Public evaluation)
PSDS 1
(Vimeo dataset)
PSDS 2
(Evaluation dataset)
PSDS 2
(Public evaluation)
PSDS 2
(Vimeo dataset)
F-score
(Evaluation dataset)
F-score
(Public evaluation)
F-score
(Vimeo dataset)
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 0.327 (0.317 - 0.339) 0.366 (0.347 - 0.385) 0.247 (0.220 - 0.275) 0.538 (0.515 - 0.566) 0.580 (0.552 - 0.612) 0.430 (0.397 - 0.466) 0.377 (0.351 - 0.402) 0.408 (0.379 - 0.441) 0.299 (0.269 - 0.330)
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 0.510 (0.496 - 0.523) 0.560 (0.541 - 0.579) 0.414 (0.395 - 0.435) 0.798 (0.782 - 0.811) 0.841 (0.829 - 0.853) 0.697 (0.671 - 0.718) 0.567 (0.544 - 0.588) 0.603 (0.571 - 0.629) 0.480 (0.454 - 0.504)
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 0.531 (0.520 - 0.544) 0.577 (0.562 - 0.595) 0.431 (0.409 - 0.457) 0.762 (0.751 - 0.773) 0.800 (0.789 - 0.812) 0.663 (0.637 - 0.688) 0.599 (0.584 - 0.613) 0.634 (0.620 - 0.653) 0.509 (0.477 - 0.543)
Li_USTC_task4a_6 Pseudo labeling and single Li2023 0.546 (0.529 - 0.562) 0.593 (0.573 - 0.614) 0.451 (0.429 - 0.473) 0.783 (0.771 - 0.796) 0.810 (0.797 - 0.825) 0.703 (0.679 - 0.724) 0.603 (0.589 - 0.615) 0.635 (0.618 - 0.651) 0.523 (0.502 - 0.545)
Li_USTC_task4a_7 SKCRNN MT Li2023 0.404 (0.389 - 0.421) 0.451 (0.431 - 0.473) 0.303 (0.281 - 0.320) 0.630 (0.612 - 0.648) 0.673 (0.654 - 0.693) 0.502 (0.466 - 0.532) 0.478 (0.467 - 0.489) 0.516 (0.502 - 0.530) 0.384 (0.363 - 0.405)
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 0.434 (0.420 - 0.448) 0.489 (0.472 - 0.506) 0.314 (0.292 - 0.334) 0.646 (0.633 - 0.660) 0.704 (0.688 - 0.721) 0.510 (0.485 - 0.531) 0.468 (0.447 - 0.485) 0.500 (0.477 - 0.518) 0.389 (0.359 - 0.414)
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 0.413 (0.394 - 0.438) 0.461 (0.438 - 0.488) 0.318 (0.293 - 0.342) 0.655 (0.638 - 0.673) 0.715 (0.693 - 0.737) 0.527 (0.500 - 0.549) 0.484 (0.470 - 0.497) 0.518 (0.502 - 0.535) 0.397 (0.369 - 0.422)
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 0.521 (0.510 - 0.531) 0.569 (0.555 - 0.586) 0.424 (0.404 - 0.446) 0.813 (0.796 - 0.831) 0.858 (0.839 - 0.876) 0.717 (0.694 - 0.742) 0.564 (0.551 - 0.575) 0.598 (0.586 - 0.611) 0.481 (0.460 - 0.503)
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 0.515 (0.488 - 0.536) 0.564 (0.532 - 0.587) 0.416 (0.390 - 0.440) 0.805 (0.791 - 0.818) 0.850 (0.832 - 0.868) 0.699 (0.676 - 0.719) 0.553 (0.529 - 0.574) 0.586 (0.557 - 0.608) 0.469 (0.446 - 0.498)
Lee_CAU_task4A_1 CAU_ET Lee2023 0.425 (0.415 - 0.440) 0.475 (0.458 - 0.492) 0.320 (0.302 - 0.339) 0.634 (0.618 - 0.648) 0.683 (0.662 - 0.704) 0.514 (0.490 - 0.542) 0.470 (0.459 - 0.481) 0.513 (0.500 - 0.528) 0.364 (0.342 - 0.384)
Lee_CAU_task4A_2 CAU_ET Lee2023 0.104 (0.090 - 0.117) 0.118 (0.098 - 0.136) 0.090 (0.075 - 0.105) 0.674 (0.661 - 0.690) 0.707 (0.690 - 0.727) 0.592 (0.560 - 0.622) 0.137 (0.119 - 0.151) 0.150 (0.126 - 0.169) 0.106 (0.091 - 0.119)
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 0.516 (0.504 - 0.529) 0.573 (0.555 - 0.593) 0.411 (0.391 - 0.433) 0.796 (0.784 - 0.808) 0.841 (0.828 - 0.854) 0.697 (0.675 - 0.719) 0.577 (0.566 - 0.588) 0.615 (0.599 - 0.632) 0.486 (0.469 - 0.504)
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 0.487 (0.475 - 0.502) 0.540 (0.521 - 0.560) 0.389 (0.366 - 0.410) 0.759 (0.745 - 0.773) 0.804 (0.785 - 0.823) 0.656 (0.633 - 0.682) 0.555 (0.543 - 0.566) 0.596 (0.580 - 0.611) 0.454 (0.432 - 0.477)
Chen_CHT_task4_1 VGGSK Chen2023b 0.441 (0.403 - 0.468) 0.488 (0.440 - 0.523) 0.333 (0.289 - 0.370) 0.620 (0.567 - 0.652) 0.666 (0.608 - 0.707) 0.496 (0.428 - 0.548) 0.504 (0.449 - 0.543) 0.544 (0.486 - 0.585) 0.406 (0.351 - 0.447)
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 0.563 (0.550 - 0.574) 0.621 (0.600 - 0.639) 0.451 (0.431 - 0.471) 0.779 (0.768 - 0.792) 0.821 (0.809 - 0.834) 0.690 (0.665 - 0.715) 0.628 (0.615 - 0.641) 0.669 (0.653 - 0.686) 0.530 (0.508 - 0.552)
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Xiao2023 0.403 (0.392 - 0.417) 0.455 (0.439 - 0.472) 0.309 (0.292 - 0.326) 0.660 (0.646 - 0.672) 0.705 (0.690 - 0.724) 0.549 (0.527 - 0.572) 0.483 (0.472 - 0.493) 0.522 (0.510 - 0.534) 0.388 (0.373 - 0.402)
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 0.525 (0.516 - 0.538) 0.566 (0.549 - 0.584) 0.438 (0.424 - 0.454) 0.808 (0.796 - 0.821) 0.848 (0.837 - 0.862) 0.705 (0.683 - 0.729) 0.579 (0.569 - 0.588) 0.613 (0.601 - 0.627) 0.498 (0.481 - 0.517)
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.071 (0.062 - 0.080) 0.084 (0.070 - 0.096) 0.061 (0.050 - 0.074) 0.807 (0.796 - 0.818) 0.845 (0.833 - 0.859) 0.723 (0.701 - 0.742) 0.131 (0.124 - 0.137) 0.138 (0.128 - 0.147) 0.118 (0.106 - 0.130)
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 0.551 (0.543 - 0.562) 0.605 (0.591 - 0.621) 0.451 (0.433 - 0.469) 0.813 (0.802 - 0.827) 0.855 (0.844 - 0.868) 0.718 (0.698 - 0.736) 0.581 (0.573 - 0.591) 0.628 (0.616 - 0.641) 0.467 (0.449 - 0.485)
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 0.526 (0.513 - 0.539) 0.572 (0.552 - 0.590) 0.435 (0.418 - 0.450) 0.800 (0.788 - 0.813) 0.840 (0.825 - 0.857) 0.716 (0.695 - 0.735) 0.548 (0.533 - 0.563) 0.584 (0.567 - 0.604) 0.462 (0.444 - 0.482)
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.082 (0.073 - 0.091) 0.096 (0.083 - 0.107) 0.057 (0.043 - 0.071) 0.855 (0.844 - 0.867) 0.890 (0.871 - 0.903) 0.775 (0.756 - 0.796) 0.142 (0.134 - 0.151) 0.150 (0.139 - 0.160) 0.126 (0.113 - 0.138)
Wang_XiaoRice_task4a_1 SINGLE Wang2023 0.494 (0.477 - 0.510) 0.551 (0.532 - 0.574) 0.380 (0.362 - 0.402) 0.801 (0.789 - 0.815) 0.838 (0.823 - 0.854) 0.713 (0.686 - 0.742) 0.487 (0.465 - 0.513) 0.514 (0.491 - 0.543) 0.423 (0.396 - 0.451)
Zhang_IOA_task4_5 base system Zhang2023 0.524 (0.513 - 0.537) 0.565 (0.549 - 0.579) 0.445 (0.421 - 0.472) 0.774 (0.762 - 0.786) 0.821 (0.804 - 0.837) 0.672 (0.651 - 0.695) 0.601 (0.591 - 0.610) 0.630 (0.616 - 0.644) 0.534 (0.513 - 0.553)
Zhang_IOA_task4_6 strong_single Zhang2023 0.562 (0.552 - 0.575) 0.612 (0.597 - 0.626) 0.467 (0.450 - 0.487) 0.795 (0.786 - 0.805) 0.848 (0.838 - 0.857) 0.683 (0.661 - 0.703) 0.626 (0.617 - 0.633) 0.658 (0.646 - 0.669) 0.550 (0.530 - 0.566)
Zhang_IOA_task4_7 weak single Zhang2023 0.055 (0.048 - 0.064) 0.062 (0.050 - 0.074) 0.030 (0.015 - 0.044) 0.830 (0.820 - 0.842) 0.882 (0.873 - 0.892) 0.714 (0.690 - 0.735) 0.129 (0.123 - 0.135) 0.135 (0.126 - 0.143) 0.115 (0.103 - 0.126)
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 0.391 (0.379 - 0.405) 0.437 (0.423 - 0.458) 0.295 (0.278 - 0.311) 0.596 (0.584 - 0.610) 0.638 (0.617 - 0.660) 0.484 (0.463 - 0.505) 0.466 (0.454 - 0.478) 0.513 (0.500 - 0.527) 0.347 (0.326 - 0.365)
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 0.351 (0.333 - 0.372) 0.394 (0.370 - 0.422) 0.257 (0.236 - 0.275) 0.562 (0.532 - 0.587) 0.612 (0.586 - 0.640) 0.434 (0.391 - 0.477) 0.390 (0.373 - 0.414) 0.422 (0.400 - 0.452) 0.311 (0.288 - 0.329)
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 0.380 (0.361 - 0.406) 0.427 (0.400 - 0.459) 0.278 (0.257 - 0.296) 0.575 (0.553 - 0.594) 0.625 (0.604 - 0.650) 0.444 (0.409 - 0.480) 0.408 (0.389 - 0.432) 0.442 (0.416 - 0.474) 0.323 (0.302 - 0.341)
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.200 (0.164 - 0.225) 0.227 (0.185 - 0.256) 0.153 (0.117 - 0.179) 0.646 (0.626 - 0.664) 0.681 (0.656 - 0.706) 0.556 (0.525 - 0.590) 0.163 (0.141 - 0.181) 0.173 (0.148 - 0.192) 0.146 (0.123 - 0.164)
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.141 (0.124 - 0.155) 0.160 (0.136 - 0.179) 0.105 (0.089 - 0.121) 0.673 (0.652 - 0.700) 0.708 (0.683 - 0.735) 0.580 (0.550 - 0.610) 0.155 (0.135 - 0.172) 0.161 (0.137 - 0.180) 0.144 (0.126 - 0.160)
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 0.365 (0.353 - 0.377) 0.403 (0.388 - 0.424) 0.281 (0.260 - 0.298) 0.603 (0.589 - 0.617) 0.656 (0.636 - 0.676) 0.473 (0.453 - 0.496) 0.437 (0.426 - 0.448) 0.469 (0.455 - 0.483) 0.353 (0.331 - 0.371)
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 0.412 (0.400 - 0.424) 0.450 (0.432 - 0.472) 0.334 (0.314 - 0.352) 0.663 (0.652 - 0.676) 0.707 (0.690 - 0.724) 0.556 (0.531 - 0.574) 0.472 (0.462 - 0.480) 0.504 (0.491 - 0.515) 0.390 (0.370 - 0.407)
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 0.459 (0.431 - 0.484) 0.504 (0.472 - 0.533) 0.368 (0.330 - 0.400) 0.701 (0.681 - 0.720) 0.750 (0.732 - 0.771) 0.590 (0.556 - 0.625) 0.545 (0.530 - 0.564) 0.582 (0.567 - 0.602) 0.453 (0.434 - 0.474)
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 0.591 (0.574 - 0.611) 0.645 (0.624 - 0.668) 0.489 (0.466 - 0.515) 0.831 (0.823 - 0.841) 0.868 (0.859 - 0.877) 0.751 (0.733 - 0.768) 0.646 (0.634 - 0.658) 0.684 (0.670 - 0.697) 0.554 (0.534 - 0.573)
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 0.581 (0.553 - 0.600) 0.633 (0.604 - 0.655) 0.483 (0.456 - 0.503) 0.835 (0.826 - 0.846) 0.871 (0.862 - 0.881) 0.754 (0.736 - 0.772) 0.638 (0.622 - 0.654) 0.675 (0.658 - 0.691) 0.549 (0.522 - 0.572)
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 0.539 (0.528 - 0.549) 0.598 (0.581 - 0.614) 0.423 (0.404 - 0.437) 0.816 (0.806 - 0.831) 0.858 (0.848 - 0.870) 0.710 (0.688 - 0.733) 0.569 (0.559 - 0.577) 0.605 (0.594 - 0.614) 0.481 (0.460 - 0.501)
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 0.546 (0.536 - 0.556) 0.596 (0.583 - 0.611) 0.432 (0.418 - 0.448) 0.831 (0.823 - 0.842) 0.875 (0.868 - 0.884) 0.735 (0.715 - 0.754) 0.582 (0.574 - 0.589) 0.615 (0.603 - 0.626) 0.498 (0.481 - 0.515)
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 0.440 (0.429 - 0.454) 0.491 (0.472 - 0.508) 0.331 (0.314 - 0.349) 0.686 (0.673 - 0.699) 0.730 (0.711 - 0.751) 0.567 (0.547 - 0.588) 0.504 (0.497 - 0.514) 0.547 (0.533 - 0.561) 0.397 (0.379 - 0.413)
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.059 (0.049 - 0.068) 0.076 (0.064 - 0.086) 0.054 (0.041 - 0.067) 0.707 (0.694 - 0.723) 0.739 (0.723 - 0.758) 0.634 (0.610 - 0.654) 0.131 (0.125 - 0.137) 0.140 (0.131 - 0.148) 0.116 (0.105 - 0.127)

With ensembling

Rank Submission
code
Submission
name
Technical
Report
Ranking score
(Evaluation dataset)
PSDS 1
(Evaluation dataset)
PSDS 2
(Evaluation dataset)
PSDS 1
(Development dataset)
PSDS 2
(Development dataset)
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566) 0.359 0.562
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811) 0.491 0.787
Li_USTC_task4a_1 TAFT and SdMT Li2023 1.54 0.539 (0.527 - 0.551) 0.769 (0.758 - 0.778) 0.562 0.795
Li_USTC_task4a_2 Pseudo labeling Li2023 1.58 0.556 (0.544 - 0.569) 0.781 (0.769 - 0.795) 0.554 0.799
Li_USTC_task4a_3 TAFT and AFL Li2023 1.54 0.546 (0.535 - 0.558) 0.756 (0.745 - 0.769) 0.558 0.798
Li_USTC_task4a_4 MaxFilter Li2023 0.89 0.061 (0.050 - 0.070) 0.852 (0.843 - 0.863) 0.093 0.899
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 1.52 0.531 (0.520 - 0.544) 0.762 (0.751 - 0.773) 0.555 0.791
Li_USTC_task4a_6 Pseudo labeling and single Li2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796) 0.552 0.795
Li_USTC_task4a_7 SKCRNN MT Li2023 1.20 0.404 (0.389 - 0.421) 0.630 (0.612 - 0.648) 0.451 0.662
Liu_NSYSU_task4_1 DCASE2023 FDY_WeakSED_Ensemble Liu2023 0.80 0.051 (0.042 - 0.060) 0.779 (0.767 - 0.791) 0.063 0.711
Liu_NSYSU_task4_2 FDY_Ensemble Liu2023 1.36 0.466 (0.455 - 0.480) 0.701 (0.688 - 0.714) 0.473 0.714
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 1.26 0.434 (0.420 - 0.448) 0.646 (0.633 - 0.660) 0.437 0.682
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 1.24 0.413 (0.394 - 0.438) 0.655 (0.638 - 0.673) 0.456 0.687
Liu_NSYSU_task4_5 DCASE2023 FDY_BEATs_WeakSED Liu2023 0.82 0.045 (0.035 - 0.053) 0.806 (0.794 - 0.818) 0.061 0.839
Liu_NSYSU_task4_6 DCASE2023 FDY_BEATs Liu2023 1.62 0.552 (0.540 - 0.563) 0.838 (0.829 - 0.848) 0.527 0.803
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831) 0.492 0.800
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 1.53 0.515 (0.488 - 0.536) 0.805 (0.791 - 0.818) 0.511 0.780
Lee_CAU_task4A_1 CAU_ET Lee2023 1.24 0.425 (0.415 - 0.440) 0.634 (0.618 - 0.648) 0.437 0.654
Lee_CAU_task4A_2 CAU_ET Lee2023 0.79 0.104 (0.090 - 0.117) 0.674 (0.661 - 0.690) 0.070 0.734
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808) 0.496 0.788
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 1.45 0.487 (0.475 - 0.502) 0.759 (0.745 - 0.773) 0.516 0.781
Chen_CHT_task4_1 VGGSK Chen2023b 1.25 0.441 (0.403 - 0.468) 0.620 (0.567 - 0.652) 0.424 0.633
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792) 0.529 0.780
Chen_CHT_task4_3 VGGSK+BEATs Chen2023b 1.66 0.596 (0.585 - 0.606) 0.810 (0.800 - 0.822) 0.552 0.794
Chen_CHT_task4_4 multi+BEATs Chen2023b 1.66 0.590 (0.578 - 0.601) 0.820 (0.810 - 0.831) 0.542 0.799
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Zhang2023 1.23 0.403 (0.392 - 0.417) 0.660 (0.646 - 0.672) 0.464 0.711
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 1.55 0.525 (0.516 - 0.538) 0.808 (0.796 - 0.821) 0.543 0.801
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.86 0.071 (0.062 - 0.080) 0.807 (0.796 - 0.818) 0.098 0.845
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827) 0.539 0.793
Xiao_FMSG_task4a_5 Xiao_FMSG_task4a_5_ensemble_model Xiao2023 1.61 0.555 (0.545 - 0.567) 0.821 (0.811 - 0.834) 0.544 0.801
Xiao_FMSG_task4a_6 Xiao_FMSG_task4a_6_ensemble_model Xiao2023 1.61 0.551 (0.541 - 0.561) 0.829 (0.819 - 0.842) 0.557 0.812
Xiao_FMSG_task4a_7 Xiao_FMSG_task4a_7_ensemble_model Xiao2023 0.87 0.075 (0.066 - 0.084) 0.811 (0.800 - 0.822) 0.098 0.854
Xiao_FMSG_task4a_8 Xiao_FMSG_task4a_8_ensemble_model Xiao2023 1.62 0.549 (0.540 - 0.560) 0.834 (0.824 - 0.847) 0.551 0.813
Guan_HIT_task4a_1 Guan_HIT_task4a_1 Guan2023 1.57 0.536 (0.526 - 0.546) 0.810 (0.800 - 0.822) 0.523 0.790
Guan_HIT_task4a_2 Guan_HIT_task4a_2 Guan2023 0.93 0.082 (0.074 - 0.090) 0.862 (0.852 - 0.872) 0.115 0.890
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 1.55 0.526 (0.513 - 0.539) 0.800 (0.788 - 0.813) 0.517 0.782
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.92 0.082 (0.073 - 0.091) 0.855 (0.844 - 0.867) 0.113 0.885
Guan_HIT_task4a_5 Guan_HIT_task4a_5 Guan2023 1.40 0.488 (0.475 - 0.503) 0.708 (0.696 - 0.720) 0.492 0.705
Guan_HIT_task4a_6 Guan_HIT_task4a_6 Guan2023 0.88 0.088 (0.080 - 0.096) 0.797 (0.787 - 0.810) 0.109 0.839
Wang_XiaoRice_task4a_1 SINGLE Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815) 0.527 0.790
Wang_XiaoRice_task4a_2 SED Embed Wang2023 1.52 0.497 (0.486 - 0.510) 0.814 (0.803 - 0.828) 0.534 0.811
Wang_XiaoRice_task4a_3 L-TAG Wang2023 0.91 0.088 (0.076 - 0.098) 0.835 (0.824 - 0.844) 0.102 0.886
Zhang_IOA_task4_1 strong_ensemble Zhang2023 1.75 0.622 (0.613 - 0.634) 0.857 (0.849 - 0.866) 0.598 0.837
Zhang_IOA_task4_2 segment tagging model Zhang2023 0.95 0.070 (0.060 - 0.080) 0.903 (0.895 - 0.911) 0.071 0.921
Zhang_IOA_task4_3 strong_ensemble_all Zhang2023 1.71 0.613 (0.603 - 0.625) 0.828 (0.821 - 0.839) 0.601 0.847
Zhang_IOA_task4_4 strong_ensemble_1 Zhang2023 1.75 0.625 (0.615 - 0.637) 0.855 (0.847 - 0.864) 0.602 0.841
Zhang_IOA_task4_5 base system Zhang2023 1.52 0.524 (0.513 - 0.537) 0.774 (0.762 - 0.786) 0.498 0.746
Zhang_IOA_task4_6 strong_single Zhang2023 1.60 0.562 (0.552 - 0.575) 0.795 (0.786 - 0.805) 0.552 0.794
Zhang_IOA_task4_7 weak single Zhang2023 0.86 0.055 (0.048 - 0.064) 0.830 (0.820 - 0.842) 0.065 0.865
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610) 0.429 0.644
Wu_NCUT_task4a_2 Wu_NCUT_task4a_2 Wu2023 1.53 0.519 (0.507 - 0.531) 0.793 (0.783 - 0.806) 0.525 0.780
Wu_NCUT_task4a_3 Wu_NCUT_task4a_3 Wu2023 1.50 0.497 (0.486 - 0.509) 0.793 (0.783 - 0.806) 0.521 0.783
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 1.06 0.351 (0.333 - 0.372) 0.562 (0.532 - 0.587) 0.374 0.575
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 1.12 0.380 (0.361 - 0.406) 0.575 (0.553 - 0.594) 0.387 0.585
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.91 0.200 (0.164 - 0.225) 0.646 (0.626 - 0.664) 0.224 0.696
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.84 0.141 (0.124 - 0.155) 0.673 (0.652 - 0.700) 0.164 0.740
Barahona_AUDIAS_task4a_5 4-Resolution CRNN Barahona2023 1.14 0.378 (0.365 - 0.392) 0.604 (0.590 - 0.622) 0.405 0.624
Barahona_AUDIAS_task4a_6 4-Resolution CRNN with class-dependent median filtering Barahona2023 1.18 0.401 (0.390 - 0.414) 0.612 (0.596 - 0.630) 0.416 0.626
Barahona_AUDIAS_task4a_7 5-Resolution Conformer Barahona2023 1.06 0.274 (0.262 - 0.287) 0.684 (0.671 - 0.699) 0.306 0.727
Barahona_AUDIAS_task4a_8 5-Resolution Conformer with class-wise median filtering Barahona2023 1.00 0.213 (0.201 - 0.226) 0.729 (0.710 - 0.752) 0.243 0.781
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617) 0.402 0.620
Gan_NCUT_task4_2 Gan_NCUT_SED_system_2 Gan2023 1.52 0.511 (0.498 - 0.524) 0.799 (0.785 - 0.813) 0.521 0.792
Gan_NCUT_task4_3 Gan_NCUT_SED_system_3 Gan2023 1.50 0.483 (0.467 - 0.498) 0.816 (0.805 - 0.828) 0.497 0.825
Liu_SRCN_task4a_1 DCASE2023 t4a system1 Chen2023a 1.65 0.585 (0.572 - 0.598) 0.817 (0.804 - 0.834) 0.570 0.843
Liu_SRCN_task4a_2 DCASE2023 t4a system2 Chen2023a 1.40 0.380 (0.369 - 0.392) 0.877 (0.867 - 0.885) 0.414 0.884
Liu_SRCN_task4a_3 DCASE2023 t4a system3 Chen2023a 1.65 0.556 (0.544 - 0.569) 0.861 (0.852 - 0.870) 0.554 0.833
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676) 0.436 0.675
Liu_SRCN_task4a_5 DCASE2023 t4a system5 Chen2023a 0.94 0.098 (0.086 - 0.108) 0.851 (0.841 - 0.860) 0.118 0.889
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 1.35 0.459 (0.431 - 0.484) 0.701 (0.681 - 0.720) 0.471 0.715
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 1.68 0.591 (0.574 - 0.611) 0.831 (0.823 - 0.841) 0.546 0.807
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 1.66 0.581 (0.553 - 0.600) 0.835 (0.826 - 0.846) 0.543 0.806
Kim_GIST-HanwhaVision_task4a_4 FDYLKA BEATs pool 1d stage1 Kim2023 1.63 0.576 (0.549 - 0.595) 0.809 (0.797 - 0.821) 0.525 0.770
Kim_GIST-HanwhaVision_task4a_5 FDYLKA BEATs all ensemble 48 Kim2023 1.72 0.611 (0.598 - 0.623) 0.846 (0.838 - 0.855) 0.566 0.815
Kim_GIST-HanwhaVision_task4a_6 FDYLKA BEATs PSDS1 ensemble 16 Kim2023 1.72 0.611 (0.590 - 0.628) 0.841 (0.832 - 0.851) 0.564 0.810
Kim_GIST-HanwhaVision_task4a_7 FDYLKA BEATs PSDS2 ensemble 16 Kim2023 1.69 0.591 (0.574 - 0.604) 0.844 (0.835 - 0.853) 0.554 0.817
Kim_GIST-HanwhaVision_task4a_8 FDYLKA BEATs PSDS sum ensemble 16 Kim2023 1.72 0.612 (0.599 - 0.626) 0.841 (0.831 - 0.851) 0.567 0.810
Wenxin_TJU_task4a_1 ensemble-pretrained-psds1-0 Wenxin2023 1.63 0.555 (0.543 - 0.566) 0.837 (0.828 - 0.847) 0.535 0.806
Wenxin_TJU_task4a_2 ensemble-pretrained-psds1-1 Wenxin2023 1.66 0.570 (0.559 - 0.580) 0.844 (0.836 - 0.854) 0.530 0.804
Wenxin_TJU_task4a_3 ensemble-pretrained-psds2-0 Wenxin2023 0.88 0.080 (0.071 - 0.088) 0.815 (0.802 - 0.825) 0.087 0.875
Wenxin_TJU_task4a_4 ensemble-pretrained-psds2-1 Wenxin2023 0.90 0.081 (0.071 - 0.090) 0.838 (0.828 - 0.849) 0.087 0.875
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 1.58 0.539 (0.528 - 0.549) 0.816 (0.806 - 0.831) 0.521 0.793
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842) 0.512 0.808
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 1.31 0.440 (0.429 - 0.454) 0.686 (0.673 - 0.699) 0.460 0.699
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.75 0.059 (0.049 - 0.068) 0.707 (0.694 - 0.723) 0.067 0.781

Supplementary metrics

Rank Submission
code
Submission
name
Technical
Report
PSDS 1
(Evaluation dataset)
PSDS 1
(Public evaluation)
PSDS 1
(Vimeo dataset)
PSDS 2
(Evaluation dataset)
PSDS 2
(Public evaluation)
PSDS 2
(Vimeo dataset)
F-score
(Evaluation dataset)
F-score
(Public evaluation)
F-score
(Vimeo dataset)
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 0.327 (0.317 - 0.339) 0.366 (0.347 - 0.385) 0.247 (0.220 - 0.275) 0.538 (0.515 - 0.566) 0.580 (0.552 - 0.612) 0.430 (0.397 - 0.466) 0.377 (0.351 - 0.402) 0.408 (0.379 - 0.441) 0.299 (0.269 - 0.330)
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 0.510 (0.496 - 0.523) 0.560 (0.541 - 0.579) 0.414 (0.395 - 0.435) 0.798 (0.782 - 0.811) 0.841 (0.829 - 0.853) 0.697 (0.671 - 0.718) 0.567 (0.544 - 0.588) 0.603 (0.571 - 0.629) 0.480 (0.454 - 0.504)
Li_USTC_task4a_1 TAFT and SdMT Li2023 0.539 (0.527 - 0.551) 0.588 (0.574 - 0.605) 0.435 (0.418 - 0.451) 0.769 (0.758 - 0.778) 0.810 (0.800 - 0.820) 0.669 (0.647 - 0.687) 0.595 (0.584 - 0.606) 0.632 (0.619 - 0.646) 0.501 (0.478 - 0.522)
Li_USTC_task4a_2 Pseudo labeling Li2023 0.556 (0.544 - 0.569) 0.603 (0.589 - 0.618) 0.453 (0.435 - 0.469) 0.781 (0.769 - 0.795) 0.809 (0.794 - 0.823) 0.706 (0.683 - 0.725) 0.615 (0.598 - 0.629) 0.647 (0.631 - 0.663) 0.534 (0.507 - 0.559)
Li_USTC_task4a_3 TAFT and AFL Li2023 0.546 (0.535 - 0.558) 0.591 (0.577 - 0.607) 0.446 (0.430 - 0.464) 0.756 (0.745 - 0.769) 0.792 (0.780 - 0.805) 0.675 (0.655 - 0.697) 0.590 (0.581 - 0.599) 0.629 (0.615 - 0.642) 0.494 (0.475 - 0.513)
Li_USTC_task4a_4 MaxFilter Li2023 0.061 (0.050 - 0.070) 0.076 (0.064 - 0.088) 0.028 (0.016 - 0.040) 0.852 (0.843 - 0.863) 0.891 (0.882 - 0.900) 0.764 (0.743 - 0.786) 0.137 (0.130 - 0.143) 0.141 (0.131 - 0.150) 0.127 (0.113 - 0.139)
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 0.531 (0.520 - 0.544) 0.577 (0.562 - 0.595) 0.431 (0.409 - 0.457) 0.762 (0.751 - 0.773) 0.800 (0.789 - 0.812) 0.663 (0.637 - 0.688) 0.599 (0.584 - 0.613) 0.634 (0.620 - 0.653) 0.509 (0.477 - 0.543)
Li_USTC_task4a_6 Pseudo labeling and single Li2023 0.546 (0.529 - 0.562) 0.593 (0.573 - 0.614) 0.451 (0.429 - 0.473) 0.783 (0.771 - 0.796) 0.810 (0.797 - 0.825) 0.703 (0.679 - 0.724) 0.603 (0.589 - 0.615) 0.635 (0.618 - 0.651) 0.523 (0.502 - 0.545)
Li_USTC_task4a_7 SKCRNN MT Li2023 0.404 (0.389 - 0.421) 0.451 (0.431 - 0.473) 0.303 (0.281 - 0.320) 0.630 (0.612 - 0.648) 0.673 (0.654 - 0.693) 0.502 (0.466 - 0.532) 0.478 (0.467 - 0.489) 0.516 (0.502 - 0.530) 0.384 (0.363 - 0.405)
Liu_NSYSU_task4_1 DCASE2023 FDY_WeakSED_Ensemble Liu2023 0.051 (0.042 - 0.060) 0.062 (0.050 - 0.072) 0.018 (0.007 - 0.029) 0.779 (0.767 - 0.791) 0.807 (0.794 - 0.822) 0.697 (0.675 - 0.721) 0.128 (0.122 - 0.135) 0.136 (0.126 - 0.145) 0.110 (0.097 - 0.123)
Liu_NSYSU_task4_2 FDY_Ensemble Liu2023 0.466 (0.455 - 0.480) 0.521 (0.505 - 0.536) 0.354 (0.337 - 0.370) 0.701 (0.688 - 0.714) 0.756 (0.741 - 0.773) 0.577 (0.551 - 0.599) 0.516 (0.505 - 0.527) 0.550 (0.537 - 0.564) 0.427 (0.407 - 0.444)
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 0.434 (0.420 - 0.448) 0.489 (0.472 - 0.506) 0.314 (0.292 - 0.334) 0.646 (0.633 - 0.660) 0.704 (0.688 - 0.721) 0.510 (0.485 - 0.531) 0.468 (0.447 - 0.485) 0.500 (0.477 - 0.518) 0.389 (0.359 - 0.414)
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 0.413 (0.394 - 0.438) 0.461 (0.438 - 0.488) 0.318 (0.293 - 0.342) 0.655 (0.638 - 0.673) 0.715 (0.693 - 0.737) 0.527 (0.500 - 0.549) 0.484 (0.470 - 0.497) 0.518 (0.502 - 0.535) 0.397 (0.369 - 0.422)
Liu_NSYSU_task4_5 DCASE2023 FDY_BEATs_WeakSED Liu2023 0.045 (0.035 - 0.053) 0.059 (0.047 - 0.069) 0.007 (0.001 - 0.019) 0.806 (0.794 - 0.818) 0.835 (0.823 - 0.849) 0.725 (0.704 - 0.750) 0.142 (0.135 - 0.149) 0.151 (0.141 - 0.161) 0.124 (0.112 - 0.135)
Liu_NSYSU_task4_6 DCASE2023 FDY_BEATs Liu2023 0.552 (0.540 - 0.563) 0.600 (0.583 - 0.619) 0.452 (0.437 - 0.467) 0.838 (0.829 - 0.848) 0.879 (0.871 - 0.889) 0.746 (0.728 - 0.763) 0.589 (0.578 - 0.599) 0.625 (0.613 - 0.637) 0.504 (0.488 - 0.519)
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 0.521 (0.510 - 0.531) 0.569 (0.555 - 0.586) 0.424 (0.404 - 0.446) 0.813 (0.796 - 0.831) 0.858 (0.839 - 0.876) 0.717 (0.694 - 0.742) 0.564 (0.551 - 0.575) 0.598 (0.586 - 0.611) 0.481 (0.460 - 0.503)
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 0.515 (0.488 - 0.536) 0.564 (0.532 - 0.587) 0.416 (0.390 - 0.440) 0.805 (0.791 - 0.818) 0.850 (0.832 - 0.868) 0.699 (0.676 - 0.719) 0.553 (0.529 - 0.574) 0.586 (0.557 - 0.608) 0.469 (0.446 - 0.498)
Lee_CAU_task4A_1 CAU_ET Lee2023 0.425 (0.415 - 0.440) 0.475 (0.458 - 0.492) 0.320 (0.302 - 0.339) 0.634 (0.618 - 0.648) 0.683 (0.662 - 0.704) 0.514 (0.490 - 0.542) 0.470 (0.459 - 0.481) 0.513 (0.500 - 0.528) 0.364 (0.342 - 0.384)
Lee_CAU_task4A_2 CAU_ET Lee2023 0.104 (0.090 - 0.117) 0.118 (0.098 - 0.136) 0.090 (0.075 - 0.105) 0.674 (0.661 - 0.690) 0.707 (0.690 - 0.727) 0.592 (0.560 - 0.622) 0.137 (0.119 - 0.151) 0.150 (0.126 - 0.169) 0.106 (0.091 - 0.119)
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 0.516 (0.504 - 0.529) 0.573 (0.555 - 0.593) 0.411 (0.391 - 0.433) 0.796 (0.784 - 0.808) 0.841 (0.828 - 0.854) 0.697 (0.675 - 0.719) 0.577 (0.566 - 0.588) 0.615 (0.599 - 0.632) 0.486 (0.469 - 0.504)
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 0.487 (0.475 - 0.502) 0.540 (0.521 - 0.560) 0.389 (0.366 - 0.410) 0.759 (0.745 - 0.773) 0.804 (0.785 - 0.823) 0.656 (0.633 - 0.682) 0.555 (0.543 - 0.566) 0.596 (0.580 - 0.611) 0.454 (0.432 - 0.477)
Chen_CHT_task4_1 VGGSK Chen2023b 0.441 (0.403 - 0.468) 0.488 (0.440 - 0.523) 0.333 (0.289 - 0.370) 0.620 (0.567 - 0.652) 0.666 (0.608 - 0.707) 0.496 (0.428 - 0.548) 0.504 (0.449 - 0.543) 0.544 (0.486 - 0.585) 0.406 (0.351 - 0.447)
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 0.563 (0.550 - 0.574) 0.621 (0.600 - 0.639) 0.451 (0.431 - 0.471) 0.779 (0.768 - 0.792) 0.821 (0.809 - 0.834) 0.690 (0.665 - 0.715) 0.628 (0.615 - 0.641) 0.669 (0.653 - 0.686) 0.530 (0.508 - 0.552)
Chen_CHT_task4_3 VGGSK+BEATs Chen2023b 0.596 (0.585 - 0.606) 0.655 (0.643 - 0.668) 0.482 (0.464 - 0.500) 0.810 (0.800 - 0.822) 0.849 (0.837 - 0.860) 0.733 (0.714 - 0.752) 0.638 (0.630 - 0.648) 0.673 (0.660 - 0.685) 0.552 (0.533 - 0.573)
Chen_CHT_task4_4 multi+BEATs Chen2023b 0.590 (0.578 - 0.601) 0.649 (0.634 - 0.664) 0.476 (0.460 - 0.493) 0.820 (0.810 - 0.831) 0.862 (0.851 - 0.872) 0.731 (0.710 - 0.751) 0.639 (0.629 - 0.649) 0.676 (0.662 - 0.688) 0.547 (0.528 - 0.565)
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Zhang2023 0.403 (0.392 - 0.417) 0.455 (0.439 - 0.472) 0.309 (0.292 - 0.326) 0.660 (0.646 - 0.672) 0.705 (0.690 - 0.724) 0.549 (0.527 - 0.572) 0.483 (0.472 - 0.493) 0.522 (0.510 - 0.534) 0.388 (0.373 - 0.402)
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 0.525 (0.516 - 0.538) 0.566 (0.549 - 0.584) 0.438 (0.424 - 0.454) 0.808 (0.796 - 0.821) 0.848 (0.837 - 0.862) 0.705 (0.683 - 0.729) 0.579 (0.569 - 0.588) 0.613 (0.601 - 0.627) 0.498 (0.481 - 0.517)
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.071 (0.062 - 0.080) 0.084 (0.070 - 0.096) 0.061 (0.050 - 0.074) 0.807 (0.796 - 0.818) 0.845 (0.833 - 0.859) 0.723 (0.701 - 0.742) 0.131 (0.124 - 0.137) 0.138 (0.128 - 0.147) 0.118 (0.106 - 0.130)
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 0.551 (0.543 - 0.562) 0.605 (0.591 - 0.621) 0.451 (0.433 - 0.469) 0.813 (0.802 - 0.827) 0.855 (0.844 - 0.868) 0.718 (0.698 - 0.736) 0.581 (0.573 - 0.591) 0.628 (0.616 - 0.641) 0.467 (0.449 - 0.485)
Xiao_FMSG_task4a_5 Xiao_FMSG_task4a_5_ensemble_model Xiao2023 0.555 (0.545 - 0.567) 0.606 (0.592 - 0.623) 0.457 (0.442 - 0.474) 0.821 (0.811 - 0.834) 0.859 (0.850 - 0.873) 0.735 (0.717 - 0.751) 0.000 (0.000 - 0.000) 0.000 (0.000 - 0.000) 0.000 (0.000 - 0.000)
Xiao_FMSG_task4a_6 Xiao_FMSG_task4a_6_ensemble_model Xiao2023 0.551 (0.541 - 0.561) 0.595 (0.581 - 0.612) 0.464 (0.449 - 0.479) 0.829 (0.819 - 0.842) 0.867 (0.856 - 0.880) 0.741 (0.724 - 0.760) 0.599 (0.590 - 0.609) 0.643 (0.630 - 0.657) 0.495 (0.477 - 0.512)
Xiao_FMSG_task4a_7 Xiao_FMSG_task4a_7_ensemble_model Xiao2023 0.075 (0.066 - 0.084) 0.088 (0.074 - 0.101) 0.074 (0.059 - 0.088) 0.811 (0.800 - 0.822) 0.847 (0.837 - 0.861) 0.731 (0.710 - 0.751) 0.132 (0.126 - 0.138) 0.140 (0.130 - 0.149) 0.117 (0.106 - 0.128)
Xiao_FMSG_task4a_8 Xiao_FMSG_task4a_8_ensemble_model Xiao2023 0.549 (0.540 - 0.560) 0.594 (0.578 - 0.613) 0.464 (0.447 - 0.481) 0.834 (0.824 - 0.847) 0.870 (0.861 - 0.883) 0.747 (0.728 - 0.762) 0.602 (0.593 - 0.612) 0.641 (0.627 - 0.656) 0.509 (0.493 - 0.527)
Guan_HIT_task4a_1 Guan_HIT_task4a_1 Guan2023 0.536 (0.526 - 0.546) 0.579 (0.565 - 0.598) 0.445 (0.428 - 0.460) 0.810 (0.800 - 0.822) 0.851 (0.841 - 0.863) 0.727 (0.709 - 0.747) 0.559 (0.547 - 0.570) 0.598 (0.585 - 0.614) 0.465 (0.445 - 0.484)
Guan_HIT_task4a_2 Guan_HIT_task4a_2 Guan2023 0.082 (0.074 - 0.090) 0.096 (0.086 - 0.107) 0.055 (0.042 - 0.069) 0.862 (0.852 - 0.872) 0.899 (0.889 - 0.909) 0.783 (0.764 - 0.802) 0.143 (0.137 - 0.149) 0.151 (0.140 - 0.160) 0.127 (0.114 - 0.138)
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 0.526 (0.513 - 0.539) 0.572 (0.552 - 0.590) 0.435 (0.418 - 0.450) 0.800 (0.788 - 0.813) 0.840 (0.825 - 0.857) 0.716 (0.695 - 0.735) 0.548 (0.533 - 0.563) 0.584 (0.567 - 0.604) 0.462 (0.444 - 0.482)
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.082 (0.073 - 0.091) 0.096 (0.083 - 0.107) 0.057 (0.043 - 0.071) 0.855 (0.844 - 0.867) 0.890 (0.871 - 0.903) 0.775 (0.756 - 0.796) 0.142 (0.134 - 0.151) 0.150 (0.139 - 0.160) 0.126 (0.113 - 0.138)
Guan_HIT_task4a_5 Guan_HIT_task4a_5 Guan2023 0.488 (0.475 - 0.503) 0.535 (0.517 - 0.554) 0.394 (0.374 - 0.412) 0.708 (0.696 - 0.720) 0.758 (0.743 - 0.774) 0.592 (0.568 - 0.619) 0.511 (0.501 - 0.521) 0.548 (0.533 - 0.561) 0.422 (0.407 - 0.437)
Guan_HIT_task4a_6 Guan_HIT_task4a_6 Guan2023 0.088 (0.080 - 0.096) 0.100 (0.088 - 0.110) 0.059 (0.042 - 0.075) 0.797 (0.787 - 0.810) 0.838 (0.826 - 0.850) 0.698 (0.679 - 0.720) 0.137 (0.130 - 0.144) 0.146 (0.136 - 0.156) 0.115 (0.100 - 0.127)
Wang_XiaoRice_task4a_1 SINGLE Wang2023 0.494 (0.477 - 0.510) 0.551 (0.532 - 0.574) 0.380 (0.362 - 0.402) 0.801 (0.789 - 0.815) 0.838 (0.823 - 0.854) 0.713 (0.686 - 0.742) 0.487 (0.465 - 0.513) 0.514 (0.491 - 0.543) 0.423 (0.396 - 0.451)
Wang_XiaoRice_task4a_2 SED Embed Wang2023 0.497 (0.486 - 0.510) 0.556 (0.538 - 0.576) 0.387 (0.366 - 0.406) 0.814 (0.803 - 0.828) 0.849 (0.837 - 0.862) 0.727 (0.704 - 0.753) 0.482 (0.467 - 0.496) 0.512 (0.492 - 0.532) 0.413 (0.393 - 0.431)
Wang_XiaoRice_task4a_3 L-TAG Wang2023 0.088 (0.076 - 0.098) 0.100 (0.086 - 0.113) 0.069 (0.055 - 0.084) 0.835 (0.824 - 0.844) 0.864 (0.851 - 0.875) 0.755 (0.733 - 0.772) 0.122 (0.115 - 0.130) 0.130 (0.120 - 0.142) 0.103 (0.087 - 0.117)
Zhang_IOA_task4_1 strong_ensemble Zhang2023 0.622 (0.613 - 0.634) 0.671 (0.657 - 0.687) 0.523 (0.506 - 0.541) 0.857 (0.849 - 0.866) 0.892 (0.884 - 0.902) 0.748 (0.728 - 0.770) 0.666 (0.658 - 0.675) 0.690 (0.679 - 0.700) 0.616 (0.600 - 0.635)
Zhang_IOA_task4_2 segment tagging model Zhang2023 0.070 (0.060 - 0.080) 0.078 (0.066 - 0.089) 0.040 (0.027 - 0.054) 0.903 (0.895 - 0.911) 0.951 (0.946 - 0.957) 0.800 (0.782 - 0.821) 0.154 (0.147 - 0.161) 0.164 (0.155 - 0.173) 0.131 (0.118 - 0.142)
Zhang_IOA_task4_3 strong_ensemble_all Zhang2023 0.613 (0.603 - 0.625) 0.669 (0.656 - 0.683) 0.518 (0.503 - 0.535) 0.828 (0.821 - 0.839) 0.870 (0.860 - 0.882) 0.743 (0.722 - 0.764) 0.651 (0.643 - 0.659) 0.677 (0.665 - 0.690) 0.588 (0.572 - 0.606)
Zhang_IOA_task4_4 strong_ensemble_1 Zhang2023 0.625 (0.615 - 0.637) 0.673 (0.659 - 0.689) 0.526 (0.508 - 0.543) 0.855 (0.847 - 0.864) 0.891 (0.883 - 0.901) 0.745 (0.725 - 0.767) 0.668 (0.659 - 0.676) 0.691 (0.680 - 0.701) 0.619 (0.603 - 0.638)
Zhang_IOA_task4_5 base system Zhang2023 0.524 (0.513 - 0.537) 0.565 (0.549 - 0.579) 0.445 (0.421 - 0.472) 0.774 (0.762 - 0.786) 0.821 (0.804 - 0.837) 0.672 (0.651 - 0.695) 0.601 (0.591 - 0.610) 0.630 (0.616 - 0.644) 0.534 (0.513 - 0.553)
Zhang_IOA_task4_6 strong_single Zhang2023 0.562 (0.552 - 0.575) 0.612 (0.597 - 0.626) 0.467 (0.450 - 0.487) 0.795 (0.786 - 0.805) 0.848 (0.838 - 0.857) 0.683 (0.661 - 0.703) 0.626 (0.617 - 0.633) 0.658 (0.646 - 0.669) 0.550 (0.530 - 0.566)
Zhang_IOA_task4_7 weak single Zhang2023 0.055 (0.048 - 0.064) 0.062 (0.050 - 0.074) 0.030 (0.015 - 0.044) 0.830 (0.820 - 0.842) 0.882 (0.873 - 0.892) 0.714 (0.690 - 0.735) 0.129 (0.123 - 0.135) 0.135 (0.126 - 0.143) 0.115 (0.103 - 0.126)
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 0.391 (0.379 - 0.405) 0.437 (0.423 - 0.458) 0.295 (0.278 - 0.311) 0.596 (0.584 - 0.610) 0.638 (0.617 - 0.660) 0.484 (0.463 - 0.505) 0.466 (0.454 - 0.478) 0.513 (0.500 - 0.527) 0.347 (0.326 - 0.365)
Wu_NCUT_task4a_2 Wu_NCUT_task4a_2 Wu2023 0.519 (0.507 - 0.531) 0.576 (0.560 - 0.596) 0.429 (0.412 - 0.444) 0.793 (0.783 - 0.806) 0.840 (0.830 - 0.851) 0.693 (0.672 - 0.716) 0.587 (0.577 - 0.596) 0.620 (0.608 - 0.635) 0.506 (0.485 - 0.529)
Wu_NCUT_task4a_3 Wu_NCUT_task4a_3 Wu2023 0.497 (0.486 - 0.509) 0.553 (0.537 - 0.575) 0.418 (0.402 - 0.434) 0.793 (0.783 - 0.806) 0.840 (0.830 - 0.850) 0.691 (0.669 - 0.715) 0.572 (0.562 - 0.581) 0.605 (0.592 - 0.618) 0.491 (0.470 - 0.512)
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 0.351 (0.333 - 0.372) 0.394 (0.370 - 0.422) 0.257 (0.236 - 0.275) 0.562 (0.532 - 0.587) 0.612 (0.586 - 0.640) 0.434 (0.391 - 0.477) 0.390 (0.373 - 0.414) 0.422 (0.400 - 0.452) 0.311 (0.288 - 0.329)
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 0.380 (0.361 - 0.406) 0.427 (0.400 - 0.459) 0.278 (0.257 - 0.296) 0.575 (0.553 - 0.594) 0.625 (0.604 - 0.650) 0.444 (0.409 - 0.480) 0.408 (0.389 - 0.432) 0.442 (0.416 - 0.474) 0.323 (0.302 - 0.341)
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.200 (0.164 - 0.225) 0.227 (0.185 - 0.256) 0.153 (0.117 - 0.179) 0.646 (0.626 - 0.664) 0.681 (0.656 - 0.706) 0.556 (0.525 - 0.590) 0.163 (0.141 - 0.181) 0.173 (0.148 - 0.192) 0.146 (0.123 - 0.164)
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.141 (0.124 - 0.155) 0.160 (0.136 - 0.179) 0.105 (0.089 - 0.121) 0.673 (0.652 - 0.700) 0.708 (0.683 - 0.735) 0.580 (0.550 - 0.610) 0.155 (0.135 - 0.172) 0.161 (0.137 - 0.180) 0.144 (0.126 - 0.160)
Barahona_AUDIAS_task4a_5 4-Resolution CRNN Barahona2023 0.378 (0.365 - 0.392) 0.424 (0.407 - 0.442) 0.287 (0.266 - 0.304) 0.604 (0.590 - 0.622) 0.655 (0.627 - 0.683) 0.480 (0.450 - 0.511) 0.427 (0.415 - 0.439) 0.457 (0.442 - 0.473) 0.349 (0.330 - 0.366)
Barahona_AUDIAS_task4a_6 4-Resolution CRNN with class-dependent median filtering Barahona2023 0.401 (0.390 - 0.414) 0.449 (0.433 - 0.466) 0.300 (0.277 - 0.320) 0.612 (0.596 - 0.630) 0.664 (0.639 - 0.690) 0.487 (0.449 - 0.525) 0.435 (0.417 - 0.449) 0.467 (0.448 - 0.483) 0.358 (0.335 - 0.378)
Barahona_AUDIAS_task4a_7 5-Resolution Conformer Barahona2023 0.274 (0.262 - 0.287) 0.310 (0.295 - 0.325) 0.227 (0.208 - 0.247) 0.684 (0.671 - 0.699) 0.718 (0.697 - 0.741) 0.605 (0.585 - 0.628) 0.239 (0.231 - 0.248) 0.253 (0.240 - 0.264) 0.209 (0.196 - 0.223)
Barahona_AUDIAS_task4a_8 5-Resolution Conformer with class-wise median filtering Barahona2023 0.213 (0.201 - 0.226) 0.239 (0.220 - 0.257) 0.173 (0.154 - 0.188) 0.729 (0.710 - 0.752) 0.761 (0.736 - 0.789) 0.648 (0.621 - 0.672) 0.204 (0.197 - 0.213) 0.215 (0.203 - 0.226) 0.182 (0.169 - 0.196)
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 0.365 (0.353 - 0.377) 0.403 (0.388 - 0.424) 0.281 (0.260 - 0.298) 0.603 (0.589 - 0.617) 0.656 (0.636 - 0.676) 0.473 (0.453 - 0.496) 0.437 (0.426 - 0.448) 0.469 (0.455 - 0.483) 0.353 (0.331 - 0.371)
Gan_NCUT_task4_2 Gan_NCUT_SED_system_2 Gan2023 0.511 (0.498 - 0.524) 0.562 (0.545 - 0.581) 0.412 (0.396 - 0.429) 0.799 (0.785 - 0.813) 0.846 (0.835 - 0.858) 0.699 (0.676 - 0.721) 0.569 (0.553 - 0.584) 0.604 (0.589 - 0.620) 0.483 (0.457 - 0.510)
Gan_NCUT_task4_3 Gan_NCUT_SED_system_3 Gan2023 0.483 (0.467 - 0.498) 0.531 (0.515 - 0.549) 0.391 (0.374 - 0.414) 0.816 (0.805 - 0.828) 0.853 (0.842 - 0.865) 0.729 (0.707 - 0.751) 0.569 (0.558 - 0.579) 0.604 (0.591 - 0.618) 0.482 (0.461 - 0.501)
Liu_SRCN_task4a_1 DCASE2023 t4a system1 Chen2023a 0.585 (0.572 - 0.598) 0.636 (0.618 - 0.655) 0.484 (0.467 - 0.501) 0.817 (0.804 - 0.834) 0.853 (0.839 - 0.870) 0.725 (0.700 - 0.750) 0.632 (0.620 - 0.642) 0.672 (0.660 - 0.685) 0.541 (0.525 - 0.556)
Liu_SRCN_task4a_2 DCASE2023 t4a system2 Chen2023a 0.380 (0.369 - 0.392) 0.419 (0.405 - 0.435) 0.313 (0.297 - 0.328) 0.877 (0.867 - 0.885) 0.914 (0.907 - 0.920) 0.806 (0.788 - 0.825) 0.436 (0.426 - 0.446) 0.465 (0.453 - 0.477) 0.371 (0.355 - 0.389)
Liu_SRCN_task4a_3 DCASE2023 t4a system3 Chen2023a 0.556 (0.544 - 0.569) 0.601 (0.586 - 0.620) 0.469 (0.451 - 0.487) 0.861 (0.852 - 0.870) 0.901 (0.893 - 0.909) 0.775 (0.756 - 0.799) 0.634 (0.623 - 0.645) 0.671 (0.658 - 0.686) 0.541 (0.523 - 0.557)
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 0.412 (0.400 - 0.424) 0.450 (0.432 - 0.472) 0.334 (0.314 - 0.352) 0.663 (0.652 - 0.676) 0.707 (0.690 - 0.724) 0.556 (0.531 - 0.574) 0.472 (0.462 - 0.480) 0.504 (0.491 - 0.515) 0.390 (0.370 - 0.407)
Liu_SRCN_task4a_5 DCASE2023 t4a system5 Chen2023a 0.098 (0.086 - 0.108) 0.110 (0.097 - 0.123) 0.048 (0.033 - 0.067) 0.851 (0.841 - 0.860) 0.879 (0.869 - 0.890) 0.770 (0.750 - 0.786) 0.158 (0.151 - 0.165) 0.165 (0.155 - 0.176) 0.142 (0.129 - 0.154)
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 0.459 (0.431 - 0.484) 0.504 (0.472 - 0.533) 0.368 (0.330 - 0.400) 0.701 (0.681 - 0.720) 0.750 (0.732 - 0.771) 0.590 (0.556 - 0.625) 0.545 (0.530 - 0.564) 0.582 (0.567 - 0.602) 0.453 (0.434 - 0.474)
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 0.591 (0.574 - 0.611) 0.645 (0.624 - 0.668) 0.489 (0.466 - 0.515) 0.831 (0.823 - 0.841) 0.868 (0.859 - 0.877) 0.751 (0.733 - 0.768) 0.646 (0.634 - 0.658) 0.684 (0.670 - 0.697) 0.554 (0.534 - 0.573)
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 0.581 (0.553 - 0.600) 0.633 (0.604 - 0.655) 0.483 (0.456 - 0.503) 0.835 (0.826 - 0.846) 0.871 (0.862 - 0.881) 0.754 (0.736 - 0.772) 0.638 (0.622 - 0.654) 0.675 (0.658 - 0.691) 0.549 (0.522 - 0.572)
Kim_GIST-HanwhaVision_task4a_4 FDYLKA BEATs pool 1d stage1 Kim2023 0.576 (0.549 - 0.595) 0.628 (0.595 - 0.654) 0.479 (0.441 - 0.508) 0.809 (0.797 - 0.821) 0.854 (0.842 - 0.867) 0.712 (0.693 - 0.731) 0.612 (0.585 - 0.632) 0.656 (0.624 - 0.681) 0.503 (0.483 - 0.526)
Kim_GIST-HanwhaVision_task4a_5 FDYLKA BEATs all ensemble 48 Kim2023 0.611 (0.598 - 0.623) 0.661 (0.647 - 0.678) 0.511 (0.494 - 0.529) 0.846 (0.838 - 0.855) 0.883 (0.875 - 0.891) 0.760 (0.743 - 0.777) 0.655 (0.641 - 0.671) 0.694 (0.676 - 0.715) 0.558 (0.540 - 0.575)
Kim_GIST-HanwhaVision_task4a_6 FDYLKA BEATs PSDS1 ensemble 16 Kim2023 0.611 (0.590 - 0.628) 0.661 (0.640 - 0.682) 0.512 (0.485 - 0.535) 0.841 (0.832 - 0.851) 0.877 (0.867 - 0.887) 0.762 (0.743 - 0.779) 0.658 (0.639 - 0.674) 0.698 (0.675 - 0.717) 0.561 (0.540 - 0.583)
Kim_GIST-HanwhaVision_task4a_7 FDYLKA BEATs PSDS2 ensemble 16 Kim2023 0.591 (0.574 - 0.604) 0.643 (0.621 - 0.662) 0.489 (0.467 - 0.509) 0.844 (0.835 - 0.853) 0.882 (0.873 - 0.892) 0.759 (0.741 - 0.776) 0.638 (0.620 - 0.652) 0.676 (0.655 - 0.696) 0.543 (0.522 - 0.563)
Kim_GIST-HanwhaVision_task4a_8 FDYLKA BEATs PSDS sum ensemble 16 Kim2023 0.612 (0.599 - 0.626) 0.659 (0.644 - 0.675) 0.517 (0.500 - 0.536) 0.841 (0.831 - 0.851) 0.877 (0.867 - 0.887) 0.759 (0.740 - 0.777) 0.657 (0.644 - 0.671) 0.696 (0.679 - 0.712) 0.562 (0.543 - 0.581)
Wenxin_TJU_task4a_1 ensemble-pretrained-psds1-0 Wenxin2023 0.555 (0.543 - 0.566) 0.608 (0.592 - 0.627) 0.439 (0.425 - 0.454) 0.837 (0.828 - 0.847) 0.879 (0.871 - 0.888) 0.738 (0.719 - 0.754) 0.590 (0.581 - 0.600) 0.626 (0.614 - 0.640) 0.503 (0.488 - 0.519)
Wenxin_TJU_task4a_2 ensemble-pretrained-psds1-1 Wenxin2023 0.570 (0.559 - 0.580) 0.623 (0.606 - 0.638) 0.445 (0.429 - 0.458) 0.844 (0.836 - 0.854) 0.884 (0.876 - 0.894) 0.752 (0.732 - 0.770) 0.603 (0.595 - 0.612) 0.641 (0.629 - 0.653) 0.513 (0.495 - 0.531)
Wenxin_TJU_task4a_3 ensemble-pretrained-psds2-0 Wenxin2023 0.080 (0.071 - 0.088) 0.095 (0.083 - 0.105) 0.075 (0.063 - 0.088) 0.815 (0.802 - 0.825) 0.844 (0.833 - 0.860) 0.723 (0.700 - 0.744) 0.141 (0.135 - 0.147) 0.144 (0.132 - 0.152) 0.137 (0.125 - 0.150)
Wenxin_TJU_task4a_4 ensemble-pretrained-psds2-1 Wenxin2023 0.081 (0.071 - 0.090) 0.095 (0.085 - 0.105) 0.078 (0.063 - 0.093) 0.838 (0.828 - 0.849) 0.867 (0.857 - 0.878) 0.760 (0.738 - 0.780) 0.143 (0.136 - 0.150) 0.150 (0.140 - 0.158) 0.129 (0.117 - 0.140)
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 0.539 (0.528 - 0.549) 0.598 (0.581 - 0.614) 0.423 (0.404 - 0.437) 0.816 (0.806 - 0.831) 0.858 (0.848 - 0.870) 0.710 (0.688 - 0.733) 0.569 (0.559 - 0.577) 0.605 (0.594 - 0.614) 0.481 (0.460 - 0.501)
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 0.546 (0.536 - 0.556) 0.596 (0.583 - 0.611) 0.432 (0.418 - 0.448) 0.831 (0.823 - 0.842) 0.875 (0.868 - 0.884) 0.735 (0.715 - 0.754) 0.582 (0.574 - 0.589) 0.615 (0.603 - 0.626) 0.498 (0.481 - 0.515)
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 0.440 (0.429 - 0.454) 0.491 (0.472 - 0.508) 0.331 (0.314 - 0.349) 0.686 (0.673 - 0.699) 0.730 (0.711 - 0.751) 0.567 (0.547 - 0.588) 0.504 (0.497 - 0.514) 0.547 (0.533 - 0.561) 0.397 (0.379 - 0.413)
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.059 (0.049 - 0.068) 0.076 (0.064 - 0.086) 0.054 (0.041 - 0.067) 0.707 (0.694 - 0.723) 0.739 (0.723 - 0.758) 0.634 (0.610 - 0.654) 0.131 (0.125 - 0.137) 0.140 (0.131 - 0.148) 0.116 (0.105 - 0.127)

Class-wise performance

PSDS scenario 1

Rank Submission
code
Submission
name
Technical
Report
Ranking score
(Evaluation dataset)
Alarm
Bell
Ringing
Blender Cat Dishes Dog Electric
shave
toothbrush
Frying Running
water
Speech Vacuum
cleaner
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 1.00 0.389 (0.345 - 0.443) 0.654 (0.604 - 0.702) 0.574 (0.515 - 0.622) 0.168 (0.147 - 0.197) 0.311 (0.286 - 0.340) 0.546 (0.481 - 0.627) 0.620 (0.531 - 0.688) 0.408 (0.371 - 0.454) 0.744 (0.713 - 0.766) 0.699 (0.639 - 0.748)
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 1.52 0.662 (0.608 - 0.736) 0.871 (0.832 - 0.909) 0.771 (0.741 - 0.799) 0.286 (0.269 - 0.303) 0.495 (0.458 - 0.533) 0.778 (0.709 - 0.837) 0.773 (0.716 - 0.815) 0.641 (0.608 - 0.675) 0.782 (0.769 - 0.796) 0.864 (0.837 - 0.894)
Li_USTC_task4a_1 TAFT and SdMT Li2023 1.54 0.708 (0.666 - 0.752) 0.800 (0.765 - 0.834) 0.807 (0.786 - 0.825) 0.380 (0.362 - 0.401) 0.481 (0.453 - 0.510) 0.773 (0.745 - 0.805) 0.819 (0.792 - 0.842) 0.597 (0.566 - 0.631) 0.781 (0.767 - 0.796) 0.863 (0.838 - 0.885)
Li_USTC_task4a_2 Pseudo labeling Li2023 1.58 0.704 (0.661 - 0.757) 0.829 (0.799 - 0.858) 0.828 (0.807 - 0.846) 0.399 (0.380 - 0.421) 0.475 (0.448 - 0.501) 0.862 (0.833 - 0.893) 0.800 (0.759 - 0.839) 0.643 (0.604 - 0.678) 0.804 (0.791 - 0.817) 0.860 (0.834 - 0.885)
Li_USTC_task4a_3 TAFT and AFL Li2023 1.54 0.729 (0.698 - 0.770) 0.831 (0.801 - 0.856) 0.819 (0.799 - 0.836) 0.379 (0.362 - 0.399) 0.494 (0.465 - 0.523) 0.802 (0.774 - 0.826) 0.807 (0.786 - 0.827) 0.608 (0.576 - 0.640) 0.764 (0.750 - 0.779) 0.893 (0.872 - 0.916)
Li_USTC_task4a_4 MaxFilter Li2023 0.89 0.248 (0.219 - 0.281) 0.286 (0.243 - 0.322) 0.066 (0.051 - 0.081) 0.018 (0.011 - 0.023) 0.089 (0.062 - 0.110) 0.497 (0.459 - 0.541) 0.720 (0.682 - 0.758) 0.485 (0.448 - 0.524) 0.076 (0.063 - 0.089) 0.787 (0.758 - 0.816)
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 1.52 0.708 (0.665 - 0.756) 0.775 (0.741 - 0.813) 0.806 (0.784 - 0.825) 0.377 (0.358 - 0.398) 0.480 (0.442 - 0.520) 0.774 (0.746 - 0.808) 0.806 (0.764 - 0.837) 0.570 (0.535 - 0.608) 0.777 (0.761 - 0.793) 0.857 (0.816 - 0.892)
Li_USTC_task4a_6 Pseudo labeling and single Li2023 1.56 0.682 (0.623 - 0.743) 0.819 (0.789 - 0.852) 0.830 (0.807 - 0.852) 0.400 (0.377 - 0.429) 0.469 (0.433 - 0.505) 0.851 (0.825 - 0.884) 0.797 (0.754 - 0.846) 0.613 (0.556 - 0.659) 0.802 (0.789 - 0.817) 0.856 (0.830 - 0.885)
Li_USTC_task4a_7 SKCRNN MT Li2023 1.20 0.482 (0.422 - 0.536) 0.718 (0.677 - 0.757) 0.656 (0.624 - 0.690) 0.235 (0.214 - 0.261) 0.322 (0.287 - 0.369) 0.676 (0.644 - 0.710) 0.663 (0.630 - 0.697) 0.555 (0.526 - 0.587) 0.756 (0.742 - 0.769) 0.771 (0.736 - 0.812)
Liu_NSYSU_task4_1 DCASE2023 FDY_WeakSED_Ensemble Liu2023 0.80 0.190 (0.163 - 0.220) 0.268 (0.230 - 0.303) 0.043 (0.030 - 0.056) 0.012 (0.004 - 0.016) 0.087 (0.059 - 0.107) 0.485 (0.441 - 0.527) 0.714 (0.674 - 0.748) 0.460 (0.429 - 0.496) 0.092 (0.078 - 0.107) 0.770 (0.740 - 0.802)
Liu_NSYSU_task4_2 FDY_Ensemble Liu2023 1.36 0.554 (0.508 - 0.615) 0.781 (0.746 - 0.815) 0.701 (0.671 - 0.729) 0.277 (0.262 - 0.293) 0.461 (0.438 - 0.497) 0.683 (0.648 - 0.722) 0.757 (0.728 - 0.783) 0.547 (0.520 - 0.578) 0.795 (0.785 - 0.807) 0.818 (0.790 - 0.848)
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 1.26 0.555 (0.514 - 0.612) 0.719 (0.684 - 0.755) 0.675 (0.643 - 0.708) 0.239 (0.228 - 0.255) 0.435 (0.394 - 0.474) 0.677 (0.629 - 0.737) 0.735 (0.694 - 0.785) 0.494 (0.459 - 0.527) 0.756 (0.741 - 0.774) 0.719 (0.689 - 0.748)
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 1.24 0.488 (0.440 - 0.549) 0.765 (0.708 - 0.814) 0.670 (0.634 - 0.707) 0.244 (0.217 - 0.284) 0.380 (0.332 - 0.441) 0.654 (0.622 - 0.691) 0.690 (0.641 - 0.743) 0.507 (0.432 - 0.564) 0.742 (0.727 - 0.758) 0.812 (0.783 - 0.843)
Liu_NSYSU_task4_5 DCASE2023 FDY_BEATs_WeakSED Liu2023 0.82 0.199 (0.169 - 0.230) 0.290 (0.249 - 0.329) 0.052 (0.035 - 0.068) 0.010 (0.003 - 0.014) 0.086 (0.059 - 0.107) 0.509 (0.468 - 0.553) 0.733 (0.695 - 0.771) 0.507 (0.471 - 0.550) 0.031 (0.025 - 0.037) 0.786 (0.757 - 0.816)
Liu_NSYSU_task4_6 DCASE2023 FDY_BEATs Liu2023 1.62 0.684 (0.634 - 0.750) 0.908 (0.888 - 0.927) 0.797 (0.773 - 0.820) 0.340 (0.324 - 0.356) 0.590 (0.563 - 0.618) 0.731 (0.703 - 0.765) 0.812 (0.786 - 0.842) 0.652 (0.617 - 0.690) 0.801 (0.787 - 0.814) 0.886 (0.861 - 0.908)
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 1.55 0.696 (0.656 - 0.746) 0.831 (0.795 - 0.865) 0.772 (0.747 - 0.800) 0.318 (0.301 - 0.334) 0.552 (0.527 - 0.582) 0.685 (0.644 - 0.727) 0.805 (0.775 - 0.834) 0.600 (0.563 - 0.642) 0.730 (0.713 - 0.747) 0.860 (0.831 - 0.890)
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 1.53 0.667 (0.576 - 0.735) 0.866 (0.827 - 0.898) 0.774 (0.745 - 0.800) 0.312 (0.291 - 0.332) 0.551 (0.523 - 0.581) 0.660 (0.597 - 0.718) 0.818 (0.791 - 0.846) 0.607 (0.568 - 0.649) 0.755 (0.719 - 0.788) 0.872 (0.835 - 0.905)
Lee_CAU_task4A_1 CAU_ET Lee2023 1.24 0.528 (0.482 - 0.596) 0.752 (0.717 - 0.781) 0.673 (0.637 - 0.703) 0.252 (0.235 - 0.272) 0.408 (0.381 - 0.443) 0.631 (0.599 - 0.669) 0.622 (0.590 - 0.658) 0.515 (0.485 - 0.547) 0.789 (0.773 - 0.803) 0.783 (0.752 - 0.821)
Lee_CAU_task4A_2 CAU_ET Lee2023 0.79 0.229 (0.181 - 0.289) 0.311 (0.258 - 0.369) 0.143 (0.106 - 0.182) 0.032 (0.025 - 0.040) 0.113 (0.087 - 0.137) 0.449 (0.402 - 0.499) 0.619 (0.578 - 0.663) 0.452 (0.414 - 0.489) 0.166 (0.136 - 0.200) 0.694 (0.646 - 0.741)
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 1.53 0.652 (0.600 - 0.722) 0.867 (0.838 - 0.892) 0.786 (0.760 - 0.809) 0.321 (0.303 - 0.339) 0.497 (0.473 - 0.528) 0.824 (0.797 - 0.856) 0.788 (0.762 - 0.822) 0.592 (0.553 - 0.628) 0.775 (0.758 - 0.793) 0.865 (0.836 - 0.892)
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 1.45 0.485 (0.441 - 0.542) 0.815 (0.783 - 0.846) 0.781 (0.751 - 0.809) 0.316 (0.302 - 0.332) 0.519 (0.494 - 0.552) 0.789 (0.758 - 0.820) 0.707 (0.671 - 0.745) 0.577 (0.540 - 0.610) 0.814 (0.797 - 0.831) 0.870 (0.839 - 0.898)
Chen_CHT_task4_1 VGGSK Chen2023b 1.25 0.523 (0.385 - 0.629) 0.704 (0.660 - 0.745) 0.666 (0.622 - 0.707) 0.271 (0.224 - 0.307) 0.468 (0.437 - 0.500) 0.707 (0.659 - 0.751) 0.771 (0.717 - 0.815) 0.471 (0.438 - 0.505) 0.730 (0.717 - 0.746) 0.830 (0.789 - 0.870)
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 1.58 0.679 (0.616 - 0.730) 0.876 (0.841 - 0.904) 0.802 (0.777 - 0.825) 0.376 (0.356 - 0.395) 0.596 (0.565 - 0.629) 0.812 (0.782 - 0.843) 0.804 (0.768 - 0.844) 0.598 (0.569 - 0.630) 0.824 (0.812 - 0.839) 0.891 (0.856 - 0.921)
Chen_CHT_task4_3 VGGSK+BEATs Chen2023b 1.66 0.720 (0.682 - 0.765) 0.899 (0.879 - 0.918) 0.821 (0.799 - 0.841) 0.395 (0.375 - 0.413) 0.615 (0.591 - 0.641) 0.860 (0.834 - 0.886) 0.830 (0.801 - 0.861) 0.671 (0.644 - 0.700) 0.837 (0.824 - 0.850) 0.913 (0.893 - 0.933)
Chen_CHT_task4_4 multi+BEATs Chen2023b 1.66 0.692 (0.648 - 0.749) 0.896 (0.875 - 0.919) 0.832 (0.811 - 0.852) 0.384 (0.366 - 0.400) 0.610 (0.587 - 0.634) 0.859 (0.834 - 0.883) 0.803 (0.776 - 0.829) 0.688 (0.656 - 0.719) 0.839 (0.828 - 0.853) 0.912 (0.895 - 0.934)
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Zhang2023 1.23 0.393 (0.356 - 0.440) 0.746 (0.712 - 0.777) 0.690 (0.662 - 0.717) 0.228 (0.214 - 0.241) 0.396 (0.374 - 0.431) 0.711 (0.681 - 0.750) 0.720 (0.694 - 0.746) 0.514 (0.487 - 0.541) 0.787 (0.776 - 0.799) 0.824 (0.800 - 0.851)
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 1.55 0.635 (0.591 - 0.713) 0.849 (0.819 - 0.889) 0.774 (0.754 - 0.790) 0.303 (0.286 - 0.321) 0.513 (0.483 - 0.551) 0.786 (0.758 - 0.819) 0.864 (0.843 - 0.888) 0.647 (0.617 - 0.673) 0.798 (0.787 - 0.813) 0.899 (0.878 - 0.923)
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.86 0.196 (0.169 - 0.225) 0.359 (0.326 - 0.389) 0.071 (0.056 - 0.084) 0.042 (0.035 - 0.050) 0.095 (0.071 - 0.114) 0.523 (0.488 - 0.565) 0.702 (0.667 - 0.730) 0.445 (0.412 - 0.480) 0.095 (0.088 - 0.106) 0.787 (0.760 - 0.817)
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 1.60 0.692 (0.658 - 0.736) 0.818 (0.788 - 0.855) 0.790 (0.771 - 0.810) 0.360 (0.345 - 0.376) 0.544 (0.517 - 0.577) 0.774 (0.749 - 0.805) 0.759 (0.732 - 0.788) 0.624 (0.592 - 0.657) 0.813 (0.803 - 0.823) 0.888 (0.864 - 0.915)
Xiao_FMSG_task4a_5 Xiao_FMSG_task4a_5_ensemble_model Xiao2023 1.61 0.685 (0.641 - 0.739) 0.837 (0.806 - 0.873) 0.790 (0.766 - 0.811) 0.362 (0.345 - 0.377) 0.553 (0.524 - 0.583) 0.796 (0.771 - 0.822) 0.770 (0.742 - 0.801) 0.634 (0.604 - 0.663) 0.811 (0.799 - 0.823) 0.887 (0.861 - 0.909)
Xiao_FMSG_task4a_6 Xiao_FMSG_task4a_6_ensemble_model Xiao2023 1.61 0.673 (0.629 - 0.733) 0.837 (0.810 - 0.871) 0.810 (0.788 - 0.830) 0.334 (0.317 - 0.347) 0.556 (0.529 - 0.588) 0.815 (0.792 - 0.844) 0.808 (0.784 - 0.835) 0.640 (0.607 - 0.673) 0.823 (0.813 - 0.834) 0.905 (0.886 - 0.929)
Xiao_FMSG_task4a_7 Xiao_FMSG_task4a_7_ensemble_model Xiao2023 0.87 0.202 (0.175 - 0.231) 0.381 (0.344 - 0.416) 0.072 (0.057 - 0.086) 0.043 (0.033 - 0.051) 0.097 (0.073 - 0.117) 0.510 (0.471 - 0.553) 0.704 (0.669 - 0.733) 0.464 (0.427 - 0.501) 0.099 (0.092 - 0.109) 0.791 (0.764 - 0.821)
Xiao_FMSG_task4a_8 Xiao_FMSG_task4a_8_ensemble_model Xiao2023 1.62 0.678 (0.634 - 0.752) 0.850 (0.820 - 0.889) 0.803 (0.783 - 0.822) 0.332 (0.315 - 0.349) 0.553 (0.522 - 0.586) 0.821 (0.799 - 0.849) 0.804 (0.777 - 0.833) 0.640 (0.613 - 0.669) 0.816 (0.805 - 0.827) 0.899 (0.878 - 0.925)
Guan_HIT_task4a_1 Guan_HIT_task4a_1 Guan2023 1.57 0.698 (0.661 - 0.740) 0.851 (0.820 - 0.887) 0.813 (0.791 - 0.839) 0.322 (0.308 - 0.336) 0.466 (0.436 - 0.500) 0.850 (0.831 - 0.878) 0.838 (0.811 - 0.867) 0.668 (0.634 - 0.699) 0.792 (0.779 - 0.806) 0.905 (0.884 - 0.925)
Guan_HIT_task4a_2 Guan_HIT_task4a_2 Guan2023 0.93 0.269 (0.235 - 0.311) 0.494 (0.456 - 0.533) 0.063 (0.049 - 0.076) 0.024 (0.016 - 0.030) 0.090 (0.068 - 0.109) 0.621 (0.590 - 0.662) 0.777 (0.744 - 0.803) 0.509 (0.477 - 0.546) 0.089 (0.081 - 0.099) 0.865 (0.839 - 0.890)
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 1.55 0.686 (0.638 - 0.739) 0.842 (0.811 - 0.879) 0.806 (0.781 - 0.831) 0.320 (0.306 - 0.335) 0.452 (0.417 - 0.488) 0.843 (0.819 - 0.874) 0.822 (0.773 - 0.868) 0.651 (0.619 - 0.686) 0.781 (0.764 - 0.797) 0.901 (0.878 - 0.922)
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.92 0.269 (0.235 - 0.310) 0.497 (0.458 - 0.535) 0.064 (0.049 - 0.078) 0.025 (0.017 - 0.030) 0.090 (0.066 - 0.110) 0.617 (0.584 - 0.663) 0.765 (0.731 - 0.798) 0.498 (0.459 - 0.539) 0.091 (0.069 - 0.109) 0.865 (0.839 - 0.890)
Guan_HIT_task4a_5 Guan_HIT_task4a_5 Guan2023 1.40 0.500 (0.455 - 0.556) 0.777 (0.745 - 0.822) 0.760 (0.731 - 0.789) 0.326 (0.310 - 0.342) 0.461 (0.434 - 0.491) 0.744 (0.711 - 0.793) 0.738 (0.709 - 0.771) 0.621 (0.590 - 0.651) 0.811 (0.801 - 0.823) 0.887 (0.859 - 0.913)
Guan_HIT_task4a_6 Guan_HIT_task4a_6 Guan2023 0.88 0.226 (0.194 - 0.266) 0.447 (0.410 - 0.478) 0.049 (0.035 - 0.063) 0.021 (0.015 - 0.025) 0.093 (0.070 - 0.114) 0.558 (0.520 - 0.607) 0.715 (0.683 - 0.747) 0.512 (0.477 - 0.550) 0.185 (0.167 - 0.205) 0.847 (0.819 - 0.874)
Wang_XiaoRice_task4a_1 SINGLE Wang2023 1.50 0.553 (0.497 - 0.622) 0.816 (0.788 - 0.845) 0.753 (0.723 - 0.778) 0.334 (0.310 - 0.355) 0.493 (0.467 - 0.523) 0.703 (0.621 - 0.802) 0.742 (0.695 - 0.781) 0.595 (0.566 - 0.627) 0.767 (0.751 - 0.784) 0.870 (0.843 - 0.896)
Wang_XiaoRice_task4a_2 SED Embed Wang2023 1.52 0.524 (0.476 - 0.580) 0.833 (0.808 - 0.859) 0.732 (0.699 - 0.759) 0.341 (0.323 - 0.360) 0.477 (0.452 - 0.504) 0.671 (0.637 - 0.709) 0.749 (0.723 - 0.779) 0.666 (0.632 - 0.700) 0.785 (0.772 - 0.798) 0.833 (0.800 - 0.863)
Wang_XiaoRice_task4a_3 L-TAG Wang2023 0.91 0.214 (0.185 - 0.246) 0.310 (0.272 - 0.348) 0.068 (0.050 - 0.087) 0.022 (0.014 - 0.027) 0.114 (0.086 - 0.137) 0.512 (0.473 - 0.556) 0.728 (0.691 - 0.767) 0.521 (0.485 - 0.561) 0.208 (0.191 - 0.226) 0.792 (0.761 - 0.822)
Zhang_IOA_task4_1 strong_ensemble Zhang2023 1.75 0.786 (0.747 - 0.831) 0.925 (0.908 - 0.940) 0.911 (0.897 - 0.926) 0.395 (0.377 - 0.413) 0.588 (0.556 - 0.626) 0.826 (0.802 - 0.856) 0.878 (0.860 - 0.905) 0.786 (0.762 - 0.814) 0.855 (0.845 - 0.869) 0.929 (0.912 - 0.947)
Zhang_IOA_task4_2 segment tagging model Zhang2023 0.95 0.220 (0.190 - 0.252) 0.320 (0.280 - 0.357) 0.055 (0.043 - 0.068) 0.016 (0.010 - 0.019) 0.109 (0.080 - 0.132) 0.511 (0.470 - 0.555) 0.744 (0.705 - 0.781) 0.537 (0.506 - 0.577) 0.105 (0.095 - 0.116) 0.800 (0.769 - 0.834)
Zhang_IOA_task4_3 strong_ensemble_all Zhang2023 1.71 0.782 (0.744 - 0.833) 0.919 (0.902 - 0.935) 0.888 (0.874 - 0.906) 0.425 (0.406 - 0.447) 0.575 (0.543 - 0.611) 0.818 (0.793 - 0.847) 0.878 (0.859 - 0.902) 0.701 (0.673 - 0.725) 0.817 (0.806 - 0.830) 0.921 (0.903 - 0.940)
Zhang_IOA_task4_4 strong_ensemble_1 Zhang2023 1.75 0.786 (0.747 - 0.831) 0.925 (0.908 - 0.940) 0.911 (0.897 - 0.926) 0.404 (0.384 - 0.423) 0.588 (0.556 - 0.626) 0.826 (0.802 - 0.856) 0.878 (0.860 - 0.905) 0.786 (0.762 - 0.814) 0.855 (0.845 - 0.869) 0.929 (0.912 - 0.947)
Zhang_IOA_task4_5 base system Zhang2023 1.52 0.718 (0.675 - 0.769) 0.887 (0.857 - 0.919) 0.861 (0.845 - 0.876) 0.344 (0.325 - 0.364) 0.400 (0.368 - 0.433) 0.795 (0.767 - 0.829) 0.834 (0.809 - 0.860) 0.647 (0.615 - 0.677) 0.779 (0.760 - 0.797) 0.899 (0.879 - 0.922)
Zhang_IOA_task4_6 strong_single Zhang2023 1.60 0.747 (0.713 - 0.796) 0.912 (0.893 - 0.934) 0.873 (0.856 - 0.893) 0.391 (0.374 - 0.410) 0.452 (0.419 - 0.489) 0.807 (0.784 - 0.839) 0.856 (0.836 - 0.878) 0.674 (0.642 - 0.702) 0.801 (0.788 - 0.814) 0.928 (0.912 - 0.947)
Zhang_IOA_task4_7 weak single Zhang2023 0.86 0.196 (0.166 - 0.226) 0.294 (0.261 - 0.327) 0.059 (0.041 - 0.074) 0.013 (0.007 - 0.017) 0.090 (0.064 - 0.112) 0.497 (0.459 - 0.536) 0.717 (0.688 - 0.744) 0.468 (0.434 - 0.505) 0.085 (0.075 - 0.096) 0.799 (0.765 - 0.829)
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 1.15 0.379 (0.344 - 0.427) 0.670 (0.634 - 0.698) 0.640 (0.610 - 0.670) 0.243 (0.228 - 0.260) 0.355 (0.332 - 0.394) 0.689 (0.660 - 0.719) 0.714 (0.690 - 0.739) 0.521 (0.494 - 0.552) 0.752 (0.741 - 0.767) 0.768 (0.744 - 0.800)
Wu_NCUT_task4a_2 Wu_NCUT_task4a_2 Wu2023 1.53 0.495 (0.455 - 0.547) 0.824 (0.800 - 0.853) 0.775 (0.746 - 0.799) 0.334 (0.321 - 0.347) 0.558 (0.533 - 0.585) 0.837 (0.812 - 0.859) 0.840 (0.817 - 0.862) 0.645 (0.618 - 0.671) 0.809 (0.795 - 0.824) 0.893 (0.870 - 0.915)
Wu_NCUT_task4a_3 Wu_NCUT_task4a_3 Wu2023 1.50 0.490 (0.449 - 0.544) 0.830 (0.808 - 0.856) 0.775 (0.746 - 0.800) 0.297 (0.283 - 0.312) 0.500 (0.478 - 0.534) 0.837 (0.813 - 0.859) 0.841 (0.819 - 0.863) 0.646 (0.619 - 0.671) 0.816 (0.804 - 0.831) 0.893 (0.870 - 0.915)
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 1.06 0.418 (0.373 - 0.475) 0.657 (0.621 - 0.692) 0.597 (0.548 - 0.636) 0.194 (0.168 - 0.224) 0.341 (0.316 - 0.370) 0.518 (0.451 - 0.612) 0.658 (0.619 - 0.696) 0.429 (0.404 - 0.460) 0.766 (0.749 - 0.781) 0.716 (0.680 - 0.752)
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 1.12 0.437 (0.390 - 0.487) 0.661 (0.625 - 0.697) 0.596 (0.546 - 0.637) 0.224 (0.199 - 0.255) 0.410 (0.379 - 0.438) 0.522 (0.459 - 0.606) 0.669 (0.628 - 0.707) 0.450 (0.412 - 0.482) 0.763 (0.745 - 0.779) 0.733 (0.697 - 0.768)
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.91 0.265 (0.210 - 0.318) 0.504 (0.467 - 0.547) 0.291 (0.194 - 0.361) 0.053 (0.041 - 0.064) 0.128 (0.099 - 0.154) 0.557 (0.457 - 0.650) 0.604 (0.525 - 0.663) 0.463 (0.423 - 0.500) 0.581 (0.549 - 0.612) 0.736 (0.705 - 0.768)
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.84 0.242 (0.198 - 0.293) 0.457 (0.426 - 0.495) 0.070 (0.053 - 0.088) 0.029 (0.020 - 0.037) 0.111 (0.083 - 0.132) 0.546 (0.465 - 0.619) 0.631 (0.556 - 0.687) 0.482 (0.442 - 0.520) 0.551 (0.465 - 0.610) 0.752 (0.721 - 0.785)
Barahona_AUDIAS_task4a_5 4-Resolution CRNN Barahona2023 1.14 0.420 (0.378 - 0.472) 0.711 (0.669 - 0.750) 0.650 (0.612 - 0.687) 0.210 (0.193 - 0.228) 0.358 (0.334 - 0.385) 0.616 (0.559 - 0.691) 0.675 (0.633 - 0.725) 0.465 (0.437 - 0.497) 0.782 (0.767 - 0.797) 0.771 (0.729 - 0.818)
Barahona_AUDIAS_task4a_6 4-Resolution CRNN with class-dependent median filtering Barahona2023 1.18 0.426 (0.393 - 0.466) 0.703 (0.664 - 0.744) 0.646 (0.607 - 0.683) 0.238 (0.217 - 0.259) 0.418 (0.391 - 0.444) 0.615 (0.556 - 0.693) 0.686 (0.647 - 0.733) 0.479 (0.451 - 0.507) 0.777 (0.762 - 0.794) 0.774 (0.735 - 0.821)
Barahona_AUDIAS_task4a_7 5-Resolution Conformer Barahona2023 1.06 0.275 (0.240 - 0.315) 0.579 (0.534 - 0.618) 0.489 (0.450 - 0.532) 0.135 (0.123 - 0.151) 0.189 (0.158 - 0.214) 0.571 (0.532 - 0.613) 0.659 (0.624 - 0.691) 0.490 (0.454 - 0.527) 0.742 (0.729 - 0.756) 0.760 (0.731 - 0.791)
Barahona_AUDIAS_task4a_8 5-Resolution Conformer with class-wise median filtering Barahona2023 1.00 0.263 (0.234 - 0.296) 0.576 (0.531 - 0.627) 0.281 (0.239 - 0.331) 0.061 (0.052 - 0.069) 0.137 (0.110 - 0.161) 0.567 (0.528 - 0.604) 0.679 (0.643 - 0.709) 0.503 (0.468 - 0.538) 0.662 (0.628 - 0.692) 0.774 (0.743 - 0.803)
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 1.12 0.370 (0.329 - 0.416) 0.705 (0.676 - 0.733) 0.614 (0.569 - 0.647) 0.224 (0.211 - 0.238) 0.311 (0.285 - 0.340) 0.658 (0.633 - 0.689) 0.622 (0.589 - 0.663) 0.500 (0.474 - 0.529) 0.757 (0.745 - 0.769) 0.757 (0.727 - 0.796)
Gan_NCUT_task4_2 Gan_NCUT_SED_system_2 Gan2023 1.52 0.565 (0.515 - 0.632) 0.855 (0.825 - 0.884) 0.783 (0.756 - 0.807) 0.306 (0.290 - 0.322) 0.498 (0.470 - 0.532) 0.836 (0.811 - 0.864) 0.788 (0.762 - 0.816) 0.624 (0.593 - 0.654) 0.827 (0.813 - 0.841) 0.881 (0.859 - 0.906)
Gan_NCUT_task4_3 Gan_NCUT_SED_system_3 Gan2023 1.50 0.507 (0.458 - 0.568) 0.842 (0.815 - 0.868) 0.782 (0.757 - 0.804) 0.291 (0.275 - 0.308) 0.457 (0.427 - 0.492) 0.826 (0.801 - 0.856) 0.776 (0.752 - 0.806) 0.633 (0.602 - 0.664) 0.817 (0.804 - 0.831) 0.880 (0.858 - 0.904)
Liu_SRCN_task4a_1 DCASE2023 t4a system1 Chen2023a 1.65 0.682 (0.631 - 0.747) 0.872 (0.847 - 0.893) 0.836 (0.815 - 0.859) 0.400 (0.384 - 0.413) 0.554 (0.523 - 0.583) 0.784 (0.757 - 0.816) 0.902 (0.880 - 0.923) 0.694 (0.660 - 0.723) 0.830 (0.819 - 0.842) 0.915 (0.896 - 0.936)
Liu_SRCN_task4a_2 DCASE2023 t4a system2 Chen2023a 1.40 0.576 (0.520 - 0.634) 0.851 (0.821 - 0.886) 0.686 (0.655 - 0.715) 0.124 (0.110 - 0.138) 0.262 (0.238 - 0.296) 0.823 (0.797 - 0.854) 0.905 (0.886 - 0.924) 0.687 (0.658 - 0.717) 0.562 (0.547 - 0.576) 0.916 (0.896 - 0.934)
Liu_SRCN_task4a_3 DCASE2023 t4a system3 Chen2023a 1.65 0.697 (0.639 - 0.757) 0.919 (0.900 - 0.942) 0.836 (0.814 - 0.856) 0.338 (0.323 - 0.354) 0.465 (0.440 - 0.498) 0.876 (0.856 - 0.900) 0.892 (0.874 - 0.916) 0.722 (0.688 - 0.751) 0.851 (0.841 - 0.863) 0.921 (0.902 - 0.941)
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 1.25 0.400 (0.367 - 0.446) 0.730 (0.695 - 0.772) 0.694 (0.669 - 0.718) 0.251 (0.239 - 0.267) 0.395 (0.366 - 0.422) 0.681 (0.650 - 0.720) 0.761 (0.732 - 0.788) 0.538 (0.517 - 0.565) 0.768 (0.755 - 0.784) 0.864 (0.837 - 0.888)
Liu_SRCN_task4a_5 DCASE2023 t4a system5 Chen2023a 0.94 0.230 (0.196 - 0.263) 0.322 (0.279 - 0.363) 0.057 (0.039 - 0.073) 0.015 (0.004 - 0.020) 0.118 (0.087 - 0.142) 0.531 (0.490 - 0.573) 0.747 (0.711 - 0.784) 0.549 (0.516 - 0.590) 0.275 (0.256 - 0.293) 0.804 (0.774 - 0.833)
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 1.35 0.504 (0.455 - 0.573) 0.806 (0.762 - 0.853) 0.719 (0.665 - 0.760) 0.298 (0.273 - 0.329) 0.431 (0.345 - 0.495) 0.719 (0.687 - 0.755) 0.790 (0.758 - 0.819) 0.572 (0.538 - 0.607) 0.716 (0.660 - 0.750) 0.842 (0.812 - 0.875)
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 1.68 0.740 (0.684 - 0.803) 0.868 (0.843 - 0.894) 0.823 (0.798 - 0.848) 0.395 (0.372 - 0.415) 0.608 (0.557 - 0.676) 0.835 (0.811 - 0.864) 0.811 (0.784 - 0.839) 0.665 (0.628 - 0.699) 0.814 (0.801 - 0.827) 0.882 (0.859 - 0.903)
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 1.66 0.723 (0.659 - 0.786) 0.862 (0.837 - 0.888) 0.814 (0.785 - 0.840) 0.388 (0.367 - 0.408) 0.583 (0.508 - 0.657) 0.840 (0.807 - 0.871) 0.829 (0.791 - 0.865) 0.652 (0.599 - 0.697) 0.816 (0.803 - 0.829) 0.883 (0.859 - 0.905)
Kim_GIST-HanwhaVision_task4a_4 FDYLKA BEATs pool 1d stage1 Kim2023 1.63 0.715 (0.646 - 0.788) 0.871 (0.839 - 0.898) 0.797 (0.755 - 0.834) 0.387 (0.340 - 0.423) 0.601 (0.566 - 0.640) 0.833 (0.800 - 0.867) 0.811 (0.765 - 0.844) 0.636 (0.599 - 0.669) 0.797 (0.784 - 0.812) 0.867 (0.842 - 0.896)
Kim_GIST-HanwhaVision_task4a_5 FDYLKA BEATs all ensemble 48 Kim2023 1.72 0.750 (0.709 - 0.796) 0.891 (0.869 - 0.915) 0.827 (0.804 - 0.849) 0.415 (0.395 - 0.434) 0.642 (0.611 - 0.679) 0.860 (0.837 - 0.886) 0.851 (0.831 - 0.873) 0.664 (0.626 - 0.705) 0.835 (0.824 - 0.850) 0.894 (0.872 - 0.917)
Kim_GIST-HanwhaVision_task4a_6 FDYLKA BEATs PSDS1 ensemble 16 Kim2023 1.72 0.755 (0.701 - 0.809) 0.884 (0.853 - 0.910) 0.835 (0.811 - 0.855) 0.419 (0.392 - 0.442) 0.631 (0.576 - 0.681) 0.855 (0.830 - 0.882) 0.845 (0.823 - 0.870) 0.668 (0.631 - 0.703) 0.830 (0.817 - 0.843) 0.894 (0.872 - 0.916)
Kim_GIST-HanwhaVision_task4a_7 FDYLKA BEATs PSDS2 ensemble 16 Kim2023 1.69 0.725 (0.680 - 0.774) 0.875 (0.850 - 0.900) 0.814 (0.790 - 0.835) 0.389 (0.371 - 0.407) 0.620 (0.580 - 0.661) 0.851 (0.824 - 0.882) 0.848 (0.822 - 0.875) 0.652 (0.613 - 0.692) 0.828 (0.814 - 0.844) 0.890 (0.859 - 0.917)
Kim_GIST-HanwhaVision_task4a_8 FDYLKA BEATs PSDS sum ensemble 16 Kim2023 1.72 0.752 (0.711 - 0.798) 0.887 (0.864 - 0.913) 0.829 (0.806 - 0.851) 0.416 (0.397 - 0.433) 0.641 (0.608 - 0.679) 0.860 (0.838 - 0.887) 0.849 (0.828 - 0.873) 0.672 (0.634 - 0.712) 0.834 (0.822 - 0.847) 0.892 (0.869 - 0.916)
Wenxin_TJU_task4a_1 ensemble-pretrained-psds1-0 Wenxin2023 1.63 0.616 (0.569 - 0.673) 0.871 (0.849 - 0.894) 0.773 (0.747 - 0.797) 0.385 (0.371 - 0.400) 0.561 (0.536 - 0.589) 0.752 (0.722 - 0.787) 0.862 (0.842 - 0.890) 0.669 (0.635 - 0.706) 0.802 (0.789 - 0.816) 0.872 (0.848 - 0.899)
Wenxin_TJU_task4a_2 ensemble-pretrained-psds1-1 Wenxin2023 1.66 0.662 (0.624 - 0.704) 0.865 (0.838 - 0.889) 0.780 (0.757 - 0.803) 0.402 (0.387 - 0.417) 0.564 (0.539 - 0.589) 0.748 (0.722 - 0.779) 0.869 (0.851 - 0.891) 0.700 (0.672 - 0.738) 0.784 (0.770 - 0.798) 0.879 (0.853 - 0.901)
Wenxin_TJU_task4a_3 ensemble-pretrained-psds2-0 Wenxin2023 0.88 0.241 (0.214 - 0.274) 0.325 (0.301 - 0.349) 0.064 (0.048 - 0.079) 0.024 (0.020 - 0.027) 0.111 (0.084 - 0.133) 0.506 (0.465 - 0.550) 0.723 (0.688 - 0.757) 0.500 (0.465 - 0.536) 0.140 (0.127 - 0.152) 0.782 (0.752 - 0.813)
Wenxin_TJU_task4a_4 ensemble-pretrained-psds2-1 Wenxin2023 0.90 0.227 (0.200 - 0.259) 0.317 (0.285 - 0.347) 0.069 (0.054 - 0.084) 0.027 (0.024 - 0.030) 0.112 (0.085 - 0.134) 0.511 (0.470 - 0.553) 0.735 (0.698 - 0.769) 0.523 (0.487 - 0.559) 0.146 (0.134 - 0.160) 0.791 (0.763 - 0.820)
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 1.58 0.606 (0.561 - 0.652) 0.857 (0.834 - 0.884) 0.767 (0.739 - 0.791) 0.371 (0.354 - 0.389) 0.547 (0.521 - 0.572) 0.713 (0.679 - 0.751) 0.837 (0.814 - 0.860) 0.642 (0.609 - 0.670) 0.788 (0.776 - 0.801) 0.842 (0.813 - 0.869)
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 1.61 0.660 (0.624 - 0.704) 0.852 (0.831 - 0.876) 0.757 (0.728 - 0.784) 0.373 (0.357 - 0.387) 0.545 (0.522 - 0.575) 0.693 (0.663 - 0.724) 0.874 (0.855 - 0.898) 0.682 (0.653 - 0.717) 0.748 (0.732 - 0.764) 0.873 (0.849 - 0.896)
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 1.31 0.442 (0.403 - 0.493) 0.745 (0.716 - 0.779) 0.738 (0.711 - 0.763) 0.301 (0.286 - 0.319) 0.425 (0.401 - 0.457) 0.642 (0.608 - 0.681) 0.710 (0.678 - 0.740) 0.538 (0.512 - 0.566) 0.785 (0.774 - 0.796) 0.821 (0.796 - 0.848)
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.75 0.210 (0.184 - 0.238) 0.266 (0.237 - 0.294) 0.046 (0.036 - 0.055) 0.031 (0.025 - 0.035) 0.105 (0.080 - 0.126) 0.491 (0.450 - 0.533) 0.677 (0.642 - 0.712) 0.425 (0.397 - 0.458) 0.084 (0.073 - 0.096) 0.723 (0.691 - 0.764)

PSDS scenario 2

Rank Submission
code
Submission
name
Technical
Report
Ranking score
(Evaluation dataset)
Alarm
Bell
Ringing
Blender Cat Dishes Dog Electric
shave
toothbrush
Frying Running
water
Speech Vacuum
cleaner
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 1.00 0.665 (0.613 - 0.727) 0.742 (0.705 - 0.780) 0.818 (0.784 - 0.850) 0.376 (0.338 - 0.410) 0.689 (0.644 - 0.739) 0.800 (0.773 - 0.829) 0.734 (0.646 - 0.803) 0.478 (0.430 - 0.545) 0.861 (0.824 - 0.887) 0.761 (0.709 - 0.801)
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 1.52 0.917 (0.887 - 0.959) 0.914 (0.889 - 0.935) 0.963 (0.954 - 0.974) 0.709 (0.680 - 0.737) 0.918 (0.900 - 0.938) 0.920 (0.901 - 0.941) 0.892 (0.866 - 0.920) 0.778 (0.751 - 0.805) 0.886 (0.871 - 0.902) 0.916 (0.896 - 0.936)
Li_USTC_task4a_1 TAFT and SdMT Li2023 1.54 0.917 (0.896 - 0.938) 0.860 (0.836 - 0.884) 0.954 (0.944 - 0.965) 0.706 (0.686 - 0.728) 0.880 (0.861 - 0.897) 0.922 (0.908 - 0.938) 0.887 (0.867 - 0.907) 0.706 (0.677 - 0.734) 0.870 (0.856 - 0.886) 0.930 (0.915 - 0.945)
Li_USTC_task4a_2 Pseudo labeling Li2023 1.58 0.886 (0.854 - 0.940) 0.867 (0.842 - 0.892) 0.964 (0.955 - 0.975) 0.733 (0.714 - 0.756) 0.890 (0.874 - 0.905) 0.942 (0.930 - 0.958) 0.886 (0.862 - 0.909) 0.723 (0.690 - 0.752) 0.880 (0.866 - 0.897) 0.923 (0.907 - 0.941)
Li_USTC_task4a_3 TAFT and AFL Li2023 1.54 0.919 (0.896 - 0.938) 0.859 (0.837 - 0.883) 0.958 (0.949 - 0.969) 0.699 (0.679 - 0.726) 0.888 (0.871 - 0.907) 0.956 (0.947 - 0.968) 0.867 (0.852 - 0.884) 0.675 (0.646 - 0.703) 0.853 (0.838 - 0.872) 0.958 (0.946 - 0.968)
Li_USTC_task4a_4 MaxFilter Li2023 0.89 0.943 (0.928 - 0.959) 0.931 (0.918 - 0.948) 0.967 (0.959 - 0.976) 0.815 (0.797 - 0.838) 0.960 (0.951 - 0.968) 0.969 (0.960 - 0.981) 0.937 (0.925 - 0.948) 0.782 (0.756 - 0.806) 0.956 (0.949 - 0.964) 0.957 (0.946 - 0.971)
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 1.52 0.917 (0.895 - 0.939) 0.838 (0.809 - 0.868) 0.950 (0.935 - 0.966) 0.699 (0.675 - 0.728) 0.880 (0.861 - 0.900) 0.909 (0.886 - 0.931) 0.871 (0.838 - 0.900) 0.704 (0.676 - 0.732) 0.867 (0.852 - 0.884) 0.920 (0.886 - 0.946)
Li_USTC_task4a_6 Pseudo labeling and single Li2023 1.56 0.881 (0.847 - 0.932) 0.869 (0.847 - 0.892) 0.965 (0.955 - 0.977) 0.723 (0.699 - 0.749) 0.887 (0.868 - 0.906) 0.942 (0.925 - 0.960) 0.878 (0.854 - 0.904) 0.747 (0.717 - 0.774) 0.877 (0.864 - 0.895) 0.924 (0.903 - 0.946)
Li_USTC_task4a_7 SKCRNN MT Li2023 1.20 0.681 (0.634 - 0.733) 0.776 (0.742 - 0.810) 0.858 (0.836 - 0.879) 0.514 (0.460 - 0.560) 0.741 (0.699 - 0.778) 0.825 (0.786 - 0.868) 0.694 (0.667 - 0.727) 0.646 (0.616 - 0.678) 0.860 (0.844 - 0.874) 0.842 (0.805 - 0.877)
Liu_NSYSU_task4_1 DCASE2023 FDY_WeakSED_Ensemble Liu2023 0.80 0.914 (0.893 - 0.938) 0.871 (0.847 - 0.894) 0.896 (0.881 - 0.912) 0.703 (0.680 - 0.730) 0.920 (0.906 - 0.941) 0.937 (0.923 - 0.951) 0.903 (0.886 - 0.917) 0.719 (0.693 - 0.749) 0.881 (0.864 - 0.898) 0.919 (0.904 - 0.935)
Liu_NSYSU_task4_2 FDY_Ensemble Liu2023 1.36 0.813 (0.779 - 0.854) 0.861 (0.834 - 0.886) 0.907 (0.892 - 0.925) 0.559 (0.529 - 0.591) 0.745 (0.716 - 0.782) 0.914 (0.901 - 0.928) 0.903 (0.886 - 0.916) 0.693 (0.668 - 0.721) 0.900 (0.890 - 0.912) 0.918 (0.903 - 0.935)
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 1.26 0.806 (0.768 - 0.840) 0.794 (0.765 - 0.820) 0.876 (0.860 - 0.898) 0.496 (0.470 - 0.523) 0.718 (0.687 - 0.755) 0.866 (0.843 - 0.895) 0.850 (0.825 - 0.873) 0.625 (0.585 - 0.663) 0.856 (0.839 - 0.873) 0.833 (0.803 - 0.857)
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 1.24 0.749 (0.684 - 0.820) 0.844 (0.807 - 0.887) 0.872 (0.855 - 0.891) 0.521 (0.478 - 0.567) 0.707 (0.674 - 0.749) 0.893 (0.874 - 0.914) 0.847 (0.810 - 0.894) 0.640 (0.566 - 0.692) 0.862 (0.834 - 0.894) 0.870 (0.846 - 0.895)
Liu_NSYSU_task4_5 DCASE2023 FDY_BEATs_WeakSED Liu2023 0.82 0.976 (0.967 - 0.985) 0.933 (0.919 - 0.948) 0.914 (0.903 - 0.928) 0.716 (0.691 - 0.740) 0.954 (0.943 - 0.966) 0.967 (0.959 - 0.975) 0.942 (0.931 - 0.952) 0.841 (0.815 - 0.869) 0.804 (0.778 - 0.830) 0.954 (0.944 - 0.967)
Liu_NSYSU_task4_6 DCASE2023 FDY_BEATs Liu2023 1.62 0.947 (0.927 - 0.973) 0.948 (0.937 - 0.959) 0.967 (0.959 - 0.976) 0.742 (0.722 - 0.763) 0.941 (0.931 - 0.957) 0.957 (0.947 - 0.968) 0.943 (0.932 - 0.954) 0.840 (0.817 - 0.866) 0.896 (0.883 - 0.911) 0.949 (0.938 - 0.962)
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 1.55 0.936 (0.913 - 0.975) 0.921 (0.891 - 0.946) 0.962 (0.952 - 0.973) 0.736 (0.702 - 0.764) 0.926 (0.915 - 0.943) 0.934 (0.917 - 0.954) 0.919 (0.905 - 0.933) 0.798 (0.757 - 0.843) 0.860 (0.841 - 0.879) 0.920 (0.894 - 0.942)
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 1.53 0.916 (0.880 - 0.970) 0.921 (0.891 - 0.946) 0.963 (0.954 - 0.974) 0.716 (0.686 - 0.759) 0.917 (0.894 - 0.938) 0.939 (0.924 - 0.957) 0.925 (0.913 - 0.937) 0.787 (0.759 - 0.819) 0.872 (0.848 - 0.895) 0.929 (0.894 - 0.956)
Lee_CAU_task4A_1 CAU_ET Lee2023 1.24 0.794 (0.758 - 0.843) 0.820 (0.793 - 0.846) 0.902 (0.882 - 0.923) 0.523 (0.495 - 0.554) 0.699 (0.669 - 0.731) 0.830 (0.796 - 0.865) 0.729 (0.697 - 0.761) 0.586 (0.551 - 0.622) 0.889 (0.878 - 0.901) 0.821 (0.793 - 0.853)
Lee_CAU_task4A_2 CAU_ET Lee2023 0.79 0.826 (0.765 - 0.908) 0.803 (0.771 - 0.838) 0.890 (0.872 - 0.910) 0.617 (0.586 - 0.648) 0.831 (0.804 - 0.860) 0.838 (0.801 - 0.874) 0.757 (0.720 - 0.795) 0.609 (0.571 - 0.648) 0.846 (0.812 - 0.872) 0.797 (0.746 - 0.847)
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 1.53 0.925 (0.899 - 0.969) 0.914 (0.897 - 0.929) 0.967 (0.957 - 0.977) 0.736 (0.709 - 0.767) 0.912 (0.892 - 0.934) 0.934 (0.918 - 0.956) 0.906 (0.888 - 0.924) 0.739 (0.707 - 0.772) 0.874 (0.857 - 0.893) 0.922 (0.905 - 0.940)
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 1.45 0.825 (0.791 - 0.869) 0.883 (0.862 - 0.907) 0.954 (0.933 - 0.968) 0.672 (0.646 - 0.701) 0.924 (0.908 - 0.940) 0.911 (0.886 - 0.932) 0.890 (0.873 - 0.909) 0.702 (0.668 - 0.737) 0.906 (0.891 - 0.920) 0.922 (0.908 - 0.940)
Chen_CHT_task4_1 VGGSK Chen2023b 1.25 0.695 (0.544 - 0.796) 0.789 (0.758 - 0.818) 0.865 (0.828 - 0.890) 0.520 (0.445 - 0.571) 0.697 (0.658 - 0.742) 0.832 (0.764 - 0.882) 0.824 (0.774 - 0.868) 0.556 (0.523 - 0.594) 0.842 (0.829 - 0.855) 0.863 (0.830 - 0.896)
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 1.58 0.921 (0.888 - 0.950) 0.906 (0.886 - 0.926) 0.965 (0.954 - 0.977) 0.697 (0.676 - 0.720) 0.912 (0.890 - 0.932) 0.922 (0.893 - 0.948) 0.897 (0.869 - 0.928) 0.706 (0.675 - 0.735) 0.909 (0.898 - 0.920) 0.933 (0.917 - 0.953)
Chen_CHT_task4_3 VGGSK+BEATs Chen2023b 1.66 0.943 (0.929 - 0.958) 0.918 (0.901 - 0.936) 0.965 (0.958 - 0.974) 0.706 (0.684 - 0.727) 0.915 (0.900 - 0.933) 0.925 (0.910 - 0.941) 0.928 (0.914 - 0.942) 0.789 (0.761 - 0.815) 0.912 (0.901 - 0.923) 0.950 (0.937 - 0.963)
Chen_CHT_task4_4 multi+BEATs Chen2023b 1.66 0.947 (0.933 - 0.962) 0.918 (0.902 - 0.936) 0.965 (0.958 - 0.974) 0.715 (0.693 - 0.737) 0.918 (0.904 - 0.934) 0.942 (0.931 - 0.954) 0.936 (0.924 - 0.948) 0.809 (0.781 - 0.834) 0.915 (0.903 - 0.926) 0.952 (0.939 - 0.965)
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Zhang2023 1.23 0.750 (0.712 - 0.794) 0.833 (0.808 - 0.864) 0.904 (0.888 - 0.923) 0.523 (0.495 - 0.552) 0.774 (0.750 - 0.804) 0.857 (0.834 - 0.883) 0.814 (0.789 - 0.835) 0.615 (0.586 - 0.647) 0.896 (0.885 - 0.909) 0.878 (0.861 - 0.898)
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 1.55 0.864 (0.832 - 0.914) 0.928 (0.914 - 0.941) 0.951 (0.940 - 0.963) 0.711 (0.692 - 0.733) 0.914 (0.900 - 0.934) 0.944 (0.932 - 0.958) 0.926 (0.914 - 0.945) 0.799 (0.776 - 0.818) 0.918 (0.908 - 0.928) 0.950 (0.938 - 0.966)
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.86 0.891 (0.862 - 0.943) 0.889 (0.870 - 0.908) 0.929 (0.918 - 0.942) 0.743 (0.720 - 0.763) 0.951 (0.940 - 0.966) 0.933 (0.918 - 0.950) 0.893 (0.875 - 0.914) 0.764 (0.740 - 0.793) 0.899 (0.886 - 0.915) 0.957 (0.947 - 0.968)
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 1.60 0.898 (0.870 - 0.948) 0.917 (0.902 - 0.937) 0.960 (0.951 - 0.970) 0.724 (0.701 - 0.755) 0.931 (0.920 - 0.948) 0.933 (0.922 - 0.947) 0.911 (0.893 - 0.930) 0.786 (0.759 - 0.811) 0.919 (0.910 - 0.927) 0.949 (0.936 - 0.962)
Xiao_FMSG_task4a_5 Xiao_FMSG_task4a_5_ensemble_model Xiao2023 1.61 0.900 (0.873 - 0.949) 0.927 (0.914 - 0.947) 0.963 (0.956 - 0.973) 0.721 (0.700 - 0.746) 0.937 (0.927 - 0.953) 0.952 (0.942 - 0.962) 0.935 (0.922 - 0.948) 0.806 (0.780 - 0.830) 0.918 (0.908 - 0.928) 0.960 (0.950 - 0.970)
Xiao_FMSG_task4a_6 Xiao_FMSG_task4a_6_ensemble_model Xiao2023 1.61 0.903 (0.876 - 0.953) 0.935 (0.923 - 0.951) 0.963 (0.954 - 0.973) 0.737 (0.716 - 0.762) 0.947 (0.938 - 0.960) 0.956 (0.948 - 0.967) 0.941 (0.930 - 0.952) 0.808 (0.781 - 0.831) 0.927 (0.916 - 0.936) 0.964 (0.955 - 0.973)
Xiao_FMSG_task4a_7 Xiao_FMSG_task4a_7_ensemble_model Xiao2023 0.87 0.903 (0.876 - 0.957) 0.894 (0.875 - 0.912) 0.932 (0.922 - 0.944) 0.745 (0.724 - 0.767) 0.953 (0.943 - 0.970) 0.932 (0.918 - 0.945) 0.892 (0.872 - 0.912) 0.774 (0.749 - 0.804) 0.896 (0.884 - 0.913) 0.954 (0.943 - 0.965)
Xiao_FMSG_task4a_8 Xiao_FMSG_task4a_8_ensemble_model Xiao2023 1.62 0.907 (0.881 - 0.955) 0.940 (0.930 - 0.955) 0.965 (0.957 - 0.974) 0.740 (0.719 - 0.765) 0.944 (0.934 - 0.957) 0.958 (0.949 - 0.969) 0.943 (0.932 - 0.955) 0.821 (0.796 - 0.843) 0.925 (0.914 - 0.935) 0.966 (0.957 - 0.977)
Guan_HIT_task4a_1 Guan_HIT_task4a_1 Guan2023 1.57 0.937 (0.922 - 0.957) 0.947 (0.937 - 0.957) 0.960 (0.952 - 0.971) 0.714 (0.689 - 0.739) 0.897 (0.882 - 0.913) 0.946 (0.933 - 0.958) 0.941 (0.928 - 0.954) 0.777 (0.753 - 0.805) 0.905 (0.892 - 0.917) 0.960 (0.948 - 0.972)
Guan_HIT_task4a_2 Guan_HIT_task4a_2 Guan2023 0.93 0.976 (0.969 - 0.983) 0.957 (0.947 - 0.968) 0.969 (0.960 - 0.979) 0.824 (0.803 - 0.842) 0.967 (0.960 - 0.976) 0.962 (0.953 - 0.972) 0.934 (0.922 - 0.949) 0.797 (0.772 - 0.825) 0.941 (0.931 - 0.957) 0.958 (0.947 - 0.972)
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 1.55 0.930 (0.905 - 0.953) 0.940 (0.926 - 0.955) 0.958 (0.948 - 0.970) 0.702 (0.678 - 0.730) 0.888 (0.869 - 0.909) 0.941 (0.926 - 0.955) 0.936 (0.922 - 0.951) 0.762 (0.731 - 0.794) 0.900 (0.888 - 0.913) 0.952 (0.938 - 0.968)
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.92 0.973 (0.963 - 0.982) 0.955 (0.943 - 0.968) 0.967 (0.957 - 0.978) 0.817 (0.798 - 0.835) 0.965 (0.954 - 0.976) 0.959 (0.948 - 0.970) 0.928 (0.914 - 0.944) 0.782 (0.746 - 0.822) 0.946 (0.936 - 0.960) 0.951 (0.936 - 0.966)
Guan_HIT_task4a_5 Guan_HIT_task4a_5 Guan2023 1.40 0.773 (0.735 - 0.818) 0.864 (0.842 - 0.887) 0.903 (0.886 - 0.921) 0.570 (0.545 - 0.600) 0.782 (0.756 - 0.811) 0.877 (0.857 - 0.898) 0.895 (0.877 - 0.911) 0.711 (0.678 - 0.746) 0.912 (0.903 - 0.921) 0.915 (0.896 - 0.933)
Guan_HIT_task4a_6 Guan_HIT_task4a_6 Guan2023 0.88 0.850 (0.815 - 0.890) 0.886 (0.865 - 0.908) 0.931 (0.914 - 0.948) 0.750 (0.728 - 0.774) 0.889 (0.872 - 0.905) 0.908 (0.890 - 0.927) 0.897 (0.879 - 0.913) 0.742 (0.713 - 0.776) 0.956 (0.949 - 0.965) 0.916 (0.897 - 0.936)
Wang_XiaoRice_task4a_1 SINGLE Wang2023 1.50 0.884 (0.847 - 0.924) 0.870 (0.848 - 0.896) 0.967 (0.958 - 0.977) 0.724 (0.698 - 0.751) 0.933 (0.921 - 0.948) 0.946 (0.926 - 0.968) 0.875 (0.849 - 0.903) 0.785 (0.760 - 0.811) 0.905 (0.895 - 0.918) 0.924 (0.905 - 0.946)
Wang_XiaoRice_task4a_2 SED Embed Wang2023 1.52 0.862 (0.824 - 0.907) 0.894 (0.875 - 0.916) 0.955 (0.946 - 0.966) 0.719 (0.697 - 0.745) 0.929 (0.918 - 0.942) 0.956 (0.944 - 0.970) 0.905 (0.885 - 0.927) 0.837 (0.816 - 0.862) 0.913 (0.897 - 0.925) 0.920 (0.904 - 0.939)
Wang_XiaoRice_task4a_3 L-TAG Wang2023 0.91 0.947 (0.921 - 0.968) 0.913 (0.897 - 0.932) 0.930 (0.917 - 0.943) 0.737 (0.714 - 0.761) 0.948 (0.937 - 0.965) 0.982 (0.976 - 0.988) 0.950 (0.938 - 0.964) 0.821 (0.795 - 0.845) 0.923 (0.907 - 0.935) 0.964 (0.954 - 0.975)
Zhang_IOA_task4_1 strong_ensemble Zhang2023 1.75 0.931 (0.914 - 0.952) 0.950 (0.940 - 0.964) 0.982 (0.977 - 0.988) 0.757 (0.738 - 0.778) 0.964 (0.957 - 0.972) 0.923 (0.913 - 0.937) 0.942 (0.930 - 0.956) 0.897 (0.880 - 0.922) 0.930 (0.922 - 0.941) 0.953 (0.940 - 0.967)
Zhang_IOA_task4_2 segment tagging model Zhang2023 0.95 0.992 (0.990 - 0.996) 0.988 (0.984 - 0.997) 0.973 (0.965 - 0.982) 0.835 (0.818 - 0.855) 0.967 (0.959 - 0.978) 0.997 (0.997 - 0.999) 0.964 (0.954 - 0.975) 0.897 (0.880 - 0.917) 0.962 (0.954 - 0.974) 0.981 (0.974 - 0.991)
Zhang_IOA_task4_3 strong_ensemble_all Zhang2023 1.71 0.931 (0.915 - 0.952) 0.947 (0.936 - 0.961) 0.979 (0.974 - 0.986) 0.736 (0.718 - 0.756) 0.959 (0.951 - 0.966) 0.930 (0.918 - 0.941) 0.940 (0.928 - 0.956) 0.799 (0.778 - 0.823) 0.907 (0.895 - 0.921) 0.952 (0.940 - 0.966)
Zhang_IOA_task4_4 strong_ensemble_1 Zhang2023 1.75 0.931 (0.914 - 0.952) 0.950 (0.940 - 0.964) 0.982 (0.977 - 0.988) 0.751 (0.733 - 0.775) 0.964 (0.957 - 0.972) 0.923 (0.913 - 0.937) 0.942 (0.930 - 0.956) 0.897 (0.880 - 0.922) 0.930 (0.922 - 0.941) 0.953 (0.940 - 0.967)
Zhang_IOA_task4_5 base system Zhang2023 1.52 0.867 (0.819 - 0.905) 0.923 (0.902 - 0.942) 0.960 (0.950 - 0.970) 0.672 (0.651 - 0.696) 0.876 (0.853 - 0.903) 0.895 (0.876 - 0.915) 0.905 (0.889 - 0.925) 0.738 (0.710 - 0.768) 0.898 (0.886 - 0.912) 0.929 (0.910 - 0.948)
Zhang_IOA_task4_6 strong_single Zhang2023 1.60 0.896 (0.873 - 0.920) 0.943 (0.931 - 0.957) 0.965 (0.957 - 0.976) 0.704 (0.685 - 0.727) 0.881 (0.866 - 0.900) 0.914 (0.901 - 0.930) 0.923 (0.911 - 0.939) 0.749 (0.722 - 0.775) 0.906 (0.896 - 0.919) 0.954 (0.941 - 0.967)
Zhang_IOA_task4_7 weak single Zhang2023 0.86 0.932 (0.920 - 0.948) 0.935 (0.922 - 0.950) 0.930 (0.916 - 0.943) 0.752 (0.726 - 0.777) 0.929 (0.916 - 0.944) 0.956 (0.947 - 0.967) 0.914 (0.900 - 0.931) 0.787 (0.758 - 0.811) 0.954 (0.947 - 0.961) 0.946 (0.934 - 0.959)
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 1.15 0.665 (0.618 - 0.705) 0.774 (0.744 - 0.803) 0.857 (0.836 - 0.886) 0.440 (0.414 - 0.466) 0.719 (0.690 - 0.758) 0.826 (0.800 - 0.856) 0.768 (0.741 - 0.792) 0.586 (0.557 - 0.622) 0.871 (0.856 - 0.885) 0.812 (0.789 - 0.840)
Wu_NCUT_task4a_2 Wu_NCUT_task4a_2 Wu2023 1.53 0.865 (0.838 - 0.895) 0.897 (0.878 - 0.917) 0.953 (0.942 - 0.964) 0.672 (0.653 - 0.695) 0.935 (0.926 - 0.949) 0.936 (0.923 - 0.950) 0.909 (0.894 - 0.924) 0.799 (0.772 - 0.824) 0.907 (0.896 - 0.918) 0.945 (0.931 - 0.959)
Wu_NCUT_task4a_3 Wu_NCUT_task4a_3 Wu2023 1.50 0.870 (0.846 - 0.898) 0.899 (0.879 - 0.918) 0.955 (0.945 - 0.966) 0.662 (0.643 - 0.685) 0.933 (0.924 - 0.946) 0.938 (0.926 - 0.952) 0.911 (0.897 - 0.928) 0.803 (0.777 - 0.829) 0.912 (0.903 - 0.923) 0.946 (0.933 - 0.961)
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 1.06 0.673 (0.626 - 0.730) 0.734 (0.699 - 0.770) 0.819 (0.769 - 0.855) 0.415 (0.364 - 0.461) 0.683 (0.611 - 0.741) 0.769 (0.737 - 0.804) 0.747 (0.692 - 0.811) 0.521 (0.492 - 0.559) 0.871 (0.859 - 0.884) 0.766 (0.731 - 0.802)
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 1.12 0.653 (0.597 - 0.711) 0.742 (0.707 - 0.780) 0.821 (0.771 - 0.857) 0.437 (0.389 - 0.480) 0.693 (0.624 - 0.747) 0.777 (0.727 - 0.817) 0.779 (0.728 - 0.835) 0.540 (0.503 - 0.582) 0.868 (0.855 - 0.881) 0.787 (0.755 - 0.820)
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.91 0.745 (0.684 - 0.805) 0.787 (0.746 - 0.823) 0.884 (0.864 - 0.906) 0.534 (0.482 - 0.574) 0.788 (0.748 - 0.822) 0.873 (0.852 - 0.897) 0.735 (0.648 - 0.790) 0.611 (0.573 - 0.652) 0.868 (0.843 - 0.899) 0.858 (0.833 - 0.883)
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.84 0.776 (0.708 - 0.847) 0.828 (0.783 - 0.871) 0.882 (0.855 - 0.915) 0.549 (0.492 - 0.619) 0.842 (0.816 - 0.873) 0.888 (0.864 - 0.909) 0.768 (0.682 - 0.827) 0.638 (0.599 - 0.677) 0.853 (0.838 - 0.872) 0.875 (0.853 - 0.897)
Barahona_AUDIAS_task4a_5 4-Resolution CRNN Barahona2023 1.14 0.714 (0.666 - 0.773) 0.804 (0.774 - 0.835) 0.852 (0.823 - 0.883) 0.467 (0.432 - 0.508) 0.718 (0.683 - 0.760) 0.855 (0.831 - 0.882) 0.792 (0.754 - 0.827) 0.533 (0.502 - 0.567) 0.888 (0.875 - 0.899) 0.822 (0.786 - 0.856)
Barahona_AUDIAS_task4a_6 4-Resolution CRNN with class-dependent median filtering Barahona2023 1.18 0.696 (0.645 - 0.754) 0.801 (0.770 - 0.834) 0.852 (0.823 - 0.882) 0.486 (0.457 - 0.523) 0.719 (0.682 - 0.762) 0.876 (0.855 - 0.900) 0.815 (0.780 - 0.848) 0.545 (0.512 - 0.582) 0.886 (0.875 - 0.897) 0.828 (0.796 - 0.862)
Barahona_AUDIAS_task4a_7 5-Resolution Conformer Barahona2023 1.06 0.760 (0.717 - 0.811) 0.795 (0.768 - 0.821) 0.891 (0.875 - 0.910) 0.577 (0.546 - 0.609) 0.796 (0.767 - 0.831) 0.895 (0.879 - 0.911) 0.817 (0.793 - 0.841) 0.654 (0.619 - 0.689) 0.887 (0.874 - 0.901) 0.876 (0.854 - 0.900)
Barahona_AUDIAS_task4a_8 5-Resolution Conformer with class-wise median filtering Barahona2023 1.00 0.816 (0.779 - 0.860) 0.838 (0.812 - 0.867) 0.906 (0.887 - 0.926) 0.628 (0.586 - 0.687) 0.859 (0.834 - 0.884) 0.922 (0.908 - 0.936) 0.848 (0.825 - 0.871) 0.683 (0.653 - 0.718) 0.882 (0.870 - 0.894) 0.896 (0.875 - 0.918)
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 1.12 0.670 (0.622 - 0.728) 0.795 (0.767 - 0.822) 0.842 (0.819 - 0.867) 0.501 (0.476 - 0.527) 0.661 (0.630 - 0.703) 0.825 (0.800 - 0.850) 0.737 (0.708 - 0.766) 0.569 (0.541 - 0.603) 0.866 (0.854 - 0.878) 0.803 (0.777 - 0.832)
Gan_NCUT_task4_2 Gan_NCUT_SED_system_2 Gan2023 1.52 0.912 (0.887 - 0.942) 0.906 (0.889 - 0.925) 0.959 (0.949 - 0.969) 0.676 (0.647 - 0.706) 0.937 (0.926 - 0.951) 0.932 (0.919 - 0.947) 0.918 (0.903 - 0.932) 0.785 (0.758 - 0.813) 0.922 (0.911 - 0.932) 0.948 (0.936 - 0.962)
Gan_NCUT_task4_3 Gan_NCUT_SED_system_3 Gan2023 1.50 0.898 (0.871 - 0.931) 0.889 (0.869 - 0.913) 0.963 (0.954 - 0.975) 0.742 (0.717 - 0.767) 0.943 (0.932 - 0.955) 0.943 (0.930 - 0.958) 0.891 (0.875 - 0.908) 0.794 (0.768 - 0.820) 0.926 (0.916 - 0.937) 0.960 (0.950 - 0.972)
Liu_SRCN_task4a_1 DCASE2023 t4a system1 Chen2023a 1.65 0.883 (0.851 - 0.935) 0.906 (0.887 - 0.928) 0.975 (0.969 - 0.985) 0.781 (0.759 - 0.806) 0.950 (0.940 - 0.967) 0.807 (0.778 - 0.839) 0.935 (0.923 - 0.946) 0.818 (0.796 - 0.841) 0.934 (0.925 - 0.943) 0.960 (0.948 - 0.973)
Liu_SRCN_task4a_2 DCASE2023 t4a system2 Chen2023a 1.40 0.967 (0.959 - 0.978) 0.950 (0.941 - 0.963) 0.978 (0.971 - 0.987) 0.816 (0.796 - 0.835) 0.966 (0.960 - 0.974) 0.965 (0.957 - 0.976) 0.956 (0.947 - 0.965) 0.854 (0.830 - 0.875) 0.921 (0.912 - 0.929) 0.967 (0.958 - 0.978)
Liu_SRCN_task4a_3 DCASE2023 t4a system3 Chen2023a 1.65 0.962 (0.952 - 0.974) 0.956 (0.948 - 0.965) 0.979 (0.973 - 0.987) 0.784 (0.763 - 0.807) 0.940 (0.930 - 0.953) 0.957 (0.946 - 0.970) 0.951 (0.942 - 0.960) 0.845 (0.820 - 0.869) 0.931 (0.922 - 0.940) 0.965 (0.955 - 0.977)
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 1.25 0.730 (0.684 - 0.773) 0.808 (0.775 - 0.844) 0.885 (0.865 - 0.902) 0.561 (0.539 - 0.587) 0.785 (0.762 - 0.809) 0.825 (0.799 - 0.852) 0.870 (0.847 - 0.892) 0.593 (0.566 - 0.625) 0.890 (0.878 - 0.901) 0.878 (0.855 - 0.901)
Liu_SRCN_task4a_5 DCASE2023 t4a system5 Chen2023a 0.94 0.972 (0.965 - 0.980) 0.926 (0.912 - 0.942) 0.936 (0.926 - 0.948) 0.763 (0.742 - 0.785) 0.965 (0.957 - 0.975) 0.961 (0.952 - 0.974) 0.951 (0.941 - 0.959) 0.847 (0.820 - 0.866) 0.952 (0.946 - 0.957) 0.966 (0.957 - 0.977)
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 1.35 0.803 (0.744 - 0.845) 0.867 (0.818 - 0.908) 0.931 (0.918 - 0.945) 0.586 (0.549 - 0.624) 0.748 (0.695 - 0.804) 0.908 (0.885 - 0.937) 0.859 (0.840 - 0.879) 0.709 (0.674 - 0.746) 0.825 (0.760 - 0.866) 0.892 (0.847 - 0.933)
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 1.68 0.949 (0.925 - 0.973) 0.930 (0.908 - 0.950) 0.973 (0.965 - 0.984) 0.732 (0.712 - 0.754) 0.929 (0.911 - 0.948) 0.948 (0.935 - 0.960) 0.934 (0.922 - 0.945) 0.831 (0.805 - 0.859) 0.901 (0.887 - 0.914) 0.953 (0.941 - 0.965)
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 1.66 0.952 (0.933 - 0.970) 0.931 (0.907 - 0.950) 0.974 (0.967 - 0.983) 0.741 (0.718 - 0.766) 0.929 (0.914 - 0.947) 0.946 (0.935 - 0.960) 0.938 (0.926 - 0.949) 0.831 (0.803 - 0.859) 0.907 (0.895 - 0.919) 0.949 (0.936 - 0.963)
Kim_GIST-HanwhaVision_task4a_4 FDYLKA BEATs pool 1d stage1 Kim2023 1.63 0.946 (0.920 - 0.970) 0.910 (0.887 - 0.930) 0.970 (0.960 - 0.981) 0.721 (0.693 - 0.747) 0.908 (0.872 - 0.937) 0.955 (0.939 - 0.969) 0.911 (0.888 - 0.928) 0.804 (0.772 - 0.834) 0.887 (0.869 - 0.905) 0.911 (0.879 - 0.947)
Kim_GIST-HanwhaVision_task4a_5 FDYLKA BEATs all ensemble 48 Kim2023 1.72 0.961 (0.947 - 0.978) 0.935 (0.923 - 0.948) 0.974 (0.966 - 0.983) 0.755 (0.737 - 0.776) 0.937 (0.925 - 0.952) 0.961 (0.951 - 0.972) 0.943 (0.930 - 0.955) 0.841 (0.817 - 0.865) 0.912 (0.900 - 0.923) 0.951 (0.939 - 0.965)
Kim_GIST-HanwhaVision_task4a_6 FDYLKA BEATs PSDS1 ensemble 16 Kim2023 1.72 0.956 (0.941 - 0.974) 0.930 (0.917 - 0.945) 0.974 (0.967 - 0.983) 0.751 (0.731 - 0.772) 0.935 (0.924 - 0.950) 0.957 (0.937 - 0.973) 0.946 (0.934 - 0.956) 0.835 (0.808 - 0.861) 0.910 (0.898 - 0.922) 0.949 (0.937 - 0.962)
Kim_GIST-HanwhaVision_task4a_7 FDYLKA BEATs PSDS2 ensemble 16 Kim2023 1.69 0.963 (0.949 - 0.979) 0.931 (0.916 - 0.946) 0.973 (0.966 - 0.982) 0.755 (0.734 - 0.777) 0.940 (0.928 - 0.953) 0.959 (0.949 - 0.970) 0.938 (0.926 - 0.949) 0.840 (0.815 - 0.863) 0.910 (0.896 - 0.922) 0.951 (0.933 - 0.965)
Kim_GIST-HanwhaVision_task4a_8 FDYLKA BEATs PSDS sum ensemble 16 Kim2023 1.72 0.959 (0.940 - 0.976) 0.925 (0.910 - 0.940) 0.974 (0.966 - 0.983) 0.753 (0.736 - 0.773) 0.933 (0.920 - 0.951) 0.955 (0.943 - 0.968) 0.945 (0.933 - 0.957) 0.833 (0.803 - 0.861) 0.914 (0.902 - 0.925) 0.947 (0.932 - 0.962)
Wenxin_TJU_task4a_1 ensemble-pretrained-psds1-0 Wenxin2023 1.63 0.931 (0.917 - 0.945) 0.939 (0.930 - 0.952) 0.973 (0.966 - 0.982) 0.742 (0.722 - 0.766) 0.896 (0.879 - 0.915) 0.969 (0.962 - 0.977) 0.955 (0.946 - 0.964) 0.868 (0.851 - 0.885) 0.892 (0.878 - 0.907) 0.949 (0.936 - 0.963)
Wenxin_TJU_task4a_2 ensemble-pretrained-psds1-1 Wenxin2023 1.66 0.937 (0.923 - 0.950) 0.941 (0.932 - 0.952) 0.966 (0.958 - 0.977) 0.753 (0.733 - 0.775) 0.906 (0.891 - 0.923) 0.972 (0.966 - 0.982) 0.958 (0.949 - 0.968) 0.891 (0.871 - 0.909) 0.886 (0.873 - 0.900) 0.944 (0.932 - 0.958)
Wenxin_TJU_task4a_3 ensemble-pretrained-psds2-0 Wenxin2023 0.88 0.902 (0.873 - 0.956) 0.894 (0.878 - 0.916) 0.917 (0.904 - 0.932) 0.730 (0.704 - 0.754) 0.939 (0.927 - 0.956) 0.965 (0.958 - 0.974) 0.926 (0.914 - 0.939) 0.785 (0.758 - 0.811) 0.927 (0.917 - 0.936) 0.929 (0.913 - 0.946)
Wenxin_TJU_task4a_4 ensemble-pretrained-psds2-1 Wenxin2023 0.90 0.946 (0.930 - 0.966) 0.900 (0.881 - 0.922) 0.925 (0.913 - 0.939) 0.739 (0.717 - 0.764) 0.959 (0.951 - 0.969) 0.982 (0.977 - 0.986) 0.950 (0.942 - 0.960) 0.832 (0.807 - 0.852) 0.936 (0.929 - 0.946) 0.946 (0.933 - 0.959)
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 1.58 0.905 (0.887 - 0.927) 0.914 (0.899 - 0.931) 0.961 (0.952 - 0.972) 0.705 (0.679 - 0.741) 0.885 (0.867 - 0.904) 0.963 (0.955 - 0.971) 0.933 (0.922 - 0.945) 0.845 (0.823 - 0.863) 0.897 (0.886 - 0.910) 0.933 (0.920 - 0.948)
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 1.61 0.931 (0.917 - 0.944) 0.924 (0.908 - 0.939) 0.964 (0.957 - 0.973) 0.734 (0.714 - 0.760) 0.902 (0.889 - 0.919) 0.965 (0.957 - 0.976) 0.955 (0.945 - 0.963) 0.882 (0.865 - 0.901) 0.868 (0.854 - 0.885) 0.940 (0.926 - 0.953)
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 1.31 0.702 (0.659 - 0.750) 0.866 (0.843 - 0.890) 0.918 (0.904 - 0.936) 0.571 (0.543 - 0.602) 0.755 (0.724 - 0.786) 0.864 (0.842 - 0.888) 0.868 (0.848 - 0.890) 0.700 (0.673 - 0.726) 0.887 (0.877 - 0.896) 0.869 (0.848 - 0.891)
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.75 0.779 (0.735 - 0.838) 0.841 (0.818 - 0.862) 0.892 (0.877 - 0.913) 0.665 (0.635 - 0.695) 0.893 (0.877 - 0.914) 0.905 (0.891 - 0.920) 0.846 (0.825 - 0.864) 0.664 (0.636 - 0.695) 0.809 (0.792 - 0.832) 0.828 (0.805 - 0.856)

Energy Consumption

Rank Submission
code
Submission
name
Technical
Report

Ranking score
(Evaluation dataset)

PSDS 1
(Evaluation dataset)

PSDS 2
(Evaluation dataset)

Energy (kWh)
(training, normalized)

Energy (kWh)
(Test, normalized)

EW-PSDS 1
(training energy)

EW-PSDS 2
(training energy)

EW-PSDS 1
(test energy)

EW-PSDS 2
(test energy)
Baseline_task4a_1 DCASE2023 baseline system Turpault2023 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566) 1.390 0.019 0.327 0.538 0.327 0.538
Baseline_task4a_2 DCASE2023 baseline system (Audioset+Beats) Turpault2023 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811) 1.821 0.020 0.389 0.609 0.484 0.758
Li_USTC_task4a_1 TAFT and SdMT Li2023 1.54 0.539 (0.527 - 0.551) 0.769 (0.758 - 0.778) 10.045 0.026 0.075 0.106 0.394 0.562
Li_USTC_task4a_2 Pseudo labeling Li2023 1.58 0.556 (0.544 - 0.569) 0.781 (0.769 - 0.795) 6.496 0.019 0.119 0.167 0.556 0.781
Li_USTC_task4a_3 TAFT and AFL Li2023 1.54 0.546 (0.535 - 0.558) 0.756 (0.745 - 0.769) 6.496 0.019 0.117 0.162 0.546 0.756
Li_USTC_task4a_4 MaxFilter Li2023 0.89 0.061 (0.050 - 0.070) 0.852 (0.843 - 0.863) 6.496 0.019 0.013 0.182 0.061 0.852
Li_USTC_task4a_5 TAFT and SdMT and single Li2023 1.52 0.531 (0.520 - 0.544) 0.762 (0.751 - 0.773) 3.347 0.010 0.221 0.316 1.009 1.447
Li_USTC_task4a_6 Pseudo labeling and single Li2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796) 3.347 0.010 0.227 0.325 1.037 1.488
Li_USTC_task4a_7 SKCRNN MT Li2023 1.20 0.404 (0.389 - 0.421) 0.630 (0.612 - 0.648) 5.472 0.003 0.103 0.160 2.559 3.990
Liu_NSYSU_task4_1 DCASE2023 FDY_WeakSED_Ensemble Liu2023 0.80 0.051 (0.042 - 0.060) 0.779 (0.767 - 0.791) 12.405 6.240 0.006 0.087 0.000 0.002
Liu_NSYSU_task4_2 FDY_Ensemble Liu2023 1.36 0.466 (0.455 - 0.480) 0.701 (0.688 - 0.714) 6.453 1.957 0.100 0.151 0.005 0.007
Liu_NSYSU_task4_3 DCASE2023 VGGSK_Single Liu2023 1.26 0.434 (0.420 - 0.448) 0.646 (0.633 - 0.660) 0.192 0.016 3.141 4.675 0.515 0.767
Liu_NSYSU_task4_4 DCASE2023 FDY_Single Liu2023 1.24 0.413 (0.394 - 0.438) 0.655 (0.638 - 0.673) 1.077 0.325 0.533 0.846 0.024 0.038
Liu_NSYSU_task4_5 DCASE2023 FDY_BEATs_WeakSED Liu2023 0.82 0.045 (0.035 - 0.053) 0.806 (0.794 - 0.818) 4.608 1.399 0.013 0.243 0.001 0.011
Liu_NSYSU_task4_6 DCASE2023 FDY_BEATs Liu2023 1.62 0.552 (0.540 - 0.563) 0.838 (0.829 - 0.848) 4.608 1.399 0.167 0.253 0.007 0.011
Liu_NSYSU_task4_7 DCASE2023 FDY_BEATs Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831) 0.923 0.279 0.784 1.225 0.035 0.055
Liu_NSYSU_task4_8 DCASE2023 FDY_BEATs Liu2023 1.53 0.515 (0.488 - 0.536) 0.805 (0.791 - 0.818) 0.923 0.279 0.775 1.212 0.035 0.055
Lee_CAU_task4A_1 CAU_ET Lee2023 1.24 0.425 (0.415 - 0.440) 0.634 (0.618 - 0.648) 2.686 0.016 0.220 0.328 0.505 0.753
Lee_CAU_task4A_2 CAU_ET Lee2023 0.79 0.104 (0.090 - 0.117) 0.674 (0.661 - 0.690) 3.011 0.012 0.048 0.311 0.164 1.068
Cheimariotis_DUTH_task4a_1 DuthApida Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808) 1.666 0.033 0.430 0.664 0.297 0.458
Cheimariotis_DUTH_task4a_2 DuthApida Cheimariotis2023 1.45 0.487 (0.475 - 0.502) 0.759 (0.745 - 0.773) 1.964 0.375 0.345 0.537 0.025 0.038
Chen_CHT_task4_1 VGGSK Chen2023b 1.25 0.441 (0.403 - 0.468) 0.620 (0.567 - 0.652) 0.655 0.005 0.935 1.315 1.675 2.355
Chen_CHT_task4_2 VGGSK+BEATs Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792) 1.354 0.005 0.578 0.800 2.139 2.961
Chen_CHT_task4_3 VGGSK+BEATs Chen2023b 1.66 0.596 (0.585 - 0.606) 0.810 (0.800 - 0.822) 1.354 0.005 0.612 0.832 2.267 3.080
Chen_CHT_task4_4 multi+BEATs Chen2023b 1.66 0.590 (0.578 - 0.601) 0.820 (0.810 - 0.831) 1.354 0.005 0.606 0.842 2.243 3.118
Xiao_FMSG_task4a_1 Xiao_FMSG_task4a_1_single_model_without_external Zhang2023 1.23 0.403 (0.392 - 0.417) 0.660 (0.646 - 0.672) 0.800 0.010 0.701 1.146 0.766 1.253
Xiao_FMSG_task4a_2 Xiao_FMSG_task4a_2_single_model Xiao2023 1.55 0.525 (0.516 - 0.538) 0.808 (0.796 - 0.821) 0.971 0.007 0.752 1.156 1.425 2.193
Xiao_FMSG_task4a_3 Xiao_FMSG_task4a_3_single_model_psds2 Xiao2023 0.86 0.071 (0.062 - 0.080) 0.807 (0.796 - 0.818) 0.853 0.007 0.116 1.314 0.193 2.189
Xiao_FMSG_task4a_4 Xiao_FMSG_task4a_4_single_model Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827) 0.811 0.006 0.945 1.394 1.746 2.575
Xiao_FMSG_task4a_5 Xiao_FMSG_task4a_5_ensemble_model Xiao2023 1.61 0.555 (0.545 - 0.567) 0.821 (0.811 - 0.834) 4.587 0.032 0.168 0.249 0.330 0.488
Xiao_FMSG_task4a_6 Xiao_FMSG_task4a_6_ensemble_model Xiao2023 1.61 0.551 (0.541 - 0.561) 0.829 (0.819 - 0.842) 9.707 0.075 0.079 0.119 0.140 0.210
Xiao_FMSG_task4a_7 Xiao_FMSG_task4a_7_ensemble_model Xiao2023 0.87 0.075 (0.066 - 0.084) 0.811 (0.800 - 0.822) 3.426 0.075 0.030 0.329 0.019 0.206
Xiao_FMSG_task4a_8 Xiao_FMSG_task4a_8_ensemble_model Xiao2023 1.62 0.549 (0.540 - 0.560) 0.834 (0.824 - 0.847) 10.240 0.075 0.075 0.113 0.139 0.211
Guan_HIT_task4a_1 Guan_HIT_task4a_1 Guan2023 1.57 0.536 (0.526 - 0.546) 0.810 (0.800 - 0.822) 6.673 0.038 0.112 0.169 0.268 0.405
Guan_HIT_task4a_2 Guan_HIT_task4a_2 Guan2023 0.93 0.082 (0.074 - 0.090) 0.862 (0.852 - 0.872) 7.313 0.038 0.016 0.164 0.041 0.431
Guan_HIT_task4a_3 Guan_HIT_task4a_3 Guan2023 1.55 0.526 (0.513 - 0.539) 0.800 (0.788 - 0.813) 8.806 0.049 0.083 0.126 0.204 0.310
Guan_HIT_task4a_4 Guan_HIT_task4a_4 Guan2023 0.92 0.082 (0.073 - 0.091) 0.855 (0.844 - 0.867) 8.806 0.049 0.013 0.135 0.032 0.332
Guan_HIT_task4a_5 Guan_HIT_task4a_5 Guan2023 1.40 0.488 (0.475 - 0.503) 0.708 (0.696 - 0.720) 8.981 0.049 0.076 0.110 0.189 0.274
Guan_HIT_task4a_6 Guan_HIT_task4a_6 Guan2023 0.88 0.088 (0.080 - 0.096) 0.797 (0.787 - 0.810) 8.981 0.049 0.014 0.123 0.034 0.309
Wang_XiaoRice_task4a_1 SINGLE Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815) 0.358 0.056 1.918 3.110 0.168 0.272
Wang_XiaoRice_task4a_2 SED Embed Wang2023 1.52 0.497 (0.486 - 0.510) 0.814 (0.803 - 0.828) 1.882 0.056 0.367 0.601 0.168 0.276
Wang_XiaoRice_task4a_3 L-TAG Wang2023 0.91 0.088 (0.076 - 0.098) 0.835 (0.824 - 0.844) 1.882 0.056 0.065 0.617 0.030 0.283
Zhang_IOA_task4_1 strong_ensemble Zhang2023 1.75 0.622 (0.613 - 0.634) 0.857 (0.849 - 0.866) 69.120 1.024 0.013 0.017 0.012 0.016
Zhang_IOA_task4_2 segment tagging model Zhang2023 0.95 0.070 (0.060 - 0.080) 0.903 (0.895 - 0.911) 35.840 0.640 0.003 0.035 0.002 0.027
Zhang_IOA_task4_3 strong_ensemble_all Zhang2023 1.71 0.613 (0.603 - 0.625) 0.828 (0.821 - 0.839) 128.000 1.280 0.007 0.009 0.009 0.012
Zhang_IOA_task4_4 strong_ensemble_1 Zhang2023 1.75 0.625 (0.615 - 0.637) 0.855 (0.847 - 0.864) 69.120 1.024 0.013 0.017 0.012 0.016
Zhang_IOA_task4_5 base system Zhang2023 1.52 0.524 (0.513 - 0.537) 0.774 (0.762 - 0.786) 15.360 0.640 0.047 0.070 0.016 0.023
Zhang_IOA_task4_6 strong_single Zhang2023 1.60 0.562 (0.552 - 0.575) 0.795 (0.786 - 0.805) 20.480 0.640 0.038 0.054 0.017 0.024
Zhang_IOA_task4_7 weak single Zhang2023 0.86 0.055 (0.048 - 0.064) 0.830 (0.820 - 0.842) 23.040 0.640 0.003 0.050 0.002 0.025
Wu_NCUT_task4a_1 Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610) 1.002 0.002 0.543 0.826 3.715 5.659
Wu_NCUT_task4a_2 Wu_NCUT_task4a_2 Wu2023 1.53 0.519 (0.507 - 0.531) 0.793 (0.783 - 0.806) 3.238 0.010 0.223 0.341 0.985 1.507
Wu_NCUT_task4a_3 Wu_NCUT_task4a_3 Wu2023 1.50 0.497 (0.486 - 0.509) 0.793 (0.783 - 0.806) 4.317 0.014 0.160 0.255 0.675 1.076
Barahona_AUDIAS_task4a_1 CRNN T++ resolution Barahona2023 1.06 0.351 (0.333 - 0.372) 0.562 (0.532 - 0.587) 10.007 0.095 0.049 0.078 0.070 0.112
Barahona_AUDIAS_task4a_2 CRNN T++ resolution with class-wise median filtering Barahona2023 1.12 0.380 (0.361 - 0.406) 0.575 (0.553 - 0.594) 10.007 0.095 0.053 0.080 0.076 0.115
Barahona_AUDIAS_task4a_3 Conformer F+ resolution Barahona2023 0.91 0.200 (0.164 - 0.225) 0.646 (0.626 - 0.664) 19.345 0.095 0.014 0.046 0.040 0.129
Barahona_AUDIAS_task4a_4 Conformer F+ resolution with class-wise median filtering Barahona2023 0.84 0.141 (0.124 - 0.155) 0.673 (0.652 - 0.700) 19.345 0.095 0.010 0.048 0.028 0.135
Barahona_AUDIAS_task4a_5 4-Resolution CRNN Barahona2023 1.14 0.378 (0.365 - 0.392) 0.604 (0.590 - 0.622) 37.104 0.365 0.014 0.023 0.020 0.031
Barahona_AUDIAS_task4a_6 4-Resolution CRNN with class-dependent median filtering Barahona2023 1.18 0.401 (0.390 - 0.414) 0.612 (0.596 - 0.630) 37.104 0.365 0.015 0.023 0.021 0.032
Barahona_AUDIAS_task4a_7 5-Resolution Conformer Barahona2023 1.06 0.274 (0.262 - 0.287) 0.684 (0.671 - 0.699) 124.323 0.400 0.003 0.008 0.013 0.033
Barahona_AUDIAS_task4a_8 5-Resolution Conformer with class-wise median filtering Barahona2023 1.00 0.213 (0.201 - 0.226) 0.729 (0.710 - 0.752) 124.323 0.400 0.002 0.008 0.010 0.035
Gan_NCUT_task4_1 Gan_NCUT_SED_system_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617) 1.175 0.006 0.432 0.714 1.156 1.911
Gan_NCUT_task4_2 Gan_NCUT_SED_system_2 Gan2023 1.52 0.511 (0.498 - 0.524) 0.799 (0.785 - 0.813) 7.477 0.039 0.095 0.149 0.249 0.389
Gan_NCUT_task4_3 Gan_NCUT_SED_system_3 Gan2023 1.50 0.483 (0.467 - 0.498) 0.816 (0.805 - 0.828) 8.971 0.075 0.075 0.126 0.122 0.207
Liu_SRCN_task4a_1 DCASE2023 t4a system1 Chen2023a 1.65 0.585 (0.572 - 0.598) 0.817 (0.804 - 0.834) 19.611 0.087 0.041 0.058 0.128 0.178
Liu_SRCN_task4a_2 DCASE2023 t4a system2 Chen2023a 1.40 0.380 (0.369 - 0.392) 0.877 (0.867 - 0.885) 18.171 0.064 0.029 0.067 0.113 0.260
Liu_SRCN_task4a_3 DCASE2023 t4a system3 Chen2023a 1.65 0.556 (0.544 - 0.569) 0.861 (0.852 - 0.870) 20.457 0.091 0.038 0.059 0.116 0.180
Liu_SRCN_task4a_4 DCASE2023 t4a system4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676) 4.425 0.014 0.129 0.208 0.559 0.899
Liu_SRCN_task4a_5 DCASE2023 t4a system5 Chen2023a 0.94 0.098 (0.086 - 0.108) 0.851 (0.841 - 0.860) 18.171 0.064 0.007 0.065 0.029 0.253
Kim_GIST-HanwhaVision_task4a_1 DCASE2023 FDY-LKA CRNN without external single Kim2023 1.35 0.459 (0.431 - 0.484) 0.701 (0.681 - 0.720) 3.241 0.021 0.197 0.301 0.415 0.635
Kim_GIST-HanwhaVision_task4a_2 FDYLKA BEATs pool1d Stage2 Kim2023 1.68 0.591 (0.574 - 0.611) 0.831 (0.823 - 0.841) 3.912 0.028 0.210 0.295 0.401 0.564
Kim_GIST-HanwhaVision_task4a_3 LKAFDY BEATs Stage 2 interpolate Kim2023 1.66 0.581 (0.553 - 0.600) 0.835 (0.826 - 0.846) 3.432 0.023 0.235 0.338 0.480 0.690
Kim_GIST-HanwhaVision_task4a_4 FDYLKA BEATs pool 1d stage1 Kim2023 1.63 0.576 (0.549 - 0.595) 0.809 (0.797 - 0.821) 2.783 0.028 0.288 0.404 0.391 0.549
Kim_GIST-HanwhaVision_task4a_5 FDYLKA BEATs all ensemble 48 Kim2023 1.72 0.611 (0.598 - 0.623) 0.846 (0.838 - 0.855) 12.334 1.195 0.069 0.095 0.010 0.013
Kim_GIST-HanwhaVision_task4a_6 FDYLKA BEATs PSDS1 ensemble 16 Kim2023 1.72 0.611 (0.590 - 0.628) 0.841 (0.832 - 0.851) 12.334 0.398 0.069 0.095 0.029 0.040
Kim_GIST-HanwhaVision_task4a_7 FDYLKA BEATs PSDS2 ensemble 16 Kim2023 1.69 0.591 (0.574 - 0.604) 0.844 (0.835 - 0.853) 12.334 0.398 0.067 0.095 0.028 0.040
Kim_GIST-HanwhaVision_task4a_8 FDYLKA BEATs PSDS sum ensemble 16 Kim2023 1.72 0.612 (0.599 - 0.626) 0.841 (0.831 - 0.851) 12.334 0.398 0.069 0.095 0.029 0.040
Wenxin_TJU_task4a_1 ensemble-pretrained-psds1-0 Wenxin2023 1.63 0.555 (0.543 - 0.566) 0.837 (0.828 - 0.847) 190.549 0.474 0.004 0.006 0.022 0.034
Wenxin_TJU_task4a_2 ensemble-pretrained-psds1-1 Wenxin2023 1.66 0.570 (0.559 - 0.580) 0.844 (0.836 - 0.854) 95.275 0.237 0.008 0.012 0.046 0.068
Wenxin_TJU_task4a_3 ensemble-pretrained-psds2-0 Wenxin2023 0.88 0.080 (0.071 - 0.088) 0.815 (0.802 - 0.825) 171.494 0.427 0.001 0.007 0.004 0.036
Wenxin_TJU_task4a_4 ensemble-pretrained-psds2-1 Wenxin2023 0.90 0.081 (0.071 - 0.090) 0.838 (0.828 - 0.849) 190.549 0.474 0.001 0.006 0.003 0.034
Wenxin_TJU_task4a_5 single-pretrained-psds1-0 Wenxin2023 1.58 0.539 (0.528 - 0.549) 0.816 (0.806 - 0.831) 9.527 0.024 0.079 0.119 0.427 0.646
Wenxin_TJU_task4a_6 single-pretrained-psds1-1 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842) 9.527 0.024 0.080 0.121 0.432 0.658
Wenxin_TJU_task4a_7 single-psds1 Wenxin2023 1.31 0.440 (0.429 - 0.454) 0.686 (0.673 - 0.699) 7.582 0.017 0.081 0.126 0.492 0.766
Wenxin_TJU_task4a_8 single-psds2 Wenxin2023 0.75 0.059 (0.049 - 0.068) 0.707 (0.694 - 0.723) 7.582 0.017 0.011 0.130 0.065 0.790

System characteristics

General characteristics

Rank Code Technical
Report
Ranking score (Evaluation dataset)
PSDS 1
(Evaluation dataset)

PSDS 2
(Evaluation dataset)
Data
augmentation
Features
Baseline_task4a_1 Turpault2023 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566) mixup log-mel energies
Baseline_task4a_2 Turpault2023 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811) mixup log-mel energies
Li_USTC_task4a_1 Li2023 1.54 0.539 (0.527 - 0.551) 0.769 (0.758 - 0.778) specaugmentation log-mel energies
Li_USTC_task4a_2 Li2023 1.58 0.556 (0.544 - 0.569) 0.781 (0.769 - 0.795) specaugmentation, mixup log-mel energies
Li_USTC_task4a_3 Li2023 1.54 0.546 (0.535 - 0.558) 0.756 (0.745 - 0.769) specaugmentation, mixup log-mel energies
Li_USTC_task4a_4 Li2023 0.89 0.061 (0.050 - 0.070) 0.852 (0.843 - 0.863) specaugmentation, mixup log-mel energies
Li_USTC_task4a_5 Li2023 1.52 0.531 (0.520 - 0.544) 0.762 (0.751 - 0.773) specaugmentation log-mel energies
Li_USTC_task4a_6 Li2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796) specaugmentation, mixup log-mel energies
Li_USTC_task4a_7 Li2023 1.20 0.404 (0.389 - 0.421) 0.630 (0.612 - 0.648) specaugmentation log-mel energies
Liu_NSYSU_task4_1 Liu2023 0.80 0.051 (0.042 - 0.060) 0.779 (0.767 - 0.791) mixup, filter augment log-mel energies
Liu_NSYSU_task4_2 Liu2023 1.36 0.466 (0.455 - 0.480) 0.701 (0.688 - 0.714) mixup, filter augment log-mel energies
Liu_NSYSU_task4_3 Liu2023 1.26 0.434 (0.420 - 0.448) 0.646 (0.633 - 0.660) mixup, filter augment, time shifting, pitch shifting, spec augment log-mel energies
Liu_NSYSU_task4_4 Liu2023 1.24 0.413 (0.394 - 0.438) 0.655 (0.638 - 0.673) mixup, filter augment, time shifting, pitch shifting, spec augment log-mel energies
Liu_NSYSU_task4_5 Liu2023 0.82 0.045 (0.035 - 0.053) 0.806 (0.794 - 0.818) mixup, filter augment log-mel energies
Liu_NSYSU_task4_6 Liu2023 1.62 0.552 (0.540 - 0.563) 0.838 (0.829 - 0.848) mixup, filter augment log-mel energies
Liu_NSYSU_task4_7 Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831) mixup, filter augment log-mel energies
Liu_NSYSU_task4_8 Liu2023 1.53 0.515 (0.488 - 0.536) 0.805 (0.791 - 0.818) mixup, filter augment log-mel energies
Lee_CAU_task4A_1 Lee2023 1.24 0.425 (0.415 - 0.440) 0.634 (0.618 - 0.648) mixup, time-masking, filteraugment log-mel spectrogram
Lee_CAU_task4A_2 Lee2023 0.79 0.104 (0.090 - 0.117) 0.674 (0.661 - 0.690) mixup, time-masking, filteraugment log-mel spectrogram
Cheimariotis_DUTH_task4a_1 Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808) mixup log-mel spectrogram
Cheimariotis_DUTH_task4a_2 Cheimariotis2023 1.45 0.487 (0.475 - 0.502) 0.759 (0.745 - 0.773) mixup log-mel spectrogram
Chen_CHT_task4_1 Chen2023b 1.25 0.441 (0.403 - 0.468) 0.620 (0.567 - 0.652) mix-up, noise, sct log-mel energies
Chen_CHT_task4_2 Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792) mix-up, ict log-mel energies
Chen_CHT_task4_3 Chen2023b 1.66 0.596 (0.585 - 0.606) 0.810 (0.800 - 0.822) mix-up, ict, nosie, mask log-mel energies
Chen_CHT_task4_4 Chen2023b 1.66 0.590 (0.578 - 0.601) 0.820 (0.810 - 0.831) mix-up, ict, nosie, mask log-mel energies
Xiao_FMSG_task4a_1 Zhang2023 1.23 0.403 (0.392 - 0.417) 0.660 (0.646 - 0.672) time masking, frequency shifting, mixup, filter-augmentation log-mel energies
Xiao_FMSG_task4a_2 Xiao2023 1.55 0.525 (0.516 - 0.538) 0.808 (0.796 - 0.821) time masking, frequency masking, mixup log-mel energies
Xiao_FMSG_task4a_3 Xiao2023 0.86 0.071 (0.062 - 0.080) 0.807 (0.796 - 0.818) time masking, frequency masking, mixup log-mel energies
Xiao_FMSG_task4a_4 Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827) time masking, frequency masking, mixup log-mel energies
Xiao_FMSG_task4a_5 Xiao2023 1.61 0.555 (0.545 - 0.567) 0.821 (0.811 - 0.834) time masking, frequency masking, mixup log-mel energies
Xiao_FMSG_task4a_6 Xiao2023 1.61 0.551 (0.541 - 0.561) 0.829 (0.819 - 0.842) time masking, frequency masking, mixup log-mel energies
Xiao_FMSG_task4a_7 Xiao2023 0.87 0.075 (0.066 - 0.084) 0.811 (0.800 - 0.822) time masking, frequency masking, mixup log-mel energies
Xiao_FMSG_task4a_8 Xiao2023 1.62 0.549 (0.540 - 0.560) 0.834 (0.824 - 0.847) time masking, frequency masking, mixup log-mel energies
Guan_HIT_task4a_1 Guan2023 1.57 0.536 (0.526 - 0.546) 0.810 (0.800 - 0.822) mixup, time mask, pitch shift, time shift log-mel energies
Guan_HIT_task4a_2 Guan2023 0.93 0.082 (0.074 - 0.090) 0.862 (0.852 - 0.872) mixup, time mask, pitch shift, time shift log-mel energies
Guan_HIT_task4a_3 Guan2023 1.55 0.526 (0.513 - 0.539) 0.800 (0.788 - 0.813) mixup, time mask, pitch shift, time shift log-mel energies
Guan_HIT_task4a_4 Guan2023 0.92 0.082 (0.073 - 0.091) 0.855 (0.844 - 0.867) mixup, time mask, pitch shift, time shift log-mel energies
Guan_HIT_task4a_5 Guan2023 1.40 0.488 (0.475 - 0.503) 0.708 (0.696 - 0.720) mixup, time mask, pitch shift, time shift log-mel energies
Guan_HIT_task4a_6 Guan2023 0.88 0.088 (0.080 - 0.096) 0.797 (0.787 - 0.810) mixup, time mask, pitch shift, time shift log-mel energies
Wang_XiaoRice_task4a_1 Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815) Embeddings
Wang_XiaoRice_task4a_2 Wang2023 1.52 0.497 (0.486 - 0.510) 0.814 (0.803 - 0.828) Embeddings
Wang_XiaoRice_task4a_3 Wang2023 0.91 0.088 (0.076 - 0.098) 0.835 (0.824 - 0.844) Embeddings
Zhang_IOA_task4_1 Zhang2023 1.75 0.622 (0.613 - 0.634) 0.857 (0.849 - 0.866) specaugment, mixup, frame_shift, FilterAug log-mel energies
Zhang_IOA_task4_2 Zhang2023 0.95 0.070 (0.060 - 0.080) 0.903 (0.895 - 0.911) specaugment, mixup, frame_shift, FilterAug log-mel energies
Zhang_IOA_task4_3 Zhang2023 1.71 0.613 (0.603 - 0.625) 0.828 (0.821 - 0.839) specaugment, mixup, frame_shift, FilterAug log-mel energies
Zhang_IOA_task4_4 Zhang2023 1.75 0.625 (0.615 - 0.637) 0.855 (0.847 - 0.864) specaugment, mixup, frame_shift, FilterAug log-mel energies
Zhang_IOA_task4_5 Zhang2023 1.52 0.524 (0.513 - 0.537) 0.774 (0.762 - 0.786) specaugment, mixup, frame_shift, FilterAug log-mel energies
Zhang_IOA_task4_6 Zhang2023 1.60 0.562 (0.552 - 0.575) 0.795 (0.786 - 0.805) specaugment, mixup, frame_shift, FilterAug log-mel energies
Zhang_IOA_task4_7 Zhang2023 0.86 0.055 (0.048 - 0.064) 0.830 (0.820 - 0.842) specaugment, mixup, frame_shift, FilterAug log-mel energies
Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610) mixup,frameshift,FilterAugment log-mel energies
Wu_NCUT_task4a_2 Wu2023 1.53 0.519 (0.507 - 0.531) 0.793 (0.783 - 0.806) mixup, FilterAugment log-mel energies
Wu_NCUT_task4a_3 Wu2023 1.50 0.497 (0.486 - 0.509) 0.793 (0.783 - 0.806) mixup, frameshift, FilterAugment log-mel energies
Barahona_AUDIAS_task4a_1 Barahona2023 1.06 0.351 (0.333 - 0.372) 0.562 (0.532 - 0.587) mixup, time shifting log-mel energies
Barahona_AUDIAS_task4a_2 Barahona2023 1.12 0.380 (0.361 - 0.406) 0.575 (0.553 - 0.594) mixup, time shifting log-mel energies
Barahona_AUDIAS_task4a_3 Barahona2023 0.91 0.200 (0.164 - 0.225) 0.646 (0.626 - 0.664) mixup, filteraugment log-mel energies
Barahona_AUDIAS_task4a_4 Barahona2023 0.84 0.141 (0.124 - 0.155) 0.673 (0.652 - 0.700) mixup, filteraugment log-mel energies
Barahona_AUDIAS_task4a_5 Barahona2023 1.14 0.378 (0.365 - 0.392) 0.604 (0.590 - 0.622) mixup, time shifting log-mel energies
Barahona_AUDIAS_task4a_6 Barahona2023 1.18 0.401 (0.390 - 0.414) 0.612 (0.596 - 0.630) mixup, time shifting log-mel energies
Barahona_AUDIAS_task4a_7 Barahona2023 1.06 0.274 (0.262 - 0.287) 0.684 (0.671 - 0.699) mixup, filteraugment log-mel energies
Barahona_AUDIAS_task4a_8 Barahona2023 1.00 0.213 (0.201 - 0.226) 0.729 (0.710 - 0.752) mixup, filteraugment log-mel energies
Gan_NCUT_task4_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617) frame shift, time mask, mixup, FilterAugment log-mel energies
Gan_NCUT_task4_2 Gan2023 1.52 0.511 (0.498 - 0.524) 0.799 (0.785 - 0.813) frame shift, time mask, mixup, FilterAugment log-mel energies
Gan_NCUT_task4_3 Gan2023 1.50 0.483 (0.467 - 0.498) 0.816 (0.805 - 0.828) frame shift, time mask, mixup, FilterAugment log-mel energies
Liu_SRCN_task4a_1 Chen2023a 1.65 0.585 (0.572 - 0.598) 0.817 (0.804 - 0.834) mixup log-mel energies
Liu_SRCN_task4a_2 Chen2023a 1.40 0.380 (0.369 - 0.392) 0.877 (0.867 - 0.885) mixup log-mel energies
Liu_SRCN_task4a_3 Chen2023a 1.65 0.556 (0.544 - 0.569) 0.861 (0.852 - 0.870) mixup log-mel energies
Liu_SRCN_task4a_4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676) mixup, time stretching, pitch shifting, time mask, frequency mask log-mel energies
Liu_SRCN_task4a_5 Chen2023a 0.94 0.098 (0.086 - 0.108) 0.851 (0.841 - 0.860) mixup log-mel energies
Kim_GIST-HanwhaVision_task4a_1 Kim2023 1.35 0.459 (0.431 - 0.484) 0.701 (0.681 - 0.720) frame shift, frequency shift, time masking, filter augment log-mel energies
Kim_GIST-HanwhaVision_task4a_2 Kim2023 1.68 0.591 (0.574 - 0.611) 0.831 (0.823 - 0.841) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Kim_GIST-HanwhaVision_task4a_3 Kim2023 1.66 0.581 (0.553 - 0.600) 0.835 (0.826 - 0.846) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Kim_GIST-HanwhaVision_task4a_4 Kim2023 1.63 0.576 (0.549 - 0.595) 0.809 (0.797 - 0.821) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Kim_GIST-HanwhaVision_task4a_5 Kim2023 1.72 0.611 (0.598 - 0.623) 0.846 (0.838 - 0.855) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Kim_GIST-HanwhaVision_task4a_6 Kim2023 1.72 0.611 (0.590 - 0.628) 0.841 (0.832 - 0.851) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Kim_GIST-HanwhaVision_task4a_7 Kim2023 1.69 0.591 (0.574 - 0.604) 0.844 (0.835 - 0.853) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Kim_GIST-HanwhaVision_task4a_8 Kim2023 1.72 0.612 (0.599 - 0.626) 0.841 (0.831 - 0.851) time masking, time and frequency shift, filter-augment, mix-up, Gaussian noise log-mel energies
Wenxin_TJU_task4a_1 Wenxin2023 1.63 0.555 (0.543 - 0.566) 0.837 (0.828 - 0.847) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_2 Wenxin2023 1.66 0.570 (0.559 - 0.580) 0.844 (0.836 - 0.854) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_3 Wenxin2023 0.88 0.080 (0.071 - 0.088) 0.815 (0.802 - 0.825) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_4 Wenxin2023 0.90 0.081 (0.071 - 0.090) 0.838 (0.828 - 0.849) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_5 Wenxin2023 1.58 0.539 (0.528 - 0.549) 0.816 (0.806 - 0.831) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_6 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_7 Wenxin2023 1.31 0.440 (0.429 - 0.454) 0.686 (0.673 - 0.699) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies
Wenxin_TJU_task4a_8 Wenxin2023 0.75 0.059 (0.049 - 0.068) 0.707 (0.694 - 0.723) FilterAugment mixup frameshift SpecAugment ICT SCT log-mel energies



Machine learning characteristics

Rank Code Technical
Report
Ranking score
(Evaluation dataset)

PSDS 1
(Evaluation dataset)

PSDS 2
(Evaluation dataset)
Classifier Semi-supervised approach Post-processing Segmentation
method
Decision
making
Baseline_task4a_1 Turpault2023 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566) CRNN Mean-teacher student median filtering
Baseline_task4a_2 Turpault2023 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811) CRNN Mean-teacher student median filtering
Li_USTC_task4a_1 Li2023 1.54 0.539 (0.527 - 0.551) 0.769 (0.758 - 0.778) PaSST-SED Mean-teacher student median filtering
Li_USTC_task4a_2 Li2023 1.58 0.556 (0.544 - 0.569) 0.781 (0.769 - 0.795) PaSST-SED Mean-teacher student, Pseudo-labelling median filtering
Li_USTC_task4a_3 Li2023 1.54 0.546 (0.535 - 0.558) 0.756 (0.745 - 0.769) PaSST-SED Mean-teacher student median filtering
Li_USTC_task4a_4 Li2023 0.89 0.061 (0.050 - 0.070) 0.852 (0.843 - 0.863) PaSST-SED Mean-teacher student max filtering
Li_USTC_task4a_5 Li2023 1.52 0.531 (0.520 - 0.544) 0.762 (0.751 - 0.773) PaSST-SED Mean-teacher student median filtering
Li_USTC_task4a_6 Li2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796) PaSST-SED Pseudo-labelling, Mean-teacher student median filtering
Li_USTC_task4a_7 Li2023 1.20 0.404 (0.389 - 0.421) 0.630 (0.612 - 0.648) SKCRNN Mean-teacher student median filtering
Liu_NSYSU_task4_1 Liu2023 0.80 0.051 (0.042 - 0.060) 0.779 (0.767 - 0.791) CRNN Mean-teacher student median filtering (93ms) average
Liu_NSYSU_task4_2 Liu2023 1.36 0.466 (0.455 - 0.480) 0.701 (0.688 - 0.714) CRNN Mean-teacher student median filtering (93ms) average
Liu_NSYSU_task4_3 Liu2023 1.26 0.434 (0.420 - 0.448) 0.646 (0.633 - 0.660) CRNN Mean-teacher student median filtering (93ms)
Liu_NSYSU_task4_4 Liu2023 1.24 0.413 (0.394 - 0.438) 0.655 (0.638 - 0.673) CRNN Mean-teacher student median filtering (93ms)
Liu_NSYSU_task4_5 Liu2023 0.82 0.045 (0.035 - 0.053) 0.806 (0.794 - 0.818) CRNN Mean-teacher student median filtering (93ms) average
Liu_NSYSU_task4_6 Liu2023 1.62 0.552 (0.540 - 0.563) 0.838 (0.829 - 0.848) CRNN Mean-teacher student median filtering (93ms) average
Liu_NSYSU_task4_7 Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831) CRNN Mean-teacher student median filtering (93ms)
Liu_NSYSU_task4_8 Liu2023 1.53 0.515 (0.488 - 0.536) 0.805 (0.791 - 0.818) CRNN Mean-teacher student median filtering (93ms)
Lee_CAU_task4A_1 Lee2023 1.24 0.425 (0.415 - 0.440) 0.634 (0.618 - 0.648) CRNN Mean-teacher student median filtering attention layers None
Lee_CAU_task4A_2 Lee2023 0.79 0.104 (0.090 - 0.117) 0.674 (0.661 - 0.690) CRNN Mean-teacher student median filtering attention layers None
Cheimariotis_DUTH_task4a_1 Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808) CRNN-FDY Mean-teacher student median filtering (93ms)
Cheimariotis_DUTH_task4a_2 Cheimariotis2023 1.45 0.487 (0.475 - 0.502) 0.759 (0.745 - 0.773) CRNN-FDY Mean-teacher student median filtering (93ms)
Chen_CHT_task4_1 Chen2023b 1.25 0.441 (0.403 - 0.468) 0.620 (0.567 - 0.652) CRNN Mean-teacher student median filtering average
Chen_CHT_task4_2 Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792) CRNN Mean-teacher student median filtering average
Chen_CHT_task4_3 Chen2023b 1.66 0.596 (0.585 - 0.606) 0.810 (0.800 - 0.822) CRNN Mean-teacher student median filtering average
Chen_CHT_task4_4 Chen2023b 1.66 0.590 (0.578 - 0.601) 0.820 (0.810 - 0.831) CRNN Mean-teacher student median filtering average
Xiao_FMSG_task4a_1 Zhang2023 1.23 0.403 (0.392 - 0.417) 0.660 (0.646 - 0.672) FDY_CRNN Mean-teacher student classwise median filtering
Xiao_FMSG_task4a_2 Xiao2023 1.55 0.525 (0.516 - 0.538) 0.808 (0.796 - 0.821) FDY_CRNN Mean-teacher student classwise median filtering
Xiao_FMSG_task4a_3 Xiao2023 0.86 0.071 (0.062 - 0.080) 0.807 (0.796 - 0.818) FDY_CRNN Mean-teacher student classwise median filtering
Xiao_FMSG_task4a_4 Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827) FDY_CRNN Mean-teacher student classwise median filtering
Xiao_FMSG_task4a_5 Xiao2023 1.61 0.555 (0.545 - 0.567) 0.821 (0.811 - 0.834) FDY_CRNN Mean-teacher student classwise median filtering average
Xiao_FMSG_task4a_6 Xiao2023 1.61 0.551 (0.541 - 0.561) 0.829 (0.819 - 0.842) FDY_CRNN Mean-teacher student classwise median filtering average
Xiao_FMSG_task4a_7 Xiao2023 0.87 0.075 (0.066 - 0.084) 0.811 (0.800 - 0.822) FDY_CRNN Mean-teacher student classwise median filtering average
Xiao_FMSG_task4a_8 Xiao2023 1.62 0.549 (0.540 - 0.560) 0.834 (0.824 - 0.847) FDY_CRNN Mean-teacher student classwise median filtering average
Guan_HIT_task4a_1 Guan2023 1.57 0.536 (0.526 - 0.546) 0.810 (0.800 - 0.822) CRNN Mean-teacher student classwise median filtering average
Guan_HIT_task4a_2 Guan2023 0.93 0.082 (0.074 - 0.090) 0.862 (0.852 - 0.872) CRNN Mean-teacher student classwise median filtering average
Guan_HIT_task4a_3 Guan2023 1.55 0.526 (0.513 - 0.539) 0.800 (0.788 - 0.813) CRNN Mean-teacher student classwise median filtering
Guan_HIT_task4a_4 Guan2023 0.92 0.082 (0.073 - 0.091) 0.855 (0.844 - 0.867) CRNN Mean-teacher student classwise median filtering
Guan_HIT_task4a_5 Guan2023 1.40 0.488 (0.475 - 0.503) 0.708 (0.696 - 0.720) CRNN Mean-teacher student classwise median filtering average
Guan_HIT_task4a_6 Guan2023 0.88 0.088 (0.080 - 0.096) 0.797 (0.787 - 0.810) CRNN Mean-teacher student classwise median filtering average
Wang_XiaoRice_task4a_1 Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815) GRU Mean-teacher student median filtering (320ms) average
Wang_XiaoRice_task4a_2 Wang2023 1.52 0.497 (0.486 - 0.510) 0.814 (0.803 - 0.828) GRU Mean-teacher student median filtering (93ms)
Wang_XiaoRice_task4a_3 Wang2023 0.91 0.088 (0.076 - 0.098) 0.835 (0.824 - 0.844) DNN Unsupervised data augmentation median filtering (93ms) average
Zhang_IOA_task4_1 Zhang2023 1.75 0.622 (0.613 - 0.634) 0.857 (0.849 - 0.866) CRNN,Transformer Mean-teacher student,Pseudo-labelling median filtering average
Zhang_IOA_task4_2 Zhang2023 0.95 0.070 (0.060 - 0.080) 0.903 (0.895 - 0.911) CRNN,Transformer Mean-teacher student,Pseudo-labelling median filtering average
Zhang_IOA_task4_3 Zhang2023 1.71 0.613 (0.603 - 0.625) 0.828 (0.821 - 0.839) CRNN,Transformer Mean-teacher student,Pseudo-labelling median filtering average
Zhang_IOA_task4_4 Zhang2023 1.75 0.625 (0.615 - 0.637) 0.855 (0.847 - 0.864) CRNN,Transformer Mean-teacher student,Pseudo-labelling median filtering average
Zhang_IOA_task4_5 Zhang2023 1.52 0.524 (0.513 - 0.537) 0.774 (0.762 - 0.786) CRNN Mean-teacher student median filtering
Zhang_IOA_task4_6 Zhang2023 1.60 0.562 (0.552 - 0.575) 0.795 (0.786 - 0.805) CRNN Mean-teacher student, Pseudo-labelling median filtering
Zhang_IOA_task4_7 Zhang2023 0.86 0.055 (0.048 - 0.064) 0.830 (0.820 - 0.842) CRNN,Transformer Mean-teacher student,Pseudo-labelling median filtering
Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610) RCRNN Mean-teacher student median filtering (93ms)
Wu_NCUT_task4a_2 Wu2023 1.53 0.519 (0.507 - 0.531) 0.793 (0.783 - 0.806) RCRNN Mean-teacher student median filtering (93ms) average
Wu_NCUT_task4a_3 Wu2023 1.50 0.497 (0.486 - 0.509) 0.793 (0.783 - 0.806) RCRNN Mean-teacher student median filtering (93ms) average
Barahona_AUDIAS_task4a_1 Barahona2023 1.06 0.351 (0.333 - 0.372) 0.562 (0.532 - 0.587) CRNN Mean-teacher student median filtering (450ms)
Barahona_AUDIAS_task4a_2 Barahona2023 1.12 0.380 (0.361 - 0.406) 0.575 (0.553 - 0.594) CRNN Mean-teacher student median filtering (class dependent)
Barahona_AUDIAS_task4a_3 Barahona2023 0.91 0.200 (0.164 - 0.225) 0.646 (0.626 - 0.664) Conformer Mean-teacher student median filtering (1344 ms)
Barahona_AUDIAS_task4a_4 Barahona2023 0.84 0.141 (0.124 - 0.155) 0.673 (0.652 - 0.700) Conformer Mean-teacher student median filtering (class-dependent)
Barahona_AUDIAS_task4a_5 Barahona2023 1.14 0.378 (0.365 - 0.392) 0.604 (0.590 - 0.622) CRNN Mean-teacher student median filtering (450ms) averaging
Barahona_AUDIAS_task4a_6 Barahona2023 1.18 0.401 (0.390 - 0.414) 0.612 (0.596 - 0.630) CRNN Mean-teacher student median filtering (class-dependent) averaging
Barahona_AUDIAS_task4a_7 Barahona2023 1.06 0.274 (0.262 - 0.287) 0.684 (0.671 - 0.699) Conformer Mean-teacher student median filtering (1344ms) averaging
Barahona_AUDIAS_task4a_8 Barahona2023 1.00 0.213 (0.201 - 0.226) 0.729 (0.710 - 0.752) Conformer Mean-teacher student median filtering (class-dependent) averaging
Gan_NCUT_task4_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617) CRNN Mean-teacher student median filtering (93ms)
Gan_NCUT_task4_2 Gan2023 1.52 0.511 (0.498 - 0.524) 0.799 (0.785 - 0.813) CRNN Mean-teacher student median filtering (93ms) average
Gan_NCUT_task4_3 Gan2023 1.50 0.483 (0.467 - 0.498) 0.816 (0.805 - 0.828) CRNN Mean-teacher student median filtering (93ms) average
Liu_SRCN_task4a_1 Chen2023a 1.65 0.585 (0.572 - 0.598) 0.817 (0.804 - 0.834) CRNN,Transformer,ensemble Mean-teacher student median filtering
Liu_SRCN_task4a_2 Chen2023a 1.40 0.380 (0.369 - 0.392) 0.877 (0.867 - 0.885) CRNN,Transformer,ensemble Mean-teacher student median filtering
Liu_SRCN_task4a_3 Chen2023a 1.65 0.556 (0.544 - 0.569) 0.861 (0.852 - 0.870) CRNN,Transformer,ensemble Mean-teacher student median filtering
Liu_SRCN_task4a_4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676) CRNN Mean-teacher student
Liu_SRCN_task4a_5 Chen2023a 0.94 0.098 (0.086 - 0.108) 0.851 (0.841 - 0.860) CRNN,Transformer,ensemble Mean-teacher student time pool
Kim_GIST-HanwhaVision_task4a_1 Kim2023 1.35 0.459 (0.431 - 0.484) 0.701 (0.681 - 0.720) CRNN Mean-teacher student class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner)
Kim_GIST-HanwhaVision_task4a_2 Kim2023 1.68 0.591 (0.574 - 0.611) 0.831 (0.823 - 0.841) CRNN with pretrained BEATs Mean-teacher student, Pseudo-labelling class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner)
Kim_GIST-HanwhaVision_task4a_3 Kim2023 1.66 0.581 (0.553 - 0.600) 0.835 (0.826 - 0.846) CRNN with pretrained BEATs Mean-teacher student, Pseudo-labelling class-wise median filtering (28 ms for alarm, 44 ms for blender, 40ms for cat, 32 ms for dog, 32 ms for dishes, 88 ms for electric shaver, 148 ms for frying, 124 ms for running water, 28 ms for speech, 60 ms for vacuum cleaner)
Kim_GIST-HanwhaVision_task4a_4 Kim2023 1.63 0.576 (0.549 - 0.595) 0.809 (0.797 - 0.821) CRNN with pretrained BEATs, ensemble Mean-teacher student, Pseudo-labelling class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner) Average
Kim_GIST-HanwhaVision_task4a_5 Kim2023 1.72 0.611 (0.598 - 0.623) 0.846 (0.838 - 0.855) CRNN with pretrained BEATs, ensemble Mean-teacher student, Pseudo-labelling class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner) Average
Kim_GIST-HanwhaVision_task4a_6 Kim2023 1.72 0.611 (0.590 - 0.628) 0.841 (0.832 - 0.851) CRNN with pretrained BEATs, ensemble Mean-teacher student, Pseudo-labelling class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner) Average
Kim_GIST-HanwhaVision_task4a_7 Kim2023 1.69 0.591 (0.574 - 0.604) 0.844 (0.835 - 0.853) CRNN with pretrained BEATs, ensemble Mean-teacher student, Pseudo-labelling class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner) Average
Kim_GIST-HanwhaVision_task4a_8 Kim2023 1.72 0.612 (0.599 - 0.626) 0.841 (0.831 - 0.851) CRNN with pretrained BEATs, ensemble Mean-teacher student, Pseudo-labelling class-wise median filtering (20 ms for alarm, 44 ms for blender, 20ms for cat, 20 ms for dog, 20 ms for dishes, 268 ms for electric shaver, 244 ms for frying, 196 ms for running water, 20 ms for speech, 68 ms for vacuum cleaner) Average
Wenxin_TJU_task4a_1 Wenxin2023 1.63 0.555 (0.543 - 0.566) 0.837 (0.828 - 0.847) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_2 Wenxin2023 1.66 0.570 (0.559 - 0.580) 0.844 (0.836 - 0.854) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_3 Wenxin2023 0.88 0.080 (0.071 - 0.088) 0.815 (0.802 - 0.825) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_4 Wenxin2023 0.90 0.081 (0.071 - 0.090) 0.838 (0.828 - 0.849) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_5 Wenxin2023 1.58 0.539 (0.528 - 0.549) 0.816 (0.806 - 0.831) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_6 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_7 Wenxin2023 1.31 0.440 (0.429 - 0.454) 0.686 (0.673 - 0.699) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average
Wenxin_TJU_task4a_8 Wenxin2023 0.75 0.059 (0.049 - 0.068) 0.707 (0.694 - 0.723) FDYCRNN Mutual mean teaching median filtering (320ms 705ms 320ms 320ms 320ms 4295ms 3910ms 3141ms 320ms 1090ms) attention layers Average

Complexity

Rank Code Technical
Report
Ranking score
(Evaluation dataset)

PSDS 1
(Evaluation dataset)

PSDS 2
(Evaluation dataset)
Model
complexity
MACS Ensemble
subsystems
Training time
Baseline_task4a_1 Turpault2023 1.00 0.327 (0.317 - 0.339) 0.538 (0.515 - 0.566) 1112420 930902000 3h (A100 40Gb)
Baseline_task4a_2 Turpault2023 1.52 0.510 (0.496 - 0.523) 0.798 (0.782 - 0.811) 1227236 948793000 3h (A100 40Gb)
Li_USTC_task4a_1 Li2023 1.54 0.539 (0.527 - 0.551) 0.769 (0.758 - 0.778) 623304000 714906000000 6 18h (2 GTX 3090)
Li_USTC_task4a_2 Li2023 1.58 0.556 (0.544 - 0.569) 0.781 (0.769 - 0.795) 415536000 476604000000 4 6h (2 GTX 3090)
Li_USTC_task4a_3 Li2023 1.54 0.546 (0.535 - 0.558) 0.756 (0.745 - 0.769) 415536000 476604000000 4 6h (2 GTX 3090)
Li_USTC_task4a_4 Li2023 0.89 0.061 (0.050 - 0.070) 0.852 (0.843 - 0.863) 415536000 476604000000 4 6h (2 GTX 3090)
Li_USTC_task4a_5 Li2023 1.52 0.531 (0.520 - 0.544) 0.762 (0.751 - 0.773) 103884000 119151000000 6h (2 GTX 3090)
Li_USTC_task4a_6 Li2023 1.56 0.546 (0.529 - 0.562) 0.783 (0.771 - 0.796) 103884000 119151000000 3h (2 GTX 3090)
Li_USTC_task4a_7 Li2023 1.20 0.404 (0.389 - 0.421) 0.630 (0.612 - 0.648) 2684000 15481000000 2.5h (2 GTX 3090)
Liu_NSYSU_task4_1 Liu2023 0.80 0.051 (0.042 - 0.060) 0.779 (0.767 - 0.791) 132600000 5470000000 6 36h (1 RTX3060)
Liu_NSYSU_task4_2 Liu2023 1.36 0.466 (0.455 - 0.480) 0.701 (0.688 - 0.714) 132600000 5470000000 6 36h (1 RTX3060)
Liu_NSYSU_task4_3 Liu2023 1.26 0.434 (0.420 - 0.448) 0.646 (0.633 - 0.660) 6600000 4632000000 6h (1 RTX3060)
Liu_NSYSU_task4_4 Liu2023 1.24 0.413 (0.394 - 0.438) 0.655 (0.638 - 0.673) 22100000 911717000 6h (1 RTX3060)
Liu_NSYSU_task4_5 Liu2023 0.82 0.045 (0.035 - 0.053) 0.806 (0.794 - 0.818) 110500000 4558000000 5 30h (1 RTX3060)
Liu_NSYSU_task4_6 Liu2023 1.62 0.552 (0.540 - 0.563) 0.838 (0.829 - 0.848) 110500000 4558000000 5 30h (1 RTX3060)
Liu_NSYSU_task4_7 Liu2023 1.55 0.521 (0.510 - 0.531) 0.813 (0.796 - 0.831) 22100000 911717000 6h (1 RTX3060)
Liu_NSYSU_task4_8 Liu2023 1.53 0.515 (0.488 - 0.536) 0.805 (0.791 - 0.818) 22100000 911717000 6h (1 RTX3060)
Lee_CAU_task4A_1 Lee2023 1.24 0.425 (0.415 - 0.440) 0.634 (0.618 - 0.648) 6200000 331754000000 None 6h (1 Quadro RTX 8000)
Lee_CAU_task4A_2 Lee2023 0.79 0.104 (0.090 - 0.117) 0.674 (0.661 - 0.690) 6600000 335673000000 None 5h (1 Quadro RTX 8000)
Cheimariotis_DUTH_task4a_1 Cheimariotis2023 1.53 0.516 (0.504 - 0.529) 0.796 (0.784 - 0.808) 6600000 3497000000 8h (1 A6000)
Cheimariotis_DUTH_task4a_2 Cheimariotis2023 1.45 0.487 (0.475 - 0.502) 0.759 (0.745 - 0.773) 6600000 3497000000 8h (1 A6000)
Chen_CHT_task4_1 Chen2023b 1.25 0.441 (0.403 - 0.468) 0.620 (0.567 - 0.652) 4920104 5707000000 13h (1 A100)
Chen_CHT_task4_2 Chen2023b 1.58 0.563 (0.550 - 0.574) 0.779 (0.768 - 0.792) 6329384 5854000000 23h (1 A100)
Chen_CHT_task4_3 Chen2023b 1.66 0.596 (0.585 - 0.606) 0.810 (0.800 - 0.822) 37976304 35124000000 6 138h (1 A100)
Chen_CHT_task4_4 Chen2023b 1.66 0.590 (0.578 - 0.601) 0.820 (0.810 - 0.831) 94940760 87814000000 15 345h (1 A100)
Xiao_FMSG_task4a_1 Zhang2023 1.23 0.403 (0.392 - 0.417) 0.660 (0.646 - 0.672) 2770884 120350000 4h (1 RTX A5000)
Xiao_FMSG_task4a_2 Xiao2023 1.55 0.525 (0.516 - 0.538) 0.808 (0.796 - 0.821) 8832068 3726000000 5h (1 RTX A5000)
Xiao_FMSG_task4a_3 Xiao2023 0.86 0.071 (0.062 - 0.080) 0.807 (0.796 - 0.818) 4729412 1483000000 5h (1 RTX A5000)
Xiao_FMSG_task4a_4 Xiao2023 1.60 0.551 (0.543 - 0.562) 0.813 (0.802 - 0.827) 4171844 459620000 4.5h (1 RTX A5000)
Xiao_FMSG_task4a_5 Xiao2023 1.61 0.555 (0.545 - 0.567) 0.821 (0.811 - 0.834) 21629908 9322000000 5 25h (1 RTX A5000)
Xiao_FMSG_task4a_6 Xiao2023 1.61 0.551 (0.541 - 0.561) 0.829 (0.819 - 0.842) 51596456 12125000000 10 50h (1 RTX A5000)
Xiao_FMSG_task4a_7 Xiao2023 0.87 0.075 (0.066 - 0.084) 0.811 (0.800 - 0.822) 16687376 5917000000 4 20h (1 A4000)
Xiao_FMSG_task4a_8 Xiao2023 1.62 0.549 (0.540 - 0.560) 0.834 (0.824 - 0.847) 60916904 18646000000 10 50h (1 RTX A5000)
Guan_HIT_task4a_1 Guan2023 1.57 0.536 (0.526 - 0.546) 0.810 (0.800 - 0.822) 17840496 88200000000 4 6h (1 GTX 3090)
Guan_HIT_task4a_2 Guan2023 0.93 0.082 (0.074 - 0.090) 0.862 (0.852 - 0.872) 35680992 88200000000 8 6h (1 GTX 3090)
Guan_HIT_task4a_3 Guan2023 1.55 0.526 (0.513 - 0.539) 0.800 (0.788 - 0.813) 4460124 88200000000 6h (1 GTX 3090)
Guan_HIT_task4a_4 Guan2023 0.92 0.082 (0.073 - 0.091) 0.855 (0.844 - 0.867) 4460124 88200000000 6h (1 GTX 3090)
Guan_HIT_task4a_5 Guan2023 1.40 0.488 (0.475 - 0.503) 0.708 (0.696 - 0.720) 49061364 88200000000 11 6h (1 GTX 3090)
Guan_HIT_task4a_6 Guan2023 0.88 0.088 (0.080 - 0.096) 0.797 (0.787 - 0.810) 49061364 88200000000 11 6h (1 GTX 3090)
Wang_XiaoRice_task4a_1 Wang2023 1.50 0.494 (0.477 - 0.510) 0.801 (0.789 - 0.815) 671000 105200000 20min (1 V100)
Wang_XiaoRice_task4a_2 Wang2023 1.52 0.497 (0.486 - 0.510) 0.814 (0.803 - 0.828) 3873000 606000000 6 30 min (1 GTX 1080 Ti)
Wang_XiaoRice_task4a_3 Wang2023 0.91 0.088 (0.076 - 0.098) 0.835 (0.824 - 0.844) 1979000 350000000 6 30min (1 V100)
Zhang_IOA_task4_1 Zhang2023 1.75 0.622 (0.613 - 0.634) 0.857 (0.849 - 0.866) 240652180 10518000000000 25 30h (4 Telsa A100)
Zhang_IOA_task4_2 Zhang2023 0.95 0.070 (0.060 - 0.080) 0.903 (0.895 - 0.911) 64870560 2480000000000 16 20h (4 Telsa A100)
Zhang_IOA_task4_3 Zhang2023 1.71 0.613 (0.603 - 0.625) 0.828 (0.821 - 0.839) 481304360 21036000000000 50 60h (4 Telsa A100)
Zhang_IOA_task4_4 Zhang2023 1.75 0.625 (0.615 - 0.637) 0.855 (0.847 - 0.864) 240652180 10518000000000 25 30h (4 Telsa A100)
Zhang_IOA_task4_5 Zhang2023 1.52 0.524 (0.513 - 0.537) 0.774 (0.762 - 0.786) 11325746 460000000000 6h (1 Telsa A100)
Zhang_IOA_task4_6 Zhang2023 1.60 0.562 (0.552 - 0.575) 0.795 (0.786 - 0.805) 9160520 368000000000 6h (1 Telsa A100)
Zhang_IOA_task4_7 Zhang2023 0.86 0.055 (0.048 - 0.064) 0.830 (0.820 - 0.842) 4670390 182000000000 6h (1 Telsa A100)
Wu_NCUT_task4a_1 Wu2023 1.15 0.391 (0.379 - 0.405) 0.596 (0.584 - 0.610) 17307000 98754000000 8h (1 GTX 4090)
Wu_NCUT_task4a_2 Wu2023 1.53 0.519 (0.507 - 0.531) 0.793 (0.783 - 0.806) 66637000 487389000000 4 29h (1 GTX 4090)
Wu_NCUT_task4a_3 Wu2023 1.50 0.497 (0.486 - 0.509) 0.793 (0.783 - 0.806) 66637000 492389000000 4 29h (1 GTX 4090)
Barahona_AUDIAS_task4a_1 Barahona2023 1.06 0.351 (0.333 - 0.372) 0.562 (0.532 - 0.587) 1112420 1824000000 14h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_2 Barahona2023 1.12 0.380 (0.361 - 0.406) 0.575 (0.553 - 0.594) 1112420 1824000000 14h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_3 Barahona2023 0.91 0.200 (0.164 - 0.225) 0.646 (0.626 - 0.664) 12637170 633627000 21h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_4 Barahona2023 0.84 0.141 (0.124 - 0.155) 0.673 (0.652 - 0.700) 12637170 633627000 21h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_5 Barahona2023 1.14 0.378 (0.365 - 0.392) 0.604 (0.590 - 0.622) 4449680 5432000000 4 56h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_6 Barahona2023 1.18 0.401 (0.390 - 0.414) 0.612 (0.596 - 0.630) 4449680 5432000000 4 56h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_7 Barahona2023 1.06 0.274 (0.262 - 0.287) 0.684 (0.671 - 0.699) 63185850 4426000000 5 133h (1 GeForce RTX 2080 Ti)
Barahona_AUDIAS_task4a_8 Barahona2023 1.00 0.213 (0.201 - 0.226) 0.729 (0.710 - 0.752) 63185850 5432000000 5 133h (1 GeForce RTX 2080 Ti)
Gan_NCUT_task4_1 Gan2023 1.12 0.365 (0.353 - 0.377) 0.603 (0.589 - 0.617) 11200000 911727000 7h (1 RTX 3090)
Gan_NCUT_task4_2 Gan2023 1.52 0.511 (0.498 - 0.524) 0.799 (0.785 - 0.813) 132500000 56778000000 6 40h (1 RTX 3090)
Gan_NCUT_task4_3 Gan2023 1.50 0.483 (0.467 - 0.498) 0.816 (0.805 - 0.828) 452500000 138254000000 10 44h (1 RTX 3090)
Liu_SRCN_task4a_1 Chen2023a 1.65 0.585 (0.572 - 0.598) 0.817 (0.804 - 0.834) 1804800000 765529000000 26 92h (NVIDIA A100-PCIE-40GB)
Liu_SRCN_task4a_2 Chen2023a 1.40 0.380 (0.369 - 0.392) 0.877 (0.867 - 0.885) 1057200000 578115000000 20 87h (NVIDIA A100-PCIE-40GB)
Liu_SRCN_task4a_3 Chen2023a 1.65 0.556 (0.544 - 0.569) 0.861 (0.852 - 0.870) 1872000000 1011000000000 33 98h (NVIDIA A100-PCIE-40GB)
Liu_SRCN_task4a_4 Chen2023a 1.25 0.412 (0.400 - 0.424) 0.663 (0.652 - 0.676) 5300000 896000000 15h (NVIDIA A100-PCIE-40GB)
Liu_SRCN_task4a_5 Chen2023a 0.94 0.098 (0.086 - 0.108) 0.851 (0.841 - 0.860) 1057200000 578115000000 20 87h (NVIDIA A100-PCIE-40GB)
Kim_GIST-HanwhaVision_task4a_1 Kim2023 1.35 0.459 (0.431 - 0.484) 0.701 (0.681 - 0.720) 4542556 7234000000 29h (3 RTX A6000)
Kim_GIST-HanwhaVision_task4a_2 Kim2023 1.68 0.591 (0.574 - 0.611) 0.831 (0.823 - 0.841) 4804956 7300000000 14h 36m (4 RTX A6000)
Kim_GIST-HanwhaVision_task4a_3 Kim2023 1.66 0.581 (0.553 - 0.600) 0.835 (0.826 - 0.846) 4804956 7300000000 15h 31m (4 RTX A6000)
Kim_GIST-HanwhaVision_task4a_4 Kim2023 1.63 0.576 (0.549 - 0.595) 0.809 (0.797 - 0.821) 4804956 7300000000 46 12h (4 RTX A6000)
Kim_GIST-HanwhaVision_task4a_5 Kim2023 1.72 0.611 (0.598 - 0.623) 0.846 (0.838 - 0.855) 9609912 335800000000 46 36h (4 RTX A6000)
Kim_GIST-HanwhaVision_task4a_6 Kim2023 1.72 0.611 (0.590 - 0.628) 0.841 (0.832 - 0.851) 9609912 116800000000 46 36h (4 RTX A6000)
Kim_GIST-HanwhaVision_task4a_7 Kim2023 1.69 0.591 (0.574 - 0.604) 0.844 (0.835 - 0.853) 9609912 116800000000 46 36h (4 RTX A6000)
Kim_GIST-HanwhaVision_task4a_8 Kim2023 1.72 0.612 (0.599 - 0.626) 0.841 (0.831 - 0.851) 9609912 116800000000 46 36h (4 RTX A6000)
Wenxin_TJU_task4a_1 Wenxin2023 1.63 0.555 (0.543 - 0.566) 0.837 (0.828 - 0.847) 942000000 20320000000 20 1d 7h * 20 (1 Tesla V100)
Wenxin_TJU_task4a_2 Wenxin2023 1.66 0.570 (0.559 - 0.580) 0.844 (0.836 - 0.854) 471000000 10160000000 10 1d 7h * 10 (1 Tesla V100)
Wenxin_TJU_task4a_3 Wenxin2023 0.88 0.080 (0.071 - 0.088) 0.815 (0.802 - 0.825) 847800000 18288000000 18 1d 7h * 18 (1 Tesla V100)
Wenxin_TJU_task4a_4 Wenxin2023 0.90 0.081 (0.071 - 0.090) 0.838 (0.828 - 0.849) 942000000 20320000000 20 1d 7h * 20 (1 Tesla V100)
Wenxin_TJU_task4a_5 Wenxin2023 1.58 0.539 (0.528 - 0.549) 0.816 (0.806 - 0.831) 47700000 1045000000 1d 7h (1 Tesla V100)
Wenxin_TJU_task4a_6 Wenxin2023 1.61 0.546 (0.536 - 0.556) 0.831 (0.823 - 0.842) 47700000 1045000000 1d 7h (1 Tesla V100)
Wenxin_TJU_task4a_7 Wenxin2023 1.31 0.440 (0.429 - 0.454) 0.686 (0.673 - 0.699) 44200000 911727000 1d 7h (1 Tesla V100)
Wenxin_TJU_task4a_8 Wenxin2023 0.75 0.059 (0.049 - 0.068) 0.707 (0.694 - 0.723) 44200000 911727000 1d 7h (1 Tesla V100)

Technical reports

OPTIMIZING MULTI-RESOLUTION CONFORMER AND CRNN MODELS FOR DIFFERENT PSDS SCENARIOS IN DCASE CHALLENGE 2023 TASK 4A

Barahona, Sara and de Benito-Gorron, Diego and Segovia, Sergio and Ramos, Daniel and Toledano, Doroteo
Universidad Autónoma de Madrid, Madrid, Spain

Abstract

In this technical report we describe our submission to DCASE 2023 Task 4A: Sound Event Detection with Weak Labels and Synthetic Soundscapes. Considering that the different scenarios proposed for the Polyphonic Sound Event Score (PSDS) highlight diverse properties of a Sound Event Detection (SED) system, we have employed two different architectures for optimizing each scenario. Whereas we exploit the temporal benefits of Convolution Recurrent Neural Networks (CRNNs) for maximizing the PSDS1, we employ a Conformer network for improving sound events classification and therefore enhancing PSDS2. Additionally, we follow the multi-resolution approach successfully employed in previous DCASE editions to take advantage of the temporal and spectral disparities among the different sound event categories.

System characteristics
PDF

SOUND EVENT DETECTION OF DOMESTIC ACTIVITIES USING FREQUENCY DYNAMIC CONVOLUTION AND BEATS EMBEDDINGS

Cheimariotis, Grigorios-Aris and Mitianoudis, Nikolaos
Democritus University of Thrace, Xanthi, Greece

Abstract

This technical report describes one submission for Dcase2023 Task 4a “Sound event detection of domestic activities”. The methodologies proposed are based on the baseline system, which is provided by the organizers, and consist mainly of feature extraction by passing spectrograms through frequency dynamic convolution network, concatenation of these features with BEATS embeddings, use of BiGRU for sequence modelling. Also, a mean-teacher model is employed. The results for the submissions, when using audioset real strong-labelled data are: PSDS1 0.496 PSDS2 0.788 and when the aforementioned data subset is not used are: PSDS1 0.516 PSDS2 0.781.

System characteristics
PDF

DCASE 2023 CHALLENGE TASK4 TECHNICAL REPORT

Chen, Minjun and Jin, Yongbin and Shao, Jun and Liu, Yangyang and Peng, Bo and Chen, Jie
Samsung Research China-Nanjing, Nanjing, China

Abstract

We describe our submitted systems for DCASE2023 Task4 in this technical report: Sound Event Detection with Weak Labels and Synthetic Soundscapes (Subtask A), and Sound Event Detection with Soft Labels (Subtask B). We focus on construct a CRNN model, which fuses the embedding extracted by the BEATs or AST pre-trained model and use the frequency dynamic convolution(FDY-CRNN) and channel-wise selective kernel attention (SKA) for having adaptive receptive field. To get multiple models of different architectures for making an ensemble, we fine-tune multiple BEATs model on the SED dataset also. In order to make use of the weak labeled and unlabeled subset of DESED dataset further, we pseudo labels these subsets by a multiple iterative of self-training. We also use a small part of audio files from the Audioset dataset, and this part of data following the same self-training procedure. We train these models using two different settings, one setting for optimizing PSDS1 score, and the other for optimizing PSDS2 score. Our proposed systems achieve poly-phonic sound event detection scores (PSDS-scores) of 0.570 (PSDS-scenario1) and 0.889 (PSDS-scenario2) respectively on development dataset of sub-task A, and macro-average F1 score with optimum threshold per class (F1MO) 49.70 on development dataset of subtask B.

System characteristics
PDF

SOUND EVENT DETECTION SYSTEM USING PRE-TRAINED MODEL FOR DCASE 2023 TASK 4

Chen, Wei-Yu and Lu, Chung-Li and Chuang, Hsiang-Feng and Cheng, Yu-Han Cheng and Chan, Bo-Cheng
Chunghwa Telecom Laboratories, Taiwan

Abstract

In this technical report, we briefly describe the system we designed for Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge Task4: Sound Event Detection with Weak Labels and Synthetic Soundscapes. Our best single system combines the embedding obtained by VGGSK and BEATs, using GRU to classify sound events for each frame. Thresholding and smoothing are utilized during the post-processing stage. The mean teacher method is applied for semi-supervised learning with the EMA strategy to update parameters of the teacher model. To utilize unlabeled data, pseudo label is generated by the student model. In the process of data augmentation, we utilize techniques such as mix-up, Gaussian noise and embedding masking. The submitted single system trained with extra data achieves the PSDS1 of 0.529 and the PSDS2 of 0.78 on the validation set.

System characteristics
PDF

SEMI-SUPERVISED SOUND EVENT DETECTION BASED ON PRETRAINED MODELS FOR DCASE 2023 TASK 4A

Gan, Yanggang and Qiao, Ziling and Wu, Juan and Cai, Xichang and Wu, Menglong
Universidad Autónoma de Madrid, Madrid, Spain

Abstract

In this technical report, we present our submission system for DCASE 2023 Task4A: Sound Event Detection with Weak Labels and Synthetic Soundscapes. The proposed system is based on mean teacher framework of semi-supervised learning,selective kernel multi-scale convolutional network and frequency dynamic convolutional network. We extract the frame embeddings of the pre-trained model BEAT, and use adaptive average pooling to unify the embeddings to a fixed dimension, and finally fuse them with the features extracted by the convolutional layer of the SED model in the channel dimension. Our systems finally achieve the PSDS-scenario1 of 52.1% and PSDS-scenario2 of 82.5% on the validation set.

System characteristics
PDF

SEMI-SUPERVISED SOUND EVENT DETECTION SYSTEM FOR DCASE 2023 TASK 4

Yadong Guan, Qijie Shang
Harbin Institute of Technology, Harbin, China

Abstract

In this report, we describe our submissions for the task 4 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge: Sound Event Detection in Domestic Environments. Our methods are mainly based on Convolutional Recurrent Neural Network. We propose to utilize sound activity detection (SAD) as an auxiliary task for sound event detection and use a multi-task learning approach to train the two tasks simultaneously, thus improving the model generalization performance. Moreover, we proposed a new local weak prediction to improve the PSDS2 index. To prevent overfitting, we adopt data augmentation using hard mixup, pitch shift, and time shift. Besides, we utilize external data and a pretrained model named Beats to further improve performance, and try an ensemble of multiple subsystems to enhance the generalization capability of our system. Our final systems achieve a PSDS1/PSDS2 score of 0.523/0.890 on development dataset.

System characteristics
PDF

SEMI-SUPERVISED LEARNING-BASED SOUND EVENT DETECTION USING FREQUENCY DYNAMIC CONVOLUTION WITH LARGE KERNEL ATTENTION FOR DCASE CHALLENGE 2023 TASK 4

Kim, Ji Won1 and Son, Sang Won1 and Song, Yoonah1 and Kim, Hong Kook1,2 and Song, Il Hoon3 and Lim, Jeong Eun3
1AI Graduate School, Gwangju Institute of Science and Technology, Gwangju, Korea 2School of EECS, Gwangju Institute of Science and Technology, Gwangju, Korea 3AI Lab. R&D Center, Hanwha Visionn Seongnam-si, Gyeonggi-do, Korea

Abstract

In this technical report, we present our submission system for DCASE 2023 Task4A: Sound Event Detection with Weak Labels and Synthetic Soundscapes. The proposed system is based on mean teacher framework of semi-supervised learning,selective kernel multi-scale convolutional network and frequency dynamic convolutional network. We extract the frame embeddings of the pre-trained model BEAT, and use adaptive average pooling to unify the embeddings to a fixed dimension, and finally fuse them with the features extracted by the convolutional layer of the SED model in the channel dimension. Our systems finally achieve the PSDS-scenario1 of 52.1% and PSDS-scenario2 of 82.5% on the validation set.

System characteristics
PDF

SOUND EVENT DETECTION USING CONVOLUTION ATTENTION MODULE FOR DCASE 2023 CHALLENGE TASK4A

Lee, Sumi and Kim, Narin and Lee, Juhyun and Hwang, Chaewon and Jang, Sojung and Kwak, Il-Youp
Chung-Ang University, Department of Applied Statistics, Seoul, South Korea

Abstract

In this technical report, we propose sound event detection models based on CRNN for DCASE 2023 challenge task4A. DCASE task4 evaluates the model with two main metrics. The two metrics are PSDS1 and PSDS2, which have different characteristics, making it difficult to dramatically raise two metrics with one model. Therefore, we have developed two models with different directions. The first model is the Flcam-CRNN, which aimed at PSDS1. Flcam is an attention module created by reflecting the features of 2D audio features in the time-frequency domain. The second model is Mha-CRNN, which aimed at PSDS2. SED data has the characteristic of containing several sounds about a space. Therefore, multi-head attention was used to extract features from various perspectives.

System characteristics
PDF

LI USTC TEAM’S SUBMISSION FOR DCASE 2023 CHALLENGE TASK4A

Li, Kang and Cai, Pengfei and Song, Yan
University of Science and Technology of China, Hefei, China

Abstract

In this technical report, we present our submissions for DCASE 2023 challenge task4a. We mainly study how to fine-tune patchout fast spectrogram transformer (PaSST) for sound event detection task (PaSST-SED). Firstly, we fine-tune PaSST with weakly-labeled DESED dataset. Task-aware fine-tuning (TAFT) and self-distillated mean teacher (SdMT) are used as fine-tuning strategies, TAFT helps exploit both local and semantic information from PaSST and SdMT helps train a robust model with soft knowledge distillation. Secondly, we fine-tune PaSST with pseudo-labeled DESED with pseudo labels from DCASE2022 rank1, mix-up is used to mix the audios with true or pseudo labels. Besides, when test with PaSST-SED model, slide window clipping (SWC) is used to compensate the temporal resolution loss of PaSST feature. We also evaluate post-processing methods including median-filtering and max-filtering. Experiments on the DCASE2023 task4a validation dataset demonstrate the effectiveness of the techniques used in our systems. Specifically, our systems achieve the best PSDS1/PSDS2 of 0.5624/0.8990.

System characteristics
PDF

CHT+NSYSU SOUND EVENT DETECTION SYSTEM WITH PRETRAINED EMBEDDINGS EXTRACTED FROM BEATS MODEL FOR DCASE 2023 TASK 4

Liu, Chia-Chuan1 and Kuo, Tzu-Hao1 and Chen, Chia-Ping1 and Lu, Chung-Li2 and Chan, Bo-Cheng2 and Cheng, Yu-Han2 and Chuang, Hsiang-Feng2
1National Sun Yat-Sen University, Taiwan 2Chunghwa Telecom Laboratories, Taiwan

Abstract

In this technical report, we describe our submission system for DCASE 2023 Task4: sound event detection in domestic environments. We propose FDY CRNN systems using BEATs embeddings. The system adapted late-fusion to concate the feature maps from Frequency Dynamic Convolution and the frame-level embeddings from BEATs. After that, a classification layer produces the prediction from the late-fusion features. The system is trained by the mean teacher framework. We utilize Asymmetric Focal Loss as the supervised loss to alleviate the imbalance between positive and negative samples. Furthermore, we apply two-stage mean teacher training to utilize training data adequateately. Compared to PSDS-scenario 1 of 50% and PSDS-scenario 2 of 76.2% of the baseline system using BEATs embeddings. Our FDY CRNN system achieves 50.1% and 79.8%, respectively. The ensemble of the FDY CRNN system further improves the PSDS-scenario 1 to 52.5% and the PSDS-scenario 2 to 80.4%.

System characteristics
PDF

PEPE: PLAIN EFFICIENT PRETRAINED EMBEDDINGS FOR SOUND EVENT DETECTION

Wang, Yongqing and Dinkel, Heinrich and Yan, Zhiyong and Zhang, Junbo and Wang, Yujun
Xiaomi Corporation, Beijing, China

Abstract

This paper is a system description of the XiaoRice team submission to the DCASE 2023 Task 4 challenge. In light of the increasing availability of pretrained audio embedding models, our research addresses the need for efficient utilization of these resources, taking into account their environmental impact. Our method named plain efficient pretrained (audio) embeddings (PEPE) integrates a linear classifier or a bidirectional gated recurrent network (BiGRU) with those embeddings while prioritizing energy efficiency, training speed and minimizing carbon emissions. By employing a streamlined approach, we demonstrate that a linear classifier with 52K parameters surpasses the challenge baseline for PSDS-2 scores, highlighting the potential of eco-friendly solutions in achieving superior performance. We achieve a polyphonic sound detection score (PSDS)-1 score of 53.44 via a 6-way ensemble and a PSDS-2 score of 88.60 with a simple linear classifier using PEPE. Through our work, we aim to emphasize the adoption of environmentally conscious practices in the field.

System characteristics
PDF

SEMI-SUPERVISED SOUND EVENT DETECTION SYSTEM FOR DCASE 2023 TASK4A

Duo, Wenxin1 Fang, Xiang2 and Li, Jie2
1Tianjin University, School of Electrical and Information Engineering, Tianjin, China 2China Telecom Corporation Ltd., Data&AI Technology Company, Beijing, China,

Abstract

In this technical report, we describe our systems for DCASE 2023 Challenge Task4a. Our systems are mainly based on Frequency Dynamic Convolutional Recurrent Neural Network (FDYCRNN) and Mutual Mean Teaching (MMT) semi-supervised strategy. In order to prevent overfitting, we adopt data augmentation using mixup, frame shift, SpecAugment, FilterAugment, Interpolation Consistency Training (ICT) and Shift Consistency Training (SCT). Besides, we utilize strongly labeled AudioSet data as external data and several pretrained models to further improve performance, and try an ensemble of multiple systems with different pretrained models to enhance the generalization capability of our system.

System characteristics
PDF

SEMI-SUPERVISED SOUND EVENT DETECTION SYSTEM WITH PRETRAINED MODEL

Wu, Juan and Gan, Yanggang and Cai, Xichang and Wu, Menglong
North China University of Technology, Beijing,China

Abstract

In this report, we present the sound event detection system for Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge Task 4: Sound Event Detection with Weak Labels and Synthetic Soundscapes. For Task 4A, we designed a SED system based on the Mean Teacher [1] architecture to detect event information and start and stop times in audio sequences, using semi supervised learning to address the lack of labeled data in the DCASE 2023 Challenge task. In addition, we use pre-trained models to leverage external data information to further improve the stability of the system. We finally integrated multiple systems with the best PSDS1 of 0.525 and PSDS2 of 0.783.

System characteristics
PDF

FMSG SUBMISSION FOR DCASE 2023 CHALLENGE TASK 4 ON SOUND EVENT DETECTION WITH WEAK LABELS AND SYNTHETIC SOUNDSCAPES

Xiao, Yang and Khandelwal, Tanmay and Das, Rohan Kumar
Fortemedia Singapore, Singapore

Abstract

This report presents the systems developed and submitted by Fortemedia Singapore (FMSG) for DCASE 2023 Task 4A, which focuses on sound event detection with weak labels and synthetic soundscapes. Our approach primarily involves integrating features from Bidirectional Encoder representation from Audio Transformers (BEATs) and frequency dynamic (FDY)-convolutional recurrent neural network (CRNN) into a single-stage setup. We focus on three main directions to enhance our approach. Firstly, we curate an external dataset from AudioSet by establishing relationships between AudioSet sound event categories and the target sound events. Secondly, we utilize multiple aggregation methods to leverage the strengths of different methods. Lastly, we employ the asymmetric focal loss (AFL) function to adjust the training weights based on the model’s training difficulty. Additionally, we use data augmentation techniques to prevent overfitting, apply adaptive post-processing methods, and experiment with an ensemble of multiple subsystems to improve the generalization capability of our system. Our method achieves the top PSDS1 and PSDS2 scores of 0.557 and 0.854, respectively, on the development set. Further, on the public evaluation set, our approach achieves the highest PSDS1 and PSDS2 scores of 0.607 and 0.875, respectively.

System characteristics
PDF

SOUND EVENT DETECTION WITH WEAK PREDICTION FOR DCASE 2023 CHALLENGE TASK4A

Xiao, Shengchang and Shen, Jiakun and Hu, Aolin and Zhang, Xueshuai and Zhang ,Pengyuan and Yan, Yonghong
Institute of Acoustics, Beijing, China

Abstract

In this technical report, we describe our submitted systems for dcase 2023 Challenge Task4A: Sound Event Detection with weak labels and synthetic soundscapes. Specifically, we design two different systems respectively for PSDS1 and PSDS2. As in previous editions of the Challenge, we also predict weak labels of clips to improve PSDS2. The difference is that this year we use shorter segments for specific classes. Moreover, we adopt the energy difference based log-mel spectrogram to improve feature representation. And we use the Multi-dimensional frequency dynamic convolution (MFDConv) to strengthen the feature extraction ability of convolutional kernels. And we use the confidence-wieghted BCE loss in self-training stage. In addition, we also set higher weight for those classes with worse performances. For post-processing, we optimize the probability values of intervals between events to obtain sharper boundaries.

System characteristics
PDF