Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring


Challenge results

Task description

The goal of this task is to identify whether the sound emitted from a target machine is normal or anomalous. The main challenge is to detect unknown anomalous sounds under the condition that only normal sound samples have been provided as training data. This task cannot be solved as a simple classification problem, even though the normal/anomaly classification problem seems to be a two-class classification problem. Prompt detection of machine anomaly by observing its sounds will be useful for machine condition monitoring.

More detailed task description can be found in the task description page

Teams ranking

Table including only the best performing system per submitting team.

Rank Submission Information Evaluation dataset Development dataset
Submission code Technical
Report
Official
rank
fan
(AUC)
fan
(pAUC)
pump
(AUC)
pump
(pAUC)
slider
(AUC)
slider
(pAUC)
valve
(AUC)
valve
(pAUC)
ToyCar
(AUC)
ToyCar
(pAUC)
ToyConveyor
(AUC)
ToyConveyor
(pAUC)
fan
(AUC)
fan
(pAUC)
pump
(AUC)
pump
(pAUC)
slider
(AUC)
slider
(pAUC)
valve
(AUC)
valve
(pAUC)
ToyCar
(AUC)
ToyCar
(pAUC)
ToyConveyor
(AUC)
ToyConveyor
(pAUC)
DCASE2020_baseline_task2_1 Koizumi2020 93 82.80 65.80 82.37 64.11 79.41 58.87 57.37 50.79 80.14 66.17 85.36 66.95 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Alam_CRIM_task2_4 Alam2020 54 67.67 52.98 83.28 63.16 95.26 78.75 87.13 81.50 82.44 66.71 90.44 73.54 72.47 52.68 78.33 67.74 93.78 84.10 95.40 86.97 82.95 69.71 74.80 62.42
Giri_Amazon_task2_2 Giri2020 1 94.54 84.30 93.65 81.73 97.63 89.73 96.13 90.89 94.34 89.73 91.19 73.34 82.33 78.97 86.94 79.60 97.28 89.54 97.38 91.21 95.04 90.39 80.67 65.90
Hayashi_HDL_task2_3 Hayashi2020 10 92.72 80.52 90.63 73.61 95.68 81.48 97.43 89.69 91.75 83.97 92.10 76.76 86.59 76.31 88.83 78.43 97.16 89.27 99.69 98.42 93.17 85.44 77.91 63.50
Jiang_UESTC_task2_2 Jiang2020 60 85.86 68.69 85.87 65.02 81.84 59.76 69.70 52.57 84.61 69.85 90.18 74.44 69.58 57.52 73.15 62.80 85.19 71.60 67.95 51.68 81.12 70.34 73.06 59.77
Hoang_FPT_task2_2 Hoang2020 71 82.72 66.87 76.38 67.32 90.43 68.82 37.25 49.41 73.74 57.30 91.06 74.16 74.99 60.97 83.02 71.85 91.96 76.71 87.77 69.74 86.42 70.40 76.57 62.29
Tian_BUPT_task2_2 Tian2020 101 49.36 49.82 50.02 49.38 46.51 50.16 49.33 49.96 82.55 63.44 61.83 54.20 59.10 72.42 100.00 100.00 49.31 66.93 58.88 60.74 81.56 70.06 73.30 60.74
Durkota_NSW_task2_3 Durkota2020 24 90.74 83.38 88.70 75.97 96.18 87.49 97.48 92.46 94.32 89.01 64.38 53.79 77.40 71.47 75.47 67.28 97.12 89.85 97.95 92.53 92.42 84.72 64.43 54.00
Bai_LFXS_task2_1 Bai2020 48 85.64 66.78 86.16 65.82 92.05 77.00 78.47 60.24 82.01 68.47 88.46 70.31 65.00 53.00 80.00 63.00 92.00 78.00 85.00 66.00 80.00 67.00 73.00 61.00
Ahmed_Mila_task2_2 Ahmed2020 76 91.93 77.10 78.10 68.21 74.88 64.60 73.41 71.22 65.19 57.88 82.72 66.67 83.81 70.42 86.40 79.33 89.15 77.97 74.50 59.80 87.56 74.77 75.50 61.25
Chaudhary_NCS_task2_2 Chaudhary2020 62 87.32 66.30 86.33 65.75 82.66 60.24 66.67 51.37 85.19 69.55 88.68 71.09 70.18 56.25 74.34 63.75 85.05 66.53 76.66 52.80 85.70 72.12 74.19 59.87
Wilkinghoff_FKIE_task2_3 Wilkinghoff2020 22 93.75 80.68 93.19 81.10 95.71 79.45 94.87 83.58 94.06 86.80 84.22 69.12 83.12 71.77 94.00 83.48 97.79 88.39 91.55 73.03 94.33 83.65 73.30 59.74
Daniluk_SRPOL_task2_4 Daniluk2020 3 99.13 96.40 95.07 90.23 98.18 91.98 90.97 77.41 93.52 83.87 90.51 77.56 94.12 88.23 97.31 92.56 97.85 94.54 98.35 92.11 98.30 93.55 89.02 73.89
Xiao_THU_task2_4 Xiao2020 98 74.02 63.68 51.91 49.85 55.88 55.72 92.54 84.44 75.85 66.54 52.93 53.02 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Shinmura_JPN_task2_1 Shinmura2020 42 81.56 64.40 88.83 68.70 96.99 87.43 91.03 74.80 88.21 74.94 69.95 58.65 76.40 89.00 97.30 94.10 90.90 70.00
Grollmisch_IDMT_task2_2 Grollmisch2020 26 89.65 78.33 87.99 71.28 91.05 70.01 94.98 83.61 94.07 86.78 87.41 72.25 79.57 67.64 84.77 72.21 90.90 70.76 99.59 97.93 94.31 84.78 79.97 64.70
Haunschmid_CPJKU_task2_2 Haunschmid2020 39 91.48 74.32 92.30 72.14 89.74 76.43 81.99 69.82 81.50 67.00 88.01 70.52 75.69 62.13 79.84 69.48 93.55 87.76 94.50 81.86 81.92 67.05 73.46 60.98
Zhou_PSH_task2_3 Zhou2020 12 99.79 98.92 95.79 92.60 99.84 99.17 91.83 84.74 95.60 91.30 73.61 64.06 88.73 82.29 91.84 81.62 100.00 99.98 92.91 82.70 94.64 89.29 65.21 57.35
Wen_UESTC_task2_2 Wen2020 104 49.16 50.90 62.76 59.02 66.23 52.06 41.00 49.63 58.31 54.25 55.06 52.35 69.00 53.20 75.23 61.20 80.68 57.92 67.25 51.00 82.92 69.05 79.41 64.84
Chen_UESTC_task2_3 Chen2020 105 54.70 52.57 64.65 57.20 71.97 56.91 35.14 49.00 59.61 51.54 48.27 51.53 69.56 51.40 74.53 59.72 90.32 72.09 81.17 53.65 79.37 66.93 77.59 62.51
Shao_ELCN_task2_1 Shao2020 79 85.44 65.88 78.12 64.22 83.64 65.14 67.92 52.02 81.18 67.03 86.35 70.93 66.22 52.43 74.77 61.04 91.56 82.24 81.55 55.72 79.01 66.32 74.95 62.50
Zhao_TAU_task2_3 Zhao2020 43 88.85 68.49 86.94 68.22 89.10 67.46 89.67 76.12 85.44 69.06 85.04 71.98 81.70 61.30 85.80 67.60 92.80 78.30 91.30 74.10 91.60 79.50 77.70 62.50
Sakamoto_fixstars_task2_1 Sakamoto2020 27 96.63 84.34 89.07 71.88 78.09 60.49 92.92 85.44 84.71 70.89 90.82 78.31 82.05 67.18 82.66 69.30 94.39 79.60 92.84 83.71 92.22 77.52 82.26 65.13
Naranjo-Alcazar_Vfy_task2_3 Naranjo-Alcazar2020 114 49.05 51.18 46.97 49.89 54.29 51.03 46.47 50.62 59.81 50.51 53.89 51.57 78.63 71.26 80.33 70.94 78.94 70.08 80.94 70.83 87.27 74.21 90.35 81.50
Jalali_AIT_task2_1 Jalali2020 110 49.37 51.12 61.06 58.17 71.78 55.87 39.76 49.28 60.18 55.07 56.61 51.28 67.32 52.05 73.94 61.01 84.99 67.47 67.82 51.07 75.63 66.39 70.80 57.63
Primus_CP-JKU_task2_2 Primus2020 5 96.84 95.24 97.76 92.24 97.29 88.74 90.15 86.65 86.37 83.83 88.28 79.15 92.86 83.53 92.98 87.23 98.95 94.54 97.77 93.57 95.67 89.62 85.27 72.60
Wei_Kuaiyu_task2_3 Wei2020 66 86.20 63.42 86.51 65.78 84.74 60.69 66.12 51.10 80.49 62.39 91.40 75.47 66.02 61.68 74.53 61.68 88.76 68.92 77.26 52.05 79.96 66.25 77.86 63.17
Morita_SECOM_task2_3 Morita2020 33 90.12 76.19 85.62 65.02 92.00 74.62 89.24 76.74 93.42 86.61 92.34 75.72 82.50 64.81 81.04 65.77 91.19 79.88 91.52 79.06 95.69 88.72 79.66 64.90
Uchikoshi_JRI_task2_2 Uchikoshi2020 78 85.57 67.64 84.34 64.65 81.66 59.01 59.12 51.08 82.64 68.56 88.30 70.40 74.94 59.64
Park_LGE_task2_4 Park2020 58 82.30 59.97 84.38 64.23 96.39 83.58 83.86 61.99 81.40 66.37 86.41 71.92 70.77 54.50 76.21 62.06 94.16 83.97 89.67 72.85 82.73 70.35 76.61 64.01
Vinayavekhin_IBM_task2_2 Vinayavekhin2020 7 98.84 94.89 94.37 88.27 95.68 83.09 97.82 94.93 93.16 87.69 87.41 72.03 88.73 79.82 93.20 82.52 99.47 97.20 99.77 98.79 95.74 88.15 81.60 67.71
Lapin_BMSTU_task2_1 Lapin2020 99 51.35 50.97 64.67 57.88 67.11 56.01 73.80 73.42 57.61 52.55 55.71 52.08 68.75 54.61 76.00 61.11 90.78 72.80 84.90 63.78 70.29 57.15 75.57 60.31
He_THU_task2_1 Wang2020 74 79.94 56.72 81.03 63.07 84.74 60.47 77.22 55.87 82.46 71.10 88.92 71.30 68.25 53.21 72.93 61.52 82.04 67.58 84.19 62.47 80.61 72.55 74.00 61.81
Zhang_NJUPT_task2_2 Zhang2020 82 86.47 70.40 86.21 65.60 76.01 55.94 46.07 49.61 82.55 63.44 87.25 68.96 70.61 76.40 81.19 72.30 80.25 71.75
Kaltampanidis_AUTH_task2_1 Kaltampanidis2020 70 80.45 73.99 75.01 66.19 76.56 62.62 82.90 78.87 83.87 71.11 71.63 60.40 85.57 78.68 77.33 73.69 83.03 69.59 87.24 76.48 82.11 72.35 70.35 59.92
Tiwari_IITKGP_task2_4 Tiwari2020 59 81.04 67.26 80.43 60.60 88.25 67.06 85.61 85.32 81.49 67.18 88.48 70.45 73.23 58.29 82.60 69.65 90.28 79.36 96.84 90.81 87.00 71.92 72.53 60.43
Pilastri_CCG_task2_2 Ribeiro2020 63 79.45 56.36 84.60 62.23 88.59 63.24 69.41 51.59 81.06 71.59 91.57 75.97 66.78 52.63 72.07 60.96 91.77 76.20 78.83 53.10 78.04 69.12 75.93 60.03
Lopez_IL_task2_1 Lopez2020 18 93.09 90.67 93.98 90.72 98.88 95.38 96.80 90.61 86.59 81.85 71.21 61.41 88.23 80.57 93.21 86.19 99.97 99.82 99.89 99.41 95.73 90.32 74.17 65.86
Agrawal_mSense_task2_3 Agrawal2020 32 95.84 82.45 89.73 69.19 89.88 65.86 78.71 61.26 90.14 74.47 92.62 80.56 86.70 70.58 88.70 72.04 91.15 73.19 88.60 75.34 92.54 79.31 86.00 69.72
Phan_UIUC_task2_2 Phan2020 47 88.92 72.67 87.27 67.68 87.23 64.45 82.39 59.43 82.65 77.16 87.43 69.68 73.87 59.25 73.59 64.97 88.54 68.55 89.33 65.79 74.21 66.42 74.12 59.70



Systems ranking

Rank Submission Information Evaluation dataset Development dataset
Submission code Technical
Report
Official
rank
fan
(AUC)
fan
(pAUC)
pump
(AUC)
pump
(pAUC)
slider
(AUC)
slider
(pAUC)
valve
(AUC)
valve
(pAUC)
ToyCar
(AUC)
ToyCar
(pAUC)
ToyConveyor
(AUC)
ToyConveyor
(pAUC)
fan
(AUC)
fan
(pAUC)
pump
(AUC)
pump
(pAUC)
slider
(AUC)
slider
(pAUC)
valve
(AUC)
valve
(pAUC)
ToyCar
(AUC)
ToyCar
(pAUC)
ToyConveyor
(AUC)
ToyConveyor
(pAUC)
DCASE2020_baseline_task2_1 Koizumi2020 93 82.80 65.80 82.37 64.11 79.41 58.87 57.37 50.79 80.14 66.17 85.36 66.95 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Alam_CRIM_task2_1 Alam2020 90 78.02 54.83 73.77 59.70 88.94 67.43 85.55 85.21 75.74 63.43 61.22 50.57 71.84 51.51 78.12 67.90 90.68 78.94 96.46 89.86 80.49 63.77 59.40 50.60
Alam_CRIM_task2_2 Alam2020 65 47.20 50.02 73.77 59.70 95.56 80.36 85.55 85.21 81.32 65.50 89.73 71.86 71.90 58.77 78.12 67.90 93.91 84.31 96.46 89.86 81.21 70.10 74.56 62.62
Alam_CRIM_task2_3 Alam2020 75 47.20 50.02 73.77 59.70 94.68 76.68 85.55 85.21 80.67 66.01 87.51 69.31 71.90 58.77 78.12 67.90 93.91 84.31 96.46 89.86 81.21 70.10 74.56 62.62
Alam_CRIM_task2_4 Alam2020 54 67.67 52.98 83.28 63.16 95.26 78.75 87.13 81.50 82.44 66.71 90.44 73.54 72.47 52.68 78.33 67.74 93.78 84.10 95.40 86.97 82.95 69.71 74.80 62.42
Giri_Amazon_task2_1 Giri2020 2 94.08 82.94 93.73 82.31 97.42 88.91 95.92 91.54 94.81 90.54 91.27 73.41 82.39 78.23 87.64 82.37 97.09 88.03 98.46 94.87 95.57 91.54 81.46 66.62
Giri_Amazon_task2_2 Giri2020 1 94.54 84.30 93.65 81.73 97.63 89.73 96.13 90.89 94.34 89.73 91.19 73.34 82.33 78.97 86.94 79.60 97.28 89.54 97.38 91.21 95.04 90.39 80.67 65.90
Giri_Amazon_task2_3 Giri2020 3 94.54 84.55 94.62 85.87 97.89 90.97 95.61 89.96 94.55 91.12 90.46 71.67 82.75 79.72 86.73 79.60 97.62 89.70 99.07 96.20 94.64 89.48 80.53 65.58
Giri_Amazon_task2_4 Giri2020 38 84.52 59.33 88.07 67.47 95.18 78.42 84.23 63.17 83.64 73.38 90.97 73.80 70.10 53.62 75.68 68.97 93.29 83.46 89.68 70.95 80.51 71.89 76.03 60.70
Hayashi_HDL_task2_1 Hayashi2020 20 88.55 75.64 89.23 73.77 94.59 77.57 97.11 88.72 90.51 81.90 91.06 74.60 86.59 76.31 88.83 78.43 97.16 89.27 99.68 98.33 92.62 83.37 77.91 63.50
Hayashi_HDL_task2_2 Hayashi2020 25 90.67 79.89 88.66 71.51 93.34 75.54 95.00 79.10 83.50 67.45 92.03 78.24 84.54 72.43 88.47 78.60 95.80 86.19 99.03 95.24 80.84 62.79 78.01 63.07
Hayashi_HDL_task2_3 Hayashi2020 10 92.72 80.52 90.63 73.61 95.68 81.48 97.43 89.69 91.75 83.97 92.10 76.76 86.59 76.31 88.83 78.43 97.16 89.27 99.69 98.42 93.17 85.44 77.91 63.50
Hayashi_HDL_task2_4 Hayashi2020 14 91.85 77.90 90.63 73.61 95.67 81.06 97.28 88.87 92.24 84.54 93.08 78.26 87.95 79.33 90.29 82.08 97.37 89.41 99.82 99.05 93.50 85.44 79.06 64.75
Jiang_UESTC_task2_1 Jiang2020 67 85.43 69.50 85.83 67.29 81.73 60.29 64.32 51.43 82.53 65.96 89.73 72.76 71.44 57.48 75.99 63.67 83.81 64.50 69.98 51.10 82.92 69.36 71.60 58.97
Jiang_UESTC_task2_2 Jiang2020 60 85.86 68.69 85.87 65.02 81.84 59.76 69.70 52.57 84.61 69.85 90.18 74.44 69.58 57.52 73.15 62.80 85.19 71.60 67.95 51.68 81.12 70.34 73.06 59.77
Hoang_FPT_task2_1 Hoang2020 92 50.17 51.05 64.35 57.48 90.43 68.82 37.25 49.41 50.59 49.38 91.06 74.16 75.05 60.07 86.86 71.62 91.96 76.71 87.77 69.74 89.78 75.60 76.57 62.29
Hoang_FPT_task2_2 Hoang2020 71 82.72 66.87 76.38 67.32 90.43 68.82 37.25 49.41 73.74 57.30 91.06 74.16 74.99 60.97 83.02 71.85 91.96 76.71 87.77 69.74 86.42 70.40 76.57 62.29
Hoang_FPT_task2_3 Hoang2020 93 50.17 51.05 64.35 57.48 90.43 68.82 37.25 49.41 49.11 48.76 91.06 74.16 75.05 60.07 86.86 71.62 91.96 76.71 87.77 69.74 87.87 71.22 76.57 62.29
Hoang_FPT_task2_4 Hoang2020 112 50.17 51.05 64.35 57.48 78.15 58.62 37.25 49.41 49.11 48.76 47.67 49.39 75.05 60.07 86.86 71.62 91.74 74.52 87.77 69.74 87.87 71.22 76.31 54.41
Tian_BUPT_task2_1 Tian2020 103 48.35 49.33 47.70 50.88 49.20 49.43 50.23 50.32 82.55 63.44 61.83 54.20 68.54 63.09 99.99 99.97 79.62 78.39 82.18 71.45 83.18 71.40 76.51 59.86
Tian_BUPT_task2_2 Tian2020 101 49.36 49.82 50.02 49.38 46.51 50.16 49.33 49.96 82.55 63.44 61.83 54.20 59.10 72.42 100.00 100.00 49.31 66.93 58.88 60.74 81.56 70.06 73.30 60.74
Durkota_NSW_task2_1 Durkota2020 34 89.81 83.55 88.28 75.49 96.56 86.03 95.29 82.88 90.20 82.58 62.78 54.34 79.62 73.92 82.50 74.95 96.47 87.55 95.70 82.80 88.38 79.58 68.03 59.37
Durkota_NSW_task2_2 Durkota2020 29 91.35 83.60 90.50 80.77 96.11 86.64 88.79 85.56 93.91 86.04 68.33 56.19 76.35 68.75 77.70 68.58 98.10 91.70 95.50 88.15 93.25 86.58 64.63 55.73
Durkota_NSW_task2_3 Durkota2020 24 90.74 83.38 88.70 75.97 96.18 87.49 97.48 92.46 94.32 89.01 64.38 53.79 77.40 71.47 75.47 67.28 97.12 89.85 97.95 92.53 92.42 84.72 64.43 54.00
Bai_LFXS_task2_1 Bai2020 48 85.64 66.78 86.16 65.82 92.05 77.00 78.47 60.24 82.01 68.47 88.46 70.31 65.00 53.00 80.00 63.00 92.00 78.00 85.00 66.00 80.00 67.00 73.00 61.00
Bai_LFXS_task2_2 Bai2020 60 84.30 67.06 86.16 65.82 90.11 66.53 78.47 60.24 79.79 66.71 88.46 70.31 65.00 52.00 80.00 63.00 91.00 76.00 85.00 66.00 79.00 67.00 73.00 61.00
Bai_LFXS_task2_3 Bai2020 57 86.34 66.51 86.16 65.82 91.69 71.67 78.47 60.24 81.22 67.77 87.92 67.06 65.00 53.00 80.00 63.00 91.00 77.00 85.00 66.00 80.00 67.00 72.00 60.00
Bai_LFXS_task2_4 Bai2020 80 79.59 57.11 84.23 65.75 89.32 69.51 76.22 59.91 77.08 66.32 84.40 63.96 67.00 53.00 80.00 63.00 84.00 73.00 81.00 65.00 78.00 67.00 72.00 61.00
Ahmed_Mila_task2_1 Ahmed2020 85 91.62 76.99 77.63 68.37 79.49 66.00 72.50 71.89 65.74 50.44 53.41 49.89 83.98 72.01 87.46 79.50 89.38 78.22 74.07 59.87 87.18 74.33 74.84 61.27
Ahmed_Mila_task2_2 Ahmed2020 76 91.93 77.10 78.10 68.21 74.88 64.60 73.41 71.22 65.19 57.88 82.72 66.67 83.81 70.42 86.40 79.33 89.15 77.97 74.50 59.80 87.56 74.77 75.50 61.25
Ahmed_Mila_task2_3 Ahmed2020 88 90.93 78.28 73.29 65.08 74.85 64.62 60.61 52.77 68.68 59.39 73.69 62.04 75.36 66.94 81.32 66.66 89.18 71.34 71.48 52.10 87.49 74.73 74.41 61.25
Ahmed_Mila_task2_4 Ahmed2020 82 90.37 74.82 77.26 66.92 55.45 56.88 72.86 71.57 67.06 58.64 86.39 67.72 80.09 70.77 81.43 75.46 88.12 77.51 74.32 59.86 87.57 74.85 74.91 61.25
Chaudhary_NCS_task2_1 Chaudhary2020 69 86.82 66.51 84.36 64.50 84.55 59.97 73.39 53.13 83.11 65.79 88.50 72.01 69.95 56.21 75.08 63.28 87.68 67.29 80.72 54.07 84.05 70.16 74.53 60.86
Chaudhary_NCS_task2_2 Chaudhary2020 62 87.32 66.30 86.33 65.75 82.66 60.24 66.67 51.37 85.19 69.55 88.68 71.09 70.18 56.25 74.34 63.75 85.05 66.53 76.66 52.80 85.70 72.12 74.19 59.87
Wilkinghoff_FKIE_task2_1 Wilkinghoff2020 30 86.50 76.96 89.28 76.18 95.48 83.57 96.54 89.16 94.31 88.35 73.77 59.01 78.02 71.08 93.64 82.97 97.43 87.19 95.65 85.64 92.10 81.45 66.56 55.59
Wilkinghoff_FKIE_task2_2 Wilkinghoff2020 36 85.31 76.50 84.95 72.26 93.71 78.19 97.89 93.96 93.11 87.20 69.87 55.28 81.12 73.43 93.22 81.97 97.18 86.16 98.41 93.46 91.30 79.33 63.69 54.29
Wilkinghoff_FKIE_task2_3 Wilkinghoff2020 22 93.75 80.68 93.19 81.10 95.71 79.45 94.87 83.58 94.06 86.80 84.22 69.12 83.12 71.77 94.00 83.48 97.79 88.39 91.55 73.03 94.33 83.65 73.30 59.74
Wilkinghoff_FKIE_task2_4 Wilkinghoff2020 23 92.27 79.75 92.39 81.18 96.69 84.37 96.79 89.71 93.02 86.47 79.61 61.89 83.82 74.41 94.37 83.13 98.02 89.65 96.75 85.84 93.03 80.91 71.85 59.81
Daniluk_SRPOL_task2_1 Daniluk2020 41 88.18 75.93 84.32 72.04 89.93 73.72 75.14 58.21 88.08 78.27 83.45 75.67 80.94 66.55 85.26 74.35 95.64 90.74 91.54 77.10 93.39 85.46 83.08 68.98
Kapka_SRPOL_task2_2 Kapka2020 40 90.70 82.41 92.65 84.27 88.01 76.34 80.84 59.42 87.26 78.01 84.42 63.59 79.29 74.88 84.54 77.76 81.25 68.49 82.21 56.46 88.87 85.67 68.62 58.82
Kosmider_SRPOL_task2_3 Kosmider2020 21 97.35 91.07 85.98 78.92 98.05 91.60 89.20 76.50 88.66 71.56 89.57 75.58 93.52 85.10 95.87 89.53 97.36 94.61 97.95 91.59 96.73 89.30 87.23 72.59
Daniluk_SRPOL_task2_4 Daniluk2020 3 99.13 96.40 95.07 90.23 98.18 91.98 90.97 77.41 93.52 83.87 90.51 77.56 94.12 88.23 97.31 92.56 97.85 94.54 98.35 92.11 98.30 93.55 89.02 73.89
Xiao_THU_task2_1 Xiao2020 114 47.82 50.48 66.10 59.69 66.84 52.92 39.06 49.06 59.33 53.68 46.83 50.22 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Xiao_THU_task2_2 Xiao2020 102 52.16 62.20 61.12 56.64 76.53 65.53 55.68 51.87 48.42 48.52 47.44 49.02 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Xiao_THU_task2_3 Xiao2020 106 52.91 57.27 50.90 50.88 52.21 50.33 71.37 69.50 39.26 49.90 49.96 49.97 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Xiao_THU_task2_4 Xiao2020 98 74.02 63.68 51.91 49.85 55.88 55.72 92.54 84.44 75.85 66.54 52.93 53.02 65.83 52.45 72.89 59.99 84.76 66.53 66.28 50.98 78.77 67.58 72.53 60.43
Shinmura_JPN_task2_1 Shinmura2020 42 81.56 64.40 88.83 68.70 96.99 87.43 91.03 74.80 88.21 74.94 69.95 58.65 76.40 89.00 97.30 94.10 90.90 70.00
Grollmisch_IDMT_task2_1 Grollmisch2020 27 88.29 78.29 88.47 70.36 91.52 71.60 96.59 87.08 93.99 86.00 86.67 72.81 77.40 63.17 83.98 70.81 90.90 70.76 98.05 93.14 94.31 84.78 77.98 62.28
Grollmisch_IDMT_task2_2 Grollmisch2020 26 89.65 78.33 87.99 71.28 91.05 70.01 94.98 83.61 94.07 86.78 87.41 72.25 79.57 67.64 84.77 72.21 90.90 70.76 99.59 97.93 94.31 84.78 79.97 64.70
Haunschmid_CPJKU_task2_1 Haunschmid2020 50 87.24 69.08 85.24 65.88 91.76 70.17 79.28 58.11 81.36 66.71 90.37 73.39 66.30 53.11 73.65 60.18 91.44 78.71 85.24 59.08 79.49 68.60 73.58 61.31
Haunschmid_CPJKU_task2_2 Haunschmid2020 39 91.48 74.32 92.30 72.14 89.74 76.43 81.99 69.82 81.50 67.00 88.01 70.52 75.69 62.13 79.84 69.48 93.55 87.76 94.50 81.86 81.92 67.05 73.46 60.98
Haunschmid_CPJKU_task2_3 Haunschmid2020 45 90.83 73.23 93.04 72.27 90.02 77.68 84.40 70.62 80.77 65.50 86.84 68.83 75.00 60.87 78.83 68.73 93.57 88.65 94.56 81.37 79.99 66.05 73.87 61.59
Haunschmid_CPJKU_task2_4 Haunschmid2020 46 91.11 73.26 92.29 72.50 88.49 76.94 83.06 70.21 80.50 66.65 86.62 68.71 75.23 61.11 79.13 69.09 93.47 88.00 94.49 81.90 79.38 65.54 72.81 60.60
Zhou_PSH_task2_1 Zhou2020 15 99.70 98.57 95.54 89.92 99.78 98.84 90.79 83.05 94.47 89.64 69.42 61.26 88.12 83.12 91.59 81.52 99.99 99.95 92.98 84.56 93.99 89.23 64.83 57.23
Zhou_PSH_task2_2 Zhou2020 18 99.79 98.87 91.78 88.74 98.16 91.68 89.86 80.33 94.16 87.47 73.77 66.26 85.92 80.59 92.17 80.88 99.58 97.84 90.05 76.55 93.56 88.51 68.57 60.89
Zhou_PSH_task2_3 Zhou2020 12 99.79 98.92 95.79 92.60 99.84 99.17 91.83 84.74 95.60 91.30 73.61 64.06 88.73 82.29 91.84 81.62 100.00 99.98 92.91 82.70 94.64 89.29 65.21 57.35
Zhou_PSH_task2_4 Zhou2020 13 99.87 99.32 94.89 92.05 99.34 96.54 91.19 83.45 96.21 91.92 75.01 66.79 88.44 82.17 92.53 81.76 99.89 99.43 92.37 82.25 94.65 89.95 68.26 60.73
Wen_UESTC_task2_1 Wen2020 112 52.68 51.39 65.03 57.08 70.81 55.15 34.22 48.86 59.29 56.69 51.88 51.40 66.57 51.69 73.33 60.89 91.42 74.62 82.00 54.39 75.33 63.01 77.08 62.73
Wen_UESTC_task2_2 Wen2020 104 49.16 50.90 62.76 59.02 66.23 52.06 41.00 49.63 58.31 54.25 55.06 52.35 69.00 53.20 75.23 61.20 80.68 57.92 67.25 51.00 82.92 69.05 79.41 64.84
Wen_UESTC_task2_3 Wen2020 109 49.16 50.90 62.76 59.02 70.81 55.15 34.22 48.86 58.31 54.25 55.06 52.35 69.00 53.20 75.23 61.20 91.42 74.62 82.00 54.39 82.92 69.05 79.41 64.84
Chen_UESTC_task2_1 Chen2020 106 54.70 52.57 64.65 57.20 71.97 56.91 35.14 49.00 56.96 53.52 48.27 51.53 69.56 51.40 74.53 59.72 90.32 72.09 81.17 53.65 81.62 66.97 77.59 62.51
Chen_UESTC_task2_2 Chen2020 110 50.34 51.64 64.65 57.20 71.41 56.81 33.41 48.93 59.90 54.05 48.27 51.53 67.93 51.42 74.53 59.72 90.22 72.40 80.39 53.72 76.07 64.81 77.58 62.51
Chen_UESTC_task2_3 Chen2020 105 54.70 52.57 64.65 57.20 71.97 56.91 35.14 49.00 59.61 51.54 48.27 51.53 69.56 51.40 74.53 59.72 90.32 72.09 81.17 53.65 79.37 66.93 77.59 62.51
Shao_ELCN_task2_1 Shao2020 79 85.44 65.88 78.12 64.22 83.64 65.14 67.92 52.02 81.18 67.03 86.35 70.93 66.22 52.43 74.77 61.04 91.56 82.24 81.55 55.72 79.01 66.32 74.95 62.50
Zhao_TAU_task2_1 Zhao2020 68 88.38 68.27 86.54 68.14 81.02 57.14 84.33 66.66 81.29 65.64 81.63 69.01 80.70 61.10 86.50 69.00 85.60 62.20 81.50 54.00 89.30 76.70 74.90 60.70
Zhao_TAU_task2_2 Zhao2020 53 88.85 68.49 86.94 68.22 81.02 57.14 84.33 66.66 85.44 69.06 85.04 71.98 81.70 61.30 85.80 67.60 85.60 62.20 81.50 54.00 91.60 79.50 77.70 62.50
Zhao_TAU_task2_3 Zhao2020 43 88.85 68.49 86.94 68.22 89.10 67.46 89.67 76.12 85.44 69.06 85.04 71.98 81.70 61.30 85.80 67.60 92.80 78.30 91.30 74.10 91.60 79.50 77.70 62.50
Sakamoto_fixstars_task2_1 Sakamoto2020 27 96.63 84.34 89.07 71.88 78.09 60.49 92.92 85.44 84.71 70.89 90.82 78.31 82.05 67.18 82.66 69.30 94.39 79.60 92.84 83.71 92.22 77.52 82.26 65.13
Sakamoto_fixstars_task2_2 Sakamoto2020 52 94.91 78.53 86.16 63.03 86.27 61.65 91.76 83.85 70.26 63.41 86.54 72.20 78.95 61.21 81.06 65.36 92.34 73.88 92.09 81.43 90.13 77.34 79.53 63.40
Sakamoto_fixstars_task2_3 Sakamoto2020 31 96.65 84.31 88.69 71.69 74.44 58.95 92.86 85.69 90.12 72.75 90.88 78.07 82.46 67.24 83.38 68.77 93.82 75.22 92.68 83.48 92.31 77.52 82.67 65.55
Sakamoto_fixstars_task2_4 Sakamoto2020 37 96.21 83.18 84.84 70.74 90.26 71.23 92.29 79.60 77.17 65.31 89.06 72.53 82.61 65.67 82.77 68.65 95.89 82.00 91.22 77.45 90.05 75.90 79.96 63.40
Naranjo-Alcazar_Vfy_task2_1 Naranjo-Alcazar2020 117 46.07 50.65 47.94 48.87 55.11 50.79 46.54 50.60 57.33 50.05 52.82 51.11 79.87 70.78 81.51 70.99 80.86 70.69 82.85 71.62 95.67 87.14 96.63 90.45
Naranjo-Alcazar_Vfy_task2_2 Naranjo-Alcazar2020 116 49.09 49.94 48.38 49.16 56.04 51.37 46.62 50.38 67.80 53.44 52.97 51.26 80.40 72.56 82.61 72.33 81.16 69.94 83.19 72.34 91.12 73.41 93.36 80.32
Naranjo-Alcazar_Vfy_task2_3 Naranjo-Alcazar2020 114 49.05 51.18 46.97 49.89 54.29 51.03 46.47 50.62 59.81 50.51 53.89 51.57 78.63 71.26 80.33 70.94 78.94 70.08 80.94 70.83 87.27 74.21 90.35 81.50
Jalali_AIT_task2_1 Jalali2020 110 49.37 51.12 61.06 58.17 71.78 55.87 39.76 49.28 60.18 55.07 56.61 51.28 67.32 52.05 73.94 61.01 84.99 67.47 67.82 51.07 75.63 66.39 70.80 57.63
Primus_CP-JKU_task2_1 Primus2020 6 96.84 95.24 97.76 92.24 98.23 91.97 90.15 86.65 88.72 85.32 86.45 77.45 92.27 82.30 92.98 87.23 98.95 94.54 94.25 89.04 94.90 87.52 83.76 72.80
Primus_CP-JKU_task2_2 Primus2020 5 96.84 95.24 97.76 92.24 97.29 88.74 90.15 86.65 86.37 83.83 88.28 79.15 92.86 83.53 92.98 87.23 98.95 94.54 97.77 93.57 95.67 89.62 85.27 72.60
Primus_CP-JKU_task2_3 Primus2020 10 97.86 94.77 97.57 92.38 97.38 88.92 90.70 85.41 86.67 85.16 87.51 77.78 92.82 82.84 92.10 87.06 98.59 92.68 96.96 91.18 95.47 89.14 85.15 73.75
Primus_CP-JKU_task2_4 Primus2020 9 97.26 94.87 97.67 92.50 97.15 87.48 91.29 85.77 87.12 85.44 86.88 76.90 92.30 82.85 91.47 86.78 98.23 91.09 93.83 87.93 95.05 88.90 82.54 70.27
Wei_Kuaiyu_task2_1 Wei2020 73 86.71 80.11 82.02 70.37 87.39 71.88 85.19 74.79 60.10 51.02 60.42 52.85 66.53 60.84 87.89 76.24 87.36 77.16 72.74 61.64 67.56 55.82 60.24 52.17
Wei_Kuaiyu_task2_2 Wei2020 97 71.05 64.67 75.52 63.21 83.61 64.91 81.15 70.18 57.45 51.80 60.78 52.23 66.87 64.10 81.86 69.57 86.44 69.47 69.18 54.97 72.11 58.86 59.61 51.60
Wei_Kuaiyu_task2_3 Wei2020 66 86.20 63.42 86.51 65.78 84.74 60.69 66.12 51.10 80.49 62.39 91.40 75.47 66.02 61.68 74.53 61.68 88.76 68.92 77.26 52.05 79.96 66.25 77.86 63.17
Wei_Kuaiyu_task2_4 Wei2020 77 81.59 56.76 86.27 64.58 87.33 62.60 65.62 51.33 80.34 64.75 90.39 74.96 67.79 52.25 73.55 62.07 83.10 70.64 75.69 52.65 74.69 61.77 76.07 62.95
Morita_SECOM_task2_1 Morita2020 49 89.57 74.40 84.53 64.52 81.74 60.33 62.93 52.36 93.91 87.15 88.74 74.72 81.30 63.96 80.36 65.77 80.96 63.30 69.75 50.39 95.21 87.53 77.33 61.11
Morita_SECOM_task2_2 Morita2020 91 78.15 60.44 63.96 54.32 92.00 74.62 83.95 68.57 74.30 62.37 60.31 51.35 66.07 50.70 64.70 59.16 91.19 79.88 88.85 73.50 72.88 59.31 56.82 51.97
Morita_SECOM_task2_3 Morita2020 33 90.12 76.19 85.62 65.02 92.00 74.62 89.24 76.74 93.42 86.61 92.34 75.72 82.50 64.81 81.04 65.77 91.19 79.88 91.52 79.06 95.69 88.72 79.66 64.90
Uchikoshi_JRI_task2_1 Uchikoshi2020 100 64.63 55.36 58.84 52.81 59.71 50.83 73.38 65.57 63.97 59.18 53.00 52.55 60.29 51.72 62.53 54.90 67.37 53.59 72.84 59.56 61.68 54.56 56.61 53.11
Uchikoshi_JRI_task2_2 Uchikoshi2020 78 85.57 67.64 84.34 64.65 81.66 59.01 59.12 51.08 82.64 68.56 88.30 70.40 74.94 59.64
Uchikoshi_JRI_task2_3 Uchikoshi2020 93 80.66 66.25 78.07 62.99 78.59 58.09 69.95 52.89 81.19 64.66 87.24 66.04 81.93 56.68
Park_LGE_task2_1 Park2020 106 52.43 51.25 64.58 58.50 71.71 55.54 36.68 49.03 59.98 54.72 50.04 51.37 69.02 53.66 73.59 61.01 85.48 67.03 65.69 50.77 79.40 68.37 74.60 62.02
Park_LGE_task2_2 Park2020 82 76.55 55.22 85.11 63.34 91.39 68.48 75.41 53.07 78.40 60.61 86.86 67.40 64.17 51.86 74.50 62.35 92.35 76.92 82.87 56.11 78.76 63.81 71.03 60.12
Park_LGE_task2_3 Park2020 85 79.32 56.08 82.59 61.53 83.59 59.91 61.07 50.93 81.26 66.02 89.86 73.39 67.29 52.55 73.14 60.83 87.22 69.62 72.16 51.50 80.78 68.54 74.57 61.18
Park_LGE_task2_4 Park2020 58 82.30 59.97 84.38 64.23 96.39 83.58 83.86 61.99 81.40 66.37 86.41 71.92 70.77 54.50 76.21 62.06 94.16 83.97 89.67 72.85 82.73 70.35 76.61 64.01
Vinayavekhin_IBM_task2_1 Vinayavekhin2020 8 98.83 94.94 94.61 89.51 95.89 83.62 97.69 94.74 93.80 88.26 87.32 71.89 89.05 80.98 93.32 82.95 99.50 97.35 99.77 98.77 95.66 88.13 81.71 67.68
Vinayavekhin_IBM_task2_2 Vinayavekhin2020 7 98.84 94.89 94.37 88.27 95.68 83.09 97.82 94.93 93.16 87.69 87.41 72.03 88.73 79.82 93.20 82.52 99.47 97.20 99.77 98.79 95.74 88.15 81.60 67.71
Vinayavekhin_IBM_task2_3 Vinayavekhin2020 17 98.98 95.49 93.87 87.95 92.63 77.47 98.02 95.39 91.06 83.92 79.88 66.74 89.13 81.49 91.60 80.83 99.31 96.48 99.53 98.65 92.48 77.68 76.90 63.48
Vinayavekhin_IBM_task2_4 Vinayavekhin2020 16 91.35 84.01 92.95 83.75 96.29 84.86 96.07 92.14 93.06 88.09 85.82 69.61 81.27 71.94 90.44 79.24 98.08 90.76 98.81 94.98 91.37 87.77 79.45 66.81
Lapin_BMSTU_task2_1 Lapin2020 99 51.35 50.97 64.67 57.88 67.11 56.01 73.80 73.42 57.61 52.55 55.71 52.08 68.75 54.61 76.00 61.11 90.78 72.80 84.90 63.78 70.29 57.15 75.57 60.31
He_THU_task2_1 Wang2020 74 79.94 56.72 81.03 63.07 84.74 60.47 77.22 55.87 82.46 71.10 88.92 71.30 68.25 53.21 72.93 61.52 82.04 67.58 84.19 62.47 80.61 72.55 74.00 61.81
Zhang_NJUPT_task2_1 Zhang2020 87 86.47 70.40 86.21 65.60 76.01 55.94 31.76 49.36 82.55 63.44 87.25 68.96 70.61 76.40 81.19 72.30 80.25 71.75
Zhang_NJUPT_task2_2 Zhang2020 82 86.47 70.40 86.21 65.60 76.01 55.94 46.07 49.61 82.55 63.44 87.25 68.96 70.61 76.40 81.19 72.30 80.25 71.75
Kaltampanidis_AUTH_task2_1 Kaltampanidis2020 70 80.45 73.99 75.01 66.19 76.56 62.62 82.90 78.87 83.87 71.11 71.63 60.40 85.57 78.68 77.33 73.69 83.03 69.59 87.24 76.48 82.11 72.35 70.35 59.92
Tiwari_IITKGP_task2_1 Tiwari2020 96 81.04 67.26 79.45 62.63 66.73 50.04 87.86 71.93 76.83 60.87 59.43 52.42 73.23 58.29 78.26 63.15 71.93 52.63 79.65 53.86 83.16 66.75 56.13 52.20
Tiwari_IITKGP_task2_2 Tiwari2020 89 77.77 54.38 73.85 59.79 88.25 67.06 85.61 85.32 76.96 64.64 61.25 50.58 72.11 51.74 78.72 68.36 90.28 79.36 96.84 90.81 80.96 64.24 58.57 50.63
Tiwari_IITKGP_task2_3 Tiwari2020 81 83.47 63.90 80.43 60.60 83.78 57.07 88.49 84.42 81.49 67.18 63.09 52.06 76.89 53.84 82.60 69.65 86.00 69.57 94.59 82.21 87.00 71.92 61.78 51.58
Tiwari_IITKGP_task2_4 Tiwari2020 59 81.04 67.26 80.43 60.60 88.25 67.06 85.61 85.32 81.49 67.18 88.48 70.45 73.23 58.29 82.60 69.65 90.28 79.36 96.84 90.81 87.00 71.92 72.53 60.43
Pilastri_CCG_task2_1 Ribeiro2020 71 81.88 57.47 82.33 62.06 84.04 60.08 61.80 50.70 82.45 67.30 90.40 76.32 72.03 53.25 73.06 60.94 87.08 68.10 72.16 51.17 80.79 71.17 76.43 63.79
Pilastri_CCG_task2_2 Ribeiro2020 63 79.45 56.36 84.60 62.23 88.59 63.24 69.41 51.59 81.06 71.59 91.57 75.97 66.78 52.63 72.07 60.96 91.77 76.20 78.83 53.10 78.04 69.12 75.93 60.03
Pilastri_CCG_task2_3 Ribeiro2020 63 81.88 57.47 82.33 62.06 88.59 63.24 69.41 51.59 82.45 67.30 90.40 76.32 72.03 53.25 73.06 60.94 91.77 76.20 78.83 53.10 80.79 71.17 76.43 63.79
Lopez_IL_task2_1 Lopez2020 18 93.09 90.67 93.98 90.72 98.88 95.38 96.80 90.61 86.59 81.85 71.21 61.41 88.23 80.57 93.21 86.19 99.97 99.82 99.89 99.41 95.73 90.32 74.17 65.86
Agrawal_mSense_task2_1 Agrawal2020 44 93.64 76.96 86.28 65.60 82.90 58.61 73.36 55.48 88.63 74.37 92.45 80.52 84.89 67.84 84.93 66.57 86.66 67.74 85.54 59.39 95.64 85.99 86.52 70.81
Agrawal_mSense_task2_2 Agrawal2020 35 96.80 84.56 89.54 70.89 89.20 66.00 74.90 57.94 86.51 67.92 91.60 79.53 86.06 69.37 87.86 70.43 92.36 76.11 86.00 68.33 93.46 80.52 84.21 69.31
Agrawal_mSense_task2_3 Agrawal2020 32 95.84 82.45 89.73 69.19 89.88 65.86 78.71 61.26 90.14 74.47 92.62 80.56 86.70 70.58 88.70 72.04 91.15 73.19 88.60 75.34 92.54 79.31 86.00 69.72
Agrawal_mSense_task2_4 Agrawal2020 55 87.93 67.84 81.34 64.81 90.04 65.97 73.58 57.78 84.15 64.80 92.01 79.65 79.29 60.27 82.58 64.53 91.24 74.49 84.50 64.90 92.20 78.81 82.90 67.46
Phan_UIUC_task2_1 Phan2020 56 88.84 73.07 86.60 67.64 87.38 64.12 81.37 58.74 81.29 76.53 87.77 70.14 74.56 59.43 74.09 65.72 89.07 68.72 89.31 67.19 74.54 66.66 73.58 59.44
Phan_UIUC_task2_2 Phan2020 47 88.92 72.67 87.27 67.68 87.23 64.45 82.39 59.43 82.65 77.16 87.43 69.68 73.87 59.25 73.59 64.97 88.54 68.55 89.33 65.79 74.21 66.42 74.12 59.70
Phan_UIUC_task2_3 Phan2020 51 89.04 72.06 86.91 67.91 87.48 64.54 81.81 59.20 80.48 76.80 88.12 70.91 74.76 59.21 73.17 63.70 89.38 70.04 89.97 67.73 74.94 67.04 74.03 59.79



System characteristics

Summary of the submitted system characteristics.

Rank Submission
code
Technical
Report
Classifier System
complexity
Acoustic
feature
Data
augmentation
Decision making System
embeddings
Subsystem
conut
External
data usage
Front end
system
93 DCASE2020_baseline_task2_1 Koizumi2020 AE 269992 log-mel energies
90 Alam_CRIM_task2_1 Alam2020 AE 269992 modulation spectrogram
65 Alam_CRIM_task2_2 Alam2020 AE, VAE, ensemble 3547000 log-mel energies, LFCC, modulation spectrogram maximum 5
75 Alam_CRIM_task2_3 Alam2020 AE, VAE, ensemble 3547000 log-mel energies, LFCC, modulation spectrogram maximum 5
54 Alam_CRIM_task2_4 Alam2020 AE, VAE, CVAE, ensemble 8461000 log-mel energies, LFCC, modulation spectrogram, PSCC maximum, average 10
2 Giri_Amazon_task2_1 Giri2020 IDNN/IAE, MobileNetV2, ResNet50, ensemble 73450548 log-mel energies mixup, spectrogram warping average, maximum 7
1 Giri_Amazon_task2_2 Giri2020 IDNN/IAE, MobileNetV2, ensemble 2779530 log-mel energies mixup, spectrogram warping average 4
3 Giri_Amazon_task2_3 Giri2020 IDNN/IAE, MobileNetV2, ArcFace, ensemble 3494426 log-mel energies mixup, spectrogram warping average, maximum 3
38 Giri_Amazon_task2_4 Giri2020 IDNN/IAE 663552 log-mel energies
20 Hayashi_HDL_task2_1 Hayashi2020 AE, Conformer, GMM, ID regression 5714035 log-mel energies
25 Hayashi_HDL_task2_2 Hayashi2020 AE, Conformer, GMM, ID embedding 5230130 log-mel energies
10 Hayashi_HDL_task2_3 Hayashi2020 AE, Conformer, Transformer, GMM, ID regression, ID embedding, ensemble 30463585 log-mel energies weighted average 6
14 Hayashi_HDL_task2_4 Hayashi2020 AE, Conformer, Transformer, GMM, ID regression, ID embedding, ensemble 47264120 log-mel energies weighted average 10
67 Jiang_UESTC_task2_1 Jiang2020 AE 272080 log-mel energies
60 Jiang_UESTC_task2_2 Jiang2020 VAE, AE, GMM 538976 log-mel energies
92 Hoang_FPT_task2_1 Hoang2020 AE, U-Net AE, IDNN/IAE, LSTM AE 3570807 MFCC, log-mel energies, STFT, chroma features, spectral contrast, tonnetz
71 Hoang_FPT_task2_2 Hoang2020 AE, U-Net AE, IDNN/IAE, LSTM AE 3570807 MFCC, log-mel energies, STFT, chroma features, spectral contrast, tonnetz
93 Hoang_FPT_task2_3 Hoang2020 AE, U-Net AE, IDNN/IAE 3026421 MFCC, log-mel energies, STFT, chroma features, spectral contrast, tonnetz
112 Hoang_FPT_task2_4 Hoang2020 U-Net AE, IDNN/IAE 1884328 MFCC, log-mel energies, STFT, chroma features, spectral contrast, tonnetz
103 Tian_BUPT_task2_1 Tian2020 VAE 225216 log-mel energies
101 Tian_BUPT_task2_2 Tian2020 VAE 225216 log-mel energies
34 Durkota_NSW_task2_1 Durkota2020 KNN 2672449 spectrogram Siamese Network
29 Durkota_NSW_task2_2 Durkota2020 KNN 2672449 spectrogram Siamese Network
24 Durkota_NSW_task2_3 Durkota2020 KNN 3292033 spectrogram Siamese Network
48 Bai_LFXS_task2_1 Bai2020 AE, ensemble 269992 log-mel energies, HPSS_H, HPSS_P 3 HPSS
60 Bai_LFXS_task2_2 Bai2020 AE, ensemble 269992 log-mel energies, log-spectrogram, MFCC, HPSS_H, HPSS_P 5 HPSS
57 Bai_LFXS_task2_3 Bai2020 AE, ensemble 269992 log-mel energies, log-spectrogram, MFCC, HPSS_H, HPSS_P 5 HPSS
80 Bai_LFXS_task2_4 Bai2020 AE, ensemble 269992 log-mel energies, HPSS_H, HPSS_P 3 HPSS
85 Ahmed_Mila_task2_1 Ahmed2020 AE, ResNet classifier, phase-shift prediction, GMM, ensemble 9000000 log-mel energies weighted average 13
76 Ahmed_Mila_task2_2 Ahmed2020 AE, ResNet classifier, phase-shift prediction, GMM, ensemble 9000000 log-mel energies weighted average 13
88 Ahmed_Mila_task2_3 Ahmed2020 AE, ResNet classifier, ensemble 9000000 log-mel energies weighted average 11
82 Ahmed_Mila_task2_4 Ahmed2020 AE, ResNet classifier, phase-shift prediction, GMM, ensemble 9000000 log-mel energies weighted average 13
69 Chaudhary_NCS_task2_1 Chaudhary2020 CNN, AE 9627136 log-mel energies, spectrogram
62 Chaudhary_NCS_task2_2 Chaudhary2020 CNN, AE, ensemble 9627136 log-mel energies, spectrogram geometric mean 4
30 Wilkinghoff_FKIE_task2_1 Wilkinghoff2020 CNN, PCA, RLDA, PLDA 10635245 log-mel energies manifold mixup OpenL3 embeddings
36 Wilkinghoff_FKIE_task2_2 Wilkinghoff2020 CNN, PCA, RLDA, PLDA 10635245 log-mel energies manifold mixup OpenL3 embeddings
22 Wilkinghoff_FKIE_task2_3 Wilkinghoff2020 CNN, AE, ensemble, PCA, RLDA, PLDA 10905237 log-mel energies manifold mixup concatenation OpenL3 2 embeddings
23 Wilkinghoff_FKIE_task2_4 Wilkinghoff2020 CNN, AE, ensemble, PCA, RLDA, PLDA 10905237 log-mel energies manifold mixup concatenation OpenL3 2 embeddings
41 Daniluk_SRPOL_task2_1 Daniluk2020 Heteroskedastic VAE, ensemble 96600000 log-mel energies, spectrogram average OpenL3 10 Audioset U-Net-based noise reduction
40 Kapka_SRPOL_task2_2 Kapka2020 CAE 2372832 log-mel energies
21 Kosmider_SRPOL_task2_3 Kosmider2020 CNN, ensemble 60941952 log-mel energies, mel energies, sqrt-mel energies average 53
3 Daniluk_SRPOL_task2_4 Daniluk2020 VAE, CAE, CNN, ensemble 179000000 log-mel energies, spectrogram, mel energies, sqrt-mel energies weighted average OpenL3 3 Audioset U-Net-based noise reduction
114 Xiao_THU_task2_1 Xiao2020 AE, CNN 469342 log-mel energies, raw waveform
102 Xiao_THU_task2_2 Xiao2020 AE, CNN 469342 log-mel energies, raw waveform
106 Xiao_THU_task2_3 Xiao2020 AE, CNN 469342 log-mel energies, raw waveform
98 Xiao_THU_task2_4 Xiao2020 AE, CNN 469342 log-mel energies, raw waveform
42 Shinmura_JPN_task2_1 Shinmura2020 MobileNetV2, ensemble, ArcFace 705936 spectrogram average 10
27 Grollmisch_IDMT_task2_1 Grollmisch2020 IDNN/IAE, GMM, PCA 4938007 log-mel energies OpenL3 embeddings
26 Grollmisch_IDMT_task2_2 Grollmisch2020 IDNN/IAE, GMM, PCA 4938007 log-mel energies OpenL3 embeddings
50 Haunschmid_CPJKU_task2_1 Haunschmid2020 AE 269992 log-mel energies
39 Haunschmid_CPJKU_task2_2 Haunschmid2020 normalizing flow 29720576 log-mel energies
45 Haunschmid_CPJKU_task2_3 Haunschmid2020 normalizing flow 14868480 log-mel energies
46 Haunschmid_CPJKU_task2_4 Haunschmid2020 normalizing flow 16005120 log-mel energies
15 Zhou_PSH_task2_1 Zhou2020 CNN, ArcFace 1021632 spectrogram
18 Zhou_PSH_task2_2 Zhou2020 CNN, ArcFace 3535872 spectrogram
12 Zhou_PSH_task2_3 Zhou2020 CNN, ArcFace 1021632 spectrogram
13 Zhou_PSH_task2_4 Zhou2020 CNN, ArcFace, ensemble 4557504 spectrogram average 2
112 Wen_UESTC_task2_1 Wen2020 AE, VAE, GMM 2282144 log-mel energies simulation of anomalous samples
104 Wen_UESTC_task2_2 Wen2020 AE, VAE, GMM, CNN 3714193 log-mel energies simulation of anomalous samples
109 Wen_UESTC_task2_3 Wen2020 AE, VAE, GMM, CNN 5996337 log-mel energies simulation of anomalous samples
106 Chen_UESTC_task2_1 Chen2020 AE 16894776 log-mel energies time stretching, time shifting, adding white noise
110 Chen_UESTC_task2_2 Chen2020 AE 4531616 log-mel energies
105 Chen_UESTC_task2_3 Chen2020 AE 13578384 log-mel energies time stretching, time shifting, adding white noise
79 Shao_ELCN_task2_1 Shao2020 AE 423279 log-mel energies, spectrogram, raw waveform VGGish, OpenL3 simulation of anomalous samples, features extraction filter
68 Zhao_TAU_task2_1 Zhao2020 None MFCC KL divergence
53 Zhao_TAU_task2_2 Zhao2020 None MFCC KL divergence
43 Zhao_TAU_task2_3 Zhao2020 None MFCC KL divergence
27 Sakamoto_fixstars_task2_1 Sakamoto2020 mahalanobis distance, subspace distance k-nearest neighbor, matrix normal distribution, ensemble 141728 log-mel energies mahalanobis distance 3
52 Sakamoto_fixstars_task2_2 Sakamoto2020 mahalanobis distance, subspace distance k-nearest neighbor, matrix normal distribution, ensemble 141728 log-mel energies mahalanobis distance 3
31 Sakamoto_fixstars_task2_3 Sakamoto2020 mahalanobis distance, subspace distance k-nearest neighbor, matrix normal distribution, ensemble 16512 log-mel energies mahalanobis distance 2
37 Sakamoto_fixstars_task2_4 Sakamoto2020 mahalanobis distance, subspace distance k-nearest neighbor, matrix normal distribution, ensemble 222912 log-mel energies mahalanobis distance 2
117 Naranjo-Alcazar_Vfy_task2_1 Naranjo-Alcazar2020 AE 8851009 gammatone
116 Naranjo-Alcazar_Vfy_task2_2 Naranjo-Alcazar2020 AE 8851009 gammatone
114 Naranjo-Alcazar_Vfy_task2_3 Naranjo-Alcazar2020 semi-supervised AE 8851009 gammatone
110 Jalali_AIT_task2_1 Jalali2020 LSTM AE 755776 log-mel energies
6 Primus_CP-JKU_task2_1 Primus2020 CNN 1000000 log-mel energies
5 Primus_CP-JKU_task2_2 Primus2020 CNN 12000000 log-mel energies
10 Primus_CP-JKU_task2_3 Primus2020 CNN 59000000 log-mel energies median 5
9 Primus_CP-JKU_task2_4 Primus2020 CNN 136000000 log-mel energies median 13
73 Wei_Kuaiyu_task2_1 Wei2020 MobileNetV2, L2-softmax 706224 log-mel energies
97 Wei_Kuaiyu_task2_2 Wei2020 MobileNetV2, L2-softmax 706224 log-mel energies
66 Wei_Kuaiyu_task2_3 Wei2020 AE 2710992 log-mel energies
77 Wei_Kuaiyu_task2_4 Wei2020 AE 2710992 log-mel energies
49 Morita_SECOM_task2_1 Morita2020 LOF 8120000 log-mel energies
91 Morita_SECOM_task2_2 Morita2020 GMM 207360 log-mel energies
33 Morita_SECOM_task2_3 Morita2020 LOF, GMM 8120000 log-mel energies
100 Uchikoshi_JRI_task2_1 Uchikoshi2020 CNN, KNN, GMM 74346100 log-mel energies random noise
78 Uchikoshi_JRI_task2_2 Uchikoshi2020 CNN, KNN, GMM, AE, ensemble 74616092 log-mel energies random noise weighted average 2
93 Uchikoshi_JRI_task2_3 Uchikoshi2020 CNN, KNN, GMM, AE, ensemble 74616092 log-mel energies random noise weighted average 2
106 Park_LGE_task2_1 Park2020 AE 269992 log-mel energies latent space sampling
82 Park_LGE_task2_2 Park2020 AE 4206173 spectrogram
85 Park_LGE_task2_3 Park2020 AE 4206173 log-mel energies, median-filtered spectrogram median-filter
58 Park_LGE_task2_4 Park2020 AE 3523357 spectrogram
8 Vinayavekhin_IBM_task2_1 Vinayavekhin2020 ensemble, CNN 813943 log-mel energies time stretching, pitch shifting probability aggregation 2
7 Vinayavekhin_IBM_task2_2 Vinayavekhin2020 ensemble, CNN 813943 log-mel energies time stretching, pitch shifting probability aggregation 2
17 Vinayavekhin_IBM_task2_3 Vinayavekhin2020 CNN 418162 log-mel energies
16 Vinayavekhin_IBM_task2_4 Vinayavekhin2020 CNN 395781 log-mel energies time stretching, pitch shifting
99 Lapin_BMSTU_task2_1 Lapin2020 AE, ensemble 2282272 log-mel energies, pseudo wigner ville maximum 2
74 He_THU_task2_1 Wang2020 IDNN/IAE 187560 log-mel energies
87 Zhang_NJUPT_task2_1 Zhang2020 AE, dictionary learning, OCSVM 27144 logfbank, variance, square root amplitude, kurtosis index, crook index, slope, effective value, pulse index, waveform index, kurtosis, margin index, root mean square frequency, mean square frequency, fourier sum of squares, frequency vavriance center of gravity, center of gravity, frequency standard deviation
82 Zhang_NJUPT_task2_2 Zhang2020 AE, dictionary learning, OCSVM 27144 logfbank, variance, square root amplitude, kurtosis index, crook index, slope, effective value, pulse index, waveform index, kurtosis, margin index, root mean square frequency, mean square frequency, fourier sum of squares, frequency vavriance center of gravity, center of gravity, frequency standard deviation
70 Kaltampanidis_AUTH_task2_1 Kaltampanidis2020 ProtoPNet 13128 spectrogram
96 Tiwari_IITKGP_task2_1 Tiwari2020 GMM 141458 MFCC i-vectors
89 Tiwari_IITKGP_task2_2 Tiwari2020 graph clustering, KNN 0 modulation spectrogram
81 Tiwari_IITKGP_task2_3 Tiwari2020 graph clustering, KNN, GMM, ensemble 141458 modulation spectrogram, MFCC i-vectors geometric mean 2
59 Tiwari_IITKGP_task2_4 Tiwari2020 graph clustering, KNN, GMM, AE, ensemble 411450 modulation spectrogram, MFCC i-vectors, log-mel energies geometric mean, maximum 3
71 Pilastri_CCG_task2_1 Ribeiro2020 AE 2257576 log-mel energies
63 Pilastri_CCG_task2_2 Ribeiro2020 CNN AE 4133673 log-mel energies
63 Pilastri_CCG_task2_3 Ribeiro2020 AE 2882992 log-mel energies
18 Lopez_IL_task2_1 Lopez2020 CNN, stats pooling, large margin cosine distance 1826030 log-mel energies, spectrogram weighted linear interpolation
44 Agrawal_mSense_task2_1 Agrawal2020 AE 784729 log-mel energies
35 Agrawal_mSense_task2_2 Agrawal2020 AE 870841 log-mel energies
32 Agrawal_mSense_task2_3 Agrawal2020 AE 870841 log-mel energies
55 Agrawal_mSense_task2_4 Agrawal2020 AE 864257 log-mel energies
56 Phan_UIUC_task2_1 Phan2020 IDNN/IAE 249664 log-mel energies
47 Phan_UIUC_task2_2 Phan2020 IDNN/IAE 249664 log-mel energies
51 Phan_UIUC_task2_3 Phan2020 IDNN/IAE 249664 log-mel energies



Technical reports

Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring

Vipin Agrawal (mSense, Inc)*; Shiv Shankar Maurya (mSense, Inc)
mSense Inc, CA, USA and mSense Inc, Bangalore, India

Abstract

Autoencoders are a very popular approach in detecting anomalies in a system, where reconstruction error is generally used as an anomaly score. However, the reconstruction errors, generated in such manners, contain external noises of the system, making reconstruction errors as anomaly scores less effective. In this brief, we present an additional hypothesis that autoencoders may introduce additional statistical noise in the reconstruction errors as well. Our proposal includes a design of an autoencoder, lays out a theoretical basis of designing a noise filter for reconstruction errors, and outlines various aggregation methods to reduce the effect of the noise. While further work is still needed, we are able to show the accuracy improvement by using various aggregation methods .

System characteristics
Classifier AE
System complexity 784729, 864257, 870841 parameters
Acoustic features log-mel energies
PDF

An ensemble approach for detecting machine failure from sound

Faruk Ahmed (Mila, Universite de Montreal)*; Phong Nguyen (Hitachi, Ltd.); Aaron Courville (Universite de Montreal)
Mila-Universite de Montreal, Montreal, Canada and Central Research Laboratory, Hitachi, Ltd., Tokyo, Japan

Abstract

We develop an ensemble-based approach for our submission to the anomaly detection challenge at DCASE 2020. The main members of our ensemble are auto-encoders (with reconstruction error as the signal), classifiers (with negative predictive confidence as the signal), mismatch of the time-shifted signal with its Fourier-phase-shifted version, and a Gaussian mixture model on a set of common short-term features extracted from the waveform. The scores are passed through an exponential non-linearity and weighted to provide the final score, where the weighting and scaling hyper-parameters are learned on the development set. Our ensemble improves over the baseline on the development set.

System characteristics
Classifier AE, GMM, ResNet classifier, ensemble, phase-shift prediction
System complexity 9000000 parameters
Acoustic features log-mel energies
Decision making weighted average
Subsystem count 11, 13
PDF

An Ensemble Approach to Unsupervised Anomalous Sound Detection

Jahangir Alam (Computer Research Institute of Montreal (CRIM), Montreal (Quebec) Canada)*; Gilles Boulianne (CRIM); Vishwa Gupta (CRIM); Abderrahim Fathan (CRIM)
Speech group, CRIM, Montreal, Canada

Abstract

The task of anomalous sound detection (ASD) is to determine whether an observed sound is anomalous or normal. Both supervised and unsupervised approach can be adopted for the ASD task. In supervised approach anomalous and normal data are used in training whereas in unsupervised approach only normal data are used for training. In this work, we provide an overview of the systems developed for the task 2 i.e., unsupervised detection of anomalous sounds for machine condition monitoring, of the DCASE 2020 challenge. We employ various handcrafted local representations from the short-time spectral analysis of sounds. We also use fisher vector encoding -a learned global representations obtained from local representations of sound. Autoencoder variants and copy detection approaches are applied on the top of local representations and a standard GMM classifier is used with fisher vector encodings for unsupervised detection of anomalous sounds.

System characteristics
Classifier AE, CVAE, VAE, ensemble
System complexity 269992, 3547000, 8461000 parameters
Acoustic features LFCC, PSCC, log-mel energies, modulation spectrogram
Decision making average, maximum
Subsystem count 10, 5
PDF

Bai_LFXS_NWPU_dcase2020_submission

Jisheng Bai (Northwestern Polytechnical University)*
LianFeng Acoustic Technologies Co., Ltd., Xi'an, China and School of Marine Science and Technology, Northwestern Polytechnical University, Xi'an, China

Abstract

task 2 and task 5

System characteristics
Classifier AE, ensemble
System complexity 269992 parameters
Acoustic features HPSS_H, HPSS_P, MFCC, log-mel energies, log-spectrogram
Subsystem count 3, 5
Front end system HPSS
PDF

CONVOLUTIONAL AUTO ENCODER FOR MACHINE CONDITION MONITORING FROM ACOUSTIC SIGNATURES

Nitesh K Chaudhary (NCS Pte Ltd)*; Josey Mathew (NCS); Sunil Sivadas (NCS)
NEXT Products & Platform, NCS Pte. Ltd., Singapore

Abstract

Condition monitoring of machinery is critical for early detection and prevention of failures in factories. Recent advancements in machine learning is driving the development of data driven tools like monitoring acoustic signatures from microphones. This report presents a convolutional autoencoder (CAE) trained to minimize the acoustic spectrogram reconstruction error during normal operation of the machine. The model is pre-trained on machines of similar type and then fine-tuned on a specific machine. The reconstruction error is used as the anomaly score for an unseen acoustic sample. The proposed model improves performance compared to baseline system for the DCASE2020 challenge task2

System characteristics
Classifier AE, CNN, ensemble
System complexity 9627136 parameters
Acoustic features log-mel energies, spectrogram
Decision making geometric mean
Subsystem count 4
PDF

Anomalous Sounds Detection Using A New Type of Autoencoder based on Residual Connection

Yunqi Chen (University of Electronic Science and Technology of China)*
University of Electronic Science and Technology of China, Chengdu, China

Abstract

This report describes our submission for task2 (Unsupervised Detection of Anomalous Sounds for Machine Condition Mon-itoring) of the DCASE 2020. In this report we propose net-works of fully connected autoencoder based on residual con-nections, which can increase the accuracy of anomaly sound detection. As for data preprocessing, we use data augmenta-tion methods to generate more data from existing data. Our feature extraction is still carried out with log mel spectrogram. Finally, our method has achieved average AUC of 0.7912 and average pAUC of 0.6105 on the development dataset.

System characteristics
Classifier AE
System complexity 13578384, 16894776, 4531616 parameters
Acoustic features log-mel energies
Data augmentation time stretching, time shifting, adding white noise
PDF

ENSEMBLE OF AUTO-ENCODER BASED SYSTEMS FOR ANOMALY DETECTION

Pawel Daniluk (Samsung R&D Institute Poland)*; Marcin Gozdziewski (Samsung R&D Institute Poland); Slawomir Kapka (Samsung R&D Institute Poland); Michal Kosmider (Samsung R&D Institute Poland)
Artificial Intelligence, Samsung R&D Institute Poland, Warsaw, Poland

Abstract

In this paper we report an ensemble of models used to perform anomaly detections for DCASE Challenge 2020 Task 2. Our solu- tion comprises three families of models: Variational Heteroskedas- tic Auto-encoders, Conditioned Auto-encoders and a WaveNet- based network. Noisy recordings are preprocessed using a U-Net trained for noise removal on training samples augmented with noised obtained from the AudioSet. Models operate either on OpenL3 embeddings or log-mel power spectra. Heteroskedastic VAEs have a non-standard loss function which uses model's own error estimation to weigh typical MSE loss. Model ar- chitecture i.e. sizes of layers, dimension of latent space and size of an error estimating network are independently selected for each machine type. ID Conditined AEs are an adaptation of the class conditioned auto- encoder approach designed for open set recognition. Assuming that non-anomalous samples constitute distinct IDs, we apply the class conditioned auto-encoder with machine IDs as labels. Our approach omits the classification subtask and reduces the learning process to a single run. We simplify the learning process further by fixing a target for non-matching labels. Anomalies are predicted either by poor reconstruction or attibution of samples to the wrong machine ID. The third solution is based on a convolutional neural network and a simple noise reduction method. The architecture of the model is inspired by the WaveNet and uses causal convolutional layers with growing dilation rates. It works by predicting the next frame in the spectrogram of a given recording. Anomaly score is derived from the reconstruction error. We present results obtained by each kind of models separately, as well as, a result of an ensemble obtained by averaging anomaly scores computed by individual models.

System characteristics
Classifier CAE, CNN, Heteroskedastic VAE, VAE, ensemble
System complexity 179000000, 2372832, 60941952, 96600000 parameters
Acoustic features log-mel energies, mel energies, spectrogram, sqrt-mel energies
Decision making average, weighted average
System embeddings OpenL3
Subsystem count 10, 3, 53
External data usage Audioset
Front end system U-Net-based noise reduction
PDF

NEURON-NET: SIAMESE NETWORK FOR ANOMALY DETECTION

Karel Durkota (CTU in Prague)*; Michal Linda (NeuronSW SE); Martin Ludvik (NeuronSW SE); Jan Tozicka (NeuronSW SE)
NSW, Prague, Czech Republic

Abstract

This paper describes our submission to the DCASE 2020 challenge Task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring." Acoustic-based machine condition monitoring is a challenging task with a very unbalanced training dataset. In this submission, we combine the Siamese Network feature extractor with KNN anomaly detection algorithm. Experiment results prove it to be a viable approach with an average AUC 85.85 and pAUC of 77.1. This novel approach have not been used by NeuronSW SE so far.

System characteristics
Classifier KNN
System complexity 2672449, 3292033 parameters
Acoustic features spectrogram
System embeddings Siamese Network
PDF

Unsupervised Anomalous Sound Detection Using Self-Supervised Classification and Group Masked Autoencoder for Density Estimation

Ritwik Giri (Amazon)*; Srikanth Tenneti (Amazon); Fangzhou Cheng (Amazon); Karim Helwani (Amazon); Umut Isik (Amazon); Arvindh Krishnaswamy (Amazon)
Amazon Web Services, California, United States

Abstract

This technical report outlines our solutions to Task 2 of the DCASE 2020 challenge, \emph{Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring}. The objective is to detect audio recordings containing anomalous machine sounds in a test set, when the training dataset itself does not contain any examples of anomalies. Our approaches are based on an ensemble of a novel density estimation based anomaly detector (Group Masked Autoencoder for Density Estimation (GMADE)) and self-supervised classification based anomaly detector.

System characteristics
Classifier ArcFace, IDNN/IAE, MobileNetV2, ResNet50, ensemble
System complexity 2779530, 3494426, 663552, 73450548 parameters
Acoustic features log-mel energies
Data augmentation mixup, spectrogram warping
Decision making average, maximum
Subsystem count 3, 4, 7
PDF

IAEO3 - Combining OpenL3 Embeddings and Interpolation Autoencoder for Anomalous Sound Detection

Sascha Grollmisch (Fraunhofer IDMT)*; David Johnson (Fraunhofer IDMT); Jakob AbeBer (Fraunhofer IDMT); Hanna Lukashevich (Fraunhofer Institute for Digital Media Technology, Germany)
Institute of Media Technology, Technische Universitat Ilmenau, Ilmenau, Germany and Industrial Media Applications, Fraunhofer IDMT, Ilmenau, Germany and Semantic Music Technologies, Fraunhofer IDMT, Ilmenau, Germany

Abstract

In this technical report, we present our system for task 2 of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE2020 Challenge): Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring. The focus of this task is to detect anomalous industrial machine sounds using an acoustic quality control system, which is only trained with sound samples from the normal (machine) condition. The dataset covers a variety of machines ranging from stable sound sources such as car engines, to transient sounds such as opening and closing valves. Our proposed method combines pre-trained OpenL3 embeddings with the reconstruction error of an interpolation autoencoder using a gaussian mixture model as the final predictor. The optimized model achieved 88.5% AUC and 76.8% pAUC on average over all machines and types provided with the development dataset, and outperformed the published baseline by 14.9% AUC and 17.2% pAUC.

System characteristics
Classifier GMM, IDNN/IAE, PCA
System complexity 4938007 parameters
Acoustic features log-mel energies
System embeddings OpenL3
External data usage embeddings
PDF

Anomalous Sound Detection with Masked Autoregressive Flows and Machine Type Dependent Postprocessing

Verena Haunschmid (Johannes Kepler University Linz)*; Patrick Praher (Software Competence Center Hagenberg)
Institute of Computational Perception, Johannes Kepler University, Linz, Austria and Data Analysis Systems, Software Competence Center Hagenberg GmbH, Hagenberg, Austria

Abstract

This technical report describes the submission from the CP JKU/SCCH team for Task 2 of the DCASE2020 challenge - Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring. Our approach uses a Masked Autoregressive Flow (MAF) model for density estimation trained solely on normal samples. Anomaly scores per input snippet are computed using the negative log likelihood of new samples. The anomaly scores per input audio are summarised using different metrics depending on the machine type instead of simply averaging them.

System characteristics
Classifier AE, normalizing flow
System complexity 14868480, 16005120, 269992, 29720576 parameters
Acoustic features log-mel energies
PDF

CONFORMER-BASED ID-AWARE AUTOENCODER FOR UNSUPERVISED ANOMALOUS SOUND DETECTION

Tomoki Hayashi (Human Dataware Lab. Co., Ltd.)*; Takenori Yoshimura (Human Dataware Lab. Co., Ltd.); Yusuke Adachi (Human Dataware Lab. Co., Ltd.)
Human Dataware Lab. Co., Ltd., Nagoya, Japan

Abstract

This paper presents an autoencoder-based unsupervised anomalous sound detection (ASD) method for the DCASE 2020 Challenge Task 2. Inspired by the great successes of the self-attention architecture in various fields such as speech recognition, we propose Transformer- and Conformer-based autoencoder for ASD, enabling us to perform sequence-by-sequence processing. As opposed to the standard autoencoder, they can extract sequence-level information from whole audio inputs. Furthermore, we propose two simple methods for exploiting machine ID information: machine ID embedding and machine ID regression. The two methods enable the proposed models to avoid the confusion of anomalous and normal sounds among the different machine IDs. The experimental evaluation demonstrates that the proposed autoencoders outperform the conventional frame-level autoencoder, and the explicit use of machine ID information significantly improves the ASD performance. We achieved an averaged area under the curve (AUC) of 91.33% and averaged partial AUC of 83.34% on the development set.

System characteristics
Classifier AE, Conformer, GMM, ID embedding, ID regression, Transformer, ensemble
System complexity 30463585, 47264120, 5230130, 5714035 parameters
Acoustic features log-mel energies
Decision making weighted average
Subsystem count 10, 6
PDF

Unsupervised Detection Of Anomalous Sound For Machine Condition Monitoring Using Different Auto-encoder Methods

Truong Hoang (FPT Software)*; Hieu Nguyen (FPT Software); Giao Pham (FPT Software)
STU.HCM, FPT Software, Ho Chi Minh, Vietnam and FHN.DCS, FPT Software, Hanoi, Vietnam and FWI.AAA, FPT Software, Hanoi, Vietnam

Abstract

Anomaly detection from the sound of machines is an important task for monitoring machines. This paper presents four auto-encoder methods to detect anomalous sound for machine condition monitoring using Long-short term memory auto-encoder, U-Net auto-encoder, Interpolation deep neural network, and Fully-connected auto-encoder. With experiments on the same dataset with the baseline system, experimental results show that our methods out-perform the baseline system in terms of AUC and pAUC evaluation metrics.

System characteristics
Classifier AE, IDNN/IAE, LSTM AE, U-Net AE
System complexity 1884328, 3026421, 3570807 parameters
Acoustic features MFCC, STFT, chroma features, log-mel energies, spectral contrast, tonnetz
PDF

DCASE Challenge 2020: Unsupervised Anomalous Sound Detection of Machinery with Deep Autoencoders

Anahid Jalali (Austrian Institute of Technology)*
Digital Safety and Security, Austrian Institute of Technology, Vienna, Austria

Abstract

Inourwork,wepresentanunsupervisedanomaloussounddetection framework trained on DCASE2020 audio dataset. This dataset is a subset of two datasets ToyADMOS and MIMII. We use the state of the art anomaly detection approach, deep autoencoder architecture trained on Mel-Spectrograms. This architecture uses LSTM-RNN units to learn the normal condition of the machine, and is proven efficientatdetectingdiversemachineanomalies. Ourtrainedmodel on MIMII dataset achieves average result of 73.51% AUC and 57.90% pAUC, resulting in a slight improvement compared to the baseline system with the average results of 72.44% AUC and 57.48 pAUC. The average performance of the baseline system on ToyADMOSdatasetis75.65%AUCand64%pAUC,whereourmodel reaches to average of 73.51% AUC and 57.90% pAUC. Our system reaches overall average of 73.41% AUC and 59.27% pAUC on the development data set, with overall similar performance to the baseline system with average of 73.51% AUC and 59.66% pAUC.

System characteristics
Classifier LSTM AE
System complexity 755776 parameters
Acoustic features log-mel energies
PDF

ABNORMAL SOUND DETECTION SYSTEM BASED ON AUTOENCODER

huitian jiang (University of Electronic Science and Technology of China)*; bo lan (University of Electronic Science and Technology of China); Huiyong Li (University of Electronic Science and Technology of China)
School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China and University of Electronic Science and Technology of China, Chengdu, China

Abstract

This report describes our submissions with an autoencoder (AE) to solve the DCASE 2020 challenge task 2 (unsupervised detec-tion of anomalous sounds for machine condition monitoring). Previous research results show that AE is a very effective solution to abnormal sound detection (ASD). This design continues previ-ous research, using AE to implement unsupervised ASD. To decrease the false positive rate (FPR), the AE is trained to mini-mize the reconstruction error of normal sound. In addition, the design uses variational autoencoder (VAE) to generate normal sound samples. The generated sound samples are used to enhance AE's ability to reconstruct normal sound samples.

System characteristics
Classifier AE, GMM, VAE
System complexity 272080, 538976 parameters
Acoustic features log-mel energies
PDF

Unsupervised Detection of Anomalous Sounds via ProtoPNet

Yannis A Kaltampanidis (Aristotle University of Thessaloniki)*
Polytech School of Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece and Aristotle University of Thessaloniki, Thessaloniki, Greece

Abstract

Prototypical part network (ProtoPNet) is a novel method proposed for the task of image classification, offering the ability to interpret the network's reasoning process during classification. The subject of this work is the examination of ProtoPNet as an unsupervised anomaly detection method, through its application at the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 task 2 challenge. It is also showed that ProtoPNet shares common grounds with Deep One-Class Support Vector Data Descriptor (DOCSVDD).

System characteristics
Classifier ProtoPNet
System complexity 13128 parameters
Acoustic features spectrogram
PDF

Description and discussion on DCASE2020 challenge task2: unsupervised anomalous sound detection for machine condition monitoring

Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada
Media Intelligence Laboratories, NTT Corporation, Tokyo, Japan and Research and Development Group, Hitachi, Ltd., Tokyo, Japan and Doshisha University, Kyoto, Japan

Abstract

This paper presents the details of the DCASE 2020 Challenge Task 2; Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring. The goal of anomalous sound detection (ASD) is to identify whether the sound emitted from a target machine is normal or anomalous. The main challenge of this task is to detect unknown anomalous sounds under the condition that only normal sound samples have been provided as training data. We have designed a DCASE challenge task which contributes as a starting point and a benchmark of ASD research; the dataset, evaluation metrics, a simple baseline system, and other detailed rules. After the challenge submission deadline, challenge results and analysis of the submissions will be added.

System characteristics
Classifier AE
System complexity 269992 parameters
Acoustic features log-mel energies
PDF

THE STUDY OF ANOMALOUS MACHINE SOUND DETECTION BASED ON CYCLOSTATIONARITY MODEL

Dmitriy Lapin (BMSTU)*; Vladimir Klychnikov (BSMTU); Mark Hubbatulin (BMSTU)
Bauman Moscow State Technical University, Moscow, Russia

Abstract

In industrial predictive maintenance, one of the most important direction in Industry 4.0, machine monitoring and diagnostics is critical part of its operation. Non-contact acoustic data gathering is particular interest because of high ergonomics and low costs. The general method for processing such type of data is anomalous sound detection. This method allows express diagnostics of machines and units with mini-mum integrations. Based on DCASE 2020 Challenge, the study of the proposed method was presented. Problem description with physical interpretation and model elements review was conducted. Model based on Winger-Ville trans-form with architecture improvement and ensemble score calculation was developed. Model results on provided development dataset were calculated. Discussion of model results with assumptions for further research and development was shown. Conclusion about present study and future work was received.

System characteristics
Classifier AE, ensemble
System complexity 2282272 parameters
Acoustic features log-mel energies, pseudo wigner ville
Decision making maximum
Subsystem count 2
PDF

A Speaker Recognition Approach To Anomaly Detection

Jose A Lopez (Intel Labs)*; Jonathan Huang (Apple); Georg Stemmer (Intel); Paulo Lopez Meyer (Intel); Hong Lu (Intel Labs); Lama Nachman (Intel Labs)
Intel Labs, Intel Corp, Santa Clara, CA, USA and Intel Labs, Intel Corp, Zapopan, JAL, Mexico and Intel Labs, Intel Corp, Neubiberg, Germany and Apple Corp, Cupertino, CA, USA

Abstract

We discuss our unsupervised speaker-recognition-based submission to the DCASE 2020 Challenge Task 2. We found that a speaker-recognition approach enables the use of all the training data, even from different machine types, to detect anomalies in specific machines. Using this approach, we obtained AUCs close to, or greater than, 0.9 for 5 out of 6 machines. We also discuss the modifications needed to surpass the baseline score for the ToyConveyor data.

System characteristics
Classifier CNN, large margin cosine distance, stats pooling
System complexity 1826030 parameters
Acoustic features log-mel energies, spectrogram
Data augmentation weighted linear interpolation
PDF

ANOMALOUS SOUND DETECTION BY USING LOCAL OUTLIER FACTOR AND GAUSSIAN MIXTURE MODEL

Kazuki Morita (SECOM)*; Tomohiko Yano (SECOM); Khai Q. Tran (SECOM)
Intelligent Systems Laboratory, SECOM CO.,LTD., Tokyo, Japan

Abstract

In this report, we introduce our methods and results of the anomalous sound detection in DCASE2020 task2. We attempted to detect anomalous sound without using deep learning methods. Precisely, we first extracted features by applying principal component analysis (PCA) to the log-mel spectrogram of the sound signal. Then we used Local Outlier Factor (LOF) and Gaussian Mixture Model (GMM) as the anomaly detection method. Our experiment showed the proposed method improved the Area Under Curve (AUC) to 0.8706 and the partial Area Under Curve(pAUC) to 0.7403 compared to the baseline system on development dataset.

System characteristics
Classifier GMM, LOF
System complexity 207360, 8120000 parameters
Acoustic features log-mel energies
PDF

TASK 2 DCASE 2020: ANOMALOUS SOUND DETECTION USING UNSUPERVISED AND SEMI-SUPERVISED AUTOENCODERS AND GAMMTONE AUDIO REPRESENTATION

Javier Naranjo-Alcazar (Visualfy)*; Sergi Perez-Castanos (Visualfy); Pedro Zuccarello (Visualfy); Maximo Cobos (Universitat de Valencia); Jose Ferrandis (Visualfy)
Computer Science Department, Burjassot, Spain and AI department, Benisano, Valencia

Abstract

Anomalous sound detection (ASD) is one of the fields of machine listening that is attracting most attention among the scientific community. Unsupervised detection is attracting a lot of interest due to its immediate applicability in many fields. For example, related to industrial processes, the early detection of malfunctions or damage in machines can mean great savings and an improvement in the efficiency of industrial processes. This problem can be solved with an unsupervised ASD solution since industrial machines will not be damaged simply by having this audio data in the training stage. This paper proposes a novel framework based on convolutional autoencoders (both unsupervised and semi-supervised) and a Gammatone-based representation of the audio. The results obtained by these architectures substantially exceed the results presented as a baseline.

System characteristics
Classifier AE, semi-supervised AE
System complexity 8851009 parameters
Acoustic features gammatone
PDF

Unsupervised detection of anomalous machine sound using various spectral features and focused hypothesis test in the reverberant and noisy environment

Jihwan Park (lge)*; Sooyeon Yoo (lge)
Advanced Robot Research Laboratory, LG Electronics, Seoul, South Korea and Advanced Robot Research Laboratory, LG Electronics, Seou, South Korea

Abstract

In this technical report, we describe our anomalous sound detection (ASD) systems submitted in DCASE 2020 Task2. To improve the ASD performance in the reverberant and noisy condition, normal machine sound augmentation, focused hypothesis test, and selecting the distinctive spectral features is applied to deep neural network (DNN)-based autoencoder (AE). In the experiments, we found that our approaches outperform baseline methods under the condition that only reverberant and noisy normal sound samples have been provided as training data.

System characteristics
Classifier AE
System complexity 269992, 3523357, 4206173 parameters
Acoustic features log-mel energies, median-filtered spectrogram, spectrogram
Data augmentation latent space sampling
Front end system median-filter
PDF

DCASE 2020 TASK 2: UNSUPERVISED DETECTION OF ANOMALOUS SOUNDS FOR MACHINE CONDITION MONITORING

Duc H Phan (University of Illinois)*; Douglas L. Jones (University of Illinois Urbana-Champaign)
ECE, Illinois, USA

Abstract

A multiple layer neural network predictor is proposed for anomalous sound detection instead of a traditional auto-encoder approach. The network operates on the log-mel-spectrogram, predicting the log-mel feature vector given the previous and future feature vectors. The prediction error is used as the anomaly score measure. The proposed system outperforms the baseline system [1] on Detection and Classification of Acoustic Scenes and Events 2020 (DCASE2020) Task 2 development data set [2, 3].

System characteristics
Classifier IDNN/IAE
System complexity 249664 parameters
Acoustic features log-mel energies
PDF

Reframing Unsupervised Machine Condition Monitoring as a Supervised Classification Task with Outlier-Exposed Classifiers

Paul Primus (Johannes Kepler University)*
Computational Perception, JKU, Austria, Linz

Abstract

This technical report contains a detailed summary of our submissions to the Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring (MCM) Task of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events 2020 (DCASE). The goal of acoustic MCM is to identify whether a sound emitted from a machine is normal or anomalous. In contrast to the task coordinator's claim that 'this task cannot be solved as a simple classification problem,' we show that a simple binary classifier substantially outperforms the provided unsupervised Autoencoder baseline across all machine types and instances, if outliers i.e., various other recordings, are available. In addition to this technical description, we release our complete source code to make our submission fully reproducible.

System characteristics
Classifier CNN
System complexity 1000000, 12000000, 136000000, 59000000 parameters
Acoustic features log-mel energies
Decision making median
Subsystem count 13, 5
PDF

Deep Dense and Convolutional Autoencoders for Unsupervised Anomaly Detection in Machine Condition Sounds

Andre Pilastri (CCG - Centro de Computacao Grafica)*; Alexandrine Ribeiro (CCG - Centro de Computacao Grafica); Luis Matos (ALGORITMI Centre, University of Minho); Pedro Pereira (ALGORITMI Centre, University of Minho); Eduardo Nunes (ALGORITMI Centre, University of Minho); Andre Ferreira (Bosch Car Multimedia Portugal, S.A.); Paulo Cortez (University of Minho)
Centro de Computacao Grafica, Guimaraes, Portugal and Department of Information Systems, ALGORITMI Centre, University of Minho, Guimaraes, Portugal and Bosch Car Multimedia Portugal, Braga, Portugal and Department of Information Systems, ALGORITMI Centre, Guimaraes, Portugal

Abstract

This technical report describes two methods that were developed forTask 2 of the DCASE 2020 challenge. The challenge involves an unsupervised learning to detect anomalous sounds, thus only normal machine working condition samples are available during the training process. The two methods involve deep autoencoders, based on dense and convolutional architectures that use melspectogram processed sound features. Experiments were held, using the six machine type datasets of the challenge. Overall, competitive results were achieved by the proposed dense and convolutional AE, outperforming the baseline challenge method.

System characteristics
Classifier AE, CNN AE
System complexity 2257576, 2882992, 4133673 parameters
Acoustic features log-mel energies
PDF

ANOMALY CALCULATION FOR EACH COMPONENTS OF SOUND DATA AND ITS INTEGRATION FOR DCASE 2020 CHALLENGE TASK2

yuya sakamoto (Fixstars Corporation)*
Fixstars Corporation, Tokyo, Japan

Abstract

This paper is a technical report on our method that we submitted to the DCASE2020 Challenge Task 2. In our method, we first convert one sample into a log-mel-spectrogram, as in the baseline system. Next, the log-mel-spectrogram is decomposed into mean component, basis component and latent component by principal component analysis, and anomaly score is calculated for these each components. Then, the final anomaly score was determined by integrating the calculated anomaly score of each components. Each anormal score is calculated using Mahalanobis distance, k-nearest neighbor based on subspace distance, and distance based on matrix normal distribution.

System characteristics
Classifier ensemble, mahalanobis distance, matrix normal distribution, subspace distance k-nearest neighbor
System complexity 141728, 16512, 222912 parameters
Acoustic features log-mel energies
Decision making mahalanobis distance
Subsystem count 2, 3
PDF

A METHOD OF ANOMALOUS SOUND DETECTION WITH MULTI-DIMENSIONAL AUDIO FEATURE INPUTS

Dinghou Lin (Tsinghua University)*; Youfang Han (Tsinghua University); Ran Shao (ELCOM (Suzhou) Co. Ltd.); Chunping Li (Tsinghua University)
ELCOM (Suzhou) Co. Ltd., Suzhou, China and Data Mining Group, School of software, Tsinghua University, Beijing, China

Abstract

In this technical report, we describe the system we submitted to DCASE2020 task 2 in details, i.e., anomalous sound detection (ASD). The goal is to train a model which can distinguish normal sound and abnormal one when only normal sound samples are used as training data. To achieve this goal, we need to find out the characteristics of normal sound. Firstly, we adopt the preprocessing method to intensify the features of normal audio, secondly, we extract different types of features including artificial features and implicit features. Moreover we also use psychoacoustic features to assist to train the model. Finally, we achieve better performance than DCASE2020 baseline system.

System characteristics
Classifier AE
System complexity 423279 parameters
Acoustic features log-mel energies, raw waveform, spectrogram
System embeddings VGGish, OpenL3
External data usage simulation of anomalous samples, features extraction
Front end system filter
PDF

DCASE2020 TASK2 SELF-SUPERVISED LEARNING SOLUTION

zero shinmura (none)*
Nagano, Japan

Abstract

The detection of anomalies by sound is very useful. Because, unlike images, there is no need to worry about adjusting the light or shielding. We propose anomaly sound detection method using self-supervised learning with deep metric learning. Our approach is fast because of using MobileNet V2. And our approach was good at non-stationary sounds, achieving an AUC of 0.9 or higher for most of non-stationary sounds.

System characteristics
Classifier ArcFace, MobileNetV2, ensemble
System complexity 705936 parameters
Acoustic features spectrogram
Decision making average
Subsystem count 10
PDF

ANOMALY MACHINE DETECTION ALGORITHM BASED ON SEMI VARIATIONAL AUTO-ENCODER OF MEL SPECTROGRAM

ke tian (tianke); Guoheng Fu (Beijing University of Posts and Telecommunications); Shengchen Li (Beijing University of Posts and Telecommunications)*; gang tang (Beijing University of Chemical Technology); Xi Shao (Nanjing University of Posts and Telecommunications)
BUPT, Beijing, China and BUPT, BUPT, China and BUCT, Beijing, China and NJUPT, Nanjing, China

Abstract

This report proposes a solution for Task 2 of IEEE DCASE data challenge 2020, which attempts to detect anomaly machines according to acoustic data. The proposed solution uses a semi variational auto-encoder. The term "semi" indicates that the resulting variational auto-encoder may not successfully reconstruct the input as the key task of the task is to distinguish the outlier samples according to a specific feature rather than reconstruct the input precisely. As a result, there are a few minor changes introduced by the provided baseline system, which set up a different training stop criteria and a different anomaly scoring system. By the proposed method, the use of different stop training criteria for an variational autoncoder may help different objectives.

System characteristics
Classifier VAE
System complexity 225216 parameters
Acoustic features log-mel energies
PDF

MODULATION SPECTRAL SIGNAL REPRESENTATIONAND I-VECTORS FOR ANOMALOUS SOUND DETECTION

Parth Tiwari (IITKGP)*; Yash Jain (IITKGP); Anderson R. Avila (Institut national de la recherche scientifique (INRS-EMT), Quebec, Canada); Joao B Monteiro (Institut National de la Recherche Scientifique); Shruti R Kshirsagar (INRS-EMT); Amr Gaballah (INRS); Tiago H Falk (INRS-EMT)
Industrial and Systems Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India and Mathematics, Indian Institute of Technology Kharagpur, Kharagpur, India and Institut national de la recherche scientifique (INRS), Montreal, Canada

Abstract

This report summarizes our submission for Task-2 of the DCASE 2020 Challenge. We propose two different anomalous sound detection systems, one based on features extracted from a modulation spectral signal representation and the other based on i-vectors extracted from mel-band features. The first system uses a nearest neighbour graph to construct clusters which capture local variations in the training data. Anomalies are then identified based on their distance from the cluster centroids. The second system uses i-vectors extracted from mel-band spectra for training a Gaussian Mixture Model. Anomalies are then identified using their negative log-likelihood. Both these methods show significant improvement over the DCASE Challenge baseline AUC scores, with an average improvement of 6% across all machines. An ensemble of the two systems is shown to further improve the average performance by 11% over the baseline.

System characteristics
Classifier AE, GMM, KNN, ensemble, graph clustering
System complexity 0, 141458, 411450 parameters
Acoustic features MFCC i-vectors, log-mel energies, modulation spectrogram
Decision making geometric mean, maximum
Subsystem count 2, 3
PDF

ANOMALY DETECTION USING THE MIDDLE LAYER OF THE CNN-CLASSIFICATION MODEL

Motonobu Uchikoshi (The Japan Research Institute, Limited)*
Development Promotion Division, The Japan Research Institute, Limited, Tokyo, Japan

Abstract

I have confirmed the effectiveness of anomaly detection using the middle layer of the CNN-classification model. The model is 3-level classification model using 3 different nor-mal datasets (normal data, normal data with noise, and a different type of normal data). The abnormality is defined for the distance from the normal data outputs area in latent space of the middle layer of the CNN-classification model. For characterizing the region occupied by normal data, clustering with a mixed Gaussian model. Though the average scores of the CNN model were below the AE-baselines, some tasks better scores than the baselines. So I tried ensembling the CNN model and the AE model.Index Terms- CNN, KNN, GMM, AE

System characteristics
Classifier AE, CNN, GMM, KNN, ensemble
System complexity 74346100, 74616092 parameters
Acoustic features log-mel energies
Data augmentation random noise
Decision making weighted average
Subsystem count 2
PDF

Detection of Anomalous Sounds for Machine Condition Monitoring using Classification Confidence

Tadanobu Inoue (IBM Research); Phongtharin Vinayavekhin (IBM Research)*; Shu Morikuni (IBM Research); Shiqiang Wang (IBM Research); Tuan Hoang Trong (IBM Research); David Wood (IBM Research); Michiaki Tatsubori (IBM Research - Tokyo); Ryuki Tachibana (IBM Research)
AI, IBM Research, Tokyo, Japan and AI, IBM Research, Yorktown Heights, NY, USA and AI, Yorktown Heights, NY, USA

Abstract

We propose unsupervised anomalous sound detection methods using ensemble of two classifiers. Both classifiers are trained with either known or generated properties of normal sounds as labels; (1) one is a model to classify sounds into machine types and IDs, and (2) the other is a model to classify transformed sounds into the data augmentation types. For training such a model, we augment the normal sound by using sound transformation techniques such as pitch shifting, and use each data augmentation types as labels. For both classifiers, we use the classification confidence as the normal- ity score of the input sample at the run-time. We ensemble these approaches by probability aggregation of their anomaly scores. As a result, the experimental results show superior performance to the baseline which is provided by the DCASE organizer.

System characteristics
Classifier CNN, ensemble
System complexity 395781, 418162, 813943 parameters
Acoustic features log-mel energies
Data augmentation time stretching, pitch shifting
Decision making probability aggregation
Subsystem count 2
PDF

ANOMALOUS SOUND DETECTION BASED ON A NOVEL AUTOENCODER

Xianwei Zhang (Tsinghua University)*
Department of Electronic Engineering, Tsinghua University, Beijing, China

Abstract

The DCASE2020 Task2 is Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring [1]. This technical re- port describes the approach we used to participate in this task. We utilize the interpolation deep neural network (IDNN) [2] based on the autoencoder (AE). For 5 frames ofa spectrogram from sounds in development dataset, we remove the middle frame and send the rest into AE and get the output with the same shape of middle frame. The reconstruction error between the output and the original middle frame is used as anomaly score. Compared with baseline, the AUC score is improved on validation dataset of valve.

System characteristics
Classifier IDNN/IAE
System complexity 187560 parameters
Acoustic features log-mel energies
PDF

AUTO-ENCODER AND METRIC-LEARNING FOR ANOMALOUS SOUND DETECTION TASK

Qingkai WEI (Beijing Kuaiyu Co. Ltd.)*
Beijing, China and Beijing Kuaiyu Electronics Co., Ltd., Beijing, PRC

Abstract

DCASE 2020 task 2 aim at the problem of anomalous sound detection, to judge whether the target machine is in normal status by the sound it emitted [1]. The challenge of this task is to detect anomalous status while only sound of normal status is provided. With only samples of normal status, supervised learning which is usually used in sound event detection cannot be applied then. The given baseline use auto-encoder with log-mel-spectrogram as input and to reconstruct it, error of reconstruction as the anomalous score. Based on the idea of baseline, we tuned the parameters of auto-encoder net structure, tried variant auto-encoder and convolutional auto-encoder. The results show that only tuning parameters of auto-encoder shows 0.05 improvement of AUC for part of the machine types. In addition, we applied metric learning, which is usually used in face recognition, in this task to extract feature vector. Then local outlier factor is used to get the anomalous score. The results on validation dataset shows a larger improvement, increasing about 0.1 of pAUC for four types of machine.

System characteristics
Classifier AE, L2-softmax, MobileNetV2
System complexity 2710992, 706224 parameters
Acoustic features log-mel energies
PDF

Unsupervised Detection of Anomalous Sounds using Abnormal Sound Simulation Algorithm and Auto-encoder Classifier

Wen Haifeng (University of Electronic Science and Technology of China)*; Shi Shaoyang (niversity of Electronic Science and Technology of China); Chuang Shi (University of Electronic Science and Technology of China)
University of Electronic Science and Technology of China, Information and Communication, Chengdu, China

Abstract

This report described our contribution to Unsupervised Detection of Anomalous Sounds on DCASE 2020 challenge (Task2). In our work, we made some changes to the algorithm for simulating abnormal sound referred to the idea of abnormal sound simula-tion. Besides, to make use of the simulated abnormal sound, the change of output of the classification system based on Auto-encoder and binary cross entropy used for system's training were done. The experiment results show a significant improve-ment performance comparing with baseline system's results. In this report, we propose two systems with the above basic ideas, which are based on fully connected neural networks and convo-lutional neural networks (CNNs), respectively.

System characteristics
Classifier AE, CNN, GMM, VAE
System complexity 2282144, 3714193, 5996337 parameters
Acoustic features log-mel energies
External data usage simulation of anomalous samples
PDF

Anomalous Sound Detection with Look, Listen, and Learn Embeddings

Kevin Wilkinghoff (Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE )*
Communication Systems, Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE, Wachtberg, Germany

Abstract

The goal of anomalous sound detection is to unsupervisedly train a system to distinguish normal from anomalous sounds that substantially differ from the normal sounds used for training. In this paper, a system based on Look, Listen, and Learn embeddings, which participated in task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring" of the DCASE challenge 2020, is presented. The experimental results show that the presented system significantly outperforms the baseline system of the challenge both in detecting outliers and in recognizing the correct machine type or exact machine id. Moreover, it is shown that an ensemble consisting of the presented system and the baseline system performs even better than both of its components.

System characteristics
Classifier AE, CNN, PCA, PLDA, RLDA, ensemble
System complexity 10635245, 10905237 parameters
Acoustic features log-mel energies
Data augmentation manifold mixup
Decision making concatenation
System embeddings OpenL3
Subsystem count 2
External data usage embeddings
PDF

UNSUPERVISED DETECTION OF ANOMALOUS SOUNDS TECHNICAL REPORT

Yao Xiao (Tsinghua University)*
School of Software, Tsinghua University, Beijing, China

Abstract

This report describes the solution to Task 2 of the DCASE 2020 challenge. Besides the autoencoder-based unsupervised anomaly detector used in the baseline, the classifier-based unsupervised anomaly detector is used and the classification error of the normal or anomalous machine sounds is used as anomaly score.

System characteristics
Classifier AE, CNN
System complexity 469342 parameters
Acoustic features log-mel energies, raw waveform
PDF

UNSUPERVISED DETECTION OF ANOMALOUS SOUNDS BASED ON DICTIONARY LEARNING AND AUTOENCODER

Chenxu Zhang (Nanjing University of Posts and Telecommunications); Yao Yao (NJUPT)*; Yuxuan Zhou (BUCT); Guoheng Fu (Beijing University of Posts and Telecommunications); Shengchen Li (Beijing University of Posts and Telecommunications); gang tang (Beijing University of Chemical Technology); Xi Shao (Nanjing University of Posts and Telecommunications)
NJUPT, Nanjing, China and BUCT, Beijing, China and BUPT, Beijing, China

Abstract

The DCASE2020 Challenge Task2 is to develop an unsupervised detection system of anomalous sounds for six types of machine. In this paper, we proposed two methods. One is to use auditory traditional features and dictionary learning (DL) to train a dictionary. Another is to use auditory spectral features and deep learning method to train an autoencoder (AE). Both of our proposed methods achieve an improvement comparing to the baseline system, and better performance can be obtained by using the mixture of two methods. Experiments prove the practicability of the proposed methods for anomaly detection.

System characteristics
Classifier AE, OCSVM, dictionary learning
System complexity 27144 parameters
Acoustic features center of gravity, crook index, effective value, fourier sum of squares, frequency standard deviation, frequency vavriance center of gravity, kurtosis, kurtosis index, logfbank, margin index, mean square frequency, pulse index, root mean square frequency, slope, square root amplitude, variance, waveform index
PDF

ACOUSTIC ANOMALY DETECTION BASED ON SIMILARITY ANALYSIS

Shuyang Zhao (Tampere University)*
Pervasive computing, Tampere University, Tampere, Finland

Abstract

This study uses nearest neighbour distance as a measure of anomaly. The nearest neighbour distance is defined as the distance from a test sample to its nearest neighbour in the training dataset,which contains only sounds recorded in normal condition. A sample is represented by a multi-variate Gaussian distribution of corresponding MFCCs. Kullback-Leibler divergence is used to measure the dissimilarity between two distributions, and it is further used as a distance between two samples. Three submissions vary in the use of MFCC deltas and the type of covariance matrices used for Gaussian distributions.

System characteristics
Classifier None
Acoustic features MFCC
Decision making KL divergence
PDF

ARCFACE BASED SOUND MOBILENETS FOR DCASE 2020 TASK 2

Qiping Zhou (PFU SHANGHAI Co., LTD)*
R&D department, PFU Shanghai Co., LTD, Shanghai, China

Abstract

In this report, we propose our anomalous sounds detection neural network for the DCASE 2020 challenge's Task 2 (Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring). We propose a metric learning model based on additive angular margin loss (ArcFace). In order to learn the embedding efficiently, a CNN architecture based on MobileFaceNets is employed.

System characteristics
Classifier ArcFace, CNN, ensemble
System complexity 1021632, 3535872, 4557504 parameters
Acoustic features spectrogram
Decision making average
Subsystem count 2
PDF