Program in EST (UTC-5)

Note: All the presentations in oral sessions will be played twice. The attendees from different time zone can pick one of the two sessions.

Sunday 1st November

* Recommended sessions for attendees in EST (UTC-5)

Day1

19:00

* Welcome

Nobutaka Ono
Tokyo Metropolitan University, Department of Computer Science

19:20

* Keynote I
(1st play)

Mounya Elhilali
Johns Hopkins University, Department of Electrical and Computer Engineering

Active listening in everyday soundscapes

20:20

* Oral session I (1st play)

Sound Event Detection and Localization I
(Session Co-Chairs: Keisuke Imoto and Jonathan Le Roux)

L01	Conformer-based sound event detection with semi-supervised learning and data augmentation Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda
L02	On the effectiveness of spatial and multi-channel features for multi-channel polyphonic sound event detection Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan
L03	Guided multi-branch learning systems for sound event detection with sound separation Yuxin Huang, Liwei Lin, Shuo Ma, Xiangdong Wang, Hong Liu, Yueliang Qian, Min Liu, Kazushige Ouchi
L04	Ensemble of sequence matching networks for dynamic sound event localization, detection, and tracking Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

21:50

Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

Monday 2nd November

* Recommended sessions for attendees in EST (UTC-5)

0:00

Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

2:00

* Virtual tour I

Japan: Is Japan an ideal destination for your holidays?

3:10

* Challenge
spotlights

Reports on the 6 tasks of the DCASE 2020 challenge
(Session Co-Chairs: Annamaria Mesaros and Romain Serizel)

4:10

Oral session II (1st play)

Sound Event Detection and Localization II
(Session Co-Chairs: Sharath Adavanne and Yasunori Ohishi)

L05	On multitask loss function for audio event detection and localization Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins
L06	Event-independent network for polyphonic sound event localization and detection Yin Cao, Turab Iqbal, Qiuqiang Kong, Yue Zhong, Wenwu Wang, Mark D. Plumbley
L07	A multi-resolution approach to sound event detection in DCASE 2020 Task4 Diego de Benito-Gorron, Daniel Ramos, Doroteo T. Toledano
L08	Training sound event detection on a heterogeneous dataset Nicolas Turpault, Romain Serizel

6:00

Oral session I (2nd play)

Sound Event Detection and Localization I
(Session Co-Chairs: Yin Cao and Keisuke Imoto)

L01	Conformer-based sound event detection with semi-supervised learning and data augmentation Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda
L02	On the effectiveness of spatial and multi-channel features for multi-channel polyphonic sound event detection Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan
L03	Guided multi-branch learning systems for sound event detection with sound separation Yuxin Huang, Liwei Lin, Shuo Ma, Xiangdong Wang, Hong Liu, Yueliang Qian, Min Liu, Kazushige Ouchi
L04	Ensemble of sequence matching networks for dynamic sound event localization, detection, and tracking Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

7:30

Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

9:00

Keynote I
(2nd play)

Mounya Elhilali
Johns Hopkins University, Department of Electrical and Computer Engineering

Active listening in everyday soundscapes

10:00

* Oral session II (2nd play)

Sound Event Detection and Localization II
(Session Co-Chairs: Yasunori Ohishi and Tuomas Virtanen)

L05	On multitask loss function for audio event detection and localization Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins
L06	Event-independent network for polyphonic sound event localization and detection Yin Cao, Turab Iqbal, Qiuqiang Kong, Yue Zhong, Wenwu Wang, Mark D. Plumbley
L07	A multi-resolution approach to sound event detection in DCASE 2020 Task4 Diego de Benito-Gorron, Daniel Ramos, Doroteo T. Toledano
L08	Training sound event detection on a heterogeneous dataset Nicolas Turpault, Romain Serizel

End of day1

The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

Day2

19:10

* Welcome

Yohei Kawaguchi
Hitachi, Ltd.

19:20

* Keynote II
(1st play)

Shin'ichi Satoh
National Institute of Informatics

How benchmarks work for visual recognition research? --
Historical review and future prospects

20:20

* Oral session III (1st play)

Scene Classification and Anomalous Sound Detection I
(Session Co-Chairs: Yuma Koizumi and Gordon Wichern)

L09	Group masked autoencoder based density estimator for audio anomaly detection Ritwik Giri, Fangzhou Cheng, Karim Helwani, Srikanth V. Tenneti, Umut Isik, Arvindh Krishnaswamy
L10	Detection of anomalous sounds for machine condition monitoring using classification confidence Tadanobu Inoue, Phongtharin Vinayavekhin, Shu Morikuni, Shiqiang Wang, Tuan Hoang Trong, David Wood, Michiaki Tatsubori, Ryuki Tachibana
L11	Acoustic scene classification with spectrogram processing strategies Helin Wang, Yuexian Zou, Dading Chong
L12	Searching for efficient network architectures for acoustic scene classification Yuzhong Wu, Tan Lee

21:50

Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

Tuesday 3rd November

* Recommended sessions for attendees in EST (UTC-5)

0:00

Oral session IV (1st play)

Audio Captioning
(Session Co-Chairs: Il-Young Jeong and Tatsuya Komatsu)

L16	Effects of word-frequency based pre- and post- processings for audio captioning Daiki Takeuchi, Yuma Koizumi, Yasunori Ohishi, Noboru Harada, Kunio Kashino
L17	Multi-task regularization based on infrequent classes for audio captioning Emre Çakir, Konstantinos Drossos, Tuomas Virtanen
L18	Temporal sub-sampling of audio feature sequences for automated audio captioning Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen

1:10

Oral session V (1st play)

Scene Classification and Anomalous Sound Detection II
(Session Co-Chairs: Sakiko Mishima and Nobutaka Ono)

L13	Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer
L14	ID-Conditioned auto-encoder for unsupervised anomaly detection Sławomir Kapka
L15	Anomalous sound detection as a simple binary classification problem with careful selection of proxy outlier examples Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer

2:20

* Virtual tour II

Tokyo: A city of old and new

3:30

* Sponsor Event

Short presentation by platinum and gold sponsors (Hitachi, LINE, and NEC)

4:00

Keynote II
(2nd play)

Shin'ichi Satoh
National Institute of Informatics

How benchmarks work for visual recognition research? --
Historical review and future prospects

5:00

Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

7:20

* Oral session IV (2nd play)

Audio Captioning
(Session Co-Chairs: Tatsuya Komatsu and Robin Scheibler)

L16	Effects of word-frequency based pre- and post- processings for audio captioning Daiki Takeuchi, Yuma Koizumi, Yasunori Ohishi, Noboru Harada, Kunio Kashino
L17	Multi-task regularization based on infrequent classes for audio captioning Emre Çakir, Konstantinos Drossos, Tuomas Virtanen
L18	Temporal sub-sampling of audio feature sequences for automated audio captioning Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen

8:30

* Oral session III (2nd play)

Scene Classification and Anomalous Sound Detection I
(Session Co-Chairs: Yuma Koizumi and Gordon Wichern)

L09	Detection of anomalous sounds for machine condition monitoring using classification confidence Tadanobu Inoue, Phongtharin Vinayavekhin, Shu Morikuni, Shiqiang Wang, Tuan Hoang Trong, David Wood, Michiaki Tatsubori, Ryuki Tachibana
L10	Acoustic scene classification with spectrogram processing strategies Helin Wang, Yuexian Zou, Dading Chong
L11	Searching for efficient network architectures for acoustic scene classification Yuzhong Wu, Tan Lee
L12	Group masked autoencoder based density estimator for audio anomaly detection Ritwik Giri, Fangzhou Cheng, Karim Helwani, Srikanth V. Tenneti, Umut Isik, Arvindh Krishnaswamy

10:00

* Oral session V (2nd play)

Scene Classification and Anomalous Sound Detection II
(Session Co-Chairs: Nobutaka Ono and Mark Plumbley)

L13	Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer
L14	ID-Conditioned auto-encoder for unsupervised anomaly detection Sławomir Kapka
L15	Anomalous sound detection as a simple binary classification problem with careful selection of proxy outlier examples Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer

End of day2

The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

Wednesday 4th November

* Recommended sessions for attendees in EST (UTC-5)

Day3

3:00

* Welcome

Noboru Harada
NTT Corporation, NTT Communication Science Laboratories

3:10

* Challenge & paper awards

Award Co-Chairs: Sharath Adavanne, Annamaria Mesaros, Romain Serizel, and Sayaka Shiota

3:30

* DCASE2021 announcements

3:50

* Town hall discussion

4:50

* Closing remarks

End of workshop