Program in EST (UTC-5)

Note: All the presentations in oral sessions will be played twice. The attendees from different time zone can pick one of the two sessions.

Sunday 1st November

* Recommended sessions for attendees in EST (UTC-5)
Day1
19:00 * Welcome

Nobutaka Ono
Tokyo Metropolitan University, Department of Computer Science

19:20 * Keynote I
(1st play)

Mounya Elhilali
Johns Hopkins University, Department of Electrical and Computer Engineering

Active listening in everyday soundscapes

Abstract & bio

20:20 * Oral session I (1st play)

Sound Event Detection and Localization I
(Session Co-Chairs: Keisuke Imoto and Jonathan Le Roux)

L01

Conformer-based sound event detection with semi-supervised learning and data augmentation
Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda

L02

On the effectiveness of spatial and multi-channel features for multi-channel polyphonic sound event detection
Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

L03

Guided multi-branch learning systems for sound event detection with sound separation
Yuxin Huang, Liwei Lin, Shuo Ma, Xiangdong Wang, Hong Liu, Yueliang Qian, Min Liu, Kazushige Ouchi

L04

Ensemble of sequence matching networks for dynamic sound event localization, detection, and tracking
Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

21:50 Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

Monday 2nd November

* Recommended sessions for attendees in EST (UTC-5)
0:00 Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

2:00 * Virtual tour I

Japan: Is Japan an ideal destination for your holidays?

3:10 * Challenge
spotlights

Reports on the 6 tasks of the DCASE 2020 challenge
(Session Co-Chairs: Annamaria Mesaros and Romain Serizel)

4:10 Oral session II (1st play)

Sound Event Detection and Localization II
(Session Co-Chairs: Sharath Adavanne and Yasunori Ohishi)

L05

On multitask loss function for audio event detection and localization
Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

L06

Event-independent network for polyphonic sound event localization and detection
Yin Cao, Turab Iqbal, Qiuqiang Kong, Yue Zhong, Wenwu Wang, Mark D. Plumbley

L07

A multi-resolution approach to sound event detection in DCASE 2020 Task4
Diego de Benito-Gorron, Daniel Ramos, Doroteo T. Toledano

L08

Training sound event detection on a heterogeneous dataset
Nicolas Turpault, Romain Serizel

6:00 Oral session I (2nd play)

Sound Event Detection and Localization I
(Session Co-Chairs: Yin Cao and Keisuke Imoto)

L01

Conformer-based sound event detection with semi-supervised learning and data augmentation
Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda

L02

On the effectiveness of spatial and multi-channel features for multi-channel polyphonic sound event detection
Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

L03

Guided multi-branch learning systems for sound event detection with sound separation
Yuxin Huang, Liwei Lin, Shuo Ma, Xiangdong Wang, Hong Liu, Yueliang Qian, Min Liu, Kazushige Ouchi

L04

Ensemble of sequence matching networks for dynamic sound event localization, detection, and tracking
Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon Seng Gan

7:30 Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

9:00 Keynote I
(2nd play)

Mounya Elhilali
Johns Hopkins University, Department of Electrical and Computer Engineering

Active listening in everyday soundscapes

Abstract & bio

10:00 * Oral session II (2nd play)

Sound Event Detection and Localization II
(Session Co-Chairs: Yasunori Ohishi and Tuomas Virtanen)

L05

On multitask loss function for audio event detection and localization
Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

L06

Event-independent network for polyphonic sound event localization and detection
Yin Cao, Turab Iqbal, Qiuqiang Kong, Yue Zhong, Wenwu Wang, Mark D. Plumbley

L07

A multi-resolution approach to sound event detection in DCASE 2020 Task4
Diego de Benito-Gorron, Daniel Ramos, Doroteo T. Toledano

L08

Training sound event detection on a heterogeneous dataset
Nicolas Turpault, Romain Serizel

End of day1

The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

Day2
19:10 * Welcome

Yohei Kawaguchi
Hitachi, Ltd.

19:20 * Keynote II
(1st play)

Shin'ichi Satoh
National Institute of Informatics

How benchmarks work for visual recognition research? --
Historical review and future prospects

Abstract & bio

20:20 * Oral session III (1st play)

Scene Classification and Anomalous Sound Detection I
(Session Co-Chairs: Yuma Koizumi and Gordon Wichern)

L09

Group masked autoencoder based density estimator for audio anomaly detection
Ritwik Giri, Fangzhou Cheng, Karim Helwani, Srikanth V. Tenneti, Umut Isik, Arvindh Krishnaswamy

L10

Detection of anomalous sounds for machine condition monitoring using classification confidence
Tadanobu Inoue, Phongtharin Vinayavekhin, Shu Morikuni, Shiqiang Wang, Tuan Hoang Trong, David Wood, Michiaki Tatsubori, Ryuki Tachibana

L11

Acoustic scene classification with spectrogram processing strategies
Helin Wang, Yuexian Zou, Dading Chong

L12

Searching for efficient network architectures for acoustic scene classification
Yuzhong Wu, Tan Lee

21:50 Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

Tuesday 3rd November

* Recommended sessions for attendees in EST (UTC-5)
0:00 Oral session IV (1st play)

Audio Captioning
(Session Co-Chairs: Il-Young Jeong and Tatsuya Komatsu)

L16

Effects of word-frequency based pre- and post- processings for audio captioning
Daiki Takeuchi, Yuma Koizumi, Yasunori Ohishi, Noboru Harada, Kunio Kashino

L17

Multi-task regularization based on infrequent classes for audio captioning
Emre Çakir, Konstantinos Drossos, Tuomas Virtanen

L18

Temporal sub-sampling of audio feature sequences for automated audio captioning
Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen

1:10 Oral session V (1st play)

Scene Classification and Anomalous Sound Detection II
(Session Co-Chairs: Sakiko Mishima and Nobutaka Ono)

L13

Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping
Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer

L14

ID-Conditioned auto-encoder for unsupervised anomaly detection
Sławomir Kapka

L15

Anomalous sound detection as a simple binary classification problem with careful selection of proxy outlier examples
Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer

2:20 * Virtual tour II

Tokyo: A city of old and new

3:30 * Sponsor Event

Short presentation by platinum and gold sponsors (Hitachi, LINE, and NEC)

4:00 Keynote II
(2nd play)

Shin'ichi Satoh
National Institute of Informatics

How benchmarks work for visual recognition research? --
Historical review and future prospects

Abstract & bio

5:00 Poster highlights

The short highlight videos will be cast on virtual platform by the organizers. The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

List of posters

7:20 * Oral session IV (2nd play)

Audio Captioning
(Session Co-Chairs: Tatsuya Komatsu and Robin Scheibler)

L16

Effects of word-frequency based pre- and post- processings for audio captioning
Daiki Takeuchi, Yuma Koizumi, Yasunori Ohishi, Noboru Harada, Kunio Kashino

L17

Multi-task regularization based on infrequent classes for audio captioning
Emre Çakir, Konstantinos Drossos, Tuomas Virtanen

L18

Temporal sub-sampling of audio feature sequences for automated audio captioning
Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen

8:30 * Oral session III (2nd play)

Scene Classification and Anomalous Sound Detection I
(Session Co-Chairs: Yuma Koizumi and Gordon Wichern)

L09

Detection of anomalous sounds for machine condition monitoring using classification confidence
Tadanobu Inoue, Phongtharin Vinayavekhin, Shu Morikuni, Shiqiang Wang, Tuan Hoang Trong, David Wood, Michiaki Tatsubori, Ryuki Tachibana

L10

Acoustic scene classification with spectrogram processing strategies
Helin Wang, Yuexian Zou, Dading Chong

L11

Searching for efficient network architectures for acoustic scene classification
Yuzhong Wu, Tan Lee

L12

Group masked autoencoder based density estimator for audio anomaly detection
Ritwik Giri, Fangzhou Cheng, Karim Helwani, Srikanth V. Tenneti, Umut Isik, Arvindh Krishnaswamy

10:00 * Oral session V (2nd play)

Scene Classification and Anomalous Sound Detection II
(Session Co-Chairs: Nobutaka Ono and Mark Plumbley)

L13

Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping
Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer

L14

ID-Conditioned auto-encoder for unsupervised anomaly detection
Sławomir Kapka

L15

Anomalous sound detection as a simple binary classification problem with careful selection of proxy outlier examples
Paul Primus, Verena Haunschmid, Patrick Praher, Gerhard Widmer

End of day2

The 15-minute presentation videos and Q&A fora can be accessed at any time during the workshop.

Wednesday 4th November

* Recommended sessions for attendees in EST (UTC-5)
Day3
3:00 * Welcome

Noboru Harada
NTT Corporation, NTT Communication Science Laboratories

3:10 * Challenge & paper awards

Award Co-Chairs: Sharath Adavanne, Annamaria Mesaros, Romain Serizel, and Sayaka Shiota

3:30 * DCASE2021 announcements


3:50 * Town hall discussion


4:50 * Closing remarks


End of workshop