DCASE2017 Workshop

Workshop on Detection and Classification of Acoustic Scenes and Events

16 - 17 November 2017, Munich, Germany

Contact

DCASE Workshop

Sponsors

Audio Analytic

Google

Research at Google

DCASE 2017 Workshop is the second workshop on Detection and Classification of Acoustic Scenes and Events, being organized for the second time in conjunction with the DCASE challenge. We aim to bring together researchers from many different universities and companies with interest in the topic, and provide the opportunity for scientific exchange of ideas and opinions.

The technical program will include invited speakers on the topic of computational everyday sound analysis and recognition, and oral and poster presentations of accepted papers. In addition, a special poster session will be dedicated to the DCASE 2017 challenge entries.

The workshop is organized as a one-and-a-half day event, to be held on 16-17 November 2017 at Hotel Maritim, Munich, Germany.

Topics

We invite submissions on the topics of computational analysis of acoustic scenes and sound events, including but not limited to:

Tasks in computational environmental audio analysis

Acoustic scene classification
Sound event detection and localization
Audio tagging
Challenges in real-life applications (e.g., rare events, overlapping sound events, weak labels)

Methods for computational environmental audio analysis

Signal processing methods
Machine learning methods
Auditory-motivated methods
Cross-disciplinary methods involving, e.g., acoustics, biology, psychology, geography, materials science, transports science

Resources, applications, and evaluation of computational environmental audio analysis

Publicly available datasets or software, taxonomies and ontologies, evaluation procedures
Applications
Description of systems submitted to the DCASE 2017 Challenge

The results of the Detection and Classification of Acoustic Scenes and Events (DCASE) challenge 2017 will also be announced at the workshop.

Program

Full technical program

Day 1 Thursday 16.11.2017, 9:00 - 18:00

Hours

8:45

Registration &
Coffee

9:10

Welcome

Annamaria Mesaros
Tampere University of Technology, Finland

9:20

Keynote

General-Purpose Sound Event Recognition

Shawn Hershey, Google Research

10:10

Coffee break

10:30

Oral session I

Presentations of workshop papers
5 presentations

12:30

Lunch

14:00

Oral session II

Presentations of workshop papers
4 presentations

15:20

Coffee

15:20

Poster session I

Posters of workshop papers and DCASE2017 challenge results
12 posters

17:00

Open discussion

Moderated by Mark Plumbley
University of Surrey, United Kingdom

Day 2 Friday 17.11.2017, 9:00 - 12:10

Hours

9:00

Keynote

Sound Texture Perception via Summary Statistics

Josh McDermott , MIT, USA

9:50

Oral session III

Presentations of workshop papers
3 presentations

10:50

Coffee

10:50

Poster session II

Posters of workshop papers
8 posters

12:00

Closing remarks

Keynote speakers

General-Purpose Sound Event Recognition

Shawn Hershey

Machine Hearing Group, Google Research

Day 1, Thursday 16.11.2017, 9:20

Abstract

Inspired by the success of general-purpose object recognition in images, we have been working on automatic, real-time systems for recognizing sound events regardless of domain. Our goal is a system that can tag or describe an arbitrary soundtrack - as might be found on a media sharing site like YouTube - using terms that make sense to a human. I will cover the process of defining this task, our deep learning approach, our efforts to collect training data, and our current results. I'll discuss some factors important for accurate models, and some ideas about how to get the best return from manual labeling investment.

Biography

Shawn Hershey is a software engineer at Google Research, working in the Machine Hearing Group on machine learning for speech and audio processing. He is currently working on soundtrack classification and audio event detection. Before Google he worked as the first Software Engineer at Lyric Semiconductors, building tools to aid the development of hardware accelerators for AI. On the side, Shawn travels the world teaching Lindy Hop and blues dancing and playing in swing and blues bands. Long ago Shawn graduated from the University of Rochester with a BA in Computer Science and half of a degree from the Eastman School of Music.

Research Scientist

Shawn Hershey

Machine Hearing Group

Google Research New York

Sound Texture Perception via Summary Statistics

Josh McDermott

Fred & Carole Middleton Career Development Assistant Professor, Department of Brain and Cognitive Science, Massachusetts Institute of Technology

Day 2, Friday 17.11.2017, 9:00

Abstract

Sound textures are produced by superpositions of large numbers of similar acoustic features (as in rain, swarms of insects, or galloping horses). Textures are noteworthy for being stationary, raising the possibility that time-averaged statistics might capture their structure. I will describe several lines of work testing this idea. I will show how the synthesis of textures from statistics of biological auditory models provides evidence for statistical texture representations. I will then describe experiments that characterize the process by which texture statistics are measured by the auditory system, and that explore their role in auditory scene analysis.

Biography

Josh McDermott is a perceptual scientist studying sound and hearing in the Department of Brain and Cognitive Sciences at MIT, where he is the Fred & Carole Middleton Career Development Assistant Professor and heads the Laboratory for Computational Audition. His research addresses human and machine audition using tools from experimental psychology, engineering, and neuroscience. McDermott obtained a BA in Brain and Cognitive Science from Harvard, an MPhil in Computational Neuroscience from University College London, a PhD in Brain and Cognitive Science from MIT, and postdoctoral training in psychoacoustics at the University of Minnesota and in computational neuroscience at NYU. He is the recipient of a Marshall Scholarship, a James S. McDonnell Foundation Scholar Award, and an NSF CAREER Award.

Assistant Professor

Josh McDermott

Department of Brain and Cognitive Science

Massachusetts Institute of Technology

Proceedings

Altogether 27 papers were accepted to be presented in the workshop. The full proceedings can be accessed here.

Venue

Hotel Maritim, Munich, Germany

Instructions for authors

Detailed instructions for preparation and submission of workshop papers can be found here.

Registration

The registration is currently open for authors and non-authors. The registration system (ConfTool system) can be accessed here. The registration fee is 150 Euros, including lunch and coffee breaks.

At least one author per accepted paper must register for the Workshop before 20th October 2017.

Invitation letter

Personalized invitation letter (mentioning the name of the participant) is accessible in the user account in ConfTool after registration and payment of the workshop fee. The document is in the PDF format and can be printed for visa application purposes.

If you require a more detailed letter, please fill in your passport information in the personal details within the ConfTool user account, and send a request email to dcase.workshop@gmail.com. Please also mention if you need an original signature on it.