Keynotes - DCASE

Mounya Elhilali

Johns Hopkins University, Department of Electrical and Computer Engineering

Active listening in everyday soundscapes

Abstract

Every day, we face the daily challenge of parsing a cacophony of sounds that constantly impinge on our ears. Yet, we have no difficulty navigating these complex soundscapes and making sense of sound events around us. While completely intuitive and almost automatic, this ability engages intricate brain mechanisms and feedback systems whose neural underpinnings and theoretical formulations are not fully understood. Attention plays a vital role in mapping our acoustic surroundings and guiding neural processing to focus on informative cues in the sensory input. A growing body of work has been amending our views of analysis in the auditory system; replacing the conventional view of ‘static’ processing in sensory cortex with a more active and malleable mapping that rapidly adapts to the task at hand and listening conditions. After all, humans and most animals are not specialists, but generalists whose perception is shaped by experience, context and changing behavioral demands. Leveraging these attentional capabilities in audio technologies leads to promising improvements in our ability to track sounds of interest amidst competing distracters.

Biography

Mounya Elhilali received her Ph.D. degree in Electrical and Computer Engineering from the University of Maryland, College Park in 2004. She is now professor of Electrical and Computer Engineering at the Johns Hopkins University with a joint appointment in the department of Psychology and Brain Sciences. She directs the Laboratory for Computational Audio Perception and is affiliated with the Center for Speech and Language Processing and the Center for Hearing and Balance. Her research examines sound processing by humans and machines in noisy soundscapes, and investigates reverse engineering intelligent processing of sounds by brain networks with applications to speech and audio technologies and medical systems. She was named the Charles Renn faculty scholar in 2015, received a Johns Hopkins catalyst award in 2017 and recognized as outstanding women innovator in 2020. Dr. Elhilali is the recipient of the National Science Foundation CAREER award and the Office of Naval Research Young Investigator award.

Video

Shin'ichi Satoh

National Institute of Informatics

How benchmarks work for visual recognition research? -- Historical review and future prospects

Abstract

Recent technical breakthrough, especially deep learning technilogies, witnesses significant performance improvement in many fields such as acoustic signal analysis as well as visual recognition. There is no doublt that this might not happen without large-scale realistic benchmark datasets. This talk will visit tight relationship between visual recognition tasks and benchmark datasets. As visual recognition tasks evolve, since benchmark datasets should be prepared for tasks, the construction of benchmark is also getting more complicated. The evolution of visual recognition tasks will be showcased along with corresponding benchmark datasets. Couple of issues, including noise in datasets as well as biases in datasets, will also be visited.

Biography

Shin'ichi Satoh is a professor at National Institute of Informatics (NII), Tokyo. He received PhD degree in 1992 at the University of Tokyo. His research interests include image processing, video content analysis and multimedia database. Currently he is leading the video processing project at NII, addressing video analysis, indexing, retrieval, and mining for broadcasted video archives.

Video

Content

Mounya Elhilali

Active listening in everyday soundscapes

Abstract

Biography

Shin'ichi Satoh

How benchmarks work for visual recognition research? -- Historical review and future prospects

Abstract

Biography