Proceedings - DCASE

The proceedings of the DCASE2022 Workshop have been published as an electronic publication:

Mathieu Lagrange, Annamaria Mesaros, Thomas Pellegrini, Gaël Richard, Romain Serizel and Dan Stowell (eds.), Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2022), Nov. 2022..

ISBN (Electronic): 978-952-03-2677-7

Link PDF

Total cites: 628 (updated 30.11.2024)

Integrating Isolated Examples with Weakly-Supervised Sound Event Detection: A Direct Approach

Mohammad Abdollahi¹, Romain Serizel¹, Alain Rakotomamonjy¹, and Gilles Gasso¹

¹Universite de Rouen

PDF

Abstract

In an attempt to mitigate the need for high quality strong annotations for Sound Event Detection (SED), an approach has been to resort to a mix of weakly-labelled, unlabelled and a small set of representative (isolated) examples. The common approach to integrate the set of representative examples into the training process is to use them for creating synthetic soundscapes. The process of synthesizing soundscapes however could come with its own artefacts and mismatch to real recordings and harm the overall performance. Alternatively, a rather direct way would be to use the isolated examples in a form of template matching. To this end in this paper we propose to train an isolated event classifier using the representative examples. By sliding the classifier across a recording, we use its output as an auxiliary feature vector concatenated with intermediate spectro-temporal representations extracted by the SED system. Experimental results on DESED dataset demonstrate improvements in segmentation performance when using auxiliary features and comparable results to the baseline when using them without synthetic soundscapes. Furthermore we show that this auxiliary feature vector block could act as a gateway to integrate external annotated datasets in order to further boost SED system’s performance.

PDF

Impact of Temporal Resolution on Convolutional Recurrent Networks for Audio Tagging and Sound Event Detection

Wim Boes¹ and Hugo Van hamme¹

¹ESAT, KU Leuven, Belgium

3 cites

PDF

Abstract

Many state-of-the-art systems for audio tagging and sound event detection employ convolutional recurrent neural architectures. Typically, they are trained in a mean teacher setting to deal with the heterogeneous annotation of the available data. In this work, we present a thorough analysis of how changing the temporal resolution of these convolutional recurrent neural networks - which can be done by simply adapting their pooling operations - impacts their performance. By using a variety of evaluation metrics, we investigate the effects of adapting this design parameter under several sound recognition scenarios involving different needs in terms of temporal localization.

Cites: 3 ( see at Google Scholar )

PDF

Confidence Regularized Entropy for Polyphonic Sound Event Detection

Won-Gook Choi¹ and Joon-Hyuk Chang¹

¹Department of Electronic Engineering, Hanyang University, Seoul, Republic of Korea

4 cites

PDF

Abstract

One of the main issues of polyphonic sound event detection (PSED) is the class imbalance problem caused by the proportions of active and inactive frames. Since the target sounds occasionally appear, binary cross-entropy makes the model mainly fit on inactive frames. This paper introduces an effective objective function, confidence regularized entropy, which regularizes the confidence level to prevent overfitting of the dominant classes. The proposed method exhibits less overfitted samples and better detection performance than the binary cross-entropy. Also, we compare our method with the other objective function, the asymmetric focal loss also designed to solve the class imbalance problem in PSED. The two objective functions show different system characteristics. From an end-user perspective, we suggest choosing a proper objective function for the purposes.

Cites: 4 ( see at Google Scholar )

PDF

EDANSA-2019: The Ecoacoustic Dataset from Arctic North Slope Alaska

Enis Berk Çoban¹, Megan Perra², Dara Pir³, and Michael I Mandel^1,4

¹The Graduate Center, CUNY, New York, NY, USA, ²Institute of Arctic Biology, UAF, Fairbanks, AK, USA, ³Guttman Community College, CUNY, New York, NY, USA, ⁴Brooklyn College, CUNY, Brooklyn, NY, USA

6 cites

Integrating Isolated Examples with Weakly-Supervised Sound Event Detection: A Direct Approach

Abstract

Impact of Temporal Resolution on Convolutional Recurrent Networks for Audio Tagging and Sound Event Detection

Abstract

Confidence Regularized Entropy for Polyphonic Sound Event Detection

Abstract

EDANSA-2019: The Ecoacoustic Dataset from Arctic North Slope Alaska

Abstract

Ensemble of Multiple Anomalous Sound Detectors

Abstract

MIMII DG: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection for Domain Ggeneralization Task

Abstract

Description and Discussion on DCASE 2022 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Applying Domain Generalization Techniques

Abstract

Convolutional Neural Network for Audibility Assessment of Acoustic Alarms

Abstract

Detection and Identification of Beehive Piping Audio Signals

Abstract

Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains

Abstract

CoLoC: Conditioned Localizer and Classifier for Sound Event Localization and Detection

Abstract

Model Training that Prioritizes Rare Overlapped Labels for Polyphonic Sound Event Detection

Abstract

Analyzing the Effect of Equal-Angle Spatial Discretization on Sound Event Localization and Detection

Abstract

Is my Automatic Audio Captioning System so Bad? SPIDEr-max: A Metric to Consider Several Caption Candidates

Abstract

Multi-Scale Architecture and Device-Aware Data-Random-Drop Based Fine-Tuning Method for Acoustic Scene Classification

Abstract

A Hybrid System of Sound Event Detection Transformer and Frame-Wise Model for DCASE 2022 Task 4

Abstract

Unsupervised Anomalous Sound Detection for Machine Condition Monitoring Using Temporal Modulation Features on Gammatone Auditory Filterbank

Abstract

Few-Shot Bioacoustic Event Detection: Enhanced Classifiers for Prototypical Networks

Abstract

Leveraging Label Hierachies for Few-Shot Everyday Sound Recognition

Abstract

Segment-Level Metric Learning for Few-Shot Bioacoustic Event Detection

Abstract

Explaining the Decision of Anomalous Sound Detectors

Abstract

A Device Classification-Aided Multi-Task Framework for Low-Complexity Acoustic Scene Classification

Abstract

Low-Complexity Acoustic Scene Classification in DCASE 2022 Challenge

Abstract

A Summarization Approach to Evaluating Audio Captioning

Abstract

Few-Shot Bioacoustic Event Detection Using an Event-Length Adapted Ensemble of Prototypical Networks

Abstract

Quantity Over Quality? Investigating the Effects of Volume and Strength of Training Data in Marine Bioacoustics

Abstract

DG-Mix: Domain Generalization for Anomalous Sound Detection Based on Self-Supervised Learning

Abstract

Few-Shot Bioacoustic Event Detection at the DCASE 2022 Challenge

Abstract

Exploring Eco-Acoustic Data with K-Determinantal Point Processes

Abstract

Polyphonic Sound Event Detection for Highly Dense Birdsong Scenes

Abstract

Language-Based Audio Retrieval with Textual Embeddings of Tag Names

Abstract

Latent and Adversarial Data Augmentations for Sound Event Detection and Classification

Abstract

STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

Abstract

Improving Natural-Language-Based Audio Retrieval with Transfer Learning and Audio & Text Augmentations

Abstract

Description and Analysis of Novelties Introduced in DCASE Task 4 2022 on the Baseline System

Abstract

Sound Event Localization and Detection with Pre-Trained Audio Spectrogram Transformer and Multichannel Seperation Network

Abstract

Knowledge Distillation from Transformers for Low-Complexity Acoustic Scene Classification

Abstract

Feature Selection Using Alternating Direction Method of Multiplier for Low-Complexity Acoustic Scene Classification

Abstract

Low-Complexity CNNs for Acoustic Scene Classification

Abstract

Improved Domain Generalization via Disentangled Multi-Task Learning in Unsupervised Anomalous Sound Detection

Abstract