Sound event detection / Office Live


Challenge results

Official results are shown in original DCASE2013 Challenge website. Purpose of this page is to show results in an uniform way compared to more recent editions of the DCASE Challenge.

Task description

The event detection challenge will address the problem of identifying individual sound events that are prominent in an acoustic scene. Two distinct experiments will take, one for simple acoustic scenes without overlapping sounds and the other using complex scenes in a polyphonic scenario.

More detailed task description can be found in the task description page

Systems

Frame-based results

Rank Submission Information Frame-based metrics
Code Author Affiliation Technical
Report
AEER F1 Precision Recall
DCASE2013 baseline Dimitrios Giannoulis Centre for Digital Music, Queen Mary University of London, London, UK 2.5900 10.7 12.1 10.6
CPS Sameer Chauhan Electrical Engineering, Cooper Union for the Advancement of Science and Art, New York, USA Chauhan2013 2.1160 3.8 9.2 3.0
DHV Aleksandr Diment Tampere University of Technology, Tampere, Finland Diment2013 3.1280 26.0 19.8 45.3
GVV Jort F Gemmeke ESAT-PSI, KU Leuven, Heverlee, Belgium Gemmeke2013 1.0840 31.9 61.8 22.3
NR2 Waldo Nogueira Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Nogueira2013 1.8850 34.7 37.1 35.0
NVM_1 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.1150 40.9 59.9 32.9
NVM_2 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.1020 42.8 61.1 34.3
NVM_3 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.2120 45.5 57.2 38.8
NVM_4 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.3600 42.9 50.8 37.8
SCS_1 Jens Schröder Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany Schroeder2013 1.1670 53.0 59.9 48.3
SCS_2 Jens Schröder Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany Schroeder2013 1.0160 61.5 66.2 57.8
VVK Lode Vuegen ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium Vuegen2013 1.0010 43.4 68.1 32.6

Event-based results (onset-only)

Rank Submission Information Event-based metrics Class-wise Event-based metrics
Code Author Affiliation Technical
Report
AEER F1 Precision Recall AEER F1 Precision Recall
DCASE2013 baseline Dimitrios Giannoulis Centre for Digital Music, Queen Mary University of London, London, UK 5.9000 7.4 4.8 18.2 5.9600 9.0 7.3 21.6
CPS Sameer Chauhan Electrical Engineering, Cooper Union for the Advancement of Science and Art, New York, USA Chauhan2013 2.2850 2.2 3.2 1.9 1.8720 0.7 0.4 2.2
DHV Aleksandr Diment Tampere University of Technology, Tampere, Finland Diment2013 2.5190 26.7 22.8 33.5 2.1820 30.7 31.0 35.9
GVV Jort F Gemmeke ESAT-PSI, KU Leuven, Heverlee, Belgium Gemmeke2013 1.7790 15.5 61.8 22.2 1.5560 13.2 14.2 13.8
NR2 Waldo Nogueira Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Nogueira2013 3.0760 19.2 14.8 27.7 2.8570 21.5 20.9 28.5
NVM_1 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.8640 32.6 33.9 32.2 1.6390 29.4 28.9 34.2
NVM_2 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.8520 34.2 34.9 34.2 1.6020 33.0 33.1 33.3
NVM_3 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.8270 34.5 36.1 33.8 1.5750 33.5 35.1 24.6
NVM_4 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 1.9060 30.5 31.8 30.1 1.6500 28.2 30.2 30.8
SCS_1 Jens Schröder Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany Schroeder2013 1.6690 39.5 41.7 37.8 1.5790 36.3 40.6 39.6
SCS_2 Jens Schröder Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany Schroeder2013 1.6010 45.2 45.5 45.4 1.5110 41.5 43.4 46.4
VVK Lode Vuegen ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium Vuegen2013 2.0540 30.8 31.3 32.5 1.7620 24.6 22.3 33.0

Event-based results (onset-offset)

Rank Submission Information Event-based metrics Class-wise Event-based metrics
Code Author Affiliation Technical
Report
AEER F1 Precision Recall AEER F1 Precision Recall
DCASE2013 baseline Dimitrios Giannoulis Centre for Digital Music, Queen Mary University of London, London, UK 6.3180 1.6 1.0 4.2 6.4620 1.9 1.4 4.9
CPS Sameer Chauhan Electrical Engineering, Cooper Union for the Advancement of Science and Art, New York, USA Chauhan2013 2.3010 1.6 2.4 1.3 1.8910 0.5 0.3 1.6
DHV Aleksandr Diment Tampere University of Technology, Tampere, Finland Diment2013 2.6760 22.4 19.1 28.2 2.3700 25.3 25.5 29.6
GVV Jort F Gemmeke ESAT-PSI, KU Leuven, Heverlee, Belgium Gemmeke2013 1.8310 13.5 21.9 12.9 1.6060 12.0 13.2 12.1
NR2 Waldo Nogueira Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Nogueira2013 3.2440 15.3 11.8 22.0 3.0100 17.6 16.6 23.3
NVM_1 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 2.0950 24.9 26.1 24.5 1.8990 21.8 21.3 25.6
NVM_2 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 2.0950 26.3 27.1 26.1 1.8770 24.9 24.7 28.1
NVM_3 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 2.0520 27.0 28.4 26.3 1.8460 24.6 25.3 27.6
NVM_4 Maria E. Niessen AGT International, Darmstadt, Germany Niessen2013 2.0830 24.7 25.9 24.2 1.8490 21.6 23.1 24.1
SCS_1 Jens Schröder Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany Schroeder2013 1.7490 36.7 38.9 35.1 1.6770 34.2 38.8 36.3
SCS_2 Jens Schröder Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany Schroeder2013 1.7270 41.1 41.4 41.2 1.6460 38.3 40.6 41.9
VVK Lode Vuegen ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium Vuegen2013 2.2240 25.4 25.8 26.9 1.9490 20.4 18.8 26.8

System characteristics

Rank Code Technical
Report
Accuracy
(Eval)
Input Sampling
rate
Features Classifier
DCASE2013 baseline mono 44.1kHz NMF NMF
CPS Chauhan2013 mono 44.1kHz loudness, wavelet decomposition coefficients, autocorrelation, spectral centroid, spectral flux, spectral entropy, short time energy, spectral roll-off, MFCC LRT
DHV Diment2013 mono 44.1kHz MFCC HMM
GVV Gemmeke2013 mono 44.1kHz NMF HMM
NR2 Nogueira2013 mono 44.1kHz MFCC SVM
NVM_1 Niessen2013 mono 44.1kHz STE, ZCR, flatness, spectral flux, spectral roll-off, spectral flatness, spectral brightness, MFCC, LPC hierarchical HMM, random forests
NVM_2 Niessen2013 mono 44.1kHz STE, ZCR, flatness, spectral flux, spectral roll-off, spectral flatness, spectral brightness, MFCC, LPC hierarchical HMM, random forests
NVM_3 Niessen2013 mono 44.1kHz STE, ZCR, flatness, spectral flux, spectral roll-off, spectral flatness, spectral brightness, MFCC, LPC hierarchical HMM, random forests
NVM_4 Niessen2013 mono 44.1kHz STE, ZCR, flatness, spectral flux, spectral roll-off, spectral flatness, spectral brightness, MFCC, LPC hierarchical HMM, random forests
SCS_1 Schroeder2013 mono 44.1kHz Gabor filterbank HMM
SCS_2 Schroeder2013 mono 44.1kHz Gabor filterbank HMM
VVK Vuegen2013 mono 44.1kHz MFCC GMM



Technical reports

Event Detection and Classification

Sameer Chauhan, Sharang Phadke and Christian Sherland
Electrical Engineering, Cooper Union for the Advancement of Science and Art, New York, USA

Abstract

The IEEE AASP Challenge addresses the problem of acoustic event detection and classification in an office environment. Our system performs segmentation and event classification on a continuous stream of acoustic activity in an office using basic feature extraction techniques and a single layer frame-by-frame classifier. We achieve high classification accuracy in noiseless environments, but performance severely deteriorates in noisy environments.

System characteristics
Input mono
Sampling rate 44.1kHz
Features loudness, wavelet decomposition coefficients, autocorrelation, spectral centroid, spectral flux, spectral entropy, short time energy, spectral roll-off, MFCC
Classifier LRT
PDF

Sound Event Detection for Office Live and Office Synthetic AASP Challenge

Aleksandr Diment, Toni Heittola and Tuomas Virtanen
Tampere University of Technology, Tampere, Finland

Abstract

We present a sound event detection system based on hidden Markov models. The system is evaluated with development material provided in the AASP Challenge on Detection and Classification of Acoustic Scenes and Events. Two approaches using the same basic detection scheme are presented. First one, developed for acoustic scenes with non-overlapping sound events is evaluated with Office Live development dataset. Second one, developed for acoustic scenes with some degree of overlapping sound events is evaluated with Office Synthetic development dataset.

System characteristics
Input mono
Sampling rate 44.1kHz
Features MFCC
Classifier HMM
PDF

An Exemplar-Based NMF Approach for Audio Event Detection

Jort F Gemmeke1, Lode Vuegen1,2,3, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium

Abstract

We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation (NMF). Building on recent work in noise robust automatic speech recognition, we model events as a linear combination of dictionary atoms, and mixtures as a linear combination of overlapping events. The exemplar based dictionary is created by extracting all available training data, artificially augmented by linear time warping at multiple rates. The method is evaluated on the Office Live and Office Synthetic development datasets released by the AASP Challenge on Detection and Classification of Acoustic Scenes and Events.

System characteristics
Input mono
Sampling rate 44.1kHz
Features NMF
Classifier HMM
PDF

Hierarchical Sound Event Detection

Maria E. Niessen, Tim L. M Van Kasteren and Andreas Merentitis
AGT International, Darmstadt, Germany

Abstract

Environmental sound recognition in real-world conditions is a particularly challenging topic, since it requires significant efforts on both the feature extraction and the classification modeling parts in order to achieve satisfactory results. In the presented work we propose a multi-tier method that employs best of breed techniques at all relevant tasks; initially feature extraction takes place focusing on a broad range of audio features. Following feature extraction a Hierarchical Hidden Markov Model classifier scheme with explicit modeling of the finishing of the state to better detect transitions is developed. Finally, the best result is achieved when ensemble methods are added on top of the previous scheme. Specifically, a variation of Stacking using a Random Forest and the HHMM as level-1 classifiers and a second instance of HHMM as the metaclassifier is selected. Results indicate that this final method is on one hand able to deliver the best overall performance, as well as explore different tradeoffs between classes and metrics (e.g. emphasize on specific metrics or classes that are of higher importance).

System characteristics
Input mono
Sampling rate 44.1kHz
Features STE, ZCR, flatness, spectral flux, spectral roll-off, spectral flatness, spectral brightness, MFCC, LPC
Classifier hierarchical HMM, random forests
PDF

Automatic Event Classification Using Front End Single Channel Noise Reduction, MFCC Features and a Support Vector Machine Classifier

Waldo Nogueira, Gerard Roma and Perfecto Herrera
Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain

Abstract

This submission to the sub-task scene event classification Office Live of the IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events uses first a single channel noise reduction to clean stationary background noise, next mfccs are extracted and finally a support-vector machine classifier is used to classify the events. In this short paper the usage of the implementation as well as a short description of the system is explained.

System characteristics
Input mono
Sampling rate 44.1kHz
Features MFCC
Classifier SVM
PDF

Acoustic Event Detection Using Signal Enhancement and Spectro-Temporal Feature Extraction

Jens Schröder1, Benjamin Cauchi1, Marc René Schädler2, Niko Moritz1, Kamil Adiloglu3, Jörn Anemüller1,2, Simon Doclo1,2, Birger Kollmeier1,2,3 and Stefan Goetze1
1Project Group Hearing, Speech and Audio Technology, Fraunhofer IDMT, Oldenburg, Germany, 2Department of Medical Physics and Acoustis, University of Oldenburg, Oldenburg, Germany, 3Hörtech gGmbH, Oldenburg, Germany

Abstract

In this paper, an acoustic event detection system is proposed. It consists of a noise reduction signal enhancement step based on the noise power spectral density estimator proposed in [1] and on the noise suppression by [2], a Gabor filterbank feature extraction stage and a two layer hidden Markov model as back-end classifier. Optimization on the development set yields up to a F-Score of 0.73 on frame based and 0.63 on onset and offset based measure.

System characteristics
Input mono
Sampling rate 44.1kHz
Features Gabor filterbank
Classifier HMM
PDF

An MFCC-GMM Approach for Event Detection and Classification

Lode Vuegen1,2,3, Bert Van Den Broeck2,3,4, Peter Karsmakers2,3,4, Jort F Gemmeke1, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium

Abstract

This abstract explores Gaussian Mixture Models (GMM) estimated from Mel Frequency Cepstral Coefficients (MFCCs) for acoustic event detection and classification. To limit the impact of silence, a shared background model is used. An average Fscore of 48% for the office life subtask is obtained. However, the analysis reveals that the proposed method has difficulties to cope with the large intra-class variations (e.g. time durations, dynamic range, characteristic sounds) in the provided dataset.

System characteristics
Input mono
Sampling rate 44.1kHz
Features MFCC
Classifier GMM
PDF