Sound event detection / Office Synthetic


Challenge results

Official results are shown in original DCASE2013 Challenge website. Purpose of this page is to show results in an uniform way compared to more recent editions of the DCASE Challenge.

Task description

The event detection challenge will address the problem of identifying individual sound events that are prominent in an acoustic scene. Two distinct experiments will take, one for simple acoustic scenes without overlapping sounds and the other using complex scenes in a polyphonic scenario.

More detailed task description can be found in the task description page

Systems

Frame-based results

Rank Submission Information Frame-based metrics
Code Author Affiliation Technical
Report
AEER F1 Precision Recall
DCASE2013 baseline Dimitrios Giannoulis Centre for Digital Music, Queen Mary University of London, London, UK 2.8040 12.8 14.4 14.9
DHV Aleksandr Diment Tampere University of Technology, Tampere, Finland Diment2013 7.9800 18.7 13.6 45.3
GVV Jort F Gemmeke ESAT-PSI, KU Leuven, Heverlee, Belgium Gemmeke2013 1.3180 21.3 38.5 15.2
VVK Lode Vuegen ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium Vuegen2013 1.8880 13.5 22.6 12.3

Event-based results (onset-only)

Rank Submission Information Event-based metrics Class-wise Event-based metrics
Code Author Affiliation Technical
Report
AEER F1 Precision Recall AEER F1 Precision Recall
DCASE2013 baseline Dimitrios Giannoulis Centre for Digital Music, Queen Mary University of London, London, UK 6.2630 7.8 4.9 21.6 5.3320 9.5 7.1 21.8
DHV Aleksandr Diment Tampere University of Technology, Tampere, Finland Diment2013 3.5157 16.1 12.4 26.0 2.9030 18.7 17.2 26.1
GVV Jort F Gemmeke ESAT-PSI, KU Leuven, Heverlee, Belgium Gemmeke2013 1.6810 17.0 22.3 14.2 1.2160 14.2 17.1 14.3
VVK Lode Vuegen ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium Vuegen2013 1.6830 13.8 20.9 11.3 1.3770 10.5 12.3 11.5

Event-based results (onset-offset)

Rank Submission Information Event-based metrics Class-wise Event-based metrics
Code Author Affiliation Technical
Report
AEER F1 Precision Recall AEER F1 Precision Recall
DCASE2013 baseline Dimitrios Giannoulis Centre for Digital Music, Queen Mary University of London, London, UK 6.8710 0.6 0.4 1.4 5.9450 0.3 0.2 1.3
DHV Aleksandr Diment Tampere University of Technology, Tampere, Finland Diment2013 3.7643 11.1 8.6 17.7 3.1550 12.2 11.1 17.7
GVV Jort F Gemmeke ESAT-PSI, KU Leuven, Heverlee, Belgium Gemmeke2013 1.7660 13.6 18.0 11.4 1.2980 11.4 13.6 11.5
VVK Lode Vuegen ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium Vuegen2013 1.7980 9.2 14.0 7.6 1.4910 7.5 9.1 7.7

System characteristics

Rank Code Technical
Report
Accuracy
(Eval)
Input Sampling
rate
Features Classifier
DCASE2013 baseline mono 44.1kHz NMF NMF
DHV Diment2013 mono 44.1kHz MFCC HMM
GVV Gemmeke2013 mono 44.1kHz NMF HMM
VVK Vuegen2013 mono 44.1kHz MFCC GMM



Technical reports

Sound Event Detection for Office Live and Office Synthetic AASP Challenge

Aleksandr Diment, Toni Heittola and Tuomas Virtanen
Tampere University of Technology, Tampere, Finland

Abstract

We present a sound event detection system based on hidden Markov models. The system is evaluated with development material provided in the AASP Challenge on Detection and Classification of Acoustic Scenes and Events. Two approaches using the same basic detection scheme are presented. First one, developed for acoustic scenes with non-overlapping sound events is evaluated with Office Live development dataset. Second one, developed for acoustic scenes with some degree of overlapping sound events is evaluated with Office Synthetic development dataset.

System characteristics
Input mono
Sampling rate 44.1kHz
Features MFCC
Classifier HMM
PDF

An Exemplar-Based NMF Approach for Audio Event Detection

Jort F Gemmeke1, Lode Vuegen1,2,3, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium

Abstract

We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation (NMF). Building on recent work in noise robust automatic speech recognition, we model events as a linear combination of dictionary atoms, and mixtures as a linear combination of overlapping events. The exemplar based dictionary is created by extracting all available training data, artificially augmented by linear time warping at multiple rates. The method is evaluated on the Office Live and Office Synthetic development datasets released by the AASP Challenge on Detection and Classification of Acoustic Scenes and Events.

System characteristics
Input mono
Sampling rate 44.1kHz
Features NMF
Classifier HMM
PDF

An MFCC-GMM Approach for Event Detection and Classification

Lode Vuegen1,2,3, Bert Van Den Broeck2,3,4, Peter Karsmakers2,3,4, Jort F Gemmeke1, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium

Abstract

This abstract explores Gaussian Mixture Models (GMM) estimated from Mel Frequency Cepstral Coefficients (MFCCs) for acoustic event detection and classification. To limit the impact of silence, a shared background model is used. An average Fscore of 48% for the office life subtask is obtained. However, the analysis reveals that the proposed method has difficulties to cope with the large intra-class variations (e.g. time durations, dynamic range, characteristic sounds) in the provided dataset.

System characteristics
Input mono
Sampling rate 44.1kHz
Features MFCC
Classifier GMM
PDF