Official results are shown in original DCASE2013 Challenge website. Purpose of this page is to show results in an uniform way compared to more recent editions of the DCASE Challenge.
Task description
The event detection challenge will address the problem of identifying individual sound events that are prominent in an acoustic scene. Two distinct experiments will take, one for simple acoustic scenes without overlapping sounds and the other using complex scenes in a polyphonic scenario.
More detailed task description can be found in the task description page
Systems
Frame-based results
Rank | Submission Information | Frame-based metrics | ||||||
---|---|---|---|---|---|---|---|---|
Code | Author | Affiliation |
Technical Report |
AEER / Frame-based | F1 / Frame-based | Precision / Frame-based | Recall / Frame-based | |
DCASE2013 baseline | Dimitrios Giannoulis | Centre for Digital Music, Queen Mary University of London, London, UK | 2.8040 | 12.8 | 14.4 | 14.9 | ||
DHV | Aleksandr Diment | Tampere University of Technology, Tampere, Finland | Diment2013 | 7.9800 | 18.7 | 13.6 | 45.3 | |
GVV | Jort F Gemmeke | ESAT-PSI, KU Leuven, Heverlee, Belgium | Gemmeke2013 | 1.3180 | 21.3 | 38.5 | 15.2 | |
VVK | Lode Vuegen | ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium | Vuegen2013 | 1.8880 | 13.5 | 22.6 | 12.3 |
Event-based results (onset-only)
Rank | Submission Information | Event-based metrics | Class-wise Event-based metrics | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Code | Author | Affiliation |
Technical Report |
AEER / Event-based | F1 / Event-based | Precision / Event-based | Recall / Event-based | AEER / Class-wise Event-based | F1 / Class-wise Event-based | Precision / Class-wise Event-based | Recall / Class-wise Event-based | |
DCASE2013 baseline | Dimitrios Giannoulis | Centre for Digital Music, Queen Mary University of London, London, UK | 6.2630 | 7.8 | 4.9 | 21.6 | 5.3320 | 9.5 | 7.1 | 21.8 | ||
DHV | Aleksandr Diment | Tampere University of Technology, Tampere, Finland | Diment2013 | 3.5157 | 16.1 | 12.4 | 26.0 | 2.9030 | 18.7 | 17.2 | 26.1 | |
GVV | Jort F Gemmeke | ESAT-PSI, KU Leuven, Heverlee, Belgium | Gemmeke2013 | 1.6810 | 17.0 | 22.3 | 14.2 | 1.2160 | 14.2 | 17.1 | 14.3 | |
VVK | Lode Vuegen | ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium | Vuegen2013 | 1.6830 | 13.8 | 20.9 | 11.3 | 1.3770 | 10.5 | 12.3 | 11.5 |
Event-based results (onset-offset)
Rank | Submission Information | Event-based metrics | Class-wise Event-based metrics | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Code | Author | Affiliation |
Technical Report |
AEER / Event-based | F1 / Event-based | Precision / Event-based | Recall / Event-based | AEER / Class-wise Event-based | F1 / Class-wise Event-based | Precision / Class-wise Event-based | Recall / Class-wise Event-based | |
DCASE2013 baseline | Dimitrios Giannoulis | Centre for Digital Music, Queen Mary University of London, London, UK | 6.8710 | 0.6 | 0.4 | 1.4 | 5.9450 | 0.3 | 0.2 | 1.3 | ||
DHV | Aleksandr Diment | Tampere University of Technology, Tampere, Finland | Diment2013 | 3.7643 | 11.1 | 8.6 | 17.7 | 3.1550 | 12.2 | 11.1 | 17.7 | |
GVV | Jort F Gemmeke | ESAT-PSI, KU Leuven, Heverlee, Belgium | Gemmeke2013 | 1.7660 | 13.6 | 18.0 | 11.4 | 1.2980 | 11.4 | 13.6 | 11.5 | |
VVK | Lode Vuegen | ESAT-PSI, KU Leuven, Heverlee, Belgium; Future Health Department, iMinds, Heverlee, Belgium; MOBILAB, TM Kempen, Geel, Belgium | Vuegen2013 | 1.7980 | 9.2 | 14.0 | 7.6 | 1.4910 | 7.5 | 9.1 | 7.7 |
System characteristics
Rank | Code |
Technical Report |
Accuracy (Eval) |
Input |
Sampling rate |
Features | Classifier |
---|---|---|---|---|---|---|---|
DCASE2013 baseline | mono | 44.1kHz | NMF | NMF | |||
DHV | Diment2013 | mono | 44.1kHz | MFCC | HMM | ||
GVV | Gemmeke2013 | mono | 44.1kHz | NMF | HMM | ||
VVK | Vuegen2013 | mono | 44.1kHz | MFCC | GMM |
Technical reports
Sound Event Detection for Office Live and Office Synthetic AASP Challenge
Aleksandr Diment, Toni Heittola and Tuomas Virtanen
Tampere University of Technology, Tampere, Finland
DHV
Sound Event Detection for Office Live and Office Synthetic AASP Challenge
Aleksandr Diment, Toni Heittola and Tuomas Virtanen
Tampere University of Technology, Tampere, Finland
Abstract
We present a sound event detection system based on hidden Markov models. The system is evaluated with development material provided in the AASP Challenge on Detection and Classification of Acoustic Scenes and Events. Two approaches using the same basic detection scheme are presented. First one, developed for acoustic scenes with non-overlapping sound events is evaluated with Office Live development dataset. Second one, developed for acoustic scenes with some degree of overlapping sound events is evaluated with Office Synthetic development dataset.
System characteristics
Input | mono |
Sampling rate | 44.1kHz |
Features | MFCC |
Classifier | HMM |
An Exemplar-Based NMF Approach for Audio Event Detection
Jort F Gemmeke1, Lode Vuegen1,2,3, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium
GVV
An Exemplar-Based NMF Approach for Audio Event Detection
Jort F Gemmeke1, Lode Vuegen1,2,3, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium
Abstract
We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation (NMF). Building on recent work in noise robust automatic speech recognition, we model events as a linear combination of dictionary atoms, and mixtures as a linear combination of overlapping events. The exemplar based dictionary is created by extracting all available training data, artificially augmented by linear time warping at multiple rates. The method is evaluated on the Office Live and Office Synthetic development datasets released by the AASP Challenge on Detection and Classification of Acoustic Scenes and Events.
System characteristics
Input | mono |
Sampling rate | 44.1kHz |
Features | NMF |
Classifier | HMM |
An MFCC-GMM Approach for Event Detection and Classification
Lode Vuegen1,2,3, Bert Van Den Broeck2,3,4, Peter Karsmakers2,3,4, Jort F Gemmeke1, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium
VVK
An MFCC-GMM Approach for Event Detection and Classification
Lode Vuegen1,2,3, Bert Van Den Broeck2,3,4, Peter Karsmakers2,3,4, Jort F Gemmeke1, Bart Vanrumste2,3,4 and Hugo Van hamme1
1ESAT-PSI, KU Leuven, Heverlee, Belgium, 2Future Health Department, iMinds, Heverlee, Belgium, 3MOBILAB, TM Kempen, Geel, Belgium, 4ESAT-SISTA, KU Leuven, Heverlee, Belgium
Abstract
This abstract explores Gaussian Mixture Models (GMM) estimated from Mel Frequency Cepstral Coefficients (MFCCs) for acoustic event detection and classification. To limit the impact of silence, a shared background model is used. An average Fscore of 48% for the office life subtask is obtained. However, the analysis reveals that the proposed method has difficulties to cope with the large intra-class variations (e.g. time durations, dynamic range, characteristic sounds) in the provided dataset.
System characteristics
Input | mono |
Sampling rate | 44.1kHz |
Features | MFCC |
Classifier | GMM |