Introduction
Challenge submission consists in submission package (one zip-package) containing system outputs and system meta information, and one technical report (pdf file). The technical report can be submitted also as a scientific paper to the DCASE2017 Workshop.
- Please prepare the submission package (zip-file) as instructed here. The submission package contains system outputs for all tasks, maximum 4 per task, and systems meta information.
- Please use the provided paper template for your technical report.
- Follow the submission process to submit your system to the DCASE Challenge.
Submission package
Participants are instructed to pack their system output(s) and system meta information into one zip-package. Example package:
Please prepare your submission zip-file as the provided example. Follow the same file structure and fill meta information with similar structure than as the one in *.meta.yaml
-files. The zip-file should contain system outputs for all tasks, maximum 4 submissions per task, and separate meta information for each submission.
More detailed instructions can be found in the following subsections.
Submission label
Submission label is used to index all your submissions (systems per tasks), to avoid overlapping labels among all submitted systems use following way to form your label:
[Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
As example, for baseline system this would translate to codes:
Heittola_TUT_task1_1
Heittola_TUT_task2_1
Heittola_TUT_task3_1
Elizalde_CMU_task4_1
System meta information
In order to allow meta analysis of submitted systems, participants should provide rough meta information in structured and correctly formatted YAML-file.
See example meta files below for each baseline system. These examples are also available in the example submission package. Meta file structure is mostly the same for all tasks, only the metrics collected in results->development_dataset
-section differ per challenge task.
Example meta information file for Task 1 baseline system task1/Heittola_TUT_task1_1/Heittola_TUT_task1_1.meta.yaml
:
# Submission information submission: # Submission label # Label is used to index submissions, to avoid overlapping codes among submissions # use following way to form your label: # [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)] label: Heittola_TUT_task1_1 # Submission name # This name will be used in the results tables when space permits name: DCASE2017 baseline system # Submission name abbreviated # This abbreviated name will be used in the results table when space is tight, maximum 10 characters abbreviation: Baseline # Submission authors in order, mark one of the authors as corresponding author. authors: - lastname: Heittola firstname: Toni email: toni.heittola@tut.fi # Contact email address corresponding: true # Mark true for one of the authors # Affiliation information for the author affiliation: abbreviation: TUT institute: Tampere University of Technology department: Laboratory of Signal Processing location: Tampere, Finland - lastname: Mesaros firstname: Annamaria email: annamaria.mesaros@tut.fi # Contact email address # Affiliation information for the author affiliation: abbreviation: TUT institute: Tampere University of Technology department: Laboratory of Signal Processing location: Tampere, Finland # System information system: # System description, meta data provided here will be used to do # meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments. description: # Audio input input_channels: mono # e.g. one or combination [mono, binaural, left, right, mixed, ...] input_sampling_rate: 44.1kHz # # Acoustic representation acoustic_features: log-mel energies # e.g one or combination [MFCC, log-mel energies, spectrogram, CQT, ...] # Data augmentation methods data_augmentation: null # [time stretching, block mixing, pitch shifting, ...] # Machine learning machine_learning_method: MLP # e.g one or combination [GMM, HMM, SVM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...] # Decision making methods decision_making: majority vote # [majority vote, ...] # URL to the source code of the system [optional] source_code: https://github.com/TUT-ARG/DCASE2017-baseline-system # System results results: # System evaluation result for provided the cross-validation setup. development_dataset: # Overall accuracy (mean of class-wise accuracies) overall: accuracy: 74.8 # Class-wise accuracies class_wise: beach: accuracy: 75.3 bus: accuracy: 71.8 cafe/restaurant: accuracy: 57.7 car: accuracy: 97.1 city_center: accuracy: 90.7 forest_path: accuracy: 79.5 grocery_store: accuracy: 58.7 home: accuracy: 68.6 library: accuracy: 57.1 metro_station: accuracy: 91.7 office: accuracy: 99.7 park: accuracy: 70.2 residential_area: accuracy: 64.1 train: accuracy: 58.0 tram: accuracy: 74.8
Example meta information file for Task 2 baseline system task2/Heittola_TUT_task2_1/Heittola_TUT_task2_1.meta.yaml
:
# Submission information submission: # Submission label # Label is used to index submissions, to avoid overlapping codes among submissions # use following way to form your label: # [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)] label: Heittola_TUT_task2_1 # Submission name # This name will be used in the results tables when space permits name: DCASE2017 baseline system # Submission name abbreviated # This abbreviated name will be used in the results table when space is tight, maximum 10 characters abbreviation: Baseline # Submission authors in order, mark one of the authors as corresponding author. authors: - lastname: Heittola firstname: Toni email: toni.heittola@tut.fi # Contact email address corresponding: true # Mark true for one of the authors # Affiliation information for the author affiliation: abbreviation: TUT institute: Tampere University of Technology department: Laboratory of Signal Processing location: Tampere, Finland - lastname: Mesaros firstname: Annamaria email: annamaria.mesaros@tut.fi # Contact email address # Affiliation information for the author affiliation: abbreviation: TUT institute: Tampere University of Technology department: Laboratory of Signal Processing location: Tampere, Finland # System information system: # System description, meta data provided here will be used to do # meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments. description: # Audio input input_channels: mono # e.g. one or combination [mono, binaural, left, right, mixed, ...] input_sampling_rate: 44.1kHz # # Acoustic representation acoustic_features: log-mel energies # e.g one or combination [MFCC, log-mel energies, spectrogram, CQT, ...] # Data augmentation methods data_augmentation: null # [time stretching, block mixing, pitch shifting, ...] # Machine learning machine_learning_method: MLP # e.g one or combination [GMM, HMM, SVM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...] # Decision making methods decision_making: median filtering # [sliding median filter, ...] # URL to the source code of the system [optional] source_code: https://github.com/TUT-ARG/DCASE2017-baseline-system # System results results: # System evaluation result for provided the cross-validation setup. development_dataset: event_based: # Overall metrics overall: er: 0.56 f1: 71.7 # Class-wise metrics class_wise: babycry: er: 0.77 f1: 69.2 glassbreak: er: 0.22 f1: 88.5 gunshot: er: 0.56 f1: 71.7
Example meta information file for Task 3 baseline system task3/Heittola_TUT_task3_1/Heittola_TUT_task3_1.meta.yaml
:
# Submission information submission: # Submission label # Label is used to index submissions, to avoid overlapping codes among submissions # use following way to form your label: # [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)] label: Heittola_TUT_task3_1 # Submission name # This name will be used in the results tables when space permits name: DCASE2017 baseline system # Submission name abbreviated # This abbreviated name will be used in the results table when space is tight, maximum 10 characters abbreviation: Baseline # Submission authors in order, mark one of the authors as corresponding author. authors: - lastname: Heittola firstname: Toni email: toni.heittola@tut.fi # Contact email address corresponding: true # Mark true for one of the authors # Affiliation information for the author affiliation: abbreviation: TUT institute: Tampere University of Technology department: Laboratory of Signal Processing location: Tampere, Finland - lastname: Mesaros firstname: Annamaria email: annamaria.mesaros@tut.fi # Contact email address # Affiliation information for the author affiliation: abbreviation: TUT institute: Tampere University of Technology department: Laboratory of Signal Processing location: Tampere, Finland # System information system: # System description, meta data provided here will be used to do # meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments. description: # Audio input input_channels: mono # e.g. one or combination [mono, binaural, left, right, mixed, ...] input_sampling_rate: 44.1kHz # # Acoustic representation acoustic_features: log-mel energies # e.g one or combination [MFCC, log-mel energies, spectrogram, CQT, ...] # Data augmentation methods data_augmentation: null # [time stretching, block mixing, pitch shifting, ...] # Machine learning machine_learning_method: MLP # e.g one or combination [GMM, HMM, SVM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...] # Decision making methods decision_making: median filtering # [sliding median filter, ...] # URL to the source code of the system [optional] source_code: https://github.com/TUT-ARG/DCASE2017-baseline-system # System results results: # System evaluation result for provided the cross-validation setup. development_dataset: segment_based: # Overall metrics overall: er: 0.69 f1: 56.7 # Class-wise metrics class_wise: brakes_squeking: er: 0.98 f1: 4.1 car: er: 0.57 f1: 74.1 children: er: 1.35 f1: 0.0 large_vehicle: er: 0.90 f1: 50.8 people_speaking: er: 1.25 f1: 18.5 people_walking: er: 0.84 f1: 55.6
Example meta information file for Task 4 baseline system task4/Elizalde_CMU_task4_1/Elizalde_CMU_task4_1.meta.yaml
:
# Submission information submission: # Submission label # Label is used to index submissions, to avoid overlapping codes among submissions # use following way to form your label: # [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)] label: Elizalde_CMU_task4_1 # Submission name # This name will be used in the results tables when space permits name: DCASE2017 baseline system # Submission name abbreviated # This abbreviated name will be used in the results table when space is tight, maximum 10 characters abbreviation: Baseline # Submission authors in order, mark one of the authors as corresponding author. authors: - lastname: Elizalde firstname: Benjamin email: bmartin1@andrew.cmu.edu # Contact email address corresponding: true # Mark true for one of the authors # Affiliation information for the author affiliation: abbreviation: CMU institute: Carnegie Mellon University location: Pittsburgh, USA - lastname: Badlani firstname: Rohan email: rohan.badlani@gmail.com # Contact email address # Affiliation information for the author affiliation: institute: Birla Institute of Technology & Science location: Rajasthan, India - lastname: Shah firstname: Ankit email: ankit.tronix@gmail.com # Contact email address # Affiliation information for the author affiliation: abbreviation: CMU institute: Carnegie Mellon University location: Pittsburgh, USA # System information system: # System description, meta data provided here will be used to do # meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments. description: # Audio input input_channels: mono # e.g. one or combination [mono, binaural, left, right, mixed, ...] input_sampling_rate: 44.1kHz # # Acoustic representation acoustic_features: log-mel energies # e.g one or combination [MFCC, log-mel energies, spectrogram, CQT, ...] # Data augmentation methods data_augmentation: null # [time stretching, block mixing, pitch shifting, ...] # Machine learning machine_learning_method: MLP # e.g one or combination [GMM, HMM, SVM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...] # Decision making methods decision_making: median filtering # [sliding median filter, ...] # URL to the source code of the system [optional] source_code: https://github.com/ankitshah009/Task-4-Large-scale-weakly-supervised-sound-event-detection-for-smart-cars # System results results: # System evaluation result for provided the cross-validation setup. development_dataset: subtask_a: overall: f1: 19.8 precision: 16.2 recall: 25.6 subtask_b: segment_based: # Overall metrics overall: er: 1.00 f1: 11.4
System output
Participants must submit the results for the provided evaluation dataset (see download page).
-
Tasks are independent. You can participate a single task or multiple tasks.
-
Multiple submissions for the same task are allowed (maximum 4 per task). Use running index in the submission label, and give more detailed names for the submitted system in the system meta information files. Please mark carefully the connection between the submitted systems and system parameters description in the technical report (for example by referring to the systems with submission label or system name (given in the system meta information file)).
-
Submitted system outputs will be published online on the DCASE2017 website later to allow future evaluations.
Examples for formatting the output for the different tasks are given below.
Task 1 - Acoustic scene classification
Single text-file (in CSV format) containing classification result for each audio file in the evaluation set. Result items can be in any order. Format:
[filename (string)][tab][scene label (string)]
Example task1_results.txt file
audio/178.wav residential_street
audio/62.wav office
audio/261.wav home
...
Task 2 - Detection of rare sound events
Single text-file (in CSV format) containing detected sound event from each audio file. Events can be in any order. Format:
[filename (string)][tab][event onset time in seconds (float)][tab][event offset time in seconds (float)][tab][event label (string)]
Example task2_results.txt file
audio/mixture_evaltest_babycry_001_0e22e5d08617707ea812e0268d628031.wav 1.44 3.8 babycry
audio/mixture_evaltest_babycry_000_35c7bc20a21ec8fbb7097c6fb71487b5.wav
audio/mixture_evaltest_glassbreak_000_c711628a46aab5b1032e19b003bf78d7.wav 2.44 5.8 glassbreak
audio/mixture_evaltest_glassbreak_001_1b1c3c26ee642fafed65ed873910adad.wav
audio/mixture_evaltest_gunshot_001_9a2eb11a7c6edea6c75ff30dc3f5de12.wav 3.44 6.8 gunshot
audio/mixture_evaltest_gunshot_002_54d69de42bbd0acc7b4a89b0207eacf1.wav
...
If no event is detected for the particular audio signal, the system should still output a row containing only the file name, to indicate that the file was processed. This is used to verify that participants processed all evaluation files.
Task 3 - Sound event detection in real life audio
Single text-file (in CSV format) containing detected sound event from each audio file. Events can be in any order. Format:
[filename (string)][tab][event onset time in seconds (float)][tab][event offset time in seconds (float)][tab][event label (string)]
Example task3_results.txt file
audio/a029.wav 1.0000 2.8000 car
audio/a029.wav 0.0000 0.2000 people walking
audio/a033.wav
audio/a034.wav 4.0000 4.1000 brakes squeaking
...
If no event is detected for the particular audio signal, the system should still output a row containing only the file name, to indicate that the file was processed. This is used to verify that participants processed all evaluation files.
Task 4 - Large-scale weakly supervised sound event detection for smart cars
For Subtasks A and B, provide a text-file (in CSV format) containing detected sound event from each audio file. Events can be in any order. Detection for both subtasks--with and without timestamps will be evaluated using the same output format, but the latter will ignore the timestamps. The system output file can be the same for both subtasks or two different versions can be provided. We require one folder package per system output, which includes one yaml file, and one or two prediction files (one for each subtask or one for both). Note that you are not obligated to submit a prediction file for both subtasks if you want to participate only in one subtask.
Format:
[filename (string)][tab][event onset time in seconds (float)][tab][event offset time in seconds (float)][tab][event label (string)]
Note that "audio/" and/or "Y" can be inserted in front of the filename and it would be properly parsed by the metric scripts, assuming the chosen convention is consistent throughout the file.
Example team_task4_results.txt file
Y--0w1YA1Hm4_30.000_40.000.wav 1.0000 2.8000 Car alarm
Y-fCSO8SVWZU_6.000_16.000.wav 0.0000 0.2000 Police car (siren)
Y0Hz4R_m0hmI_80.000_90.000.wav 7.3000 9.4000 Fire engine, fire truck (siren)
Y0Hz4R_m0hmI_80.000_90.000.wav
...
If no event is detected for the particular audio signal, the system should still output a row containing only the file name, to indicate that the file was processed. This is used to verify that participants processed all evaluation files.
Package structure
Make sure your zip-package has the following structure (example zip-file for baseline systems):
Zip-package root
│
└───task1 Task1 submissions
│ │
│ └───Heittola_TUT_task1_1 System 1 submission files
│ │ │ Heittola_TUT_task1_1.meta.yaml System 1 meta information
│ │ │ Heittola_TUT_task1_1.output.txt System 1 output
│ │
│ └───Heittola_TUT_task1_1 System 2 submission files
│ │ │ Heittola_TUT_task1_2.meta.yaml System 2 meta information
│ │ │ Heittola_TUT_task1_2.output.txt System 2 output
│ │
│ │...
│
└───task2 Task2 submissions
│ │
│ └───Heittola_TUT_task2_1 System 1 submission files
│ │ │ Heittola_TUT_task2_1.meta.yaml System 1 meta information
│ │ │ Heittola_TUT_task2_1.output.txt System 1 output
│ │
│ │...
│
│
└───task3 Task3 submissions
│ │
│ └───Heittola_TUT_task3_1 System 1 submission files
│ │ │ Heittola_TUT_task3_1.meta.yaml System 1 meta information
│ │ │ Heittola_TUT_task3_1.output.txt System 1 output
│ │
│ │...
│
│
└───task4 Task4 submissions
│
└───Elizalde_CMU_task4_1
│ │ Elizalde_CMU_task4_1.meta.yaml System 1 meta information
│ │ Elizalde_CMU_task4_1_A.output.txt System 1 output subtask A
│ │ Elizalde_CMU_task4_1_B.output.txt System 1 output subtask B
│ │ Elizalde_CMU_task4_1_AB.output.txt In case the System 1 output is used for both subtasks
│
│...
Technical report
All participants are expected to submit a technical report about the submitted system, to help the DCASE community better understand how the algorithm works.
Technical reports are not peer-reviewed. The technical reports will be published on the challenge website together with all other information about the submitted system. For the technical report it is not necessary to follow closely the scientific publication structure (for example there is no need for extensive literature review). The report should however contain sufficient description of the system.
Please report the system performance using the provided cross-validation setup or development set, according to the task. For participants taking part in multiple tasks, one technical report covering all tasks is sufficient.
Authors have the possibility to update their technical report to include the challenge evaluation results. The deadline for the camera-ready report is 20th October 2017.
When referring to the DCASE2017 Challenge use the following:
A. Mesaros, T. Heittola, A. Diment, B. Elizalde, A. Shah, E. Vincent, B. Raj, and T. Virtanen. DCASE 2017 challenge setup: tasks, datasets and baseline system. In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 85–92. November 2017.
DCASE 2017 Challenge Setup: Tasks, Datasets and Baseline System
Abstract
DCASE 2017 Challenge consists of four tasks: acoustic scene classification, detection of rare sound events, sound event detection in real-life audio, and large-scale weakly supervised sound event detection for smart cars. This paper presents the setup of these tasks: task definition, dataset, experimental setup, and baseline system results on the development dataset. The baseline systems for all tasks rely on the same implementation using multilayer perceptron and log mel-energies, but differ in the structure of the output layer and the decision making process, as well as the evaluation of system output using task specific metrics.
Keywords
Sound scene analysis, Acoustic scene classification, Sound event detection, Audio tagging, Rare sound events
Participants can also submit the same report as a scientific paper to DCASE 2017 Workshop. In this case, the paper must respect the structure of a scientific publication, and be prepared according to the provided Workshop paper instructions. Please report the system performance using the provided cross-validation setup or development set, according to the task. Scientific papers will be peer-reviewed.
Template
Report are in format 4+1 pages. Papers are maximum 5 pages, including all text, figures, and references, with the 5th page containing only references. The template is the same for the technical report and DCASE2017 Workshop papers.
Submission process
-
Create a user account in the Submission system. The submitting author is considered corresponding author for the submission.
-
Select Your Submissions in the menu. Using this system you can submit a contribution to the DCASE 2017 Challenge and also to DCASE 2017 Workshop.
-
Select Challenge Submission.
- Please fill in names and affiliations of all authors.
- Fill in submission details (title and abstract).
- Indicate the task you submit for by selecting the corresponding topic. If your zip-file contains system outputs for multiple tasks, tick multiple topics accordingly.
- Mark if the main author is a student.
- Check the copyright form box.
- At the next step you will be able to upload the files.
-
Upload files:
- Technical report in pdf format
- System outputs as zip-file
-
The system will send a confirmation email.
-
If you intend to submit your technical report as a scientific paper for the DCASE2017 Workshop, return to step 2 (Your Submissions) in the submission system and select Workshop submission. Follow the steps to upload the pdf. The system will also send a confirmation email about the workshop submission.