The submission deadline is June 15th 2020 23:59 Anywhere on Earth (AoE)
Introduction
Challenge submission consists in submission package (one zip-package) containing system outputs, system meta information, and technical report (pdf file).
Submission process shortly:
- Participants run their system with evaluation dataset, and produce the system output in specified format. Participants are allowed to submit 4 different system outputs per task or subtask.
- Participants create a meta information file to go along the system output to describe the system used to produce this particular output. Meta information file has a predefined format to help the automatic handling of the challenge submissions. Information provided in the meta file will be later used to produce challenge results. Participants should fill all meta information and make sure meta information file follows defined formatting.
- Participants describe their system in a technical report in sufficient detail. There is a template provided for the technical report.
- Participants prepare the submission package (zip-file). The submission package contains system outputs, maximum 4 per task, systems meta information and the technical report.
- Participants submit the submission package and the technical report to DCASE2020 Challenge.
Please read carefully the requirements for the files included in the submission package!
Submission system
The submission system is now available:
- Create user account and login
- Go to "All Conferences" tab in the system and type DCASE to filter the list
- Select "2020 Challenge on Detection and Classification of Acoustic Scenes and Events"
- Create a new submission
The challenge deadline is 15 June 2020 (AOE).
The technical report in the submission package must contain at least title, authors, and abstract. An updated camera-ready version of the technical report can be submitted separately until 17 June 2020 (AOE).
Note: the submission system does not any send confirmation email. You can check that your submission has been taken into account in your author console. A confirmation email will be sent to all participants once the submissions are closed.
By submitting to the challenge, participants agree for the system output to be evaluated and to be published together with the results and the technical report on the DCASE Challenge website under CC-BY license.
Submission package
Participants are instructed to pack their system output(s), system meta information, and technical report into one zip-package. Example package:
Please prepare your submission zip-file as the provided example. Follow the same file structure and fill meta information with similar structure as the one in *.meta.yaml
-files. The zip-file should contain system outputs for all tasks/subtasks, maximum 4 submissions per task/subtask, separate meta information for each system, and technical report(s) covering all submitted systems.
If you submit similar systems for multiple tasks, you can describe everything in one technical report. If your approaches for different tasks are significantly different, prepare one technical report for each and include it in the corresponding task folder.
More detailed instructions for constructing the package can be found in the following sections. Technical report template is available here.
A script for checking the content of the submission package is provided for selected tasks. In that case, please validate your submission package accordingly.
For task 1, use validator code from repository
For task 4, use validator code task4/validate_submissions.py
delivered inside submission example package
Submission label
Submission label is used to index all your submissions (systems per tasks). To avoid overlapping labels among all submitted systems, use following way to form your label:
[Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number][subtask letter (optional)]_[index number of your submission (1-4)]
For example the baseline systems would have the following labels:
Heittola_TAU_task1a_1
Koizumi_NTT_task2_1
Politis_TAU_task3_1
Turpault_INR_task4_1
Cartwright_NYU_task5_1
Drossos_TAU_task6_1
Package structure
Make sure your zip-package follows provided file naming convention and directory structure:
Zip-package root │ └───task1 Task 1 submissions │ │ Heittola_TAU_task1.technical_report.pdf Technical report covering all subtasks │ │ Heittola_TAU_task1a.technical_report.pdf (optional) Technical report for subtask A system only │ │ Heittola_TAU_task1b.technical_report.pdf (optional) Technical report for subtask B system only │ │ │ └───Heittola_TAU_task1a_1 Subtask A System 1 submission files │ │ Heittola_TAU_task1a_1.meta.yaml Subtask A System 1 meta information │ │ Heittola_TAU_task1a_1.output.csv Subtask A System 1 output │ : │ └───Heittola_TAU_task1a_4 Subtask A System 4 submission files │ │ Heittola_TAU_task1a_4.meta.yaml Subtask A System 4 meta information │ │ Heittola_TAU_task1a_4.output.csv Subtask A System 4 output │ │ │ └───Heittola_TAU_task1b_1 Subtask B System 1 submission files │ │ Heittola_TAU_task1b_1.meta.yaml Subtask B System 1 meta information │ │ Heittola_TAU_task1b_1.output.csv Subtask B System 1 output │ : │ └───Heittola_TAU_task1b_4 Subtask B System 4 submission files │ Heittola_TAU_task1b_4.meta.yaml Subtask B System 4 meta information │ Heittola_TAU_task1b_4.output.csv Subtask B System 4 output │ └───task2 Task 2 submissions │ │ Koizumi_NTT_task2.technical_report.pdf Technical report │ │ │ └───Koizumi_NTT_task2_1 System 1 submission files │ │ Koizumi_NTT_task2_1.meta.yaml System 1 meta information │ │ anomaly_score_ToyCar_id_05.csv System 1 output for each Machie ID in the evaluation dataset │ │ anomaly_score_ToyCar_id_06.csv │ │ anomaly_score_ToyCar_id_07.csv │ │ anomaly_score_ToyConveyor_id_04.csv │ : : │ │ anomaly_score_valve_id_05.csv │ │ │ └───Koizumi_NTT_task2_4 System 4 submission files │ Koizumi_NTT_task2_4.meta.yaml System 4 meta information │ anomaly_score_ToyCar_id_05.csv System 4 output for each Machie ID in the evaluation dataset │ anomaly_score_ToyCar_id_06.csv │ anomaly_score_ToyCar_id_07.csv │ anomaly_score_ToyConveyor_id_04.csv │ : │ anomaly_score_valve_id_05.csv │ └───task3 Task 3 submissions │ │ Politis_TAU_task3.technical_report.pdf Technical report │ │ │ └───Politis_TAU_task3_1 System 1 submission files │ │ Politis_TAU_task3_1.meta.yaml System 1 meta information │ │ Politis_TAU_task3_1 System 1 output files (200 files in total) │ : │ │ │ └───Politis_TAU_task3_4 System 4 submission files │ Politis_TAU_task3_4.meta.yaml System 4 meta information │ Politis_TAU_task3_4 System 4 output files (200 files in total) │ └───task4 Task 4 submissions (not all the 3 scenarios are needed) │ │ Turpault_task4_SED.technical_report.pdf Technical report for a SED submission │ │ Turpault_task4_SS_SED.technical_report.pdf Technical report for a SS+SED submission │ │ Wisdom_task4_SS.technical_report.pdf Technical report for a SS submission │ │ validate_submissions.py Submission validation code │ │ readme.md Instructions how to use the submission validation code │ │ │ └───Turpault_INR_SED_task4_1 SED System 1 submission files │ │ Turpault_INR_task4_SED_1.meta.yaml SED System 1 meta information │ │ Turpault_INR_task4_SED_1.output.csv SED System 1 output │ : │ │ │ └───Turpault_INR_SED_task4_4 SED System 4 submission files │ │ Turpault_INR_task4_SED_4.meta.yaml SED System 4 meta information │ │ Turpault_INR_task4_SED_4.output.csv SED System 4 output │ │ │ └───Turpault_INR_SS_SED_task4_1 SS+SED System 1 submission files │ │ Turpault_INR_task4_SS_SED_1.meta.yaml SS+SED System 4 meta information │ │ Turpault_INR_task4_SS_SED_1.output.csv SS+SED System 4 output │ : │ │ │ └───Turpault_INR_SS_SED_task4_4 SS+SED System 1 submission files │ │ Turpault_INR_task4_SS_SED_4.meta.yaml SS+SED System 4 meta information │ │ Turpault_INR_task4_SS_SED_4.output.csv SS+SED System 4 output │ │ │ └───Wisdom_GOO_SS_task4_1 SS System 1 submission files │ │ Wisdom_GOO_task4_SS_1.meta.yaml SS System 1 meta information │ │ Wisdom_GOO_task4_SS_1.output.csv SS System 1 output │ : │ │ │ └───Wisdom_GOO_SED_task4_4 SS System 4 submission files │ │ Wisdom_GOO_task4_SS_4.meta.yaml SS System 4 meta information │ │ Wisdom_GOO_task4_SS_4.output.csv SS System 4 output │ └───task5 Task 5 submissions │ │ Cartwright_NYU_task5.technical_report.pdf Technical report │ │ │ └───Cartwright_NYU_task5_1 System 1 submission files │ │ Cartwright_NYU_task5_1.meta.yaml System 1 meta information │ │ Cartwright_NYU_task5_1.output.csv System 1 output │ : │ │ │ └───Cartwright_NYU_task5_4 System 4 submission files │ Cartwright_NYU_task5_4.meta.yaml System 4 meta information │ Cartwright_NYU_task5_4.output.csv System 4 output │ └───task6 Task 6 submissions │ Drossos_TAU_task6_1.technical_report.pdf Technical report │ └───Drossos_TAU_task6_1 System 1 submission files │ Drossos_TAU_task6_1.meta.yaml System 1 meta information │ Drossos_TAU_task6_1.output.csv System 1 output : │ └───Drossos_TAU_task6_4 System 4 submission files Drossos_TAU_task6_4.meta.yaml System 4 meta information Drossos_TAU_task6_4.output.csv System 4 output
System outputs
Participants must submit the results for the provided evaluation datasets.
-
Follow the system output format specified in the task description.
-
Tasks are independent. You can participate to a single task or multiple tasks.
-
Multiple submissions for the same task are allowed (maximum 4 per task). Use a running index in the submission label, and give more detailed names for the submitted systems in the system meta information files. Please mark carefully the connection between the submitted systems and system parameters description in the technical report (for example by referring to the systems by using the submission label or system name given in the system meta information file).
-
Submitted system outputs will be published online on the DCASE2020 website later to allow future evaluations.
Meta information
In order to enable fast processing of the submissions and meta analysis of submitted systems, participants should provide meta information presented in a structured and correctly formatted YAML-file. Participants are advised to fill the meta information carefully while making sure all asked information is correctly provided.
A complete meta file will help us notice possible errors before officially publishing the results (for example unexpectedly large difference in performance between development and evaluation set) and allow contacting the authors in case we consider necessary. Please note that task organizers may ask you to update the meta file after the challenge submission deadline.
See the example meta files below for each baseline system. These examples are also available in the example submission package. Meta file structure is mostly the same for all tasks, only the metrics collected in results->development_dataset
-section differ per challenge task.
Example meta information file for Task 1 baseline system task1/Heittola_TAU_task1a_1/Heittola_TAU_task1a_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions.
# Generate your label following way to avoid
# overlapping codes among submissions:
# [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Heittola_TAU_task1a_1
# Submission name
# This name will be used in the results tables when space permits
name: DCASE2020 baseline system
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight.
# Use maximum 10 characters.
abbreviation: Baseline
# Authors of the submitted system. Mark authors in
# the order you want them to appear in submission lists.
# One of the authors has to be marked as corresponding author,
# this will be listed next to the submission in the results tables.
authors:
# First author
- lastname: Heittola
firstname: Toni
email: toni.heittola@tuni.fi # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences # Optional
location: Tampere, Finland
# Second author
- lastname: Mesaros
firstname: Annamaria
email: annamaria.mesaros@tuni.fi
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences
location: Tampere, Finland
# Third author
- lastname: Virtanen
firstname: Tuomas
email: tuomas.virtanen@tuni.fi
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences
location: Tampere, Finland
# System information
system:
# System description, meta data provided here will be used to do
# meta analysis of the submitted system.
# Use general level tags, when possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input
# e.g. 16kHz, 22.05kHz, 44.1kHz
input_sampling_rate: 44.1kHz
# Acoustic representation
# one or multiple labels, e.g. MFCC, log-mel energies, spectrogram, CQT, raw waveform, ...
acoustic_features: !!null
# Embeddings
# e.g. VGGish, OpenL3, ...
embeddings: OpenL3
# Data augmentation methods
# e.g. mixup, time stretching, block mixing, pitch shifting, ...
data_augmentation: !!null
# Machine learning
# In case using ensemble methods, please specify all methods used (comma separated list).
# one or multiple, e.g. GMM, HMM, SVM, MLP, CNN, RNN, CRNN, ResNet, ensemble, ...
machine_learning_method: MLP
# Ensemble method subsystem count
# In case ensemble method is not used, mark !!null.
# e.g. 2, 3, 4, 5, ...
ensemble_method_subsystem_count: !!null
# Decision making methods
# e.g. average, majority vote, maximum likelihood, ...
decision_making: !!null
# External data usage method
# e.g. directly, embeddings, pre-trained model, ...
external_data_usage: embeddings
# System complexity, meta data provided here will be used to evaluate
# submitted systems from the computational load perspective.
complexity:
# Total amount of parameters used in the acoustic model.
# For neural networks, this information is usually given before training process
# in the network summary.
# For other than neural networks, if parameter count information is not directly
# available, try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
# In case embeddings are used, add up parameter count of the embedding
# extraction networks and classification network
# Use numerical value.
total_parameters: 5012931 # embeddings (OpenL2)=4684224, classifier=328707
# List of external datasets used in the submission.
# Development dataset is used here only as example, list only external datasets
external_datasets:
# Dataset name
- name: TAU Urban Acoustic Scenes 2020, Development dataset
# Dataset access url
url: https://doi.org/10.5281/zenodo.3819968
# Total audio length in minutes
total_audio_length: 3840 # minutes
# URL to the source code of the system [optional]
source_code: https://github.com/toni-heittola/dcase2020_task1_baseline
# System results
results:
development_dataset:
# System results for development dataset with provided the cross-validation setup.
# Full results are not mandatory, however, they are highly recommended
# as they are needed for through analysis of the challenge submissions.
# If you are unable to provide all results, also incomplete
# results can be reported.
# Overall metrics
overall:
accuracy: 51.6 # mean of class-wise accuracies
logloss: 1.405
# Class-wise metrics
class_wise:
airport:
accuracy: 36.5
logloss: 1.989
bus:
accuracy: 52.9
logloss: 1.014
metro:
accuracy: 46.8
logloss: 1.429
metro_station:
accuracy: 47.1
logloss: 1.477
park:
accuracy: 72.7
logloss: 0.971
public_square:
accuracy: 59.6
logloss: 1.182
shopping_mall:
accuracy: 42.4
logloss: 1.714
street_pedestrian:
accuracy: 20.9
logloss: 2.421
street_traffic:
accuracy: 74.7
logloss: 0.861
tram:
accuracy: 62.8
logloss: 0.989
# Device-wise
device_wise:
a:
accuracy: 68.8
logloss: 0.946
b:
accuracy: 60.2
logloss: 1.158
c:
accuracy: 59.9
logloss: 1.038
s1:
accuracy: 50.3
logloss: 1.408
s2:
accuracy: 50.0
logloss: 1.405
s3:
accuracy: 50.9
logloss: 1.468
s4:
accuracy: 45.2
logloss: 1.642
s5:
accuracy: 44.8
logloss: 1.646
s6:
accuracy: 34.8
logloss: 1.931
Example meta information file for Task 1 baseline system task1/Heittola_TAU_task1b_1/Heittola_TAU_task1b_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions.
# Generate your label following way to avoid
# overlapping codes among submissions:
# [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Heittola_TAU_task1b_1
# Submission name
# This name will be used in the results tables when space permits
name: DCASE2020 baseline system
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight.
# Use maximum 10 characters.
abbreviation: Baseline
# Authors of the submitted system. Mark authors in
# the order you want them to appear in submission lists.
# One of the authors has to be marked as corresponding author,
# this will be listed next to the submission in the results tables.
authors:
# First author
- lastname: Heittola
firstname: Toni
email: toni.heittola@tuni.fi # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences # Optional
location: Tampere, Finland
# Second author
- lastname: Mesaros
firstname: Annamaria
email: annamaria.mesaros@tuni.fi
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences
location: Tampere, Finland
# Third author
- lastname: Virtanen
firstname: Tuomas
email: tuomas.virtanen@tuni.fi
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences
location: Tampere, Finland
# System information
system:
# System description, meta data provided here will be used to do
# meta analysis of the submitted system.
# Use general level tags, when possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input / channels
# one or multiple: e.g. mono, binaural, left, right, mixed, ...
input_channels: mono
# Audio input / sampling rate
# e.g. 16kHz, 22.05kHz, 44.1kHz, 48.0kHz
input_sampling_rate: 48.0kHz
# Acoustic representation
# one or multiple labels, e.g. MFCC, log-mel energies, spectrogram, CQT, raw waveform, ...
acoustic_features: log-mel energies
# Embeddings
# e.g. VGGish, OpenL3, ...
embeddings: !!null
# Data augmentation methods
# e.g. mixup, time stretching, block mixing, pitch shifting, ...
data_augmentation: !!null
# Machine learning
# In case using ensemble methods, please specify all methods used (comma separated list).
# one or multiple, e.g. GMM, HMM, SVM, MLP, CNN, RNN, CRNN, ResNet, ensemble, ...
machine_learning_method: CNN
# Ensemble method subsystem count
# In case ensemble method is not used, mark !!null.
# e.g. 2, 3, 4, 5, ...
ensemble_method_subsystem_count: !!null
# Decision making methods
# e.g. average, majority vote, maximum likelihood, ...
decision_making: !!null
# External data usage method
# e.g. directly, embeddings, pre-trained model, ...
external_data_usage: embeddings
# Method for handling the complexity restrictions
# e.g. weight quantization, sparsity, ...
complexity_management: !!null
# System complexity, meta data provided here will be used to evaluate
# submitted systems from the computational load perspective.
complexity:
# Total amount of parameters used in the acoustic model.
# For neural networks, this information is usually given before training process
# in the network summary.
# For other than neural networks, if parameter count information is not directly
# available, try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
# In case embeddings are used, add up parameter count of the embedding
# extraction networks and classification network
# Use numerical value.
total_parameters: 115219
# Total amount of non-zero parameters in the acoustic model.
# Calculated with same principles as "total_parameters".
# Use numerical value.
total_parameters_non_zero: 115219
# Model size calculated as instructed in task description page.
# Use numerical value, unit is KB
model_size: 450.1 # KB
# List of external datasets used in the submission.
# Development dataset is used here only as example, list only external datasets
external_datasets:
# Dataset name
- name: TAU Urban Acoustic Scenes 2020 3Class, Development dataset
# Dataset access url
url: https://doi.org/10.5281/zenodo.3670185
# Total audio length in minutes
total_audio_length: 2400 # minutes
# URL to the source code of the system [optional]
source_code: https://github.com/toni-heittola/dcase2020_task1_baseline
# System results
results:
development_dataset:
# System results for development dataset with provided the cross-validation setup.
# Full results are not mandatory, however, they are highly recommended
# as they are needed for through analysis of the challenge submissions.
# If you are unable to provide all results, also incomplete
# results can be reported.
# Overall metrics
overall:
accuracy: 88.0
logloss: 0.481
# Class-wise accuracies
class_wise:
indoor:
accuracy: 83.7
logloss: 0.746
outdoor:
accuracy: 89.5
logloss: 0.367
transportation:
accuracy: 90.7
logloss: 0.356
Example meta information file for Task 2 baseline system task2/Koizumi_NTT_task2_1/Koizumi_NTT_task2_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions.
# Generate your label following way to avoid overlapping codes among submissions:
# [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Koizumi_NTT_task2_1
# Submission name
# This name will be used in the results tables when space permits.
name: DCASE2020 baseline system
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight.
# Use a maximum of 10 characters.
abbreviation: Baseline
# Authors of the submitted system.
# Mark authors in the order you want them to appear in submission lists.
# One of the authors has to be marked as corresponding author, this will be listed next to the submission in the results tables.
authors:
# First author
- lastname: Koizumi
firstname: Yuma
email: koizumi.yuma@ieee.org # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
institution: NTT Corporation
department: Media Intelligence Laboratories # Optional
location: Tokyo, Japan
# Second author
- lastname: Kawaguchi
firstname: Yohei
email: yohei.kawaguchi.xk@hitachi.com
# Affiliation information for the author
affiliation:
institution: Hitachi, Ltd.
department: Research and Development Group
location: Tokyo, Japan
# Third author
- lastname: Imoto
firstname: Keisuke
email: keisuke.imoto@ieee.org
# Affiliation information for the author
affiliation:
institution: Doshisha University
location: Kyoto, Japan
# System information
system:
# System description, metadata provided here will be used to do a meta-analysis of the submitted system.
# Use general level tags, when possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input
# Please specify all sampling rates (comma-separated list).
# e.g. 16kHz, 22.05kHz, 44.1kHz
input_sampling_rate: 16kHz
# Data augmentation methods
# Please specify all methods used (comma-separated list).
# e.g. mixup, time stretching, block mixing, pitch shifting, ...
data_augmentation: !!null
# Front-end (preprocessing) methods
# Please specify all methods used (comma-separated list).
# e.g. HPSS, WPE, NMF, NN filter, RPCA, ...
front_end: !!null
# Acoustic representation
# one or multiple labels, e.g. MFCC, log-mel energies, spectrogram, CQT, raw waveform, ...
acoustic_features: log-mel energies
# Embeddings
# Please specify all embedings used (comma-separated list).
# one or multiple, e.g. VGGish, OpenL3, ...
embeddings: !!null
# Machine learning
# In case using ensemble methods, please specify all methods used (comma-separated list).
# e.g. AE, VAE, GAN, GMM, k-means, OCSVM, normalizing flow, CNN, LSTM, random forest, ensemble, ...
machine_learning_method: AE
# Method for aggregating predictions over time
# Please specify all methods used (comma-separated list).
# e.g. average, median, maximum, minimum, ...
aggregation_method: average
# Ensemble method subsystem count
# In case ensemble method is not used, mark !!null.
# e.g. 2, 3, 4, 5, ...
ensemble_method_subsystem_count: !!null
# Decision making in ensemble
# e.g. average, median, maximum, minimum, ...
decision_making: !!null
# External data usage method
# Please specify all usages (comma-separated list).
# e.g. simulation of anomalous samples, embeddings, pre-trained model, ...
external_data_usage: !!null
# Usage of the development dataset
# Please specify all usages (comma-separated list).
# e.g. development, pre-training, fine-tuning
development_data_usage: development
# System complexity, metadata provided here may be used to evaluate submitted systems from the computational load perspective.
complexity:
# Total amount of parameters used in the acoustic model.
# For neural networks, this information is usually given before training process in the network summary.
# For other than neural networks, if parameter count information is not directly available, try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
# In case embeddings are used, add up parameter count of the embedding extraction networks and classification network.
# Use numerical value.
total_parameters: 269992
# List of external datasets used in the submission.
# Development dataset is used here only as an example, list only external datasets
external_datasets:
# Dataset name
- name: DCASE 2020 Challenge Task 2 Development Dataset
# Dataset access URL
url: https://zenodo.org/record/3678171
# URL to the source code of the system [optional, highly recommended]
# Reproducibility will be used to evaluate submitted systems.
source_code: https://github.com/y-kawagu/dcase2020_task2_baseline
# System results
results:
development_dataset:
# System results for development dataset.
# Full results are not mandatory, however, they are highly recommended as they are needed for a thorough analysis of the challenge submissions.
# If you are unable to provide all results, also incomplete results can be reported.
# Average of AUCs over all Machine IDs [%]
# No need to round numbers
ToyCar:
averaged_auc: 78.77
averaged_pauc: 67.58
ToyConveyor:
averaged_auc: 72.53
averaged_pauc: 60.43
fan:
averaged_auc: 65.83
averaged_pauc: 52.45
pump:
averaged_auc: 72.89
averaged_pauc: 59.99
slider:
averaged_auc: 84.76
averaged_pauc: 66.53
valve:
averaged_auc: 66.28
averaged_pauc: 50.98
Example meta information file for Task 3 baseline system task3/Politis_TAU_task3_1/Politis_TAU_task3_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions, to avoid overlapping codes among submissions
# use following way to form your label:
# [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Politis_TAU_task3_1
# Submission name
# This name will be used in the results tables when space permits
name: DCASE2020 Ambisonic example
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight, maximum 10 characters
abbreviation: FOA_base
# Submission authors in order, mark one of the authors as corresponding author.
authors:
# First author
- lastname: Politis
firstname: Archontis
email: archontis.politis@tuni.fi # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Audio Research Group
location: Tampere, Finland
# Second author
- lastname: Adavanne
firstname: Sharath
email: sharath.adavanne@tuni.fi # Contact email address
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Audio Research Group
location: Tampere, Finland
# Third author
- lastname: Virtanen
firstname: Tuomas
email: tuomas.virtanen@tuni.fi # Contact email address
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Audio Research Group
location: Tampere, Finland
# System information
system:
# System description, meta data provided here will be used to do
# meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input
input_format: Ambisonic # e.g. Ambisonic or Microphone Array or both
input_sampling_rate: 24kHz #
# Acoustic representation
acoustic_features: mel spectra, intensity vector # e.g one or multiple [phase and magnitude spectra, mel spectra, GCC-PHAT, TDOA, intensity vector ...]
# Data augmentation methods
data_augmentation: !!null # [time stretching, block mixing, pitch shifting, ...]
# Machine learning
# In case using ensemble methods, please specify all methods used (comma separated list).
machine_learning_method: CRNN # e.g one or multiple [GMM, HMM, SVM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...]
# System complexity, meta data provided here will be used to evaluate
# submitted systems from the computational load perspective.
complexity:
# Total amount of parameters used in the acoustic model. For neural networks, this
# information is usually given before training process in the network summary.
# For other than neural networks, if parameter count information is not directly available,
# try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
total_parameters: 116118
# URL to the source code of the system [optional]
source_code: https://github.com/sharathadavanne/seld-dcase2020
# Evaluation setup used for training your final submitted model
evaluation_setup:
# List the folds used for training and validating your submitted evaluation model. For instance the baseline SELDnet was trained and validated with the following folds from the dataset
training_folds: 2, 3, 4, 5, 6
validation_folds: 1
# System results
results:
development_dataset:
# System result for development dataset with the provided evaluation setup.
# Overall score
overall:
ER_20: 0.72
F_20: 37.4
LE_CD: 22.8
LR_CD: 60.7
Example meta information file for Task 4 baseline system task4/Turpault_INR_task4_SS_SED_1/Turpault_INR_task4_SS_SED_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions, to avoid overlapping codes among submissions
# use following way to form your label:
# [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Turpault_INR_task4_SS_SED_1
# Submission name
# This name will be used in the results tables when space permits
name: DCASE2020 SS+SED baseline system
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight, maximum 10 characters
abbreviation: SS+SED Baseline
# Submission authors in order, mark one of the authors as corresponding author.
authors:
# First author
- lastname: Turpault
firstname: Nicolas
email: nicolas.turpault@inria.fr # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
abbreviation: INR
institute: Inria Nancy Grand-Est
department: Department of Natural Language Processing & Knowledge Discovery
location: Nancy, France
# Second author
- lastname: Serizel
firstname: Romain
email: romain.serizel@loria.fr # Contact email address
# Affiliation information for the author
affiliation:
abbreviation: ULO
institute: University of Lorraine, Loria
department: Department of Natural Language Processing & Knowledge Discovery
location: Nancy, France
# Third author
- firstname: John
lastname: Hershey
# Affiliation information for the author
affiliation:
abbreviation: GOO
institue: Google, Inc.
department: AI Perception
Location: Cambridge, United States
# Fourth author
- firstname: Scott
lastname: Wisdom
# Affiliation information for the author
affiliation:
abbreviation: GOO
institue: Google, Inc.
department: AI Perception
Location: Cambridge, United States
# Fifth author
- firstname: Hakan
lastname: Erdogan
# Affiliation information for the author
affiliation:
abbreviation: GOO
institue: Google, Inc.
department: AI Perception
Location: Cambridge, United States
#...
# SED System information
sed_system:
# SED system description, meta data provided here will be used to do
# meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input
input_channels: mono # e.g. one or multiple [mono, binaural, left, right, mixed, ...]
input_sampling_rate: 16 # In kHz
# Acoustic representation
acoustic_features: log-mel energies # e.g one or multiple [MFCC, log-mel energies, spectrogram, CQT, ...]
# Data augmentation methods
data_augmentation: !!null # [time stretching, block mixing, pitch shifting, ...]
# Machine learning
# In case using ensemble methods, please specify all methods used (comma separated list).
machine_learning_method: CRNN # e.g one or multiple [GMM, HMM, SVM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...]
# Ensemble method subsystem count
# In case ensemble method is not used, mark !!null.
ensemble_method_subsystem_count: 3 # [2, 3, 4, 5, ... ]
# Decision making methods
decision_making: P-norm # [majority vote, ...]
# Semi-supervised method used to exploit both labelled and unlabelled data
machine_learning_semi_supervised: mean-teacher student # e.g one or multiple [pseudo-labelling, mean-teacher student...]
# Segmentation method
segmentation_method: !!null # E.g. [RBM, attention layers...]
# Post-processing, followed by the time span (in ms) in case of smoothing
post-processing: median filtering (93ms) # [median filtering, time aggregation...]
# System complexity, meta data provided here will be used to evaluate
# submitted systems from the computational load perspective.
complexity:
# Total amount of parameters used in the acoustic model. For neural networks, this
# information is usually given before training process in the network summary.
# For other than neural networks, if parameter count information is not directly available,
# try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
total_parameters: 1112420
# Approximate training time followed by the hardware used
trainining_time: 3h (1 GTX 1080 Ti)
# Model size in MB
model_size: 4.5
# The training subsets used to train the model. Followed the amount of data (number of clips) used per subset.
subsets: # [weak (xx), unlabel_in_domain (xx), synthetic (xx), FUSS (xx)...]
# URL to the source code of the system [optional, highly recommended]
source_code: https://github.com/turpaultn/dcase20_task4/tree/public_branch/baseline
# SS System information
ss_system:
# SS system description, meta data provided here will be used to do
# meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
# Description of sound separation system (optional).
input_sampling_rate: 16 #in kHz
# Basis
basis: stft # e.g one or multiple [stft, learnable_conv, ...]
# Data augmentation methods
data_augmentation: !!null # [time stretching, block mixing, pitch shifting, mix-on-the-fly, ...]
# Network architecture
# In case using ensemble methods, please specify all methods used (comma separated list).
network_architecture: TDCN++ # e.g one or multiple [MLP, CNN, RNN, CRNN, NMF, ConvTasNet, TDCN++, ensemble, ...]
# Ensemble method subsystem count
# In case ensemble method is not used, mark !!null.
ensemble_method_subsystem_count: !!null # [2, 3, 4, 5, ... ]
# Other details about the separation model
other_details: mixture consistency # comma-delimted
complexity:
# Total amount of parameters used in the acoustic model. For neural networks, this
# information is usually given before training process in the network summary.
# For other than neural networks, if parameter count information is not directly available,
# try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
total_parameters: 1112420
# Approximate training time followed by the hardware used
trainining_time: 3h (1 GTX 1080 Ti)
# Model size in MB
model_size: 4.5
# URL to the source code of the system [optional, highly recommended]
source_code: https://github.com/google-research/sound-separation/tree/master/models/dcase2020_fuss_baseline
# Description of joint SED ans SS system.
sed_ss_description:
# Integration type
integration_type: late # [early, late, both,...]
# Integration method
integration_method: average # [average, concat, ...]
# Separated sources used
separated_sources_used: DESED foreground # [DESED foreground, all sources, ...]
# Training procedure
training: separately # [separately, end-to-end, ...]
# System results (SED)
sed_results:
# Full results are not mandatory, but for through analysis of the challenge submissions recommended.
# If you cannot provide all results, also incomplete results can be reported.
development_dataset:
# System result for development dataset with provided the cross-validation setup.
overall:
F-score: 34.8
PSDS: 0.610
# Class-wise accuracies
class_wise:
Alarm_bell_ringing:
F-score: 36.1
Blender:
F-score: 35.2
Cat:
F-score: 45.1
Dishes:
F-score: 25.7
Dog:
F-score: 22.1
Electric_shaver_toothbrush:
F-score: 37.6
Frying:
F-score: 24.1
Running_water:
F-score: 33.4
Speech:
F-score: 50.9
Vacuum_cleaner:
F-score: 45.7
# System results (see SS evaluation script for more details)
ss_results:
# SS on DESED+FUSS development set
dry_fuss_development_dataset:
SISNR_mixture_target: 0
SISNR_separated_target: 0
SISNR_mixture_background: 0
SISNR_separated_background: 0
# SS on DESED+FUSS evaluation set
dry_fuss_evaluation_dataset:
SISNR_mixture_target: 0
SISNR_separated_target: 0
SISNR_mixture_background: 0
SISNR_separated_background: 0
Example meta information file for Task 5 baseline system task5/Cartwright_NYU_task5_1/Cartwright_NYU_task5_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions, to avoid overlapping codes among submissions
# use following way to form your label:
# [Last name of corresponding author]_[Abbreviation of institution of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Cartwright_NYU_task5_1
# Submission name
# This name will be used in the results tables when space permits
name: DCASE2020 baseline system
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight, maximum 10 characters
abbreviation: Baseline
# Submission authors in order, mark one of the authors as corresponding author.
authors:
# First author
- firstname: Mark
lastname: Cartwright
email: mark.cartwright@nyu.edu # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Computer Science and Engineering, Center for Urban Science and Progress
location: New York, New York, USA
# Second author
- firstname: Jason
lastname: Cramer
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Electrical and Computer Engineering
location: New York, New York, USA
# Third author
- firstname: Ana Elisa
lastname: Mendez Mendez
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Music and Performing Arts Professions
location: New York, New York, USA
# Fourth author
- firstname: Yu
lastname: Wang
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Music and Performing Arts Professions
location: New York, New York, USA
# Fifth author
- firstname: Ho-Hsiang
lastname: Wu
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Music and Performing Arts Professions
location: New York, New York, USA
# Sixth author
- firstname: Vincent
lastname: Lostanlen
# Affiliation information for the author
affiliation:
institution: Cornell University
department: Cornell Lab of Ornithology
location: Ithaca, New York, USA
# Seventh author
- firstname: Magdalena
lastname: Fuentes
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Music and Performing Arts Professions
location: New York, New York, USA
# Eigth author
- firstname: Justin
lastname: Salamon
# Affiliation information for the author
affiliation:
institution: Adobe Research
department: Machine Perception Team
location: San Francisco, CA, USA
# Ninth author
- firstname: Juan P.
lastname: Bello
# Affiliation information for the author
affiliation:
institution: New York University
department: Music and Audio Research Laboratory, Department of Computer Science and Engineering, Center for Urban Science and Progress
location: New York, New York, USA
# System information
system:
# System description, meta data provided here will be used to do
# meta analysis of the submitted system. Use general level tags, if possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input
input_channels: mono # e.g. one or multiple [mono, binaural, left, right, mixed, ...]
input_sampling_rate: 48kkHz #
# Acoustic representation
acoustic_features: openl3 # e.g one or multiple [MFCC, log-mel energies, spectrogram, CQT, deep embedding (e.g. vggish), ...]
# Data augmentation methods
data_augmentation: !!null # [time stretching, block mixing, pitch shifting, ...]
# Machine learning method
# In case using ensemble methods, please specify all methods used (comma separated list).
# You do not need to repeat model types used multiple times in the ensemble.
machine_learning_method: MLP # e.g one or multiple [GMM, HMM, kNN, MLP, CNN, RNN, CRNN, NMF, random forest, ensemble, ...]
# Ensemble method subsystem count
# In case ensemble method is not used, mark !!null.
ensemble_method_subsystem_count: !!null # [2, 3, 4, 5, ... ]
# Specify if any of the additional metadata was used for training
used_annotator_id: false
used_proximity: false
used_sensor_id: false
used_borough: false
used_block: false
used_latitude: true
used_longitude: true
used_year: false
used_week: true
used_day: true
used_hour: true
# Method for aggregating predictions over time, if relevant
aggregation_method: !!null # [mean, ...]
# STC data type and source
stc_external_data_and_sources: !!null # [(precipitation, NOAA), (temperature, NOAA), (311complaints, NYCOpenData), ...]
# External data type and source
other_external_data_and_sources: !!null # [(audio_data, AudioSet), ...]
# Annotation level targeted by model
# That is, if the model only predicts fine level annotations
# should be specified as "fine". A model that is specifically
# trained to predict both fine and coarse annotations should
# be specified as "both".
target_level: fine # [fine, coarse, both]
# Method for determining targets for training from annotations
target_method: minority vote # [minority vote, majority vote, ...]
# Re-labeling of the train set
re_labeling: !!null # [automatic, manual, ...]
# NOTE: These should only be provided if providing detections, rather than
# probabilities of class presence.
#
# Type of method used to determine detection thresholds
detection_threshold_method: !!null # [automatic, manual, fixed, !!null]
# Specify if the method for determining threshold was done over all classes
# or per class
detection_threshold_level: !!null # [global, classwise, !!null]
# System complexity, meta data provided here will be used to evaluate
# submitted systems from the computational load perspective.
complexity:
# Total amount of learned parameters used in the acoustic model.
# For neural networks, this information is usually given before training process
# in the network summary. For other than neural networks, if parameter count
# information is not directly available,try estimating the count as accurately
# as possible. In case of ensemble approaches, add up parameters for all subsystems.
total_parameters: 79534
# URL to the source code of the system
source_code: https://github.com/sonyc-project/dcase2020task5-uststc-baseline
Example meta information file for Task 6 baseline system task6/Drossos_TAU_task6_1/Drossos_TAU_task6_1.meta.yaml
:
# Submission information
submission:
# Submission label
# Label is used to index submissions.
# Generate your label following way to avoid
# overlapping codes among submissions:
# [Last name of corresponding author]_[Abbreviation of institute of the corresponding author]_task[task number]_[index number of your submission (1-4)]
label: Drossos_TAU_task6_1.meta.yaml
# Submission name
# This name will be used in the results tables when space permits
name: DCASE2020 baseline system
# Submission name abbreviated
# This abbreviated name will be used in the results table when space is tight.
# Use maximum 10 characters.
abbreviation: Baseline
# Authors of the submitted system. Mark authors in
# the order you want them to appear in submission lists.
# One of the authors has to be marked as corresponding author,
# this will be listed next to the submission in the results tables.
authors:
# First author
- lastname: Drossos
firstname: Konstantinos
email: konstantinos.drossos@tuni.fi # Contact email address
corresponding: true # Mark true for one of the authors
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences # Optional
location: Tampere, Finland
# Second author
- lastname: Lipping
firstname: Samuel
email: samuel.lipping@tuni.fi
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences
location: Tampere, Finland
# Third author
- lastname: Virtanen
firstname: Tuomas
email: tuomas.virtanen@tuni.fi
# Affiliation information for the author
affiliation:
abbreviation: TAU
institute: Tampere University
department: Computing Sciences
location: Tampere, Finland
# System information
system:
# System description, meta data provided here will be used to do
# meta analysis of the submitted system.
# Use general level tags, when possible use the tags provided in comments.
# If information field is not applicable to the system, use "!!null".
description:
# Audio input
# e.g. 16kHz, 22.05kHz, 44.1kHz
input_sampling_rate: 44.1kHz
# Acoustic representation
# one or multiple labels, e.g. MFCC, log-mel energies, spectrogram, CQT, raw waveform, ...
acoustic_features: !!null
# Word representation
# one or multiple labels, e.g. embeddings, one-hot, ...
word_features: !!null
# Usage of available metadata (i.e. the tags of each file)
# Either True or False
used_metadata: False
# Data augmentation methods
# e.g. mixup, time stretching, block mixing, pitch shifting, ...
data_augmentation: !!null
# Machine learning
# In case using ensemble methods, please specify all methods used (comma separated list).
# one or multiple, e.g. seq2seq
machine_learning_method: MLP
# In the case that you used a seq2seq, indicate
# what encoder you have, e.g. multi-layer GRU, CNNs, etc
seq2seq_encoder: multi-layer RNN
# and what decoder you have, e.g. multi-layer GRU, CNNs, etc
seq2seq_decoder: multi-layer RNN
# Classifier
# e.g. feed-forward, SVM, ...
classifier: feed-forward
# System complexity, meta data provided here will be used to evaluate
# submitted systems from the computational load perspective.
complexity:
# Total amount of parameters used in the system.
# For neural networks, this information is usually given before training process
# in the network summary.
# For other than neural networks, if parameter count information is not directly
# available, try estimating the count as accurately as possible.
# In case of ensemble approaches, add up parameters for all subsystems.
# In case embeddings are used, add up parameter count of the embedding
# extraction networks and classification network.
# If you use extra modules during training, but not during testing, here
# indicate the total (i.e. during training) amount of parameters.
# Use numerical value.
total_parameters: 5012931 # embeddings (OpenL2)=4684224, classifier=328707
# And here, if different from `total_parameters`, indicate
# the amount of parameters that your system use during inference.
inference_parameters: 10931
# URL to the source code of the system [optional, but strongly encouraged]
source_code: https://github.com/audio-captioning/dcase-2020-baseline
# URL to the pre-trained weights of the system [optional, but strongly encouraged]
pre_trained_weights: https://zenodo.org/record/3697687
# System results
results:
# For both development and evaluation results, full results are not mandatory,
# however, they are highly recommended as they are needed for through analysis
# of the challenge submissions. If you are unable to provide all results, also
# incomplete results can be reported.
#
# For all metrics, numerical precision should be at three decimal digits.
development_dataset:
# System results for **development** dataset with provided setup.
# Final results are reported at the **evaluation** section, just
# below this one.
# Metrics
bleu_1: 0.389
bleu_2: 0.136
bleu_3: 0.055
bleu_4: 0.015
rouge_l: 0.262
meteor: 0.084
cider: 0.074
spice: 0.033
spider: 0.054
evaluation_dataset:
# System results for **evaluation** dataset with provided the evaluation setup.
# These results are the ones that will rank your submission.
# Metrics
bleu_1: # add here the BLEU 1 score of your method for DCASE evaluation split
bleu_2: # add here the BLEU 2 score of your method for DCASE evaluation split
bleu_3: # add here the BLEU 3 score of your method for DCASE evaluation split
bleu_4: # add here the BLEU 4 score of your method for DCASE evaluation split
rouge_l: # add here the ROUGE L score of your method for DCASE evaluation split
meteor: # add here the METEOR score of your method for DCASE evaluation split
cider: # add here the CIDEr score of your method for DCASE evaluation split
spice: # add here the SPICE score of your method for DCASE evaluation split
spider: # add here the SPIDEr score of your method for DCASE evaluation split
Technical report
All participants are expected to submit a technical report about the submitted system, to help the DCASE community better understand how the algorithm works.
Technical reports are not peer-reviewed. The technical reports will be published on the challenge website together with all other information about the submitted system. For the technical report it is not necessary to follow closely the scientific publication structure (for example there is no need for extensive literature review). The report should however contain sufficient description of the system.
Please report the system performance using the provided cross-validation setup or development set, according to the task. For participants taking part in multiple tasks, one technical report covering all tasks is sufficient, if the systems have only small differences. Describe the task specific parameters in the report.
Participants can also submit the same report as a scientific paper to DCASE 2020 Workshop. In this case, the paper must respect the structure of a scientific publication, and be prepared according to the provided Workshop paper instructions and template. Please note that the template is slightly different, and you will have to create a separate submission to the DCASE2020 Workshop track in the submission system. Please refer to the workshop webpage for more details. DCASE2020 Workshop papers will be peer-reviewed.
Template
Reports are in format 4+1 pages. Papers are maximum 5 pages, including all text, figures, and references, with the 5th page containing only references. The templates for technical report are available here: