Poster Session II

Poster Session II

 

Language-Queried Audio Source Separation Enhanced by Expanded Language-Audio Contrastive Loss
Haechun chung (KT Corporation), JaeHoon Jung (AI2XL Lab., Institute of Convergence Technology, KT corp.)

A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining
Feiyang Xiao (Harbin Engineering University), Jian Guan (Harbin Engineering University), Qiaoxi Zhu (University of Technology Sydney), Xubo Liu (University of Surrey), Wenbo Wang (Harbin Institute of Technology), Shuhan Qi (Harbin Institute of Technology, Shenzhen), kejia zhang (Harbin Engineering University), Jianyuan Sun (University of Sheffield), Wenwu Wang (University of Surrey)

Large-Language-Model-Based Caption Augmentation for Language-Queried Audio Source Separation
Yoonah song (GIST), Dohyun Lee (GIST), Hong Kook Kim (Gwangju Institute of Science and Technology)

Angular Distance Distribution Loss for Audio Classification
Antonio Almudévar (University of Zaragoza), romain serizel (Université de Lorraine), Alfonso Ortega (Universidad de Zaragoza)

Auxiliary Decoder-Based Learning of Sound Event Detection using Multi-Channel Features and Maximum Probability Aggregation
Sang Won Son (Gwangju Institute of Science and Technology), Jongyeon Park (Gwangju Institute of Science and Technology), Hong Kook Kim (Gwangju Institute of Science and Technology), Sulaiman Vesal (Hanwha Vision), Jeong Eun Lim (Hanwha Vision)

Data-Efficient Acoustic Scene Classification with Pre-Training, Bayesian Ensemble Averaging, and Extensive Augmentations
Aida Rostamza (Johannes Kepler Universität Linz), David Nadrchal (Johannes Kepler Universität Linz), Patrick Schilcher (Johannes Kepler Universität Linz)

Task 6 Automated Audio Captioning

Huang Xie (Tampere University), Tuomas Virtanen (Tampere University), Romain Serizel (University of Lorraine), Etienne Labbé (Université Toulouse III – Paul Sabatier), Thomas Pellegrini (Université Toulouse III – Paul Sabatier), Xinhao Mei (University of Surrey), Xuenan Xu (University of Surrey), Mark D. Plumbley (University of Surrey), Wenwu Wang (University of Surrey), Mengyue Wu (Shanghai Jiao Tong University)

Task 7 Sound Scene Synthesis

Mathieu Lagrange (CNRS, Ecole Centrale Nantes, Nantes Université), Junwon Lee (Gaudio Lab, Inc. / Korea Advanced Institute of Science & Technology (KAIST)), Modan Tailleur (Laboratoire des sciences du numérique de Nantes), Laurie Heller (Carnegie Mellon University), Keunwoo Choi (Gaudio Lab, Inc.), Brian McFee (New York University), Keisuke Imoto (Doshisha University), Yuki Okamoto (The University of Tokyo)

Task 8 Language-Based Audio Retrieval

Huang Xie (Tampere University), Tuomas Virtanen (Tampere University), Romain Serizel (University of Lorraine), Etienne Labbé (Université Toulouse III – Paul Sabatier), Thomas Pellegrini (Université Toulouse III – Paul Sabatier)

Task 9 Language-Queried Audio Source Separation

Xubo Liu (University of Surrey), Wenwu Wang (University of Surrey), Mark D. Plumbley (University of Surrey), Jonathan Le Roux (Mitsubishi Electric Research Laboratories), Gordon Wichern (Mitsubishi Electric Research Laboratories), Yan Zhao (ByteDance), Yuzhuo Liu (ByteDance), Hangting Chen (Tencent AI Lab)

Task 10 Acoustic-based Traffic Monitoring

Luca Bondi (Bosch),Luca Bondi (Bosch Research, USA), Shabnam Ghaffarzadegan (Bosch Research, USA), Stefano Damiano (KU Leuven), Abinaya Kumar (Bosch Research, USA), Ho-Hsiang Wu (Bosch Research, USA), Wei-Cheng Lin (Bosch Research, USA), Samarjit Das (Bosch Research, USA), Hans-Georg Horst (Robert Bosch GmbH), Toon van Waterschoot (KU Leuven)