Task description
More detailed task description can be found in the task description page
Teams ranking
Here are listed the best systems from all teams.
| Rank | Submission Code | Name |
Corresponding Author |
Affiliation |
Continual learning solution |
Technical Report |
Avg Accuracy |
D1 Accuracy |
D2 Accuracy |
D3 Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|
| Park_GIST-HanwhaVision_task7_4 | XSTACK-ENS | Jongyeon Park | Gwangju Institute of Science and Technology (GIST), Gwangju, South Korea | Architectural, Replay | Park2026 | 79.62 | 87.07 | 81.45 | 70.34 | |
| Chun_Chosun_task7_1 | OR-KDL-L | Chun Chanjun | Chosun University, Gwangju, Republic of Korea | Regularization, Architectural | Chanjun2026 | 77.96 | 91.71 | 79.74 | 62.42 | |
| Im_SSU_task7_2 | DIR_OOD-S2 | Sungbin Im | Soongsil University, Seoul, Repulic of Korea | Architectural, Regularization | Choi2026 | 76.92 | 88.82 | 78.62 | 63.32 | |
| Chang_Surrey_task7_1 | PH-Route | Peiwei Chang | University of Surrey, Guildford, UK | Architectural, Regularization | Chang2026 | 75.12 | 80.25 | 82.63 | 62.49 | |
| Divakar_IND_task7_2 | Augmentation | Aakash Divakar | Independent Researcher | Architectural, Regularization | Divakar2026 | 72.78 | 89.22 | 76.25 | 52.87 | |
| Miyazaki_CyberAgent_task7_4 | Chain-tri | Koichi Miyazaki | CyberAgent, Tokyo, Japan | Architectural, Regularization | Miyazaki2026 | 72.32 | 87.95 | 70.03 | 58.98 | |
| Chang_HYU_task7_3 | HYU | Joon-Hyuk Chang | Hanyang University, Seoul, Republic of Korea | Architectural, Regularization | Son2026 | 71.27 | 90.23 | 68.43 | 55.14 | |
| BelHadj_VUB_task7_2 | LoRAAdapt | Yacine Bel-Hadj | Vrije Universiteit Brussel, Brussels, Belgium | Architectural, Optimization | Bel-Hadj2026 | 70.65 | 85.72 | 67.88 | 58.34 | |
| Guan_HEU_task7_2 | ProDEx-WPE | Xuefeng Yang | Harbin Engineering University, Harbin, China | Architectural | Yang2026 | 69.91 | 79.26 | 70.58 | 59.89 | |
| Li_SCUT_task7_1 | SoftD3Exp | Yanxiong Li | South China University of Technology, Guangzhou, China | Architectural | Li2026 | 69.38 | 86.80 | 68.08 | 53.25 | |
| Zhang_XJTLU_task7_4 | MAJ5 | Peihong Zhang | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural | Zhang2026a | 69.08 | 81.54 | 69.25 | 56.44 | |
| Gao_SHNU_task7_1 | CkptAvg | Zhe Gao | Shanghai Normal University, Shanghai, China | Optimization | Sun2026 | 68.87 | 87.14 | 59.49 | 59.99 | |
| Huang_WHU_task7_3 | MR-CLIAC | Gongping Huang | Wuhan University, Wuhan, China | Architectural, Regularization | Pi2026 | 68.78 | 73.84 | 72.24 | 60.25 | |
| Heo_SeoulTech_task7_3 | MoE-TTA | Se-Min Heo | Seoul National University of Science and Technology (SeoulTech), Seoul, Republic of Korea | Architectural | Heo2026 | 66.66 | 72.69 | 82.01 | 45.28 | |
| Chung_IND_task7_4 | System4 | Haechun Chung | Independent, Seoul, Republic of Korea | Architectural | Chung2026 | 65.46 | 89.14 | 60.76 | 46.49 | |
| Tyagi_IITM_Task7_1 | Tyagi | Akansha Tyagi | Indian Institute of Technology Mandi, Himachal Pradesh, India | Architectural | Tyagi2026 | 63.09 | 81.54 | 66.99 | 40.76 | |
| Takami_OU_task7_1 | DCCMS | Haruto Takami | Okayama University, Okayama, Japan | Architectural | Takami2026 | 62.42 | 86.14 | 53.33 | 47.81 | |
| Giacomini_ICMC_task7_1 | MFSeqEns | Anderson Giacomini | University of São Paulo (USP), São Carlos, SP, Brazil | Architectural, Regularization | Giacomini2026 | 62.34 | 82.89 | 56.47 | 47.66 | |
| Yang_USST_task7_1 | SoftBNDist | Wenxing Yang | University of Shanghai for Science and Technology, Shanghai, China | Architectural | Zhang2026 | 62.33 | 86.44 | 55.06 | 45.48 | |
| Hu_XJTLU_task7_2 | LoRA-ANS | Shengchen Li | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural, Regularization | Hu2026 | 61.58 | 72.79 | 60.24 | 51.73 | |
| Baseline System | Baseline System | Manjuanath Mulimani | Tampere University, Tampere, Finland | Regularization, Architectural | 60.35 | 89.15 | 49.72 | 42.18 | ||
| Raja_IITM_task7_1 | BN-Ridge | Risan Raja | Indian Institute of Technology Madras, Chennai, India | Architectural, Optimization | Raja2026a | 59.66 | 77.64 | 41.11 | 60.22 |
Systems ranking
| Rank | Submission Code | Name | Author | Affiliation |
Continual learning solution |
Technical Report |
Avg Accuracy |
D1 Accuracy |
D2 Accuracy |
D3 Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|
| Park_GIST-HanwhaVision_task7_4 | XSTACK-ENS | Jongyeon Park | Gwangju Institute of Science and Technology (GIST), Gwangju, South Korea | Architectural, Replay | Park2026 | 79.62 | 87.07 | 81.45 | 70.34 | |
| Park_GIST-HanwhaVision_task7_3 | DEEP_EX2 | Jongyeon Park | Gwangju Institute of Science and Technology (GIST), Gwangju, South Korea | Architectural, Replay | Park2026 | 78.78 | 86.32 | 80.07 | 69.94 | |
| Park_GIST-HanwhaVision_task7_2 | DEEP_EX1 | Jongyeon Park | Gwangju Institute of Science and Technology (GIST), Gwangju, South Korea | Architectural, Replay | Park2026 | 78.77 | 86.84 | 79.55 | 69.90 | |
| Park_GIST-HanwhaVision_task7_1 | EX_ONLY | Jongyeon Park | Gwangju Institute of Science and Technology (GIST), Gwangju, South Korea | Architectural, Replay | Park2026 | 78.00 | 84.58 | 80.80 | 68.62 | |
| Chun_Chosun_task7_1 | OR-KDL-L | Chun Chanjun | Chosun University, Gwangju, Republic of Korea | Regularization, Architectural | Chanjun2026 | 77.96 | 91.71 | 79.74 | 62.42 | |
| Chun_Chosun_task7_3 | OR-KDL-CL | Chun Chanjun | Chosun University, Gwangju, Republic of Korea | Regularization, Architectural | Chanjun2026 | 77.34 | 88.88 | 79.93 | 63.22 | |
| Im_SSU_task7_2 | DIR_OOD-S2 | Sungbin Im | Soongsil University, Seoul, Repulic of Korea | Architectural, Regularization | Choi2026 | 76.92 | 88.82 | 78.62 | 63.32 | |
| Im_SSU_task7_1 | DIR_OOD-S1 | Sungbin Im | Soongsil University, Seoul, Repulic of Korea | Architectural, Regularization | Choi2026 | 76.68 | 88.59 | 78.45 | 63.00 | |
| Chun_Chosun_task7_2 | OR-KDL-LS | Chun Chanjun | Chosun University, Gwangju, Republic of Korea | Regularization, Architectural | Chanjun2026 | 76.58 | 88.45 | 78.35 | 62.94 | |
| Chun_Chosun_task7_4 | OR-KDL-CLS | Chun Chanjun | Chosun University, Gwangju, Republic of Korea | Regularization, Architectural | Chanjun2026 | 76.26 | 87.33 | 78.40 | 63.04 | |
| Chang_Surrey_task7_1 | PH-Route | Peiwei Chang | University of Surrey, Guildford, UK | Architectural, Regularization | Chang2026 | 75.12 | 80.25 | 82.63 | 62.49 | |
| Divakar_IND_task7_2 | Augmentation | Aakash Divakar | Independent Researcher | Architectural, Regularization | Divakar2026 | 72.78 | 89.22 | 76.25 | 52.87 | |
| Miyazaki_CyberAgent_task7_4 | Chain-tri | Koichi Miyazaki | CyberAgent, Tokyo, Japan | Architectural, Regularization | Miyazaki2026 | 72.32 | 87.95 | 70.03 | 58.98 | |
| IM_SSU_task7_4 | SCR-S2 | Jungyu Choi | Soongsil University, Seoul, Republic of Korea | Architectural, Regularization | Choi2026 | 72.25 | 89.90 | 75.38 | 51.48 | |
| Divakar_IND_task7_3 | Ensemble | Aakash Divakar | Independent Researcher | Architectural, Regularization | Divakar2026 | 72.20 | 88.84 | 75.36 | 52.41 | |
| IM_SSU_task7_3 | SCR-S1 | Jungyu Choi | Soongsil University, Seoul, Republic of Korea | Architectural, Regularization | Choi2026 | 71.42 | 87.21 | 75.48 | 51.56 | |
| Chang_HYU_task7_3 | HYU | Joon-Hyuk Chang | Hanyang University, Seoul, Republic of Korea | Architectural, Regularization | Son2026 | 71.27 | 90.23 | 68.43 | 55.14 | |
| Chang_HYU_task7_2 | HYU | Joon-Hyuk Chang | Hanyang University, Seoul, Republic of Korea | Architectural, Regularization | Son2026 | 71.17 | 90.34 | 69.51 | 53.65 | |
| Miyazaki_CyberAgent_task7_3 | Chain-cKL | Koichi Miyazaki | CyberAgent, Tokyo, Japan | Architectural, Regularization | Miyazaki2026 | 70.94 | 86.40 | 69.41 | 57.00 | |
| BelHadj_VUB_task7_2 | LoRAAdapt | Yacine Bel-Hadj | Vrije Universiteit Brussel, Brussels, Belgium | Architectural, Optimization | Bel-Hadj2026 | 70.65 | 85.72 | 67.88 | 58.34 | |
| BelHadj_VUB_task7_1 | AlignStack | Yacine Bel-Hadj | Vrije Universiteit Brussel, Brussels, Belgium | Architectural, Optimization | Bel-Hadj2026 | 70.64 | 87.31 | 66.48 | 58.15 | |
| Miyazaki_CyberAgent_task7_2 | Chain-KL | Koichi Miyazaki | CyberAgent, Tokyo, Japan | Architectural, Regularization | Miyazaki2026 | 70.60 | 87.77 | 68.83 | 55.20 | |
| Guan_HEU_task7_2 | ProDEx-WPE | Xuefeng Yang | Harbin Engineering University, Harbin, China | Architectural | Yang2026 | 69.91 | 79.26 | 70.58 | 59.89 | |
| Li_SCUT_task7_1 | SoftD3Exp | Yanxiong Li | South China University of Technology, Guangzhou, China | Architectural | Li2026 | 69.38 | 86.80 | 68.08 | 53.25 | |
| Li_SCUT_task7_2 | ClsCalib | Yanxiong Li | South China University of Technology, Guangzhou, China | Architectural | Li2026 | 69.35 | 86.80 | 68.85 | 52.39 | |
| Zhang_XJTLU_task7_4 | MAJ5 | Peihong Zhang | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural | Zhang2026a | 69.08 | 81.54 | 69.25 | 56.44 | |
| Gao_SHNU_task7_1 | CkptAvg | Zhe Gao | Shanghai Normal University, Shanghai, China | Optimization | Sun2026 | 68.87 | 87.14 | 59.49 | 59.99 | |
| Guan_HEU_task7_4 | ProtoMoE-L | Xuefeng Yang | Harbin Engineering University, Harbin, China | Architectural | Yang2026 | 68.84 | 77.95 | 69.38 | 59.18 | |
| Huang_WHU_task7_3 | MR-CLIAC | Gongping Huang | Wuhan University, Wuhan, China | Architectural, Regularization | Pi2026 | 68.78 | 73.84 | 72.24 | 60.25 | |
| Divakar_IND_task7_1 | ResAdapt | Aakash Divakar | Independent Researcher | Architectural, Regularization | Divakar2026 | 68.34 | 84.67 | 69.38 | 50.98 | |
| Zhang_XJTLU_task7_1 | STABLE | Peihong Zhang | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural | Zhang2026a | 68.34 | 81.82 | 66.75 | 56.45 | |
| Guan_HEU_task7_1 | ProDEx | Xuefeng Yang | Harbin Engineering University, Harbin, China | Architectural | Yang2026 | 67.95 | 78.26 | 67.17 | 58.42 | |
| Chang_HYU_task7_1 | HYU | Joon-Hyuk Chang | Hanyang University, Seoul, Republic of Korea | Architectural, Regularization | Son2026 | 67.88 | 89.39 | 59.18 | 55.07 | |
| Zhang_XJTLU_task7_3 | D3ORIENT | Peihong Zhang | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural | Zhang2026a | 67.85 | 80.71 | 67.06 | 55.78 | |
| Zhang_XJTLU_task7_2 | D1SAFE | Peihong Zhang | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural | Zhang2026a | 67.32 | 80.31 | 66.33 | 55.33 | |
| Huang_WHU_task7_4 | ProtoTri | Gongping Huang | Wuhan University, Wuhan, China | Architectural, Regularization | Pi2026 | 66.91 | 79.94 | 65.79 | 55.01 | |
| Divakar_IND_task7_4 | Mahanalobis | Aakash Divakar | Independent Researcher | Architectural, Regularization | Divakar2026 | 66.86 | 77.86 | 70.62 | 52.12 | |
| Heo_SeoulTech_task7_3 | MoE-TTA | Se-Min Heo | Seoul National University of Science and Technology (SeoulTech), Seoul, Republic of Korea | Architectural | Heo2026 | 66.66 | 72.69 | 82.01 | 45.28 | |
| Heo_SeoulTech_task7_4 | MeanAvg | Se-Min Heo | Seoul National University of Science and Technology (SeoulTech), Seoul, Republic of Korea | Architectural | Heo2026 | 65.96 | 73.01 | 79.94 | 44.93 | |
| Heo_SeoulTech_task7_2 | MoE-t4 | Se-Min Heo | Seoul National University of Science and Technology (SeoulTech), Seoul, Republic of Korea | Architectural | Heo2026 | 65.66 | 72.68 | 80.03 | 44.29 | |
| Heo_SeoulTech_task7_1 | MoE-t3 | Se-Min Heo | Seoul National University of Science and Technology (SeoulTech), Seoul, Republic of Korea | Architectural | Heo2026 | 65.61 | 72.45 | 80.19 | 44.18 | |
| Chung_IND_task7_4 | System4 | Haechun Chung | Independent, Seoul, Republic of Korea | Architectural | Chung2026 | 65.46 | 89.14 | 60.76 | 46.49 | |
| Guan_HEU_task7_3 | ProtoMoE | Xuefeng Yang | Harbin Engineering University, Harbin, China | Architectural | Yang2026 | 65.18 | 77.76 | 61.82 | 55.97 | |
| Li_SCUT_task7_3 | PTR | Yanxiong Li | South China University of Technology, Guangzhou, China | Architectural | Li2026 | 63.91 | 89.96 | 58.79 | 42.99 | |
| Tyagi_IITM_Task7_1 | Tyagi | Akansha Tyagi | Indian Institute of Technology Mandi, Himachal Pradesh, India | Architectural | Tyagi2026 | 63.09 | 81.54 | 66.99 | 40.76 | |
| Tyagi_IITM_Task7_2 | Tyagi | Akansha Tyagi | Indian Institute of Technology Mandi, Himachal Pradesh, India | Architectural | Tyagi2026 | 63.09 | 81.54 | 66.99 | 40.76 | |
| Tyagi_IITM_Task7_3 | Tyagi | Akansha Tyagi | Indian Institute of Technology Mandi, Himachal Pradesh, India | Architectural | Tyagi2026 | 63.09 | 81.54 | 66.99 | 40.76 | |
| Chang_HYU_task7_4 | HYU | Joon-Hyuk Chang | Hanyang University, Seoul, Republic of Korea | Architectural, Regularization | Son2026 | 62.82 | 83.86 | 54.94 | 49.66 | |
| Takami_OU_task7_1 | DCCMS | Haruto Takami | Okayama University, Okayama, Japan | Architectural | Takami2026 | 62.42 | 86.14 | 53.33 | 47.81 | |
| Giacomini_ICMC_task7_1 | MFSeqEns | Anderson Giacomini | University of São Paulo (USP), São Carlos, SP, Brazil | Architectural, Regularization | Giacomini2026 | 62.34 | 82.89 | 56.47 | 47.66 | |
| Yang_USST_task7_1 | SoftBNDist | Wenxing Yang | University of Shanghai for Science and Technology, Shanghai, China | Architectural | Zhang2026 | 62.33 | 86.44 | 55.06 | 45.48 | |
| Huang_WHU_task7_2 | ProtoRoute | Gongping Huang | Wuhan University, Wuhan, China | Architectural, Regularization | Pi2026 | 62.04 | 67.98 | 68.27 | 49.88 | |
| Hu_XJTLU_task7_2 | LoRA-ANS | Shengchen Li | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural, Regularization | Hu2026 | 61.58 | 72.79 | 60.24 | 51.73 | |
| Hu_XJTLU_task7_3 | LoRA-NS | Shengchen Li | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural, Regularization | Hu2026 | 60.78 | 71.21 | 59.27 | 51.85 | |
| Baseline System | Baseline System | Manjuanath Mulimani | Tampere University, Tampere, Finland | Regularization, Architectural | 60.35 | 89.15 | 49.72 | 42.18 | ||
| Miyazaki_CyberAgent_task7_1 | BN-stat | Koichi Miyazaki | CyberAgent, Tokyo, Japan | Architectural, Regularization | Miyazaki2026 | 60.20 | 84.15 | 51.42 | 45.03 | |
| Raja_IITM_task7_1 | BN-Ridge | Risan Raja | Indian Institute of Technology Madras, Chennai, India | Architectural, Optimization | Raja2026a | 59.66 | 77.64 | 41.11 | 60.22 | |
| Hu_XJTLU_task7_1 | LoRA-G | Shengchen Li | Xi'an Jiaotong-Liverpool University, Suzhou, China | Architectural, Regularization | Hu2026 | 59.05 | 71.20 | 55.06 | 50.91 | |
| Raja_IITM_task7_2 | Proteus | Risan Raja | Indian Institute of Technology Madras, Chennai, India | Architectural, Optimization | Raja2026b | 55.70 | 74.89 | 54.85 | 37.35 | |
| Raja_IITM_task7_3 | LeJEPA | Risan Raja | Indian Institute of Technology Madras, Chennai, India | Architectural, Regularization | Raja2026c | 49.73 | 72.15 | 27.24 | 49.80 | |
| Huang_WHU_task7_1 | DS-RanPAC | Gongping Huang | Wuhan University, Wuhan, China | Architectural, Regularization | Pi2026 | 41.89 | 59.68 | 42.84 | 23.16 |
Technical reports
DCASE 2026 TASK 7 SUBMISSION: ALIGNED LOGIT STACKING AND LOW-RANK ADAPTER EXPERTS
Yacine Bel-Hadj, Ricardo Fracasso and Jan Van Rompaey
Department of Mechanical Engineering, Brussels, Belgium
BelHadj_VUB_task7_2 BelHadj_VUB_task7_1
DCASE 2026 TASK 7 SUBMISSION: ALIGNED LOGIT STACKING AND LOW-RANK ADAPTER EXPERTS
Yacine Bel-Hadj, Ricardo Fracasso and Jan Van Rompaey
Department of Mechanical Engineering, Brussels, Belgium
Abstract
This technical report describes the VUB systems submitted to DCASE 2026 Task 7, domain-agnostic incremental learning for audio classification. We start from the official CNN14-style baseline with domain-specific batch-normalization branches. At inference time the domain label is unavailable, so the baseline evaluates all branches and selects the prediction with the lowest entropy. Our submissions keep the same branch structure but replace this fixed selection rule with a learned decision stage. BelHadj VUB task7 1 trains an aligned logit stacker on branch logits, posterior probabilities, confidence measures, and branch-disagreement statistics. Bel-Hadj VUB task7 2 first adds task-specific low-rank convolutional adapters to the backbone, then trains the same type of branch-level stacker on the adapter-enhanced logits. The two systems therefore test two related strategies: learning a better domain-agnostic decision rule and adding parameter-efficient domain-specific capacity. Both systems use only the data and checkpoints released for the challenge, and each evaluation recording is classified indepen- dently.
A DOMAIN-AGNOSTIC INCREMENTAL LEARNING FOR AUDIO CLASSIFICATION SYSTEM FOR DCASE 2026 TASK 7
Peiwei Chang, Yuelan Cheng, Yongqiang Chen and Wenwu Wang
Centre for Vision, Speech and Signal Processing (CVSSP), Guildford, UK
Chang_Surrey_task7_1
A DOMAIN-AGNOSTIC INCREMENTAL LEARNING FOR AUDIO CLASSIFICATION SYSTEM FOR DCASE 2026 TASK 7
Peiwei Chang, Yuelan Cheng, Yongqiang Chen and Wenwu Wang
Centre for Vision, Speech and Signal Processing (CVSSP), Guildford, UK
Abstract
We describe our submissions to DCASE2026 Task 7 (Domain-Agnostic Incremental Learning for Audio Classification). Starting from the provided CNN14 baseline frozen at its D1 checkpoint, each incremental domain learns a strictly disjoint set of parameters, including domain-specific batch normalization, low-rank (r=8) convolutional adapters, and a zero-initialized per-domain classifier delta. Catastrophic forgetting is therefore prevented by construction. At inference time, a small residual-MLP ensemble routes between domain heads using multi-scale pooled features extracted from the frozen D1 path. The final logits are computed as a router-weighted mixture of per-domain ensemble heads, whose member subsets are selected independently for each head by exhaustive screening over 28 trained variants. On the development test set our primary system achieves 77.5% average accuracy (D2: 85.7%, D3: 69.3%).
OR-KDL: ORTHOGONAL KNOWLEDGE DISTILLED LORA FOR DOMAIN-AGNOSTIC INCREMENTAL LEARNING IN DCASE 2026 TASK 7
Chun Chanjun, Kim Hanseul, Kim Eojin, Lee Jihyuk, Bae Junho and Park Donghyeok
Department of Computer Engineering, Gwangju, Republic of Korea
Chun_Chosun_task7_1
OR-KDL: ORTHOGONAL KNOWLEDGE DISTILLED LORA FOR DOMAIN-AGNOSTIC INCREMENTAL LEARNING IN DCASE 2026 TASK 7
Chun Chanjun, Kim Hanseul, Kim Eojin, Lee Jihyuk, Bae Junho and Park Donghyeok
Department of Computer Engineering, Gwangju, Republic of Korea
Abstract
We present OR-KDL (Orthogonal Knowledge distilled LoRA), a continual adaptation system for the domain-incremental acoustic classification setting of DCASE 2026 Task 7. A CNN14 backbone pretrained on D1 is kept frozen throughout, with adaptation to D2 and D3 performed solely via domain-specific low-rank adapters, preserving the pretrained representation while minimizing trainable parameters. Knowledge distillation (KD) is applied at each stage: logit distillation for D1→D2, and logit combined with cosine feature distillation for D2→D3 to mitigate representational drift. To prevent catastrophic forgetting without access to prior-domain data, Weight-based Orthogonal Gradient Projection (OGP) constructs protected subspaces via SVD of previous domain weights and adapter updates, projecting gradients onto their orthogonal complement. We compare standard LoRA and a CoLoRA-based adapter under this framework, and select the ensemble size N of a stratified 5-fold Top-N soft-voting ensemble on a held-out validation set. The final system CoLoRA with uniform augmentation and a Top-20 ensemble achieves D2@D3 of 79.73%, D3@D3 of 66.55%, Accuracy of 73.14%, and Forgetting of 4.04%p.
DIRNET: DOMAIN-AGNOSTIC INCREMENTAL ROUTING NETWORK
Jungyu Choi1, Seowoo Kim1 and Sungbin Im2
1Dept. of IT Engineering, Seoul, Repulic of Korea, 2School of Electronic Engineering, Seoul, Repulic of Korea
Im_SSU_task7_2 Im_SSU_task7_1 Im_SSU_task7_4 Im_SSU_task7_3
DIRNET: DOMAIN-AGNOSTIC INCREMENTAL ROUTING NETWORK
Jungyu Choi1, Seowoo Kim1 and Sungbin Im2
1Dept. of IT Engineering, Seoul, Repulic of Korea, 2School of Electronic Engineering, Seoul, Repulic of Korea
Abstract
This technical report presents DIRNET , our submission to DCASE 2026 Challenge Task 7 on domain-agnostic incremental sound classification. The task requires a model to learn sequential acoustic domains (D1, D2, and D3) while preserving previously acquired knowledge and performing inference without domain labels. To address this challenge, we propose a unified CNN14-based framework with two domain-routing strategies. DIRNET-OOD performs soft domain fusion using class-wise prototype distances and a D1 out-of-distribution residual prior. DIRNET-SCR estimates domain responsibility from confidence, prediction margin, and entropy after domain-specific calibration, followed by soft probability mixture across domain paths. Experimental results show that both systems substantially outperform the official baseline, demonstrating the importance of effective scoring and fusion strategies for domain-agnostic incremental sound classification.
ROUTER-BASED PROGRESSIVE KNOWLEDGE REUSE FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Haechun Chung
Seoul, Republic of Korea
Chung_IND_task7_4
ROUTER-BASED PROGRESSIVE KNOWLEDGE REUSE FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Haechun Chung
Seoul, Republic of Korea
Abstract
Domain-Incremental Learning (DIL) for sound classification requires models to learn new domains without forgetting previously acquired knowledge. In DCASE 2026 Task 7, this challenge is further complicated by the absence of domain identity during infer- ence. To address this problem, we propose a Router-based Progressive Knowledge Reuse framework that introduces domain-specific classifiers and lightweight routing modules for adaptive knowledge reuse across domains. An entropy-threshold strategy is employed during training to regulate the transition between knowledge reuse and knowledge expansion, while inference is performed through entropy-based selection of progressively aggregated domain predictions. Experiments on the DCASE 2026 Task 7 development set show that the proposed method achieves 72.14% accuracy after learning Domain 2 and 60.7% average accuracy after learning Domain 3, outperforming the reproduced baseline by 12.72 and 7.23 percentage points, respectively. The results demonstrate that the proposed approach effectively mitigates forgetting while improving domain-agnostic incremental learning performance.
PER-DOMAIN RESIDUAL ADAPTERS WITH ASYMMETRIC DEPTH FOR DOMAIN-INCREMENTAL AUDIO CLASSIFICATION
Aakash Divakar, Angad Ripudaman Singh Bajwa and Atharva Anand Joshi
Independent Researcher
Divakar_IND_task7_2 Divakar_IND_task7_1 Divakar_IND_task7_4 Divakar_IND_task7_3
PER-DOMAIN RESIDUAL ADAPTERS WITH ASYMMETRIC DEPTH FOR DOMAIN-INCREMENTAL AUDIO CLASSIFICATION
Aakash Divakar, Angad Ripudaman Singh Bajwa and Atharva Anand Joshi
Independent Researcher
Abstract
We describe our submission to DCASE 2026 Task 7, Domain-Incremental Learning (DIL) for audio classification. Starting from the released ADIL baseline, which adds per-domain Batch Normalization (BN) and a per-domain classifier to a shared, frozen CNN14 backbone, we identify the limited per-domain capacity as the main bottleneck for the harder domains. We add, on top of the frozen backbone, a stack of zero-initialised residual bottleneck adapters after every convolutional block. Because each domain owns an independent adapter stack, the depth of the stack can be set per domain: the more difficult domain (D3) receives a deeper stack than the easier one (D2). Combined with a warm-started per-domain classifier head and a supervised contrastive auxiliary loss, this raises the domain-agnostic average accuracy on the dev-test set from 52.5% to 68.5%, with D2 and D3 improving from 58.6%/46.1% to 76.2%/60.8%. The shared backbone and the D1 behaviour are preserved bit-for-bit, so no previously learned domain is forgotten.
MULTIFEATURE SEQUENTIAL LEARNING WITH CNN14 ENSEMBLE FOR DCASE 2026 TASK 7
Anderson Giacomini
Institute of Mathematics and Computer Sciences (ICMC), São Carlos, SP, Brazil
Giacomini_ICMC_task7_1
MULTIFEATURE SEQUENTIAL LEARNING WITH CNN14 ENSEMBLE FOR DCASE 2026 TASK 7
Anderson Giacomini
Institute of Mathematics and Computer Sciences (ICMC), São Carlos, SP, Brazil
Abstract
We present a system for DCASE 2026 Task 7 (Domain-Agnostic Incremental Learning) that combines a multi-feature cross-attention transformer (MultiFeatureSED) with the organizer-provided CNN14 baseline. MultiFeatureSED is trained from scratch on Domain 2 (D2) and fine-tuned on Domain 3 (D3) with L2 weight regularization to mitigate catastrophic forgetting. D1 knowledge is incorporated exclusively via the CNN14 checkpoint, whose domain-specific batch normalization and entropy-based domain routing complement the multi-feature model at inference. An equal-weight ensemble with domain-conditional per-class probability calibration, optimized on the development set, achieves 61.5% mean balanced accuracy versus 55.2% for CNN14 alone.
FINE-TUNED EXPERT AGGREGATION FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Se-Min Heo1, Seunggyu Jeong2 and Seong-Eun Kim2
1Department of Applied Artificial Intelligence, Seoul, Republic of Korea, 2Seoul, Korea
Heo_SeoulTech_task7_4 Heo_SeoulTech_task7_1 Heo_SeoulTech_task7_2 Heo_SeoulTech_task7_3
FINE-TUNED EXPERT AGGREGATION FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Se-Min Heo1, Seunggyu Jeong2 and Seong-Eun Kim2
1Department of Applied Artificial Intelligence, Seoul, Republic of Korea, 2Seoul, Korea
Abstract
This technical report describes our submission for Task 7, Domain-Agnostic Incremental Learning for Audio Classification, of the DCASE 2026 Challenge. The task requires an audio classification system to adapt to sequentially introduced domains while performing inference without access to domain labels. Our submitted systems use two separately fine-tuned MCnn14 experts for D2 and D3, both initialized from the official D1 checkpoint. The D2 expert is trained with device augmentation, and the D3 expert is trained with gain-shift augmentation. At inference time, each input sample is evaluated by both experts, and their probability outputs are combined without using domain labels. We evaluate entropy-guided soft aggregation, full-safe test-time augmentation, and mean probability averaging as inference variants. Using the class-wise macro accuracy protocol, the best submitted system obtains a final-stage average validation accuracy of 63.79%.
PA-LORA: PROTOTYPE ANCHORED LOW-RANK ADAPTATION FOR DOMAIN-INCREMENTAL AUDIO CLASSIFICATION
Bohan Hu, Yiqiang Cai and Shengchen Li
School of Advanced Technology, Suzhou, China
Hu_XJTLU_task7_2 Hu_XJTLU_task7_3 Hu_XJTLU_task7_1
PA-LORA: PROTOTYPE ANCHORED LOW-RANK ADAPTATION FOR DOMAIN-INCREMENTAL AUDIO CLASSIFICATION
Bohan Hu, Yiqiang Cai and Shengchen Li
School of Advanced Technology, Suzhou, China
Abstract
The Task 7 of DCASE Challenge 2026 focuses on developing a universal domain-incremental learning system that learns to classify audio from different domains sequentially over time without signif- icantly forgetting the knowledge of any of the previously learned domains. This technical report details the systems we submitted. Rather than rebuilding decision boundaries for each new domain, we propose Prototype Anchored Low-Rank Adaptation (PA-LoRA), which anchors the basic classifier as a frozen prototype space and learns only lightweight, low-rank boundary deformations for new domains, with a learnable gate that autonomously calibrates adaptation. Continual Normalization combines group and batch normalization to prevent feature statistics drift. Furthermore, we introduce negative sampling-based domain routing regularization to reduce misrouting errors during incremental training. On the DIL-DCASE26 development dataset, our best system LoRA-NS achieves an average domain-agnostic accuracy of 61.51%.
Domain-Calibrated Routing and Parameter-Isolated Training for Domain-Agnostic Incremental Audio Classification
Qianqian Li, Guoqing Chen and Yanxiong Li
School of Electronic and Information Engineering, Guangzhou, China
Li_SCUT_task7_1 Li_SCUT_task7_2 Li_SCUT_task7_3
Domain-Calibrated Routing and Parameter-Isolated Training for Domain-Agnostic Incremental Audio Classification
Qianqian Li, Guoqing Chen and Yanxiong Li
School of Electronic and Information Engineering, Guangzhou, China
Abstract
This technical report describes three systems to DCASE 2026 Task 7, Domain-Agnostic Incremental Learning for Audio Classification. The task requires sequential adaptation from the released D1 baseline to D2 and D3 without revisiting previous-domain audio and without using external data or pretrained models. Systems 1 and 2 share an Mutil-Cnn14 model with domain-specific batch normalization, squeeze-and-excitation attention, residual adapters, and domain-specific classifier heads; they differ only in deterministic inference calibration. System 3 uses a late-private high-level branch architecture, where D1 retains the original high-level path while D2/D3 use private block5/block6 modules and domain classifiers. Its inference uses Prototype-Likelihood Routing (PLR), which compares test features with saved D2/D3 prototype statistics and falls back to entropy routing when the prototype match is unreliable. System 2 gives the best D2/D3 mean (66.04%), whereas system 1 gives the best D3 score (61.76%). The report focuses on the modeling choices, routing policies, rule compliance, reproducibility, and development-set behavior of the three submitted systems.
BATCHNORM DOMAIN ROUTING WITH LOW-RANK ADAPTERS FOR INCREMENTAL AUDIO CLASSIFICATION
Koichi Miyazaki and Katsuhiko Yamamoto
Tokyo, Japan
Miyazaki_CyberAgent_task7_3 Miyazaki_CyberAgent_task7_2 Miyazaki_CyberAgent_task7_1 Miyazaki_CyberAgent_task7_4
BATCHNORM DOMAIN ROUTING WITH LOW-RANK ADAPTERS FOR INCREMENTAL AUDIO CLASSIFICATION
Koichi Miyazaki and Katsuhiko Yamamoto
Tokyo, Japan
Abstract
We describe our submitted system to DCASE 2026 Task 7, which studies domain-agnostic incremental audio classification. Our system keeps the shared convolutional neural network backbone frozen while adapting to new domains. For each new domain, it adds domain-specific low-rank adapters and classifier heads. At test time, it routes each clip to the branch whose batch normalization statistics best match the input. Combined with selective self-distillation, Batch normalization recalibration, and chunk-level prediction averaging, our best system achieves 70.1% average class-wise macro accuracy on the development test set, compared with 53.5% for the official baseline.
DOMAIN-INCREMENTAL AUDIO CLASSIFICATION USING DOMAIN-SPECIFIC EXPERTS AND PROTOTYPE CLASSIFIER
Jongyeon Park1, Do-Hyeon Lim1, Sang-won Park2, Hong Kook Kim1, Kyungdeuk Ko3, Hyeongcheol Geum3 and Jeong Eun Lim3
1Department of AI Convergence, Gwangju, South Korea, 2Department of Electrical Engineering and Computer Science (EECS), Gwangju, South Korea, 3Seongnam, South Korea
Park_GIST-HanwhaVision_task7_1 Park_GIST-HanwhaVision_task7_2 Park_GIST-HanwhaVision_task7_3 Park_GIST-HanwhaVision_task7_4
DOMAIN-INCREMENTAL AUDIO CLASSIFICATION USING DOMAIN-SPECIFIC EXPERTS AND PROTOTYPE CLASSIFIER
Jongyeon Park1, Do-Hyeon Lim1, Sang-won Park2, Hong Kook Kim1, Kyungdeuk Ko3, Hyeongcheol Geum3 and Jeong Eun Lim3
1Department of AI Convergence, Gwangju, South Korea, 2Department of Electrical Engineering and Computer Science (EECS), Gwangju, South Korea, 3Seongnam, South Korea
Abstract
This technical report presents submission systems for Task 7 (domain-incremental audio classification) of the DCASE 2026 Challenge. The main obstacle is that the system can never access past- and future-domain data at the same time. We approached domain-incremental learning (DIL) as a frozen-feature replay problem. At each incremental stage, one or two compact experts are trained and then kept fixed; at the final stage, the penultimate features from all frozen experts are concatenated and used to train a lightweight per-class prototype classifier solely on cached features. This design prevents catastrophic forgetting by preserving each frozen expert at inference. To retain earlier-domain knowledge without raw audio, each expert is trained with DeepInversion-based generative replay. Separately, a cross-stage regression imputer—trained only on samples for which all expert slots are legitimately observable—fills the feature slots of experts that did not yet exist at an earlier stage. We submit four fully DIL-compliant systems: three based on diverse frozen five-expert backbones and their cross-stack ensemble, achieving 78.38% micro / 78.92% macro on the development set, outperforming every individual backbone on both metrics.
DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION WITH COMPLEMENTARY INCREMENTAL LEARNING SYSTEMS
Kai Pi1, Yunqi Chen1, Fan Zhong1, Jiahui Yin1, Yike Zhang1, Shihong Tan1, Xudong Zhao2 and Gongping Huang1
1Electronic Information School, Wuhan, China, 2London, U.K.
Huang_WHU_task7_4 Huang_WHU_task7_3 Huang_WHU_task7_1 Huang_WHU_task7_2
DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION WITH COMPLEMENTARY INCREMENTAL LEARNING SYSTEMS
Kai Pi1, Yunqi Chen1, Fan Zhong1, Jiahui Yin1, Yike Zhang1, Shihong Tan1, Xudong Zhao2 and Gongping Huang1
1Electronic Information School, Wuhan, China, 2London, U.K.
Abstract
This report presents our submissions to Task 7 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2026 Challenge, which focuses on domain-agnostic incremental learning for audio classification. Our approaches provide complementary strategies to improve classification accuracy by reducing domain-routing errors and enhancing the classification models. Specifically, System 1 uses the challenge-provided D1 checkpoint as a frozen CNN14 feature extractor and performs analytic adaptation via hyperspherical random projection and streaming ridge regression. Systems 2 and 4 are both built on the official CNN-BN baseline and improve domain-agnostic inference through a two-stage prototype router and a triple-resolver mechanism, respectively. System 3 follows the same routing principle as System 2, but adopts a separate CRNN-LoRA architecture for the D2 and D3 domains. Among the four submitted systems, System 4 obtains the best local validation performance, achieving 70.2% average macro accuracy on the local D2/D3 development-test sanity split after selecting the final static resolver variant on that split.
LeJEPA: Fast-Weight Temporal Modeling for Forgetting-Free Domain-Agnostic Sound Event Detection
Risan Raja
Chennai, India
Raja_IITM_task7_3
LeJEPA: Fast-Weight Temporal Modeling for Forgetting-Free Domain-Agnostic Sound Event Detection
Risan Raja
Chennai, India
Abstract
We present LeJEPA (Latent Evolution Joint-Embedding Predictive Architecture), a continual learning system for domain-agnostic sound event detection that addresses catastrophic forgetting through temporal prediction and fast-weight adaptation. The approach combines a frozen pre-trained acoustic backbone (MCnn14 trained on Domain 1) with a learnable meta-encoder that predicts future temporal states while maintaining domain-invariant representations through fast-weight world modeling. Our architecture employs three key mechanisms: (1) a frozen-backbone temporal feature extractor with per-domain BatchNorm sets for domain-specific statistics while preserving shared convolutional features, (2) a fast-weight delta world model that predicts future latent states with per-clip adaptation (η ∈ [0.001, 0.5], meta-initialization W init , and controller MLP), and (3) forgetting-free prototype accumulation across all seen domains for domain-agnostic inference. The system achieves parameter efficiency by training only the meta-encoder and fast-weight components (∼1.3M parameters) while keeping the 15M-parameter backbone frozen after D1. Domain-agnostic inference is enabled through multi-prototype Nearest Class Mean (NCM) classification with Semantic Drift Compensation (SDC)—a replay-free prototype alignment mechanism that shifts stored prototypes into evolving feature spaces using Gaussian-similarity-weighted drift estimation. Experimental results on DCASE 2026 Task 7 demonstrate that LeJEPA maintains D1 accuracy within 2% after sequential D2/D3 training (76.8% post-D3 vs. 78.5% post-D1) while achieving competitive performance across acoustic domains (macro accuracy: 64.3% D2, 52.1% D3) through geometric manifold constraints and temporal dynamics modeling rather than replay-based memorization.
Domain-Incremental Audio Classification with Per-Domain BatchNorm and Closed-Form Ridge Regression
Risan Raja
Chennai, India
Raja_IITM_task7_1
Domain-Incremental Audio Classification with Per-Domain BatchNorm and Closed-Form Ridge Regression
Risan Raja
Chennai, India
Abstract
We present a domain-incremental learning system for audio classification that combines per-domain BatchNorm adaptation with closed-form Ridge regression readouts. Building on the DCASE 2026 Task 7 MCnn14 baseline, our approach employs a three-phase training protocol per domain: (1) warm-start initialization of domain-specific BatchNorm and classification head from the previous domain, (2) gradient fine-tuning of these components using cross-entropy loss, and (3) replacement of the gradient-trained head with a closed-form Ridge regression readout computed on frozen 2048-dimensional features. The Ridge regression formulation provides a deterministic, globally optimal solution with strong L 2 regularization (γ = 589.69, optimized via hyperparameter sweep). Our uncentered formulation (bias = 0) exploits the implicit biases already embedded in the deep feature representations. Standard deviation scale matching ensures stable training dynamics by aligning ridge logit magnitudes with fine-tuned FC magnitudes, preventing saturation when subsequent domains warm-start from the ridge solution. At inference, entropy-based task head selection identifies the correct domain without oracle labels. The system achieves 67.74% macro overall accuracy on the three-domain DCASE 2026 Task 7 benchmark while maintaining O(d 2 ) memory complexity and computational efficiency through single-pass feature extraction and DDP-safe closed-form solutions. Per-domain parameter overhead is minimal (∼24K parameters per domain vs. ∼15M shared parameters).
Proteus with Isometric Domain Shaping: Domain-Agnostic Audio Classification for DCASE 2026 Task 7
Risan Raja
Chennai, India
Raja_IITM_task7_2
Proteus with Isometric Domain Shaping: Domain-Agnostic Audio Classification for DCASE 2026 Task 7
Risan Raja
Chennai, India
Abstract
We present Proteus, a domain-agnostic audio classification system that addresses catastrophic forgetting in domain-incremental learning through Isometric Domain Shaping (IDS). The approach combines selective Low-Rank Adaptation (LoRA) with dual geometric regularization—Gram matrix isometry preservation and SIGReg feature space shaping—to enable sequential domain learning while maintaining knowledge from previously seen acoustic environments. Our architecture builds upon the CNN14 backbone from PANNs, applying LoRA adapters exclusively to the semantic layers (blocks 4-6) while keeping the frozen classification head and shared batch normalization statistics from the source domain (D1). The Isometric Domain Shaping framework enforces two complementary geometric constraints: (1) Gram Isometry Loss preserves relative pairwise feature geometry between a frozen teacher and the adapting student via L2-normalized Gram matrix alignment, and (2) SIGReg Loss shapes the feature manifold toward a zero-mean, isotropic Gaussian distribution by decorrelating dimensions and uniformizing variance. The system achieves parameter efficiency by training only 0.13% of the full model (approximately 100K out of 75.6M parameters) through LoRA adapters with rank-8 decomposition and α = 16 scaling. The teacher network is progressively updated after each domain to accumulate geometric knowledge across the sequence. Domain-agnostic inference is enabled through entropy-based task-head selection across seen domains, eliminating the requirement for domain identifiers at test time. Experimental results on DCASE 2026 Task 7 demonstrate that Proteus with Isometric Domain Shaping maintains competitive accuracy across sequential acoustic domains while mitigating catastrophic forgetting through geometric manifold constraints rather than replay-based memorization.
HYU SUBMISSION TO DCASE 2026 TASK 7: MULTI-BRANCH FUSION AND CONSERVATIVE ROUTING FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Gihun Son1, Pil Moo Byun2 and Joon-Hyuk Chang3
1Artificial Intelligence Application, Seoul, Republic of Korea, 2Artificial Intelligence, Seoul, Republic of Korea, 3Artificial Intelligence Application / Artificial Intelligence, Seoul, Republic of Korea
Chang_HYU_task7_1 Chang_HYU_task7_3 Chang_HYU_task7_4 Chang_HYU_task7_2
HYU SUBMISSION TO DCASE 2026 TASK 7: MULTI-BRANCH FUSION AND CONSERVATIVE ROUTING FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Gihun Son1, Pil Moo Byun2 and Joon-Hyuk Chang3
1Artificial Intelligence Application, Seoul, Republic of Korea, 2Artificial Intelligence, Seoul, Republic of Korea, 3Artificial Intelligence Application / Artificial Intelligence, Seoul, Republic of Korea
Abstract
This paper presents the Hanyang University team’s submission to DCASE 2026 Task 7 on domain-agnostic incremental audio classification. The task requires adapting an audio classifier to sequentially revealed acoustic domains while preserving performance on previously learned domains and performing inference without domain labels. To address this challenge, we explore four systems based on stage-specific branch adaptation and frozen inference-time routing. System 1 uses stage-local acceptors with a fixed D3 override rule, while Systems 2 and 3 perform multi-branch fusion over a retention-oriented balanced path and a more plastic specialist path with different fusion rules. System 4 adopts a conservative low-fire disagreement router to reduce harmful D3 overrides. On the released development report split, System 3 achieves the best D2/D3 average among our submissions, while System 2 provides stronger D2 retention and System 1 achieves the highest D3 accuracy. The submitted systems offer complementary operating points along the retention-plasticity trade-off in domain-agnostic incremental learning.
AUGMENTATION-DRIVEN CHECKPOINT AVERAGING FOR DOMAIN-AGNOSTIC INCREMENTAL SOUND CLASSIFICATION
Jiajun Sun and Zhe Gao
Information and Telecommunication Engineering, Shanghai, China
Gao_SHNU_task7_1
AUGMENTATION-DRIVEN CHECKPOINT AVERAGING FOR DOMAIN-AGNOSTIC INCREMENTAL SOUND CLASSIFICATION
Jiajun Sun and Zhe Gao
Information and Telecommunication Engineering, Shanghai, China
Abstract
We design a system for DCASE 2026 Task 7, the domain-agnostic incremental learning for audio classification, which requires a classifier to seperate sequential audios from both previous observed and newly released domains, and no class labels from previous domains are allowed in new domains. We propose our method by introducing an augmentation-driven checkpoint averaging strategy into the official CNN14-based architecture. We first use the original checkpoints from the given official CNN14 model, then apply data augmentation with randomly generated seeds for each newly released audio domain, and fine-tune the model for each seed to obtain a new checkpoint. Next, we update the checkpoint by a weighted-average method between the previous checkpoint and the newly generated one via the optimal scale parameter settled after ablation study. After the checkpoint is updated, for each audio in the new domain, five 3-second audio crops are extracted from different positions along the time axis, and the prediction via the updated checkpoint with the highest softmax confidence is the final output for this audio from the new domain. In Task 7, the checkpoints are updated twice from the initial model to domain D2 and D3. With the final checkpoint after D3, we test our method and achieve class-wise accuracies on domain D2 and D3 of 63.53% and 67.08%, respectively, or 65.31% on average. All the results are better than the baseline results, demonstrating the superiority and domain adaptivity of our method.
A DOMAIN COMPATIBILITY AND CONFIDENCE-BASED MODEL SELECTION SYSTEM FOR DOMAIN-AGNOSTIC INCREMENTAL LEARNING
Haruto Takami and Sunao Hara
Okayama, Japan
Takami_OU_task7_1
A DOMAIN COMPATIBILITY AND CONFIDENCE-BASED MODEL SELECTION SYSTEM FOR DOMAIN-AGNOSTIC INCREMENTAL LEARNING
Haruto Takami and Sunao Hara
Okayama, Japan
Abstract
Domain-agnostic Incremental Learning requires a model to adapt to newly introduced domains while preserving previously acquired knowledge under the constraint that past-domain training data cannot be accessed again. In the baseline method, multiple domain-specific classification models are maintained, and the model used for inference is selected based on output entropy. However, using entropy as a confidence criterion often yields high confidence even when the prediction is incorrect. To address this issue, we propose a model selection method that combines a VAE-based domain rejection mechanism with a ConfidNet-based confidence estimation mechanism. First, domain-specific VAEs are used to reject domain models that are not compatible with the input sample. Next, ConfidNet estimates the confidence of the remaining candidate models, and the classification result of the model with the highest confidence score is selected as the final output. Experimental results on the DCASE 2026 Task 7 development dataset demonstrate that the proposed method consistently outperforms the baseline system under all evaluation conditions. In particular, the classification accuracy on the D2 task improved from 54.77% to 70.11%, while the average accuracy on the D3 task improved from 45.50% to 59.09%. These results confirm that the proposed method provides an effective model selection strategy for Domain-agnostic Incremental Learning.
DOMAIN-AGNOSTIC INCREMENTAL LEARNING USING FUSION-BASED REPRESENTATIONS FOR AUDIO CLASSIFICATION
Akansha Tyagi
School of Computing and Electrical Engineering, Himachal Pradesh, India
Tyagi_IITM_task7_1 Tyagi_IITM_task7_3 Tyagi_IITM_task7_2
DOMAIN-AGNOSTIC INCREMENTAL LEARNING USING FUSION-BASED REPRESENTATIONS FOR AUDIO CLASSIFICATION
Akansha Tyagi
School of Computing and Electrical Engineering, Himachal Pradesh, India
Abstract
This work extends the DCASE Task 7 baseline system. The proposed approach enhances the baseline by combining audio representations learned from both transformer-based and Convolutional Neural Network-based architectures, leveraging the strengths of each architecture through a fusion-based framework. The proposed system performs better than the baseline system by 5.5% when data from both domains D2 and D3 are seen by the model.
GISP@HEU’S SUBMISSION FOR DCASE 2026 TASK 7: PROTOTYPE-GUIDED EXPERT NETWORK FOR DOMAIN-INCREMENTAL LEARNING
Xuefeng Yang1, Tong Ye1, Xiaoyu Feng1, Wenbo Wang1, Qiaoxi Zhu2 and Jian Guan1
1Harbin, China, 2Ultimo, Australia
Guan_HEU_task7_3 Guan_HEU_task7_2 Guan_HEU_task7_4 Guan_HEU_task7_1
GISP@HEU’S SUBMISSION FOR DCASE 2026 TASK 7: PROTOTYPE-GUIDED EXPERT NETWORK FOR DOMAIN-INCREMENTAL LEARNING
Xuefeng Yang1, Tong Ye1, Xiaoyu Feng1, Wenbo Wang1, Qiaoxi Zhu2 and Jian Guan1
1Harbin, China, 2Ultimo, Australia
Abstract
This report presents our submission to DCASE 2026 Task 7 on domain-agnostic incremental audio classification, where models learn D1→D2→D3 without revisiting previous-domain data and automatically infer test domains. All systems we submitted use a domain-incremental backbone with a frozen shared encoder, domain-specific BatchNorm, incremental experts, and a prototype-gated selector. System 1 adds a spectral-temporal dual-branch head for reverberant D3. System 2 evaluates weighted prediction error (WPE) dereverberation on D3. System 3 combines raw, WPE, and multi-window dereverberated predictions in a three-way ensemble with conservative routing. System 4 uses the same ensemble but lowers the fallback threshold to favor D2 and D3 routing. On the development set, all systems achieve 65.31%–68.18% average accuracy across D2 and D3, surpassing the 52.50% official baseline. System 4 performs best, reaching 68.18% average accuracy, including 72.78% on D2 and 63.59% on D3.
BN-DISTANCE SOFT ROUTING FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Zhenping Zhang1, Wenxing Yang1, Yuzhu Wang2, Wenqiang Zhao1 and Qianyi Wang1
1College of Oriental Pan-Vascular Devices Innovation, Shanghai, China, 2Signal Processing Research Center, Tampere, Finland
Yang_USST_task7_1
BN-DISTANCE SOFT ROUTING FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Zhenping Zhang1, Wenxing Yang1, Yuzhu Wang2, Wenqiang Zhao1 and Qianyi Wang1
1College of Oriental Pan-Vascular Devices Innovation, Shanghai, China, 2Signal Processing Research Center, Tampere, Finland
Abstract
This report describes our submission to DCASE 2026 Task 7, Domain-Agnostic Incremental Learning for Audio Classification. The submitted system keeps the official CNN14-style acoustic classifier and its domain-specific batch-normalization (BN) branches unchanged. The core idea is to treat shallow BN running statistics as domain fingerprints. For each test clip, BN-distance is computed between the clip’s pre-BN activation statistics and the stage-available BN branch statistics at selected shallow BN layers. These distances are converted to softmax branch weights and used to softly fuse the branch class posteriors. No learned router or selector is trained; only a deterministic routing rule is applied at inference time. The method uses no external data, no data augmentation, and no evaluation-set statistics. On the development set, after-D2 D2 macro accuracy improves from 58.60% to 65.03%, and after-D3 D2/D3 average macro accuracy improves from 52.50% to 58.87%.
EVIDENCE-GUIDED EXPERT EXPANSION FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Peihong Zhang, Yiqiang Cai and Shengchen Li
Suzhou, China
Zhang_XJTLU_task7_2 Zhang_XJTLU_task7_1 Zhang_XJTLU_task7_3 Zhang_XJTLU_task7_4
EVIDENCE-GUIDED EXPERT EXPANSION FOR DOMAIN-AGNOSTIC INCREMENTAL AUDIO CLASSIFICATION
Peihong Zhang, Yiqiang Cai and Shengchen Li
Suzhou, China
Abstract
Domain-agnostic incremental audio classification is not only a forgetting problem: after new-domain experts have been learned, the system must still decide which frozen expert is reliable for an unlabeled-domain test sample. This report studies this routing problem under the DCASE 2026 Task 7 constraints: no external data, no pretrained audio model, no replay of previous-domain audio, no evaluation-set calibration, and no test-time adaptation. We ask whether compact aggregate evidence from frozen experts can replace the unavailable domain label at inference time. Our final system freezes the provided D1 expert, expands compact D2 and D3 experts, and routes each sample using batch-normalization evidence, class-conditional prototype distance, entropy, margin, and a calibrated D2/D3 energy score. A five-model P3a ensemble obtains 70.82% D2 accuracy, 62.41% D3 accuracy, and 66.62% D2/D3 average accuracy on the development split. Relative to the same top-five ensemble with the clean router, energy evidence reduces D3-to-D2 routing from 24.69% to 18.24%. Negative results are also informative: hard routers, tiny learned meta-routers, D3-oriented model mixing, and fixed TTA each improved one diagnostic but reduced robustness or D2 accuracy.