Shanghai Jiao Tong University · Department of Computer Science & Engineering

AudioCC Lab

Auditory Cognition & Computational Acoustics

Highlights

Sound 2025! A Review of AudioCC Lab's Annual Highlights

The article reviews the remarkable achievements of Shanghai Jiao Tong University's Audio Cognition and Computational Acoustics Lab in 2025, covering research innovation, talent cultivation, and academic exchanges, including paper publications, model releases, team-building activities, and awards.

awardasr

Good News | Professor Qian Yanmin Wins the Second Shanghai Jiao Tong University Ruiyuan Youth Science and Technology Award

Professor Qian Yanmin was awarded the Second Ruiyuan Youth Science and Technology Award in Information and Space Technology for his outstanding achievements in the field of auditory artificial intelligence. His innovative research effectively addressed the long-standing 'cocktail party problem' in the field, laying a technical foundation for the large-scale application of auditory processing and speech interaction technologies.

competitionsed

Good News | Two Championships and One Third Place in DCASE 2024 International Challenge

Shanghai Jiao Tong University, in collaboration with several universities and companies, won two championships in the Low-Complexity Acoustic Scene Classification and Industrial Machine Anomalous Sound Detection tasks, and a third place in the Automated Audio Captioning task at the DCASE 2024 International Challenge.

【论文速递】ICASSP 2026 | MEANSE: 基于平均速度流的高效生成式语音增强

在本文中，我们提出了 MeanSE，这是一种利用平均流（Mean Flow）的高效生成式语音增强模型，该模型通过建模平均速度场来实现高质量的单次函数评估增强。实验结果表明，在单次函数评估条件下，我们提出的 MeanSE 显著优于流匹配基线

Publications

Full list →

arXiv 2026Journal

DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice

Leying Zhang, Tingxiao Zhou, Haiyang Sun, Mengxiao Bi, Yanmin Qian

Paper

T-ASLP 2026Journal

Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification

Bei Liu, Yanmin Qian

Paper

CSL 2026Journal

An End-to-end Integration of Speech Separation and Recognition with Self- Supervised Learning Representation

Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe

Paper

INTERSPEECH 2026Conference

Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto- Regressive Modeling

Haiyang Sun, Shujie Hu, Shujie Liu, Lingwei Meng, Hui Wang, Bing Han, Yifan Yang, Yanqing Liu, Sheng Zhao, Yan Lu, Yanmin Qian

Paper

INTERSPEECH 2026Conference

Speaking Guided by Listening: Unsupervised Text-to-Speech Generative Model Guided by End- to-End Speech Recognition

Chenda Li, Wei Wang, Samuele Cornell, Bing Han, Leying Zhang, Zhengyang Chen, Shinji Watanabe, Yanmin Qian

INTERSPEECH 2026Conference

ContextSpeech: A Large-Scale Real-Human Speech Corpus with Context-Aware Descriptions

Haiyang Sun, Bing Han, Zheng Lian, Leying zhang, Chenda Li, Chenyang Le, Ye Bai, Yi Zhao, Yanmin Qian

Research Themes

Speech Signal Processing Frontends

Signal processing techniques that enhance and separate speech signals, such as speech enhancement, separation, dereverberation, and robust acoustic feature extraction.

Spoken Language Understanding

Methods for transcribing and interpreting speech, including automatic speech recognition, speech translation, and contextual spoken language understanding.

Speech & Audio Generation

Generative models for producing speech and audio from text, semantic representations, or other modalities, including text-to-speech and expressive speech synthesis.

Members

Yanmin Qian

Professor

Scholar Email

Dr. Yanmin Qian is a Full Professor in Shanghai Jiao Tong University, China. He received his PhD in the Department of Electronic Engineering from Tsinghua University, China in 2012, and he was also an Associate Research at the Speech Group in Cambridge University Engineering Department, UK, from 2015 to 2016. He is a senior member of IEEE and a member of ISCA, and one of the founding members of Kaldi Speech Recognition Toolkit. He has published more than 300 papers on speech and language processing with 20,000 citations, and also granted more than 120 patents from China and US. He led the team to win the champion of international challenge 6 times. He was the recipient of several awards including IEEE SPS Best Paper Award, Elesiver Speech Communication Best Paper Award and Best Paper Award from IEEE ISCSLP'24, IEEE ASRU’19 and IEEE ISCSLP’16. He was also honored with several high-level talent awards in China, including Chang Jiang Scholars Program of the Ministry of Education, Excellent Youth Scientists of National Natural Science Foundation of China and The First Prize Award of Wu Wenjun Artificial Intelligence Science and Technology Award. He is currently a Member of IEEE Signal Processing Society Speech and Language Technical Committee. His research interests include the speech recognition and translation, speaker and language recognition, speech separation and enhancement, natural language understanding and multi-media signal processing.