Yanmin.qian | AudioCC Lab

Biography

Dr. Yanmin Qian is a Full Professor in Shanghai Jiao Tong University, China. He received his PhD in the Department of Electronic Engineering from Tsinghua University, China in 2013, and he was also an Associate Research at the Speech Group in Cambridge University Engineering Department, UK, from 2015 to 2016. He is a senior member of IEEE and a member of ISCA, and one of the founding members of Kaldi Speech Recognition Toolkit. He has published more than 200 papers on speech and language processing with 12,000+ citations, and also granted more than 80 patents from China and US. He led the team to win the champion of international challenge 5 times. He was the recipient of several awards including the Best Journal Paper Award in Speech Communication and Best Paper Award from IEEE ASRU’19 and IEEE ISCSLP’16. He was also honored with several high-level talent awards in China, including Chang Jiang Scholars Program of the Ministry of Education, Outstanding Youth Fund of the National Natural Science Foundation of China and The First Prize of the Wu Wenjun Artificial Intelligence Science and Technology Award. He is currently a Member of IEEE Signal Processing Society Speech and Language Technical Committee. His research interests include the speech recognition and translation, speaker and language recognition, speech separation and enhancement, natural language understanding and multi-media signal processing.

Work Experience

2022-present, Shanghai Jiao Tong University, Department of Computer Science and Engineering: Full Professor
2017-2022, Shanghai Jiao Tong University, Department of Computer Science and Engineering : Associate Professor
2015-2016, Cambridge University, Department of Engineering, Machine Intelligence Laboratory : Research Associate
2013-2016, Shanghai Jiao Tong University, Department of Computer Science and Engineering : Assistant Professor

Education

2007-2013, Tsinghua University, Department of Electronic Engineering: Ph.D. Candidate in Electronic Engineering
2003-2007, Huazhong University of Science & Technology, Department of Electronic and Information Engineering: B.E in Electronic and Information Engineering

Research Interests

Speech & Language understanding and human computer interaction
Automatic speech recognition and translation
Speaker and language recognition
Speech separation and enhancement
Natural language understanding
Deep learning and machine learning
Multimedia processing
GPU and SOC based fast speech processing systems

Projects

Research on Brain-like Auditory Frontend Model and System, supported by the Ministry of Science and Technology of China (PI, 18,742,000¥)
Speech Signal Processing, Analysis and Recognition, supported by the NFSC (PI, 2,000,000)
Research on Fast and Efficient Adaptation in Deep Learning Speech Recognition, supported by the NSFC (PI, 540,000¥)
Structured Deep Learning Study for the Robust Speech Recognition in the Heterogeneous Noisy Scenario, supported by the NSFC (PI, 220,000￥)
Shanghai Sailing Program, supported by the Shanghai Government (PI, 200,000￥)
Multi-talker Speech Recognition for Cocktail Party Problem, supported by the Tencent Corporation (PI, 150,000￥)
High Performance Speech and Speaker Recognition System, supported by AVIC (PI, 500,000￥)
Deep Neural Network based denoising technology, supported by Baidu (PI, 100,000￥)
Deep Learning for Noise Robust Speech Recognition, supported by Shanghai Jiao Tong University (PI, 100,000￥)
Speech Objective Recognition and Content Transcription under Complex Environment, supported by the NSFC (Co-PI, 2,510,000￥)
Joint SJTU-AISpeech Laboratory, supported by AISpeech Corporation (Co-PI, 5,000,000￥)
Big Data Driven Natural Language Understanding, QA and Translation, supported by the National Key Research and Development Program of China (involved, ~50,000,000￥)
Cloud Service Platform for Service Robot, supported by the National Key Research and Development Program of China (involved, ~28,000,000￥)
Natural Speech Technology, supported by UK-EPSRC (involved, ~9,000,000 $)
Babel, supported by USA-IARPA (involved, ~20,000,000$)
Speech Recognition Technology Under the Low-Data-Resource Conditions, supported by the NSFC (involved , 830,000￥) and the PhD Research and Innovation Fund of Tsinghua University (involved , 40, 000￥)
Kaldi Speech Recognition Toolkit Development and Research
Large Vocabulary Continuous Speech Recognition System and Spoken Term Detection System Development and Research, Supported by the China 863 Projects, NSFC Projects and the Projects from China's Ministry of National Defense
Multilingual Speech Recognition Research, Supported by the Interdisciplinary Fund Support by School of Information Science and Technology in Tsinghua University (involved, 100,000￥)
Speech Recognition SOC System Development Under the Low-Hardware-Resource Condition, The SOC system is applied in the 2008 Olympic mascots, and win the High-tech Olympics Advanced Award (involved)

Activities

Membership&Qualification

IEEE Senior Member & ISCA Member
Member of IEEE Speech and Language Processing Technical Committee
Kaldi Group Member & Developer
Regular reviewer for IEEE/ACM Transactions on Audio, Speech and Language, IEEE Journal of Selected Topics in Signal Processing, IEEE Signal Processing Letter, Speech Communication, Computer Speech and Language, Neurocomputing, Multimedia Tools and Applications, etc
Regular reviewer for International conferences: ICASSP, INTERSPEECH, ASRU, SLT, ISCSLP, ChinaSip, EUSIPCO, COCOSDA, NCMMSC, ICPR, etc

Open-source toolkit

Wespeaker Speaker Embedding Learning Toolkit, released in 2023 (https://github.com/wenet- e2e/wespeaker)
ESPNet-SE End-to-End Speech Enhancement and Separation Toolkit, released in 2021 (https://github.com/espnet/espnet)
The Kaldi Speech Recognition Toolkit: http://github.com/kaldi-asr/kaldi
CUED-RNNLM-An open-source toolkit for efficient training and evaluation of recurrent neural network language models: http://mi.eng.cam.ac.uk/projects/cued-rnnlm/

International Challenges

2022--Conversational Short-phrase Speaker Diarization Challenge, ranked 1st of 40 teams 2022 VoxCeleb Speaker Recognition Challenge, ranked 3rd of 30 teams in track 1
2022--VoxCeleb Speaker Recognition Challenge, ranked 3rd of 30 teams in track 3
2022--CN-Celeb Speaker Recognition Challenge, ranked 1st of 15 teams in the fixed track 2022 CN-Celeb Speaker Recognition Challenge, ranked 2nd of 15 teams in the open track 2021 Children Speech Recognition Challenge, ranked 1st of 28 teams in track 1
2020--Accented English Speech Recognition Challenge, ranked 1st of 49 teams in track 1
2020--Accented English Speech Recognition Challenge, ranked 2nd of 49 teams in track 2
2019--Mandarin-English Code-Mix Speech Recognition Challenge, ranked 2nd of 28 teams in track 3
2016--BTAS 2016 Speaker Anti-spoofing Competition, ranked 3rd of 7 teams
2015--MGB Recognition Challenge - Recognition of Multi-Genre Broadcast Data, ranked 1st of 20 teams
2015--Automatic Speaker Verification Spoofing and Countermeasures Challenge,ranked 3rd of 16 teams

Awards

2022--Chang Jiang Scholars Program of the Ministry of Education in China
2020--The First Prize of the Wu Wenjun Artificial Intelligence Science and Technology Award
2019--Speech Communication Journal Best Paper Award
2019--IEEE ASRU Best Paper Award
2017--Shanghai Jiao Tong University SMC-Chenxin Level-B Young Scholar Award
2016--ISCSLP Best Student Paper Award
2016--Shanghai Science and Technology Young Scholar Award
2015--The First Prize of the MGB Data Recognition Challenge
2015--The Third Prize of the Automatic Speaker Verification Spoofing and Countermeasures Challenge
2015--Shanghai Jiao Tong University SMC-Chenxin Young Scholar Award
2014--The Second Prize of the Fourth Wu Wenjun Artificial Intelligence Science and Technology Award
2013--The Second Excellent Doctoral Dissertation Award in Tsinghua University
2012--Google Grants Award in InterSpeech2012 (Total 4 PhDs around the world)
2012--Tsinghua-JiangZhen Scholarship, First Class（Total 25 students in Tsinghua University）
2011--Tsinghua-JiangZhen Scholarship, First Class（Total 25 students in Tsinghua University）
2010--Excellent PhD Academic Newcomer Award Nomination of Chinese Education Ministry
2010--PhD Research and Innovation Award of Tsinghua University
2009--Interdisciplinary Fund Support by School of Information Science and Technology in Tsinghua University

Contact

Yanmin Qian
SpeechLab, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 3-515 SEIEE Building, 800 Dongchuan Road, Minhang District, Shanghai
Email: yanminqian@sjtu.edu.cn
200240, China

Publications

Total 200+ published or accepted papers till now: 37 Journal Papers and 179 International Conference Papers.
157 Papers are published on the top-level Journal and International Conference on Speech and Signal Processing, including IEEE T-ASLP, Speech Communication, ICASSP, InterSpeech, ASRU, etc.
16 IEEE Transaction, 5 Speech Communication; 64 ICASSP, 60 InterSpeech, 12 ASRU
Papers are cited 12,000+ times (Google Scholar), 9 papers ranked as ESI top 1%, 20 papers ranked as ESI top 3%
85 China National Invention Patents are applied, 52 has been granted; 3 USA Patents are applied, 3 granted
2 co-authored Book chapters
2 co-translated Books
List of Publications

Curriculum vitae

resume.pdf

钱彦旻