Biography
Dr. Yanmin Qian is a Full Professor in Shanghai Jiao Tong University, China. He received his PhD in the Department of Electronic Engineering from Tsinghua University, China in 2013, and he was also an Associate Research at the Speech Group in Cambridge University Engineering Department, UK, from 2015 to 2016. He is a senior member of IEEE and a member of ISCA, and one of the founding members of Kaldi Speech Recognition Toolkit. He has published more than 200 papers on speech and language processing with 12,000+ citations, and also granted more than 80 patents from China and US. He led the team to win the champion of international challenge 5 times. He was the recipient of several awards including the Best Journal Paper Award in Speech Communication and Best Paper Award from IEEE ASRU’19 and IEEE ISCSLP’16. He was also honored with several high-level talent awards in China, including Chang Jiang Scholars Program of the Ministry of Education, Outstanding Youth Fund of the National Natural Science Foundation of China and The First Prize of the Wu Wenjun Artificial Intelligence Science and Technology Award. He is currently a Member of IEEE Signal Processing Society Speech and Language Technical Committee. His research interests include the speech recognition and translation, speaker and language recognition, speech separation and enhancement, natural language understanding and multi-media signal processing.
Work Experience
- 2022-present, Shanghai Jiao Tong University, Department of Computer Science and Engineering: Full Professor
- 2017-2022, Shanghai Jiao Tong University, Department of Computer Science and Engineering : Associate Professor
- 2015-2016, Cambridge University, Department of Engineering, Machine Intelligence Laboratory : Research Associate
- 2013-2016, Shanghai Jiao Tong University, Department of Computer Science and Engineering : Assistant Professor
Education
- 2007-2013, Tsinghua University, Department of Electronic Engineering: Ph.D. Candidate in Electronic Engineering
- 2003-2007, Huazhong University of Science & Technology, Department of Electronic and Information Engineering: B.E in Electronic and Information Engineering
Research Interests
- Speech & Language understanding and human computer interaction
- Automatic speech recognition and translation
- Speaker and language recognition
- Speech separation and enhancement
- Natural language understanding
- Deep learning and machine learning
- Multimedia processing
- GPU and SOC based fast speech processing systems
Projects
- Research on Brain-like Auditory Frontend Model and System, supported by the Ministry of Science and Technology of China (PI, 18,742,000¥)
- Speech Signal Processing, Analysis and Recognition, supported by the NFSC (PI, 2,000,000)
- Research on Fast and Efficient Adaptation in Deep Learning Speech Recognition, supported by the NSFC (PI, 540,000¥)
- Structured Deep Learning Study for the Robust Speech Recognition in the Heterogeneous Noisy Scenario, supported by the NSFC (PI, 220,000¥)
- Shanghai Sailing Program, supported by the Shanghai Government (PI, 200,000¥)
- Multi-talker Speech Recognition for Cocktail Party Problem, supported by the Tencent Corporation (PI, 150,000¥)
- High Performance Speech and Speaker Recognition System, supported by AVIC (PI, 500,000¥)
- Deep Neural Network based denoising technology, supported by Baidu (PI, 100,000¥)
- Deep Learning for Noise Robust Speech Recognition, supported by Shanghai Jiao Tong University (PI, 100,000¥)
- Speech Objective Recognition and Content Transcription under Complex Environment, supported by the NSFC (Co-PI, 2,510,000¥)
- Joint SJTU-AISpeech Laboratory, supported by AISpeech Corporation (Co-PI, 5,000,000¥)
- Big Data Driven Natural Language Understanding, QA and Translation, supported by the National Key Research and Development Program of China (involved, ~50,000,000¥)
- Cloud Service Platform for Service Robot, supported by the National Key Research and Development Program of China (involved, ~28,000,000¥)
- Natural Speech Technology, supported by UK-EPSRC (involved, ~9,000,000 $)
- Babel, supported by USA-IARPA (involved, ~20,000,000$)
- Speech Recognition Technology Under the Low-Data-Resource Conditions, supported by the NSFC (involved , 830,000¥) and the PhD Research and Innovation Fund of Tsinghua University (involved , 40, 000¥)
- Kaldi Speech Recognition Toolkit Development and Research
- Large Vocabulary Continuous Speech Recognition System and Spoken Term Detection System Development and Research, Supported by the China 863 Projects, NSFC Projects and the Projects from China's Ministry of National Defense
- Multilingual Speech Recognition Research, Supported by the Interdisciplinary Fund Support by School of Information Science and Technology in Tsinghua University (involved, 100,000¥)
- Speech Recognition SOC System Development Under the Low-Hardware-Resource Condition, The SOC system is applied in the 2008 Olympic mascots, and win the High-tech Olympics Advanced Award (involved)
Activities
Membership&Qualification
- IEEE Senior Member & ISCA Member
- Member of IEEE Speech and Language Processing Technical Committee
- Kaldi Group Member & Developer
- Regular reviewer for IEEE/ACM Transactions on Audio, Speech and Language, IEEE Journal of Selected Topics in Signal Processing, IEEE Signal Processing Letter, Speech Communication, Computer Speech and Language, Neurocomputing, Multimedia Tools and Applications, etc
- Regular reviewer for International conferences: ICASSP, INTERSPEECH, ASRU, SLT, ISCSLP, ChinaSip, EUSIPCO, COCOSDA, NCMMSC, ICPR, etc
Open-source toolkit
- Wespeaker Speaker Embedding Learning Toolkit, released in 2023 (https://github.com/wenet- e2e/wespeaker)
- ESPNet-SE End-to-End Speech Enhancement and Separation Toolkit, released in 2021 (https://github.com/espnet/espnet)
- The Kaldi Speech Recognition Toolkit: http://github.com/kaldi-asr/kaldi
- CUED-RNNLM-An open-source toolkit for efficient training and evaluation of recurrent neural network language models: http://mi.eng.cam.ac.uk/projects/cued-rnnlm/
International Challenges
- 2022--Conversational Short-phrase Speaker Diarization Challenge, ranked 1st of 40 teams 2022 VoxCeleb Speaker Recognition Challenge, ranked 3rd of 30 teams in track 1
- 2022--VoxCeleb Speaker Recognition Challenge, ranked 3rd of 30 teams in track 3
- 2022--CN-Celeb Speaker Recognition Challenge, ranked 1st of 15 teams in the fixed track 2022 CN-Celeb Speaker Recognition Challenge, ranked 2nd of 15 teams in the open track 2021 Children Speech Recognition Challenge, ranked 1st of 28 teams in track 1
- 2020--Accented English Speech Recognition Challenge, ranked 1st of 49 teams in track 1
- 2020--Accented English Speech Recognition Challenge, ranked 2nd of 49 teams in track 2
- 2019--Mandarin-English Code-Mix Speech Recognition Challenge, ranked 2nd of 28 teams in track 3
- 2016--BTAS 2016 Speaker Anti-spoofing Competition, ranked 3rd of 7 teams
- 2015--MGB Recognition Challenge - Recognition of Multi-Genre Broadcast Data, ranked 1st of 20 teams
- 2015--Automatic Speaker Verification Spoofing and Countermeasures Challenge,ranked 3rd of 16 teams
Awards
- 2022--Chang Jiang Scholars Program of the Ministry of Education in China
- 2020--The First Prize of the Wu Wenjun Artificial Intelligence Science and Technology Award
- 2019--Speech Communication Journal Best Paper Award
- 2019--IEEE ASRU Best Paper Award
- 2017--Shanghai Jiao Tong University SMC-Chenxin Level-B Young Scholar Award
- 2016--ISCSLP Best Student Paper Award
- 2016--Shanghai Science and Technology Young Scholar Award
- 2015--The First Prize of the MGB Data Recognition Challenge
- 2015--The Third Prize of the Automatic Speaker Verification Spoofing and Countermeasures Challenge
- 2015--Shanghai Jiao Tong University SMC-Chenxin Young Scholar Award
- 2014--The Second Prize of the Fourth Wu Wenjun Artificial Intelligence Science and Technology Award
- 2013--The Second Excellent Doctoral Dissertation Award in Tsinghua University
- 2012--Google Grants Award in InterSpeech2012 (Total 4 PhDs around the world)
- 2012--Tsinghua-JiangZhen Scholarship, First Class(Total 25 students in Tsinghua University)
- 2011--Tsinghua-JiangZhen Scholarship, First Class(Total 25 students in Tsinghua University)
- 2010--Excellent PhD Academic Newcomer Award Nomination of Chinese Education Ministry
- 2010--PhD Research and Innovation Award of Tsinghua University
- 2009--Interdisciplinary Fund Support by School of Information Science and Technology in Tsinghua University
Contact
Yanmin Qian
SpeechLab, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
3-515 SEIEE Building, 800 Dongchuan Road, Minhang District, Shanghai
Email: yanminqian@sjtu.edu.cn
200240, China
Publications
- Total 200+ published or accepted papers till now: 37 Journal Papers and 179 International Conference Papers.
- 157 Papers are published on the top-level Journal and International Conference on Speech and Signal Processing, including IEEE T-ASLP, Speech Communication, ICASSP, InterSpeech, ASRU, etc.
- 16 IEEE Transaction, 5 Speech Communication; 64 ICASSP, 60 InterSpeech, 12 ASRU
- Papers are cited 12,000+ times (Google Scholar), 9 papers ranked as ESI top 1%, 20 papers ranked as ESI top 3%
- 85 China National Invention Patents are applied, 52 has been granted; 3 USA Patents are applied, 3 granted
- 2 co-authored Book chapters
- 2 co-translated Books
- List of Publications