Dr. Minghui Dong, currently holds the position of Principal Scientist and Head of the Speech Generation Team at the Arual and Language Intelligence Department, Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*Star), Singapore. Along with his role at I2R, he also holds the titles of President of the Pattern Recognition and Machine Intelligence Association, Vice-President of the Chinese and Oriental Languages Information Processing Society (COLIPS), Chairperson of ISCA Special Interest Group on Chinese Spoken Language Processing(SIG-CSLP), Executive Commitee Member of the Asian Federation of Natural Language Processing (AFNLP), and Editor-in-Chief of the International Journal of Asian Language Processing (IJALP). He obtained his Bachelor of Science degree from the University of Science and Technology of China (USTC), followed by a Master of Science degree from Peking University (PKU) and a PhD degree from the National University of Singapore (NUS). Prior to joining I2R in December 2004, Dr. Dong worked as a Research Engineer at Peking University from July 1995 to July 1998, and as a Researcher at Infotalk Technology (Singapore) from July 2001 to November 2004.
Dr. Minghui Dong's research interests are focused on various aspects of language processing such as spoken language, natural language, music and singing processing, as well as machine learning methods. He has published over 100 research papers in top-tier conferences and journals and edited 14 books and proceedings. Dr. Dong has made contributions to the Asian and international research communities by holding various positions in several conferences, including IJCNLP, ACL, InterSpeech, PACLIC, ISCSLP, IALP, CLSW, ICCPOL, COLING, and APSIPA. Additionally, he has served as an officer in organizations such as COLIPS, ISCA SIG-CSLP, and AFNLP. Dr. Dong has been overseeing the organization of the IALP conference series and the publication of IJALP journal, which aim to foster interactions between researchers working on low-resourced language processing.
Working extensively in the field of speech and language processing, he has taken charge of leading Text-to-Speech (TTS) research and development. With his expertise, he has developed TTS systems for a wide range of local languages including English, Chinese, Malay, Tamil, and many others, and optimized them for various platforms like Cloud, PC, and Smartphone. Apart from this, he has also contributed his knowledge to projects related to speech recognition, speaker and language recognition, computer-aided language learning, natural language processing, transliteration, sentiment analysis, parsing, language models, semantics, and dialogue systems. Currently, he is actively leading the research on deep learning technologies for spoken language processing, among other areas of focus.
Served as technical reviewer or program commiettte member for the following journals and conferences:
- B Sisman, M Zhang, M Dong, H Li, On the study of generative adversarial networks for cross-lingual voice conversion, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2019), 2019.
- B Sisman, K Vijayan, M Dong, H Li, SINGAN: Singing voice conversion with generative adversarial networks, 2019 Asia-Pacific Signal and Information Processing Association Annual Meeting and Summit (APSIAP 2019), 2019.
- Y Lu, M Dong, Y Chen, Implementing prosodic phrasing in chinese end-to-end speech synthesis, 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019), 2019
- DY Huang, E Chandra, X Yang, Y Zhou, H Ming, W Lin, M Dong, H Li, Visual Speech Emotion Conversion using Deep Learning for 3D Talking Head, Proceedings of the Joint Workshop of the 4th Workshop on Affective Social Computing, 2018
-
Dongyan Huang, Wan Ding, Yu Ming, Huaiping Ming, Xinguo Yu, Minghui Dong, Haizhou Li. Multimodal Prediction Of Affective Dimensions Fusing Multiple Regression Techniques, INTERSPEECH 2017.
-
Minghui Dong, Chenyu Yang, Huaiping Ming, Jochen Ehnes, Dongyan Huang, Frame Labeling and Mapping for Non-parallel Voice Conversion, IEEE International Conference on Signal and Image Processing (ICSIP) 2017
-
Xiaoxi Ma, Dongyan Huang, Weisi Lin, Minghui Dong, Haizhou Li. Facial Emotion Recognition, IEEE International Conference on Signal and Image Processing (ICSIP) 2017.
-
Yanfeng Lu, Chenyu Yang, Minghui Dong, Word Level Prosody Prediction Using Large Audiobook Dataset, Asia-Pacific Signal and Information Processing Association (APSIPA) 2017.
-
Minghui Dong, Zhengchen Zhang, Huaiping Ming, Representing Raw Linguistic Information in Chinese Text-to-Speech System, Asia-Pacific Signal and Information Processing Association (APSIPA) 2017.
-
Huaiping Ming, Yanfeng Lu, Zhengchen Zhang, Minghui Dong, A Light-weight Method of Building an LSTM-RNN-based Bilingual TTS System, International Conference on Asian Language Processing (IALP) 2017.
-
-
Paul Chan, Minghui Dong, Haizhou Li, SERAPHIM - A Wavetable Synthesis System with 3D Lip Animation for Real-time Speech and Singing Applications on Mobile Platforms, INTERSPEECH 2016, Sep 2016.
-
Paul Chan, Minghui Dong, Haizhou Li, SERAPHIM Live! Singing Synthesis for the Performer, the Composer, and the 3D Game Developer, INTERSPEECH 2016, Sep 2016
-
Huaiping Ming, Dongyan Huang, Minghui Dong, Haizhou Li, Lei Xie, Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion, INTERSPEECH 2016, Sep 2016.
-
Zhengchen Zhang, Fuxiang Wu, Chenyu Yang, Minghui Dong, Fugen Zhou, Mandarin Prosodic Phrase Prediction based on Syntactic Trees, 9th ISCA Speech Synthesis Workshop, Sep 2016.
-
Wei Yang Quek, D.Y. Huang, W.S. Lin, M.H. Dong, Mobile Acoustic Emotion Recognition, IEEE Region 10. Technical Conference (TENCON), Nov 2016.
-
Fuxiang Wu, Minghui Dong, Zhengchen Zhang and Fugen Zhou, A Stack LSTM Transition-Based Dependency Parser with Context Enhancement and K-best Decoding, 17th International Workshop on Chinese Lexical Semantics, May 2016.
-
Xin Wang, Minghui Dong, Zhen-Hua Ling, A Full Training Framework Of Cross-Stream Dependence Modelling For Hmm-Based Singing Voice Synthesis, IEEE International Conference on Acoustics, Speech And Signal Processing (ICASSP), Mar 2016.
-
Huaiping Ming, Dongyan Huang, Lei Xie, Shaofei Zhang, Minghui Dong, Haizhou Li: Exemplar-based Sparse Representation of Timbre and Prosody for Voice Conversion. ICASSP 2016, Mar 2016.
-
Dong-Yan Huang, Minghui Dong, Haizhou Li, Combining Multiple Kernel Models for Automatic Intelligibility Detection of Pathological Speech. IEEE International Conference on Acoustics, Speech And Signal Processing (ICASSP), Mar 2016.
-
S.W.Y Lee, Quy Hy Nguyen, K.A. Lee, Xiaohai Tian, Longting Xu, E.S. Chng, M.H. Dong, H.Z. Li, Voiceproducer: Many-To-Many Voice Conversion For Arbitrary Speakers, IEEE International Conference on Acoustics, Speech And Signal Processing (ICASSP), Mar 2016.
-
Wan Ding, Mingyu Xu, Dong-Yan Huang, Weisi Lin, Minghui Dong, Xinguo Yu, Haizhou Li: Audio and face video emotion recognition in the wild using deep neural networks and small datasets. ICMI 2016: 506-513
-
Zhengchen Zhang, Chen Zhang, Fuxiang Wu, Dongyan Huang, Weisi Lin, Minghui Dong: I2R-NTU at SemEval-2016 Task 4: Classifier Fusion for Polarity Classification in Twitter. Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016); Jan 2016
-
Shaofei Zhang, Dongyan Huang, Lei Xie, Eng Siong Chng, Haizhou Li, Minghui Dong: Non-negative Matrix Factorization using Stable Alternating Direction Method of Multipliers for Source Separation. APSIPA ASC 2015, Hong Kong, China; 12/2015
-
Minghui Dong, Chenyu Yang, Yanfeng Lu, Jochen Walter Ehnes, Dongyan Huang, Huaiping Ming, Rong Tong, Siu Wa Lee, Haizhou Li: Mapping Frames with DNN-HMM Recognizer for Non-parallel Voice Conversion. APSIPA ASC 2015, Hong Kong; 12/2015
-
Bo Fan, Siu Wa Lee, Xiaohai Tian, Lei Xie, Minghui Dong: A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) 2015, Hong Kong; 12/2015
-
Zhengchen Zhang, Fuxiang Wu, Minghui Dong, Fugen Zhou: Mandarin Prosodic Word Prediction Using Dependency Relationships. 2015 International Conference on Asian Language Processing (IALP); 10/2015
-
Gillian Chua, Qian Ci Chang, Ye Won Park, Paul Yaozhu Chan, Minghui Dong, Haizhou Li: The Expression of Singing Emotion - Contradicting the Constraints of Song. 2015 International Conference on Asian Language Processing (IALP); 10/2015
-
Yang Yu, Weisi Lin, Dong-Yan Huang, Minghui Dong, Haizhou Li: Performance Scoring of Singing voice. 2015 International Conference on Asian Language Processing (IALP); 10/2015
-
Huaiping Ming, Dongyan Huang, Minghui Dong, Haizhou Li, Lei Xie, Shaofei Zhang: Fundamental Frequency Modeling Using Wavelets for Emotional Voice Conversion. 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), Xi'an, Shaanxi, China; 09/2015
-
Huaiping Ming, Dongyan Huang, Lei Xie, Haizhou Li, Minghui Dong: An Alternating Optimization Approach for Phase Retrieval. INTERSPEECH 2015, Dresden, Germany; 09/2015
-
Shaofei Zhang, Dongyan Huang, Lei Xie, Eng Siong Chng, Haizhou Li, Minghui Dong: Regularized Non-negative Matrix Factorization Using Alternating Direction Method of Multipliers and Its Application to Source Separation. INTERSPEECH 2015, Dresden, Germany; 09/2015
-
Dong-Yan Huang, Minghui Dong, Haizhou Li: A Real-Time Variable-Q Non-Stationary Gabor Transform for Pitch Shifting. INTERSPEECH 2015, Dresden, Germany; 09/2015
-
Xiaohai Tian, Zhizheng Wu, Siu Wa Lee, Nguyen Quy, Minghui Dong, Eng Siong Chng: System Fusion for High-Performance Voice Conversion. INTERSPEECH2015; 09/2015
-
Andrea Der, C Yang, D.-Y Huang, M Dong, H Li: Perceptual Speech Quality Improvement for Vocoder based on Amplitude Spectrum of Residual Signal. The Third IEEE China Summit and International Conference on Signal and Information Processing 2015, Chengdu; 07/2015
-
Paul Yaozhu Chan, Minghui Dong, Yi Qian Lim, Ashleigh Toh, Elliot Chong, Mantita Yeo, Megan Chua, Haizhou Li: Formant Excursion in Singing Synthesis. 2015 IEEE International Conference on Digital Signal Processing (DSP); 07/2015
-
Xiaohai Tian, Zhizheng Wu, Siu Wa Lee, Hy Quy Nguyen, Eng Siong Chng, Minghui Dong: Sparse Representation for Frequency Warping based Voice Synthesis. ICASSP, Brisbane, Australia; 04/2015
-
Renbo Zhao, S. W. Lee, Dong-Yan Huang, Minghui Dong: Soft Constrained Leading Voice Separation with Music Score Guidance. 2014 9th International Symposium on Chinese Spoken Language Processing, Singapore; 09/2014
-
Minghui Dong, Siu Wa Lee, Haizhou Li, Paul Chan, Xuejian Peng, Jochen Walters Ehnes, Dongyan Huang: I2R Speech2singing Perfects Everyone's Singing. INTERSPEECH, Singapore; 09/2014
-
Siu Wa Lee, Zhizheng Wu, Minghui Dong, Xiaohai Tian, Haizhou Li: A Comparative Study of Spectral Transformation Techniques for Singing Voice Synthesis. INTERSPEECH, Singapore; 09/2014
-
Dong-Yan Huang, Minghui Dong, Haizhou Li: Intelligibility Detection of Pathological Speech using Asymmetric Sparse Kernel Partial Least Squares Classifier. ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 05/2014
-
Dongyan Huang, Haizhou Li, and Minghui Dong, Ensemble NystrÖm Method for Predicting Conflict Level from Speech, Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014.
-
Zhengchen Zhang, Minghui Dong, and Shuzhi Sam Ge, Emotion Analysis of Children's Stories with Context Information, Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014.
-
Shuojun Liu, Dongyan Huang, Weisi Lin, Minghui Dong, Haizhou Li, and Ee Ping Ong, Emotional Facial Expression Transfer based on Temporal Restricted Boltzmann Machines, Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014.
-
Zhengchen Zhang, Minghui Dong, The Power of Special Characters in Prosodic Word Prediction for Chinese TTS , International Symposium on Chinese Spoken Language Processing (ISCSLP) , 2014
-
Kelvin Poon-Feng, D.Y. Huang, Minghui Dong, Haizhou Li: Acoustic Emotion Recognition based on Fusion of Multiple Feature-Dependent Deep Boltzmann Machines, International Symposium on Chinese Spoken Language Processing (ISCSLP), 2014.
-
Dong-Yan Huang, Minghui Dong, Haizhou Li: A Dynamic Gaussian Process for Voice Conversion. 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), 07/2013
-
S.W. Lee, Minghui Dong, Haizhou Li: A Study of F0 Modelling and Generation with Lyrics and Shape Characterization for Singing Voice Synthesis. 2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP) ; 12/2012
-
S.W. Lee, Shen Ting Ang, Minghui Dong, Haizhou Li: Generalized F0 Modelling with Absolute and Relative Pitch Features for Singing Voice Synthesis. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 03/2012
-
Ling Cen, Minghui Dong, Paul Yaozhu Chan: Template-based Personalized Singing Voice Synthesis. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP); 03/2012
-
Ling Cen, Minghui Dong, Paul Yaozhu Chan: Linear Regression for Prosody Prediction via Convex Optimization. The International Conference on Asian Language Processing; 11/2011
-
Ling Cen, Minghui Dong, Paul Yaozhu Chan: Data Pre-processing in Emotional Speech Synthesis by Emotion Recognition. APSIPA Annual Summit and Conference; 10/2011
-
Ling Cen, Minghui Dong, Paul Yaozhu Chan: Segmentation of Speech Signals in Template-based Speech to Singing Conversion. APSIPA Annual Summit and Conference; 10/2011
-
Minghui Dong, S W Lee, Paul Yaozhu, Ling CEN: I2R Text-to-Speech System for Blizzard Challenge 2011. Blizzard Challenge Workshop 2010; 09/2011
-
Paul Y. Chan, Minghui Dong, S. W. Lee, Ling Cen: Solo to A Capella Conversion - Synthesizing Vocal Harmony from Lead Vocals. Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain.
-
S. W. Lee, Minghui Dong: Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope. INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011.
-
Ling Cen, Zhu Liang Yu, Minghui Dong: Speech Emotion Recognition System Based on L1 Regularized Linear Regression and Decision Fusion.. Affective Computing and Intelligent Interaction - Fourth International Conference, ACII 2011, Memphis, TN, USA, October 9-12, 2011, Proceedings, Part II.
-
Hwee Teng Tan, Minghui Dong, "Analyzing the Relationship between Formants and Pitch for Singing Voice", The International Conference on Asian Language Processing (IALP) 2011.
-
Ling Cen, Minghui Dong, Paul Chan, "Linear Regression for Prosody Prediction via Convex Optimization", The International Conference on Asian Language Processing (IALP) 2011.
-
Minghui Dong, Paul Chang, Ling Cen, Haizhou Li: Aligning Singing Voice with MIDI Melody Using Synthesized Audio Signal. 7th International Symposium on Chinese Spoken Language Processing; 12/2010
-
Paul Yaozhu Chan, Minghui Dong, Ling Cen: The Psychoacoustic Approach towards Enhancing Speech Intelligibility in Noise. 7th International Symposium on Chinese Spoken Language Processing; 12/2010
-
Ling Cen, Paul Chan, Minghui Dong: Generating Emotional Speech from Neutral Speech. 7th International Symposium on Chinese Spoken Language Processing; 12/2010
-
Minghui Dong, Ling Cen, Paul Chan, Haizhou Li: Considering Readability in Text-to-Speech Recording Script Design. 7th Speech Synthesis Workshop; 09/2010
-
Tin Lay Nwe, Minghui Dong, Paul Y. Chan, Xi Wang, Bin Ma, Haizhou Li: Voice Conversion: From Spoken Vowels to Singing Vowels. Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, ICME 2010, 19-23 July 2010, Singapore.
-
Minghui Dong, Paul Y. Chan, Ling Cen, Haizhou Li, Jason Teo, Ping Jen Kua: Phonetic Segmentation of Singing Voice Using MIDI and Parallel Speech. INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010.
-
Minghui Dong, Ling Cen, Paul Y. Chan, Haizhou Li: Refining Unit Boundaries for Mandarin Text-to-Speech Database. 2009 International Conference on Asian Language Processing, IALP 2009, Singapore, December 7-9, 2009.
-
Minghui Dong, Paul Chan, Ling Cen, Bin Ma, Haizhou Li: I2R Text-to-Speech System for Blizzard Challenge 2010. Blizzard Challenge Workshop 2010.
-
Ling Cen, Minghui Dong, Paul Chan, Haizhou Li: Boundary Labeling of Prosodic Words for Mandarin TTS Database. International Symposium on Asian Speech Resources; 08/2009
-
Minghui Dong, Haizhou Li: Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis. 6th International Symposium on Chinese Spoken Language Processing, 2008. ISCSLP '08.
-
Ling Cen, Minghui Dong, Paul Y. Chan, Haizhou Li: Unit selection based speech synthesis for poor channel condition. INTERSPEECH 2009, 10th Annual Conference of the International Speech Communication Association, Brighton, United Kingdom, September 6-10, 2009.
-
Dongyan Huang, E.P. Wong, S. Rahardja, Minghui Dong, Haizhou Li "Transformation of Vocal Characteristics: A Review of Literature", International Conference on Machine Vision, Image Processing, and Pattern Analysis (ICMVIPPA), 2009.
-
Minghui Dong, Ling Cen, Paul Chan, Dongyan Huang, Donglai Zhu, Bin Ma, Haizhou Li, "I2R Text-to-Speech System for Blizzard Challenge 2009", Blizzard Challenge workshop 2009, Sep 2009, Edinburgh, UK.
-
Ling Cen, Minghui Dong, Paul Chan, "Feature Integration and Dimension Reduction in Unit Selection based TTS", International Conference on Asian Language Processing 2009, Dec 2009, Singapore.
-
Tin Lay Nwe, Minghui Dong, Swe Zin Kalayar Khine, Haizhou Li: Multi-speaker meeting audio segmentation. INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008.
-
Minghui Dong, Donglai Zhu, Bin Ma, Haizhou Li, "I2R Submission for Blizzard Challenge 2008" , ISCA Blizzard Challenge Workshop, September 21, 2008, Brisbane, Australia.
-
Haizhou Li, Khe Chai Sim, Jin-Shea Kuo, Minghui Dong: Semantic Transliteration of Personal Names. ACL 2007, Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, June 23-30, 2007, Prague, Czech Republic.
-
Minghui Dong, Haizhou Li, Tin Lay Nwe: Evaluating prosody of Mandarin speech for language learning.. INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006.
-
Tin Lay Nwe, Haizhou Li, Minghui Dong: Analysis and detection of speech under sleep deprivation. INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006.
-
Kong-Aik Lee, Hanwu Sun, Rong Tong, Bin Ma, Minghui Dong, Changhuai You, Donglai Zhu, Chin-Wei Eugene Koh, Lei Wang, Tomi Kinnunen, Chng Eng Siong, Haizhou Li: The IIR Submission to CSLP 2006 Speaker Recognition Evaluation. Chinese Spoken Language Processing, 5th International Symposium, ISCSLP 2006, Singapore, December 13-16, 2006, Proceedings.
-
Rong Tong, Bin Ma, Kong-Aik Lee, Changhuai You, Donglai Zhu, Tomi Kinnunen, Hanwu Sun, Minghui Dong, Chng Eng Siong, Haizhou Li: Fusion of Acoustic and Tokenization Features for Speaker Recognition. Chinese Spoken Language Processing, 5th International Symposium, ISCSLP 2006, Singapore, December 13-16, 2006, Proceedings.
-
Minghui Dong, Kim-Teng Lua, Haizhou Li: A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS.. INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005.
-
Minghui Dong, Kim-Teng Lua, Jun Xu: Selecting Prosody Parameters for Unit Selection Based Chinese TTS. Natural Language Processing - IJCNLP 2004, First International Joint Conference, Hainan Island, China, March 22-24, 2004, Revised Selected Papers.
-
Jun Xu, Thomas Choy, Minghui Dong, Cuntai Guan, Haizhou Li: On unit analysis for Cantonese corpus-based TTS.. 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003, Geneva, Switzerland, September 1-4, 2003.
-
Minghui Dong, Kim-Teng Lua: Pitch contour model for Chinese text-to-speech using CART and statistical model.. 7th International Conference on Spoken Language Processing, ICSLP2002 - INTERSPEECH 2002, Denver, Colorado, USA, September 16-20, 2002.
-
Minghui Dong and Kim-Teng Lua, "Prosodic Phrase Detection For Chinese TTS Using Cart And Statistical Model", International Symposium on Chinese Spoken Language Processing (ISCSLP 2002), Taipei, 2002.
-
Minghui Dong and Kim-Teng Lua, "Automatic Prosodic Break Labelling For Mandarin Chinese Speech Data", International Conference on Spoken Language Processing (ICSLP 2002), Denver, USA, 2002.
-
Minghui Dong, Kim-Teng Lua: Using prosody database in Chinese speech synthesis.. Sixth International Conference on Spoken Language Processing, ICSLP 2000 / INTERSPEECH 2000, Beijing, China, October 16-20, 2000; 01/2000
-
Minghui Dong and Kim-Teng Lua, "An Example-Based Approach For Prosody Generation In Chinese Speech Synthesis", International Symposium on Chinese Spoken Language Processing (ISCSLP 2000), Beijing, China, 2000.