Raymond W. M. Ng

Past Members, Researchers, Visitors

I am currently a research associate with the University of Sheffield. My current research focus is spoken language translation, and I am also involved in research related to speaker and language recognition.

Research:

– Spoken language translation

Spoken language translation (SLT) combines automatic speech recognition (ASR) and machine translation (MT). State-of-the-art SLT systems normally require careful tuning of the parameters of both the ASR and the MT components. To achieve reasonable performance the ASR components are normally required to demonstrate robust performance (with low word error rates, WER), upon which a pipeline approach is adopted to link the ASR and the MT components. In more practical scenarios, an SLT system has to deal with more varied inputs. In a situation with mismatched input or domain, the automatic transcript may have much higher WERs. At this point little is known on what types of errors in high WER scenarios cause specific degradation in MT performance. This research focuses on improving the SLT system performance by means of system integration. By modelling the information flow between the ASR and the MT components, the ASR output can be filtered and/or adapted to alleviate the conditions of model mismatch. MT system can also be tuned to the filtered, coupling output.

http://staffwww.dcs.shef.ac.uk/people/W.Ng/

Publications

2019

Rosanna Milner; Md Asif Jalal; Raymond W. M. Ng; Thomas Hain: A Cross-Corpus Study on Speech Emotion Recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 304–311, IEEE, 2019. (Type: Proceedings Article | Links)

2018

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain: Lightly supervised alignment of subtitles on multi-genre broadcasts. In: Multim. Tools Appl., vol. 77, no. 23, pp. 30533–30550, 2018. (Type: Journal Article | Links)

2017

Raymond W. M. Ng; Mauro Nicolao; Thomas Hain: Unsupervised crosslingual adaptation of tokenisers for spoken language recognition. In: Comput. Speech Lang., vol. 46, pp. 327–342, 2017. (Type: Journal Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Exploring the use of acoustic embeddings in neural machine translation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp. 450–457, IEEE, 2017. (Type: Proceedings Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Semi-Supervised Adaptation of RNNLMs by Fine-Tuning with Domain-Specific Auxiliary Features. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2715–2719, ISCA, 2017. (Type: Proceedings Article | Links)

Chenhao Wu; Raymond W. M. Ng; Oscar Saz; Thomas Hain: Analysing acoustic model changes for active learning in automatic speech recognition. In: International Conference on Systems, Signals and Image Processing, IWSSIP 2017, Poznań, Poland, May 22-24, 2017, pp. 1–5, IEEE, 2017. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Alvin C. M. Kwan; Tan Lee; Thomas Hain: Shefce: A Cantonese-English bilingual speech corpus for pronunciation assessment. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5825–5829, IEEE, 2017. (Type: Proceedings Article | Links)

2016

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken langauge translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. (Type: Proceedings Article | )

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken language translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp. 6120–6124, IEEE, 2016. (Type: Proceedings Article | Links)

Thomas Hain; Jeremy Christian; Oscar Saz; Salil Deena; Madina Hasan; Raymond W. M. Ng; Rosanna Milner; Mortaza Doulaty; Yulan Liu: webASR 2 - Improved Cloud Based Speech Technology. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1613–1617, ISCA, 2016. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Automatic Genre and Show Identification of Broadcast Media. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2115–2119, ISCA, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Bhusan Chettri; Thomas Hain: Combining Weak Tokenisers for Phonotactic Language Recognition in a Resource-Constrained Setting. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2939–2943, ISCA, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Oscar Saz; Madina Hasan; Bhusan Chettri; Mortaza Doulaty; Tan Lee; Thomas Hain: The Sheffield language recognition system in NIST LRE 2015. In: Rodríguez-Fuentes, Luis Javier; Lleida, Eduardo (Ed.): Odyssey 2016: The Speaker and Language Recognition Workshop, Bilbao, Spain, June 21-24, 2016, pp. 181–187, ISCA, 2016. (Type: Proceedings Article | Links)

2015

Kashif Shah; Raymond W. M. Ng; Fethi Bougares; Lucia Specia: Investigating continuous space language models for machine translation quality estimation. In: 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015. (Type: Proceedings Article | )

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Latent Dirichlet Allocation based organisation of broadcast media archives for deep neural network adaptation. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 130–136, IEEE, 2015. (Type: Proceedings Article | Links)

Oscar Saz; Mortaza Doulaty; Salil Deena; Rosanna Milner; Raymond W. M. Ng; Madina Hasan; Yulan Liu; Thomas Hain: The 2015 sheffield system for transcription of Multi-Genre Broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 624–631, IEEE, 2015. (Type: Proceedings Article | Links)

Rosanna Milner; Oscar Saz; Salil Deena; Mortaza Doulaty; Raymond W. M. Ng; Thomas Hain: The 2015 sheffield system for longitudinal diarisation of broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 632–638, IEEE, 2015. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Wilker Aziz; Lucia Specia; Thomas Hain: Quality estimation for asr k-best list rescoring in spoken language translation. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp. 5226–5230, IEEE, 2015. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: A study on the stability and effectiveness of features in quality estimation for spoken language translation. In: INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp. 2257–2261, ISCA, 2015. (Type: Proceedings Article | Links)

Ghada AlHarbi; Raymond W. M. Ng; Thomas Hain: Annotating meta-discourse in academic lectures from different disciplines. In: Steidl, Stefan; Batliner, Anton; Jokisch, Oliver (Ed.): ISCA International Workshop on Speech and Language Technology in Education, SLaTE 2015, Leipzig, Germany, September 4-5, 2015, pp. 161–166, ISCA, 2015. (Type: Proceedings Article | Links)

2014

Raymond W. M. Ng; Mortaza Doulaty; Rama Doddipatla; Wilker Aziz; Kashif Shah; Oscar Saz; Madina Hasan; Ghada AlHarbi; Lucia Specia; Thomas Hain: The USFD SLT system for IWSLT 2014. In: Federico, Marcello; Stüker, Sebastian; Yvon, François (Ed.): Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, Lake Tahoe, CA, USA, December 4-5, 2014, 2014. (Type: Proceedings Article | Links)

2013

Raymond W. M. Ng; Thomas Hain; Trevor Cohn: Adaptation of lecture speech recognition system with machine translation output. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp. 8401–8405, IEEE, 2013. (Type: Proceedings Article | Links)