Thomas Hain

Head of Group

Thomas Hain

Head of Group

Thomas holds the degree ‘Dipl.-Ing’ from the University of Technology, Vienna, and a PhD from Cambridge University. After work at Philips Speech Processing, Vienna he joined the Cambridge University Engineering Department in 1997, and moved to SpandH in 2004. He was promoted to Professor in 2012 He is leading the Machine Intelligence for Natural Interfaces subgroup. Thomas has published on machine learning and speech recognition topics in more than 100 publications in international conferences, journals and books (h-index 26). Apart from membership of many technical committees, including speech recognition area chair at ICASSP and Interspeech, he served on the IEEE Speech Technical Committee from 2007-2009, and as organizing committee member of Interspeech 2009, IEEE ASRU 2011 and 2013. He is currently member of the editorial board of Computer Speech and Language (CSL) and Associate Editor of the ACM Transactions on Speech and Language Processing. His recent research has its focus on recognition of natural speech in realistic environments, and on integration of speech technology with downstream processes such as content linking, summarisation or machine translation.

Contact

Contact Information

Where to find Us?

Office G041, Regent Court, 211 Portobello, Sheffield S1 4DP, UK

+44 114 222 1836

t.hain@sheffield.ac.uk

Publications

145 entries « ‹ 2 of 3 › »

2019

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment. In: IEEE ACM Trans. Audio Speech Lang. Process., vol. 27, no. 3, pp. 572–582, 2019. (Type: Journal Article | Links)

Rosanna Milner; Md Asif Jalal; Raymond W. M. Ng; Thomas Hain: A Cross-Corpus Study on Speech Emotion Recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 304–311, IEEE, 2019. (Type: Proceedings Article | Links)

Md Asif Jalal; Roger K. Moore; Thomas Hain: Spatio-Temporal Context Modelling for Speech Emotion Classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 853–859, IEEE, 2019. (Type: Proceedings Article | Links)

Hardik B. Sailor; Salil Deena; Md Asif Jalal; Rasa Lileikyte; Thomas Hain: Unsupervised Adaptation of Acoustic Models for ASR Using Utterance-Level Embeddings from Squeeze and Excitation Networks. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 980–987, IEEE, 2019. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Detecting Mismatch Between Speech and Transcription Using Cross-Modal Attention. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 584–588, ISCA, 2019. (Type: Proceedings Article | Links)

Md Asif Jalal; Erfan Loweimi; Roger K. Moore; Thomas Hain: Learning Temporal Clusters Using Capsule Routing for Speech Emotion Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1701–1705, ISCA, 2019. (Type: Proceedings Article | Links)

Mortaza Doulaty; Thomas Hain: Latent Dirichlet Allocation Based Acoustic Data Selection for Automatic Speech Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 3228–3232, ISCA, 2019. (Type: Proceedings Article | Links)

2018

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain: Lightly supervised alignment of subtitles on multi-genre broadcasts. In: Multim. Tools Appl., vol. 77, no. 23, pp. 30533–30550, 2018. (Type: Journal Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Exploring the Use of Group Delay for Generalised VTS Based Noise Compensation. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, April 15-20, 2018, pp. 4824–4828, IEEE, 2018. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: On the Usefulness of the Speech Phase Spectrum for Pitch Extraction. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 696–700, ISCA, 2018. (Type: Proceedings Article | Links)

Mauro Nicolao; Michiel Sanders; Thomas Hain: Improved Acoustic Modelling for Automatic Literacy Assessment of Children. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 1666–1670, ISCA, 2018. (Type: Proceedings Article | Links)

Rahhal Errattahi; Salil Deena; Asmaa El Hannani; Hassan Ouahmane; Thomas Hain: Improving ASR Error Detection with RNNLM Adaptation. In: 2018 IEEE Spoken Language Technology Workshop, SLT 2018, Athens, Greece, December 18-21, 2018, pp. 190–196, IEEE, 2018. (Type: Proceedings Article | Links)

2017

Oscar Saz; Thomas Hain: Acoustic adaptation to dynamic background conditions with asynchronous transformations. In: Comput. Speech Lang., vol. 41, pp. 180–194, 2017. (Type: Journal Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Thomas Hain: Unsupervised crosslingual adaptation of tokenisers for spoken language recognition. In: Comput. Speech Lang., vol. 46, pp. 327–342, 2017. (Type: Journal Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Exploring the use of acoustic embeddings in neural machine translation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp. 450–457, IEEE, 2017. (Type: Proceedings Article | Links)

Rosanna Milner; Thomas Hain: DNN approach to speaker diarisation using speaker channels. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 4925–4929, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Statistical normalisation of phase-based feature representation for robust speech recognition. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5310–5314, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Channel Compensation in the Generalised Vector Taylor Series Approach to Robust ASR. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2466–2470, ISCA, 2017. (Type: Proceedings Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Semi-Supervised Adaptation of RNNLMs by Fine-Tuning with Domain-Specific Auxiliary Features. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2715–2719, ISCA, 2017. (Type: Proceedings Article | Links)

Chenhao Wu; Raymond W. M. Ng; Oscar Saz; Thomas Hain: Analysing acoustic model changes for active learning in automatic speech recognition. In: International Conference on Systems, Signals and Image Processing, IWSSIP 2017, Poznań, Poland, May 22-24, 2017, pp. 1–5, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Oscar Saz; Thomas Hain: Robust Source-Filter Separation of Speech Signal in the Phase Domain. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 414–418, ISCA, 2017. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Alvin C. M. Kwan; Tan Lee; Thomas Hain: Shefce: A Cantonese-English bilingual speech corpus for pronunciation assessment. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5825–5829, IEEE, 2017. (Type: Proceedings Article | Links)

2016

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken langauge translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. (Type: Proceedings Article | )

Rosanna Milner; Thomas Hain: Segment-oriented evaluation of speaker diarisation performance. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp. 5460–5464, IEEE, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken language translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp. 6120–6124, IEEE, 2016. (Type: Proceedings Article | Links)

Sarah Al-Shareef; Thomas Hain: Colloquialising Modern Standard Arabic Text for Improved Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1345–1349, ISCA, 2016. (Type: Proceedings Article | Links)

Thomas Hain; Jeremy Christian; Oscar Saz; Salil Deena; Madina Hasan; Raymond W. M. Ng; Rosanna Milner; Mortaza Doulaty; Yulan Liu: webASR 2 - Improved Cloud Based Speech Technology. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1613–1617, ISCA, 2016. (Type: Proceedings Article | Links)

Julia Olcoz; Oscar Saz; Thomas Hain: Error Correction in Lightly Supervised Alignment of Broadcast Subtitles. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2110–2114, ISCA, 2016. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Automatic Genre and Show Identification of Broadcast Media. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2115–2119, ISCA, 2016. (Type: Proceedings Article | Links)

Rosanna Milner; Thomas Hain: DNN-Based Speaker Clustering for Speaker Diarisation. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2185–2189, ISCA, 2016. (Type: Proceedings Article | Links)

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2343–2347, ISCA, 2016. (Type: Proceedings Article | Links)

Iñigo Casanueva; Thomas Hain; Phil D. Green: Improving Generalisation to New Speakers in Spoken Dialogue State Tracking. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2726–2730, ISCA, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Bhusan Chettri; Thomas Hain: Combining Weak Tokenisers for Phonotactic Language Recognition in a Resource-Constrained Setting. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2939–2943, ISCA, 2016. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 3798–3802, ISCA, 2016. (Type: Proceedings Article | Links)

Yulan Liu; Charles Fox; Madina Hasan; Thomas Hain: The Sheffield Wargame Corpus - Day Two and Day Three. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 3833–3837, ISCA, 2016. (Type: Proceedings Article | Links)

Ghada AlHarbi; Thomas Hain: The OpenCourseWare Metadiscourse (OCWMD) Corpus. In: Calzolari, Nicoletta; Choukri, Khalid; Declerck, Thierry; Goggi, Sara; Grobelnik, Marko; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Moreno, Asunción; Odijk, Jan; Piperidis, Stelios (Ed.): Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016, European Language Resources Association (ELRA), 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Oscar Saz; Madina Hasan; Bhusan Chettri; Mortaza Doulaty; Tan Lee; Thomas Hain: The Sheffield language recognition system in NIST LRE 2015. In: Rodríguez-Fuentes, Luis Javier; Lleida, Eduardo (Ed.): Odyssey 2016: The Speaker and Language Recognition Workshop, Bilbao, Spain, June 21-24, 2016, pp. 181–187, ISCA, 2016. (Type: Proceedings Article | Links)

Iñigo Casanueva; Thomas Hain; Mauro Nicolao; Phil D. Green: Using phone features to improve dialogue state tracking generalisation to unseen states. In: Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA, pp. 80–89, The Association for Computer Linguistics, 2016. (Type: Proceedings Article | Links)

Mauro Nicolao; Heidi Christensen; Stuart P. Cunningham; Phil D. Green; Thomas Hain: A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus. In: Calzolari, Nicoletta; Choukri, Khalid; Declerck, Thierry; Goggi, Sara; Grobelnik, Marko; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Moreno, Asunción; Odijk, Jan; Piperidis, Stelios (Ed.): Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016, European Language Resources Association (ELRA), 2016. (Type: Proceedings Article | Links)

2015

Heidi Christensen; Mauro Nicolao; Stuart P. Cunningham; Salil Deena; Phil D. Green; Thomas Hain: Speech-Enabled Environmental Control in an AAL setting for people with Speech Disorders: a Case Study. In: IET International Conference on Technologies for Active and Assisted Living, TechAAL 2015, London, UK, 2015. (Type: Proceedings Article | )

Yulan Liu; Penny Karanasou; Thomas Hain: An Investigation Into Speaker Informed DNN Front-end for LVCSR. In: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015. (Type: Proceedings Article | )

Erfan Loweimi; Jon Barker; Thomas Hain: Emotion Recognition from Speech Signal by Effective Combination of the Generative and Discriminative Models. In: University of Sheffield Engineering Symposium, Sheffield, UK, 2015. (Type: Proceedings Article | )

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Latent Dirichlet Allocation based organisation of broadcast media archives for deep neural network adaptation. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 130–136, IEEE, 2015. (Type: Proceedings Article | Links)

Oscar Saz; Mortaza Doulaty; Salil Deena; Rosanna Milner; Raymond W. M. Ng; Madina Hasan; Yulan Liu; Thomas Hain: The 2015 sheffield system for transcription of Multi-Genre Broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 624–631, IEEE, 2015. (Type: Proceedings Article | Links)

Rosanna Milner; Oscar Saz; Salil Deena; Mortaza Doulaty; Raymond W. M. Ng; Thomas Hain: The 2015 sheffield system for longitudinal diarisation of broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 632–638, IEEE, 2015. (Type: Proceedings Article | Links)

Peter Bell; Mark J. F. Gales; Thomas Hain; Jonathan Kilgour; Pierre Lanchantin; Xunying Liu; Andrew McParland; Steve Renals; Oscar Saz; Mirjam Wester; Philip C. Woodland: The MGB challenge: Evaluating multi-genre broadcast media recognition. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 687–693, IEEE, 2015. (Type: Proceedings Article | Links)

Ghada AlHarbi; Thomas Hain: Using Topic Segmentation Models for the Automatic Organisation of MOOCs resources. In: Santos, Olga C.; Boticario, Jesus; Romero, Cristóbal; Pechenizkiy, Mykola; Merceron, Agathe; Mitros, Piotr; Luna, José María; Mihaescu, Marian Cristian; Moreno, Pablo; Hershkovitz, Arnon; Ventura, Sebastián; Desmarais, Michel C. (Ed.): Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, pp. 524–527, International Educational Data Mining Society (IEDMS), 2015. (Type: Proceedings Article | Links)

Yulan Liu; Penny Karanasou; Thomas Hain: An investigation into speaker informed DNN front-end for LVCSR. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp. 4300–4304, IEEE, 2015. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Wilker Aziz; Lucia Specia; Thomas Hain: Quality estimation for asr k-best list rescoring in spoken language translation. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp. 5226–5230, IEEE, 2015. (Type: Proceedings Article | Links)

Mauro Nicolao; Amy V. Beeston; Thomas Hain: Automatic assessment of English learner pronunciation using discriminative classifiers. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp. 5351–5355, IEEE, 2015. (Type: Proceedings Article | Links)

145 entries « ‹ 2 of 3 › »