Publications

Md Asif Jalal; Roger K. Moore; Thomas Hain: Spatio-Temporal Context Modelling for Speech Emotion Classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 853–859, IEEE, 2019. (Type: Proceedings Article | Links)

Hardik B. Sailor; Salil Deena; Md Asif Jalal; Rasa Lileikyte; Thomas Hain: Unsupervised Adaptation of Acoustic Models for ASR Using Utterance-Level Embeddings from Squeeze and Excitation Networks. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 980–987, IEEE, 2019. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Detecting Mismatch Between Speech and Transcription Using Cross-Modal Attention. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 584–588, ISCA, 2019. (Type: Proceedings Article | Links)

Md Asif Jalal; Erfan Loweimi; Roger K. Moore; Thomas Hain: Learning Temporal Clusters Using Capsule Routing for Speech Emotion Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1701–1705, ISCA, 2019. (Type: Proceedings Article | Links)

Mortaza Doulaty; Thomas Hain: Latent Dirichlet Allocation Based Acoustic Data Selection for Automatic Speech Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 3228–3232, ISCA, 2019. (Type: Proceedings Article | Links)

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain: Lightly supervised alignment of subtitles on multi-genre broadcasts. In: Multim. Tools Appl., vol. 77, no. 23, pp. 30533–30550, 2018. (Type: Journal Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Exploring the Use of Group Delay for Generalised VTS Based Noise Compensation. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, April 15-20, 2018, pp. 4824–4828, IEEE, 2018. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: On the Usefulness of the Speech Phase Spectrum for Pitch Extraction. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 696–700, ISCA, 2018. (Type: Proceedings Article | Links)

Mauro Nicolao; Michiel Sanders; Thomas Hain: Improved Acoustic Modelling for Automatic Literacy Assessment of Children. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 1666–1670, ISCA, 2018. (Type: Proceedings Article | Links)

Rahhal Errattahi; Salil Deena; Asmaa El Hannani; Hassan Ouahmane; Thomas Hain: Improving ASR Error Detection with RNNLM Adaptation. In: 2018 IEEE Spoken Language Technology Workshop, SLT 2018, Athens, Greece, December 18-21, 2018, pp. 190–196, IEEE, 2018. (Type: Proceedings Article | Links)

Rosanna Milner; Thomas Hain: DNN approach to speaker diarisation using speaker channels. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 4925–4929, IEEE, 2017. (Type: Proceedings Article | Links)

Oscar Saz; Thomas Hain: Acoustic adaptation to dynamic background conditions with asynchronous transformations. In: Comput. Speech Lang., vol. 41, pp. 180–194, 2017. (Type: Journal Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Thomas Hain: Unsupervised crosslingual adaptation of tokenisers for spoken language recognition. In: Comput. Speech Lang., vol. 46, pp. 327–342, 2017. (Type: Journal Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Exploring the use of acoustic embeddings in neural machine translation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp. 450–457, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Statistical normalisation of phase-based feature representation for robust speech recognition. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5310–5314, IEEE, 2017. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Alvin C. M. Kwan; Tan Lee; Thomas Hain: Shefce: A Cantonese-English bilingual speech corpus for pronunciation assessment. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5825–5829, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Oscar Saz; Thomas Hain: Robust Source-Filter Separation of Speech Signal in the Phase Domain. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 414–418, ISCA, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Channel Compensation in the Generalised Vector Taylor Series Approach to Robust ASR. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2466–2470, ISCA, 2017. (Type: Proceedings Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Semi-Supervised Adaptation of RNNLMs by Fine-Tuning with Domain-Specific Auxiliary Features. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2715–2719, ISCA, 2017. (Type: Proceedings Article | Links)

Chenhao Wu; Raymond W. M. Ng; Oscar Saz; Thomas Hain: Analysing acoustic model changes for active learning in automatic speech recognition. In: International Conference on Systems, Signals and Image Processing, IWSSIP 2017, Poznań, Poland, May 22-24, 2017, pp. 1–5, IEEE, 2017. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken langauge translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. (Type: Proceedings Article | )

Rosanna Milner: Using deep neural networks for speaker diarisation. University of Sheffield, UK, 2016. (Type: PhD Thesis | Links)

Rosanna Milner; Thomas Hain: Segment-oriented evaluation of speaker diarisation performance. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp. 5460–5464, IEEE, 2016. (Type: Proceedings Article | Links)

Thomas Hain; Jeremy Christian; Oscar Saz; Salil Deena; Madina Hasan; Raymond W. M. Ng; Rosanna Milner; Mortaza Doulaty; Yulan Liu: webASR 2 - Improved Cloud Based Speech Technology. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1613–1617, ISCA, 2016. (Type: Proceedings Article | Links)

Rosanna Milner; Thomas Hain: DNN-Based Speaker Clustering for Speaker Diarisation. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2185–2189, ISCA, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken language translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp. 6120–6124, IEEE, 2016. (Type: Proceedings Article | Links)

Sarah Al-Shareef; Thomas Hain: Colloquialising Modern Standard Arabic Text for Improved Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1345–1349, ISCA, 2016. (Type: Proceedings Article | Links)

Julia Olcoz; Oscar Saz; Thomas Hain: Error Correction in Lightly Supervised Alignment of Broadcast Subtitles. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2110–2114, ISCA, 2016. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Automatic Genre and Show Identification of Broadcast Media. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2115–2119, ISCA, 2016. (Type: Proceedings Article | Links)

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2343–2347, ISCA, 2016. (Type: Proceedings Article | Links)

Iñigo Casanueva; Thomas Hain; Phil D. Green: Improving Generalisation to New Speakers in Spoken Dialogue State Tracking. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2726–2730, ISCA, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Bhusan Chettri; Thomas Hain: Combining Weak Tokenisers for Phonotactic Language Recognition in a Resource-Constrained Setting. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2939–2943, ISCA, 2016. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Use of Generalised Nonlinearity in Vector Taylor Series Noise Compensation for Robust Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 3798–3802, ISCA, 2016. (Type: Proceedings Article | Links)

Yulan Liu; Charles Fox; Madina Hasan; Thomas Hain: The Sheffield Wargame Corpus - Day Two and Day Three. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 3833–3837, ISCA, 2016. (Type: Proceedings Article | Links)

Ghada AlHarbi; Thomas Hain: The OpenCourseWare Metadiscourse (OCWMD) Corpus. In: Calzolari, Nicoletta; Choukri, Khalid; Declerck, Thierry; Goggi, Sara; Grobelnik, Marko; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Moreno, Asunción; Odijk, Jan; Piperidis, Stelios (Ed.): Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016, European Language Resources Association (ELRA), 2016. (Type: Proceedings Article | Links)

Mauro Nicolao; Heidi Christensen; Stuart P. Cunningham; Phil D. Green; Thomas Hain: A Framework for Collecting Realistic Recordings of Dysarthric Speech - the homeService Corpus. In: Calzolari, Nicoletta; Choukri, Khalid; Declerck, Thierry; Goggi, Sara; Grobelnik, Marko; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Moreno, Asunción; Odijk, Jan; Piperidis, Stelios (Ed.): Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorož, Slovenia, May 23-28, 2016, European Language Resources Association (ELRA), 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Oscar Saz; Madina Hasan; Bhusan Chettri; Mortaza Doulaty; Tan Lee; Thomas Hain: The Sheffield language recognition system in NIST LRE 2015. In: Rodríguez-Fuentes, Luis Javier; Lleida, Eduardo (Ed.): Odyssey 2016: The Speaker and Language Recognition Workshop, Bilbao, Spain, June 21-24, 2016, pp. 181–187, ISCA, 2016. (Type: Proceedings Article | Links)

Iñigo Casanueva; Thomas Hain; Mauro Nicolao; Phil D. Green: Using phone features to improve dialogue state tracking generalisation to unseen states. In: Proceedings of the SIGDIAL 2016 Conference, The 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 13-15 September 2016, Los Angeles, CA, USA, pp. 80–89, The Association for Computer Linguistics, 2016. (Type: Proceedings Article | Links)

Heidi Christensen; Mauro Nicolao; Stuart P. Cunningham; Salil Deena; Phil D. Green; Thomas Hain: Speech-Enabled Environmental Control in an AAL setting for people with Speech Disorders: a Case Study. In: IET International Conference on Technologies for Active and Assisted Living, TechAAL 2015, London, UK, 2015. (Type: Proceedings Article | )

Kashif Shah; Raymond W. M. Ng; Fethi Bougares; Lucia Specia: Investigating continuous space language models for machine translation quality estimation. In: 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015. (Type: Proceedings Article | )

Yulan Liu; Penny Karanasou; Thomas Hain: An Investigation Into Speaker Informed DNN Front-end for LVCSR. In: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015. (Type: Proceedings Article | )

Erfan Loweimi; Jon Barker; Thomas Hain: Emotion Recognition from Speech Signal by Effective Combination of the Generative and Discriminative Models. In: University of Sheffield Engineering Symposium, Sheffield, UK, 2015. (Type: Proceedings Article | )

David Mart'inez; Eduardo Lleida; Phil D. Green; Heidi Christensen; Alfonso Ortega; Antonio Miguel: Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace. In: ACM Transactions on Accessible Computing (TACCESS), vol. 6, no. 3, pp. 10, 2015. (Type: Journal Article | )

Oscar Saz; Mortaza Doulaty; Salil Deena; Rosanna Milner; Raymond W. M. Ng; Madina Hasan; Yulan Liu; Thomas Hain: The 2015 sheffield system for transcription of Multi-Genre Broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 624–631, IEEE, 2015. (Type: Proceedings Article | Links)

Rosanna Milner; Oscar Saz; Salil Deena; Mortaza Doulaty; Raymond W. M. Ng; Thomas Hain: The 2015 sheffield system for longitudinal diarisation of broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 632–638, IEEE, 2015. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Latent Dirichlet Allocation based organisation of broadcast media archives for deep neural network adaptation. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 130–136, IEEE, 2015. (Type: Proceedings Article | Links)

Peter Bell; Mark J. F. Gales; Thomas Hain; Jonathan Kilgour; Pierre Lanchantin; Xunying Liu; Andrew McParland; Steve Renals; Oscar Saz; Mirjam Wester; Philip C. Woodland: The MGB challenge: Evaluating multi-genre broadcast media recognition. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 687–693, IEEE, 2015. (Type: Proceedings Article | Links)

Ghada AlHarbi; Thomas Hain: Using Topic Segmentation Models for the Automatic Organisation of MOOCs resources. In: Santos, Olga C.; Boticario, Jesus; Romero, Cristóbal; Pechenizkiy, Mykola; Merceron, Agathe; Mitros, Piotr; Luna, José María; Mihaescu, Marian Cristian; Moreno, Pablo; Hershkovitz, Arnon; Ventura, Sebastián; Desmarais, Michel C. (Ed.): Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, pp. 524–527, International Educational Data Mining Society (IEDMS), 2015. (Type: Proceedings Article | Links)

Yulan Liu; Penny Karanasou; Thomas Hain: An investigation into speaker informed DNN front-end for LVCSR. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp. 4300–4304, IEEE, 2015. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Kashif Shah; Wilker Aziz; Lucia Specia; Thomas Hain: Quality estimation for asr k-best list rescoring in spoken language translation. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp. 5226–5230, IEEE, 2015. (Type: Proceedings Article | Links)