Publications

Muhammad Umar Farooq; Thomas Hain: Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition. In: Interspeech 2023, 2023. (Type: Proceedings Article | Links)

Rehan Ahmad; Md Asif Jalal; Muhammad Umar Farooq; Anna Ollerenshaw; Thomas Hain: Towards Domain Generalisation in ASR with Elitist Sampling and Ensemble Knowledge Distillation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023, IEEE, 2023. (Type: Proceedings Article | Links)

William Ravenscroft; Stefan Goetze; Thomas Hain: Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation. In: 2020 17th International Workshop on Acoustic Signal Enhancement (IWAENC), 2022. (Type: Proceedings Article | Links)

Madina Hasan; Nicholas Jefferson; Thomas Hain; Jeremy Dawson: Automatic detection of behavioural codes in team interactions. In: Comput. Speech Lang., vol. 74, pp. 101339, 2022. (Type: Journal Article | Links)

Chanho Park; Rehan Ahmad; Thomas Hain: Unsupervised Data Selection for Speech Recognition with Contrastive Loss Ratios. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8587-8591, 2022. (Type: Proceedings Article | Links)

Jose Antonio Lopez Saenz; Thomas Hain: A Model for Assessor Bias in Automatic Pronunciation Assessment. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7267-7271, 2022. (Type: Proceedings Article | Links)

George Close; Thomas Hain; Stefan Goetze: MetricGAN+/-: Increasing Robustness of Noise Reduction on Unseen Data. In: 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, IEEE, 2022. (Type: Proceedings Article | Links)

Muhammad Umar Farooq; Thomas Hain : Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition. In: Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, South Korea, September 18 - 22, 2022, ISCA, 2022. (Type: Proceedings Article | Links)

Muhammad Umar Farooq; Darshan Adiga Haniya Narayana; Thomas Hain : Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion. In: Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, South Korea, September 18 - 22, 2022, ISCA, 2022. (Type: Proceedings Article | Links)

Thomas Hain; Md Asif Jalal; Anna Ollerenshaw: Insights of Neural Representations in Multi-Banded and Multi-Channel Convolutional Transformers for End-to-End ASR. In: IEEE 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, 2022. (Type: Proceedings Article | )

William Ravenscroft; Stefan Goetze; Thomas Hain: Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation. In: IEEE 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, 2022. (Type: Proceedings Article | Links)

William Ravenscroft; Stefan Goetze; Thomas Hain: Att-TasNet: Attending to Encodings in Time-Domain Audio Speech Separation of Noisy, Reverberant Speech Mixtures. In: 2022. (Type: Journal Article | Links)

Jose Antonio Lopez Saenz; Md Asif Jalal; Rosanna Milner; Thomas Hain: Attention Based Model for Segmental Pronunciation Error Detection. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021, Cartagena, Colombia, December 13-17, 2021, pp. 725–732, IEEE, 2021. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: H-VECTORS: Improving the robustness in utterance-level speaker embeddings using a hierarchical attention model. In: Neural Networks, vol. 142, pp. 329–339, 2021. (Type: Journal Article | Links)

Korbinian Friedl; Georgios Rizos; Lukas Stappen; Madina Hasan; Lucia Specia; Thomas Hain; Björn W. Schuller: Uncertainty Aware Review Hallucination for Science Article Classification. In: Zong, Chengqing; Xia, Fei; Li, Wenjie; Navigli, Roberto (Ed.): Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, pp. 5004–5009, Association for Computational Linguistics, 2021. (Type: Proceedings Article | Links)

Mingjie Chen; Yanpei Shi; Thomas Hain: Towards Low-Resource Stargan Voice Conversion Using Weight Adaptive Instance Normalization. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021, pp. 5949–5953, IEEE, 2021. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021, pp. 6473–6477, IEEE, 2021. (Type: Proceedings Article | Links)

Anna Ollerenshaw; Md Asif Jalal; Thomas Hain: Insights on Neural Representations for End-to-End Speech Recognition. In: Hermansky, Hynek; Cernocký, Honza; Burget, Lukás; Lamel, Lori; Scharenborg, Odette; Motlícek, Petr (Ed.): Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August - 3 September 2021, pp. 4079–4083, ISCA, 2021. (Type: Proceedings Article | Links)

Shengjie Huang; Mingjie Chen; Yanyan Xu; Dengfeng Ke; Thomas Hain: WINVC: One-Shot Voice Conversion with Weight Adaptive Instance Normalization. In: Pham, Duc Nghia; Theeramunkong, Thanaruk; Governatori, Guido; Liu, Fenrong (Ed.): PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8-12, 2021, Proceedings, Part II, pp. 559–573, Springer, 2021. (Type: Proceedings Article | Links)

Jose Antonio Lopez Saenz; Thomas Hain: Use of Speaker Metadata for Improving Automatic Pronunciation Assessment. In: Anke, Luis Espinosa; Martín-Vide, Carlos; Spasic, Irena (Ed.): Statistical Language and Speech Processing - 9th International Conference, SLSP 2021, Cardiff, UK, November 23-25, 2021, Proceedings, pp. 61–72, Springer, 2021. (Type: Proceedings Article | Links)

Yanpei Shi; Thomas Hain: Contextual Joint Factor Acoustic Embeddings. In: IEEE Spoken Language Technology Workshop, SLT 2021, Shenzhen, China, January 19-22, 2021, pp. 750–757, IEEE, 2021. (Type: Proceedings Article | Links)

Yanpei Shi; Thomas Hain: Supervised Speaker Embedding De-Mixing in Two-Speaker Environment. In: IEEE Spoken Language Technology Workshop, SLT 2021, Shenzhen, China, January 19-22, 2021, pp. 758–765, IEEE, 2021. (Type: Proceedings Article | Links)

Md Asif Jalal; Rosanna Milner; Thomas Hain; Roger K. Moore: Removing Bias with Residual Mixture of Multi-View Attention for Speech Emotion Recognition. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4084–4088, ISCA, 2020. (Type: Proceedings Article | Links)

Md Asif Jalal; Rosanna Milner; Thomas Hain: Empirical Interpretation of Speech Emotion Perception with Attention Based Model for Speech Emotion Recognition. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4113–4117, ISCA, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: H-Vectors: Utterance-Level Speaker Embedding Using a Hierarchical Attention Model. In: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020, pp. 7579–7583, IEEE, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: Speaker Re-Identification with Speaker Dependent Speech Enhancement. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 1530–1534, ISCA, 2020. (Type: Proceedings Article | Links)

Lukas Stappen; Georgios Rizos; Madina Hasan; Thomas Hain; Björn W. Schuller: Uncertainty-Aware Machine Support for Paper Reviewing on the Interspeech 2019 Submission Corpus. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 1808–1812, ISCA, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 2992–2996, ISCA, 2020. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Exploration of Audio Quality Assessment and Anomaly Localisation Using Attention Models. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4611–4615, ISCA, 2020. (Type: Proceedings Article | Links)

Hardik B. Sailor; Thomas Hain: Multilingual Speech Recognition Using Language-Specific Phoneme Recognition as Auxiliary Task for Indian Languages. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4756–4760, ISCA, 2020. (Type: Proceedings Article | Links)

Mingjie Chen; Thomas Hain: Unsupervised Acoustic Unit Representation Learning for Voice Conversion Using WaveNet Auto-Encoders. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4866–4870, ISCA, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: Robust Speaker Recognition Using Speech Enhancement And Attention Model. In: Lee, Kong-Aik; Koshinaka, Takafumi; Shinoda, Koichi (Ed.): Odyssey 2020: The Speaker and Language Recognition Workshop, 1-5 November 2020, Tokyo, Japan, pp. 451–458, ISCA, 2020. (Type: Proceedings Article | Links)

Rosanna Milner; Md Asif Jalal; Raymond W. M. Ng; Thomas Hain: A Cross-Corpus Study on Speech Emotion Recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 304–311, IEEE, 2019. (Type: Proceedings Article | Links)

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment. In: IEEE ACM Trans. Audio Speech Lang. Process., vol. 27, no. 3, pp. 572–582, 2019. (Type: Journal Article | Links)

Md Asif Jalal; Roger K. Moore; Thomas Hain: Spatio-Temporal Context Modelling for Speech Emotion Classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 853–859, IEEE, 2019. (Type: Proceedings Article | Links)

Hardik B. Sailor; Salil Deena; Md Asif Jalal; Rasa Lileikyte; Thomas Hain: Unsupervised Adaptation of Acoustic Models for ASR Using Utterance-Level Embeddings from Squeeze and Excitation Networks. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 980–987, IEEE, 2019. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Detecting Mismatch Between Speech and Transcription Using Cross-Modal Attention. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 584–588, ISCA, 2019. (Type: Proceedings Article | Links)

Md Asif Jalal; Erfan Loweimi; Roger K. Moore; Thomas Hain: Learning Temporal Clusters Using Capsule Routing for Speech Emotion Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1701–1705, ISCA, 2019. (Type: Proceedings Article | Links)

Mortaza Doulaty; Thomas Hain: Latent Dirichlet Allocation Based Acoustic Data Selection for Automatic Speech Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 3228–3232, ISCA, 2019. (Type: Proceedings Article | Links)

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain: Lightly supervised alignment of subtitles on multi-genre broadcasts. In: Multim. Tools Appl., vol. 77, no. 23, pp. 30533–30550, 2018. (Type: Journal Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Exploring the Use of Group Delay for Generalised VTS Based Noise Compensation. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, April 15-20, 2018, pp. 4824–4828, IEEE, 2018. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: On the Usefulness of the Speech Phase Spectrum for Pitch Extraction. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 696–700, ISCA, 2018. (Type: Proceedings Article | Links)

Mauro Nicolao; Michiel Sanders; Thomas Hain: Improved Acoustic Modelling for Automatic Literacy Assessment of Children. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 1666–1670, ISCA, 2018. (Type: Proceedings Article | Links)

Rahhal Errattahi; Salil Deena; Asmaa El Hannani; Hassan Ouahmane; Thomas Hain: Improving ASR Error Detection with RNNLM Adaptation. In: 2018 IEEE Spoken Language Technology Workshop, SLT 2018, Athens, Greece, December 18-21, 2018, pp. 190–196, IEEE, 2018. (Type: Proceedings Article | Links)

Rosanna Milner; Thomas Hain: DNN approach to speaker diarisation using speaker channels. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 4925–4929, IEEE, 2017. (Type: Proceedings Article | Links)

Oscar Saz; Thomas Hain: Acoustic adaptation to dynamic background conditions with asynchronous transformations. In: Comput. Speech Lang., vol. 41, pp. 180–194, 2017. (Type: Journal Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Thomas Hain: Unsupervised crosslingual adaptation of tokenisers for spoken language recognition. In: Comput. Speech Lang., vol. 46, pp. 327–342, 2017. (Type: Journal Article | Links)

Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Exploring the use of acoustic embeddings in neural machine translation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp. 450–457, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Thomas Hain: Statistical normalisation of phase-based feature representation for robust speech recognition. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5310–5314, IEEE, 2017. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Alvin C. M. Kwan; Tan Lee; Thomas Hain: Shefce: A Cantonese-English bilingual speech corpus for pronunciation assessment. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5825–5829, IEEE, 2017. (Type: Proceedings Article | Links)