Publications

Amit Meghanani; Thomas Hain: SCORE: Self-supervised Correspondence Fine-Tuning for Improved Content Representations. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Forthcoming. (Type: Proceedings Article | )

Rehan Ahmad; Muhammad Umar Farooq; Thomas Hain: Progressive Unsupervised Domain Adaptation for ASR Using Ensemble Models and Multi-stage Training. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Forthcoming. (Type: Proceedings Article | )

William Ravenscroft; Stefan Goetze; Thomas Hain: Combining Conformer and Dual-Path-Transformer Networks for Single Channel Noisy Reverberant Speech Separation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Forthcoming. (Type: Proceedings Article | )

George Close; William Ravenscroft; Thomas Hain; Stefan Goetze: MULTI-CMGAN+/+: Leveraging Multi-Objective Speech Quality Metric Prediction for Speech Enhancement. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Forthcoming. (Type: Proceedings Article | )

Rhiannon Mogridge; George Close; Robert Sutherland; Thomas Hain; Jon Barker; Stefan Goetze; Anton Ragni: Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users Using Intermediate ASR Features and Human Memory Models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Forthcoming. (Type: Proceedings Article | )

Amit Meghanani; Thomas Hain : Improving Acoustic Word Embeddings through Correspondence Training of Self-supervised Speech Representations. In: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024), Forthcoming. (Type: Proceedings Article | )

Will Ravenscroft; Stefan Goetze; Thomas Hain: On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments. In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2023. (Type: Proceedings Article | Links)

Elaf Islam; Thomas Hain; Protima Nomo Sudro: Simulation of Teacher-Learner Interaction in English Language Pronunciation Learning. In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2023. (Type: Proceedings Article | Links)

Amit Meghanani; Thomas Hain: Deriving Translational Acoustic Sub-Word Embeddings. In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2023. (Type: Proceedings Article | Links)

Muhammad Umar Farooq; Rehan Ahmad; Thomas Hain: MUST: A Multilingual Student-Teacher Approach for Low-Resource Speech Recognition. In: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2023. (Type: Proceedings Article | Links)

George Close; Thomas Hain; Stefan Goetze: The Effect of Spoken Language on Speech Enhancement Using Self-Supervised Speech Representation Loss Functions. In: 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023. (Type: Proceedings Article | Links)

William Ravenscroft; Stefan Goetze; Thomas Hain: On Data Sampling Strategies for Training Neural Network Speech Separation Models. In: 31st European Signal Processing Conference (EUSIPCO), 2023. (Type: Proceedings Article | Links)

Protima Nomo Sudro; Anton Ragni; Thomas Hain: Adapting pretrained models for adult to child voice conversion. In: 1st European Signal Processing Conference (EUSIPCO), 2023. (Type: Proceedings Article | Links)

Anna Ollerenshaw; Md Asif Jalal; Thomas Hain.: Probing Statistical Representations for End-to-End ASR. In: 31st European Signal Processing Conference (EUSIPCO), 2023. (Type: Proceedings Article | Links)

Muhammad Umar Farooq; Thomas Hain: Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition. In: Interspeech 2023, 2023. (Type: Proceedings Article | Links)

George Close; William Ravenscroft; Thomas Hain; Stefan Goetze: The University of Sheffield CHiME-7 UDASE Challenge Speech Enhancement System. 2023. (Type: Technical Report | Links)

Elaf Islam; Chanho Park; Thomas Hain: Exploring Speech Representations for Proficiency Assessment in Language Learning. In: 9th Workshop on Speech and Language Technology in Education (SLaTE), 2023. (Type: Proceedings Article | Links)

Rehan Ahmad; Md Asif Jalal; Muhammad Umar Farooq; Anna Ollerenshaw; Thomas Hain: Towards Domain Generalisation in ASR with Elitist Sampling and Ensemble Knowledge Distillation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023, IEEE, 2023. (Type: Proceedings Article | Links)

William Ravenscroft; Stefan Goetze; Thomas Hain: Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation. In: 2020 17th International Workshop on Acoustic Signal Enhancement (IWAENC), 2022. (Type: Proceedings Article | Links)

Madina Hasan; Nicholas Jefferson; Thomas Hain; Jeremy Dawson: Automatic detection of behavioural codes in team interactions. In: Comput. Speech Lang., vol. 74, pp. 101339, 2022. (Type: Journal Article | Links)

Chanho Park; Rehan Ahmad; Thomas Hain: Unsupervised Data Selection for Speech Recognition with Contrastive Loss Ratios. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8587-8591, 2022. (Type: Proceedings Article | Links)

Jose Antonio Lopez Saenz; Thomas Hain: A Model for Assessor Bias in Automatic Pronunciation Assessment. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7267-7271, 2022. (Type: Proceedings Article | Links)

George Close; Thomas Hain; Stefan Goetze: MetricGAN+/-: Increasing Robustness of Noise Reduction on Unseen Data. In: 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, IEEE, 2022. (Type: Proceedings Article | Links)

Muhammad Umar Farooq; Thomas Hain : Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition. In: Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, South Korea, September 18 - 22, 2022, ISCA, 2022. (Type: Proceedings Article | Links)

Muhammad Umar Farooq; Darshan Adiga Haniya Narayana; Thomas Hain : Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion. In: Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, Incheon, South Korea, September 18 - 22, 2022, ISCA, 2022. (Type: Proceedings Article | Links)

Thomas Hain; Md Asif Jalal; Anna Ollerenshaw: Insights of Neural Representations in Multi-Banded and Multi-Channel Convolutional Transformers for End-to-End ASR. In: IEEE 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, 2022. (Type: Proceedings Article | )

William Ravenscroft; Stefan Goetze; Thomas Hain: Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation. In: IEEE 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, 2022. (Type: Proceedings Article | Links)

William Ravenscroft; Stefan Goetze; Thomas Hain: Att-TasNet: Attending to Encodings in Time-Domain Audio Speech Separation of Noisy, Reverberant Speech Mixtures. In: 2022. (Type: Journal Article | Links)

Jose Antonio Lopez Saenz; Md Asif Jalal; Rosanna Milner; Thomas Hain: Attention Based Model for Segmental Pronunciation Error Detection. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021, Cartagena, Colombia, December 13-17, 2021, pp. 725–732, IEEE, 2021. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: H-VECTORS: Improving the robustness in utterance-level speaker embeddings using a hierarchical attention model. In: Neural Networks, vol. 142, pp. 329–339, 2021. (Type: Journal Article | Links)

Korbinian Friedl; Georgios Rizos; Lukas Stappen; Madina Hasan; Lucia Specia; Thomas Hain; Björn W. Schuller: Uncertainty Aware Review Hallucination for Science Article Classification. In: Zong, Chengqing; Xia, Fei; Li, Wenjie; Navigli, Roberto (Ed.): Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, pp. 5004–5009, Association for Computational Linguistics, 2021. (Type: Proceedings Article | Links)

Mingjie Chen; Yanpei Shi; Thomas Hain: Towards Low-Resource Stargan Voice Conversion Using Weight Adaptive Instance Normalization. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021, pp. 5949–5953, IEEE, 2021. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021, pp. 6473–6477, IEEE, 2021. (Type: Proceedings Article | Links)

Anna Ollerenshaw; Md Asif Jalal; Thomas Hain: Insights on Neural Representations for End-to-End Speech Recognition. In: Hermansky, Hynek; Cernocký, Honza; Burget, Lukás; Lamel, Lori; Scharenborg, Odette; Motlícek, Petr (Ed.): Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August - 3 September 2021, pp. 4079–4083, ISCA, 2021. (Type: Proceedings Article | Links)

Shengjie Huang; Mingjie Chen; Yanyan Xu; Dengfeng Ke; Thomas Hain: WINVC: One-Shot Voice Conversion with Weight Adaptive Instance Normalization. In: Pham, Duc Nghia; Theeramunkong, Thanaruk; Governatori, Guido; Liu, Fenrong (Ed.): PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8-12, 2021, Proceedings, Part II, pp. 559–573, Springer, 2021. (Type: Proceedings Article | Links)

Jose Antonio Lopez Saenz; Thomas Hain: Use of Speaker Metadata for Improving Automatic Pronunciation Assessment. In: Anke, Luis Espinosa; Martín-Vide, Carlos; Spasic, Irena (Ed.): Statistical Language and Speech Processing - 9th International Conference, SLSP 2021, Cardiff, UK, November 23-25, 2021, Proceedings, pp. 61–72, Springer, 2021. (Type: Proceedings Article | Links)

Yanpei Shi; Thomas Hain: Contextual Joint Factor Acoustic Embeddings. In: IEEE Spoken Language Technology Workshop, SLT 2021, Shenzhen, China, January 19-22, 2021, pp. 750–757, IEEE, 2021. (Type: Proceedings Article | Links)

Yanpei Shi; Thomas Hain: Supervised Speaker Embedding De-Mixing in Two-Speaker Environment. In: IEEE Spoken Language Technology Workshop, SLT 2021, Shenzhen, China, January 19-22, 2021, pp. 758–765, IEEE, 2021. (Type: Proceedings Article | Links)

Md Asif Jalal; Rosanna Milner; Thomas Hain; Roger K. Moore: Removing Bias with Residual Mixture of Multi-View Attention for Speech Emotion Recognition. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4084–4088, ISCA, 2020. (Type: Proceedings Article | Links)

Md Asif Jalal; Rosanna Milner; Thomas Hain: Empirical Interpretation of Speech Emotion Perception with Attention Based Model for Speech Emotion Recognition. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4113–4117, ISCA, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: H-Vectors: Utterance-Level Speaker Embedding Using a Hierarchical Attention Model. In: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020, pp. 7579–7583, IEEE, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: Speaker Re-Identification with Speaker Dependent Speech Enhancement. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 1530–1534, ISCA, 2020. (Type: Proceedings Article | Links)

Lukas Stappen; Georgios Rizos; Madina Hasan; Thomas Hain; Björn W. Schuller: Uncertainty-Aware Machine Support for Paper Reviewing on the Interspeech 2019 Submission Corpus. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 1808–1812, ISCA, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 2992–2996, ISCA, 2020. (Type: Proceedings Article | Links)

Qiang Huang; Thomas Hain: Exploration of Audio Quality Assessment and Anomaly Localisation Using Attention Models. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4611–4615, ISCA, 2020. (Type: Proceedings Article | Links)

Hardik B. Sailor; Thomas Hain: Multilingual Speech Recognition Using Language-Specific Phoneme Recognition as Auxiliary Task for Indian Languages. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4756–4760, ISCA, 2020. (Type: Proceedings Article | Links)

Mingjie Chen; Thomas Hain: Unsupervised Acoustic Unit Representation Learning for Voice Conversion Using WaveNet Auto-Encoders. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4866–4870, ISCA, 2020. (Type: Proceedings Article | Links)

Yanpei Shi; Qiang Huang; Thomas Hain: Robust Speaker Recognition Using Speech Enhancement And Attention Model. In: Lee, Kong-Aik; Koshinaka, Takafumi; Shinoda, Koichi (Ed.): Odyssey 2020: The Speaker and Language Recognition Workshop, 1-5 November 2020, Tokyo, Japan, pp. 451–458, ISCA, 2020. (Type: Proceedings Article | Links)

Rosanna Milner; Md Asif Jalal; Raymond W. M. Ng; Thomas Hain: A Cross-Corpus Study on Speech Emotion Recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 304–311, IEEE, 2019. (Type: Proceedings Article | Links)

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment. In: IEEE ACM Trans. Audio Speech Lang. Process., vol. 27, no. 3, pp. 572–582, 2019. (Type: Journal Article | Links)