Publications

131 entries « 1 of 3 »

2022

Madina Hasan; Nicholas Jefferson; Thomas Hain; Jeremy Dawson: Automatic detection of behavioural codes in team interactions. In: Comput. Speech Lang., vol. 74, pp. 101339, 2022. (Type: Journal Article | Links)
Chanho Park; Rehan Ahmad; Thomas Hain: Unsupervised Data Selection for Speech Recognition with Contrastive Loss Ratios. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8587-8591, 2022. (Type: Inproceedings | Links)
Jose Antonio Lopez Saenz; Thomas Hain: A Model for Assessor Bias in Automatic Pronunciation Assessment. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7267-7271, 2022. (Type: Inproceedings | Links)
George Close; Thomas Hain; Stefan Goetze: MetricGAN+/-: Increasing Robustness of Noise Reduction on Unseen Data. In: 30th European Signal Processing Conference, EUSIPCO 2022, Belgrade, Serbia, August 29 - September 2, 2022, IEEE, 2022. (Type: Inproceedings | )

2021

Jose Antonio Lopez Saenz; Md Asif Jalal; Rosanna Milner; Thomas Hain: Attention Based Model for Segmental Pronunciation Error Detection. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021, Cartagena, Colombia, December 13-17, 2021, pp. 725–732, IEEE, 2021. (Type: Inproceedings | Links)
Yanpei Shi; Qiang Huang; Thomas Hain: H-VECTORS: Improving the robustness in utterance-level speaker embeddings using a hierarchical attention model. In: Neural Networks, vol. 142, pp. 329–339, 2021. (Type: Journal Article | Links)
Korbinian Friedl; Georgios Rizos; Lukas Stappen; Madina Hasan; Lucia Specia; Thomas Hain; Björn W. Schuller: Uncertainty Aware Review Hallucination for Science Article Classification. In: Zong, Chengqing; Xia, Fei; Li, Wenjie; Navigli, Roberto (Ed.): Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021, pp. 5004–5009, Association for Computational Linguistics, 2021. (Type: Inproceedings | Links)
Mingjie Chen; Yanpei Shi; Thomas Hain: Towards Low-Resource Stargan Voice Conversion Using Weight Adaptive Instance Normalization. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021, pp. 5949–5953, IEEE, 2021. (Type: Inproceedings | Links)
Qiang Huang; Thomas Hain: Improving Audio Anomalies Recognition Using Temporal Convolutional Attention Networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2021, Toronto, ON, Canada, June 6-11, 2021, pp. 6473–6477, IEEE, 2021. (Type: Inproceedings | Links)
Anna Ollerenshaw; Md Asif Jalal; Thomas Hain: Insights on Neural Representations for End-to-End Speech Recognition. In: Hermansky, Hynek; Cernocký, Honza; Burget, Lukás; Lamel, Lori; Scharenborg, Odette; Motlícek, Petr (Ed.): Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August - 3 September 2021, pp. 4079–4083, ISCA, 2021. (Type: Inproceedings | Links)
Shengjie Huang; Mingjie Chen; Yanyan Xu; Dengfeng Ke; Thomas Hain: WINVC: One-Shot Voice Conversion with Weight Adaptive Instance Normalization. In: Pham, Duc Nghia; Theeramunkong, Thanaruk; Governatori, Guido; Liu, Fenrong (Ed.): PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8-12, 2021, Proceedings, Part II, pp. 559–573, Springer, 2021. (Type: Inproceedings | Links)
Jose Antonio Lopez Saenz; Thomas Hain: Use of Speaker Metadata for Improving Automatic Pronunciation Assessment. In: Anke, Luis Espinosa; Martín-Vide, Carlos; Spasic, Irena (Ed.): Statistical Language and Speech Processing - 9th International Conference, SLSP 2021, Cardiff, UK, November 23-25, 2021, Proceedings, pp. 61–72, Springer, 2021. (Type: Inproceedings | Links)
Yanpei Shi; Thomas Hain: Contextual Joint Factor Acoustic Embeddings. In: IEEE Spoken Language Technology Workshop, SLT 2021, Shenzhen, China, January 19-22, 2021, pp. 750–757, IEEE, 2021. (Type: Inproceedings | Links)
Yanpei Shi; Thomas Hain: Supervised Speaker Embedding De-Mixing in Two-Speaker Environment. In: IEEE Spoken Language Technology Workshop, SLT 2021, Shenzhen, China, January 19-22, 2021, pp. 758–765, IEEE, 2021. (Type: Inproceedings | Links)

2020

Md Asif Jalal; Rosanna Milner; Thomas Hain; Roger K. Moore: Removing Bias with Residual Mixture of Multi-View Attention for Speech Emotion Recognition. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4084–4088, ISCA, 2020. (Type: Inproceedings | Links)
Md Asif Jalal; Rosanna Milner; Thomas Hain: Empirical Interpretation of Speech Emotion Perception with Attention Based Model for Speech Emotion Recognition. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4113–4117, ISCA, 2020. (Type: Inproceedings | Links)
Yanpei Shi; Qiang Huang; Thomas Hain: H-Vectors: Utterance-Level Speaker Embedding Using a Hierarchical Attention Model. In: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020, pp. 7579–7583, IEEE, 2020. (Type: Inproceedings | Links)
Yanpei Shi; Qiang Huang; Thomas Hain: Speaker Re-Identification with Speaker Dependent Speech Enhancement. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 1530–1534, ISCA, 2020. (Type: Inproceedings | Links)
Lukas Stappen; Georgios Rizos; Madina Hasan; Thomas Hain; Björn W. Schuller: Uncertainty-Aware Machine Support for Paper Reviewing on the Interspeech 2019 Submission Corpus. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 1808–1812, ISCA, 2020. (Type: Inproceedings | Links)
Yanpei Shi; Qiang Huang; Thomas Hain: Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 2992–2996, ISCA, 2020. (Type: Inproceedings | Links)
Qiang Huang; Thomas Hain: Exploration of Audio Quality Assessment and Anomaly Localisation Using Attention Models. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4611–4615, ISCA, 2020. (Type: Inproceedings | Links)
Hardik B. Sailor; Thomas Hain: Multilingual Speech Recognition Using Language-Specific Phoneme Recognition as Auxiliary Task for Indian Languages. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4756–4760, ISCA, 2020. (Type: Inproceedings | Links)
Mingjie Chen; Thomas Hain: Unsupervised Acoustic Unit Representation Learning for Voice Conversion Using WaveNet Auto-Encoders. In: Meng, Helen; Xu, Bo; Zheng, Thomas Fang (Ed.): Interspeech 2020, 21st Annual Conference of the International Speech Communication Association, Virtual Event, Shanghai, China, 25-29 October 2020, pp. 4866–4870, ISCA, 2020. (Type: Inproceedings | Links)
Yanpei Shi; Qiang Huang; Thomas Hain: Robust Speaker Recognition Using Speech Enhancement And Attention Model. In: Lee, Kong-Aik; Koshinaka, Takafumi; Shinoda, Koichi (Ed.): Odyssey 2020: The Speaker and Language Recognition Workshop, 1-5 November 2020, Tokyo, Japan, pp. 451–458, ISCA, 2020. (Type: Inproceedings | Links)

2019

Rosanna Milner; Md Asif Jalal; Raymond W. M. Ng; Thomas Hain: A Cross-Corpus Study on Speech Emotion Recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 304–311, IEEE, 2019. (Type: Inproceedings | Links)
Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment. In: IEEE ACM Trans. Audio Speech Lang. Process., vol. 27, no. 3, pp. 572–582, 2019. (Type: Journal Article | Links)
Md Asif Jalal; Roger K. Moore; Thomas Hain: Spatio-Temporal Context Modelling for Speech Emotion Classification. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 853–859, IEEE, 2019. (Type: Inproceedings | Links)
Hardik B. Sailor; Salil Deena; Md Asif Jalal; Rasa Lileikyte; Thomas Hain: Unsupervised Adaptation of Acoustic Models for ASR Using Utterance-Level Embeddings from Squeeze and Excitation Networks. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pp. 980–987, IEEE, 2019. (Type: Inproceedings | Links)
Qiang Huang; Thomas Hain: Detecting Mismatch Between Speech and Transcription Using Cross-Modal Attention. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 584–588, ISCA, 2019. (Type: Inproceedings | Links)
Md Asif Jalal; Erfan Loweimi; Roger K. Moore; Thomas Hain: Learning Temporal Clusters Using Capsule Routing for Speech Emotion Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1701–1705, ISCA, 2019. (Type: Inproceedings | Links)
Mortaza Doulaty; Thomas Hain: Latent Dirichlet Allocation Based Acoustic Data Selection for Automatic Speech Recognition. In: Kubin, Gernot; Kacic, Zdravko (Ed.): Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 3228–3232, ISCA, 2019. (Type: Inproceedings | Links)

2018

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain: Lightly supervised alignment of subtitles on multi-genre broadcasts. In: Multim. Tools Appl., vol. 77, no. 23, pp. 30533–30550, 2018. (Type: Journal Article | Links)
Erfan Loweimi; Jon Barker; Thomas Hain: Exploring the Use of Group Delay for Generalised VTS Based Noise Compensation. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, April 15-20, 2018, pp. 4824–4828, IEEE, 2018. (Type: Inproceedings | Links)
Erfan Loweimi; Jon Barker; Thomas Hain: On the Usefulness of the Speech Phase Spectrum for Pitch Extraction. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 696–700, ISCA, 2018. (Type: Inproceedings | Links)
Mauro Nicolao; Michiel Sanders; Thomas Hain: Improved Acoustic Modelling for Automatic Literacy Assessment of Children. In: Yegnanarayana, B. (Ed.): Interspeech 2018, 19th Annual Conference of the International Speech Communication Association, Hyderabad, India, 2-6 September 2018, pp. 1666–1670, ISCA, 2018. (Type: Inproceedings | Links)
Rahhal Errattahi; Salil Deena; Asmaa El Hannani; Hassan Ouahmane; Thomas Hain: Improving ASR Error Detection with RNNLM Adaptation. In: 2018 IEEE Spoken Language Technology Workshop, SLT 2018, Athens, Greece, December 18-21, 2018, pp. 190–196, IEEE, 2018. (Type: Inproceedings | Links)

2017

Rosanna Milner; Thomas Hain: DNN approach to speaker diarisation using speaker channels. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 4925–4929, IEEE, 2017. (Type: Inproceedings | Links)
Oscar Saz; Thomas Hain: Acoustic adaptation to dynamic background conditions with asynchronous transformations. In: Comput. Speech Lang., vol. 41, pp. 180–194, 2017. (Type: Journal Article | Links)
Raymond W. M. Ng; Mauro Nicolao; Thomas Hain: Unsupervised crosslingual adaptation of tokenisers for spoken language recognition. In: Comput. Speech Lang., vol. 46, pp. 327–342, 2017. (Type: Journal Article | Links)
Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Exploring the use of acoustic embeddings in neural machine translation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, December 16-20, 2017, pp. 450–457, IEEE, 2017. (Type: Inproceedings | Links)
Erfan Loweimi; Jon Barker; Thomas Hain: Statistical normalisation of phase-based feature representation for robust speech recognition. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5310–5314, IEEE, 2017. (Type: Inproceedings | Links)
Raymond W. M. Ng; Alvin C. M. Kwan; Tan Lee; Thomas Hain: Shefce: A Cantonese-English bilingual speech corpus for pronunciation assessment. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017, pp. 5825–5829, IEEE, 2017. (Type: Inproceedings | Links)
Erfan Loweimi; Jon Barker; Oscar Saz; Thomas Hain: Robust Source-Filter Separation of Speech Signal in the Phase Domain. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 414–418, ISCA, 2017. (Type: Inproceedings | Links)
Erfan Loweimi; Jon Barker; Thomas Hain: Channel Compensation in the Generalised Vector Taylor Series Approach to Robust ASR. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2466–2470, ISCA, 2017. (Type: Inproceedings | Links)
Salil Deena; Raymond W. M. Ng; Pranava Swaroop Madhyastha; Lucia Specia; Thomas Hain: Semi-Supervised Adaptation of RNNLMs by Fine-Tuning with Domain-Specific Auxiliary Features. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 2715–2719, ISCA, 2017. (Type: Inproceedings | Links)
Chenhao Wu; Raymond W. M. Ng; Oscar Saz; Thomas Hain: Analysing acoustic model changes for active learning in automatic speech recognition. In: International Conference on Systems, Signals and Image Processing, IWSSIP 2017, Poznań, Poland, May 22-24, 2017, pp. 1–5, IEEE, 2017. (Type: Inproceedings | Links)

2016

Raymond W. M. Ng; Kashif Shah; Lucia Specia; Thomas Hain: Groupwise learning for ASR k-best list reranking in spoken langauge translation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. (Type: Inproceedings | )
Rosanna Milner: Using deep neural networks for speaker diarisation. University of Sheffield, UK, 2016. (Type: PhD Thesis | Links)
Rosanna Milner; Thomas Hain: Segment-oriented evaluation of speaker diarisation performance. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016, pp. 5460–5464, IEEE, 2016. (Type: Inproceedings | Links)
Thomas Hain; Jeremy Christian; Oscar Saz; Salil Deena; Madina Hasan; Raymond W. M. Ng; Rosanna Milner; Mortaza Doulaty; Yulan Liu: webASR 2 - Improved Cloud Based Speech Technology. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1613–1617, ISCA, 2016. (Type: Inproceedings | Links)
131 entries « 1 of 3 »

Back to Top