Machine Intelligence for Natural Interfaces (MINI) Group

MINI Group

The Machine Intelligence for Natural Interfaces (MINI) group is a research group in the Department of Computer Science in the University of Sheffield, UK, known for its world leading research groups in machine learning, natural language and speech processing. MINI is part of the Speech and Hearing (SPandH) group, connected to CATCH, and many other research groups in the University.

Our research focuses on intelligent speech and multi-modal interfaces to the next generation of intelligent systems that can interact with people in natural ways. These interfaces need not only be able to recognise events in a complex world, but also require capabilities to grasp meaning, and are capable to react in comprehensible form. Our work is interdisciplinary, combining fields of engineering, machine learning, linguistics, phonetics, vision, language understanding and artificial intelligence. The areas of application lie in media, mobile computing, health, knowledge organization, education, etc.

Latest Publications

  • P. Bell, M. Gales, T. Hain, J. Kilgour, P. Lanchantin, A. Liu, A. McParland, S. Renals, O. Saz, M. Wester, and P. Woodland, “The MGB Challenge: Evaluating Multi-genre Broadcast Media Recognition,” in Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, 2015.
    [Bibtex]
    @inproceedings{Bell_ASRU,
    address = {Scottsdale, AZ},
    author = {Peter Bell and Mark Gales and Thomas Hain and Jonathan Kilgour and Pierre Lanchantin and Andrew Liu and Andrew McParland and Steve Renals and Oscar Saz and Mirjam Wester and Phil Woodland},
    booktitle = {{Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)}},
    project = {nst},
    title = {{The MGB Challenge: Evaluating Multi-genre Broadcast Media Recognition}},
    year = {2015}
    }
  • [PDF] I. Casanueva, T. Hain, H. Christensen, R. Marxer, and P. Green, “Knowledge transfer between speakers for personalised dialogue management,” in Proceedings of SIGDial, Prague, Czech Republic, 2015.
    [Bibtex]
    @inproceedings{casanueva:15,
    address = {Prague, Czech Republic},
    author = {Inigo Casanueva and Thomas Hain and Heidi Christensen and Ricard Marxer and Phil Green},
    booktitle = {{Proceedings of SIGDial}},
    pdf = {http://staffwww.dcs.shef.ac.uk/people/i.casanueva/pdfs/SIGDIAL15_casanueva.pdf},
    project = {nst-homeService},
    title = {{Knowledge transfer between speakers for personalised dialogue management}},
    year = {2015}
    }
  • [PDF] H. Christensen, M. Nicolao, S. Cunningham, S. Deena, P. Green, and T. Hain, “Speech-Enabled Environmental Control in an AAL setting for people with Speech Disorders: a Case Study,” in IET International Conference on Technologies for Active and Assisted Living, TechAAL 2015, London, UK, 2015.
    [Bibtex]
    @inproceedings{christensen_techaal15,
    address = {London, UK},
    author = {Christensen, Heidi and Nicolao, Mauro and Cunningham, Stuart and Deena, Salil and Green, Phil and Hain, Thomas},
    booktitle = {{IET International Conference on Technologies for Active and Assisted Living, TechAAL 2015}},
    project = {nst,nst-homeservice},
    title = {{Speech-Enabled Environmental Control in an AAL setting for people with Speech Disorders: a Case Study}},
    year = {2015}
    }
  • [PDF] M. Doulaty, O. Saz, and T. Hain, “Data-selective Transfer Learning for Multi-Domain Speech Recognition,” in Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech), Dresden, Germany, 2015.
    [Bibtex]
    @inproceedings{doulaty15,
    address = {Dresden, Germany},
    author = {Mortaza Doulaty and Oscar Saz and Thomas Hain},
    booktitle = {{Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech)}},
    project = {nst},
    title = {{Data-selective Transfer Learning for Multi-Domain Speech Recognition}},
    year = {2015}
    }
  • [PDF] M. Doulaty, O. Saz, and T. Hain, “Unsupervised Domain Discovery using Latent Dirichlet Allocation for Acoustic Modelling in Speech Recognition,” in Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech), Dresden, Germany, 2015.
    [Bibtex]
    @inproceedings{doulaty15b,
    address = {Dresden, Germany},
    author = {Mortaza Doulaty and Oscar Saz and Thomas Hain},
    booktitle = {{Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech)}},
    project = {nst},
    title = {{Unsupervised Domain Discovery using Latent Dirichlet Allocation for Acoustic Modelling in Speech Recognition}},
    year = {2015}
    }
  • [PDF] M. Doulaty, O. Saz, R. W. M. Ng, and T. Hain, “Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation,” in Proceedings of the 2015 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), Scottsdale, Arizona, USA, 2015.
    [Bibtex]
    @inproceedings{doulaty15c,
    address = {Scottsdale, Arizona, USA},
    author = {Mortaza Doulaty and Oscar Saz and Raymond W. M. Ng and Thomas Hain},
    booktitle = {{Proceedings of the 2015 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015)}},
    project = {nst},
    title = {{Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation}},
    year = {2015}
    }
  • [PDF] Y. Liu, P. Karanasou, and T. Hain, “An Investigation Into Speaker Informed DNN Front-end for LVCSR,” in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015.
    [Bibtex]
    @inproceedings{liu2015,
    author = {Yulan Liu and Penny Karanasou and Thomas Hain},
    booktitle = {{Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on}},
    keyword = {speech recognition; deep neural network; speaker adaptation; speaker informed training, bias adaptation},
    month = {April},
    pdf = {http://staffwww.dcs.shef.ac.uk/people/Y.Liu/publications/pdf/Liu2015.pdf},
    project = {NST},
    title = {{An Investigation Into Speaker Informed DNN Front-end for {LVCSR}}},
    year = {2015}
    }
  • [PDF] E. Loweimi, J. Barker, and T. Hain, “Source-filter Separation of Speech Signal in the Phase Domain,” in Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech), Dresden, Germany, 2015.
    [Bibtex]
    @inproceedings{loweimi_is15,
    address = {Dresden, Germany},
    author = {Erfan Loweimi and Jon Barker and Thomas Hain},
    booktitle = {{Proceedings of the 16th Annual Conference of the International Speech Communication Association (Interspeech)}},
    title = {{Source-filter Separation of Speech Signal in the Phase Domain}},
    year = {2015}
    }
  • [PDF] E. Loweimi, M. Doulaty, J. Barker, and T. Hain, “Long-term statistical Feature Extraction from Speech Signal and its Application in Emotion Recognition,” in Statistical Language and Speech Processing (SLSP), Budapest, Hungary, 2015.
    [Bibtex]
    @inproceedings{loweimi_slsp15,
    address = {Budapest, Hungary},
    author = {Erfan Loweimi and M. Doulaty and Jon Barker and Thomas Hain},
    booktitle = {{Statistical Language and Speech Processing (SLSP)}},
    title = {{Long-term statistical Feature Extraction from Speech Signal and its Application in Emotion Recognition}},
    year = {2015}
    }
  • [PDF] E. Loweimi, J. Barker, and T. Hain, “Emotion Recognition from Speech Signal by Effective Combination of the Generative and Discriminative Models,” in University of Sheffield Engineering Symposium, Sheffield, UK, 2015.
    [Bibtex]
    @inproceedings{loweimi_uses15,
    address = {Sheffield, UK},
    author = {Erfan Loweimi and Jon Barker and Thomas Hain},
    booktitle = {{University of Sheffield Engineering Symposium}},
    title = {{Emotion Recognition from Speech Signal by Effective Combination of the Generative and Discriminative Models}},
    year = {2015}
    }
  • M. Hasan, R. Doddipatla, and T. Hain, “Noise-matched training of CRF based sentence end detection models,” in Interspeech 2015, 2015.
    [Bibtex]
    @inproceedings{madinainterspeech2015,
    author = {Madina Hasan and Rama Doddipatla and Thomas Hain},
    booktitle = {{Interspeech 2015}},
    project = {nst},
    title = {{Noise-matched training of CRF based sentence end detection models}},
    year = {2015}
    }
  • [PDF] D. Mart{‘i}nez, E. Lleida, P. Green, H. Christensen, A. Ortega, and A. Miguel, “Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace,” Acm transactions on accessible computing (taccess), vol. 6, iss. 3, p. 10, 2015.
    [Bibtex]
    @article{martinez2015intelligibility,
    author = {Mart{\'\i}nez, David and Lleida, Eduardo and Green, Phil and Christensen, Heidi and Ortega, Alfonso and Miguel, Antonio},
    journal = {ACM Transactions on Accessible Computing (TACCESS)},
    number = {3},
    pages = {10},
    project = {nst,nst-homeservice},
    publisher = {ACM},
    title = {{Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace}},
    volume = {6},
    year = {2015}
    }
  • R. Milner, O. Saz, S. Deena, M. Doulaty, R. Ng, and T. Hain, “The 2015 Sheffield System for Longitudinal Diarisation of Broadcast Media,” in Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, 2015.
    [Bibtex]
    @inproceedings{milner_ASRU2015,
    address = {Scottsdale, AZ},
    author = {Rosanna Milner and Oscar Saz and Salil Deena and Mortaza Doulaty and Raymond Ng and Thomas Hain},
    booktitle = {{Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)}},
    project = {nst},
    title = {{The 2015 Sheffield System for Longitudinal Diarisation of Broadcast Media}},
    year = {2015}
    }
  • R. W. M. Ng, K. Shah, W. Aziz, L. Specia, and T. Hain, “Quality estimation for ASR k-best list rescoring in spoken language translation,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
    [Bibtex]
    @inproceedings{ng_icassp15,
    author = {Raymond W. M. Ng and Kashif Shah and Wilker Aziz and Lucia Specia and Thomas Hain},
    booktitle = {{2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}},
    keyword = {spoken language translation, quality estimation, system integration},
    month = {April},
    project = {WFST,NST},
    title = {{Quality estimation for {ASR} k-best list rescoring in spoken language translation}},
    year = {2015}
    }
  • [PDF] R. W. M. Ng, K. Shah, L. Specia, and T. Hain, “A study on the stability and effectiveness of features in quality estimation for spoken langauge translation,” in the 16th Annual Conference of the International Speech Communication Association (Interspeech), 2015.
    [Bibtex]
    @inproceedings{ng_is15,
    author = {Raymond W. M. Ng and Kashif Shah and Lucia Specia and Thomas Hain},
    booktitle = {{the 16th Annual Conference of the International Speech Communication Association (Interspeech)}},
    keyword = {spoken language translation, quality estimation, system robustness},
    month = {September},
    project = {WFST,NST},
    title = {{A study on the stability and effectiveness of features in quality estimation for spoken langauge translation}},
    year = {2015}
    }
  • [PDF] [DOI] M. Nicolao, A. V. Beeston, and T. Hain, “Automatic Assessment of English Learner Pronunciation Using Discriminative Classifiers ,” in IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, Brisbane, Australia, 2015, pp. 5351-5355.
    [Bibtex]
    @inproceedings{nicolao_icassp2015,
    address = {Brisbane, Australia},
    author = {Nicolao, Mauro and Beeston, Amy V and Hain, Thomas},
    booktitle = {{IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015}},
    doi = {10.1109/ICASSP.2015.7178993},
    month = {apr},
    pages = {5351--5355},
    project = {ITSLanguage},
    title = {{Automatic Assessment of English Learner Pronunciation Using Discriminative Classifiers }},
    year = {2015}
    }
  • O. Saz, M. Doulaty, S. Deena, R. Milner, R. Ng, M. Hasan, Y. Liu, and T. Hain, “The 2015 Sheffield System for Transcription of Multi–Genre Broadcast Media,” in Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, 2015.
    [Bibtex]
    @inproceedings{Saz_ASRU,
    address = {Scottsdale, AZ},
    author = {Oscar Saz and Mortaza Doulaty and Salil Deena and Rosanna Milner and Raymond Ng and Madina Hasan and Yulan Liu and Thomas Hain},
    booktitle = {{Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)}},
    project = {nst},
    title = {{The 2015 Sheffield System for Transcription of Multi--Genre Broadcast Media}},
    year = {2015}
    }
  • K. Shah, R. W. M. Ng, F. Bougares, and L. Specia, “Investigating continuous space language models for machine translation quality estimation,” in 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015.
    [Bibtex]
    @inproceedings{shah_emnlp15,
    author = {Kashif Shah and Raymond W. M. Ng and Fethi Bougares and Lucia Specia},
    booktitle = {{2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)}},
    keyword = {machine translation, quality estimation},
    month = {September},
    project = {WFST},
    title = {{Investigating continuous space language models for machine translation quality estimation}},
    year = {2015}
    }

Back to Top