Oscar Saz

Past Members, Researchers

PhD by the University of Zaragoza (Spain)

Research Associate at the Machine Intelligence for Natural Interfaces Group, Speech and Hearing Group, Department of Computer Science, University of Sheffield.

Contact information

Department of Computer Science
University of Sheffield
Regent Court
211 Portobello St
Sheffield, S1 4DP.
e-mail: O.SazTorralba_AT_sheffield_DOT_ac_DOT_uk

Employment history

2012-2016 – Research Associate (University of Sheffield)
2010-2012 – Post-doctorate associate as a Fulbright/MEC Fellow (Carnegie Mellon University)
2008-2010 – Research assistant (University of Zaragoza)

Academic history

2004-2009 – PhD by the University of Zaragoza
2004-2006 – DEA (equivalent MSc) in Program “Information and Communication Technologies in Mobile Networks ” (University of Zaragoza)
1998-2004 – BSc Telecommunications Engineer (University of Zaragoza)

Full personal website at DCS

http://staffwww.dcs.shef.ac.uk/people/O.Saztorralba/

Projects

Natural Speech Technology

Publications

2019

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment. In: IEEE ACM Trans. Audio Speech Lang. Process., vol. 27, no. 3, pp. 572–582, 2019. (Type: Journal Article | Links)

2018

Oscar Saz; Salil Deena; Mortaza Doulaty; Madina Hasan; Bilal Khaliq; Rosanna Milner; Raymond W. M. Ng; Julia Olcoz; Thomas Hain: Lightly supervised alignment of subtitles on multi-genre broadcasts. In: Multim. Tools Appl., vol. 77, no. 23, pp. 30533–30550, 2018. (Type: Journal Article | Links)

2017

Oscar Saz; Thomas Hain: Acoustic adaptation to dynamic background conditions with asynchronous transformations. In: Comput. Speech Lang., vol. 41, pp. 180–194, 2017. (Type: Journal Article | Links)

Chenhao Wu; Raymond W. M. Ng; Oscar Saz; Thomas Hain: Analysing acoustic model changes for active learning in automatic speech recognition. In: International Conference on Systems, Signals and Image Processing, IWSSIP 2017, Poznań, Poland, May 22-24, 2017, pp. 1–5, IEEE, 2017. (Type: Proceedings Article | Links)

Erfan Loweimi; Jon Barker; Oscar Saz; Thomas Hain: Robust Source-Filter Separation of Speech Signal in the Phase Domain. In: Lacerda, Francisco (Ed.): Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pp. 414–418, ISCA, 2017. (Type: Proceedings Article | Links)

2016

Thomas Hain; Jeremy Christian; Oscar Saz; Salil Deena; Madina Hasan; Raymond W. M. Ng; Rosanna Milner; Mortaza Doulaty; Yulan Liu: webASR 2 - Improved Cloud Based Speech Technology. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 1613–1617, ISCA, 2016. (Type: Proceedings Article | Links)

Julia Olcoz; Oscar Saz; Thomas Hain: Error Correction in Lightly Supervised Alignment of Broadcast Subtitles. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2110–2114, ISCA, 2016. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Automatic Genre and Show Identification of Broadcast Media. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2115–2119, ISCA, 2016. (Type: Proceedings Article | Links)

Salil Deena; Madina Hasan; Mortaza Doulaty; Oscar Saz; Thomas Hain: Combining Feature and Model-Based Adaptation of RNNLMs for Multi-Genre Broadcast Speech Recognition. In: Morgan, Nelson (Ed.): Interspeech 2016, 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, September 8-12, 2016, pp. 2343–2347, ISCA, 2016. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Mauro Nicolao; Oscar Saz; Madina Hasan; Bhusan Chettri; Mortaza Doulaty; Tan Lee; Thomas Hain: The Sheffield language recognition system in NIST LRE 2015. In: Rodríguez-Fuentes, Luis Javier; Lleida, Eduardo (Ed.): Odyssey 2016: The Speaker and Language Recognition Workshop, Bilbao, Spain, June 21-24, 2016, pp. 181–187, ISCA, 2016. (Type: Proceedings Article | Links)

2015

Mortaza Doulaty; Oscar Saz; Raymond W. M. Ng; Thomas Hain: Latent Dirichlet Allocation based organisation of broadcast media archives for deep neural network adaptation. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 130–136, IEEE, 2015. (Type: Proceedings Article | Links)

Oscar Saz; Mortaza Doulaty; Salil Deena; Rosanna Milner; Raymond W. M. Ng; Madina Hasan; Yulan Liu; Thomas Hain: The 2015 sheffield system for transcription of Multi-Genre Broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 624–631, IEEE, 2015. (Type: Proceedings Article | Links)

Rosanna Milner; Oscar Saz; Salil Deena; Mortaza Doulaty; Raymond W. M. Ng; Thomas Hain: The 2015 sheffield system for longitudinal diarisation of broadcast media. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 632–638, IEEE, 2015. (Type: Proceedings Article | Links)

Peter Bell; Mark J. F. Gales; Thomas Hain; Jonathan Kilgour; Pierre Lanchantin; Xunying Liu; Andrew McParland; Steve Renals; Oscar Saz; Mirjam Wester; Philip C. Woodland: The MGB challenge: Evaluating multi-genre broadcast media recognition. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, AZ, USA, December 13-17, 2015, pp. 687–693, IEEE, 2015. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Thomas Hain: Data-selective transfer learning for multi-domain speech recognition. In: INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp. 2897–2901, ISCA, 2015. (Type: Proceedings Article | Links)

Mortaza Doulaty; Oscar Saz; Thomas Hain: Unsupervised domain discovery using latent dirichlet allocation for acoustic modelling in speech recognition. In: INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, September 6-10, 2015, pp. 3640–3644, ISCA, 2015. (Type: Proceedings Article | Links)

2014

Oscar Saz; Thomas Hain: Using contextual information in joint factor eigenspace MLLR for speech recognition in diverse scenarios. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pp. 6314–6318, IEEE, 2014. (Type: Proceedings Article | Links)

Oscar Saz; Mortaza Doulaty; Thomas Hain: Background-tracking acoustic features for genre identification of broadcast shows. In: 2014 IEEE Spoken Language Technology Workshop, SLT 2014, South Lake Tahoe, NV, USA, December 7-10, 2014, pp. 118–123, IEEE, 2014. (Type: Proceedings Article | Links)

Raymond W. M. Ng; Mortaza Doulaty; Rama Doddipatla; Wilker Aziz; Kashif Shah; Oscar Saz; Madina Hasan; Ghada AlHarbi; Lucia Specia; Thomas Hain: The USFD SLT system for IWSLT 2014. In: Federico, Marcello; Stüker, Sebastian; Yvon, François (Ed.): Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, Lake Tahoe, CA, USA, December 4-5, 2014, 2014. (Type: Proceedings Article | Links)

2013

Pierre Lanchantin; Peter Bell; Mark J. F. Gales; Thomas Hain; Xunying Liu; Yanhua Long; Jennifer Quinnell; Steve Renals; Oscar Saz; Matthew Stephen Seigel; Pawel Swietojanski; Philip C. Woodland: Automatic Transcription of Multi-genre Media Archives. In: Gravier, Guillaume; Béchet, Frédéric (Ed.): Proceedings of the First Workshop on Speech, Language and Audio in Multimedia, Marseille, France, August 22-23, 2013, pp. 26–31, CEUR-WS.org, 2013. (Type: Proceedings Article | Links)

Oscar Saz; Thomas Hain: Asynchronous factorisation of speaker and background with feature transforms in speech recognition. In: Bimbot, Frédéric; Cerisara, Christophe; Fougeron, Cécile; Gravier, Guillaume; Lamel, Lori; Pellegrino, François; Perrier, Pascal (Ed.): INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France, August 25-29, 2013, pp. 1238–1242, ISCA, 2013. (Type: Proceedings Article | Links)