TR2016-075

Data selection by sequence summarizing neural network in mismatch condition training

- Zmolikova, K., Karafiat, M., Vesely, K., Delcroix, M., Watanabe, S., Burget, L., Cernocky, J.H., "Data selection by sequence summarizing neural network in mismatch condition training", Interspeech, DOI: 10.21437/Interspeech.2016-741, September 2016, pp. 2354-2358.
  BibTeX TR2016-075 PDF
  - @inproceedings{Zmolikova2016sep,
  - author = {Zmolikova, Katerina and Karafiat, Martin and Vesely, Karel and Delcroix, Marc and Watanabe, Shinji and Burget, Lukas and Cernocky, Jan, Honza},
  - title = {{Data selection by sequence summarizing neural network in mismatch condition training}},
  - booktitle = {Interspeech},
  - year = 2016,
  - pages = {2354--2358},
  - month = sep,
  - doi = {10.21437/Interspeech.2016-741},
  - url = {https://www.merl.com/publications/TR2016-075}
  - }
Research Areas:

Artificial Intelligence, Speech & Audio

Abstract:

Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

Research Areas:

Abstract: