TR2025-097

Single- and Multi-Channel Speech Enhancement and Separation for Far-Field Conversation Recognition

- Masuyama, Y., "Single- and Multi-Channel Speech Enhancement and Separation for Far-Field Conversation Recognition," Tech. Rep. TR2025-097, Jelinek Summer Workshop on Speech and Language Technology (JSALT), June 2025.
  BibTeX TR2025-097 PDF
  - @techreport{Masuyama2025jun,
  - author = {{{Masuyama, Yoshiki}}},
  - title = {{{Single- and Multi-Channel Speech Enhancement and Separation for Far-Field Conversation Recognition}}},
  - institution = {Jelinek Summer Workshop on Speech and Language Technology (JSALT)},
  - year = 2025,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2025-097}
  - }
MERL Contact:
- Yoshiki
  Masuyama
Research Areas:

Artificial Intelligence, Machine Learning, Speech & Audio

Abstract:

While ASR achieves superhuman performance on clean benchmarks, it struggles in real-world scenarios like meeting transcription, where word error rates exceed 35% versus under 3% on clean data. This lecture examines the challenges of robust ASR for conversational speech, including noise, reverberation, multiple speakers, and overlapped speech (>15% of meeting duration). The lecture covers evaluation methodologies for long-form multi-speaker audio, including concatenated minimum permutation WER (cpWER), and surveys key datasets from AMI to current benchmarks like CHiME-7/8 and NOTSOFAR1. Technical approaches are categorized into front-end methods (speech separation, beamforming, target speaker extraction) and back-end methods (self-supervised features, serialized output training, target-speaker ASR). Robust ASR remains an active research area with significant opportunities, particularly as large language models enable new applications like automated meeting summarization. Key challenges include speaker tracking, training-inference mismatches, and integrating speech separation, diarization, and recognition components.

MERL Contact:

YoshikiMasuyama

Research Areas:

Abstract:

Yoshiki
Masuyama