Software & Data Downloads — TF-Locoformer

Transformer-based model with LOcal-modeling by COnvolution for speech enhancement and audio source separation, presented in our Interspeech 2024 paper.

This code implements TF-Locoformer, a Transformer-based model with LOcal-modeling by COnvolution for speech enhancement and audio source separation, presented in our Interspeech 2024 paper. Training and inference scripts are provided, as well as pretrained models for the WSJ0-2mix, Libri2mix, WHAMR!, and DNS-Interspeech2020 datasets

    •  Saijo, K., Wichern, G., Germain, F.G., Pan, Z., Le Roux, J., "TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement", International Workshop on Acoustic Signal Enhancement (IWAENC), DOI: 10.1109/​IWAENC61483.2024.10694313, September 2024, pp. 205-209.
      BibTeX TR2024-126 PDF Software
      • @inproceedings{Saijo2024sep2,
      • author = {Saijo, Kohei and Wichern, Gordon and Germain, François G and Pan, Zexu and {Le Roux}, Jonathan},
      • title = {{TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement}},
      • booktitle = {International Workshop on Acoustic Signal Enhancement (IWAENC)},
      • year = 2024,
      • pages = {205--209},
      • month = sep,
      • doi = {10.1109/IWAENC61483.2024.10694313},
      • issn = {2835-3439},
      • isbn = {979-8-3503-6185-8},
      • url = {https://www.merl.com/publications/TR2024-126}
      • }

    Access software at https://github.com/merlresearch/tf-locoformer.