Software & Data Downloads — TF-Locoformer

Transformer-based model with LOcal-modeling by COnvolution for speech enhancement and audio source separation, presented in our Interspeech 2024 paper.

This code implements TF-Locoformer, a Transformer-based model with LOcal-modeling by COnvolution for speech enhancement and audio source separation, presented in our Interspeech 2024 paper. Training and inference scripts are provided, as well as pretrained models for the WSJ0-2mix, Libri2mix, WHAMR!, and DNS-Interspeech2020 datasets

    •  Saijo, K., Wichern, G., Germain, F.G., Pan, Z., Le Roux, J., "TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement", International Workshop on Acoustic Signal Enhancement (IWAENC), September 2024.
      BibTeX TR2024-126 PDF Software
      • @inproceedings{Saijo2024sep2,
      • author = {Saijo, Kohei and Wichern, Gordon and Germain, François G and Pan, Zexu and Le Roux, Jonathan}},
      • title = {TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement},
      • booktitle = {International Workshop on Acoustic Signal Enhancement (IWAENC)},
      • year = 2024,
      • month = sep,
      • url = {https://www.merl.com/publications/TR2024-126}
      • }

    Access software at https://github.com/merlresearch/tf-locoformer.