NEWS    Jonathan Le Roux discusses MERL's audio source separation work on popular machine learning podcast

Date released: January 25, 2022


  •  NEWS    Jonathan Le Roux discusses MERL's audio source separation work on popular machine learning podcast
  • Date:

    January 24, 2022

  • Where:

    The TWIML AI Podcast

  • Description:

    MERL Speech & Audio Senior Team Leader Jonathan Le Roux was featured in an extended interview on the popular TWIML AI Podcast, presenting MERL's work towards solving the "cocktail party problem". Humans have the extraordinary ability to focus on particular sounds of interest within a complex acoustic scene, such as a cocktail party. MERL's Speech & Audio Team has been at the forefront of the field's effort to develop algorithms giving machines similar abilities. Jonathan talked with host Sam Charrington about the group's decade-long journey on this topic, from early pioneering work using deep learning for speech enhancement and speech separation, to recent works on weakly-supervised separation, hierarchical sound separation, as well as the separation of real-world soundtracks into speech, music, and sound effects (aka the "cocktail fork problem").

    The TWIML AI podcast, formerly known as This Week in Machine Learning & AI, was created in 2016 and is followed by more than 10,000 subscribers on Youtube and Twitter. Jonathan's interview marks the 555th episode of the podcast.


  • External Link:

    http://www.twimlai.com/go/555

  • MERL Contact:
  • Research Areas:

    Artificial Intelligence, Machine Learning, Speech & Audio

    •  Manilow, E., Wichern, G., Le Roux, J., "Hierarchical Musical Instrument Separation", International Society for Music Information Retrieval (ISMIR) Conference, October 2020, pp. 376-383.
      BibTeX TR2020-136 PDF Software
      • @inproceedings{Manilow2020oct,
      • author = {Manilow, Ethan and Wichern, Gordon and Le Roux, Jonathan},
      • title = {Hierarchical Musical Instrument Separation},
      • booktitle = {International Society for Music Information Retrieval (ISMIR) Conference},
      • year = 2020,
      • pages = {376--383},
      • month = oct,
      • isbn = {978-0-9813537-0-8},
      • url = {https://www.merl.com/publications/TR2020-136}
      • }
    •  Pishdadian, F., Wichern, G., Le Roux, J., "Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision", IEEE/ACM Transactions on Audio, Speech, and Language Processing, DOI: 10.1109/​TASLP.2020.3013105, Vol. 28, pp. 2386-2399, September 2020.
      BibTeX TR2020-126 PDF
      • @article{Pishdadian2020sep,
      • author = {Pishdadian, Fatemeh and Wichern, Gordon and Le Roux, Jonathan},
      • title = {Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision},
      • journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
      • year = 2020,
      • volume = 28,
      • pages = {2386--2399},
      • month = sep,
      • doi = {10.1109/TASLP.2020.3013105},
      • url = {https://www.merl.com/publications/TR2020-126}
      • }
    •  Hershey, J.R., Chen, Z., Le Roux, J., Watanabe, S., "Deep Clustering: Discriminative Embeddings for Segmentation and Separation", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/​ICASSP.2016.7471631, March 2016, pp. 31-35.
      BibTeX TR2016-003 PDF
      • @inproceedings{Hershey2016mar,
      • author = {Hershey, John R. and Chen, Zhuo and Le Roux, Jonathan and Watanabe, Shinji},
      • title = {Deep Clustering: Discriminative Embeddings for Segmentation and Separation},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2016,
      • pages = {31--35},
      • month = mar,
      • doi = {10.1109/ICASSP.2016.7471631},
      • url = {https://www.merl.com/publications/TR2016-003}
      • }