TR2024-072

Deep Neural Room Acoustics Primitive


    •  He, Y., Cherian, A., Wichern, G., Markham, A., "Deep Neural Room Acoustics Primitive", International Conference on Machine Learning (ICML), June 2024.
      BibTeX TR2024-072 PDF
      • @inproceedings{He2024jun,
      • author = {He, Yuhang and Cherian, Anoop and Wichern, Gordon and Markham, Andrew}},
      • title = {Deep Neural Room Acoustics Primitive},
      • booktitle = {International Conference on Machine Learning (ICML)},
      • year = 2024,
      • month = jun,
      • url = {https://www.merl.com/publications/TR2024-072}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio

Abstract:

The primary objective of room acoustics is to model the intricate sound propagation dynamics from any source to receiver position within en- closed 3D spaces. These dynamics are encapsulated in the form of a 1D room impulse response (RIR). Precisely measuring RIR is diffi- cult due to the complexity of sound propagation encompassing reflection, diffraction, and absorption. In this work, we propose to learn a contin- uous neural room acoustics field that implicitly encodes all essential sound propagation primitives for each enclosed 3D space, so that we can infer the RIR corresponding to arbitrary source- receiver positions unseen in the training dataset. Our framework, dubbed DeepNeRAP, is trained in a self-supervised manner without requiring direct access to RIR ground truth that is often needed in prior methods. The key idea is to design two cooperative acoustic agents to actively probe a 3D space, one emitting and the other receiving sound at various locations. Analyzing this sound helps to inversely characterize the acoustic primitives. Our framework is well-grounded in the fundamental physical principles of sound propagation, including reciprocity and globality, and thus is acoustically interpretable and meaningful. We present experiments on both synthetic and real- world datasets, demonstrating superior quality in RIR estimation against closely related methods.