NEWS  |  Chiori Hori will give keynote on scene understanding via multimodal sensing at AI Electronics Symposium

Date released: February 9, 2021


  •  NEWS   Chiori Hori will give keynote on scene understanding via multimodal sensing at AI Electronics Symposium
  • Date:

    February 15, 2021

  • Description:

    Chiori Hori, a Senior Principal Researcher in MERL's Speech and Audio Team, will be a keynote speaker at the 2nd International Symposium on AI Electronics, alongside Alex Acero, Senior Director of Apple Siri, Roberto Cipolla, Professor of Information Engineering at the University of Cambridge, and Hiroshi Amano, Professor at Nagoya University and winner of the Nobel prize in Physics for his work on blue light-emitting diodes. The symposium, organized by Tohoku University, will be held online on February 15, 2021, 10am-4pm (JST).

    Chiori's talk, titled "Human Perspective Scene Understanding via Multimodal Sensing", will present MERL's work towards the development of scene-aware interaction. One important piece of technology that is still missing for human-machine interaction is natural and context-aware interaction, where machines understand their surrounding scene from the human perspective, and they can share their understanding with humans using natural language. To bridge this communications gap, MERL has been working at the intersection of research fields such as spoken dialog, audio-visual understanding, sensor signal understanding, and robotics technologies in order to build a new AI paradigm, called scene-aware interaction, that enables machines to translate their perception and understanding of a scene and respond to it using natural language to interact more effectively with humans. In this talk, the technologies will be surveyed, and an application for future car navigation will be introduced.

  • Where:

    The 2nd International Symposium on AI Electronics

  • MERL Contact:
  • External Link:

    https://www.aie.tohoku.ac.jp/data/news/AIE_Sympo_2021.pdf

  • Research Areas:

    Artificial Intelligence, Computer Vision, Machine Learning, Speech & Audio

    •  Hori, C., Cherian, A., Marks, T., Hori, T., "Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog", Interspeech, September 2019, pp. 1886-1890.
      BibTeX TR2019-097 PDF
      • @inproceedings{Hori2019sep,
      • author = {Hori, Chiori and Cherian, Anoop and Marks, Tim and Hori, Takaaki},
      • title = {Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog},
      • booktitle = {Interspeech},
      • year = 2019,
      • pages = {1886--1890},
      • month = sep,
      • publisher = {ISCA},
      • url = {https://www.merl.com/publications/TR2019-097}
      • }
    •  Alamri, H., Cartillier, V., Das, A., Wang, J., Lee, S., Anderson, P., Essa, I., Parikh, D., Batra, D., Cherian, A., Marks, T.K., Hori, C., "Audio-Visual Scene-Aware Dialog", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI: 10.1109/​CVPR.2019.00774, June 2019, pp. 7550-7559.
      BibTeX TR2019-048 PDF
      • @inproceedings{Alamri2019jun,
      • author = {Alamri, Huda and Cartillier, Vincent and Das, Abhishek and Wang, Jue and Lee, Stefan and Anderson, Peter and Essa, Irfan and Parikh, Devi and Batra, Dhruv and Cherian, Anoop and Marks, Tim K. and Hori, Chiori},
      • title = {Audio-Visual Scene-Aware Dialog},
      • booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      • year = 2019,
      • pages = {7550--7559},
      • month = jun,
      • doi = {10.1109/CVPR.2019.00774},
      • url = {https://www.merl.com/publications/TR2019-048}
      • }
    •  Hori, C., Alamri, H., Wang, J., Wichern, G., Hori, T., Cherian, A., Marks, T.K., Cartillier, V., Lopes, R., Das, A., Essa, I., Batra, D., Parikh, D., "End-to-End Audio Visual Scene-Aware Dialog Using Multimodal Attention-Based Video Features", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/​ICASSP.2019.8682583, May 2019.
      BibTeX TR2019-016 PDF
      • @inproceedings{Hori2019may2,
      • author = {Hori, Chiori and Alamri, Huda and Wang, Jue and Wichern, Gordon and Hori, Takaaki and Cherian, Anoop and Marks, Tim K. and Cartillier, Vincent and Lopes, Raphael and Das, Abhishek and Essa, Irfan and Batra, Dhruv and Parikh, Devi},
      • title = {End-to-End Audio Visual Scene-Aware Dialog Using Multimodal Attention-Based Video Features},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2019,
      • month = may,
      • doi = {10.1109/ICASSP.2019.8682583},
      • url = {https://www.merl.com/publications/TR2019-016}
      • }