TR2022-151
Human Perspective Scene Understanding via Multimodal Sensing
-
- "Human Perspective Scene Understanding via Multimodal Sensing," Tech. Rep. TR2022-151, Audio-Visual Scene Understanding Tutorial at CVPR 2021, June 2021. ,
-
MERL Contact:
-
Research Areas:
Artificial Intelligence, Computer Vision, Human-Computer Interaction, Speech & Audio
Abstract:
This talk introduces our research activities on scene-aware interaction including neural conversation, video captioning, Audio Visual Scene-aware Dialog (AVSD), and our demo for future car navigation using scene-aware interaction technologies.
Presented as part of a tutorial on Audio-Visual Scene Understanding, CVPR 2021.