TR2021-143

Keypoint-aligned 3D Human Shape Recovery from A Single Imagewith Bilayer-Graph


    •  Yu, X., van Baar, J., Chen, S., Sullivan, A., "Keypoint-aligned 3D Human Shape Recovery from A Single Imagewith Bilayer-Graph", International Conference on 3D Vision (3DV), DOI: 10.1109/​3DV53792.2021.00060, December 2021, pp. 505-514.
      BibTeX TR2021-143 PDF
      • @inproceedings{Yu2021dec,
      • author = {Yu, Xin and van Baar, Jeroen and Chen, Siheng and Sullivan, Alan},
      • title = {Keypoint-aligned 3D Human Shape Recovery from A Single Imagewith Bilayer-Graph},
      • booktitle = {International Conference on 3D Vision (3DV)},
      • year = 2021,
      • pages = {505--514},
      • month = dec,
      • doi = {10.1109/3DV53792.2021.00060},
      • url = {https://www.merl.com/publications/TR2021-143}
      • }
  • Research Areas:

    Artificial Intelligence, Computer Vision, Machine Learning

Abstract:

The ability to estimate 3D human shape and pose from images can be useful in many contexts. Recent approaches have explored using graph convolutional networks, and achieved promising results. The fact that the 3D shape is represented by a mesh, an undirected graph, makes graph convolutional networks a natural fit for this problem. However, graph convolutional networks have limited representation power. Information from nodes in the graph is passed to connected neighbors, and propagation of information requires successive graph convolutions. To overcome this limitation, we propose a dual-scale graph approach. We use a coarse graph, derived from a dense graph, to estimate the human's 3D pose, and the dense graph to estimate the 3D shape. Information in coarse graphs can be propagated over longer distances compared to dense graphs. In addition, information about pose can guide to recover local shape detail, and vice versa. We recognize that the connection between coarse and dense is itself a graph, and introduce graph fusion blocks to exchange information between graphs with different scales. We train our model end-to-end and show that we can achieve state of the art results for several evaluation datasets.