We introduce the task of dense captioning in 3D scans from commodity RGB-D sensors. As input, we assume a point cloud of a 3D scene; the expected output is the bounding boxes along with the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results