Recent advances in estimating 3D human pose and shape have the potential to benefit many areas. However, most approaches only consider a single frame at a time, which limits their usefulness. Additionally, these techniques cannot track individuals over time or retrieve their travel paths. This is especially challenging in hand-held videos due to camera shake.
To address these issues, a team of researchers from the Harbin Institute of Technology, Explore Academy of JD.com, Max Planck Institute for Intelligent Systems, and HiDream.ai have developed a novel technique called TRACE. TRACE uses a 5D representation (space, time, and identity) to reason about persons in various situations. It has several innovative features, including the use of two “Maps” to analyze 3D motion from both the camera’s perspective and the world’s perspective. A memory module helps keep track of individuals even after long absences. TRACE can recover 3D human models and track their movements in global coordinates from moving cameras in a single step.
The main objective of TRACE is to simultaneously reconstruct each person’s global coordinates, 3D position, shape, identity, and motion. To achieve this, TRACE first extracts temporal information and then uses a dedicated neural network to decode each sub-task. The Detection and Tracking sub-trees execute multi-subject tracking to reconstruct the 3D human motion in camera coordinates. The estimated 3D Motion Offset map shows the relative movement of each subject in space between two frames. A memory unit extracts subject identities and constructs human trajectories in camera coordinates using estimated 3D detections and motion offsets. The World branch calculates a world motion map to estimate the subjects’ trajectories in global coordinates.
To evaluate TRACE, the team used a dataset called DynaCam, which simulates camera motions in natural environments. They also tested TRACE on two multi-person benchmarks. The results showed that TRACE outperforms previous approaches in terms of tracking humans under long-term occlusion and calculating overall 3D trajectories.
In the future, the team suggests investigating explicit camera motion estimation using more complex training data. They plan to explore datasets that include complicated human motion, 3D scenes, and camera motions.
If you want to learn more about TRACE, you can check out the paper, code, and project on their respective links provided. You can also join our ML SubReddit, Discord Channel, and Email Newsletter for the latest AI research news and projects. If you have any questions or need further information, you can email us at Asif@marktechpost.com.