Abstract
With the widespread application of 3D human animation, traditional skeleton reconstruction and motion data prediction methods still have limitations in terms of modeling capabilities, trajectory discontinuity, and response delays when dealing with point cloud density, multi-view posture changes, and missing key points. To address this, this study proposes a 3D human animation and motion data generation model that integrates mean shift clustering, graph structure neural network modeling, and Kalman filter prediction. This method achieves adaptive extraction of key bone points through dual-ring neighborhood density clustering, combines local geometry and poses semantics to construct a complete skeleton topology, and introduces a state recursive mechanism for missing point repair and temporal smoothing in the output stage. The experiment showed that the model achieved the highest F1 values of 94.83% and 93.94% on the Human 3.6 M dataset and the Carnegie Mellon University motion capture dataset, with a skeleton point missing rate as low as 6.21%. It maintained an average opinion score of up to 4.78 under multi-view and complex lighting conditions, with an average processing delay of no more than 34 ms. In terms of reconstruction accuracy, acceleration smoothness, and output integrity under extreme poses, this model outperformed other comparison models. This model has significant advantages in accuracy, stability, and real-time performance, providing reliable technical support for high-quality animation generation, motion capture, and intelligent virtual human systems.
Get full access to this article
View all access options for this article.
