figshare
Browse

TGMF-Pose: Text-guided multi-view 3D pose estimation and fusion network for online sports instruction

Version 2 2025-11-12, 08:48
Version 1 2025-10-30, 03:06
software
posted on 2025-11-12, 08:48 authored by Xiaohong Qi, Yuanyuan Meng
<p dir="ltr">Background. Posture estimation technology has been widely applied in online sports instruction to provide precise motion recognition and real-time correction, which significantly improves teaching quality and learning outcomes. However, existing methods often struggle to capture subtle differences between visually similar but technically distinct actions in multi-motion scenarios, leading to semantic ambiguity. Additionally, in cases of limb occlusion, monocular 2D-to-3D mapping lacks multi-view information fusion and geometric structure constraints, resulting in blurred depth estimation and inaccurate keypoint localization.</p><p dir="ltr">Methods. To this end, we present a text-guided multi-view 3D pose estimation and fusion network for online sports instruction, named TGMF-Pose. It consists of three core components that enhance the fine-grained representation of semantic information and the accuracy of depth estimation: (1) The joint feature embedding module models the distances and angles between keypoints in 2D pose estimation. Guided by text-based prompts, it captures subtle geometric differences in limb movements across various sports. (2) The multi-view generator effectively addresses the challenge of limb occlusion by estimating the complete 3D pose of the central frame through querying keypoint features from nearby available frames, guided by textual prompts and geometric constraints. (3) The multi-view fusion module aggregates information from all views to refine features and achieve accurate pose depth estimation.</p><p dir="ltr">Results. Extensive experiments conducted on various publicly available pose estimation datasets have shown that TGMF Pose outperforms existing state-of-the-art methods in both recognition accuracy and depth recovery. Moreover, integrated within an AI-based instructional system, TGMF-Pose enables real-time feedback and semantic motion evaluation, offering practical value for intelligent, interactive sports education.</p>

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC