Research Papers research paper arxiv computer-vision image-recognition

From 3D Pose to Prose: Biomechanics-Grounded Vision--Language Coaching

arXivMarch 31, 20261 min read0 views

arXiv:2603.26938v1 Announce Type: new Abstract: We present BioCoach, a biomechanics-grounded vision--language framework for fitness coaching from streaming video. BioCoach fuses visual appearance and 3D skeletal kinematics, through a novel three-stage pipeline: an exercise-specific degree-of-freedom selector that focuses analysis on salient joints; a structured biomechanical context that pairs individualized morphometrics with cycle and constraint analysis; and a vision--biomechanics conditioned feedback module that applies cross-attention to generate precise, actionable text. Using parameter- — Yuyang Ji, Yixuan Shen, Shengjie Zhu, Yu Kong, Feng Liu

View PDF HTML (experimental)

Abstract:We present BioCoach, a biomechanics-grounded vision--language framework for fitness coaching from streaming video. BioCoach fuses visual appearance and 3D skeletal kinematics, through a novel three-stage pipeline: an exercise-specific degree-of-freedom selector that focuses analysis on salient joints; a structured biomechanical context that pairs individualized morphometrics with cycle and constraint analysis; and a vision--biomechanics conditioned feedback module that applies cross-attention to generate precise, actionable text. Using parameter-efficient training that freezes the vision and language backbones, BioCoach yields transparent, personalized reasoning rather than pattern matching. To enable learning and fair evaluation, we augment QEVD-fit-coach with biomechanics-oriented feedback to create QEVD-bio-fit-coach, and we introduce a biomechanics-aware LLM judge metric. BioCoach delivers clear gains on QEVD-bio-fit-coach across lexical and judgment metrics while maintaining temporal triggering; on the original QEVD-fit-coach, it improves text quality and correctness with near-parity timing, demonstrating that explicit kinematics and constraints are key to accurate, phase-aware coaching.

Subjects:

Computer Vision and Pattern Recognition (cs.CV)

Cite as: arXiv:2603.26938 [cs.CV]

(or arXiv:2603.26938v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2603.26938

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Yuyang Ji [view email] [v1] Fri, 27 Mar 2026 19:26:28 UTC (1,051 KB)

Original source

arXiv

https://arxiv.org/abs/2603.26938

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

researchpaperarxiv

Market NewsRecent

Microsoft outlines USD 5.5 bln Singapore investment - Telecompaper

Microsoft outlines USD 5.5 bln Singapore investment Telecompaper

GNews AI Singapore

1mabout 16 hours ago

ReleasesLive

I Built Consistent Hashing From Scratch in Go — Here's What I Learned

If you've ever added a server to a cache cluster and watched your database melt, you already know the problem consistent hashing solves. You just might not know it by name. I built a full implementation from scratch in Go to understand it deeply. This post walks through what I learned — the problem, the fix, and the gotchas nobody tells you about. The five-minute version You have 5 cache servers. You route keys with hash(key) % 5 . Life is good. Then traffic spikes and you add a 6th server. Now it's hash(key) % 6 . Sounds harmless, right? Here's what actually happens: Before: hash("user:1001") % 5 = 3 → Server C After: hash("user:1001") % 6 = 1 → Server A ← moved! That key was sitting happily on Server C. Now every client thinks it's on Server A, where it doesn't exist. Cache miss. The req