Vision-Based Kinematic Analysis of Human Motion in Dynamic Board Sports

Research Project · Developed in collaboration with TRAX
Researcher: TRAX, Satoshi Endo
Motivation
Analysing human movement in dynamic sports such as surfing requires models that capture full-body kinematics (posture, joint coordination, and movement timing) and their interaction with equipment under real-world conditions. Wearable inertial sensors provide high-frequency motion and force measurements, but computer vision methods can recover the complete skeletal structure of the athlete without constraining the performer, offering a powerful, non-invasive complement to onboard sensing.
This project is developed in collaboration with TRAX, a company building a next-generation surf performance platform. TRAX combines a board-mounted motion sensor with a pressure-sensing traction pad and an app to turn each surfing session into measurable performance data - capturing speed generation, turn angles, and weight shifts. By comparing motion and pressure patterns to ideal reference data from high-level surfers, TRAX highlights what is working, what is holding athletes back, and suggests targeted data-driven adjustments and drills for faster improvement. TRAX is currently in active development, with a functional prototype and a roadmap towards a beta-ready device and scalable software.
Recent advances in pose estimation and multi-view motion capture enable accurate reconstruction of skeletal joint trajectories from calibrated video rigs. In surfing, performance depends strongly on precise body positioning, weight transfer, and coordinated joint motion across the manoeuvre cycle. These kinematic signatures are difficult to quantify with sensors alone, making vision-based methods a valuable complement to the IMU-based measurements already captured by TRAX. A key feature of this project is access to multi-perspective video capture setup alongside perfectly time-synchronised IMU data, enabling rigorous multi-modal analysis that existing literature has not fully exploited.
The challenge is to build a complete analysis pipeline, from raw video to biomechanically meaningful features, that is robust to the variability of outdoor environments, occlusion, and the rapid, full-body movements characteristic of surfing and/or other dynamic sports. Achieving this requires combining accurate 3D pose reconstruction with physics-based constraints and sport-specific domain knowledge to ensure that learned representations are both accurate and interpretable.
Research questions
- How accurately can 3D body kinematics be reconstructed from mono/stereo-video in outdoor, dynamic environments?
- Which kinematic representations (joint trajectories, angles, angular velocity fields) or learned temporal embeddings best capture the structure of complex athletic manoeuvres such as surfing turns?
- How do kinematic patterns differ across manoeuvre types, performance levels, and individual execution styles?
- How can physics-based constraints and sport-specific heuristics be incorporated into machine learning models as inductive biases to improve accuracy, data efficiency, and interpretability?
- How can vision-derived kinematic representations be aligned with synchronised IMU time-series from TRAX hardware to enable joint modelling of body movement and equipment dynamics?
Approach
The project develops a computer vision pipeline for kinematic motion analysis in surfing, with scalability to dynamic sports more broadly. Pose estimators are applied to video sequences, to extract skeletal joint position and temporal smoothing yields physiologically plausible trajectories across full manoeuvre sequences. Biomechanical features (joint angles, angular velocities, body orientation, centre-of-mass proxies, and inter-joint coordination metrics) are extracted from the reconstructed skeleton to represent movement dynamics in a form directly relevant to the performance indicators tracked by TRAX.
A central contribution is the integration of physical constraints and domain knowledge into the learning pipeline. Rigid-body dynamics, joint range-of-motion limits, and momentum-conservation terms are embedded as soft regularisers during training, while expert-annotated kinematic thresholds and manoeuvre grammar priors encode surfing-specific heuristics. Sequence models (TCNs, Transformers) and contrastive representation learning are trained on these physics-grounded features to identify characteristic patterns across manoeuvre types and skill levels.
The precisely synchronised IMU stream from the TRAX board sensor serves dual roles: as a quantitative ground truth against which vision-only reconstructions are evaluated, and as a complementary modality for multi-modal fusion. Cross-attention and learned alignment modules exploit the temporal correspondence between video kinematics and board dynamics, enabling richer joint models of athlete-equipment interaction. This fusion directly extends the performance insights already generated by TRAX, linking body movement patterns to board response metrics such as speed generation and rail engagement.
Key results and achievements
- Developed a validated pipeline for 3D kinematic reconstruction and biomechanical feature extraction from synchronised camera video, providing full-body skeletal trajectories across complete manoeuvre sequences.
- Established quantitative benchmarking of vision-derived kinematics against TRAX IMU ground truth, characterising reconstruction errors by joint and manoeuvre phase.
- Demonstrated that physics-based constraints and surfing-specific heuristics improve model accuracy, data efficiency, and interpretability over purely data-driven baselines.
- Build a reusable multi-modal fusion framework linking body kinematics to TRAX equipment dynamics, capturing surfer-board interaction in a form applicable to surfing to potentially beyond to other dynamic sports.
- Produced kinematic performance profiles that complement and enrich the turn angle, weight shift, and speed metrics already generated by the TRAX platform, supporting targeted athlete feedback and coaching.
Research participation available for
- Master thesis
- Forschungspraktikum
Sensor-Augmented Training for Robust Motion Analytics under Reduced Sensing
Research Project · Developed in collaboration with TRAX
Researcher: TRAX, Satoshi Endo
Motivation
Many modern sensing systems are developed using rich, multi-modal sensor setups, while deployed products must operate with a significantly reduced set of sensors due to constraints on cost, power, wearability, and robustness. This creates a fundamental challenge: models trained with many sensor modalities must remain reliable when some of those modalities are absent at deployment time.
This challenge is directly relevant to TRAX, a company building a surf performance platform that combines a board-mounted motion sensor with a pressure-sensing traction pad and an app to deliver measurable performance data including speed generation, turn angles, and weight shifts. TRAX compares athlete movement and pressure patterns against reference data from high-level surfers, providing targeted feedback and drills. The core TRAX product is intentionally minimal: a clean, unobtrusive hardware setup the surfer barely notices. However, during data collection and model development, access to richer sensor configurations, such as an IMU vest worn by the athlete or an additional front traction pad, are feasible. The key scientific question is whether models trained with these auxiliary sensors retain meaningfully higher confidence and accuracy when those sensors are removed, compared to models that never had access to them.
This idea, sometimes called privileged information learning or sensor-augmented training, has been studied in limited settings within machine learning. However, comparatively little work has examined it in the context of nonlinear dynamical systems with structured time-series sensor signals, where temporal interactions between modalities play a critical role in model behaviour. Surfing and other dynamic sports provide a compelling test case: the biomechanical signals are rich and highly structured, the deployment sensor set is commercially constrained, and the performance stakes (coaching quality) make model confidence directly meaningful.
Research questions
- Does training with auxiliary sensors - such as a body-worn IMU vest or a front pressure pad - lead to models that retain higher predictive confidence and accuracy under reduced sensing at deployment, compared to models trained on the reduced sensor set alone?
- How much predictive information for surfing performance tasks (manoeuvre classification, quality scoring, weight transfer dynamics) is contained in each sensing modality and in their combinations, and which auxiliary sensors offer the greatest training-time benefit?
- Which training strategies best exploit auxiliary sensor data at training time while gracefully degrading when those sensors are absent at test time, including modality dropout, teacher–student distillation, and privileged information frameworks?
- What is the minimal deployment sensor configuration that achieves acceptable performance after sensor-augmented training, and how does this compare to performance achievable with the full sensor set?
- How does the benefit of sensor-augmented training vary across different motion tasks, manoeuvre types, and athlete skill levels relevant to the TRAX platform?
Approach
The project is structured around a controlled experimental programme using multi-modal surfing datasets, with models trained and evaluated across different sensor configurations to isolate the effect of auxiliary sensing at training time.
Data collection and preparation
Surfing sessions are instrumented with the full TRAX sensor suite and supplemented by auxiliary sensors including a body-worn IMU vest and/or a front pressure pad. All streams are synchronised and processed into structured time-series representations, enabling controlled simulation of any desired sensor subset during training and evaluation.
Baseline and ablation experiments
Baseline sequence models (temporal convolutional networks, recurrent architectures, and lightweight Transformer variants) are trained on the full sensor set and on the reduced TRAX deployment set. Controlled ablation experiments systematically remove modalities at test time to characterise performance degradation across manoeuvre classification, turn quality regression, and weight-shift dynamics tasks.
Sensor-augmented training strategies
The core experimental programme compares three families of sensor-augmented training strategies. Modality dropout trains models with stochastic sensor masking, teaching them to remain reliable under any subset of inputs. Teacher–student distillation uses a full-sensor ‘teacher’ model to supervise a reduced-sensor ‘student’, transferring richer internal representations to the deployment model. Privileged information frameworks, inspired by learning using privileged information (LUPI), treat auxiliary sensors as side information available only at training time, incorporated through regularisation or auxiliary prediction objectives. Each strategy is evaluated on both model accuracy and calibrated confidence at deployment.
Confidence and uncertainty quantification
A key evaluation metric is model confidence under sensor reduction. Bayesian and ensemble-based uncertainty quantification methods are applied to measure whether sensor-augmented training produces models that are not only more accurate but also better calibrated, knowing what they know and what they do not, when auxiliary sensors are removed. This directly supports the TRAX use case, where feedback should only be surfaced when the model has sufficient confidence in its analysis.
Key results and achievements
- Demonstrated that models trained with auxiliary sensors (IMU vest, front pressure pad) retain significantly higher predictive confidence and accuracy on TRAX deployment tasks compared to models trained on the reduced sensor set alone..
- Characterised the information content of each sensing modality for surfing performance tasks, identifying which auxiliary sensors provide the greatest training-time benefit for manoeuvre classification, quality scoring, and weight-transfer analysis.
- Benchmarked modality dropout, teacher–student distillation, and privileged information frameworks against each other and against reduced-set baselines, providing clear guidelines for which strategy is most effective under different sensing constraints.
- Defined minimal deployment sensor configurations that achieve acceptable performance after sensor-augmented training, directly informing hardware design decisions for the TRAX product roadmap.
- Produced calibrated uncertainty estimates for the TRAX reduced sensor set, enabling the platform to surface performance feedback only when model confidence meets a reliable threshold, improving the trustworthiness of athlete-facing insights.
Research participation available for
- Master thesis
- Forschungspraktikum