...

Fit3D Dataset

611 multi-view sequences; minimum 5 annotated repetitions per sequence; 2,964,236 highly accurate ground truth 3d skeletons, GHUM & SMPLX human pose and shape parameters;

More Info Download

Abstract

I went to the gym today, but how well did I do? And where should I improve? Ah, my back hurts slightly… User engagement can be sustained and injuries avoided by being able to reconstruct 3d human pose, shape, and motion, relate it to good training practices, identify errors, and provide early, real-time feedback. In this paper we introduce the first automatic system, AIFit, that performs 3d human sensing for fitness training. The system can be used at home, outdoors, or at the gym. AIFit is able to reconstruct 3d human pose and motion, reliably segment exercise repetitions, and identify in real-time the deviations between standards learnt from trainers, and the execution of a trainee. As a result, localized, quantitative feedback for correct execution of exercises, reduced risk of injury, and continuous improvement is possible. To support research and evaluation, we introduce the first large scale dataset, Fit3D, containing over 3 million images and corresponding 3d human shape and motion capture ground truth configurations, with over 37 repeated exercises, covering all the major muscle groups, performed by instructors and trainees. Our statistical coach is governed by a global parameter that captures how critical it should be of a trainee’s performance. This is an important aspect that helps adapt to a student’s level of fitness (i.e. beginner vs. advanced vs. expert), or to the expected accuracy of a 3d pose reconstruction method. We show that, for different values of the global parameter, our feedback system based on 3d pose estimates achieves good accuracy compared to the one based on ground-truth motion capture. Our statistical coach offers feedback in natural language, and with spatio-temporal visual grounding.

Paper

...

Video

Supplementary Material Video

Citation

@InProceedings{Fieraru_2021_CVPR,
author = {Fieraru, Mihai and Zanfir, Mihai and Pirlea, Silviu-Cristian and Olaru, Vlad and Sminchisescu, Cristian},
title = {AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}

 

Related Datasets

...

FlickrCI3D

Interaction Contact Signatures: 11,770 images; 14,866 contact events; 138,213 selected contact regions; 81,233 facet-level surface correspondences;
Interaction Contact Classification: 90,167 pairs of people;

More Info

...

CHI3D

631 multi-view sequences; 2,524 interaction contact events; 728,664 highly accurate ground truth 3d skeletons, GHUM & SMPLX human pose and shape parameters;

More Info

...

FlickrSC3D

Self-Contact Signatures: 3,415 images of 3,969 self-contact events; 25, 297 facet-level surface correspondences;
Self-Contact Classification:24,312 people;

More Info

...

HumanSC3D

1032 multi-view sequences; 4,128 self-contact events; 1,246,487 highly accurate ground truth 3d skeletons, GHUM & SMPLX human pose and shape parameters;

More Info