ambulance bed bolt briefcase calendar chain chevron-left chevron-right clock-o commenting-o commenting comments diamond envelope-o envelope facebook feed flask globe group heart-o heart heartbeat hospital-o instagram leaf map-marker medkit phone quote-left quote-right skype star-o star tint trophy twitter user-md user youtube

Pose from action: Unsupervised learning of pose features based on motion

Purushwalkam S, Gupta A | ECCV workshop, 2016 | 2016 | PDF
Human actions are comprised of a sequence of poses. This makes videos of humans a rich and dense source of human poses. We propose an unsupervised method to learn pose features from videos that exploits a signal which is complementary to appearance and can be used as supervision; motion. The key idea is that humans go through poses in a predictable manner while performing actions. Hence, given two poses, it should be possible to model the motion that caused the change between them. We represent each of the poses as a feature in a CNN (Appearance ConvNet) and generate a motion encoding from optical flow maps using a separate CNN (Motion ConvNet). The data for this task is automatically generated allowing us to train without human supervision. We demonstrate the strength of the learned representation by finetuning the trained model for Pose Estimation on the FLIC dataset, for static image action recognition on PASCAL and for action recognition in videos on UCF101 and HMDB51.