Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies

Guillaume Seguin; Karteek Alahari; Josef Sivic; Ivan Laptev

doi:10.1109/TPAMI.2014.2369050

Article Dans Une Revue IEEE Transactions on Pattern Analysis and Machine Intelligence Année : 2015

Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies

(1, 2) , (3) , (1, 2) , (1, 2)

1
2
3

Guillaume Seguin

Fonction : Auteur

Models of visual object recognition and scene understanding

Laboratoire d'informatique de l'école normale supérieure

Karteek Alahari

Fonction : Auteur
PersonId : 19670
IdHAL : karteek
ORCID : 0000-0002-1838-5936
IdRef : 196283892

Learning and recognition in vision

Josef Sivic

Fonction : Auteur
PersonId : 945630

Models of visual object recognition and scene understanding

Laboratoire d'informatique de l'école normale supérieure

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

Laboratoire d'informatique de l'école normale supérieure

Résumé

We describe a method to obtain a pixel-wise segmentation and pose estimation of multiple people in stereoscopic videos. This task involves challenges such as dealing with unconstrained stereoscopic video, non-stationary cameras, and complex indoor and outdoor dynamic scenes with multiple people. We cast the problem as a discrete labelling task involving multiple person labels, devise a suitable cost function, and optimize it efficiently. The contributions of our work are two-fold: First, we develop a segmentation model incorporating person detections and learnt articulated pose segmentation masks, as well as colour, motion, and stereo disparity cues. The model also explicitly represents depth ordering and occlusion. Second, we introduce a stereoscopic dataset with frames extracted from feature-length movies "StreetDance 3D" and "Pina". The dataset contains 587 annotated human poses, 1158 bounding box annotations and 686 pixel-wise segmentations of people. The dataset is composed of indoor and outdoor scenes depicting multiple people with frequent occlusions. We demonstrate results on our new challenging dataset, as well as on the H2view dataset from (Sheasby et al. ACCV 2012).

Mots clés

Person detection Pose estimation Segmentation 3D data Stereo movies

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

seguin15.pdf (4.4 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Karteek Alahari : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01089660

Soumis le : vendredi 7 août 2015-19:35:26

Dernière modification le : samedi 27 avril 2024-03:16:23

Archivage à long terme le : mercredi 26 avril 2017-09:44:40

Dates et versions

hal-01089660 , version 1 (02-12-2014)

hal-01089660 , version 2 (07-08-2015)

Identifiants

HAL Id : hal-01089660 , version 2
DOI : 10.1109/TPAMI.2014.2369050

Citer

Guillaume Seguin, Karteek Alahari, Josef Sivic, Ivan Laptev. Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (8), pp.1643 - 1655. ⟨10.1109/TPAMI.2014.2369050⟩. ⟨hal-01089660v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UGA CNRS INRIA INSMI LJK LJK_GI LJK_GI_LEAR QUAERO INRIA2 PSL

751 Consultations

706 Téléchargements

Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager