Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

Meera Hahn; Nataniel Ruiz; Jean-Baptiste Alayrac; Ivan Laptev; James M. Rehg

Pré-Publication, Document De Travail Année : 2019

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

(1) , (2) , (3) , (3) , (1)

1
2
3

Meera Hahn

Fonction : Auteur

Georgia Institute of Technology [Atlanta]

Nataniel Ruiz

Fonction : Auteur

Boston University [Boston]

Jean-Baptiste Alayrac

Fonction : Auteur
PersonId : 6558
IdHAL : jean-baptiste-alayrac
IdRef : 253131529

Models of visual object recognition and scene understanding

Ivan Laptev

Fonction : Auteur

Models of visual object recognition and scene understanding

James M. Rehg

Fonction : Auteur

Georgia Institute of Technology [Atlanta]

Résumé

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision. The task is challenging due to the difficulty of bridging the semantic gap between the visual and natural language domains. This paper addresses the task of automatically generating an alignment between a set of instructions and a first person video demonstrating an activity. The sparse descriptions and ambiguity of written instructions create significant alignment challenges. The key to our approach is the use of egocentric cues to generate a concise set of action proposals, which are then matched to recipe steps using object recognition and computational linguistic techniques. We obtain promising results on both the Extended GTEA Gaze+ dataset and the Bristol Egocentric Object Interactions Dataset.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV] Intelligence artificielle [cs.AI]

Jean-Baptiste Alayrac : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01979719

Soumis le : dimanche 13 janvier 2019-19:47:19

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-01979719 , version 1 (13-01-2019)

Identifiants

HAL Id : hal-01979719 , version 1
ARXIV : 1809.08381

Citer

Meera Hahn, Nataniel Ruiz, Jean-Baptiste Alayrac, Ivan Laptev, James M. Rehg. Learning to Localize and Align Fine-Grained Actions to Sparse Instructions. 2019. ⟨hal-01979719⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

52 Consultations

0 Téléchargements

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager