Colloquium - Manfred Pinkal (Saarland University) "Making TACoS: Grounding Distributional Models of Action Descriptions in Videos"
Mon, April 22, 2013 • 3:00 PM - 4:30 PM • CLA 1.302B
I will present the recently created and newly released Saarbrücken Corpus of Textually Annotated Cooking Scenes (TACoS). The corpus aligns high quality videos with multiple natural-language descriptions of the actions portrayed in the videos. I will report experimental results which demonstrate that a text-based model of similarity between actions improves substantially when combined with visual information. I will also touch upon ongoing research investigating the use of textual information for video recognition, as well as generation of natural-language descriptions from videos. The work presented in this talk results from an ongoing collaboration with the Visual Processing group of the Max-Planck-Institute for Computer Science.
Reference: Regneri, M., Rohrbach, M., Wetzel, D., Thater, S., Schiele, B., Pinkal, M. (2013): Grounding Action Descriptions in Videos. Transactions of ACL 1