Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning
doi 10.18653/v1/n18-2125
Full Text
Open PDFAbstract
Available in full text
Date
January 1, 2018
Authors
Publisher
Association for Computational Linguistics