Journal article icon

Journal article

Aligning source visual and target language domains for unpaired video captioning

Abstract:

Training supervised video captioning model requires coupled video-caption pairs. However, for many targeted languages, sufficient paired data are not available. To this end, we introduce the unpaired video captioning task aiming to train models without coupled video-caption pairs in target language. To solve the task, a natural choice is to employ a two-step pipeline system: first utilizing video-to-pivot captioning model to generate captions in pivot language and then utilizing pivot-to-targ...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Publisher copy:
10.1109/tpami.2021.3132229

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Magdalen College
Role:
Author
ORCID:
0000-0001-7715-5228
National Natural Science Foundation of China More from this funder
Publisher:
IEEE Publisher's website
Journal:
IEEE Transactions on Pattern Analysis and Machine Intelligence Journal website
Volume:
44
Issue:
12
Pages:
9255-9268
Publication date:
2022-12-02
Acceptance date:
2021-11-16
DOI:
EISSN:
1939-3539
ISSN:
0162-8828
Pmid:
34855588

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP