跳到主要导航 跳到搜索 跳到主要内容

Self-critical n-step training for image captioning

  • Junlong Gao
  • , Shiqi Wang*
  • , Shanshe Wang
  • , Siwei Ma
  • , Wen Gao
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Existing methods for image captioning are usually trained by cross entropy loss, which leads to exposure bias and the inconsistency between the optimizing function and evaluation metrics. Recently it has been shown that these two issues can be addressed by incorporating techniques from reinforcement learning, where one of the popular techniques is the advantage actor-critic algorithm that calculates per-token advantage by estimating state value with a parametrized estimator at the cost of introducing estimation bias. In this paper, we estimate state value without using a parametrized value estimator. With the properties of image captioning, namely, the deterministic state transition function and the sparse reward, state value is equivalent to its preceding state-action value, and we reformulate advantage function by simply replacing the former with the latter. Moreover, the reformulated advantage is extended to n-step, which can generally increase the absolute value of the mean of reformulated advantage while lowering variance. Then two kinds of rollout are adopted to estimate state-action value, which we call self-critical n-step training. Empirically we find that our method can obtain better performance compared to the state-of-the-art methods that use the sequence level advantage and parametrized estimator respectively on the widely used MSCOCO benchmark.

源语言英语
主期刊名Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
出版商IEEE Computer Society
6293-6301
页数9
ISBN(电子版)9781728132938
DOI
出版状态已出版 - 6月 2019
已对外发布
活动32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, 美国
期限: 16 6月 201920 6月 2019

出版系列

姓名Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
2019-June
ISSN(印刷版)1063-6919

会议

会议32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
国家/地区美国
Long Beach
时期16/06/1920/06/19

指纹

探究 'Self-critical n-step training for image captioning' 的科研主题。它们共同构成独一无二的指纹。

引用此