Semi-Autoregressive Transformer for Image Captioning Curre | Data Science by ODS.ai 🦜

Semi-Autoregressive Transformer for Image Captioning

Current state-of-the-art image captioning models use autoregressive decoders - they generate one word after another, which leads to heavy latency during inference. Non-autoregressive models predict all the words in parallel; however, they suffer from quality degradation as they remove word dependence excessively.

The authors suggest a semi-autoregressive approach to image captioning to improve a trade-off between speed and quality: the model keeps the autoregressive property in global but generates words parallelly in local. Experiments on MSCOCO show that SATIC can achieve a better trade-off without bells and whistles.

Paper: https://arxiv.org/abs/2106.09436

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-satic

#imagecaptioning #deeplearning #transformer

Data Science by ODS.ai 🦜

🤷‍♂️ 51.69K
Technologies

First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of f...

Join
▲ Vote (1)

​​Semi-Autoregressive Transformer for Image Captioning Curre | Data Science by ODS.ai 🦜

Login

Semi-Autoregressive Transformer for Image Captioning Curre | Data Science by ODS.ai 🦜