Semi-Autoregressive Transformer for Image Captioning Curre | Data Science by ODS.ai 🦜
Semi-Autoregressive Transformer for Image Captioning
Current state-of-the-art image captioning models use autoregressive decoders - they generate one word after another, which leads to heavy latency during inference. Non-autoregressive models predict all the words in parallel; however, they suffer from quality degradation as they remove word dependence excessively.
The authors suggest a semi-autoregressive approach to image captioning to improve a trade-off between speed and quality: the model keeps the autoregressive property in global but generates words parallelly in local. Experiments on MSCOCO show that SATIC can achieve a better trade-off without bells and whistles.
Paper: https://arxiv.org/abs/2106.09436
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-satic
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of f...