谷歌｜自回归解码器在计算机视觉多任务中的应用

研究了计算机视觉中的多任务自回归解码器，包括分类、描述、视觉问答和OCR等任务，通过大量实验研究了任务和数据混合、训练和正则化超参数、条件类型和特异性、多模态组合等因素对自回归解码器性能的影响，提出了一种名为Locked-image Tuning with Decoder(LiT Decoder)的解码器结构。

A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, Xiaohua Zhai
[Google Research]

https://arxiv.org/abs/2303.17376

自回归解码器在计算机视觉多任务中的应用

谷歌｜自回归解码器在计算机视觉多任务中的应用

动机：近年来，越来越多的计算机视觉模型涉及多种任务，使用图像编码器和自回归解码器组成，但现有研究大多只介绍了一种模型和其结果，缺乏关于设计决策和系统权衡的细节。该论文旨在填补这一空白。
方法：深入研究了在计算机视觉中使用自回归解码器进行多任务学习的影响因素，包括任务和数据混合、训练和正则化超参数、条件类型和特异性、模态组合等。提出一种新的模型架构，即锁定图像编码器并在其上添加自回归解码器进行微调(LiT解码器)，该解码器可以看作是通过自然语言引导解码器与预训练的视觉模型进行交互。
优势：通过大量的系统实验和与单任务基准的对比，揭示了多任务学习的成本，并展示了在预训练的编码器上使用小型自回归解码器的效果非常好。

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

谷歌｜自回归解码器在计算机视觉多任务中的应用

AI青年学霸齐聚杭州！这场峰会要选出「未来科学新星」

李飞飞空间智能独角兽开源底层技术！AI生成3D世界在所有设备流畅运行

终于！全球爆火AI视频神器PixVerse发布国内版——拍我AI

双重突破：全球首个零售VLA大模型来了！开源OpenWBT让机器人遥操门槛暴降！

挑战强化学习后训练霸权！全新无监督方法仅需1条数据+10步优化

通义灵码AI IDE上线，深度适配Qwen3，首创自动记忆功能

GPT-4o-Image仅完成28.9%任务！上海AI实验室等发布图像编辑新基准，360道人类专家严选难题

华为攻克AI推理「想太多」问题！新方法让大模型推理提速60%，准确率还高了

最新一期权威大模型榜单：豆包1.5、商汤日日新V6并列国内第一

每2秒吃透一道高数大题！华为终于揭秘准万亿MoE昇腾训练系统全流程