活动报名｜如何使用70万预算从头训练千亿语言大模型

1,563次阅读

王业全

北京智源人工智能研究院认知模型团队负责人，清华大学博士，中国中文信息学会情感计算专委会委员，2022年被评为AI 2000全球最具影响力人工智能学者（自然语言处理领域）。主要从事语言大模型、自然语言处理方面的研究工作，代表成果有 FLM-101B、FreeLM、Mu-Scaling、MSG和ATAE-LSTM等。

在国际顶级会议发表多项研究成果，谷歌学术引用超过2,500次。研究成果ATAE-LSTM和RNN-Capsule被PAPER DIGEST评为最具影响力论文，同时多次入选谷歌学术刊物指标榜单。

如何使用70万预算从头训练千亿语言大模型

以GPT系列为代表的语言大模型已经取得了显著的成功，但是其高昂的成本限制了大模型进一步的快速发展。同时，这也给学术界和工业界带来了新的机遇和挑战。为了进一步降低模型成本，我们采用了生长策略，成功地将千亿稠密大模型的成本降低到70万。

此外，为了更加全面合理地评估大模型，在目前已有的知识类评估的基础上，借鉴IQ测试的概念，提出了大模型的IQ测试方案。实验显示，70万训练成功的千亿大模型表现了非常好的能力。我们相信生长策略可以为突破单体稠密万亿模型带来全新的可能性。

Large language models (LLMs) have achieved remarkable success in NLP and multimodal tasks. However, their high costs constrain the further development of LLMs, which also brings both opportunities and challenges for academia and industry. To break down this barrier, FLM-101B employs a growth strategy and successfully lowers the cost of training a 100B-level dense model down to ￥700,000 CNY. Additionally, in order to evaluate LLMs systematically and more rationally, besides existing knowledge-based assessments, the IQ test in LLMs, whose concept is partially borrowed from psychology, is proposed. Experimental results show that the model trained with a budget of ￥700K, achieves comparable performance to powerful and well-known models and demonstrates impressive capabilities. We believe that the growth strategy offers new possibilities for breakthroughs in training 1T+ dense models.

活动时间：9月21日（周四）14:30-15:30

活动形式：线上直播，扫描下方二维码报名

活动报名｜如何使用70万预算从头训练千亿语言大模型

点击阅读原文，与讲者线上交流

正文完

可以使用微信扫码关注公众号（ID：xzluomor）

AI AR F1 GPT HTML RSS Web 人工智能直播

发表至：智源

2023年9月20日

EMNLP2023论文：通过NLP领域学术写作的对比分析试图解决语言偏置问题

人工智能国际治理研究院专家团队参加2023人工智能政策峰会

PNAS速递：封闭生态系统通过自组织的营养循环提取能量

大模型分布式训练效能提升的必要性（万字长文推荐收藏）

124位科学家批评整合信息论是伪科学：我们该如何探讨意识难题？

蹭上AI概念，3年B轮IPO，这家药企把资本玩明白了

评论（没有评论）

2023 年 9 月
一	二	三	四	五	六	日
	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

文心AIGC

人工智能ChatGPT，AIGC指利用人工智能技术来生成内容，其中包括文字、语音、代码、图像、视频、机器人动作等等。被认为是继PGC、UGC之后的新型内容创作方式。AIGC作为元宇宙的新方向，近几年迭代速度呈现指数级爆发，谷歌、Meta、百度等平台型巨头持续布局

文章搜索

最新评论

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

热评文章

经典留声机

经典流行从来都不冲突

在这里，听见你曾经的故事

新浪微博：主播小D

小红书：小D就是我

抖音号：52915017

Search Episodes

薛之谦：从“人歌分离”到“深情解构者”的音乐涅槃之路（上）

2025年6月30日

主播小D

你一定听过这些经典合唱–第一篇

2025年1月20日

主播小D

缅怀一代歌王罗文的经典之声–第二篇

2024年12月30日

主播小D

缅怀一代歌王罗文的经典之声–第一篇

2024年12月27日

主播小D

在这里，听琼瑶，岁月长歌–第二篇

2024年12月24日

主播小D

在这里，听琼瑶，岁月长歌–第一篇

2024年12月21日

主播小D

你总能在这些歌里找到你的回忆–第一百零三篇

2024年12月18日

主播小D

你总能在这些歌里找到你的回忆–第一百零四篇

2024年12月13日

主播小D

《这些歌都发行在2001年–第三篇》

2024年12月10日

主播小D

《这些歌都发行在2001年–第二篇》

2024年12月7日

主播小D

Search Results placeholder

2023 年 9 月
一	二	三	四	五	六	日
	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30