CS Peer Talk | Last-Iterate Convergence for Learning…

第三十九期

报告人：Weiqiang Zheng, Yale University

时间：1月13日（星期六）4:00pm

地点：静园五院204

报告信息

Title

Last-Iterate Convergence for Learning in Markov Games with Bandit Feedback

Abstract

Online learning in multi-player games captures many modern machine learning applications, ranging from generative adversarial networks and adversarial training to robust optimization and multi-agent reinforcement learning. Understanding last-iterate convergence in games is crucial since the last iterate characterizes the stability of the learning process and is widely used in practice.

In this talk, we study the problem of learning in two-player zero-sum Markov games, focusing on developing decentralized learning algorithms with non-asymptotic last-iterate convergence rates to Nash equilibrium. We first present a simple algorithm with last-iterate convergence rate in two-player zero-sum matrix games with bandit feedback. To the best of our knowledge, this is the first result that obtains finite last-iterate convergence rate given access to only bandit feedback. We then extend our result to the setting of two-player zero-sum Markov games, providing the first set of decentralized algorithms with non-asymptotic last-iterate/path convergence rates. This talk is based on joint work with Yang Cai, Haipeng Luo, and Chen-Yu Wei.

Biography

CS Peer Talk | Last-Iterate Convergence for Learning...

Weiqiang Zheng is a third-year PhD student in Computer Science at Yale University, advised by Prof. Yang Cai. He received his bachelor’s degree in Computer Science from Turing Class at Peking University. He has a broad research interest in game theory, online learning, and optimization.

about CS Peer Talk

作为活动的发起人，我们来自北京大学图灵班科研活动委员会，主要由图灵班各年级同学组成。我们希望搭建一个CS同学交流的平台，促进同学间的交流合作，帮助同学练习展示，同时增进友谊。

目前在计划中的系列包括但不限于：

教程系列：学生讲者为主，介绍自己的研究领域
研究系列：学生讲者为主，介绍自己的研究成果
客座系列：邀请老师做主题报告

除非报告人特别要求，报告默认是非公开的，希望营造一个自由放松但又互相激励的交流氛围。

CS Peer Talk | Last-Iterate Convergence for Learning...

主讲嘉宾招募

如果你愿意和大家分享你的学术成果、经历经验，总结回顾、触发新思，欢迎报名自荐。

主讲人报名：发邮件至 cs_research_tc@163.com，写明想讲的题目、内容及时间。

CS Peer Talk | Last-Iterate Convergence for Learning...

北京大学图灵班科研活动委员会

CS Peer Talk | Last-Iterate Convergence for Learning...

本微信公众号所有内容，由北京大学前沿计算研究中心微信自身创作、收集的文字、图片和音视频资料，版权属北京大学前沿计算研究中心微信所有；从公开渠道收集、整理及授权转载的文字、图片和音视频资料，版权属原作者。本公众号内容原作者如不愿意在本号刊登内容，请及时通知本号，予以删除。

CS Peer Talk | Last-Iterate Convergence for Learning...

2024 年 1 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

CS Peer Talk | Last-Iterate Convergence for Learning…

手把手教你用AI 10分钟生成一个APP！零基础也能搞定

test

手把手教你用AI 10分钟生成一个APP！零基础也能搞定

test

文心AIGC

手把手教你用AI 10分钟生成一个APP！零基础也能搞定

test

手把手教你用AI 10分钟生成一个APP！零基础也能搞定

test