美国东北大学：可以通过要求GPT4反思“你为什么错了？”来提高30%的性能

最近大语言模型（LLM）代理在决策方面取得了令人印象深刻的成绩，跨越了各种基准测试。然而，这些最先进的方法通常需要内部模型微调、外部模型微调或在定义的状态空间上进行策略优化。实施这些方法可能会因高质量的训练数据的稀缺性或缺乏明确定义的状态空间而变得具有挑战性。此外，这些代理不具备人类决策过程所固有的某些品质，特别是从错误中学习的能力。自我反思可以使人们通过试错的方式高效地解决新问题。在最近的研究基础上，本文提出了Reflexion。这种方法赋予代理动态记忆和自我反思能力，以增强其现有的推理轨迹和任务特定的行动选择能力。为了实现完全自动化，引入了一个简单而有效的启发式方法，使代理能够准确定位幻觉实例，避免在行动序列中重复，并在某些环境中构建给定环境的内部记忆映射。为了评估方法，作者评估了代理在AlfWorld环境中完成决策任务和在HotPotQA环境中完成知识密集型的基于搜索的问答任务的能力，观察到分别为97％和51％的成功率，并提供了有关自我反思的新兴属性的讨论。

标题：Reflexion: an autonomous agent with dynamic memory and self-reflection

作者：Noah Shinn, Beck Labash, Ashwin Gopinath

论文：https://arxiv.org/abs/2303.11366

代码：https://github.com/noahshinn024/reflexion

美国东北大学：可以通过要求GPT4反思“你为什么错了？”来提高30%的性能

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

美国东北大学：可以通过要求GPT4反思“你为什么错了？”来提高30%的性能

世界模型和具身大脑最新突破：90%生成数据，VLA性能暴涨300%｜开源

DeepSeek-V3.2系列开源，性能直接对标Gemini-3.0-Pro

周志华，院士！

三行代码就能手搓一个AI应用！蚂蚁OceanBase开源其首款AI数据库

10000个代码文件，我打几把游戏的功夫就搞成Wiki了！

61岁贝佐斯创业物理AI！亲任CEO，首轮获投62亿美元融资

Zleap技术解密：后RAG时代已来，SAG重新定义AI搜索

中流击水，破浪前行｜第19届中国投资年会·有限合伙人峰会即将在沪启幕

32个随机数字，1分钟推演地球未来15天丨谷歌DeepMind

谢赛宁盛赞字节Seed新研究！单Transformer搞定任意视图3D重建