大模型+agent专场：LLM与Agent的最新进展 | 第16届中国R会议暨2023X-AGI大会

2023-11-13 08:11

第16届中国R会议暨2023X-AGI大会将于2023年11月25-30日在中国人民大学召开，本次会议由中国人民大学统计学院、中国人民大学应用统计科学研究中心、统计之都、原灵科技和中国商业统计学会人工智能分会（筹）主办，由中国人民大学统计学院数据科学与大数据统计系承办，得到宽德投资、明汯投资、和鲸科技、子博设计赞助支持，将以线上会议和线下会议相结合的方式举办。

欢迎进入中国R会议暨2023X-AGI大会官网，获取更多会议信息！

链接：https://china-r.cosx.org/bj2023/index.html
更多详情：统计之都公众号菜单栏点击“最新活动”->“最新R会”

下面为您奉上本次中国R会议暨2023X-AGI大会大模型+Agent专场演讲介绍，本会场主席为王子涵。

大模型+Agent专场

时间：2023年11月26日上午9:30-11:45

会议地点：

线下：中国人民大学立德楼802
线上：点击阅读原文或扫描下方二维码

会场内容介绍

Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback

王子涵

个人简介：

Zihan Wang is an undergraduate student from GSAI, Renmin University of China, where he is currently advised by Prof. Zhicheng Dou. He works closely with Prof. Heng Ji from UIUC and Dr. Weiyan Shi from Stanford University. His research interest mainly lies in augmented language models, including (1) general language model interaction (2) the cross application of language models and information retrieval (IR) systems.

报告摘要：

Current LLM evaluations focus on single-turn and overlook multi-turn real-world scenarios. We introduce the MINT benchmark to assess LLMs in interaction with tools and language feedback. Our study of 20 LLMs shows they benefit from multi-turn interactions, but current RLHF and SIFT methods might hinder this. MINT aims to encourage research on LLM multi-turn capabilities, especially in open-source models.

Boosting Language Models with High-quality Feedback

袁立凡

个人简介：

Lifan Yuan is a member of THUNLP, advised by Prof. Zhiyuan Liu. Currently, he is also a research intern at Blender NLP, UIUC, working with Prof. Heng Ji and Prof. Hao Peng. His research interests mainly lie in building trustworthy NLP systems, and he is also interested in enhancing LM agents through internal alignment (e.g., RLHF) and external interaction (with tools/feedback).

报告摘要：

Reinforcement learning from human feedback (RLHF) has become a pivot technique in aligning large language models (LLMs) with human preferences. However, the scarcity of diverse, naturalistic datasets of human preferences on LLM outputs at scale poses a great challenge to RLHF as well as feedback learning research within the open-source community. This work investigates how high-quality feedback data can enhance LLMs.

基于强化学习的无人机空战决策

刘泽一

个人简介：

刘泽一，中国航空研究院，博士毕业于国防科技大学，主要研究方向为复杂网络分析与多智能体深度强化学习。

报告摘要：

作为未来战场的主要组成部分，无人机急需具备自主空战决策能力。基于规则的空战决策算法往往无法适应复杂的战场环境，强化学习技术可以克服该问题并根据实时态势进行空战决策。针对近距空战自主决策问题，提出了基于SAC算法的无人机自主空战决策方法。采用六自由度飞行器模型并考虑导弹等空战要素的影响构建仿真环境，以提高问题的真实性。根据敌我飞行器的角度、位置、速度等关系设计奖励函数，并通过自博弈的方法训练智能体。仿真结果表明，该方法能够实现无人机自主空战决策的目标，提高了无人机的自主决策能力。

线下参与

本会场将线上线下同步进行，线下会场位于中国人民大学，线上会场为学说直播平台。线下参会者需要扫描下方二维码报名。欢迎各位线上线下的朋友共同参会！

关于会议

主办方：

中国人民大学统计学院
中国人民大学应用统计科学研究中心
统计之都
原灵科技
中国商业统计学会人工智能分会（筹）

赞助方：

宽德投资
明汯投资
和鲸科技
子博设计

联系方式

公众号：统计之都
会议邮箱：[email protected]

统计之都：专业、人本、正直的中国统计学社区。

关注方式：扫描下图二维码。或查找公众号，搜索统计之都或 CapStat 即可。

往期推送：进入统计之都会话窗口，点击右上角小人图标，查看历史消息即可。

微信扫码关注该文公众号作者

戳这里提交新闻线索和高质量文章给我们。

来源: qq

点击查看作者最近其他文章