The emergence of technologies such as Unmanned Aircraft System, Algorithm Trading and Large Online Shared Car Platforms has brought more complex functional requirements to intelligent control systems, making traditional centralized control models increasingly difficult to compete. The cooperative multiagent reinforcement learning came into being. One of the main challenges of cooperative multiagent reinforcement learning is to extend multiagent reinforcement learning to large-scale collaboration systems, that is, how to use a unified reinforcement learning framework to describe the learning process and let a huge number of agents learn to work together and to cooperate with each other. To solve this problem, this project summarizes the shortcomings of the existing cooperative multiagent reinforcement learning technologies and theories and then proposes improved algorithms and methods from the following three perspectives: the self-organization of interaction network, the classification of active-passive learning agent, and agent individual action learning. By proposing a low-cost cooperative reinforcement learning algorithm and mechanism that can be applied to large-scale agent environments, it can promote internal coordination and order of multi-agent groups and maximize the overall utility of the system. The research results of this project are of great significance for further developing and improving multiagent system learning theory, and providing theoretical guidance for distributed intelligent control and design, as well as enhancing China's international competitiveness in this field.
自主无人机,自动交易算法和大型在线乘车共享平台等技术的出现,对相关控制系统的需求愈发复杂,这使得传统的集中式控制模式逐渐难以胜任,协作式多智能体强化学习应运而生。协作式多智能体强化学习研究的一个主要挑战是将多智能体强化学习扩展到大型协作系统,即如何利用一套统一的强化学习框架去描述这个学习过程,让海量的智能体学会一起完成同一个任务,学会彼此合作。致力于解决该问题,本项目总结现有协作式多智能体强化学习技术理论的不足,提出新的研究思路和方法,从多智能体交互网络自组织、主被动智能体分类和智能体个体学习三个角度优化模型和改进算法,研究设计可适用于大规模智能体环境的低开销的协作式强化学习算法和机制,促进多智能体群体内部协调有序的状态,实现系统整体效用最大化。本项目的研究成果对进一步发展与完善多智能体群体学习理论,为分布式智能控制与设计提供理论指导,提升我国在该领域的国际竞争力都具有重要意义。
针对大规模多智能体场景中的如何降低智能体间的交互成本和如何针对智能体差异性设计不同功能需求的学习策略两个科学问题,本项目从多智能体交互网络自组织、主被动智能体分类和智能体个体学习三个角度优化模型和改进算法,研究设计可适用于大规模智能体环境的低开销的协作式强化学习算法和机制,促进多智能体群体内部协调有序的状态,实现系统整体效用最大化。依托该项目,本人及团队在IEEE T-ITS、IEEE TVT、IEEE TNSE等学术期刊和AAAI、DAI等国际会议上发表9篇学术论文。本项目将研究的算法在多个多智能体领域应用场景中做了测试和分析,包括城市交通信号灯自动控制、广告推荐算法设计和社交网络中舆论演变分析等场景, 以支持产学结合,为智能城市建设和网络舆论引导等社会需求赋能。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于分形L系统的水稻根系建模方法研究
拥堵路网交通流均衡分配模型
自然灾难地居民风险知觉与旅游支持度的关系研究——以汶川大地震重灾区北川和都江堰为例
基于多模态信息特征融合的犯罪预测算法研究
卫生系统韧性研究概况及其展望
基于动态分层与自学习的多智能体自适应协作模型
基于模因计算的多智能体迁移强化学习研究
基于多智能体强化学习的多机器人系统研究
基于多智能体强化学习的电子市场动态定价研究