By identifying a true model of other agents, a subject agent is able to correctly predict other agents' behaviour so as to act in an optimal way. The limited model space of the subject agent may not always contain the true model of other agents, which leads to the fact that it becomes invalid for the subject agent to use Bayesian rules for updating the beliefs over other agents' models. This project will be developed based on the framework of Interactive Dynamic Influence Diagram (I-DD) and investigate the learning of the dependency of agents' models with the purpose of improving I-DID solutions in practice. By relating the true model to the candidate models of other agents, the subject agent can precisely weight the candidate models and optimize its decision during their interactions. The project will integrate multiagent systems and information theory with machine learning technologies and conduct the research on learning opponent models in I-DID. Firstly, it will use the concept of mutual information to quantify the model dependency and will study its relevant properties. Secondly, it will compute the mutual information by learning parameters of dynamic hidden Bayesian networks that introduces hidden variables into the dynamic Bayesian networks. Thirdly, it will propose a sequential learning mechanism and improve the algorithmic adaptability. Fourthly, it will further optimize model space of the subject agent by tracing the model dependency over time. Solutions of I-DID will be significantly improved with the reduced complexity of model space. Finally, it will develop an Unmanned Aerial Vehicle (UAV) simulation platform and verify the performance of the proposed model and algorithms in term of effectiveness and reliability. More significantly, it will show the applications of the proposed techniques in the real-world competitive human-like mobile robots. In summary, this project will be for the first time to incorporate online machine learning technologies into I-DID solutions and overcome the limitations of Bayesian updates in I-DID. This will establish a solid foundation on solving a wide range of sequential multiagent decision making problems in practice.
针对多Agent序贯决策优化问题,主体Agent需要识别其他Agent的真实模型,达到对其他Agent行为的准确预测。但是有限的主体Agent模型空间往往不能包括其他Agent的真实模型,这导致传统的贝叶斯公式不能被用来更新其他Agent模型的信度。本项目拟基于交互式动态影响图研究如何通过学习模型之间的相关性优化主体Agent的决策。通过结合多Agent决策系统、机器学习、信息论等技术,本项目采用互信息量化模型之间的相关性;构造动态隐性贝叶斯网络以准确地计算互信息;建立顺序学习技术以增强算法的适应性;根据相关性的实时变化动态优化模型空间;开发无人驾驶飞机仿真平台以验证模型及其算法的正确性;采用类人机器人对抗演练展示研究技术的实际效用。本项目将首次把在线机器学习方法融入到交互式动态影响图的求解过程中,从而克服传统贝叶斯方法更新模型的局限性,为解决实际多Agent序贯决策优问题提供坚实的基础。
针对多Agent序贯决策优化问题,主体Agent需要识别其他Agent的真实模型,达到对其他Agent行为的准确预测。但是有限的主体Agent模型空间往往不能包括其他Agent的真实模型,这导致传统的贝叶斯公式不能被用来更新其他Agent模型的信度。本项目基于交互式动态影响图的基本框架,从研究模型之间的相关性出发,通过机器学习的方法,降低模型空间,达到优化模型求解的目的。本项目的研究进一步提高了多智能体序贯决策的能力,并深入探讨了关于识别智能体真实模型的问题,取得了一些主要研究成果,对多智能体系统的研究有一定的指导意义。主要研究成果发表智能体研究领域的顶级国际会议论文和重要学术期刊。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
涡度相关技术及其在陆地生态系统通量研究中的应用
论大数据环境对情报学发展的影响
跨社交网络用户对齐技术综述
粗颗粒土的静止土压力系数非线性分析与计算方法
基于多Agent的通信交互式动态影响图研究及应用
基于深度学习的对手真实模型判别方法研究
基于交互式动态影响图的光储微网运行控制研究
基于值等价的交互式动态影响图的求解方法研究与应用