基于交互式动态影响图的未知对手模型学习

基本信息

批准号：61375070

项目类别：面上项目

资助金额：76.00

负责人：曾一锋

学科分类：

依托单位：厦门大学

批准年份：2013

结题年份：2017

起止时间：2014-01-01 - 2017-12-31

项目状态：已结题

项目参与者：罗键,杨帆,陈碧连,武鹤,武斌,吴文渊,李旋

关键词：

对手模型学习贝叶斯网络学习交互式动态影响图

结项摘要

By identifying a true model of other agents, a subject agent is able to correctly predict other agents' behaviour so as to act in an optimal way. The limited model space of the subject agent may not always contain the true model of other agents, which leads to the fact that it becomes invalid for the subject agent to use Bayesian rules for updating the beliefs over other agents' models. This project will be developed based on the framework of Interactive Dynamic Influence Diagram (I-DD) and investigate the learning of the dependency of agents' models with the purpose of improving I-DID solutions in practice. By relating the true model to the candidate models of other agents, the subject agent can precisely weight the candidate models and optimize its decision during their interactions. The project will integrate multiagent systems and information theory with machine learning technologies and conduct the research on learning opponent models in I-DID. Firstly, it will use the concept of mutual information to quantify the model dependency and will study its relevant properties. Secondly, it will compute the mutual information by learning parameters of dynamic hidden Bayesian networks that introduces hidden variables into the dynamic Bayesian networks. Thirdly, it will propose a sequential learning mechanism and improve the algorithmic adaptability. Fourthly, it will further optimize model space of the subject agent by tracing the model dependency over time. Solutions of I-DID will be significantly improved with the reduced complexity of model space. Finally, it will develop an Unmanned Aerial Vehicle (UAV) simulation platform and verify the performance of the proposed model and algorithms in term of effectiveness and reliability. More significantly, it will show the applications of the proposed techniques in the real-world competitive human-like mobile robots. In summary, this project will be for the first time to incorporate online machine learning technologies into I-DID solutions and overcome the limitations of Bayesian updates in I-DID. This will establish a solid foundation on solving a wide range of sequential multiagent decision making problems in practice.

针对多Agent序贯决策优化问题，主体Agent需要识别其他Agent的真实模型，达到对其他Agent行为的准确预测。但是有限的主体Agent模型空间往往不能包括其他Agent的真实模型，这导致传统的贝叶斯公式不能被用来更新其他Agent模型的信度。本项目拟基于交互式动态影响图研究如何通过学习模型之间的相关性优化主体Agent的决策。通过结合多Agent决策系统、机器学习、信息论等技术，本项目采用互信息量化模型之间的相关性；构造动态隐性贝叶斯网络以准确地计算互信息；建立顺序学习技术以增强算法的适应性；根据相关性的实时变化动态优化模型空间；开发无人驾驶飞机仿真平台以验证模型及其算法的正确性；采用类人机器人对抗演练展示研究技术的实际效用。本项目将首次把在线机器学习方法融入到交互式动态影响图的求解过程中，从而克服传统贝叶斯方法更新模型的局限性，为解决实际多Agent序贯决策优问题提供坚实的基础。

项目摘要

针对多Agent序贯决策优化问题，主体Agent需要识别其他Agent的真实模型，达到对其他Agent行为的准确预测。但是有限的主体Agent模型空间往往不能包括其他Agent的真实模型，这导致传统的贝叶斯公式不能被用来更新其他Agent模型的信度。本项目基于交互式动态影响图的基本框架，从研究模型之间的相关性出发，通过机器学习的方法，降低模型空间，达到优化模型求解的目的。本项目的研究进一步提高了多智能体序贯决策的能力，并深入探讨了关于识别智能体真实模型的问题，取得了一些主要研究成果，对多智能体系统的研究有一定的指导意义。主要研究成果发表智能体研究领域的顶级国际会议论文和重要学术期刊。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.16796/j.cnki.1000-3770.2022.03.003

发表时间：2022

DOI：10.1051/jnwpu/20213920292

发表时间：2021

DOI：

发表时间：2020

DOI：10.11842/wst.20190724002

发表时间：2020

DOI：10.16383/j.aas.c180673

发表时间：2021

曾一锋的其他基金

批准号：61772442

批准年份：2017

资助金额：58.00

项目类别：面上项目

相似国自然基金

基于多Agent的通信交互式动态影响图研究及应用

批准号：60975052

批准年份：2009

负责人：罗键

学科分类：F0304

资助金额：31.00

项目类别：面上项目

基于深度学习的对手真实模型判别方法研究

批准号：61703156

批准年份：2017

负责人：武鹤

学科分类：F0305

资助金额：21.00

项目类别：青年科学基金项目

基于交互式动态影响图的光储微网运行控制研究

批准号：61703091

批准年份：2017

负责人：李波

学科分类：F0302

资助金额：23.00

项目类别：青年科学基金项目

基于值等价的交互式动态影响图的求解方法研究与应用

批准号：61772442

批准年份：2017

负责人：曾一锋

学科分类：F06

资助金额：58.00

项目类别：面上项目

基于交互式动态影响图的未知对手模型学习

{{i.achievement_title}}

暂无此项成果

其他相关文献

EBPR工艺运行效果的主要影响因素及研究现状

一种基于多层设计空间缩减策略的近似高维优化方法

基于多色集合理论的医院异常工作流处理建模

基于文献计量学和社会网络分析的国内高血压病中医学术团队研究

二维FM系统的同时故障检测与控制

曾一锋的其他基金

基于值等价的交互式动态影响图的求解方法研究与应用

相似国自然基金