The previous literature on stochastic games is concentrated on the cases of infinite optimality horizon and constant discount factors. However, for many game problems in real-world situations, the optimality horizon is random (e.g., the winning time of contests), and the discount factor, corresponding to the so-called discount rate, is uncertain, such as the interest rate offered by a bank, which may be varied with the amount of the depositing money of investors. In addition, with the development of applications, calculations have been a hot problem in stochastic games. Based on the current works, in this proposal, we intend to discuss two parts of stochastic games: theory and calculations. The first part deals with the first passage expected discounted payoff criterion with random optimality horizon and state-dependent discount factors, which is the generalization of the standard stochastic games with infinite optimality horizon and a constant discount factor. For this part, we attempt to find the conditions for the existence of Nash equilibria, and explain the applications of our main results. As is well known, the calculation for the finite model is easier than that for the model with denumerable state spaces. The second part is on calculations under the first passage criterion above, which covers the following two aspects:(1) we first design the algorithm of saddle points for the finite game model;(2) using the results of (1) and a technique of approximations, we further search for the approximating algorithm of saddle points for the game model with denumerable states. The contents above are new, which is to promote further developments in both theoretical and applied research of stochastic games, and is helpful to the usage of our theory in practical situations.
现有文献对随机对策的理论研究集中在优化区间无限和折扣因子为常数的情形,而在实际生活中很多对策问题的优化区间是随机的(如竞赛取胜的时间)、折扣因子是不确定的(如银行的利率)。此外,随着应用的推广,计算已成为随机对策的热点问题。本项目拟在已有工作的基础上,对随机对策的理论与计算两大方面展开研究。理论研究集中于探索随机对策的首达目标期望折扣赔付准则,其中优化区间是随机的且折扣因子依赖状态,试图寻找纳什均衡存在的条件及分析主要结果的应用,这是当前常数折扣因子和优化区间无限的随机对策的延伸和拓展。众所周知,有限模型的计算易于状态可数情形,针对上述首达目标准则,本项目的计算研究包括:(1) 设计有限模型下鞍点的计算方法;(2) 运用逼近技巧进一步寻求可数状态模型下鞍点的近似计算方法。这些研究内容在随机对策中是新的,不仅能推进随机对策理论的新进展,而且能深化随机对策理论的应用,有利于理论指导实践。
本项目主要研究随机对策首达目标准则的相关问题,具体如下:. (1) 首达目标均值-方差折扣准则最优值函数和最优策略的存在性及计算方法。对该准则下的优化问题,在半马氏决策过程(即只有一个局中人的半马氏随机对策)的框架下,我们建立了最优方程解的存在唯一性,基于此证明了最优策略的存在性,并提供了计算最优策略的策略改进算法和最优值函数的迭代算法。相应结果发表在学术期刊Kybernetika上。. (2) 首达目标期望折扣赔付准则值函数和鞍点的存在性条件及值函数的逼近方法。对该准则下的优化问题,在离散时间两人零和随机对策的框架下,建立了Shapley方程,得到了值函数和鞍点存在的条件,给出了近似计算值函数的迭代算法、误差估计和策略对是鞍点的两个等价条件。相应结果发表在学术期刊Optimization上。. 上述(1)是诺贝尔经济学奖获得者Markowitz均值-方差投资组合问题在半马氏决策过程中的发展,上述(2)首次将折扣因子变动的折扣准则拓展到了两人零和随机对策模型中。
{{i.achievement_title}}
数据更新时间:2023-05-31
DeoR家族转录因子PsrB调控黏质沙雷氏菌合成灵菌红素
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
栓接U肋钢箱梁考虑对接偏差的疲劳性能及改进方法研究
一种改进的多目标正余弦优化算法
目标随机移动情形下的随机搜索首达问题研究
复杂网络上随机游走具有概率保证的首达时间与覆盖时间及其应用
半马氏随机对策的折扣概率准则及其应用研究
随机Pade逼近及其应用