Testing for homogeneity of two independent random samples is one of the most important hypothesis testing problems in statistics. The classical nonparametric tests such as the well-known Kolmogorov-Smirnov, Cramér-von Mises and Anderson-Darling tests are built on the empirical distribution functions. Although such tests have many desirable properties such as robustness and “distribution-free” properties in the one-dimensional two-sample framework, they are far less popular in high dimensions, mainly due to the lack of theoretical properties and the curse of dimensionality. This project aims to solve these problems by means of the well-known idea of projection pursuit in the statistics community. The main contents of the research are as follows. First, we will use projections to construct nonparametric measures that characterize the differences between two populations in distributions, and further study population properties of the measures. Second, we will use the measures to formulate test statistics and further study the asymptotic behaviors of the proposed test statistics under the “large sample size, fixed dimension” paradigm. Third, we will study the asymptotic behaviors of the proposed test statistics under the “fixed sample size, large dimension” paradigm and further consider how to correct the new tests in high-dimensional situations to enhance their power performance.
两样本检验是统计学中最重要的问题之一。经典的非参数检验方法,如著名的Kolmogorov-Smirnov检验,Cramér-von Mises检验和Anderson-Darling检验,是基于经验分布函数构造的。尽管这类检验在一维两样本框架下具有很多优良的性质如稳健性质和“distribution-free”性质,但在高维数据下,因理论性质的缺失及所遭遇到的维数诅咒问题,使这类经典的检验方法备受冷落。本课题旨在借助于统计学中著名的投影追踪思想去解决这些问题。主要研究内容有:一是在总体意义下利用投影构造能够刻画两样本分布差异的非参数度量,研究新度量的总体性质;二是在样本意义下,基于新度量构造检验统计量,并在样本量趋于无穷大但维数固定情况下,研究新统计量的大样本性质;三是在样本量固定但维数发散的情况下,研究新统计量的高维性质,并进一步考虑如何在高维情形下对新检验进行修正去提高检验功效。
在众多的假设检验问题当中,增加变量的维数经常导致相应的检验统计量无法很好地控制检验的第一类错误并且会降低检验功效。为了应对变量维数对检验结果的不利影响,本项目提出了基于随机投影和积分变换的检验统计量。将多维随机变量投影为一维随机变量,从而在一定程度上回避了维数问题。本项目研究了非配对数据下的两样本分布比较问题。在适当权函数下,基于随机投影和积分变换的检验统计量形式简单易于计算。利用U和V统计量的理论方法,本项目给出了所提检验统计量相合性和弱收敛性等大样本性质,论证了其渐近收敛速度和变量维数无关。本项目进一步将基于随机投影和积分变换的统计量用以解决高维两样本检验问题。在样本量固定但维数发散的情形下,推导出了所提统计量的渐近分布。以上研究有力地解决了两样本检验中所存在的维数问题,而且所提思路方法具有一定一般性,能够对其他相关的检验问题比如独立性检验,拟合优度检验以及模型方差异质性检验等提供可行的途径。在该青年项目的资助下,项目主持人的研究成果发表在Journal of the American Statistical Association,Statistica Sinica,Journal of Multivariate Analysis,Computational Statistics & Data Analysis,Journal of Statistical Planning and Inference,Science China Mathematics等国内外期刊上。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于 Kronecker 压缩感知的宽带 MIMO 雷达高分辨三维成像
基于SSVEP 直接脑控机器人方向和速度研究
高维两总体协方差矩阵相等检验及其探测边界研究
高维数据下多样本均值检验问题的研究
异方差阵下的高维多样本均值检验
Ornstein-Uhlenbeck 型过程多变点检验及两样本检验问题