Exome sequence data are often obtained in small human families. Extracting SNPs from DNA variants allows for linkage analysis but power is often low because only small numbers of families are generally sequenced. The customary procedure in linkage analysis is to initially assume homogeneity among families (null hypothesis, H0) and search for high lod scores. A subsequent analysis then generally tests for heterogeneity (alternative hypothesis, H1) to see whether potential disease loci occur at different genomic locations in different families. Combining the two steps may yield increased power but this whole concept is really counterintuitive for complex traits with presumably multiple susceptibility variants contributing to disease. Here we propose a novel approach that reverses the traditional hypothesis testing scenario: We initially assume that disease variants in different families can occur anywhere in the genome (null hypothesis, H0). Under this null hypothesis, the fact that variants with large lod scores in two families occur at (approximately) the same position is an unlikely occurrence; in fact, two such variants have an approximate probability of 5% to even occur on the same chromosome, and a much smaller probability to occur within, say, 100 bp of each other. This "surprise factor" has previously been expressed in an ad hoc manner but here we quantify this effect and develop a general hypothesis testing framework and resulting software, where homogeneity is our alternative hypothesis (H1), for which we have developed two test procedures as outlined below. In other words, we want to be able to test for the fact that potential disease variants occur at approximately the same positions in two families. A significant result of such a test would indicate that (1) the two families share some genetic vulnerabilities and (2) there is significant evidence for the presence of variants linked with disease.
全基因组外显子测序数据通常在小家系中收集。单核苷酸多态性(SNP)可以用于连锁分析,但目前只有少量家系进行全基因组测序导致SNP检测功效低。传统连锁分析中假设家系存在同质性(零假设)来寻找LOD值高的区域,备择假设为异质性,即不同家系中疾病基因是否在基因组不同位置。基于此假设的传统连锁分析可能增加检测功效,但对多个易感位点联合作用复杂性状,这个假设有悖常理。因此我们提出颠覆传统假设的新方法:零假设为不同家系中疾病基因存在基因组不同位置上,同质性为备择假设。基于零假设,两个家系中LOD值高的遗传变异出现同一(近似)区域可能性很低。事实上,两个遗传变异出现同一染色体上的概率为50%,进一步缩小范围,出现100bp距离内概率会变得极低。这个"令人惊讶的事实"是本课题立项依据之一。由此,我们提出检测两个家庭中易感疾病遗传变异发生在同一位置可能性的思路。初步研究表明,我们分析方法较原有方法更为有效。
在我们的整个研究中,我们以单核苷酸多态性(SNPs)和突变(mutations)为主要遗传标记,用多种统计遗传学方法对多个常见疾病进行了遗传学研究。这些疾病包括精酰琥珀酸尿症(argininosuccinic aciduria)、儿童多动症(ADHD)、新生儿发育迟缓(FGR)、创伤后应激障碍(PTSD)、物质依赖。在我们的研究中,我们发现了不同家系位于基因组位置很近的基因突变,进一步证明了家系同质性的存在和重要性。我们的研究还发现了多个与上述这些疾病相关的遗传标记以及对应的基因。我们的研究为探索疾病的遗传机制提供了重要参考。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
任意家系连锁分析中若干理论问题的研究
大维面板数据模型中存在序列相关性的截面相关性检验研究
时间序列分析中几种假设检验问题的研究
相依序列极限性质及其在回归模型和变点检验中的应用