Many complex systems can be described as networks, where the nodes represent the fundamental entities of a system and the edges represent relationships or interactions between them. The study of networks has a long history and proves great success in understanding the structures and dynamics of complex systems. A prominent problem in studying networks is community detection, i.e. the detection of groups of nodes which share common properties and/or play similar roles known as communities. Previous research on community detection overwhelmingly focuses on homogeneous networks. That is, only one type of nodes is present in a network, and the edges between nodes are of the same type. In real-world systems, however, there are often more than one type of entities and different types of interactions between them, leading to the prevalence of heterogeneous networks...The goal of this research project is to provide a principled generalization of community detection to heterogeneous networks. The approach is based on extensions of techniques that were previously developed for network science, such as spectral analysis, optimization theory, information theory, and statistical inference. The research covers the following key topics: 1) The definition of a community based on empirical study and computer simulation; 2) The study of several principled frameworks which unify community detection in various heterogeneous networks; 3) The comparison of algorithms under different frameworks and summarization of their advantages and disadvantages as well as their respective application scopes; 4) The application of community detection in large-scale websites...The expected achievements will facilitate the analysis of the structure and function of real-world heterogeneous systems, and advance the discovery of latent and valuable knowledge. The theoretical results will find applications in a broad range of areas, including Web searching, user online behavior analysis, targeted advertising, and personalized service.
根据给定网络的连接结构,将节点划分为若干组,使得各组节点分别对应于某一功能单元,以上过程称为社区发现。近年来,社区发现受到很多学者的关注,他们往往将此问题限定于同质网络。现实中,由不同类节点和边构成的异质网络以多种形式广泛存在,而同质网络的社区发现算法无法适用于更为复杂的异质网络。本项目在同质网络的最优化、信息论、谱分析、统计推断等理论的扩展和延伸的基础上,引入分而治之、整体规划、等价转化和函数优化四个思路来建立算法框架,对形形色色异质网络中的社区发现展开系统的研究,以揭示异质网络结构和功能之间的关系,为现实复杂异质系统的结构分析、未知功能探测和知识发现提供有效的方法和途径。本课题的预期研究成果在Web信息搜索、网站用户行为分析、定向广告、个性化服务等方面具有广泛的应用前景。
根据给定网络的连接结构,将节点划分为若干组,使得各组节点分别对应于某一功能单元,以上过程称为社区发现。近年来,社区发现受到很多学者的关注,他们往往将此问题限定于同质网络。现实中,由不同类节点和边构成的异质网络(heterogeneous network)以多种形式广泛存在,比如在社会关系网络中有表示友谊、敌对和商业等关系的不同类型的边。又比如在照片服务网站Flickr中,“用户”可以上传“照片”,对“照片”用“标签”加以评注,加其它“用户”为好友。那么,Flickr系统可以被描述成一个异质网络,其中包含“用户”、“照片”、“标签”三类节点和分别表示以上三种行为的三类边。此外,除了节点和边的拓扑信息外,现实世界中很多网络常常附带节点和边的属性信息,比如社会网络的节点有年龄、爱好、国籍、宗教信仰、住所等属性信息,边有距离、亲密程度等属性信息。我们称这些网络为属性网络(attributed network)。传统的同质网络的社区发现算法无法适用于更为复杂的异质网络和属性网络。我们对这一问题分别采用分而治之、整体规划、等价转化和函数优化四个思路进行研究,提出了三个算法框架。我们对算法进行了广泛测试,分别将它们应用于人工网络,被世界学者普遍使用的小规模网络,法国Orange公司提供的现实手机通讯网络,以及人工采样得到的Digg大规模网络,揭示出网络结构和功能之间的关系,实现了系统的可视化,并总结出不同算法的优缺点和适用范围。其中,基于分而治之的算法具有速度快,并行性好等特点,适用于节点和边的类别相对较少的异质网络;基于整体规划的算法具有准确性高的特点,适用于噪音较多,社区对应关系复杂的异质网络;基于函数优化的算法是对传统模块度算法的一个归纳升华,既能应用到传统的非属性网络中挖掘层次社区结构,也能应用于属性网络中挖掘属性无关的社区。我们的研究为现实复杂系统的结构分析、未知功能探测和知识发现提供了有效的方法和途径。
{{i.achievement_title}}
数据更新时间:2023-05-31
演化经济地理学视角下的产业结构演替与分叉研究评述
玉米叶向值的全基因组关联分析
跨社交网络用户对齐技术综述
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
黄河流域水资源利用时空演变特征及驱动要素
异质复杂社会网络下社区发现及演变的系列问题研究
基于高阶张量表示和压缩谱嵌入的多层异质网络社区发现方法研究
复杂网络半监督社区发现方法研究
多模态异构移动社会网络社区发现研究