基于图论方法的符号网络中重叠聚类算法的研究

基本信息
批准号:11401346
项目类别:青年科学基金项目
资助金额:22.00
负责人:亓兴勤
学科分类:
依托单位:山东大学
批准年份:2014
结题年份:2017
起止时间:2015-01-01 - 2017-12-31
项目状态: 已结题
项目参与者:宋慧敏,曹祝楼,于祥田,吴海燕,陈辉辉
关键词:
社区发现重叠聚类类簇符号网络复杂网络
结项摘要

Network community structure or network cluster structure is one of the most fundamental and important topological properties of complex networks, as well as the other two properties “small world” and “scale-free”. Complex networks can be modeled as graphs with nodes and edges, where nodes indicate individual actors and edges indicate the relationships between actors. Traditional clustering always assumes that all edge weights in the complex network are positive, and the task of clustering is to group the vertices of the graph into clusters taking into consideration the edge structure of the graph in such a way that there should be many edges within each cluster and relatively few between the clusters. But in many real complex networks, there are not only positive relationships but also negative relationships. For example, online news and review websites such as Epinions and Slashdot allow users to approve or denounce others. These kinds of networks can be modeled as signed networks where a positive relationship is denoted by a positive edge weight while a negative relationship is denoted by a negative edge weight. Thus, the clustering problem in signed networks is to find antagonistic clusters such that entities with the same cluster have a positive relationship with each other and entities between different clusters have a negative relationship with each other. While much research on graph clustering has been conducted on unsigned graphs which contain only positive edges, it is a very challenging project to design new clustering algorithms for signed networks since the goal of clustering changes. . In this project, we will design a novel algorithm for the clustering problem of signed networks based on graph theory. At first, we will construct an edge prediction model to predict the values between any pair of nodes. Then based on the valued complete network, we will give a new criterion named “density” to measure the tightness (or friendship) of a sub-graph. Graph clustering is a process that detects all dense sub-graphs in this complete graph, then contracts them, and repeats this process until there is only one vertex left or only negative edges left. We will construct a hierarchically nested system to illustrate their inclusion relation. Furthermore, we will refine the clustering result based on the number of most central nodes in each clusters. This project will solve "how to find the number of the clusters automatically" and "how to model overlapping clustering" these two important research problems in clustering of signed networks. This project is an extension for traditional complex network clustering, and can be used to social network analysis (e.g. terrorist organization detection), biological network analysis (e.g. protein function prediction) and other WEB text clustering.

类簇结构或社区结构是复杂网络最普遍和最重要的拓扑结构属性之一。目前已有的聚类算法大多是基于非符号复杂网络(边上权重为正),对符号网络(signed network)(边上权重扩展为正、负)的研究少之又少。本项目利用图论技术,基于申请人在非符号复杂网络已有聚类算法的基础上,通过寻找稠密子图和收缩子图的方法,对符号网络的类簇结构进行研究,设计一个高效精确的层次聚类算法,并结合申请人在复杂网络核心节点寻找问题上已有结果,利用每个子类簇内核心节点的个数,对初始聚类结果进行二次迭代,提高算法精确性。该项目的研究,将解决目前符号网络聚类算法中存在的“无法实现一个对象可属于多个类簇的重叠聚类”和“类簇个数需要用户事先给定”这两个主要问题。本项目的研究结果可以广泛应用于社会网络分析(例如恐怖组织识别等),生物网络分析(例如蛋白质交互网络分析和未知蛋白质功能预测等)以及Web社区挖掘和文档聚类等众多领域。

项目摘要

类簇结构或社区结构是复杂网络最普遍和最重要的拓扑结构属性之一。目前已有的聚类算法大多是针对传统的非符号复杂网络(边上权重为正),对符号网络(signed network)(边上权重扩展为正、负)的研究少之又少。本项目利用图论技术,对符号网络的类簇结构进行了研究,设计出两个高效算法,分别为 SQCM 算法和 Eb&D 算法,这两个算法均避免了目前算法中普遍要求的 “社团个数需要事先给定” 的弊端;并且SQCM 算法能够实现“一个对象可属于多个社团的重叠聚类”。这两个算法在现实例子和模拟数据上均取得了很好的效果。 本项目同时考虑了复杂网络中重要节点的寻找问题,该问题的研究在病毒营销、疾病控制、舆情控制等方面有重要的应用价值。具体的,针对无向图,我们设计出了两个顶点重要性衡量排序指标,分别为“Spanning tree Centrality”和“DPRank centrality”。针对有向图,设计了Laplace指标。这些指标的计算复杂性较低,不同于目前存在的算法,这些新颖的指标是从网络结构的不同视角来评价顶点的重要性。我们用大量的数据验证了这些指标的有效性及优越性。 本项目的研究极大丰富了复杂网络相关领域的研究结果,并为下一步研究复杂网络中的影响最大化问题提供了新的思路。项目执行期间共发表相关论文5篇,项目组成员有两人获得博士学位,两人获得硕士学位,陆续新增四名硕士研究生加入该项目。

项目成果
{{index+1}}

{{i.achievement_title}}

{{i.achievement_title}}

DOI:{{i.doi}}
发表时间:{{i.publish_year}}

暂无此项成果

数据更新时间:2023-05-31

其他相关文献

1

EBPR工艺运行效果的主要影响因素及研究现状

EBPR工艺运行效果的主要影响因素及研究现状

DOI:10.16796/j.cnki.1000-3770.2022.03.003
发表时间:2022
2

基于铁路客流分配的旅客列车开行方案调整方法

基于铁路客流分配的旅客列车开行方案调整方法

DOI:
发表时间:2021
3

复杂系统科学研究进展

复杂系统科学研究进展

DOI:10.12202/j.0476-0301.2022178
发表时间:2022
4

基于文献计量学和社会网络分析的国内高血压病中医学术团队研究

基于文献计量学和社会网络分析的国内高血压病中医学术团队研究

DOI:10.11842/wst.20190724002
发表时间:2020
5

含饱和非线性的主动悬架系统自适应控制

含饱和非线性的主动悬架系统自适应控制

DOI:10.3969/j.issn.1674-0696.2020.10.20
发表时间:2020

亓兴勤的其他基金

相似国自然基金

1

基于图论模型的文本重叠聚类研究

批准号:61202312
批准年份:2012
负责人:吴秦
学科分类:F0605
资助金额:23.00
项目类别:青年科学基金项目
2

符号数据的聚类有效性分析与优化算法研究

批准号:61305073
批准年份:2013
负责人:白亮
学科分类:F0603
资助金额:26.00
项目类别:青年科学基金项目
3

网络设计中的图论方法

批准号:11531011
批准年份:2015
负责人:孟吉翔
学科分类:A0408
资助金额:230.00
项目类别:重点项目
4

基于几何覆盖方法的半监督聚类算法研究

批准号:61302157
批准年份:2013
负责人:顾磊
学科分类:F0113
资助金额:25.00
项目类别:青年科学基金项目