面向社会媒体数据的子空间聚类算法研究

基本信息

批准号：61403247

项目类别：青年科学基金项目

资助金额：25.00

负责人：朱林

学科分类：

依托单位：上海电力大学

批准年份：2014

结题年份：2017

起止时间：2015-01-01 - 2017-12-31

项目状态：已结题

项目参与者：李红娇,杜海舟,杨吟冬,李博,卢露,吴仕兵,王珊

关键词：

多视图数据子空间聚类社会媒体挖掘演化数据链接约束数据

结项摘要

Social media is gaining popularity in recent years and increasingly becoming an integral part of our life. The growth of social media data in size and variety accelerates rapidly as more people use social media such as Facebook, Twitter, LinkedIn, among others. It is a massive “treasure trove” interesting to researchers and practitioners of different disciplines, and a great source for data mining. However, attribute-value data in classic data mining differs from social media data besides both are large-scale. In social media, different events concerning groups can be defined by comparing communities across time. These events include growth, contraction, merging, splitting, birth, and death. Social media data points are also inherently not independent and identically distributed (i.i.d.), but linked. Furthermore, social media data is also noisy, incomplete, comprised of multiple sources, and embedded with multi-mode and multi-dimensional networks. These unique properties present unprecedented challenges for mining social media data. .In real-world applications, high-dimensional data is ubiquitous - from text categorization, to image processing, and to Web searches. Therefore, subspace clustering has been studied extensively in recent years. The goal of subspace clustering is to locate clusters with their own associated dimensions that are embedded in different subspaces of the original data space. Existing subspace clustering algorithms that have been proven effective for data mining are unequipped for social media mining. In this research, we propose a new kind of subspace clustering to facilitate the computational understanding of social media, investigating associated fundamental research issues and developing new, effective algorithms. .As networks are highly dynamic, we propose to develop new algorithms to enable the capability of clustering high-dimensional evolutionary social media data from the subspace clustering perspective. We also define the problem of subspace clustering with linked data and present a preliminary study to demonstrate how link information can be integrated into subspace clustering for social media data. A prominent characteristic of social media is that its data comes from a range of multiple sources. As data of each source can be noisy, partial, or redundant, selecting relevant sources and using them together can help effective subspace clustering. We define types of sources and propose to study subspace clustering by using source information..The project lies at the confluence of data mining and social computing. The preliminary work towards the goal is to develop novel methods and expand research capabilities in clustering analysis and social media mining, can also contribute in improving machine learning and information retrieval, and expediting the development of a new generation of social media mining tools.

随着互联网的普及和流行，出现大量用户参与的Web应用程序和社会信息网络，包括博客、论坛、共享媒体平台、微博、社会网络、社会新闻、社会书签和维基百科等，统称为社会媒体。由于社会媒体在政治经济和日常生活发挥着越来越重要的作用，针对社会媒体的数据挖掘和机器学习算法研究成为当前本领域的研究热点。本课题就是以解决社会媒体挖掘问题为背景，研究针对高维社会媒体数据的子空间聚类方法。研究内容包括：1）基于数据整合和模型整合策略，提出针对社区演化数据的子空间聚类算法；2）根据社会媒体数据间的链接约束进行建模，提出针对链接约束数据的子空间聚类算法；3）利用社会媒体不同视图数据特征间的相互关系，提出针对多视图数据的子空间聚类算法；4）收集并整理社会媒体数据，扩展所提新算法在社会媒体挖掘方面的应用。本项目研究基础好，思路清楚，应用背景明确，研究成果将为数据挖掘和社会计算等领域提供重要的学术价值和研究意义。

项目摘要

在2015.01-2017.12执行国家自然科学基金（No. 61403247）过程中，按项目申请书和项目计划书的进度安排，开展了面向社会媒体数据的子空间聚类算法研究，并在此基础上在相关方向进行了拓展研究，主要内容具体包括：首先，针对社会媒体数据具有的高维、演化、链接约束和多视图等数据特性，对软子空间聚类算法的国内外研究现状进行了总结；其次，分别探讨了面向社区演化数据的流数据聚类技术、面向链接约束数据的半监督学习技术和面向社会媒体数据的预测精度提升技术。项目执行过程中在相关领域形成了一批研究成果，所得结果对子空间聚类理论及其在社会媒体挖掘应用等方面具有重要的价值和意义。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.16796/j.cnki.1000-3770.2022.03.003

发表时间：2022

DOI：

发表时间：2021

DOI：10.1051/jnwpu/20213920292

发表时间：2021

DOI：

发表时间：2020

DOI：10.11842/wst.20190724002

发表时间：2020

朱林的其他基金

批准号：81271377

批准年份：2012

资助金额：70.00

项目类别：面上项目

批准号：51407079

批准年份：2014

资助金额：24.00

项目类别：青年科学基金项目

批准号：51107048

批准年份：2011

资助金额：24.00

项目类别：青年科学基金项目

批准号：30971870

批准年份：2009

资助金额：26.00

项目类别：面上项目

批准号：31860135

批准年份：2018

资助金额：39.00

项目类别：地区科学基金项目

批准号：21705137

批准年份：2017

资助金额：22.00

项目类别：青年科学基金项目

批准号：21602239

批准年份：2016

资助金额：20.00

项目类别：青年科学基金项目

批准号：11901390

批准年份：2019

资助金额：25.00

项目类别：青年科学基金项目

批准号：51575003

批准年份：2015

资助金额：62.00

项目类别：面上项目

批准号：61064001

批准年份：2010

资助金额：25.00

项目类别：地区科学基金项目

批准号：31160478

批准年份：2011

资助金额：52.00

项目类别：地区科学基金项目

相似国自然基金

面向社会化媒体异构大数据的快速组合聚类研究

批准号：71471009

批准年份：2014

负责人：李红

学科分类：G0112

资助金额：60.00

项目类别：面上项目

面向大规模二维数据的岭回归子空间聚类算法研究

批准号：61806106

批准年份：2018

负责人：彭冲

学科分类：F0603

资助金额：22.00

项目类别：青年科学基金项目

基于稀疏低秩表示的子空间聚类算法研究

批准号：61502175

批准年份：2015

负责人：刘小兰

学科分类：F0605

资助金额：20.00

项目类别：青年科学基金项目

复杂多视图高维数据子空间聚类方法研究

批准号：61602081

批准年份：2016

负责人：于红

学科分类：F06

资助金额：21.00

项目类别：青年科学基金项目

面向社会媒体数据的子空间聚类算法研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

EBPR工艺运行效果的主要影响因素及研究现状

基于铁路客流分配的旅客列车开行方案调整方法

一种基于多层设计空间缩减策略的近似高维优化方法

基于多色集合理论的医院异常工作流处理建模

基于文献计量学和社会网络分析的国内高血压病中医学术团队研究

朱林的其他基金

锌稳态失衡对脑外伤后核因子-κB及继发性神经损伤的作用

交直流系统动态无功源荷失配机理与暂态电压稳定协调控制研究

基于过程总线的数字化变电站体系结构和关键技术研究

茶树硒营养代谢关键酶基因的表达及其调控机理研究

毛乌素沙地灌丛水分利用特征及植被-土壤水分互馈机制研究

应用质谱成像与定量蛋白质组学寻找新的抗炎调控网络关键节点

氟原子转移自由基反应的机理研究

Gorenstein同调代数及其诱导的ladder

土壤高速流变凿削耦合行为下摆式犁体长切削寿命调控机理及实现新方法

钕铁硼氢爆碎工艺过程中数据驱动的平行控制方法研究

基于碳同位素分辨率及其相关指标的苜蓿水分利用效率研究

相似国自然基金