As a research hotspot, sentiment analysis of Weibo has an important impact on national security and social stability because people can set up real-time information sharing communities individually in Weibo. The precisions of most classical sentiment analysis methods which are based on analyzing the polarity of sentiment words are lower than researchers’ expectations. One important reason is that many Weibo users like to use irony to express their opinions on some topics. To solve this problem, we will study the recognition of Chinese irony in Weibo data in this project. Based on the three-way decision-theoretic rough set model with semi-supervised learning and ensemble learning, the following contents will be studied in this project. First, the data with different topics will be gotten from Sina Weibo. A hierarchical multimodal feature space which contains users’ behavior feature, content feature and users’ state feature will be constructed on the basis of analyzing the discussion topics and users’ information. Second, a Chinese irony recognition system for Weibo will be built. Third, the three-way decisions based cost sensitive learning methods will be discussed. There are two innovation points in this project, one is that it is the first time to study automatic Chinese irony recognition in Weibo, the other is that the proposed method is a feasible and reasonable choice because three-way decision-theoretic rough set model has an advantage over other methods on several domains including feature selection, reducing misclassification error and dealing with imbalanced data. The recognition system can help improve sentimental analysis’ precision and give a strong decision support for early warning of social public opinion on network.
网络环境下短文本中反语的使用会导致传统情感分析方法分类时精度降低,而由于中英文在词汇和语义表述上的不同,使得现有英文反语识别技术无法直接应用于中文数据,针对此问题,本课题研究面向微博的中文反语识别技术。拟采用三支决策粗糙集与半监督学习和集成学习相结合的方法,研究以下内容:(1)面向微博数据的反语识别多模态层次化特征体系;(2)基于三支决策的微博反语识别层次化分类模型;(3)三支决策代价敏感学习方法研究。本课题创新性的从计算学角度系统的对微博中文反语识别展开研究,并利用三支决策粗糙集在多粒度特征选择、降低分类误分率和能够直接处理不平衡数据这三方面的优势进行有针对性的研究。研究的目的在于提高面向微博的情感分析分类精度,为科学的社会预警提供有力决策支持,对推动自然语言理解中语境分析的可计算化研究有着重要的科学意义,对国家安全和社会稳定具有着重要的现实意义。
鉴于微博具有实时性和快速传播等特性,面向微博的情感分析和预警技术正成为当前的研究热点,并且对国家和社会的安全具有着重要的价值和现实意义。本项目基于微博数据,着眼于反语识别这一类特殊情感分析任务展开研究。主要研究内容包括:(1)建立面向微博数据的反语识别多模态、层次化特征体系;(2)针对具有不平衡特性的微博数据设计相应的反语识别层次化分类模型;(3)三支决策代价敏感学习研究。本项目的主要研究结果包括:(1)对于反语识别特征体系,提出了建立包括基本词汇情感、标点符号、谐音词、微博长度、动词被动化和文本情感模糊度等在内的多种特征,并在考虑特征分布情况下,构建两阶段分层分类特征体系;(2)针对特征选择问题,结合三支决策理论和贝叶斯网络方法的优势,提出了一种三支决策贝叶斯网络模型,并设计了一种该模型下基于代价最小化的属性约简方法;(3)针对微博数据反语识别不平衡问题,提出了一种基于多异态分类器的两阶段集成分类反语识别方法。实验结果表明所提两阶段分层特征体系和两阶段集成分类反语识别方法能够有效提高反语识别正确率。
{{i.achievement_title}}
数据更新时间:2023-05-31
居住环境多维剥夺的地理识别及类型划分——以郑州主城区为例
基于细粒度词表示的命名实体识别研究
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
适用于带中段并联电抗器的电缆线路的参数识别纵联保护新原理
基于图卷积网络的归纳式微博谣言检测新方法
基于语义分析的中文微博信息挖掘方法研究
基于三支决策的拟阵扩展理论研究
微博炒作话题识别与传播人群分析
基于粒计算的动态三支决策理论与方法