基于统计和语义分析的中英文自动文摘的研究

基本信息

批准号：69972025

项目类别：面上项目

资助金额：13.00

负责人：罗振声

学科分类：

依托单位：清华大学

批准年份：1999

结题年份：2002

起止时间：2000-01-01 - 2002-12-31

项目状态：已结题

项目参与者：程慕胜,黄国营,彭迎喜,杨志强,刘颍,宋晖,万敏,郭玉箐,肖奔放

关键词：

文本结构分析主题句统计方法

结项摘要

After five years work, the project "Research on Automatic Abstraction Based on Statistics and Semantic Analysis for Chinese and English Texts" is accomplished. There are two prominent characteristics of this project, in techniques and in methods. One is that the system imported semantic hierarchy concept on the base of traditional word-frequency statistic. It uses extended Dictionary of Synonymy Words in Chinese, and Word-Net and related theory of hierarchy concepts in English. Thus, a more ideal Vector Space Model (VSM) was built, and it got statistic information more precisely. To analysis and identify multi-topic text, the system analyzed the distribution of many kinds of title words and key words, and made a first successful step in resolving the issue of unbalanced distribution of abstract. The other is that, to make the abstract more readable, many readable processes were applied on the raw abstract. Those mainly include sentence-form analysis in Chinese, linked grammar analysis in English, research on removing redundant repetition of abstracted sentences, research on the arrangement and transform of sentence-form, research on suspend conjunction words problem by use of the match of templates, and etc. .Based on these research works, we accomplished a more general and ideal Chinese and English Texts Abstract System technically.

随着科技的高度发展，人类已生活在信息的汪洋大海之中。如何快捷有效地获取最有用的信息，对当今经济与技术发展至关重要。本项目充分利用课题组大型语料库系统与汉语句型自动分析与分布统计系统研究的成果和经验，以中文为主，采用统计信息与语义分析相结合的综合手段，实现一个质量高覆盖面广的中英文自动文摘系统。它必将具有广泛的应用前景和巨大的社会与经济效益。

项目摘要

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.15957/j.cnki.jjdl.2016.12.031

发表时间：2016

DOI：

发表时间：

DOI：10.19713/j.cnki.43-1423/u.t20201185

发表时间：2021

DOI：

发表时间：2018

DOI：10.16383/j.aas.2016.c150880

发表时间：2016

罗振声的其他基金

批准号：69373044

批准年份：1993

资助金额：6.00

项目类别：面上项目

相似国自然基金

基于统计机器翻译和自动文摘的查询扩展研究

批准号：61363045

批准年份：2013

负责人：李卫疆

学科分类：F0211

资助金额：43.00

项目类别：地区科学基金项目

基于语义分析和统计的自动主题标引研究

批准号：60872133

批准年份：2008

负责人：吕学强

学科分类：F0113

资助金额：30.00

项目类别：面上项目

基于信息重组的多文档自动文摘技术

批准号：60803092

批准年份：2008

负责人：徐永东

学科分类：F0211

资助金额：20.00

项目类别：青年科学基金项目

基于逻辑框架的多文档自动文摘技术

批准号：60373100

批准年份：2003

负责人：王晓龙

学科分类：F0211

资助金额：8.00

项目类别：面上项目

基于统计和语义分析的中英文自动文摘的研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

演化经济地理学视角下的产业结构演替与分叉研究评述

玉米叶向值的全基因组关联分析

正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究

硬件木马:关键问题研究进展及新动向

基于SSVEP 直接脑控机器人方向和速度研究

罗振声的其他基金

汉语的句型分析与句型统计

相似国自然基金