应用技术与研究

基于T5 PEGASUS和DeepKE的文本摘要生成研究 

展开
  • 1. 河南水利与环境职业学院 2. 华北水利水电大学

网络出版日期: 2024-11-01

Research on Text Abstract Generation Based on T5 PEGASUS and DeepKE 

Expand
  • 1.Henan Vocational College of Water Conservancy and Environment 2. North China University of Water Resources and Electric Power

Online published: 2024-11-01

摘要

为减少T5 PEGASUS模型生成的摘要中的虚构信息、重复等问题,提出了一种基于T5 PEGASUS和Deep KE的文本摘要生成模型——T5 PEGASUS-DK。该模型将T5 PEGASUS模型和Deep KE框架相融合,先使用Pkuseg分词方法改进分词效果,再使用Deep KE框架抽取文本中的三元组,最后将三元组的词向量集合与文本的表示向量进行拼接。通过建立文本与三元组之间的映射关系,使得模型可以提取出事实性知识,从而提取出与原文内容更相符的信息作为摘要。T5 PEGASUS-DK模型的ROUGE值均达到最高,所生成的摘要更真实、连贯,与原文内容更相符。

本文引用格式

张琪王玲申杰 . 基于T5 PEGASUS和DeepKE的文本摘要生成研究 [J]. 电脑与电信, 2024 , 1(6) : 62 -67 . DOI: 10.15966/j.cnki.dnydx.2024.06.016

Abstract

In order to solve the problem of false information and duplication in the summarizations generated by the T5 PEGASUS model, a text summarization model based on T5 PEGASUS and DeepKE - T5 PEGASUS-DK is proposed. This model combines the T5 PEGASUS model with DeepKE framework. Firstly, the Pkuseg segmentation method is used to improve the segmentation performance. Then, the DeepKE framework is used to extract triads from text. Finally, the word vector set of triads is concatenated with the representation vector of text. By establishing a mapping relationship between text and triads, the model can extract factual knowledge and extract information that is more consistent with the original content as a summary. The experimental results show that the T5 PEGASUS-DK model has the highest ROUGE value, and the generated abstracts are more authentic, coherent, and consistent with the original content.
Options
文章导航

/