作者:哈工大SCIR 冯夏冲

摘要(Summarization)是传统的自然语言处理任务之一[1],多年以来,一直被广大研究者持续挖掘推进,该任务旨在将输入数据转换为包含关键信息的简短概述。在早些年,该方向一直以DUC,CNNDM,Gigaword等数据集为核心进行研究[2],并取得了显著的进展。为了满足各种需求,近些年,跨语言摘要[3],多模态摘要[4],无监督摘要[5],摘要事实性研究[6],对话摘要[7],科学文献摘要[8],基于预训练的摘要[9],摘要任务分析[10]等方向喷薄发展,百花齐放,论文数量持续增多,除了各大会议(例如ACL,EMNLP)中的摘要相关论文之外,arXiv也会涌现出众多摘要相关论文。

受yizhen20133868/NLP-Conferences-Code[11],teacherpeterpan/Question-Generation-Paper-List[12],thunlp/PLMpapers[13],thu-coai/PaperForONLG[14],NiuTrans/ABigSurvey[15]等项目的激励,旨在整理现有摘要研究成果,追踪最新摘要论文,中心文本生成组博士生冯夏冲收集并整理了摘要论文阅读列表,该列表每条信息包括论文题目,作者,PDF链接,论文来源,是否有实现代码,可以帮助研究者快速整合该方向核心资料,并会长期维护和迭代整理现有论文列表。



要素提取 自然语言处理 自然语言处理文本摘要_人工智能

图1 摘要论文阅读列表

除论文信息之外,该仓库还包括了文本生成组摘要论文笔记与讲解PPT,可以帮助初学者快速了解与入门该任务。



要素提取 自然语言处理 自然语言处理文本摘要_xhtml_02

图2 摘要论文笔记与讲解PPT

项目地址:

https://github.com/xcfcode/Summarization-Papers

参考资料

[1]

Paice C D. Constructing literature abstracts by computer: Techniques and prospects[J]. Inf. Process. Manag, 1990, 26(1): 171-186.


[2]

Gambhir M, Gupta V. Recent automatic text summarization techniques: a survey[J]. Artificial Intelligence Review, 2017, 47(1): 1-66.


[3]

Cao Y, Liu H, Wan X. Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 6220-6231.


[4]

Li M, Chen X, Gao S, et al. VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles[J]. arXiv preprint arXiv:2010.05406, 2020.


[5]

Kohita R, Wachi A, Zhao Y, et al. Q-learning with Language Model for Edit-based Unsupervised Summarization[J]. arXiv preprint arXiv:2010.04379, 2020.


[6]

Dong Y, Wang S, Gan Z, et al. Multi-Fact Correction in Abstractive Text Summarization[J]. arXiv preprint arXiv:2010.02443, 2020.


[7]

Feng X, Feng X, Qin B, et al. Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks[J]. arXiv preprint arXiv:2010.10044, 2020.


[8]

Subramanian S, Li R, Pilault J, et al. On extractive and abstractive neural document summarization with transformer language models[J]. arXiv preprint arXiv:1909.03186, 2019.


[9]

Bi B, Li C, Wu C, et al. PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation[J]. arXiv preprint arXiv:2004.07159, 2020.


[10]

Bhandari M, Gour P, Ashfaq A, et al. Re-evaluating Evaluation in Text Summarization[J]. arXiv preprint arXiv:2010.07100, 2020.


[11]

https://github.com/yizhen20133868/NLP-Conferences-Code


[12]

https://github.com/teacherpeterpan/Question-Generation-Paper-List


[13]

https://github.com/thunlp/PLMpapers


[14]

https://github.com/thu-coai/PaperForONLG


[15]

https://github.com/NiuTrans