21 summarization-related papers from NAACL 2022. Not all papers on summarization are covered because of time limitation.
I only skim-read many of the papers so there can be hullucinations or missing information, don’t trust my summaries, use this list just as a starting point. Feel free to contact me for mistakes, additions and etc.
Summary
- There are many works on evaluating/improving factual inconsistency in generated summaries.
- Because of the theme in NAACL 2022, there are some papers focus on human-AI interaction.
A personal favorite
- What Makes a Good and Useful Summary? Incorporating Users in Automatic Summarization Research
- Problem: We don’t quite know that current direction of summarization research can actually help users.
- Approach: Designed a survey, and collected answers from students, and propose some understudied aspects required by them.
List
- FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
- Problem: Summarization models can generate factually incorrect information.
- Approach: Propose to extend pretraining objective of PEGASUS with FactCC during the pseudo-summary selection process.
- An Exploration of Post-Editing Effectiveness in Text Summarization
- Problem: Human-machine hybrid approach for summarization is understudied.
- Approach: By experiments with 72 people on two summarization datasets, they show that when a person doesn’t know the domain, post-editing approach helps but otherwise not much.
- TSTR: Too Short to Represent, Summarize with Details! Intro-Guided Extended Summary Generation
- Problem: Abstract-long sumamries are not informative enough for documents like scholarly documents.
- Approach: Propose a model which uses introduction text as pointer feature to the main text to select salient information.
- Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization
- Problem: Using NLI models to compare input document and hypothesis is not straighforward.
- Approach: Propose a pipeline to generate pairs of documents and hypothesis to train NLI models for factual incosistency detection.
- Exploring Neural Models for Query-Focused Summarization
- Problem: Contributions to query-focused summarization is increasing but there is no comprehensive study on models.
- Approach: Systematic study on two approaches, two-stage extractive-abstract and end-to-end, and also the effectiveness of transfer learning, and two extensions.
- Reference-free Summarization Evaluation via Semantic Correlation and Compression Ratio
- Problem: Shannon score, by generate source document w/ or w/o summary as prompt, to computes information and compare to evaluate summaries but ignores saliency and token position.
- Approach: Propose to extend Shannon score to compute the correlation of distributions w/ or w/o summary prompt to take saliency adn token position into account.
- Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries
- Problem: There are no standardalized human evaluation method for factual consistency of summarization systems.
- Approach: Performed cloudsource-based human evaluation on various models for factual consistency to analyze the best setup.
- Proposition-Level Clustering for Multi-Document Summarization
- Problem: Sentence level clustering for multi-document summarization can be noisy and redundant.
- Approach: Propose to perform clustering on proposition-level to obtain more precise salient pieces of texts.
- Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness
- Problem: Extract-then-abstract approach for multi-document summarization has some problems, such as quality of pseudo oracles for extraction model.
- Approach: Propose to provide model with additional signal by weighting loss value according to the reference sentence importance during training.
- Does Summary Evaluation Survive Translation to Other Languages?
- Problem: Extending summarization datasets to other languages is expensive with manual annotations.
- Approach: By translating English SumEval dataset to seven languages, they evaluated whether they can still use English human scores to evaluate qualities after machine translation.
- SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling
- Problem: Reference-free evaluation models often based on non summarizatino datasets such as QA which introduce noises and biases.
- Approach: By mutating reference summaries, they propose to generate training samples for reference free summary evaluation models.
- Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking
- Problem: Obtaining factually incorrect summaries yet relevant to the source text is challenging.
- Approach: Propose to generate factually incorrect summaries by masking key information in the reference summaries.
- Improving the Faithfulness of Abstractive Summarization via Entity Coverage Control
- Problem: Applying pre-training language models to abstractive summarization is known to lead generating unfaithful summaries.
- Approach: Propose to prepend a new special token computed by number of named entity overlap between source text and reference summary to inform model the level of faithfullness.
- ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization
- Problem: Obtaining abstractive summarization dataset is expensive.
- Approach: Propose a two-step method that generates pseudo summaries, first extract salient sentences from a text, second generate paraphrases of them.
- Interactive Query-Assisted Summarization via Deep Reinforcement Learning
- Problem: Current neural models for interactive summarization models have latency problem prevents from real-time processing.
- Approach: Propose to decompose the interactive summarization system actions into 1) initial summary and query responses generation and 2) generation of suggested queries, and model them with reinforcement learning.
- NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias
- Problem: How each media frame a savme event in their articles differ depending on its political learnings.
- Approach: Present a new task/dataset aiming to generate framing-bias-free summaries from articles with different political bias.
- FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations
- Problem: Graph representations of texts are not yet used to evaluation factual aspect of summarization systems.
- Approach: Propose to AMR and adapter enhanced models to evaluate summaries for factuality.
- QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization
- Problem: Two categories for factual consistency evaluation; 1) entailment-based, 2) QA-based, haven’t been fairly compared.
- Approach: Found that QA-based methods captures the factual consistency better than entailment-based methods and propose a new evaluation metric.
- Mapping the Design Space of Human-AI Interaction in Text Summarization
- Problem: There is no study on humans involved in developments of text sumamrization systems.
- Approach: Conducted experiments on five human-AI interactions on text summarization task to evaluate their experience.
- Interactive Query-Assisted Summarization via Deep Reinforcement Learning
- Problem: Current interactive summarization systems require too much computational costs, and sample effeciency is not studied well.
- Approach: Propose two reinforcement-based modules to achieve; 1) to find salient information in user queries, 2) to list possible queries.