NAACL 2022 - One sentence summaries of Papers by Computer. 2022-07-19

Background

NAACL 2022 papers with one sentence summaries (data was obtained from official github repository.)
Summaries were automatically generated by a BART fine-tuned on the SCITLDR dataset.

No human validation on summaries, so you can’t trust them, please just use them as a starting point of your work.

Using dataset from our group (link), I generated summaries in other four languages.

Summaries

Social Norms Guide Reference Resolution
- Mitchell Abrams, Matthias Scheutz
- TLDR: We investigate how social norms can influence the interpretation of referents in ambiguous cases and propose a new approach to resolve them.
Learning Natural Language Generation with Truncated Reinforcement Learning
- Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin
- TLDR: We present a novel approach to train conditional languagemodels without a supervised learning phase by only using reinforcement learning.
Language Model Augmented Monotonic Attention for Simultaneous Translation
- Sathish Reddy Indurthi, Mohd Abbas Zaidi, Beomseok Lee, Nikhil Kumar Lakumarapu, Sangha Kim
- TLDR: We propose a framework to aid monotonic attention with external language model to improve its decisions.
What Makes a Good and Useful Summary? Incorporating Users in Automatic Summarization Research
- Maartje Ter Hoeve, Julia Kiseleva, Maarten Rijke
- TLDR: We propose a survey methodology that can be used to investigate the needs of users of automatically generated summaries.
ErAConD: Error Annotated Conversational Dialog Dataset for Grammatical Error Correction
- Xun Yuan, Derek Pham, Sam Davidson, Zhou Yu
- TLDR: We present a novel GEC dataset consisting of parallel original and corrected utterances drawn from open-domain chatbot conversations; this dataset is, to our knowledge, the first GEC datasets targeted to a human-machine conversational
Semantic Diversity in Dialogue with Natural Language Inference
- Katherine Stasaski, Marti Hearst
- TLDR: We propose a novel metric which uses Natural Language Inference to measure the semantic diversity of a set of model responses for a conversation.
LEA: Meta Knowledge-Driven Self-Attentive Document Embedding for Few-Shot Text Classification
- S. K. Hong, Tae Young Jang
- TLDR: We propose a novel learning method for learning how to attend, called LEA, through which meta-level attention aspects are derived based on our meta-learning strategy.
Enhancing Self-Attention with Knowledge-Assisted Attention Maps
- Jiangang Bai, Yujing Wang, Hong Sun, Ruonan Wu, Tianmeng Yang, Pengfei Tang, Defu Cao, Mingliang Zhang1, Yunhai Tong, Yaming Yang, Jing Bai, Ruofei Zhang, Hao Sun, Wei Shen
- TLDR: We propose a novel and generic approach for knowledge infusion into pre-trained language models by incorporating knowledge-generated attention maps into the self-attention mechanism.
Batch-Softmax Contrastive Loss for Pairwise Sentence Scoring Tasks
- Anton Chernyavskiy, Dmitry Ilvovsky, Pavel Kalinin, Preslav Nakov
- TLDR: We explore the idea of using a batch-softmax contrastive loss when fine-tuning large-scale pre-trained transformer models to learn better task-specific sentence embeddings for pairwise sentence scoring tasks.
NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge
- Alexander Spangher, Xiang Ren, Jonathan May, Nanyun Peng
- TLDR: We present the first publicly available dataset of news revision histories, NewsEdits.com, and develop a high-accuracy extraction algorithm to identify article-level edit actions.
Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia
- Samee Ibraheem, Gaoyue Zhou, John DeNero
- TLDR: We show that there are differences in language produced by players with different roles in the game of Mafia.
SUBS: Subtree Substitution for Compositional Semantic Parsing
- Jingfeng Yang, Le Zhang, Diyi Yang
- TLDR: We propose to use subtree substitution for compositional data augmentation, where we consider subtrees with similar semantic functions as exchangeable.
Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks
- Paul Rottger, Bertie Vidgen, Dirk Hovy, Janet Pierrehumbert
- TLDR: We propose two contrasting paradigms for data annotation that encourage annotator subjectivity and encourage the training of models that consistently apply one belief.
Do Deep Neural Nets Display Human-like Attention in Short Answer Scoring?
- Zijie Zeng, Xinyu Li, Dragan Gasevic, Guanliang Chen
- TLDR: We investigate whether and to what extent DL-based graders align with human graders regarding the important words they identify when marking short answer questions.
Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation
- Yu Li, Baolin Peng, Yelong Shen, Yi Mao, Lars Liden, Zhou Yu, Jianfeng Gao
- TLDR: We present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks.
CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data
- Rui Feng, Chen Luo, Qingyu Yin, Bing Yin, Tuo Zhao, Chao Zhang
- TLDR: We propose CERES, a graph-based transformer model for semi-structured session data that captures both intra-item and inter-item interactions and outperforms strong pretraining baselines in three session search and entity linking
Political Ideology and Polarization: A Multi-dimensional Approach
- Barea Sinno, Bernardo Oviedo, Katherine Atwell, Malihe Alikhani, Junyi Jessy Li
- TLDR: We present a novel and more nuanced approach for the study of ideology and polarization in news media, and present a new dataset of news articles whose ideological positions are annotated by trained political scientists and linguists at the paragraph level.
Cooperative Self-training of Machine Reading Comprehension
- Hongyin Luo, Shang-Wen Li, Mingye Gao, Seunghak Yu, James Glass
- TLDR: We propose a cooperative self-training framework for extracting question-answer pairs from text corpora without annotation.
GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers
- Ali Modarressi, Mohsen Fayyaz, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar
- TLDR: We present a novel token attribution analysis method that incorporates all the components in the encoder block and aggregates this throughout layers.
A Robustly Optimized BMRC for Aspect Sentiment Triplet Extraction
- Shu Liu, Kaiwen Li, Zuhe Li
- TLDR: We present a robust and efficient bidirectional machine reading comprehension method for aspect sentiment triplet extraction.
Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds
- Yu Zhang, Yu Meng, Xuan Wang, Sheng Wang, Jiawei Han
- TLDR: We propose a novel framework for seed-guided topic discovery that leverages the power of pre-trained language models and the general knowledge of PLMs.
Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational Needs
- Xu Wang, Simin Fan, Jessica Houghton, Lu Wang
- TLDR: We investigate how instructors construct questions and identify touch points to enhance the underlying NLP models for better QG systems.
SwahBERT: Language Model of Swahili
- Gati Martin, Medard Edmund Mswahili, Young-Seob Jeong, Jeong Young-Seob
- TLDR: We present a new monolingual language model for Swahili, namely SwahBERT, using our collected pre-training data, and tested it with four downstream tasks including emotion classification.
Deconstructing NLG Evaluation: Evaluation Practices, Assumptions, and Their Implications
- Kaitlyn Zhou, Su Lin Blodgett, Adam Trischler, Hal Daumé III, Kaheer Suleman, Alexandra Olteanu
- TLDR: We explore the goals, assumptions, assumptions and constraints of natural language generation practitioners and how they affect their evaluation of NLG systems.
TSTR: Too Short to Represent, Summarize with Details! Intro-Guided Extended Summary Generation
- Sajad Sotudeh, Nazli Goharian
- TLDR: Extended summaries for scientific documents with long abstracts and long-form text.
Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems
- Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Anil Nelakanti, Vineet Gandhi
- TLDR: We present a method to control the emotional prosody of Text to Speech (TTS) systems by using phoneme-level intermediate features (pitch, energy, and duration) as levers.
The Why and The How: A Survey on Natural Language Interaction in Visualization
- Henrik Voigt, Ozge Alacam, Monique Meuschke, Kai Lawonn, Sina Zarrieß
- TLDR: We provide an overview of natural language-based interaction in the research area of visualization.
Understand before Answer: Improve Temporal Reading Comprehension via Precise Question Understanding
- Hao Huang, Xiubo Geng, Guodong Long, Daxin Jiang
- TLDR: We propose a novel reading comprehension approach with precise question understanding and a novel auxiliary contrastive loss for representation learning of temporal relations.
User-Driven Research of Medical Note Generation Software
- Tom Knoll, Francesco Moramarco, Alex Papadopoulos Korfiatis, Rachel Young, Claudia Ruffini, Mark Perera, Christian Perstl, Ehud Reiter, Anya Belz, Aleksandar Savkov
- TLDR: We present three rounds of user studies on how to adapt a medical note generation system to clinical practice.
Ask Me Anything in Your Native Language
- Nikita Sorokin, Dmitry Abulkhanov, Irina Piontkovskaya, Valentin Malykh
- TLDR: We present a novel approach based on single encoder for query and passage for retrieval from multi-lingual collection, together with cross-lingUAL generative reader.
Diversifying Neural Dialogue Generation via Negative Distillation
- Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li
- TLDR: We propose a novel negative training paradigm for generative dialogue models that avoids generic responses and maximizes distance with multi-level negative knowledge.
On Synthetic Data for Back Translation
- Jiahao Xu, Yubin Ruan, Wei Bi, Guoping Huang, Shuming Shi, Lihui Chen, Lemao Liu
- TLDR: We identify two key factors on synthetic data controlling the back-translation NMT performance, which are quality and importance.
Mapping the Design Space of Human-AI Interaction in Text Summarization
- Ruijia Cheng, Alison Smith-Renner, Ke Zhang, Joel Tetreault, Alejandro Jaimes-Larrarte
- TLDR: We map the design opportunities and considerations for human-AI interaction in text summarization and broader text generation tasks.
Towards Robust and Semantically Organised Latent Representations for Unsupervised Text Style Transfer
- Sharan Narasimhan, Suvodip Dey, Maunendra Desarkar
- TLDR: We propose a novel style transfer model based on perturbation of latent space and a novel noise component on the continuous embeddings space.
An Exploration of Post-Editing Effectiveness in Text Summarization
- Vivian Lai, Alison Smith-Renner, Ke Zhang, Ruijia Cheng, Wenjuan Zhang, Joel Tetreault, Alejandro Jaimes-Larrarte
- TLDR: We explored whether post-editing provided summaries with AI improves summarization performance and user experience on formal and informal text.
Automatic Correction of Human Translations
- Jessy Lin, Geza Kovacs, Aditya Shastry, Joern Wuebker, John DeNero
- TLDR: We introduce translation error correction (TEC), the task of automatically correcting human-generated translations.
On the Robustness of Reading Comprehension Models to Entity Renaming
- Jun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, Xiang Ren
- TLDR: We study the robustness of machine reading comprehension (MRC) models to entity renaming—do models make more wrong predictions when the same questions are asked about an entity whose name has been changed?
Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data
- Cynthia Sullivan, William Brackenbury, Andrew McNut, Kevin Bryson, Kbyllofficial@gmail.com Kbyllofficial@gmail.com, Yuxin Chen, Michael Littman, Chenhao Tan, Blase Ur
- TLDR: We studied how humans select rationales, a subset of input tokens relevant to the chosen label, and how different instructions and user interface affordances impact the rationales chosen.
Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
- Haode Zhang, Haowen Liang, Yuwei Zhang, Li-Ming Zhan, Xiao-Ming Wu, Xiaolei Lu, Albert Lam
- TLDR: We propose to improve supervised pre-training by regularizing the feature space towards isotropy.
Cross-document Misinformation Detection based on Event Graph Reasoning
- Xueqing Wu, Kung-Hsiang Huang, Yi Fung, Heng Ji
- TLDR: We propose a novel task of cross-document misinformation detection that uses graph-based methods to detect misinformation.
Disentangled Action Recognition with Knowledge Bases
- Zhekun Luo, Shalini Ghosh, Devin Guillory, Keizo Kato, Trevor Darrell, Huijuan Xu
- TLDR: We propose a novel compositional action recognition model that uses knowledge graphs to learn compositional representations for verbs and nouns.
Machine-in-the-Loop Rewriting for Creative Image Captioning
- Vishakh Padmakumar, He He
- TLDR: We train a rewriting model that, when prompted, modifies specified spans of text within the user’s original draft to introduce descriptive and figurative elements in the text.
A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction
- Yong Xie, Dakuo Wang, Pin-Yu Chen, Jinjun Xiong, Sijia Liu, Oluwasanmi Koyejo
- TLDR: We propose a novel adversarial attack method to fool stock prediction models by simply concatenating a perturbed but semantically similar tweet.
Building Multilingual Machine Translation Systems That Serve Arbitrary XY Translations
- Akiko Eriguchi, Shufang Xie, Tao Qin, Hany Hassan
- TLDR: We propose a novel approach to train a multilingual neural machine translation system that can serve arbitrary X-Y translation directions while leveraging multilinguality with a two-stage training strategy of pretraining and finetuning.
Non-Autoregressive Neural Machine Translation with Consistency Regularization Optimized Variational Framework
- Minghao Zhu, Junli Wang, Chungang Yan
- TLDR: We propose a method for improving the translation quality of VAE-based neural machine translation models by improving the posterior consistency of latent variables.
User-Centric Gender Rewriting
- Bashar Alhafni, Nizar Habash, Houda Bouamor
- TLDR: We propose a multi-step system for gender rewriting in contexts involving two users (I and/or You) with independent grammatical gender preferences.
Reframing Human-AI Collaboration for Generating Free-Text Explanations
- Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- TLDR: We present a new algorithm for generating explanations for classification decisions using human-written examples in a few-shot manner.
EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction
- Benfeng Xu, Quan Wang, Yajuan Lyu, Yabing Shi, Yong Zhu, Jie Gao, Zhendong Mao
- TLDR: Multi-triple extraction is a challenging task due to the existence of informative inter-triples correlations, and consequently rich interactions across the constituent entities and relations.
Meta Learning for Natural Language Processing: A Survey
- Hung-yi Lee, Shang-Wen Li, Thang Vu
- TLDR: Meta-learning in NLP.
Analyzing Modality Robustness in Multimodal Sentiment Analysis
- Devamanyu Hazarika, Yingting Li, Bo Cheng, Shuai Zhao, Roger Zimmermann, Soujanya Poria
- TLDR: We propose simple diagnostic checks for modality robustness in multimodal models and propose robust training strategies to address the issues.
Fuse It More Deeply! A Variational Transformer with Layer-Wise Latent Variable Inference for Text Generation
- Jinyi Hu, Xiaoyuan Yi, Wenhao Li, Maosong Sun, Xing Xie
- TLDR: Variational Auto-Encoder for text generation.
Easy Adaptation to Mitigate Gender Bias in Multilingual Text Classification
- Xiaolei Huang
- TLDR: We present a standard domain adaptation model to reduce the gender bias and improve performance of text classifiers under multilingual settings.
On the Use of External Data for Spoken Named Entity Recognition
- Ankita Pasad, Felix Wu, Suwon Shon, Karen Livescu, Kyu Han
- TLDR: We present a new approach to learning low-resource spoken named entity recognition models using external data that are not annotated for the task.
Long-term Control for Dialogue Generation: Methods and Evaluation
- Ramya Ramakrishnan, Hashan Narangodage, Mauro Schilman, Kilian Weinberger, Ryan McDonald
- TLDR: We propose new metrics for evaluating long-term dialogue generation and propose a retrieval-augmented method for evaluating dialogue generation that improves performance of long-time controlled generation via logit modification techniques.
Learning Dialogue Representations from Consecutive Utterances
- Zhihan Zhou, Dejiao Zhang, Wei Xiao, Nicholas Dingwall, Xiaofei Ma, Andrew Arnold, Bing Xiang
- TLDR: We introduce Dialogue Sentence Embedding, a self-supervised contrastive learning method that learns effective dialogue representations suitable for a wide range of dialogue tasks.
On the Machine Learning of Ethical Judgments from Natural Language
- Zeerak Talat, Hagen Blix, Josef Valvoda, Maya Indira Ganesh, Ryan Cotterell, Adina Williams
- TLDR: We offer a critique of recent work on computational approaches for predicting morality in NLP, and propose a new approach to address ethical issues in machine learning.
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics
- Ximing Lu, Sean Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah Smith, Yejin Choi
- TLDR: We propose a novel method for controlling text generation under lexical constraints.
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
- Machel Reid, Mikel Artetxe
- TLDR: We propose PARADISE, a novel method for integrating parallel data into sequence-to-sequence pretraining, which improves the performance of multilingual sequence- to-sequence models by improving the accuracy of cross-lingual inference
Explaining Toxic Text via Knowledge Enhanced Text Generation
- Rohit Sridhar, Diyi Yang
- TLDR: We present a novel knowledge-informed encoder-decoder framework for detecting and explaining stereotypes in toxic speech.
Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection
- Angelica Chen, Vicky Zayats, Daniel Walker, Dirk Padfield
- TLDR: We propose a streaming BERT-based sequence tagging model that, combined with a novel training objective, is capable of detecting disfluencies in real-time while balancing accuracy and latency.
GRAM: Fast Fine-tuning of Pre-trained Language Models for Content-based Collaborative Filtering
- Yoonseok Yang, Kyu Seok Kim, Minsam Kim, Juneyoung Park
- TLDR: We propose GRAM, a new method for learning content-based collaborative filtering that improves training efficiency by up to 146x.
Generating Repetitions with Appropriate Repeated Words
- Toshiki Kawamoto, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura
- TLDR: We propose Weighted Label Smoothing, a smoothing method for explicitly learning which words to repeat during fine-tuning, and a repetition scoring method that can output more appropriate repetitions during decoding.
Textless Speech-to-Speech Translation on Real Data
- Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu
- TLDR: We present a textless speech-to-speech translation system that can translate speech from one language into another language and can be built without the need of any text data.
WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding
- Guoqing Zheng, Giannis Karamanolakis, Kai Shu, Ahmed Awadallah
- TLDR: We propose a new weak supervision benchmark for natural language understanding tasks with diverse tasks and real-world weak labeling rules.
CompactIE: Compact Facts in Open Information Extraction
- Farima Fatahi Bayat, Nikita Bhutani, H. Jagadish
- TLDR: We propose a new system for extracting compact extractions from OpenIE benchmarks.
CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
- Hyounghun Kim, Abhay Zala, Mohit Bansal
- TLDR: We present a new dataset for commonsense reasoning for counterfactual scene imagination and show that human models are significantly behind their AI counterparts.
Abstraction not Memory: BERT and the English Article System
- Harish Tayyar Madabushi, Dagmar Divjak, Petar Milin
- TLDR: We show that BERT is able to learn to predict the zero articles of a given article by detecting them using rules that the deep neural model can easily pick up.
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering
- Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, Weizhu Chen
- TLDR: We propose an omnivorous pretraining approach that consumes both natural and synthetic data to endow models with these respective abilities.
Provably Confidential Language Modelling
- Xuandong Zhao, Lei Li, Yu-Xiang Wang
- TLDR: We propose Confidentially Redacted Training (CRT), a method to train language generation models while protecting the confidential segments.
KAT: A Knowledge Augmented Transformer for Vision-and-Language
- Liangke Gui, Borui Wang, Qiuyuan Huang, Alexander Hauptmann, Yonatan Bisk, Jianfeng Gao
- TLDR: We propose a multimodal transformer that integrates implicit and explicit knowledge in their reasoning, and achieve state-of-the-art results on the open-domain multimodality task of OK-VQA.
When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it
- Sebastian Schuster, Tal Linzen
- TLDR: We present a new evaluation suite for language models that targets the knowledge of the interactions between sentential operators and indefinite NPs.
On Curriculum Learning for Commonsense Reasoning
- Adyasha Maharana, Mohit Bansal
- TLDR: We use paced curriculum learning to rank data and sample training mini-batches with increasing levels of difficulty from the ranked dataset during finetuning language models for commonsense reasoning tasks.
DocTime: A Document-level Temporal Dependency Graph Parser
- Puneet Mathur, Vlad Morariu, Verena Kaynig-Fittkau, Jiuxiang Gu, Franck Dernoncourt, Quan Tran, Ani Nenkova, Dinesh Manocha, Rajiv Jain
- TLDR: We introduce DocTime - a novel temporal dependency graph parser that takes as input a text document and produces a temporal dependency graphs.
FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
- David Wan, Mohit Bansal
- TLDR: We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning.
ScAN: Suicide Attempt and Ideation Events Dataset
- Bhanu Pratap Singh Rawat, Samuel Kovaly, Hong Yu, Wilfred Pigeon
- TLDR: We present a novel dataset for suicide attempt and ideation events and a model for predicting suicidal behavior.
Socially Aware Bias Measurements for Hindi Language Representations
- Vijit Malik, Sunipa Dev, Akihiro Nishi, Nanyun Peng, Kai-Wei Chang
- TLDR: We investigate the biases present in Hindi language representations such as caste and religion associated biases.
AmbiPun: Generating Humorous Puns with Ambiguous Context
- Anirudh Mittal, Yufei Tian, Nanyun Peng
- TLDR: We propose a simple yet effective way to generate pun sentences that does not require any training on existing puns.
EmpHi: Generating Empathetic Responses with Human-like Intents
- Mao Yan Chen, Siheng Li, Yujiu Yang
- TLDR: We propose a novel model to generate empathetic responses with human-consistent empathetics.
Yes, No or IDK: The Challenge of Unanswerable Yes/No Questions
- Elior Sulem, Jamaal Hay, Dan Roth
- TLDR: We extend the Yes/No QA task, adding questions with an IDK answer, and show its considerable difficulty compared to the original 2-label task.
Inducing and Using Alignments for Transition-based AMR Parsing
- Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, Ramón Astudillo
- TLDR: We propose a neural aligner for AMR that learns node-to-word alignments without relying on complex pipelines.
Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?
- Xiang Zhou, Shiyue Zhang, Mohit Bansal
- TLDR: We propose a new language model that can learn tag agreement in a very simplified setting and show mixed results on both the English Penn WSJ dataset and universal treebank.
DREAM: Improving Situational QA by First Elaborating the Situation
- Yuling Gu, Bhavana Dalvi, Peter Clark
- TLDR: We train a new model to answer questions that elaborate the scenes that situated questions are about, and then provide those elaborations as additional context to a question-answering (QA) model.
CoSe-Co: Text Conditioned Generative CommonSense Contextualizer
- Rachit Bansal, Milan Aggarwal, Sumit Bhatia, Jivat Kaur, Balaji Krishnamurthy
- TLDR: We propose a novel dataset for generating knowledge from sentence-based commonsense that is generically usable for natural language tasks.
Probing via Prompting
- Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan
- TLDR: We propose a novel probing via prompting approach to probing via probing that is comparable or better at extracting information than diagnostic probes while learning much less on its own.
Database Search Results Disambiguation for Task-Oriented Dialog Systems
- Kun Qian, Satwik Kottur, Ahmad Beirami, Shahin Shayandeh, Paul Crook, Alborz Geramifard, Zhou Yu, Chinnadhurai Sankar
- TLDR: We propose Database Search Result Disambiguation, a novel task that focuses on disambiguating database search results, which enhances user experience by allowing them to choose from multiple options instead of just one.
Unsupervised Slot Schema Induction for Task-oriented Dialog
- Dian Yu, Mingqiu Wang, Yuan Cao, Izhak Shafran, Laurent Shafey, Hagen Soltau
- TLDR: We propose an unsupervised approach for slot schema induction from unlabeled dialog corpora.
Towards a Progression-Aware Autonomous Dialogue Agent
- Abraham Sanders, Tomek Strzalkowski, Mei Si, Albert Chang, Deepanshu Dey, Jonas Braasch, Dakuo Wang
- TLDR: We propose a framework in which dialogue agents can evaluate the progression of a conversation toward or away from desired outcomes, and use this signal to inform planning for subsequent responses.
Cross-Domain Detection of GPT-2-Generated Technical Text
- Juan Rodriguez, Todd Hay, David Gros, Zain Shamsi, Ravi Srinivasan
- TLDR: We show that paragraph-level detectors can be used to detect GPT-2-generated technical research text.
DISAPERE: A Dataset for Discourse Structure in Peer Review Discussions
- Neha Kennard, Tim O’Gorman, Rajarshi Das, Akshay Sharma, Chhandak Bagchi, Matthew Clinton, Pranay Kumar Yelugam, Hamed Zamani, Andrew McCallum
- TLDR: We present DISAPERE, a labeled dataset of 20k sentences contained in 506 review-rebuttal pairs in English, annotated by experts.
MultiSpanQA: A Dataset for Multi-Span Question Answering
- Haonan Li, Martin Tomko, Maria Vasardani, Timothy Baldwin
- TLDR: We present MultiSpanQA, a reading comprehension dataset that focuses on multi-span questions.
Context-Aware Abbreviation Expansion Using Large Language Models
- Shanqing Cai, Subhashini Venugopalan, Katrin Tomanek, Ajit Narayanan, Meredith Morris, Michael Brenner
- TLDR: We propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters, and show that this allows for efficient abbreviation expansion and robustness against typo noise.
Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models
- Yang Cao, Anna Sotnikova, Hal Daumé III, Rachel Rudinger, Linda Zou
- TLDR: We introduce the sensitivity test (SeT) for measuring stereotypical associations from language models.
Sort by Structure: Language Model Ranking as Dependency Probing
- Max Müller-Eberstein, Rob Goot, Barbara Plank
- TLDR: We propose probing to rank LMs, specifically for parsing dependencies in a given language, by measuring the degree to which labeled trees are recoverable from an LM’s contextualized embeddings.
Quantifying Synthesis and Fusion and their Impact on Machine Translation
- Arturo Oncevay, Duygu Ataman, Niels Van Berkel, Barry Haddow, Alexandra Birch, Johannes Bjerva
- TLDR: We propose to quantify morphological typology at the word and segment level and show that both synthesis and fusion are important for machine translation quality.
Commonsense and Named Entity Aware Knowledge Grounded Dialogue Generation
- Deeksha Varshney, Akshara Prabhakar, Asif Ekbal
- TLDR: We present a novel open-domain dialogue generation model which effectively utilizes the large-scale commonsense and named entity based knowledge in addition to the unstructured topic-specific knowledge associated with each utterance.
Efficient Hierarchical Domain Adaptation for Pretrained Language Models
- Alexandra Chronopoulou, Matthew Peters, Jesse Dodge
- TLDR: We propose a novel method to adapt language models to many diverse domains using a hierarchical tree structure and a computationally efficient adapter approach.
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate
- Hannah Kirk, Bertie Vidgen, Paul Rottger, Tristan Thrush, Scott Hale
- TLDR: We present HatemojiCheck, a test suite of 3,930 short-form statements that allows us to evaluate performance on hateful language expressed with emoji.
On the Economics of Multilingual Few-shot Learning: Modeling the Cost-Performance Trade-offs of Machine Translated and Manual Data
- Kabir Ahuja, Monojit Choudhury, Sandipan Dandapat
- TLDR: We propose a framework to systematically evaluate the performance and cost trade-offs between machine-translated and manually-created labelled data for task-specific fine-tuning of massively multilingual language models.
Learning to Selectively Learn for Weakly Supervised Paraphrase Generation with Model-based Reinforcement Learning
- Haiyan Yin, Dingcheng Li, Ping Li
- TLDR: Weakly supervised paraphrase generation with reinforcement learning and reward normalization.
Quality-Aware Decoding for Neural Machine Translation
- Patrick Fernandes, António Farinhas, Ricardo Rei, José De Souza, Perez Ogayo, Graham Neubig, Andre Martins
- TLDR: We propose a new method for machine translation quality estimation and evaluation based on beam search.
Pretrained Models for Multilingual Federated Learning
- Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme
- TLDR: We explore the impact of multilingual text on federated learning algorithms and show that using pretrained models reduces the negative effects of FL, even when using non-IID partitioning.
AcTune: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models
- Yue Yu, Lingkai Kong, Jieyu Zhang, Rongzhi Zhang, Chao Zhang
- TLDR: We develop AcTune, a new framework that improves the label efficiency of active PLM fine-tuning by unleashing the power of unlabeled data via self-training.
Label Anchored Contrastive Learning for Language Understanding
- Zhenyu Zhang, Yuming Zhao, Meng Chen, Xiaodong He
- TLDR: We propose a novel label anchored contrastive learning approach for language understanding.
Go Back in Time: Generating Flashbacks in Stories with Event Temporal Prompts
- Rujun Han, Hong Chen, Yufei Tian, Nanyun Peng
- TLDR: We propose to generate stories and stories end-to-end using structured storylines to encode events and their pair-wise temporal relations (before, after and vague) as temporal prompts that guide how stories should unfold tempor
Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts
- Felix Drinkall, Stefan Zohren, Janet Pierrehumbert
- TLDR: We present a novel approach incorporating transformer-based language models into infectious disease modelling.
Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays
- Rahul Kumar, Sandeep Mathias, Sriparna Saha, Pushpak Bhattacharyya
- TLDR: Most research in the area of automatic essay grading (AEG) is geared towards scoring the essay
Natural Language Inference with Self-Attention for Veracity Assessment of Pandemic Claims
- Miguel Arana-Catania, Elena Kochkina, Arkaitz Zubiaga, Maria Liakata, Robert Procter, Yulan He
- TLDR: We present a comprehensive work on automated veracity assessment from dataset creation to developing novel methods based on Natural Language Inference (NLI), focusing on misinformation related to the COVID-19 pandemic.
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding
- Ao Jia, Yu He, Yazhou Zhang, Sagar Uprety, Dawei Song, Christina Lioma
- TLDR: We present the first multi-modal and multi-task sentiment, emotion and desire dataset, which contains 9,190 text-image pairs, with English text.
Relation-Specific Attentions over Entity Mentions for Enhanced Document-Level Relation Extraction
- Jiaxin Yu, Deqing Yang, Shuyu Tian
- TLDR: We propose a novel method for document-level relation extraction which performs selective attentions over different entity mentions with respect to candidate relations.
Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation
- Giscard Biamby, Grace Luo, Trevor Darrell, Anna Rohrbach
- TLDR: Detecting out-of-context media, such as “miscaptioned” images on Twitter, is a relevant problem, especially in domains of high public significance.
BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation
- Yuchen Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Jian Yang, Haoyang Huang, Rico Sennrich, Ryan Cotterell, Mrinmaya Sachan, Ming Zhou
- TLDR: We propose a novel automatic metric for document-level translation quality evaluation that is more sensitive to document-specific nuances and better selectivity and interpretability.
Disentangled Learning of Stance and Aspect Topics for Vaccine Attitude Detection in Social Media
- Lixing Zhu, Zheng Fang, Gabriele Pergola, Robert Procter, Yulan He
- TLDR: We propose a novel semi-supervised approach for vaccine attitude detection on social media using variational autoencoding and a variational model for sentiment analysis.
SKILL: Structured Knowledge Infusion for Large Language Models
- Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- TLDR: We propose a method to infuse structured knowledge into large language models by directly training T5 models on factual triples of knowledge graphs.
Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models
- Karolina Stanczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, Isabelle Augenstein
- TLDR: We investigate whether morphosyntactic information is encoded in the same subset of neurons in different languages.
Aspect Is Not You Need: No-aspect Differential Sentiment Framework for Aspect-based Sentiment Analysis
- Jiahao Cao, Rui Liu, Huailiang Peng, Lei Jiang, Xu Bai
- TLDR: We propose a novel framework for the aspect-based sentiment analysis task that can predict the sentiment of an aspect even if we don’t know what the aspect is.
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation
- Simiao Zuo, Qingru Zhang, Chen Liang, Pengcheng He, Tuo Zhao, Weizhu Chen
- TLDR: We propose a novel method for training small language models by adapting feed-forward neural networks into multiple experts.
Implicit n-grams Induced by Recurrence
- Xiaobing Sun, Wei Lu
- TLDR: We show that there are explainable features in recurrent neural networks that are reminiscent of classical n-grams features, and use them to model interesting linguistic phenomena such as negation and intensification.
Guiding Visual Question Generation
- Nihir Vedd, Zixu Wang, Marek Rei, Yishu Miao, Lucia Specia
- TLDR: We present Guiding Visual Question Generation - a variant of VQG which conditions the question generator on categorical information based on expectations on the type of question and the objects it should explore.
OPERA: Operation-Pivoted Discrete Reasoning over Text
- Yongwei Zhou, Junwei Bao, Chaoqun Duan, Haipeng Sun, Jiahui Liang, Yifan Wang, Jing Zhao, Youzheng Wu, Xiaodong He, Tiejun Zhao
- TLDR: Opera is a novel multi-predictor-based discrete reasoning framework that uses symbolic operations as neural modules to facilitate the reasoning ability and interpretability of machine reading comprehension.
Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness
- Yun-Zhu Song, Yi-Syuan Chen, Hong-Han Shuai
- TLDR: We present an extract-then-abstract Transformer framework for multi-document summarization that learns to extract salient sentence selection and summarize the selected contents.
Improving Constituent Representation with Hypertree Neural Networks
- Hao Zhou, Gongshen Liu, Kewei Tu
- TLDR: We propose a novel hypertree neural network for span representation based on compositional structures of natural language.
Measuring Fairness with Biased Rulers: A Comparative Study on Bias Metrics for Pre-trained Language Models
- Pieter Delobelle, Ewoenam Tokpo, Toon Calders, Bettina Berendt
- TLDR: We evaluate the compatibility of many metrics for fairness in language models and show that many metrics are not compatible with each other and highly depend on (i) templates, (ii) attribute and target seeds and (iii) the choice
MuCPAD: A Multi-Domain Chinese Predicate-Argument Dataset
- Yahui Liu, Haoping Yang, Chen Gong, Qingrong Xia, Zhenghua Li, Min Zhang
- TLDR: We present a new dataset for cross-domain predicate-argument annotation and provide benchmark results on cross-domains SRL.
Representation Learning for Conversational Data using Discourse Mutual Information Maximization
- Bishal Santra, Sumegh Roychowdhury, Aishik Mandal, Vasu Gurram, Atharva Naik, Manish Gupta, Pawan Goyal
- TLDR: We propose a structure-aware Mutual Information based loss-function DMI (Discourse Mutual Information) for training dialog-representation models, that additionally captures the inherent uncertainty in response prediction.
ValCAT: Variable-Length Contextualized Adversarial Transformations Using Encoder-Decoder Language Model
- Chuyun Deng, Mingxuan Liu, Yue Qin, Jia Zhang, Hai-Xin Duan, Donghong Sun
- TLDR: We propose ValCAT, a black-box attack framework that misleads the language model by applying variable-length contextualized transformations to the original text.
A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation
- Kexun Zhang, Rui Wang, Xu Tan, Junliang Guo, Yi Ren, Tao Qin, Tie-Yan Liu
- TLDR: We study the syntactic multi-modality problem in non-autoregressive translation and propose new loss functions for it.
CIAug: Equipping Interpolative Augmentation with Curriculum Learning
- Ramit Sawhney, Ritesh Soun, Shrey Pandit, Megh Thakkar, Sarvagya Malaviya, Yuval Pinter
- TLDR: We propose CIAug, a curriculum-based learning method that builds upon mixup to improve interpolative data augmentation and generalize faster.
Proposition-Level Clustering for Multi-Document Summarization
- Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan
- TLDR: We propose a novel text clustering method for multi-document summarization that uses salient propositions to indicate information saliency and avoids redundancy.
Non-Autoregressive Machine Translation: It’s Not as Fast as it Seems
- Jindřich Helcl, Barry Haddow, Alexandra Birch
- TLDR: We provide a fair evaluation of non-autoregressive machine translation models and show that they are almost always slower than autoregressive models.
BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer
- Marinela Parović, Goran Glavaš, Ivan Vulić, Anna Korhonen
- TLDR: We show that bilingual language pair adapters are more effective than dedicated language adapters for cross-lingual transfer when the goal is to optimize performance for a particular source-target transfer direction.
Combining Humor and Sarcasm for Improving Political Parody Detection
- Xiao Ao, Danae Sanchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras
- TLDR: We present a multi-encoder model that combines three parallel encoders to enrich parody-specific representations with humor and sarcasm information.
TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages
- Zihan Zhao, Lu Chen, Ruisheng Cao, Hongshen Xu, Xingyu Chen, Kai Yu
- TLDR: We propose a novel approach to improve the structural reading comprehension task on web pages by leveraging the informative topology of web pages.
RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators
- Rilwan Adewoyin, Ritabrata Dutta, Yulan He
- TLDR: We propose RSTGen, a framework that utilises Rhetorical Structure Theory (RST), a classical language theory, to control the discourse structure, semantics and topics of generated text.
Intent Detection and Discovery from User Logs via Deep Semi-Supervised Contrastive Clustering
- Rajat Kumar, Mayur Patidar, Vaibhav Varshney, Lovekesh Vig, Gautam Shroff
- TLDR: We propose an end-to-end deep contrastive clustering algorithm that jointly updates model parameters and cluster centers via supervised and self-supervised learning and optimally utilizes both labeled and unlabeled data.
Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations
- Daniela Brook Weiss, Paul Roit, Ori Ernst, Ido Dagan
- TLDR: We extend the sentence fusion task dataset to more than triple its size and scope.
The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation
- Tobias Domhan, Eva Hasler, Ke Tran, Sony Trenous, Bill Byrne, Felix Hieber
- TLDR: We propose a model of vocabulary selection, integrated into the neural translation model, that predicts the set of allowed output words from contextualized encoder representations.
MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting
- Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, Arman Cohan, David Jurgens, Kyle Lo
- TLDR: We present a new dataset of 12.6K citation contexts from 1.2K computational linguistics papers that fully models these phenomena.
DEGREE: A Data-Efficient Generation-Based Event Extraction Model
- I-Hung Hsu, Kuan-Hao Huang, Elizabeth Boschee, Scott Miller, Prem Natarajan, Kai-Wei Chang, Nanyun Peng
- TLDR: We propose DEGREE, a data-efficient model that formulates event extraction as a conditional generation problem and uses a natural sentence generator to extract the event arguments.
Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling
- Nuo Chen, Linjun Shou, Ming Gong, Jian Pei, Daxin Jiang
- TLDR: We show that the language modeling objective requires a gap between pre-training and fine-tuning stages.
Hero-Gang Neural Model For Named Entity Recognition
- Jinpeng Hu, Yaling Shen, Yang Liu, Xiang Wan, Tsung-Hui Chang
- TLDR: We propose a novel Hero-Gang Neural structure (HGN) to leverage both global and local information to promote NER.
MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification
- Jianhai Zhang, Mieradilijiang Maimaiti, Gao Xing, Yuanhang Zheng, Ji Zhang
- TLDR: Meta-learning for few-shot text classification.
All You May Need for VQA are Image Captions
- Soravit Changpinyo, Doron Kukliansy, Idan Szpektor, Xi Chen, Nan Ding, Radu Soricut
- TLDR: We propose a method that automatically generates VQA examples from existing image-caption annotations and neural models for textual question generation.
Frustratingly Easy System Combination for Grammatical Error Correction
- Muhammad Qorib, Seung-Hoon Na, Hwee Tou Ng
- TLDR: We formulate system combination for grammatical error correction as a simple machine learning task: binary classification.
Simple Local Attentions Remain Competitive for Long-Context Tasks
- Wenhan Xiong, Barlas Oguz, Anchit Gupta, Xilun Chen, Diana Liskovich, Omer Levy, Scott Yih, Yashar Mehdad
- TLDR: We show that the most efficient long-range attention variants can beat a simple local window attention under standard pretraining paradigms.
Even the Simplest Baseline Needs Careful Re-investigation: A Case Study on XML-CNN
- Si-An Chen, Jie-jyun Liu, Tsung-Han Yang, Hsuan-Tien Lin, Chih-Jen Lin
- TLDR: We show that the superior performance claimed in the original paper was mainly due to some unbelievable coincidences.
Multi-Relational Graph Transformer for Automatic Short Answer Grading
- Rajat Agarwal, Varun Khurana, Karish Grover, Mukesh Mohania, Vikram Goyal
- TLDR: We propose a multi-relational graph Transformer for automatic short answer grading based on the structural context of the sentence.
Event Schema Induction with Double Graph Autoencoders
- Xiaomeng Jin, Manling Li, Heng Ji
- TLDR: We propose a new event schema induction framework using double graph autoencoders, which captures the global dependencies among nodes in event graphs.
CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course
- Changyoon Lee, Yeon Seonwoo, Alice Oh
- TLDR: We present CS1QA, a dataset for code-based question answering in the programming education domain.
Unsupervised Cross-Lingual Transfer of Structured Predictors without Source Data
- Kemal Kurniawan, Lea Frermann, Philip Schulz, Trevor Cohn
- TLDR: We propose a method for unsupervised transfer of data between multiple input models for structured prediction.
Don’t Take It Literally: An Edit-Invariant Sequence Loss for Text Generation
- Guangyi Liu, Zichao Yang, Tianhua Tao, Xiaodan Liang, Junwei Bao, Zhen Li, Xiaodong He, Shuguang Cui, Zhiting Hu
- TLDR: We propose a novel Edit-Invariant Sequence Loss (EISL) algorithm for neural text generation models that computes the matching loss of a target sequence with a generated sequence.
Modeling Exemplification in Long-form Question Answering via Retrieval
- Shufan Wang, Fangyuan Xu, Laure Thompson, Eunsol Choi, Mohit Iyyer
- TLDR: We provide the first computational study of exemplification in long-form question answering and show that state-of-the-art LFQA models struggle to generate relevant examples.
D2U: Distance-to-Uniform Learning for Out-of-Scope Detection
- Eyup Yilmaz, Cagri Toraman
- TLDR: We propose a zero-shot post-processing step that exploits the shape of the output distribution to detect out-of-scope utterances in conversational systems.
Reference-free Summarization Evaluation via Semantic Correlation and Compression Ratio
- Yizhu Liu, Qi Jia, Kenny Zhu
- TLDR: We propose a new automatic reference-free evaluation metric for summarization that compares semantic distribution between source document and summary by pretrained language models and considers summary compression ratio.
KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation
- Marzieh Tahaei, Ella Charlaix, Vahid Nia, Ali Ghodsi, Mehdi Rezagholizadeh
- TLDR: We present a Transformer-based pre-trained language model compression method that outperforms state-of-the-art compression methods on GLUE and SQuAD benchmarks.
Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models
- Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, Woomyoung Park
- TLDR: We propose a new method for building role-satisfying dialogue datasets for open-domain dialogue systems and show that it can maintain conversational abilities while maintaining consistent roles.
Sentence-Level Resampling for Named Entity Recognition
- Xiaochen Wang, Yue Wang
- TLDR: We propose sentence-level resampling methods for NER that improve span-level F1-scores of NER models across corpora from diverse domains.
Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem
- Ryoma Sato
- TLDR: We propose WordTour, unsupervised one-dimensional word embeddings, which are extremely efficient and provide a minimal means to handle word embedding.
On the Diversity and Limits of Human Explanations
- Chenhao Tan
- TLDR: We provide an overview of the diversity of human explanations in NLP and provide implications for collecting and using human explanations.
Locally Aggregated Feature Attribution on Natural Language Model Understanding
- Sheng Zhang, Jin Wang, Haitao Jiang, Rui Song
- TLDR: We propose a novel gradient-based feature attribution method for NLP models.
Generic and Trend-aware Curriculum Learning for Relation Extraction
- Nidhi Vakil, Hadi Amiri
- TLDR: We present a generic and trend-aware curriculum learning approach that effectively integrates textual and structural information in text graphs for relation extraction between entities, which we consider as node pairs in graphs.
On Systematic Style Differences between Unsupervised and Supervised MT and an Application for High-Resource Machine Translation
- Kelly Marchisio, Markus Freitag, David Grangier
- TLDR: We show that unsupervised machine translation is more fluent and structurally different in comparison to human translation than is supervised MT.
Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
- Akari Asai, Matt Gardner, Hannaneh Hajishirzi
- TLDR: We present a method to incorporate spurious cues into retrieval-augmented generation models.
Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning
- Yu Jin Kim, Beong-woo Kwak, Youngwook Kim, Reinald Kim Amplayo, Seung-won Hwang, Jinyoung Yeo
- TLDR: We propose a modular variant of the knowledge aggregation for commonsense reasoning that can be utilized synergetically in multiple-source settings.
Learning to Express in Knowledge-Grounded Conversation
- Xueliang Zhao, Tingchen Fu, Chongyang Tao, Wei Wu, Dongyan Zhao, Rui Yan
- TLDR: We propose a segmentation-based generation model and optimize the model by a variational approach to discover the underlying pattern of knowledge expression in a response.
End-to-End Chinese Speaker Identification
- Dian Yu, Ben Zhou, Dong Yu
- TLDR: We present a novel-based end-to-end end-end speaker identification system for Chinese that can achieve better results than state-of-the-art models on all public SI datasets for Chinese.
MINION: a Large-Scale and Diverse Dataset for Multilingual Event Detection
- Amir Pouran Ben Veyseh, Minh Van Nguyen, Franck Dernoncourt, Thien Nguyen
- TLDR: We present a large-scale multilingual dataset for event detection in 8 different languages; 5 of them have not been supported by existing multilingual datasets.
Do Prompt-Based Models Really Understand the Meaning of Their Prompts?
- Albert Webson, Ellie Pavlick
- TLDR: We show that prompt-based models can learn just as fast with many prompts that are intentionally irrelevant or even pathologically misleading as they do with instructively “good” prompts.
GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
- Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych
- TLDR: Unsupervised domain adaptation method for dense retrieval.
Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models
- Qinyuan Ye, Madian Khabsa, Mike Lewis, Sinong Wang, Xiang Ren, Aaron Jaech
- TLDR: We aim to further push the limit of inference speed by distilling teacher models into bigger, sparser student models.
Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models
- Patrick Huber, Giuseppe Carenini
- TLDR: We propose a novel approach to infer discourse information for arbitrarily long documents.
SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction
- Yuxin Xiao, Zecheng Zhang, Yuning Mao, Carl Yang, Jiawei Han
- TLDR: We propose to explicitly teach the model to capture relevant contexts and entity types for relation extraction by supervising and augmenting intermediate steps (SAIS) for RE.
LITE: Intent-based Task Representation Learning Using Weak Supervision
- Naoki Otani, Michael Gamon, Sujay Kumar Jauhar, Mei Yang, Sri Raghu Malireddi, Oriana Riva
- TLDR: We propose a novel multi-task learning framework for to-do text representation that learns to represent English to-dos with weak-supervision labels from external resources.
Does Summary Evaluation Survive Translation to Other Languages?
- Spencer Braun, Oleg Vasilyev, Neslihan Iskender, John Bohannon
- TLDR: We explore the effect of translation on the quality of summarization datasets and compare them to automatic evaluation measures.
A Shoulder to Cry on: Towards A Motivational Virtual Assistant for Assuaging Mental Agony
- Tulika Saha, Saichethan Reddy, Anindya Das, Sriparna Saha, Pushpak Bhattacharyya
- TLDR: Mental Health Disorders continue plaguing humans worldwide.
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative Sampling
- Forrest Bao, Ge Luo, Hebi Li, Minghui Qiu, Yinfei Yang, Youbiao He, Cen Chen
- TLDR: We present a proof-of-concept study to a weakly supervised summary evaluation approach without the presence of reference summaries.
Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations
- Gábor Berend
- TLDR: We propose a novel method for using large pre-trained monolingual language models in cross lingual zero-shot word sense disambiguation (WSD) coupled with a contextualized mapping mechanism.
Cheat Codes to Quantify Missing Source Information in Neural Machine Translation
- Proyag Pal, Kenneth Heafield
- TLDR: This paper describes a method to quantify the amount of information about a person’s health.
WiC = TSV = WSD: On the Equivalence of Three Semantic Tasks
- Bradley Hauer, Grzegorz Kondrak
- TLDR: We show that the word-in-context task and the related TTSV task are equivalent, and that the two tasks can be pairwise reduced to each other.
What do tokens know about their characters and how do they know it?
- Ayush Kaushal, Kyle Mahowald
- TLDR: We investigate the mechanisms through which pre-trained language models acquire English-language character information during training and show that this knowledge is acquired through multiple phenomena, including a systematic relationship between particular characters and particular parts of speech, as well
AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization
- Alexander Fabbri, Xiaojian Wu, Srini Iyer, Haoran Li, Mona Diab
- TLDR: We propose a novel unsupervised approach for multi-perspective data augmentation for answer summarization and propose reinforcement learning rewards to improve factual consistency and answer coverage.
Paragraph-based Transformer Pre-training for Multi-Sentence Inference
- Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti
- TLDR: We show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.
Text Style Transfer via Optimal Transport
- Nasim Nouri
- TLDR: We propose Optimal Transport for TST to simultaneously incorporate syntactic and semantic information into similarity computation between the source and the converted text.
Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning
- Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis
- TLDR: We show that increasing the scale of multi-task learning, in terms of the number of tasks, indeed results in better learned representations than smaller multi-tasks.
Interactive Query-Assisted Summarization via Deep Reinforcement Learning
- Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Ido Dagan, Yael Amsterdamer
- TLDR: We propose novel deep reinforcement learning models for interactive summarization that address all of the task requirements and show that they improve informativeness while preserving positive user experience.
Data Augmentation with Dual Training for Offensive Span Detection
- Nasim Nouri
- TLDR: We propose a novel model for offensive span detection (OSD), whose goal is to identify the spans responsible for the offensive tone of the text.
Training Mixed-Domain Translation Models via Federated Learning
- Peyman Passban, Tanya Roosta, Rahul Gupta, Ankit Chadha, Clement Chung
- TLDR: We propose a federated learning approach to train neural machine trans-lation engines that can adapt to different domains and perform on par with state-of-the-art baselines.
QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization
- Alexander Fabbri, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
- TLDR: We show that selecting the components of a QA-based metric, especially question generation and answerability classification, is critical to performance.
How Gender Debiasing Affects Internal Model Representations, and Why It Matters
- Hadas Orgad, Seraphina Goldfarb-Tarrant, Yonatan Belinkov
- TLDR: We propose a new metric for measuring intrinsic bias in NLP models and show that it is a better indicator of debiasing than extrinsic bias.
A Structured Span Selector
- Tianyu Liu, Yuchen Jiang, Ryan Cotterell, Mrinmaya Sachan
- TLDR: We propose a grammar-based structured span selection model which learns to make use of the partial span-level annotation provided for such problems.
Unified Semantic Typing with Meaningful Label Inference
- James Y. Huang, Bangzheng Li, Jiashu Xu, Muhao Chen
- TLDR: Unified framework for semantic typing that captures label semantics by projecting both inputs and labels into a joint semantic embedding space.
Learning To Retrieve Prompts for In-Context Learning
- Ohad Rubin, Jonathan Herzig, Jonathan Berant
- TLDR: We propose an efficient method for retrieving prompts for in-context learning using annotated data and an LM.
Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection
- Esma Balkir, Isar Nejadgholi, Kathleen Fraser, Svetlana Kiritchenko
- TLDR: We present a novel feature attribution method for explaining text classifiers, and analyze it in the context of hate speech detection.
Learning to Retrieve Passages without Supervision
- Ori Ram, Gal Shachaf, Omer Levy, Jonathan Berant, Amir Globerson
- TLDR: We show that a novel pretraining scheme for ODQA retrievers can be used to generate pseudo examples for contrastive learning without any labeled training data.
Re2G: Retrieve, Rerank, Generate
- Michael Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Ankita Naik, Pengshan Cai, Alfio Gliozzo
- TLDR: We propose Re2G, a sequence-to-sequence generation algorithm that combines neural initial retrieval and reranking into a BART-based sequence-based generation algorithm.
Don’t sweat the small stuff, classify the rest: Sample Shielding to protect text classifiers against adversarial attacks
- Jonathan Rusert, Padmini Srinivasan
- TLDR: We show that the ‘make minimal changes’ approach of SOTA attackers leads to critical vulnerabilities that can be defended against with an intuitive sampling strategy.
Federated Learning with Noisy User Feedback
- Rahul Sharma, Anil Ramakrishna, Ansel MacLaughlin, Anna Rumshisky, Jimit Majmudar, Clement Chung, Salman Avestimehr, Rahul Gupta
- TLDR: We propose a novel method for training federated ML models using positive and negative feedback noise, and show that it improves performance on text classification datasets.
Gender Bias in Masked Language Models for Multiple Languages
- Masahiro Kaneko, Aizhan Imankulova, Danushka Bollegala, Naoaki Okazaki
- TLDR: We propose Multilingual Bias Evaluation Score, a score for evaluating the bias of masked language models in various languages using only English attribute word lists and parallel corpora.
Multi-Domain Targeted Sentiment Analysis
- Orith Toledo-Ronen, Matan Orbach, Yoav Katz, Noam Slonim
- TLDR: We present a multi-domain TSA system based on augmenting a given training set with diverse weak labels from assorted domains.
Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization
- Prasetya Utama, Joshua Bambrick, Nafise Moosavi, Iryna Gurevych
- TLDR: We present a novel data generation pipeline for neural abstractive summarization models that can detect factual inconsistency in summarization.
Dynamic Gazetteer Integration in Multilingual Models for Cross-Lingual and Cross-Domain Named Entity Recognition
- Besnik Fetahu, Anjie Fang, Oleg Rokhlenko, Shervin Malmasi
- TLDR: We propose a novel approach to address the NER knowledge gap across languages and domains by augmenting multilingual transformers with gazetteers containing named entities from a target language or domain.
MetaICL: Learning to Learn In Context
- Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- TLDR: Meta-training for In-Context Learning, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks.
Enhancing Knowledge Selection for Grounded Dialogues via Document Semantic Graphs
- Sha Li, Mahdi Namazifar, Di Jin, Mohit Bansal, Heng Ji, Yang Liu, Dilek Hakkani-Tur
- TLDR: We propose to automatically convert the background knowledge documents into document semantic graphs and then perform knowledge selection over such graphs.
Using Natural Sentence Prompts for Understanding Biases in Language Models
- Sarah Alnegheimish, Alicia Guo, Yi Sun
- TLDR: We aim to understand the differences between using template-based prompts and natural sentence prompts when studying gender-occupation biases in language models.
Robust Conversational Agents against Imperceptible Toxicity Triggers
- Ninareh Mehrabi, Ahmad Beirami, Fred Morstatter, Aram Galstyan
- TLDR: We propose adversarial attacks against conversational agents that are imperceptible, i.e., they can automatically trigger the system into generating toxic language.
Selective Differential Privacy for Language Modeling
- Weiyan Shi, Aiqi Cui, Evan Li, Ruoxi Jia, Zhou Yu
- TLDR: We propose a new privacy notion, Selective-DPSGD, for language models that provides rigorous privacy guarantees on the sensitive portion of the data to improve model utility.
Do Trajectories Encode Verb Meaning?
- Dylan Ebert, Chen Sun, Ellie Pavlick
- TLDR: We investigate the extent to which trajectories (i.e. the position and rotation of objects over time) naturally encode verb semantics.
Long Context Question Answering via Supervised Contrastive Learning
- Avi Caciularu, Ido Dagan, Jacob Goldberger, Arman Cohan
- TLDR: We propose a novel method for equipping long-context QA models with an additional sequence-level objective for better identification of the supporting evidence.
The USMLE® Step 2 Clinical Skills Patient Note Corpus
- Victoria Yaneva, Janet Mee, Le Ha, Polina Harik, Michael Jodoin, Alex Mechaber
- TLDR: We present a corpus of 43,985 clinical patient notes written by 35,156 examinees during the high-stakes USMLE® Step 2 Clinical Skills examination.
Learning to Borrow– Relation Representation for Without-Mention Entity-Pairs for Knowledge Graph Completion
- Huda Hakami, Mona Hakami, Angrosh Mandya, Danushka Bollegala
- TLDR: We propose and evaluate several methods to represent relations between entities that do not co-occur in a single sentence using LDPs.
Improving Entity Disambiguation by Reasoning over a Knowledge Base
- Tom Ayoola, Joseph Fisher, Andrea Pierleoni
- TLDR: We present a new entity disambiguation model which uses structured knowledge base information to explain the relationships between entities.
Modal Dependency Parsing via Language Model Priming
- Jiarui Yao, Nianwen Xue, Bonan Min
- TLDR: We present a new modal dependency parser that is based on priming pre-trained language models and evaluate its performance on two data sets.
Document-Level Relation Extraction with Sentences Importance Estimation and Focusing
- Wang Xu, Kehai Chen, Lili Mou, Tiejun Zhao
- TLDR: We propose a sentence importance score and sentence focusing loss for document-level relation extraction, which improves overall performance and makes DocRE models more robust.
Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification
- Yang Xiao, Jinlan Fu, See-Kiong Ng, Pengfei Liu
- TLDR: We investigate the question of whether all the datasets in the benchmark are necessary.
Triggerless Backdoor Attack for NLP Tasks with Clean Labels
- Leilei Gan, Jiwei Li, Tianwei Zhang, Xiaoya Li, Yuxian Meng, Fei Wu, Yi Yang, Shangwei Guo, Chun Fan
- TLDR: We propose a novel method to perform textual backdoor attack which does not require an external trigger and the poisoned samples are correctly labeled.
PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding
- Antoine Chaffin, Vincent Claveau, Ewa Kijak
- TLDR: We propose several new methods to search for novel sequences of text that satisfy certain constraints in language models without tuning the LM.
Interpretable Proof Generation via Iterative Backward Reasoning
- Hanhao Qu, Yu Cao, Jun Gao, Liang Ding, Ruifeng Xu
- TLDR: Iterative Backward Reasoning for rule-based Question Answering.
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation
- Quanyu Long, Tianze Luo, Wenya Wang, Sinno Pan
- TLDR: We propose Domain Confused Contrastive Learning for Unsupervised Domain Adaptation in the absence of target labels.
Incorporating Centering Theory into Neural Coreference Resolution
- Haixia Chai, Michael Strube
- TLDR: We propose to incorporate centering transitions derived from centering theory in the form of a graph into a neural coreference model.
Progressive Class Semantic Matching for Semi-supervised Text Classification
- Haiming Xu, Lingqiao Liu, Ehsan Abbasnejad
- TLDR: We propose a novel way to combine semi-supervised learning and pre-trained language models for text classification.
Low Resource Style Transfer via Domain Adaptive Meta Learning
- Xiangyang Li, Xiang Long, Yu Xia, Sujian Li
- TLDR: We propose a domain adaptive meta-learning approach for text style transfer without parallel data.
Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection
- Alan Ramponi, Sara Tonelli
- TLDR: We analyze lexical biases in hate speech detection and show that distinct spurious artifacts require different treatments to ultimately attain both robustness and fairness in hatespeech detection.
Document-Level Event Argument Extraction by Leveraging Redundant Information and Closed Boundary Loss
- Hanzhang Zhou, Kezhi Mao
- TLDR: We propose a new loss function to build classifiers with closed boundaries for document-level event argument extraction.
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
- David Adelani, Jesujoba Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, Dietrich Klakow, Peter Nabende, Ernie Chang, Tajuddeen Gwadabe, Freshia Sackey, Bonaventure F. P. Dossou, Chris Emezue, Colin Leong, Michael Beukman, Shamsuddeen Muhammad, Guyo Jarso, Oreen Yousuf, Andre Niyongabo Rubungo, Gilles Hacheme, Eric Peter Wairagala, Muhammad Umair Nasir, Benjamin Ajibade, Tunde Ajayi, Yvonne Gitau, Jade Abbott, Mohamed Ahmed, Millicent Ochieng, Anuoluwapo Aremu, Perez Ogayo, Jonathan Mukiibi, Fatoumata Ouoba Kabore, Godson Kalipe, Derguene Mbaye, Allahsera Auguste Tapo, Victoire Memdjokam Koagne, Edwin Munkoh-Buabeng, Valencia Wagner, Idris Abdulmumin, Ayodele Awokoya, Happy Buzaaba, Blessing Sibanda, Andiswa Bukula, Sam Manthalu
- TLDR: We present a novel African news corpus for low-resource translation systems and show that the resulting translation models can effectively transfer to new domains.
Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis
- Yiwei Wang, Muhao Chen, Wenxuan Zhou, Yujun Cai, Yuxuan Liang, Dayiheng Liu, Baosong Yang, Juncheng Liu, Bryan Hooi
- TLDR: We propose a novel method for debiasing sentence-level relation extraction that captures the causal effects of specific entity mentions in each instance.
Analyzing Encoded Concepts in Transformer Language Models
- Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Khan, Jia Xu
- TLDR: We propose a novel framework ConceptX, to analyze how latent concepts are encoded in representations learned within pre-trained lan-guage models.
Boosted Dense Retriever
- Patrick Lewis, Barlas Oguz, Wenhan Xiong, Fabio Petroni, Scott Yih, Sebastian Riedel
- TLDR: We propose DrBoost, a dense retrieval ensemble inspired by boosting.
MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction
- Yue Zhang, Zhenghua Li, Zuyi Bao, Jiacheng Li, Bo Zhang, Chen Li, Fei Huang, Min Zhang
- TLDR: We present MuCGECE, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7,063 sentences collected from three Chinese-as-a-Second-Language
NeuS: Neutral Multi-News Summarization for Mitigating Framing Bias
- Nayeon Lee, Yejin Bang, Tiezheng Yu, Andrea Madotto, Pascale Fung
- TLDR: Neural NLG model for neutralizing news framing bias.
Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation
- Shumpei Inoue, Tsungwei Liu, Son Nguyen, Minh-Tien Nguyen
- TLDR: We present a novel model for incomplete utterance restoration, which works well on both extraction and abstraction scenarios.
Efficient Constituency Tree based Encoding for Natural Language to Bash Translation
- Shikhar Bharadwaj, Shirish Shevade
- TLDR: We propose a Segmented Invocation Transformer for Bash translation that captures the structure of the text and uses it to improve the model.
Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption
- Garam Lee, Minsoo Kim, Jai Hyun Park, Seung-won Hwang, Jung Hee Cheon
- TLDR: We propose a new method for text classification based on homomorphic encryption of BERT embeddings, which can be used to prevent any piece of information in the process of text classification.
ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition
- Xinyu Wang, Min Gui, Yong Jiang, Zixia Jia, Nguyen Bach, Tao Wang, Zhongqiang Huang, Kewei Tu
- TLDR: We propose a novel attention mechanism to model the interactions between image and text representations in multi-modal named entity recognition.
A Dataset for N-ary Relation Extraction of Drug Combinations
- Aryeh Tiktinsky, Vijay Viswanathan, Danna Niezni, Dana Meron Azagury, Yosi Shamay, Hillel Taub-Tabib, Tom Hope, Yoav Goldberg
- TLDR: We present a dataset for extracting information about the efficacy of drug-combinations from the scientific literature.
Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language Understanding
- Zeming Chen, Qiyue Gao
- TLDR: We present Curriculum, a new format of NLI benchmark for evaluation of broad-coverage linguistic phenomena and an evaluation procedure for diagnosing how well a language model captures reasoning skills for distinct types of linguistic phenomena.
Neural Language Taskonomy: Which NLP Tasks are the most Predictive of fMRI Brain Activity?
- Subba Reddy Oota, Jashn Arora, Veeral Agarwal, Mounika Marreddy, Manish Gupta, Bapi Surampudi
- TLDR: We explore transfer learning from task-specific Transformer representations learned for ten popular natural language processing tasks for predicting brain responses from two diverse datasets: Pereira (subjects reading sentences from paragraphs) and Narratives (subjecting listening
FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations
- Leonardo Ribeiro, Mengwen Liu, Iryna Gurevych, Markus Dreyer, Mohit Bansal
- TLDR: We propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MR), which are more suitable for factuality evaluation.
Unsupervised Paraphrasability Prediction for Compound Nominalizations
- John Sie Yuen Lee, Ho Hung Lim, Carol Carol Webster
- TLDR: We propose a new method for generating clausal paraphrases for nominalizations that can be used to paraphrase a compound nominalization.
Global Entity Disambiguation with BERT
- Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto
- TLDR: We propose a global entity disambiguation model based on BERT.
Clues Before Answers: Generation-Enhanced Multiple-Choice QA
- Zixian Huang, Ao Wu, Jiaying Zhou, Yu Gu, Yue Zhao, Gong Cheng
- TLDR: Generative MCQA model that leverages the knowledge of a pre-trained encoder-decoder model to enhance a reader for MCQAs.
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
- Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
- TLDR: We present ELUE, a standard evaluation and a public leaderboard for efficient NLP models.
Stylized Knowledge-Grounded Dialogue Generation via Disentangled Template Rewriting
- Qingfeng Sun, Can Xu, Huang Hu, Yujing Wang, Jian Miao, Xiubo Geng, Yining Chen, Fei Xu, Daxin Jiang
- TLDR: We propose a novel disentangled template rewriting (DTR) method which generates responses via combing disentangle style templates (from monolingual stylized corpus) and content templates (determined by KDG corpus).
LUNA: Learning Slot-Turn Alignment for Dialogue State Tracking
- Yifan Wang, Jing Zhao, Junwei Bao, Chaoqun Duan, Youzheng Wu, Xiaodong He
- TLDR: We propose LUNA, a SLot-TUrN Alignment enhanced approach for dialogue state tracking which uses the most relevant utterances in the dialogue history to predict the current dialogue state given the dialogue state.
Crossroads, Buildings and Neighborhoods: A Dataset for Fine-grained Location Recognition
- Pei Chen, Haotian Xu, Cheng Zhang, Ruihong Huang
- TLDR: We present a new dataset HarveyNER with fine-grained locations annotated in tweets and a new heuristic curricula to learn to recognize them.
Tricks for Training Sparse Translation Models
- Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan
- TLDR: We propose two simple and effective methods to mitigate the effects of multi-task learning with sparse architectures.
Persona-Guided Planning for Controlling the Protagonist’s Persona in Story Generation
- Zhexin Zhang, Jiaxin Wen, Jian Guan, Minlie Huang
- TLDR: We propose a planning-based generation model named ConPer to explicitly model the relationship between personas and events in story generation.
CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking
- Xuming Hu, Zhijiang Guo, GuanYu Wu, Aiwei Liu, Lijie Wen, Philip Yu
- TLDR: We present a new evidence-based fact-checking dataset for non-English-language misinformation and propose a novel approach to train a veracity prediction model in an end-to-end fashion.
VGNMN: Video-grounded Neural Module Networks for Video-Grounded Dialogue Systems
- Hung Le, Nancy Chen, Steven Hoi
- TLDR: We present a novel video-grounded dialogue task model that uses neural module networks to model the information retrieval process in video-fielded language tasks as a pipeline of neural modules.
Multimodal Dialogue State Tracking
- Hung Le, Nancy Chen, Steven Hoi
- TLDR: We propose a novel dialogue state tracking task for visual objects in video-grounded dialogues and propose a new baseline for multimodal dialogue state generation and prediction.
On the Use of Bert for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation
- Yongjie Wang, Chuang Wang, Ruobing Li, Hui Lin
- TLDR: We present a novel multi-scale essay representation for BERT that can be jointly learned and achieve state-of-the-art results in Automated Essay Scoring.
Recognition of They/Them as Singular Personal Pronouns in Coreference Resolution
- Connor Baumler, Rachel Rudinger
- TLDR: We present a new benchmark for coreference resolution systems which evaluates singular personal “they” recognition.
TWEETSPIN: Fine-grained Propaganda Detection in Social Media Using Multi-View Representations
- Prashanth Vijayaraghavan, Soroush Vosoughi
- TLDR: We propose a novel neural approach to detect and categorize propaganda tweets across fine-grained propaganda techniques.
UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis
- Fatemehsadat Mireshghallah, Vaishnavi Shrivastava, Milad Shokouhi, Taylor Berg-Kirkpatrick, Robert Sim, Dimitrios Dimitriadis
- TLDR: We propose UserIdentifier, a novel scheme for training a single shared model for all users.
Improving Neural Models for Radiology Report Retrieval with Lexicon-based Automated Annotation
- Luyao Shi, Tanveer Syeda-mahmood, Tyler Baldwin
- TLDR: We present a novel query-based query match learning method for clinical information retrieval based on clinical finding detection and retrieval of clinical information.
Transparent Human Evaluation for Image Captioning
- Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Yejin Choi, Noah Smith
- TLDR: We establish THumB, a rubric-based human evaluation protocol for image captioning models.
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
- Jonas Pfeiffer, Naman Goyal, Xi Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe
- TLDR: We propose a language-specific module-based approach for multilingual pre-trained models that mitigates the negative interference between languages, and enables positive transfer, resulting in improved monolingual and cross-lingual performance.
DocAMR: Multi-Sentence AMR Representation and Evaluation
- Tahira Naseem, Austin Blodgett, Sadhana Kumaravel, Tim O’Gorman, Young-Suk Lee, Jeffrey Flanigan, Ramón Astudillo, Radu Florian, Salim Roukos, Nathan Schneider
- TLDR: We present a simple algorithm for parsing English sentences into abstract meaning representation graphs and use it to re-evaluate the best document-level AMR parser and coreference resolution systems.
Learning to Transfer Prompts for Text Generation
- Junyi Li, Tianyi Tang, Jian-Yun Nie, Ji-Rong Wen, Xin Zhao
- TLDR: We propose a novel prompt-based method for text generation in a transferable setting.
ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models
- Junyi Li, Tianyi Tang, Zheng Gong, Lixin Yang, Zhuohao Yu, Zhipeng Chen, Jingyuan Wang, Xin Zhao, Ji-Rong Wen
- TLDR: We present a large-scale empirical study on general language ability evaluation of PLMs (ElitePLM).
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand
- Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander Fabbri, Yejin Choi, Noah Smith
- TLDR: We propose a generalization of bidimensional leaderboards for language generation models and metrics for their evaluation.
Improving In-Context Few-Shot Learning via Self-Supervised Training
- Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva
- TLDR: We propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning.
Exposing the Limits of Video-Text Models through Contrast Sets
- Jae Sung Park, Sheng Shen, Ali Farhadi, Trevor Darrell, Yejin Choi, Anna Rohrbach
- TLDR: We propose a new evaluation framework that probes video-text models with hard negatives to determine whether they comprehend the semantics of the text.
Zero-shot Sonnet Generation with Discourse-level Planning and Aesthetics Features
- Yufei Tian, Nanyun Peng
- TLDR: We present a novel framework to generate sonnets that does not require training on poems.
Benchmarking Intersectional Biases in NLP
- John Lalor, Yi Yang, Kendall Smith, Nicole Forsgren, Ahmed Abbasi
- TLDR: We benchmark multiple NLP models for fairness and performance across multiple demographic dimensions and show that while current debiasing strategies fare well in terms of the fairness-accuracy trade-off, they are unable to effectively alleviate bias in
When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer
- Ameet Deshpande, Partha Talukdar, Karthik Narasimhan
- TLDR: We show that the absence of sub-word overlap significantly affects zero-shot transfer between languages and their counterparts constructed by modifying aspects such as the script, word order, and syntax.
How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns
- Stephanie Brandl, Ruixiang Cui, Anders Søgaard
- TLDR: Gender-neutral pronouns in Danish, English, and Swedish are associated with higher perplexity, more dispersed attention patterns, and worse downstream performance.
Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts
- Daniel Khashabi, Xinxi Lyu, Sewon Min, Lianhui Qin, Kyle Richardson, Sean Welleck, Hannaneh Hajishirzi, Tushar Khot, Ashish Sabharwal, Sameer Singh, Yejin Choi
- TLDR: We show that continuous prompts solve a task while being projected to an arbitrary text, and show that this behavior can generalize across models and tasks.
Contrastive Representation Learning for Cross-Document Coreference Resolution of Events and Entities
- Benjamin Hsu, Graham Horwood
- TLDR: We present an approach to entity and event coreference resolution utilizing contrastive representation learning.
Learning the Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese
- Chenxuan Cui, Katherine Zhang, David Mortensen
- TLDR: We show that computational models learn to predict the order of coordinate compounds and elaborate expressions in Hmong, Lahu, and Chinese on the basis of phonology and lexical distribution.
FRUIT: Faithfully Reflecting Updated Information in Text
- Robert Iv, Alexandre Passos, Sameer Singh, Ming-Wei Chang
- TLDR: We introduce the novel generation task of faithfully reflecting updated information in text (FRUIT) where the goal is to update an existing article given new evidence.
Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog
- Chia-Chien Hung, Anne Lauscher, Ivan Vulić, Simone Ponzetto, Goran Glavaš
- TLDR: We present a new multilingual multi-domain task-oriented dialog dataset and a framework for multilingual conversational specialization of pretrained language models for cross-lingual transfer for downstream TOD tasks.
ChapterBreak: A Challenge Dataset for Long-Range Language Models
- Simeng Sun, Katherine Thai, Mohit Iyyer
- TLDR: We present ChapterBreak, a novel dataset that provides an LRLM with a long segment from a narrative that ends at a chapter boundary and asks it to distinguish the beginning of the ground-truth next chapter from a set of
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
- Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, Matei Zaharia
- TLDR: We present a retriever that improves the quality and space footprint of neural information retrieval by 6–10x while reducing the space footprint by 6-10x.
Quantifying Language Variation Acoustically with Few Resources
- Martijn Bartelds, Martijn Wieling
- TLDR: We show that deep acoustic models can learn to distinguish low-resource (Dutch) dialects from high-resource languages without requiring phonetic transcriptions.
Adaptable Adapters
- Nafise Moosavi, Quentin Delfosse, Kristian Kersting, Iryna Gurevych
- TLDR: We propose to use adaptable adapters for designing efficient and effective adapter architectures.
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants
- Max Bartolo, Tristan Thrush, Sebastian Riedel, Pontus Stenetorp, Robin Jia, Douwe Kiela
- TLDR: We present a novel approach to adversarial data collection that improves model fooling rates and improves downstream model performance.
GMN: Generative Multi-modal Network for Practical Document Information Extraction
- Haoyu Cao, Jiefeng Ma, Antai Guo, Yiqing Hu, Hao Liu, Deqiang Jiang, Yinsong Liu, Bo Ren
- TLDR: Generative Multi-modal Network for Document Information Extraction.
One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive Translation
- Chenze Shao, Xuanfu Wu, Yang Feng
- TLDR: We propose diverse distillation with reference selection for diverse machine translation, which improves the state-of-the-art performance of non-autoregressive neural machine translation by over 1 BLEU.
Can Rationalization Improve Robustness?
- Howard Chen, Jacqueline He, Karthik Narasimhan, Danqi Chen
- TLDR: We investigate the robustness of neural NLP models that generate rationales that can explain their model predictions.
On the Effectiveness of Sentence Encoding for Intent Detection Meta-Learning
- Tingting Ma, Qianhui Wu, Zhiwei Yu, Tiejun Zhao, Chin-Yew Lin
- TLDR: We show that sentence embeddings without any fine-tuning on intent detection data could produce a non-trivially strong performance.
A Computational Acquisition Model for Multimodal Word Categorization
- Uri Berger, Gabriel Stanovsky, Omri Abend, Lea Frermann
- TLDR: We present a cognitively-inspired, multimodal acquisition model for child language acquisition, trained from image-caption pairs on naturalistic data using cross-modal self-supervision.
Residue-Based Natural Language Adversarial Attack Detection
- Vyas Raina, Mark Gales
- TLDR: We present a simple sentence-embedding adversarial detection method for neural networks, which outperforms current state of the art adversarial detectors.
Does it Really Generalize Well on Unseen Data? Systematic Evaluation of Relational Triple Extraction Methods
- Juhyuk Lee, Min-Joong Lee, June Yong Yang, Eunho Yang
- TLDR: We show that existing extraction models are able to easily memorize and recall already seen triples, but not generalize effectively for unseen triples.
From spoken dialogue to formal summary: An utterance rewriting for dialogue summarization
- Yue Fang, Hainan Zhang, Hongshen Chen, Zhuoye Ding, Bo Long, Yanyan Lan, Yanquan Zhou
- TLDR: We propose a new model for dialogue summarization task, ReWriteSum, which significantly outperforms baseline models, in terms of both metric-based and human evaluations.
EASE: Entity-Aware Contrastive Learning of Sentence Embedding
- Sosuke Nishikawa, Ryokan Ri, Ikuya Yamada, Yoshimasa Tsuruoka, Isao Echizen
- TLDR: We present EASE, a novel method for learning sentence embeddings via contrastive learning between sentences and their related entities.
Is Neural Topic Modelling Better than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics
- Zihan Zhang, Meng Fang, Ling Chen, Mohammad Reza Namazi Rad
- TLDR: We show that directly clustering high-quality sentence embeddings with an appropriate word selecting method can generate more coherent and diverse topics than NTMs, achieving also higher efficiency and simplicity.
Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering
- Jianguo Mao, Wenbin Jiang, Xiangdong Wang, Zhifan Feng, Yajuan Lyu, Hong Liu, Yong Zhu
- TLDR: We propose a novel video question answering model which performs dynamic multistep reasoning between questions and videos.
TRUE: Re-evaluating Factual Consistency Evaluation
- Or Honovich, Roee Aharoni, Jonathan Herzig, Hagai Taitelbaum, Doron Kukliansy, Vered Cohen, Thomas Scialom, Idan Szpektor, Avinatan Hassidim, Yossi Matias
- TLDR: We present a meta-evaluation protocol for factual consistency metrics that is more actionable and interpretable than previous reported correlations, yielding clearer quality measures.
Knowledge Inheritance for Pre-trained Language Models
- Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
- TLDR: We propose a pre-training framework for large-scale pre-trained language models that could be used to efficiently learn larger PLMs.
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation
- Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang
- TLDR: We introduce Bi-SimCut: a simple but effective training strategy to boost neural machine translation (NMT) performance.
On Transferability of Prompt Tuning for Natural Language Processing
- Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Huadong Wang, Kaiyue Wen, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie Zhou
- TLDR: We empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work.
DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction
- MeiHan Tong, Bin Xu, Shuai Wang, Meihuan Han, Yixin Cao, Jiangqi Zhu, Siyu Chen, Lei Hou, Juanzi Li
- TLDR: We present DocEE, a document-level event extraction dataset including 27,000+ events, 180,000-plus arguments.
Towards Debiasing Translation Artifacts
- Koel Dutta Chowdhury, Rricha Jalota, Cristina España-Bonet, Josef Genabith
- TLDR: We propose a novel approach to reducing translationese by extending an established bias-removal technique.
WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models
- Benjamin Minixhofer, Fabian Paischer, Navid Rekabsaz
- TLDR: We present a novel method for cross-lingual parameter transfer of large language models and show that it is possible to transfer large language model to new languages with up to 64x less training effort.
A New Concept of Knowledge based Question Answering (KBQA) System for Multi-hop Reasoning
- Yu Wang, V.srinivasan@samsung.com V.srinivasan@samsung.com, Hongxia Jin
- TLDR: We present a new KBQA system which can leverage multiple reasoning paths’ information and only requires labeled answer as supervision.
Bilingual Tabular Inference: A Case Study on Indic Languages
- Chaitanya Agarwal, Vivek Gupta, Anoop Kunchukuttan, Manish Shrivastava
- TLDR: We present a novel task of bilingual Tabular Natural Language Inference, in which the tabular premise and a hypothesis over it are in two separate languages.
Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning
- Hongyi Yuan, Zheng Yuan, Sheng Yu
- TLDR: We propose to inject synonyms knowledge in biomedical entity linking tasks without candidate selection and propose synonyms-aware fine-tuning to select concept names for training.
Robust Self-Augmentation for Named Entity Recognition with Meta Reweighting
- Linzhi Wu, Pengjun Xie, Jie Zhou, Meishan Zhang, Ma Chunping, Guangwei Xu, Min Zhang
- TLDR: Meta-reweighting for heterogeneous self-augmentation for NER.
Unsupervised Stem-based Cross-lingual Part-of-Speech Tagging for Morphologically Rich Low-Resource Languages
- Ramy Eskander, Cass Lowry, Sujay Khandagale, Judith Klavans, Maria Polinsky, Smaranda Muresan
- TLDR: We propose an unsupervised stem-based cross-lingual approach for POS tagging for low-resource languages of rich morphology and show that it improves the POS models for all the target languages.
Optimising Equal Opportunity Fairness in Model Training
- Aili Shen, Xudong Han, Trevor Cohn, Timothy Baldwin, Lea Frermann
- TLDR: We propose two novel training objectives which directly optimise for the widely-used criterion of fairness in neural network training.
Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval
- Siyu Ren, Kenny Zhu
- TLDR: We present a novel two-stage framework to compress large pre-trained dual-encoder for lightweight text-image retrieval.
Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization
- Jingyi You, Dongyuan Li, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura
- TLDR: We present a new algorithm for the TLS task that captures the information interaction between sentences and dates.
Early Rumor Detection Using Neural Hawkes Process with a New Benchmark Dataset
- Fengzhu Zeng, Wei Gao
- TLDR: We propose a novel rumor detection model based on neural Hawkes process for EARD, which can guide a generic rumor detection models to make timely, accurate and stable predictions.
Emp-RFT: Empathetic Response Generation via Recognizing Feature Transitions between Utterances
- Wongyu Kim, Youbin Ahn, Donghyun Kim, Kyong-Ho Lee
- TLDR: We propose a novel approach of recognizing feature transitions between utterances in multi-turn empathetic dialogues and propose a response generation strategy to help focus on emotion and keywords related to appropriate features when generating responses.
KCD: Knowledge Walks and Textual Cues Enhanced Political Perspective Detection in News Media
- Wenqian Zhang, Shangbin Feng, Zilong Chen, Zhenyu Lei, Jundong Li, Minnan Luo
- TLDR: We propose a graph-based approach for political perspective detection that leverages textual content and textual cues in news articles to enable multi-hop knowledge reasoning and incorporate textual cues as paragraph-level labels.
Collective Relevance Labeling for Passage Retrieval
- Jihyuk Kim, Minsoo Kim, Seung-won Hwang
- TLDR: We propose knowledge distillation for informed labeling, without incurring high computation overheads at evaluation time.
COGMEN: COntextualized GNN based Multimodal Emotion recognitioN
- Abhinav Joshi, Ashwani Bhat, Ayush Jain, Atin Singh, Ashutosh Modi
- TLDR: We propose COntextualized Graph Neural Network based Multi- modal Emotion RecognitioN system that leverages local information (i.e., inter/intra dependency between speakers) and global information (context
Revisit Overconfidence for OOD Detection: Reassigned Contrastive Learning with Adaptive Class-dependent Threshold
- Yanan Wu, Keqing He, Yuanmeng Yan, QiXiang Gao, Zhiyuan Zeng, Fujia Zheng, Lulu Zhao, Huixing Jiang, Wei Wu, Weiran Xu
- TLDR: We propose a novel reassigned contrastive learning method to discriminate overconfidence of neural models and an adaptive class-dependent local threshold mechanism to separate similar IND and OOD intents for over-confident IND.
AISFG: Abundant Information Slot Filling Generator
- Yang Yan, Junda Ye, Zhongbao Zhang, Liwen Wang
- TLDR: We propose Abundant Information Slot Filling Generator (AISFG), a generative model with a novel query template that incorporates domain descriptions, slot descriptions, and examples with context.
Improving negation detection with negation-focused pre-training
- Thinh Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor
- TLDR: We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking, to better incorporate negation information into language models.
Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers
- Vivek Kumar, Rishabh Maheshwary, Vikram Pudi
- TLDR: We propose several data augmentation techniques for Math Word Problem Solvers that significantly increase the generalization and robustness of existing solvers.
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings
- Yung-Sung Chuang, Rumen Dangovski, Hongyin Luo, Yang Zhang, Shiyu Chang, Marin Soljacic, Shang-Wen Li, Scott Yih, Yoon Kim, James Glass
- TLDR: We propose DiffCSE, an unsupervised contrastive learning framework for learning sentence embeddings that are sensitive to the difference between the original sentence and an edited sentence, where the edited sentence is obtained by stochastically
Generative Cross-Domain Data Augmentation for Aspect and Opinion Co-Extraction
- Junjie Li, Jianfei Yu, Rui Xia
- TLDR: We propose a novel domain adaptation method for aspect and opinion co-extraction by exploiting the labeled data in the source domain.
ProQA: Structural Prompt-based Pre-training for Unified Question Answering
- Wanjun Zhong, Yifan Gao, Ning Ding, Yujia Qin, Zhiyuan Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan
- TLDR: We present ProQA, a unified QA paradigm that solves various tasks through a single model.
A Data Cartography based MixUp for Pre-trained Language Models
- Seo Yeon Park, Cornelia Caragea
- TLDR: We propose a novel MixUp strategy that leverages Training Dynamics and allows more informative samples to be combined for generating new data samples.
Grapheme-to-Phoneme Conversion for Thai using Neural Regression Models
- Tomohiro Yamasaki
- TLDR: We propose a novel Thai grapheme-to-phoneme conversion method based on a neural regression model that is trained using neural networks to predict the similarity between a candidate and the correct pronunciation.
Generating Authentic Adversarial Examples beyond Meaning-preserving with Doubly Round-trip Translation
- Siyu Lai, Zhen Yang, Fandong Meng, Xue Zhang, Yufeng Chen, Jinan Xu, Jie Zhou
- TLDR: We propose a new definition for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT) and introduce the masked language models to construct bilingual adversarial pairs based on DRTT.
TVShowGuess: Character Comprehension in Stories as Speaker Guessing
- Yisi Sang, Xiangyang Mou, Mo Yu, Shunyu Yao, Jing Li, Jeffrey Stanton
- TLDR: We propose a new task for assessing machines’ skills of understanding fictional characters in narrative stories.
Causal Distillation for Language Models
- Zhengxuan Wu, Atticus Geiger, Joshua Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah Goodman
- TLDR: We show that it is beneficial to augment distillation with a third objective that encourages the student to imitate the teacher model.
FNet: Mixing Tokens with Fourier Transforms
- James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon
- TLDR: We show that Transformer encoder architectures can be sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that “mix” input tokens.
Answer Consolidation: Formulation and Benchmarking
- Wenxuan Zhou, Qiang Ning, Heba Elfardy, Kevin Small, Muhao Chen
- TLDR: We present a method for answer consolidation in question answering systems and evaluate multiple models.
Informativeness and Invariance: Two Perspectives on Spurious Correlations in Natural Language
- Jacob Eisenstein
- TLDR: Spurious correlations are a threat to the trustworthiness of natural language processing systems, motivating research into methods for identifying and eliminating them.
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
- Zi-Yi Dou, Nanyun Peng
- TLDR: We present FOAM, a FOllower-Aware speaker model that is constantly updated given the follower feedback, so that the generated instructions can be more suitable to the current learning state of the follower.
Improving Compositional Generalization with Latent Structure and Data Augmentation
- Linlu Qiu, Peter Shaw, Panupong Pasupat, Pawel Nowak, Tal Linzen, Fei Sha, Kristina Toutanova
- TLDR: We present a new method for compositional augmentation of generative models using example recombination and a sequence-to-sequence model.
Joint Extraction of Entities, Relations, and Events via Modeling Inter-Instance and Inter-Label Dependencies
- Minh Van Nguyen, Bonan Min, Franck Dernoncourt, Thien Nguyen
- TLDR: We propose to induce a dependency graph among task instances from data to boost representation learning and improve the performance of Joint Information Extraction.
Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling
- Jakob Prange, Nathan Schneider, Lingpeng Kong
- TLDR: We examine the extent to which, in principle, different syntactic and semantic graph representations can complement and improve neural language modeling.
Imagination-Augmented Natural Language Understanding
- Yujie Lu, Wanrong Zhu, Xin Wang, Miguel Eckstein, William Yang Wang
- TLDR: We present a novel neural model for natural language understanding that combines visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models.
What company do words keep? Revisiting the distributional semantics of J.R. Firth & Zellig Harris
- Mikael Brunila, Jack LaViolette
- TLDR: We present two different and divergent theories of meaning of language that are both grounded in the notion of context and propose new ways of modeling language embeddings.
Compositional Task-Oriented Parsing as Abstractive Question Answering
- Wenting Zhao, Konstantine Arkoudas, Weiqi Sun, Claire Cardie
- TLDR: We present a general reduction of task-oriented parsing to abstractive question answering that overcomes some limitations of canonical paraphrasing.
Learning Cross-Lingual IR from an English Retriever
- Yulong Li, Martin Franz, Md Arafat Sultan, Bhavani Iyer, Young-Suk Lee, Avirup Sil
- TLDR: We present DR.DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge dist
Testing the Ability of Language Models to Interpret Figurative Language
- Emmy Liu, Chenxuan Cui, Kenneth Zheng, Graham Neubig
- TLDR: We present Fig-QA, a Winograd-style nonliteral language understanding task consisting of correctly interpreting paired figurative phrases with divergent meanings.
Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity
- Sheshera Mysore, Arman Cohan, Tom Hope
- TLDR: We present a new scientific document similarity model based on matching fine-grained aspects of texts.
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning
- Siddharth Verma, Justin Fu, Sherry Yang, Sergey Levine
- TLDR: We show how offline reinforcement learning can be used to train dialogue agents entirely using static datasets collected from human speakers.
Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer
- Yanpeng Zhao, Jack Hessel, Youngjae Yu, Ximing Lu, Rowan Zellers, Yejin Choi
- TLDR: We propose VIP-ANT, a novel audio-text alignment algorithm that induces audio-image alignment without using any parallel audio-data.
SURF: Semantic-level Unsupervised Reward Function for Machine Translation
- Atijit Anuchitanukul, Julia Ive
- TLDR: We present a novel reward function for RL which mimics human evaluation and outperforms the standard sparse reward function in both in- and out-of-domain settings.
Disentangling Categorization in Multi-agent Emergent Communication
- Washington Garcia, Hamilton Clouse, Kevin Butler
- TLDR: We propose a new method for quantifying the categorization power of artificial agents and show that it can improve communication ability, despite encouraging compositionality in the artificial language.
Show, Don’t Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue
- Raghav Gupta, Harrison Lee, Jeffrey Zhao, Yuan Cao, Abhinav Rastogi, Yonghui Wu
- TLDR: We propose Show, Don’t Tell, which prompts seq2seq models with a labeled example dialogue to show the semantics of schema elements rather than tell the model through descriptions.
Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge
- Ian Porada, Alessandro Sordoni, Jackie Cheung
- TLDR: We show that commonsense knowledge is acquired from surface-level co-occurrence patterns rather than induced, systematic reasoning in transformable models.
Using Paraphrases to Study Properties of Contextual Embeddings
- Laura Burdick, Jonathan Kummerfeld, Rada Mihalcea
- TLDR: We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT.
Measure and Improve Robustness in NLP Models: A Survey
- Xuezhi Wang, Haohan Wang, Diyi Yang
- TLDR: We provide a unified survey of how to define, measure and improve robustness in NLP models.
Learning to Generate Examples for Semantic Processing Tasks
- Danilo Croce, Simone Filice, Giuseppe Castellucci, Roberto Basili
- TLDR: We propose a neural approach to automatically learn to generate new examples using a pre-trained sequence-to-sequence model.
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
- Peter West, Chandra Bhagavatula, Jack Hessel, Jena Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, Yejin Choi
- TLDR: We propose a new method for training commonsense models that uses large general language models to distill high-quality commonsense from a small commonsense graph.
GenIE: Generative Information Extraction
- Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, Robert West
- TLDR: We present GenIE, a novel autoregressive formulation of closed information extraction, which generalizes from fewer training data points than baselines, scales to a previously unmanageable number of entities and relations.
Entity Linking via Explicit Mention-Mention Coreference Modeling
- Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum
- TLDR: We present and empirically analyze a novel training approach for learning mention and entity representations that is based on building minimum spanning arborescences over mentions and entities across documents to explicitly model mention coreference relationships.
Massive-scale Decoding for Text Generation using Lattices
- Jiacheng Xu, Siddhartha Jonnalagadda, Greg Durrett
- TLDR: We present a search algorithm to construct lattices encoding a massive number of generation options.
Disentangling Indirect Answers to Yes-No Questions in Real Conversations
- Krishna Sanagavarapu, Jathin Singaraju, Anusha Kakileti, Anirudh Kaza, Aaron Mathews, Helen Li, Nathan Brito, Eduardo Blanco
- TLDR: We show that the answer to yes-no questions in real conversations is not the same as the answer given in synthetic corpora.
Quantifying Adaptability in Pre-trained Language Models with 500 Tasks
- Belinda Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, Jacob Andreas
- TLDR: We present a large-scale empirical study of the features and limits of LM adaptability to new tasks, and show that generalization to new examples can be systematically described and understood, and we conclude with a discussion of additional aspects
Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection
- Indira Sen, Mattia Samory, Claudia Wagner, Isabelle Augenstein
- TLDR: We show that construct-driven and construct-agnostic counterfactually augmented data improve out-of-domain generalizability, an indicator of model robustness.
A Study of the Attention Abnormality in Trojaned BERTs
- Weimin Lyu, Songzhu Zheng, Tengfei Ma, Chao Chen
- TLDR: We propose an attention-based Trojan detector based on the transformer’s attention.
EPiDA: An Easy Plug-in Data Augmentation Framework for High Performance Text Classification
- Minyi Zhao, Lu Zhang, Yi Xu, Jiandong Ding, Jihong Guan, Shuigeng Zhou
- TLDR: We present an easy and plug-in data augmentation framework EPiDA to support effective text classification.
Partial-input baselines show that NLI models can ignore context, but they don’t.
- Neha Srikanth, Rachel Rudinger
- TLDR: We show that state-of-the-art NLI models are capable of learning to condition on context, despite being trained on artifact-ridden datasets.
Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora
- Xisen Jin, Dejiao Zhang, Henghui Zhu, Wei Xiao, Shang-Wen Li, Xiaokai Wei, Andrew Arnold, Xiang Ren
- TLDR: We study a lifelong language model pretraining challenge where a PTLM is continually updated so as to adapt to emerging data.
Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition
- Pengshan Cai, Hui Wan, Fei Liu, Mo Yu, Hong Yu, Sachindra Joshi
- TLDR: We propose novel AI-empowered chat bots for learning as conversation where a user does not read a passage but gains information and knowledge through conversation with a teacher bot.
Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs
- Songlin Yang, Wei Liu, Kewei Tu
- TLDR: We propose a new method for inference of factor graph grammars that uses tensor rank decomposition to reduce inference computational complexity for HMMs and PCFGs.
What Factors Should Paper-Reviewer Assignments Rely On? Community Perspectives on Issues and Ideals in Conference Peer-Review
- Terne Thorn Jakobsen, Anna Rogers
- TLDR: We present the first survey of the NLP community, identifying common issues and perspectives on what factors should be considered by paper-reviewer matching systems.
Reducing Disambiguation Biases in NMT by Leveraging Explicit Word Sense Information
- Niccolò Campolungo, Tommaso Pasini, Denis Emelin, Roberto Navigli
- TLDR: We present a novel approach for automatically creating high-precision sense-annotated parallel corpora for neural machine translation, and show that this approach can significantly improve the accuracy of a baseline NMT model.
Mining Clues from Incomplete Utterance: A Query-enhanced Network for Incomplete Utterance Rewriting
- Shuzheng Si, Shuang Zeng, Baobao Chang
- TLDR: We propose a QUEry-Enhanced Network(QUEEN) to solve this problem.
Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization
- Lulu Zhao, Fujia Zheng, Weihao Zeng, Keqing He, Weiran Xu, Huixing Jiang, Wei Wu, Yanan Wu
- TLDR: We propose a novel and efficient domain-oriented prefix-based fine-tuning method for dialogue summarization that improves generalization ability on new domains and improves domain adaptation benchmarks.
Interactive Symbol Grounding with Complex Referential Expressions
- Rimvydas Rubavicius, Alex Lascarides
- TLDR: We present a procedure for learning to ground symbols from a sequence of stimuli consisting of an arbitrarily complex noun phrase (e.g. “all but one green square above both red circles.”) and its designation in
Generalized Quantifiers as a Source of Error in Multilingual NLU Benchmarks
- Ruixiang Cui, Daniel Hershcovich, Anders Søgaard
- TLDR: We present adversarial generalized quantifier NLI tasks and show that pre-trained language models have a clear lack of robustness in generalized quantifiers reasoning.
Exact Paired-Permutation Testing for Structured Test Statistics
- Ran Zmigrod, Tim Vieira, Ryan Cotterell
- TLDR: We provide an efficient exact algorithm for the paired-permutation test for a family of structured test statistics.
A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank
- Dan Malkin, Tomasz Limisiewicz, Gabriel Stanovsky
- TLDR: We show that the choice of pretraining languages affects downstream cross-lingual transfer for BERT-based models.
SSEGCN: Syntactic and Semantic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis
- Zheng Zhang, Zili Zhou, Yanna Wang
- TLDR: We propose a novel aspect-aware attention mechanism combined with self-attention to obtain attention score matrices of a sentence, which can not only learn the aspect-related semantic correlations, but also learn the global semantics of the
Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy
- Allison Lahnala, Charles Welch, Béla Neuendorf, Lucie Flek
- TLDR: We improve over recent work on controllable text generation by using empathetic data, and show that the degree of improvements is subject to specific communication components of empathy.
DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks
- Lin Tian, Xiuzhen Zhang, Jey Han Lau
- TLDR: Social media rumours, a form of misinformation, can mislead the public and cause significant economic and social disruption.
Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES
- Felix Stahlberg, Shankar Kumar
- TLDR: We propose a novel multi-label classification layer for machine translation that can model ambiguity more effectively.
SkillSpan: Hard and Soft Skill Extraction from English Job Postings
- Mike Zhang, Kristian Jensen, Sif Sonniks, Barbara Plank
- TLDR: We present a novel dataset for skill extraction and a novel framework for annotating spans and language models.
RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction
- Yuan Liang, Zhuoxuan Jiang, Di Yin, Bo Ren
- TLDR: We propose a novel and tailored transformer for document-level event extraction which can model the relation dependencies of event arguments.
A Double-Graph Based Framework for Frame Semantic Parsing
- Ce Zheng, Xudong Chen, Runxin Xu, Baobao Chang
- TLDR: We propose a Knowledge-guided Incremental semantic parser with Double-graph for frame semantic parsing.
An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling
- Peiyi Wang, Runxin Xu, Tianyu Liu, Qingyu Zhou, Yunbo Cao, Baobao Chang, Zhifang Sui
- TLDR: We propose ESD, an Enhanced Span-based Decomposition method for FSSL, a novel approach for the tagging model.
A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction
- Runxin Xu, Peiyi Wang, Tianyu Liu, Shuang Zeng, Baobao Chang, Zhifang Sui
- TLDR: We propose a novel method for document-level event extraction that uses the long distance dependency between trigger and arguments over sentences.
Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning
- Fei Wang, Zhewei Xu, Pedro Szekely, Muhao Chen
- TLDR: We propose a structured table-to-text generation model that captures the relations of content pieces in a table and is robust to structural transformations.
JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering
- Yueqing Sun, Qi Shi, Le Qi, Yu Zhang
- TLDR: Joint reasoning of LM and GNNs for commonsense question answering.
Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens
- Itay Itzhak, Omer Levy
- TLDR: We show that language models learn the internal composition of whole word and subword tokens to a surprising extent, withoutever seeing the characters coupled with the tokens.
A Corpus for Understanding and Generating Moral Stories
- Jian Guan, Ziqi Liu, Minlie Huang
- TLDR: We propose two understanding tasks and two generation tasks to assess the ability of machines to understand and write moral stories.
Modeling Multi-Granularity Hierarchical Features for Relation Extraction
- Xinnian Liang, Shuangzhi Wu, Mu Li, Zhoujun Li
- TLDR: We propose a novel method to extract multi-granularity features based solely on the original input sentences.
Cross-modal Contrastive Learning for Speech Translation
- Rong Ye, Mingxuan Wang, Lei Li
- TLDR: We propose ConST, a cross-modal contrastive learning method for end-to-end speech-to text translation.
Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances
- Seungju Han, Beomsu Kim, Jin Yong Yoo, Seokjun Seo, Sangbum Kim, Enkhbayar Erdenee, Buru Chang
- TLDR: We present a new practical task where only a few utterances of each fictional character are available to generate responses mimicking them.
DynamicTOC: Persona-based Table of Contents for Consumption of Long Documents
- Himanshu Maheshwari, Nethraa Sivakumar, Shelly Jain, Tanvi Karandikar, Vinay Aggarwal, Navita Goyal, Sumit Shekhar
- TLDR: We present a dynamic table of content-based navigator for non-linear, persona-based document consumption.
KALA: Knowledge-Augmented Language Model Adaptation
- Minki Kang, Jinheon Baek, Sung Ju Hwang
- TLDR: We propose a novel domain adaption framework for PLMs that modulates the intermediate hidden representations of PLMs with domain knowledge, consisting of entities and their relational facts.
On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
- Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woomyoung Park, Jung-Woo Ha, Nako Sung
- TLDR: We investigate the effects of the source and size of the pretraining corpus on in-context learning in HyperCLOVA, a Korean-centric GPT-3 model.
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences
- Yifan Chen, Qi Zeng, Dilek Hakkani-Tur, Di Jin, Heng Ji, Yun Yang
- TLDR: We propose Skeinformer and Linformer to accelerate self-attention and improve the accuracy of matrix approximation to self-tattention with matrix sketching.
Partner Personas Generation for Dialogue Response Generation
- Hongyuan Lu, Wai Lam, Hong Cheng, Helen Meng
- TLDR: We propose a novel framework that leverages automatic partner personas generation to enhance the succeeding dialogue response generation.
Semantically Informed Slang Interpretation
- Zhewei Sun, Richard Zemel, Yang Xu
- TLDR: We propose a semantically informed slang interpretation framework that considers jointly the contextual and semantic appropriateness of a candidate interpretation for a query slang.
Dual-Channel Evidence Fusion for Fact Verification over Texts and Tables
- Nan Hu, Zirui Wu, Yuxuan Lai, Xiao Liu, Yansong Feng
- TLDR: We propose a Dual Channel Unified Format fact verification model that unifies various evidence into parallel streams, i.e., natural language sentences and a global evidence table, simultaneously.
TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding
- Le Zhang, Zichao Yang, Diyi Yang
- TLDR: We propose a compositional data augmentation approach for natural language understanding called TreeMix.
Syn2Vec: Synset Colexification Graphs for Lexical Semantic Similarity
- John Harvill, Roxana Girju, Mark Hasegawa-Johnson
- TLDR: We use graph synset embedding representations to build large scale synset graphs across BabelNet’s typologically diverse set of 499 world languages.
On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?
- Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy
- TLDR: We investigate the causes of hallucination in knowledge-grounded conversational models and show that the standard benchmarks consist of > 60% hallucinated responses, leading to models that not only hallucinate but even amplify hallucinations.
Is “My Favorite New Movie” My Favorite Movie? Probing the Understanding of Recursive Noun Phrases
- Qing Lyu, Zheng Hua, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch
- TLDR: We present a dataset of three textual inference tasks for recursive noun phrases and show that such knowledge is learnable with appropriate data.
Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance
- Jingwei Ni, Zhijing Jin, Markus Freitag, Mrinmaya Sachan, Bernhard Schölkopf
- TLDR: We show that the train-test direction match and data-model direction match are important factors in the impact of translationese on the machine translation performance.
Visual Commonsense in Pretrained Unimodal and Multimodal Models
- Chenyu Zhang, Benjamin Van Durme, Zhuowan Li, Elias Stengel-Eskin
- TLDR: We investigate to what degree unimodal (language-only) and multimodal models capture a broad range of visually salient attributes.
QuALITY: Question Answering with Long Input Texts, Yes!
- Richard Yuanzhe Pang, Alicia Parrish, Nitish Joshi, Nikita Nangia, Jason Phang, Angelica Chen, Vishakh Padmakumar, Johnny Ma, Jana Thompson, He He, Samuel Bowman
- TLDR: We present a new dataset for long-document comprehension that allows for building and testing models on long-word comprehension.
ExSum: From Local Explanations to Model Understanding
- Yilun Zhou, Marco Tulio Ribeiro, Julie Shah
- TLDR: We propose a mathematical framework for quantifying model understandability and propose metrics for its quality assessment.
Maximum Bayes Smatch Ensemble Distillation for AMR Parsing
- Young-Suk Lee, Ramón Astudillo, Hoang Thanh Lam, Tahira Naseem, Radu Florian, Salim Roukos
- TLDR: We propose a new state-of-the-art method for AMR parsing by combining Smatch-based ensembling techniques with ensemble distillation.
When Does Syntax Mediate Neural Language Model Performance? Evidence from Dropout Probes
- Mycal Tucker, Tiwalayo Eisape, Peng Qian, Roger Levy, Julie Shah
- TLDR: We show that language models do encode syntactic information redundantly and introduce a new probe design that guides probes to consider all syntactic input information present in embeddings.
Modeling Task Interactions in Document-Level Joint Entity and Relation Extraction
- Liyan Xu, Jinho Choi
- TLDR: We propose Graph Compatibility for document-level relation extraction in an end-to-end setting, where the model needs to jointly perform mention extraction, coreference resolution (COREF) and relation extraction (RE) at once
Few-Shot Semantic Parsing with Language Models Trained on Code
- Richard Shin, Benjamin Van Durme
- TLDR: We show that pre-trained models can perform semantic parsing with little training data, when prompted with in-context examples.
CORWA: A Citation-Oriented Related Work Annotation Dataset
- Xiangci Li, Biswadip Mandal, Jessica Ouyang
- TLDR: We present a novel framework for generating related work sections from text fragments from different sources.
Overcoming Catastrophic Forgetting During Domain Adaptation of Seq2seq Language Generation
- Dingcheng Li, Zheng Chen, Eunah Cho, Jie Hao, Xiaohu Liu, Fan Xing, Chenlei Guo, Yang Liu
- TLDR: We propose a novel approach to learn language generation models that can be trained offline with multiple domains in a sequential fashion without requiring additional data storage.
Extreme Zero-Shot Learning for Extreme Text Classification
- Yuanhao Xiong, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Inderjit Dhillon
- TLDR: We propose a novel approach to learn the semantic embeddings of instances and labels with raw text.
ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence
- Yibo Hu, MohammadSaleh Hosseini, Erick Skorupa Parolin, Javier Osorio, Latifur Khan, Patrick Brandt, Vito D’Orazio
- TLDR: We present ConfliBERT, a domain-specific pre-trained language model for conflict and political violence.
Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot Classification
- Han Wang, Canwen Xu, Julian McAuley
- TLDR: We propose Automatic Multi-Label Prompting (AMuLaP), a simple yet effective method to automatically select label mappings for few-shot text classification with prompting.
Few-shot Subgoal Planning with Language Models
- Lajanugen Logeswaran, Yao Fu, Moontae Lee, Honglak Lee
- TLDR: Language models can infer detailed subgoal sequences from few training sequences without any subgoal supervision.
IDPG: An Instance-Dependent Prompt Generation Method
- Zhuofeng Wu, Sinong Wang, Jiatao Gu, Rui Hou, Yuxiao Dong, V.G.Vinod Vydiswaran, Hao Ma
- TLDR: We propose a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.
Embedding Hallucination for Few-shot Language Fine-tuning
- Yiren Jian, Chongyang Gao, Soroush Vosoughi
- TLDR: We propose an efficient and effective method for fine-tuning language models that can overcome over-fitting.
Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models
- Ramit Sawhney, Shivam Agarwal, Vivek Mittal, Paolo Rosso, Vikram Nanda, Sudheer Chava
- TLDR: We present and publicly release CryptoBubbles, a novel multi- span identification task for cryptocoins and a dataset of more than 400 cryptocoin from 9 exchanges over five years spanning over two million tweets.
Nearest Neighbor Knowledge Distillation for Neural Machine Translation
- Zhixian Yang, Renliang Sun, Xiaojun Wan
- TLDR: k-nearest-neighbor machine translation (
DEMix Layers: Disentangling Domains for Modular Language Modeling
- Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah Smith, Luke Zettlemoyer
- TLDR: We introduce a new domain expert mixture (DEMix) layer that enables conditioning a language model (LM) on the domain of the input text.
Contrastive Learning for Prompt-based Few-shot Language Learners
- Yiren Jian, Chongyang Gao, Soroush Vosoughi
- TLDR: We propose a contrastive learning framework for few-shot few-shots learners that can improve generality of models trained with only limited examples.
Cross-Lingual Event Detection via Optimized Adversarial Training
- Luis Guzman-Nateras, Minh Van Nguyen, Thien Nguyen
- TLDR: We present a cross-lingual event detection model that learns to detect cross-lingsual events.
Identifying Implicitly Abusive Remarks about Identity Groups using a Linguistically Informed Approach
- Michael Wiegand, Elisabeth Eder, Josef Ruppenhofer
- TLDR: We present a new dataset that consists of the subtype of atomic negative sentences about identity groups and a new approach to detect implicit abusive language.
Label Definitions Improve Semantic Role Labeling
- Li Zhang, Ishan Jindal, Yunyao Li
- TLDR: We propose a semantic role label model that learns semantic role labels from predicate senses and achieves state-of-the-art performance on the CoNLL09 dataset.
Shedding New Light on the Language of the Dark Web
- Youngjin Jin, Eugene Jang, Yongjae Lee, Seungwon Shin, Jin-Woo Chung
- TLDR: We present a publicly available Dark Web dataset for text-based Dark Web analysis and examine the textual differences between the Dark Web and the Surface Web.
Conceptualizing Treatment Leakage in Text-based Causal Inference
- Adel Daoud, Connor Jerzak, Richard Johansson
- TLDR: We identify the treatment-leakage problem in causal inference and propose a novel method to address it.
Consistency Training with Virtual Adversarial Discrete Perturbation
- Jungsoo Park, Gyuwan Kim, Jaewoo Kang
- TLDR: We propose a novel method for augmentation of consistency training based on a discrete noise that is the highest divergence between predictions.
CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning
- Xiangru Tang, Arjun Nair, Borui Wang, Bingyao Wang, Jai Desai, Aaron Wade, Haoran Li, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev
- TLDR: We propose a novel contrastive fine-tuning strategy for dialog summarization that significantly improves the factual consistency and overall quality of summaries.
CoMPM: Context Modeling with Speaker’s Pre-trained Memory Tracking for Emotion Recognition in Conversation
- Joosung Lee, Wooin Lee
- TLDR: We propose a new method for emotion recognition in conversation that combines the speaker’s pre-trained memory with the context model and show that it improves the performance of the context models.
Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries
- Xiangru Tang, Alexander Fabbri, Haoran Li, Ziming Mao, Griffin Adams, Borui Wang, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev
- TLDR: We provide a crowdsourcing method for evaluating factual consistency in summarization and show that the reliability of Likert ratings is highly dependent on the target dataset and the evaluation design.
DialSummEval: Revisiting Summarization Evaluation for Dialogues
- Mingqi Gao, Xiaojun Wan
- TLDR: We present a unified human evaluation of dialogue summarization models and show that current dialogue summarizations models are not well-defined and have flaws.
Hyperbolic Relevance Matching for Neural Keyphrase Extraction
- Mingyang Song, Yi Feng, Liping Jing
- TLDR: We propose a new hyperbolic matching model for keyphrase extraction in hyperbolive space and show that it outperforms the recent state-of-the-art baselines.
Template-free Prompt Tuning for Few-shot NER
- Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Linyang Li, Qi Zhang, Xuanjing Huang
- TLDR: We propose a new method to reformulate token-level few-shot labeling tasks as LM problems without any templates.
Few-Shot Document-Level Relation Extraction
- Nicholas Popovic, Michael Färber
- TLDR: We present FREDo, a few-shot document-level relation extraction benchmark.
LaMemo: Language Modeling with Look-Ahead Memory
- Haozhe Ji, Rongsheng Zhang, Zhenyu Yang, Zhipeng Hu, Minlie Huang
- TLDR: We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens and interpolating with the old memory states to maintain long-term information in the history
Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs
- Ghazi Felhi, Joseph Roux, Djamé Seddah
- TLDR: We propose a generative model for text generation, which exhibits disentangled latent representations of syntax and semantics.
Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints
- Chun Zeng, Jiangjie Chen, Tianyi Zhuang, Rui Xu, Hao Yang, Qin Ying, Shimin Tao, Yanghua Xiao
- TLDR: We propose a plug-in algorithm for non-autoregressive translation for low-frequency constraints, which improves constraint preservation and translation quality.
What do Toothbrushes do in the Kitchen? How Transformers Think our World is Structured
- Alexander Henlein, Alexander Mehler
- TLDR: We investigate the extent to which transformer-based language models can extract knowledge about object relations.
Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation
- Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-Rong Wen
- TLDR: We propose to refine the user dialogue history on a large scale and obtain more abundant and accurate persona information.
A Holistic Framework for Analyzing the COVID-19 Vaccine Debate
- Maria Pacheco, Tunazzina Islam, Monal Mahajan, Andrey Shor, Ming Yin, Lyle Ungar, Dan Goldwasser
- TLDR: We propose a holistic analysis framework connecting stance and reason analysis, and fine-grained entity level moral sentiment analysis.
Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training
- Yuanxin Liu, Fandong Meng, Zheng Lin, Peng Fu, Yanan Cao, Weiping Wang, Jie Zhou
- TLDR: We propose to directly optimize the subnetwork structure towards the pre-training objectives, which can better preserve the pre training performance.
You Don’t Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers’ Private Personas
- Haoran Li, Yangqiu Song, Lixin Fan
- TLDR: We propose effective defense objectives to protect the privacy leakage of the hidden states of chatbots trained by language modeling.
Explaining Dialogue Evaluation Metrics using Adversarial Behavioral Analysis
- Baber Khalid, Sungjin Lee
- TLDR: We show that dialogue metrics for both open-domain and task-oriented settings are biased in their assessments of different conversation behaviors and fail to properly penalize problematic conversations, by analyzing their assessments.
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
- Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah Smith
- TLDR: We investigate the effect of annotator identities and beliefs on toxicity annotations and show strong associations between them and their ratings of toxicity.
Non-Autoregressive Chinese ASR Error Correction with Phonological Training
- Zheng Fang, Ruiqing Zhang, Zhongjun He, Hua Wu, Yanan Cao
- TLDR: We propose a novel non-autoregressive method for Chinese ASR error correction for both cases.
Hate Speech and Counter Speech Detection: Conversational Context Does Matter
- Xinchen Yu, Eduardo Blanco, Lingzi Hong
- TLDR: We show annotators the context of Reddit comments and show them the context.
DACSA: A large-scale Dataset for Automatic summarization of Catalan and Spanish newspaper Articles
- Encarnación Segarra Soriano, Vicent Ahuir, Lluís-F. Hurtado, José González
- TLDR: We describe the construction of a large-scale corpus of Catalan and Spanish newspapers for automatic summarization and provide a set of metrics for summarization tasks.
Time Waits for No One! Analysis and Challenges of Temporal Misalignment
- Kelvin Luu, Daniel Khashabi, Suchin Gururangan, Karishma Mandyam, Noah Smith
- TLDR: We show that temporal misalignment in NLP models can degrade end-to-end performance and propose a new approach to address this problem.
MCSE: Multimodal Contrastive Learning of Sentence Embeddings
- Miaoran Zhang, Marius Mosbach, David Adelani, Michael Hedderich, Dietrich Klakow
- TLDR: We propose a sentence embedding learning approach that exploits both visual and textual information via a multimodal contrastive objective.
HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction
- Shuliang Liu, Xuming Hu, Chenwei Zhang, Shu’ang Li, Lijie Wen, Philip Yu
- TLDR: We propose a novel contrastive learning framework for unsupervised relation extraction that can efficiently extract hierarchical signals from relational feature space using cross hierarchy attention and effectively optimize relation representation of sentences under exemplar-wise contrastive training.
Diagnosing Vision-and-Language Navigation: What Really Matters
- Wanrong Zhu, Yuankai Qi, Pradyumna Narayana, Kazoo Sone, Sugato Basu, Xin Wang, Qi Wu, Miguel Eckstein, William Yang Wang
- TLDR: We show that indoor navigation agents refer to both object and direction tokens when making decisions.
Aligning to Social Norms and Values in Interactive Narratives
- Prithviraj Ammanabrolu, Liwei Jiang, Maarten Sap, Hannaneh Hajishirzi, Yejin Choi
- TLDR: We present a novel agent that uses the social commonsense knowledge present in specially trained language models to contextually restrict its action space to only those actions that are aligned with socially beneficial values.
MOVER: Mask, Over-generate and Rank for Hyperbole Generation
- Yunxiang Zhang, Xiaojun Wan
- TLDR: We propose an unsupervised method for hyperbole generation that does not require parallel literal-hyperbole pairs.
Embarrassingly Simple Performance Prediction for Abductive Natural Language Inference
- Emīls Kadiķis, Vaibhav Srivastav, Roman Klinger
- TLDR: We propose a simple method for predicting the performance of pre-trained models on a natural language inference task by comparing sentence embeddings with cosine similarity to what kind of performance is achieved when training a classifier on top of
Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics
- Daniel Deutsch, Rotem Dror, Dan Roth
- TLDR: We propose to measure system-level correlations for summarization evaluation metrics using the full test set instead of the subset of summaries judged by humans.

« ACL 2022 - A Summarization of Summarization Papers. NAACL 2022 - A Summarization of Summarization Papers. »