TLDRs

EMNLP 2022

Generative Knowledge Graph Construction: A Review
- Hongbin Ye, Ningyu Zhang, Hui Chen, Huajun Chen
- TLDR: We present a detailed, complete taxonomy for the generative KGC methods and provide theoretical insight and empirical analysis.
CDConv: A Benchmark for Contradiction Detection in Chinese Conversations
- Chujie Zheng, Jinfeng Zhou, Yinhe Zheng, Libiao Peng, Zhen Guo, Wenquan Wu, Zheng-Yu Niu, Hua Wu, Minlie Huang
- TLDR: We propose a new benchmark for dialogue contradiction detection in Chinese Conversations, which contains 12K multi-turn conversations annotated with three typical contradiction categories: Intra-sentence Contradiction, Role Confusion, and History Contradictions.
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
- Mor Geva, Avi Caciularu, Kevin Wang, Yoav Goldberg
- TLDR: We propose a new approach to understanding the prediction process of transformer models, and show that it is often human-interpretable.
Learning to Generate Question by Asking Question: A Primal-Dual Approach with Uncommon Word Generation
- Qifan Wang, Li Yang, Xiaojun Quan, Fuli Feng, Dongfang Liu, Zenglin Xu, Sinong Wang, Hao Ma
- TLDR: We propose a novel approach for automatic question generation and question answering that incorporates question generation with its dual problem, question answering, into a unified primal-dual framework.
Graph-based Model Generation for Few-Shot Relation Extraction
- Wanli Li, Tieyun Qian
- TLDR: We propose a model generation framework for few-shot relation extraction that combines a general large model for all tasks and many tiny task-specific models for each individual task.
Backdoor Attacks in Federated Learning by Rare Embeddings and Gradient Ensembling
- Ki Yoon Yoo, Nojun Kwak
- TLDR: We investigate the feasibility of model poisoning for backdoor attacks through rare word embeddings of NLP models.
Generating Natural Language Proofs with Verifier-Guided Search
- Kaiyu Yang, Jia Deng, Danqi Chen
- TLDR: We present a novel stepwise method for generating relevant proof steps conditioning on the hypothesis.
Toward Unifying Text Segmentation and Long Document Summarization
- Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Fei Liu, Dong Yu
- TLDR: We propose a novel approach to extractive summarization of long text by learning robust sentence representations and optimizing the regularizer to promote selection of diverse summary sentences.
The Geometry of Multilingual Language Model Representations
- Tyler Chang, Zhuowen Tu, Benjamin Bergen
- TLDR: We show that multilingual language models maintain a shared multilingual representation space while still encoding language-sensitive information in each language.
Improving Complex Knowledge Base Question Answering via Question-to-Action and Question-to-Question Alignment
- Yechun Tang, Xiaoxia Cheng, Weiming Lu
- TLDR: We propose a new complex question answering framework which improves the performance of existing methods by using question rewriting and question-to-action alignment.
PAIR: Prompt-Aware margIn Ranking for Counselor Reflection Scoring in Motivational Interviewing
- Do June Min, Verónica Pérez-Rosas, Kenneth Resnicow, Rada Mihalcea
- TLDR: We propose a system for the analysis of counselor reflections.
Co-guiding Net: Achieving Mutual Guidances between Multiple Intent Detection and Slot Filling via Heterogeneous Semantics-Label Graphs
- Bowen Xing, Ivor Tsang
- TLDR: Graph-based models for multiple intent detection and slot filling.
The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains
- Haoran Xu, Philipp Koehn, Kenton Murray
- TLDR: We propose a general task-agnostic method to balance the contribution of all parameters and show that it improves generalization performance significantly.
Interpreting Language Models with Contrastive Explanations
- Kayo Yin, Graham Neubig
- TLDR: We show that contrastive explanations for language generation decisions are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena, and that they significantly improve contrastive model simulatability for human observers.
RankGen: Improving Text Generation with Large Ranking Models
- Kalpesh Krishna, Yapei Chang, John Wieting, Mohit Iyyer
- TLDR: We present RankGen, a 1.2B parameter encoder model for English that scores model generations given a prefix.
Learning a Grammar Inducer from Massive Uncurated Instructional Videos
- Songyang Zhang, Linfeng Song, Lifeng Jin, Haitao Mi, Kun Xu, Dong Yu, Jiebo Luo
- TLDR: We investigate the scenario, in which text and video are only in loose correspondence, and show that video-aided grammar induction can be effective.
Normalized Contrastive Learning for Text-Video Retrieval
- Yookoon Park, Mahmoud Azab, Seungwhan Moon, Bo Xiong, Florian Metze, Gourab Kundu, Kirmani Ahmed
- TLDR: We propose Normalized Contrastive Learning (NCL) which improves the performance of cross-modal contrastive learning by normalizing the sum retrieval probabilities of each instance.
Estimating Soft Labels for Out-of-Domain Intent Detection
- Hao Lang, Yinhe Zheng, Jian Sun, Fei Huang, Luo Si, Yongbin Li
- TLDR: We propose an adaptive soft pseudo labeling method that can estimate soft labels for pseudo OOD samples when training OOD detectors.
Multi-VQG: Generating Engaging Questions for Multiple Images
- Min-Hsuan Yeh, Vincent Chen, Ting-Hao Huang, Lun-Wei Ku
- TLDR: We propose a novel visual question generation dataset for visual-and-language models that allows them to construct stories behind a series of images to generate engaging questions.
Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation
- Jannis Bulian, Christian Buck, Wojciech Gajewski, Benjamin Börschinger, Tal Schuster
- TLDR: We present the first systematic conceptual and data-driven analysis to examine the shortcomings of token-level equivalence measures.
Non-Parametric Domain Adaptation for End-to-End Speech Translation
- Yichao Du, Weizhi Wang, Zhirui Zhang, Boxing Chen, Tong Xu, Jun Xie, Enhong Chen
- TLDR: We propose a novel non-parametric method that leverages in-domain text translation corpus to achieve domain adaptation for E2E-ST systems.
Prompting for Multimodal Hateful Meme Classification
- Rui Cao, Roy Ka-Wei Lee, Wen-Haw Chong, Jing Jiang
- TLDR: We propose PromptHate, a simple yet effective prompt-based model that prompts pre-trained language models (PLMs) for hateful meme classification.
Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking
- Minghan Li, Xinyu Zhang, Ji Xin, Hongyang Zhang, Jimmy Lin
- TLDR: We propose a certified error control of candidate set pruning for relevance ranking.
Linearizing Transformer with Key-Value Memory
- Yizhe Zhang, Deng Cai
- TLDR: We propose Memsizer, a new transformer variant which improves the efficiency and accuracy of vanilla transformers while maintaining the computational complexity.
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions
- Gaurav Verma, Vishwa Vinay, Ryan Rossi, Srijan Kumar
- TLDR: We investigate the robustness of multimodal classifiers to cross-modal dilutions, and show that their performance drops by 23.3% and 22.5%, respectively, in the presence of dilutions generated by our model.
Translation between Molecules and Natural Language
- Carl Edwards, Tuan Lai, Kevin Ros, Garrett Honke, Kyunghyun Cho, Heng Ji
- TLDR: We present MolT5 - a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings.
What Makes Instruction Learning Hard? An Investigation and a New Challenge in a Synthetic Environment
- Matthew Finlayson, Kyle Richardson, Ashish Sabharwal, Peter Clark
- TLDR: We use the task of deciding whether a given string matches a regular expression (viewed as an instruction) to identify properties of tasks, instructions, and instances that make instruction learning challenging.
Sentence-Incremental Neural Coreference Resolution
- Matt Grenander, Shay B. Cohen, Mark Steedman
- TLDR: We propose a sentence-incremental neural coreference resolution system which incrementally builds clusters after marking mention boundaries in a shift-reduce method.
SNaC: Coherence Error Detection for Narrative Summarization
- Tanya Goyal, Junyi Jessy Li, Greg Durrett
- TLDR: We develop a framework for narrative coherence evaluation and a method for eliciting coherence judgments from crowdworkers in long document summarization.
HydraSum: Disentangling Style Features in Text Summarization with Multi-Decoder Models
- Tanya Goyal, Nazneen Rajani, Wenhao Liu, Wojciech Kryscinski
- TLDR: We present a new summarization architecture that learns contrasting summary styles from multiple decoders and use it to improve the quality of the final output.
A Good Neighbor, A Found Treasure: Mining Treasured Neighbors for Knowledge Graph Entity Typing
- Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao
- TLDR: We propose a novel method for knowledge graph entity typing that uses the multi-hop neighbor information of the central entity to infer the missing types for entities in knowledge graphs.
Guiding Neural Entity Alignment with Compatibility
- Bing Liu, Harrisen Scells, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang
- TLDR: We propose a training framework for neural entity alignment models that can be trained with compatible entities.
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
- Prakhar Gupta, Cathy Jiao, Yi-Ting Yeh, Shikib Mehri, Maxine Eskenazi, Jeffrey Bigham
- TLDR: We present a novel instruction tuning framework for dialogue tasks that enables good zero-shot performance on unseen datasets and tasks such as dialogue evaluation and intent detection, and even better performance in a few-shot setting.
Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
- Peijie Jiang, Dingkun Long, Yanzhao Zhang, Pengjun Xie, Meishan Zhang, Min Zhang
- TLDR: We propose a new architecture to encode the information for lexicon exploration and feature induction of Chinese sequence labeling tasks using unsupervised statistical boundary information.
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
- Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- TLDR: We propose RetroMAE, a new retrieval oriented pre-training paradigm based on Masked Auto-Encoder (MAE).
Aligning Recommendation and Conversation via Dual Imitation
- Jinfeng Zhou, Bo Wang, Minlie Huang, Dongming Zhao, Kun Huang, Ruifang He, Yuexian Hou
- TLDR: We propose DICR, a novel approach to model recommendation actions as recommendation paths in knowledge graph.
QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance
- Xiaoqiang Wang, Bang Liu, Siliang Tang, Lingfei Wu
- TLDR: We propose QRelScore, a context-aware Relevance evaluation metric for Question Generation.
Abstract Visual Reasoning with Tangram Shapes
- Anya Ji, Noriyuki Kojima, Noah Rush, Alane Suhr, Wai Keen Vong, Robert Hawkins, Yoav Artzi
- TLDR: We introduce KiloGram, a richly annotated dataset for studying abstract visual reasoning in humans and machines.
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
- Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu
- TLDR: Unified knowledge grounding tasks for structured knowledge grounding.
Balanced Adversarial Training: Balancing Tradeoffs between Fickleness and Obstinacy in NLP Models
- Hannah Chen, Yangfeng Ji, David Evans
- TLDR: We show that adversarial training methods focused on reducing vulnerability to fickle adversarial examples may make a model more vulnerable to obstinate adversarial example.
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
- Ankur Sikarwar, Arkil Patel, Navin Goyal
- TLDR: We show that transformers can generalize to new compositional attributes and propose a new compositionally generalization task.
Generative Language Models for Paragraph-Level Question Generation
- Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados
- TLDR: We propose a robust benchmark for question generation using language models and a manual evaluation method for language models.
A Unified Encoder-Decoder Framework with Entity Memory
- Zhihan Zhang, Wenhao Yu, Chenguang Zhu, Meng Jiang
- TLDR: We propose an entity memory encoder-decoder framework for informative text generation and a unified framework for entity-intensive question answering and generation tasks.
Segmenting Numerical Substitution Ciphers
- Nada Aldarrab, Jonathan May
- TLDR: We propose automatic methods to segmented historical substitution ciphers using Byte Pair Encoding and unigram language models.
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
- Ashish V. Thapliyal, Jordi Pont Tuset, Xi Chen, Radu Soricut
- TLDR: We present a dataset for massively multilingual image captioning that is highly accurate and consistent across 36 languages.
ReSel: N-ary Relation Extraction from Scientific Text and Tables by Learning to Retrieve and Select
- Yuchen Zhuang, Yinghao Li, Junyang Zhang, Yue Yu, Yingjun Mou, Xiang Chen, Le Song, Chao Zhang
- TLDR: We propose a novel method for extracting N-ary relation tuples from scientific articles by first retrieving the most relevant paragraph/table and then selecting the target entity from the retrieved component.
GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs
- Dong Yang, Peijun Qing, Yang Li, Haonan Lu, Xiaodong Lin
- TLDR: We propose a novel probabilistic embedding model for knowledge graphs for multi-hop logical reasoning.
Reasoning Like Program Executors
- Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Qiang Fu, Yan Gao, Jian-Guang Lou, Weizhu Chen
- TLDR: We present POET, a novel reasoning pre-training paradigm that empowers language models to harvest the reasoning knowledge possessed by program executors via a data-driven approach.
SEM-F1: an Automatic Way for Semantic Evaluation of Multi-Narrative Overlap Summaries at Scale
- Naman Bansal, Mousumi Akter, Shubhra Kanti Karmaker Santu
- TLDR: We propose a new sentence-level precision-recall style automated evaluation metric for the novel semantic overlap summarization task.
Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning
- Yifan Chen, Devamanyu Hazarika, Mahdi Namazifar, Yang Liu, Di Jin, Dilek Hakkani-Tur
- TLDR: We propose to understand and further develop prefix-tuning through the kernel lens.
DocInfer: Document-level Natural Language Inference using Optimal Evidence Selection
- Puneet Mathur, Gautam Kunapuli, Riyaz Bhat, Manish Shrivastava, Dinesh Manocha, Maneesh Singh
- TLDR: We present DocInfer - a novel, end-to-end Document-level Natural Language Inference model that builds a hierarchical document graph enriched through inter-sentence relations (topical, entity-based, concept-based), performs paragraph pruning using the novel SubGraph Pooling layer, followed by optimal evidence selection based on REINFORCE algorithm to identify the most important context sentences for a given hypothesis.
LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation
- Xin Mao, Wenting Wang, Yuanbin Wu, Man Lan
- TLDR: We propose a neural-free entity alignment algorithm for multi-source KGs that achieves comparable results to state-of-the-art methods across all datasets and even surpasses them on many.
Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning
- Xingwei He, Yeyun Gong, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, Sm Yiu, Nan Duan
- TLDR: We propose a new method for efficient and effective commons generation by distilling knowledge from the metric and re-ranking the candidates with a ranker.
Efficient Document Retrieval by End-to-End Refining and Quantizing BERT Embedding with Contrastive Product Quantization
- Zexuan Qiu, Qinliang Su, Jianxing Yu, Shijing Si
- TLDR: We propose to leverage BERT embeddings to perform efficient document retrieval based on the product quantization technique, which will assign for every document a real-valued codeword from the codebook, instead of a binary code as in semantic hashing.
Curriculum Knowledge Distillation for Emoji-supervised Cross-lingual Sentiment Analysis
- Jianyang Zhang, Tao Liang, Mingyang Wan, Guowu Yang, Fengmao Lv
- TLDR: We propose a novel cross-lingual sentiment analysis approach dubbed Curriculum Knowledge Distiller (CKD) to transfer sentiment knowledge across languages with the help of emojis.
Correctable-DST: Mitigating Historical Context Mismatch between Training and Inference for Improved Dialogue State Tracking
- Hongyan Xie, Haoxiang Su, Shuangyong Song, Hao Huang, Bo Zou, Kun Deng, Jianghua Lin, Zhihui Zhang, Xiaodong He
- TLDR: We propose Correctable Dialogue State Tracking (Correctable-DST) which achieves 67.51%, 68.24%, 70.30%, 71.38%, and 81.27% joint goal accuracy on MultiWOZ 2.0-2.4 datasets, respectively, and achieves a new state-of-the-art performance with significant improvements.
DropMix: A Textual Data Augmentation Combining Dropout with Mixup
- Fanshuang Kong, Richong Zhang, Xiaohui Guo, Samuel Mensah, Yongyi Mao
- TLDR: We propose a novel textual data augmentation and regularization framework that improves overfitting in text learning.
Cross-document Event Coreference Search: Task, Dataset and Modeling
- Alon Eirew, Avi Caciularu, Ido Dagan
- TLDR: We propose a novel set up for the task of Cross-document Coreference Resolution, focusing in this paper on event coreference.
VIRT: Improving Representation-based Text Matching via Virtual Interaction
- Dan Li, Yang Yang, Hongyin Tang, Jiahao Liu, Qifan Wang, Jingang Wang, Tong Xu, Wei Wu, Enhong Chen
- TLDR: We propose a novel method for improving representation-based text matching by effectively transferring knowledge from the interaction-based model.
MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction
- Xiaozhi Wang, Yulin Chen, Ning Ding, Hao Peng, Zimu Wang, Yankai Lin, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie Zhou
- TLDR: We present a unified large-scale human-annotated ERE dataset MAVEN-ERE with improved annotation schemes.
Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models
- Aniruddha Mahapatra, Sharmila Reddy Nangi, Aparna Garimella, Anandhavelu N
- TLDR: We present effective ways to select data from unlabeled corpora of target domains for language model pretraining to improve the performances in downstream entity extraction tasks.
How Large Language Models are Transforming Machine-Paraphrase Plagiarism
- Jan Philip Wahle, Terry Ruas, Frederic Kirstein, Bela Gipp
- TLDR: We present a new study on the role of large autoregressive models in generating machine-paraphrased plagiarism and their detection.
M2D2: A Massively Multi-Domain Language Modeling Dataset
- Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
- TLDR: We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation in language models (LMs).
“Will You Find These Shortcuts?” A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
- Jasmijn Bastings, Sebastian Ebert, Polina Zablotskaia, Anders Sandholm, Katja Filippova
- TLDR: We present a protocol for faithfulness evaluation of salience methods and show that some of the most popular methods are not faithful and some are surprisingly good.
Information-Transport-based Policy for Simultaneous Translation
- Shaolei Zhang, Yang Feng
- TLDR: We propose an Information-Transport-based Simultaneous Translation (ITST) policy to determine whether to translate a target token or wait for the next source token.
Learning to Adapt to Low-Resource Paraphrase Generation
- Zhigen Li, Yanmeng Wang, Rizhao Fan, Ye Wang, Jianfeng Li, Shaojun Wang
- TLDR: Meta-learning for paraphrase generation using large pre-trained language models optimized by meta-learning.
A Distributional Lens for Multi-Aspect Controllable Text Generation
- Yuxuan Gu, Xiaocheng Feng, Sicheng Ma, Lingyuan Zhang, Heng Gong, Bing Qin
- TLDR: We propose a new method for multi-aspect controllable text generation by fusing multiple attribute distributions as their combination for generation.
ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation
- Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen
- TLDR: We propose ELMER: an efficient and effective PLM for NAR text generation to explicitly model the token dependency during NAR generation.
Multilingual Relation Classification via Efficient and Effective Prompting
- Yuxuan Chen, David Harbecke, Leonhard Hennig
- TLDR: We present a prompt-based multilingual relation classification method that can be used as a strong baseline for similar multilingual classification tasks.
Topic-Regularized Authorship Representation Learning
- Jitkapat Sawatphol, Nonthakit Chaiwong, Can Udomcharoenchaikit, Sarana Nutanong
- TLDR: We propose a novel method for attribution attribution that can handle unseen authors and unseen topics in open-set and cross-topic environments.
Fine-grained Contrastive Learning for Relation Extraction
- William Hogan, Jiacheng Li, Jingbo Shang
- TLDR: We propose fine-grained contrastive learning for relation extraction, which leverages fine-granularity to improve the quality of learned relationship representations for RE.
Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization
- Changqun Li, Linlin Wang, Xin Lin, Gerard de Melo, Liang He
- TLDR: We propose a novel curriculum-based prompt learning method with self-training to address the problem of summarizing dialogue.
Zero-Shot Text Classification with Self-Training
- Ariel Gera, Alon Halfon, Eyal Shnarch, Yotam Perlitz, Liat Ein-Dor, Noam Slonim
- TLDR: We propose a plug-and-play method to bridge the gap between domain expertise and off-the-shelf text classification models by using only the class names and an unlabeled dataset.
Deconfounding Legal Judgment Prediction for European Court of Human Rights Cases Towards Better Alignment with Experts
- T.y.s.s Santosh, Shanshan Xu, Oana Ichim, Matthias Grabmair
- TLDR: We use domain expertise to strategically identify statistically predictive but legally irrelevant information and use adversarial training to prevent the system from relying on it.
SQuALITY: Building a Long-Document Summarization Dataset the Hard Way
- Alex Wang, Richard Yuanzhe Pang, Angelica Chen, Jason Phang, Samuel R. Bowman
- TLDR: We use a novel method to collect question-focused summaries from scratch and use them to evaluate summarization systems.
MetaASSIST: Robust Dialogue State Tracking with Meta Learning
- Fanghua Ye, Xi Wang, Jie Huang, Shenghui Li, Samuel Stern, Emine Yilmaz
- TLDR: Meta-learning for dialogue state tracking with adaptive weighting parameter tuning.
Multilingual Machine Translation with Hyper-Adapters
- Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale
- TLDR: We propose a rescaling fix for hyper-adapters that improves convergence and enables training larger hyper-networks.
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination
- Yue Yang, Wenlin Yao, Hongming Zhang, Xiaoyang Wang, Dong Yu, Jianshu Chen
- TLDR: We develop a novel approach, Z-LaVI, to endow language models with visual imagination capabilities.
Using Commonsense Knowledge to Answer Why-Questions
- Yash Kumar Lal, Niket Tandon, Tanvi Aggarwal, Horace Liu, Nathanael Chambers, Raymond Mooney, Niranjan Balasubramanian
- TLDR: We analyze the effects of model size on external commonsense knowledge in large language models and show that the largest models, as expected, yield substantial improvements over base models.
Affective Idiosyncratic Responses to Music
- Sky CH-Wang, Evan Li, Oliver Li, Smaranda Muresan, Zhou Yu
- TLDR: We develop computational methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform.
Successive Prompting for Decomposing Complex Questions
- Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- TLDR: We propose a new method for solving complex question answering in a few-shot setting by iteratively breaking down a complex task into a simple task, solve it, and then repeat the process until we get the final solution.
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
- Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Yejin Choi
- TLDR: We develop Maieutic Prompting, a new method for generating explanations that self-guide the inference of language models.
DANLI: Deliberative Agent for Following Natural Language Instructions
- Yichi Zhang, Jianing Yang, Jiayi Pan, Shane Storks, Nikhil Devraj, Ziqiao Ma, Keunwoo Yu, Yuwei Bao, Joyce Chai
- TLDR: We propose a neuro-symbolic deliberative agent that learns to follow language instructions by learning symbolic representations acquired from past experience.
Tracing Semantic Variation in Slang
- Zhewei Sun, Yang Xu
- TLDR: We propose a new theory of the role of communicative need and semantic distinction in the evolution of slang meaning and show that both play a role in the variation of slang usages over the course of history.
Fine-grained Category Discovery under Coarse-grained supervision with Hierarchical Weighted Self-contrastive Learning
- Wenbin An, Feng Tian, Ping Chen, Siliang Tang, Qinghua Zheng, QianYing Wang
- TLDR: We propose a hierarchical weighted self-contrastive module for novel category discovery under coarse-grained supervision and show both effectiveness and efficiency of our model over compared methods.
PLM-based World Models for Text-based Games
- Minsoo Kim, Yeonjoon Jung, Dohyeon Lee, Seung-won Hwang
- TLDR: We propose a novel world model for text-based game environments based on pre-trained language models and transformers.
Prompt-Based Meta-Learning For Few-shot Text Classification
- Haoxing Zhang, Xiaofeng Zhang, Haibo Huang, Lei Yu
- TLDR: We propose a new meta-learning framework that alleviates the shortcoming that meta-training requires too much data for meta- training.
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
- Hritik Bansal, Da Yin, Masoud Monajatipoor, Kai-Wei Chang
- TLDR: We study the effect on the diversity of the generated images when adding neutral text descriptions to text-to-image generative models.
Geographic Citation Gaps in NLP Research
- Mukund Rungta, Janvijay Singh, Saif M. Mohammad, Diyi Yang
- TLDR: We show that there are substantial geographical disparities in paper acceptance and citation for NLP publications.
Language Models of Code are Few-Shot Commonsense Learners
- Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- TLDR: We show that when we frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsens reasoners than LMs that are fine-tuned on natural language corpora.
Numerical Optimizations for Weighted Low-rank Estimation on Language Models
- Ting Hua, Yen-Chang Hsu, Felicity Wang, Qian Lou, Yilin Shen, Hongxia Jin
- TLDR: We propose a new method for weighted value decomposition of Transformer-based language models that can be used to rescue Transformer models.
Generative Multi-hop Retrieval
- Hyunji Lee, Sohee Yang, Hanseok Oh, Minjoon Seo
- TLDR: We propose a novel multi-hop retrieval algorithm that uses a fully generative model to perform multi-hops retrieval.
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
- Yu Zhao, Jianguo Wei, ZhiChao Lin, Yueheng Sun, Meishan Zhang, Min Zhang
- TLDR: We present Visual Spatial Description, a new perspective for image-to-text toward spatial semantics.
M3: A Multi-View Fusion and Multi-Decoding Network for Multi-Document Reading Comprehension
- Liang Wen, Houfeng Wang, Yingwei Luo, Xiaolin Wang
- TLDR: We propose a novel multi-view fusion and multi-decoding mechanism to achieve multi-document reading comprehension task.
COCO-DR: Combating the Distribution Shift in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
- Yue Yu, Chenyan Xiong, Si Sun, Chao Zhang, Arnold Overwijk
- TLDR: We present a new zero-shot dense retrieval method, COCO-DR, to improve the generalization ability of dense retrieval by combating the distribution shifts between source training tasks and target scenarios.
Language Model Pre-Training with Sparse Latent Typing
- Liliang Ren, Zixuan Zhang, Han Wang, Clare Voss, ChengXiang Zhai, Heng Ji
- TLDR: We propose a new pre-training objective for language models that enables the model to sparsely extract sentence-level keywords with diverse latent types.
On the Transformation of Latent Space in Fine-Tuned NLP Models
- Nadir Durrani, Hassan Sajjad, Fahim Dalvi, Firoj Alam
- TLDR: We study the evolution of latent space in fine-tuned NLP models.
Watch the Neighbors: A Unified K-Nearest Neighbor Contrastive Learning Framework for OOD Intent Discovery
- Yutao Mou, Keqing He, Pei Wang, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu
- TLDR: We propose a unified K-nearest neighbor contrastive learning framework to discover out-of-domain intents in dialogue systems.
Extracted BERT Model Leaks More Information than You Think!
- Xuanli He, Lingjuan Lyu, Chen Chen, Qiongkai Xu
- TLDR: We present a new attack on model extraction that can cause severe privacy leakage even when victim models are facilitated with state-of-the-art defensive strategies.
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?
- Mitja Nikolaus, Emmanuelle Salin, Stephane Ayache, Abdellah Fourtassi, Benoit Favre
- TLDR: We present a new multimodal task for evaluating understanding of predicate-noun dependencies in a controlled setup and show that the best models are able to track syntactic dependencies between words.
A Multilingual Perspective Towards the Evaluation of Attribution Methods in Natural Language Inference
- Kerem Zaman, Yonatan Belinkov
- TLDR: We present a multilingual approach for evaluating attribution methods for the Natural Language Inference task in terms of faithfulness and plausibility.
Graph-Based Multilingual Label Propagation for Low-Resource Part-of-Speech Tagging
- Ayyoob ImaniGooghari, Silvia Severini, Masoud Jalili Sabet, François Yvon, Hinrich Schütze
- TLDR: We propose a novel method for graph-based label propagation for unsupervised POS tagging of low-resource languages.
SubeventWriter: Iterative Sub-event Sequence Generation with Coherence Controller
- Zhaowei Wang, Hongming Zhang, Tianqing Fang, Yangqiu Song, Ginny Wong, Simon See
- TLDR: We propose a new task of sub-event generation for an unseen process to evaluate the understanding of the coherence of sub–event actions and objects.
Infinite SCAN: An Infinite Model of Diachronic Semantic Change
- Seiichi Inoue, Mamoru Komachi, Toshinobu Ogiso, Hiroya Takamura, Daichi Mochihashi
- TLDR: We propose a Bayesian model that can jointly estimate the number of senses of words and their changes through time.
Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization
- Yuxian Gu, Pei Ke, Xiaoyan Zhu, Minlie Huang
- TLDR: We propose a novel method for instruction tuning that uses pseudo-labeled data to improve the performance of language models on unseen tasks.
Counterfactual Data Augmentation via Perspective Transition for Open-Domain Dialogues
- Jiao Ou, Jinchao Zhang, Yang Feng, Jie Zhou
- TLDR: We propose a data augmentation method to augment high-quality responses with different semantics by counterfactual inference.
SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph Reasoning
- Yushi Bai, Xin Lv, Juanzi Li, Lei Hou, Yincen Qu, Zelin Dai, Feiyu Xiong
- TLDR: We present SQUIRE, the first Sequence-to-sequence based multi-hop reasoning framework, which utilizes an encoder-decoder Transformer structure to translate the query to a path.
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
- Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu, Lirong Dai, Jinyu Li, Furu Wei
- TLDR: We propose a unified-modal speech-unit-text pre-training model for speech recognition and translation tasks.
Learning Label Modular Prompts for Text Classification in the Wild
- Hailin Chen, Amrita Saha, Shafiq Joty, Steven C.H. Hoi
- TLDR: Modular Prompt tuning framework for text classification in the wild.
Unbiased and Efficient Sampling of Dependency Trees
- Miloš Stanojević
- TLDR: We show that the current algorithm for sampling with replacement from the dependency tree distribution is biased and provide two algorithms that are unbiased.
Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions
- Shuhao Gu, Bojie Hu, Yang Feng
- TLDR: We propose a two-stage training method for continual learning of large-scale pretrained neural machine translation model without accessing the previous training data or introducing model separation.
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
- Bowen Shen, Zheng Lin, Yuanxin Liu, Zhengxiao Liu, Lei Wang, Weiping Wang
- TLDR: We propose a collaborative optimization for PLMs that integrates static model compression and dynamic inference acceleration to speed up inference dynamically.
Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction
- Zhen Wan, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Jiwei Li
- TLDR: We introduce a simple enhancement of RE using pre-trained language models.
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
- Hong Chen, Duc Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama
- TLDR: We present a novel Story Evaluation method that mimics human preference when judging a story, which consists of three sub-tasks: Ranking, Rating and Reasoning.
Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
- Eric Mitchell, Joseph Noh, Siyan Li, Will Armstrong, Ananth Agarwal, Patrick Liu, Chelsea Finn, Christopher Manning
- TLDR: We propose a framework for boosting the consistency and accuracy of pre-trained NLP models using pre-training natural language inference (NLI) models without fine-tuning or re-training.
Robustness of Demonstration-based Learning Under Limited Data Scenario
- Hongxin Zhang, Yanzhe Zhang, Ruiyi Zhang, Diyi Yang
- TLDR: We design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling and show that (1) demonstrations composed of random tokens still make the model a better few-shot learner; (2) the length of random demonstrations and the relevance of random token are the main factors affecting the performance; (3) demonstrations increase the confidence of model predictions on captured superficial patterns.
Modeling Information Change in Science Communication with Semantically Matched Paraphrases
- Dustin Wright, Jiaxin Pei, David Jurgens, Isabelle Augenstein
- TLDR: We present the first paraphrase dataset of scientific findings annotated for degree of information change and show that models trained on SPICED can reveal large-scale trends in the degrees to which people and organizations faithfully communicate new scientific findings.
Word Order Matters When You Increase Masking
- Karim Lasri, Alessandro Lenci, Thierry Poibeau
- TLDR: We show that the amount of masking in Transformer-based language models can affect the importance of position information for the pre-training objective.
An Empirical Analysis of Memorization in Fine-tuned Autoregressive Language Models
- Fatemehsadat Mireshghallah, Archit Uniyal, Tianhao Wang, David Evans, Taylor Berg-Kirkpatrick
- TLDR: We empirically study memorization of fine-tuning methods using membership inference and extraction attacks, and show that their susceptibility to attacks is very different.
Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition
- Shuguang Chen, Leonardo Neves, Thamar Solorio
- TLDR: We propose a new method to effectively transform the text from a high-resource domain to a low-resource data domain by changing its style-related attributes to generate synthetic data for training.
Linguistic Corpus Annotation for Automatic Text Simplification Evaluation
- Rémi Cardon, Adrien Bibal, Rodrigo Wilkens, David Alfter, Magali Norré, Adeline Müller, Watrin Patrick, Thomas François
- TLDR: We propose annotations of the ASSET corpus that can be used to shed more light on ATS evaluation.
Semantic Framework based Query Generation for Temporal Question Answering over Knowledge Graphs
- Wentao Ding, Hao Chen, Huayu Li, Yuzhong Qu
- TLDR: We propose a temporal question answering method, SF-TQA, which generates query graphs by exploring the relevant facts of mentioned entities, where the exploring process is restricted by SF-TCons.
There Is No Standard Answer: Knowledge-Grounded Dialogue Generation with Adversarial Activated Multi-Reference Learning
- Xueliang Zhao, Tingchen Fu, Chongyang Tao, Rui Yan
- TLDR: We propose a span-based variational model for knowledge-grounded dialogue and a wake-sleep style variational approach to learn the one-to-many generalization.
Stop Measuring Calibration When Humans Disagree
- Joris Baan, Wilker Aziz, Barbara Plank, Raquel Fernandez
- TLDR: We show that measuring calibration to human majority given inherent disagreements is theoretically problematic, demonstrate this empirically on the ChaosNLI dataset, and derive several instance-level measures of calibration that capture key statistical properties of human judgements - including class frequency, ranking and entropy.
Improving compositional generalization for multi-step quantitative reasoning in question answering
- Armineh Nourbakhsh, Cathy Jiao, Sameena Shah, Carolyn Rosé
- TLDR: We propose a method for modeling the compositional nature of quantitative text and use it to improve the performance and robustness of numerical QA models.
A Comprehensive Comparison of Neural Networks as Cognitive Models of Inflection
- Adam Wiemerslage, Shiran Dudy, Katharina Kann
- TLDR: We address the question of whether neural networks are a feasible account for human behavior in morphological inflection.
Can Visual Context Improve Automatic Speech Recognition for an Embodied Agent?
- Pradip Pramanick, Chayan Sarkar
- TLDR: We present a method to incorporate a robot’s visual information into an ASR system and improve the recognition of a spoken utterance containing a visible entity.
AfroLID: A Neural Language Identification Tool for African Languages
- Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Inciarte
- TLDR: We present a neural language identification toolkit for 517 African languages and varieties.
EvEntS ReaLM: Event Reasoning of Entity States via Language Models
- Evangelia Spiliopoulou, Artidoro Pagnoni, Yonatan Bisk, Eduard Hovy
- TLDR: We show that proper model prompting can dramatically improve performance of reported baseline results across multiple tasks.
Large language models are few-shot clinical information extractors
- Monica Agrawal, Stefan Hegselmann, Hunter Lang, Yoon Kim, David Sontag
- TLDR: We show that large language models, such as InstructGPTP-3, perform well at zero- and few-shot clinical information extraction from clinical text despite not being trained specifically for the clinical domain.
Towards a Unified Multi-Dimensional Evaluator for Text Generation
- Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, Pengfei Liu, Chenguang Zhu, Heng Ji, Jiawei Han
- TLDR: Unified multi-dimensional evaluator for Natural Language Generation.
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
- Da Yin, Hritik Bansal, Masoud Monajatipoor, Liunian Harold Li, Kai-Wei Chang
- TLDR: We benchmark 11 standard multilingual PLMs on GeoMLAMA.
The (Undesired) Attenuation of Human Biases by Multilinguality
- Cristina España-Bonet, Alberto Barrón-Cedeño
- TLDR: We show that human biases are not universal, but values differ across languages, being far from universal.
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning
- Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- TLDR: We present a question-answering system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning.
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets
- Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
- TLDR: We propose a new and simple automatic evaluation method for natural language generation tasks called Near-Negative Distinction (NND) that repurposes prior human annotations into NND tests.
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection
- Badr AlKhamissi, Faisal Ladhak, Srinivasan Iyer, Veselin Stoyanov, Zornitsa Kozareva, Xian Li, Pascale Fung, Lambert Mathias, Asli Celikyilmaz, Mona Diab
- TLDR: We present a few-shot learning method for hate speech detection that outperforms the baseline by 17.83% absolute gain in the 16-shot case.
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations
- Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal
- TLDR: We show that explainable NLP is possible for both easy and hard data labels, and that explainability is influenced by sample hardness.
Stanceosaurus: Classifying Stance Towards Multicultural Misinformation
- Jonathan Zheng, Ashutosh Baheti, Tarek Naous, Wei Xu, Alan Ritter
- TLDR: We present Stanceosaurus, a new corpus of 28,033 tweets in English, Hindi and Arabic annotated with stance towards 250 misinformation claims.
Gendered Mental Health Stigma in Masked Language Models
- Inna Lin, Lucille Njoo, Anjalie Field, Ashish Sharma, Katharina Reinecke, Tim Althoff, Yulia Tsvetkov
- TLDR: We investigate gendered mental health stigma in masked language models and show that different models capture dimensions of stigma differently for men and women, associating stereotypes like anger, blame, and pity more with women with mental health conditions than with men.
Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization
- Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum
- TLDR: We present a new approach for efficient k-nearest neighbor search that uses cross-encoders for both query and candidate matching.
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models
- Mirac Suzgun, Luke Melas-Kyriazi, Dan Jurafsky
- TLDR: We propose a method for arbitrary textual style transfer using pre-trained language models.
Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts
- Ben Zhou, Kyle Richardson, Xiaodong Yu, Dan Roth
- TLDR: We show that with large-scale intermediate pre-training of decomposition-based transformers using distant supervision from comparable texts, particularly large- scale parallel news, developing robust decomposition models for a diverse range of tasks becomes more feasible.
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
- Anuj Diwan, Layne Berry, Eunsol Choi, David Harwath, Kyle Mahowald
- TLDR: We show that solving the Winoground task requires not just compositional language understanding, but a host of other abilities like commonsense reasoning or locating small, out-of-focus objects in low-resolution images.
Gradient-based Constrained Sampling from Language Models
- Sachin Kumar, Biswajit Paria, Yulia Tsvetkov
- TLDR: We propose a novel method for constrained sampling of language models that combines log-likelihood of the language model with arbitrary (differentiable) constraints in a single energy function, and then generates samples in a non-autoregressive manner.
TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data
- Fan Zhou, Mengkang Hu, Haoyu Dong, Zhoujun Cheng, Fan Cheng, Shi Han, Dongmei Zhang
- TLDR: We present TaCube, a portable pre-computation solution for numerical reasoning, which improves the accuracy of auto-regressive pre-trained language models on WikiTQ and WikiTAT-QA.
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
- Hung-Ting Chen, Michael Zhang, Eunsol Choi
- TLDR: We present a new calibration study that explores how question answering models use rich knowledge sources and how they interact with each other.
QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation
- Zhenrui Yue, Huimin Zeng, Bernhard Kratzwald, Stefan Feuerriegel, Dong Wang
- TLDR: We propose a novel self-supervised framework for QA domain adaptation and a novel data augmentation pipeline for QAI.
When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain
- Raj Shah, Kunal Chawla, Dheeraj Eidnani, Agam Shah, Wendi Du, Sudheer Chava, Natraj Raman, Charese Smiley, Jiaao Chen, Diyi Yang
- TLDR: We propose a novel domain specific Financial LANGuage model (FLANG) which uses financial keywords and phrases for better masking, together with span boundary objective and in-filing objective.
Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer
- Zhengbao Jiang, Luyu Gao, Zhiruo Wang, Jun Araki, Haibo Ding, Jamie Callan, Graham Neubig
- TLDR: We present a novel end-to-end Transformer for knowledge-intensive question answering that outperforms state-of-the-art retrieval and QA systems.
Reproducibility in Computational Linguistics: Is Source Code Enough?
- Mohammad Arvan, Luís Pina, Natalie Parde
- TLDR: We study the availability of source code at major computational linguistics conferences and show that source code releases leave much to be desired.
Generating Information-Seeking Conversations from Unlabeled Documents
- Gangwoo Kim, Sungdong Kim, Kang Min Yoo, Jaewoo Kang
- TLDR: Synthesizing datasets for conversational question answering (CQA) from unlabeled documents remains challenging due to its interactive nature.
Distill The Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation
- Ru Peng, Yawen Zeng, Jake Zhao
- TLDR: We present a novel multimodal machine translation framework that can support image-free inference during the inference phase.
A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques
- Malik Altakrori, Thomas Scialom, Benjamin C. M. Fung, Jackie Chi Kit Cheung
- TLDR: We re-evaluate different authorship obfuscation techniques on detection evasion and content preservation and show that evasion detection is not as effective as content preservation.
SafeText: A Benchmark for Exploring Physical Safety in Language Models
- Sharon Levy, Emily Allaway, Melanie Subbiah, Lydia Chilton, Desmond Patton, Kathleen McKeown, William Yang Wang
- TLDR: We present a benchmark dataset for commonsense physical safety in natural language processing and show that state-of-the-art models are susceptible to the generation of unsafe text and have difficulty rejecting unsafe advice.
Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations
- Kang Min Yoo, Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Taeuk Kim
- TLDR: We re-examine the importance of ground-truth labels in in-context learning and show that the correct input-label mappings can have varying impacts on the downstream in-Context learning performances, depending on the experimental configuration.
D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat
- Binwei Yao, Chao Shi, Likai Zou, Lingfeng Dai, Mengyue Wu, Lu Chen, Zhen Wang, Kai Yu
- TLDR: We propose a new dialogue system for depression-diagnosis-directed clinical session that combines task-oriented and chit-chats with uniqueness in dialogue topics and procedures.
Exploiting domain-slot related keywords description for Few-Shot Cross-Domain Dialogue State Tracking
- Gao Qixiang, Guanting Dong, Yutao Mou, Liwen Wang, Chen Zeng, Daichi Guo, Mingyang Sun, Weiran Xu
- TLDR: We propose a novel framework based on domain-slot related description to tackle the challenge of few-shot cross-domain DST.
CoCoa: An Encoder-Decoder Model for Controllable Code-switched Generation
- Sneha Mondal, Ritika ., Shreya Pathak, Preethi Jyothi, Aravindan Raghuveer
- TLDR: We present CoCoa, an encoder-decoder translation model that converts monolingual Hindi text to Hindi-English code-switched text with both encoder and decoder-side interventions to achieve fine-grained controllable generation.
Towards Climate Awareness in NLP Research
- Daniel Hershcovich, Nicolas Webersinke, Mathias Kraus, Julia Bingler, Markus Leippold
- TLDR: We propose a climate performance model card for NLP research that allows for systematic climate reporting of NLP.
Navigating Connected Memories with a Task-oriented Dialog System
- Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi
- TLDR: We propose dialogs for connected memories as a powerful tool to empower users to search their media collection through a multi-turn, interactive conversation.
Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models
- Hao Zhang
- TLDR: We propose Language Model Decomposition (LMD) to represent a LM using a linear combination of other LMs as basis, and derive the closed-form solution.
SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser
- Yue Zhang, Bo Zhang, Zhenghua Li, Zuyi Bao, Chen Li, Min Zhang
- TLDR: Syntactic-enhanced grammatical error correction with parallel GEC training data.
Varifocal Question Generation for Fact-checking
- Nedjma Ousidhoum, Zhangdie Yuan, Andreas Vlachos
- TLDR: We present a new question generation algorithm for verifying a claim.
Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport
- Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, Philipp Koehn
- TLDR: Graph-matching method for bilingual lexicon induction with optimal transport.
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
- Suchin Gururangan, Dallas Card, Sarah Dreier, Emily Gade, Leroy Wang, Zeyu Wang, Luke Zettlemoyer, Noah A. Smith
- TLDR: We investigate whose language is preferred by the quality filter used for language models, and why.
ConReader: Exploring Implicit Relations in Contracts for Contract Clause Extraction
- Weiwen Xu, Yang Deng, Wenqiang Lei, Wenlong Zhao, Tat-Seng Chua, Wai Lam
- TLDR: We study automatic Contract Clause Extraction by modeling implicit relations in legal contracts.
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU
- Fenia Christopoulou, Gerasimos Lampouras, Ignacio Iacobacci
- TLDR: We propose to use training dynamics as difficulty metrics for NLU and show that training dynamics can lead to better performance on zero-shot cross-lingual transfer and OOD settings with improvements up to 8.5% in certain cases.
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
- Guanzheng Chen, Fangyu Liu, Zaiqiao Meng, Shangsong Liang
- TLDR: We investigate the instability of parameter-efficient tuning methods and propose a new method for improving their stability.
Transfer Learning from Semantic Role Labeling to Event Argument Extraction with Template-based Slot Querying
- Zhisong Zhang, Emma Strubell, Eduard Hovy
- TLDR: We propose a new approach for transfer learning from semantic role labeling to event argument extraction, based on role querying and argument augmentation.
Calibrating Zero-shot Cross-lingual (Un-)structured Predictions
- Zhengping Jiang, Anqi Liu, Benjamin Van Durme
- TLDR: We investigate model calibration in the setting of zero-shot cross-lingual transfer with large-scale pre-trained language models.
PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training
- Song Xu, Haoran Li, Peng Yuan, Youzheng Wu, Xiaodong He
- TLDR: We propose a knowledge-enhanced pre-training paradigm for language models that learns to reconstruct the original text given a noise-corrupted one.
How Far are We from Robust Long Abstractive Summarization?
- Huan Yee Koh, Jiaxin Ju, He Zhang, Ming Liu, Shirui Pan
- TLDR: We present a new set of metrics for long document abstractive summarization and show that ROUGE remains the best for summarization.
Measuring Context-Word Biases in Lexical Semantic Datasets
- Qianchu Liu, Diana McCarthy, Anna Korhonen
- TLDR: We present the first quantitative analysis on the degree of context or word biases in existing datasets and propose measures to calculate and visualize the degree.
Iteratively Prompt Pre-trained Language Models for Chain of Thought
- Boshi Wang, Xiang Deng, Huan Sun
- TLDR: We propose an iterative prompting framework for multi-step inference of pre-trained language models, which learns to synthesize prompts conditioned on the current step’s contexts.
Unobserved Local Structures Make Compositional Generalization Hard
- Ben Bogin, Shivanshu Gupta, Jonathan Berant
- TLDR: We propose a criterion for the difficulty of an example that makes compositional generalization hard on a particular test instance.
Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning
- Xiaobao Wu, Anh Tuan Luu, Xinshuai Dong
- TLDR: We propose a novel contrastive learning method for short text topic modeling that improves the quality of the topic distributions and improves the data augmentation availability.
Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling
- Yiyang Li, Hai Zhao, Zhuosheng Zhang
- TLDR: We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder, which explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
- Dongfang Li, Baotian Hu, Qingcai Chen
- TLDR: We propose a method to make the model less confident with non-inductive attributions and show that it improves confidence calibration of black-box models.
Non-Autoregressive Neural Machine Translation: A Call for Clarity
- Robin Schmidt, Telmo Pires, Stephan Peitz, Jonas Lööf
- TLDR: We provide novel insights for establishing strong baselines using length prediction or CTC-based architecture variants and contribute standardized BLEU, chrF++, and TER scores using sacreBLEU on four translation tasks, which crucially have been missing as inconsistencies in the use of tokenized BLEu lead to deviations of up to 1.7 BLEUs points.
RED-ACE: Robust Error Detection for ASR using Confidence Embeddings
- Zorik Gekhman, Dina Zverinski, Jonathan Mallinson, Genady Beryozkin
- TLDR: We propose a novel method for improving the accuracy of automatic speech recognition systems by combining the ASR system’s word-level confidence scores with the transcribed text.
Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation
- Xiang Hu, Haitao Mi, Liang Li, Gerard de Melo
- TLDR: We propose a unified R2D2 method for grammar induction and inference that improves the grammar induction quality and achieves competitive results in downstream tasks.
A Localized Geometric Method to Match Knowledge in Low-dimensional Hyperbolic Space
- Bo Hui, Tian Xia, Wei-Shinn Ku
- TLDR: We propose a localized geometric method to find equivalent entities in hyperbolic space for knowledge fusion.
Memory-assisted prompt editing to improve GPT-3 after deployment
- Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- TLDR: We present a novel approach to improve the utility of large pre-trained neural networks by learning to interpret user input in a way that is consistent with the user’s intent.
LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation
- Hongcheng Guo, Jiaheng Liu, Haoyang Huang, Jian Yang, Zhoujun Li, Dongdong Zhang, Zheng Cui
- TLDR: We propose a new multilingual multimodal machine translation task which uses visual prompts to support multi-language translation.
PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning
- Zifeng Wang, Jimeng Sun
- TLDR: We propose to generate synthetic multimodal EHRs by language models and use prompt learning to control the generation conditioned by numerical and categorical demographic features.
ROSE: Robust Selective Fine-tuning for Pre-trained Language Models
- Lan Jiang, Hao Zhou, Yankai Lin, Peng Li, Jie Zhou, Rui Jiang
- TLDR: We present a novel fine-tuning approach called “Dense Search” for language models that can defend against adversarial attacks.
CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search
- Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan
- TLDR: We propose a novel code-text contrastive pre-training model for function-level code semantic representations.
Open-Topic False Information Detection on Social Networks with Contrastive Adversarial Learning
- Guanghui Ma, Chunming Hu, Ling Ge, Hong Zhang
- TLDR: We propose a novel Contrastive Adversarial Learning Network for false information detection based on conversation graphs on social networks that is better suited to actual social networks.
Mitigating Inconsistencies in Multimodal Sentiment Analysis under Uncertain Missing Modalities
- Jiandian Zeng, Jiantao Zhou, Tianyi Liu
- TLDR: We propose a novel method to recover and recover missing features of the key missing modality in MSA and check the inconsistency phenomenon.
ConvTrans: Transforming Web Search Sessions for Conversational Dense Retrieval
- Kelong Mao, Zhicheng Dou, Hongjin Qian, Fengran Mo, Xiaohua Cheng, Zhao Cao
- TLDR: We present ConvTrans, a data augmentation method that can automatically transform easily-accessible web search sessions into conversational search sessions to fundamentally alleviate the data scarcity problem for conversational dense retrieval.
MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts
- Xiangyu Xi, Jianwei Lv, Shuaipeng Liu, Wei Ye, Fan Yang, Guanglu Wan
- TLDR: We propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service.
Reproducibility Issues for BERT-based Evaluation Metrics
- Yanran Chen, Jonas Belouadi, Steffen Eger
- TLDR: We show that BERT metrics can be reproduced but not the claims and results.
Improving Multi-task Stance Detection with Multi-task Interaction Network
- Heyan Chai, Siyu Tang, Jinhao Cui, Ye Ding, Binxing Fang, Qing Liao
- TLDR: We propose a novel multi-task interaction network for improving the performance of stance detection and sentiment analysis tasks simultaneously.
Neural-based Mixture Probabilistic Query Embedding for Answering FOL queries on Knowledge Graphs
- Xiao Long, Liansheng Zhuang, Li Aodi, Shafei Wang, Houqiang Li
- TLDR: We propose a novel query embedding model that encodes the answer set of each mini-query as a mixed Gaussian distribution with multiple means and covariance parameters, which can approximate any random distribution arbitrarily well in real KGs.
Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning
- Yi Cheng, Wenge Liu, Wenjie Li, Jiashuo Wang, Ruihui Zhao, Bang Liu, Xiaodan Liang, Yefeng Zheng
- TLDR: We propose a novel system MultiESC to provide emotional support to users in emotional distress.
Conformal Predictor for Improving Zero-Shot Text Classification Efficiency
- Prafulla Kumar Choubey, Yu Bai, Chien-Sheng Wu, Wenhao Liu, Nazneen Rajani
- TLDR: We improve the efficiency of 0shot models by restricting the number of likely labels using another fast base classifier-based conformal predictor (CP) calibrated on samples labeled by the 0shot model.
Effective and Efficient Query-aware Snippet Extraction for Web Search
- Jingwei Yi, Fangzhao Wu, Chuhan Wu, Xiaolong Huang, Binxing Jiao, Guangzhong Sun, Xing Xie
- TLDR: We propose a novel query-aware webpage snippet extraction method based on query-specific sentence representations and a novel efficient query-based sentence encoder.
You Only Need One Model for Open-domain Question Answering
- Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher Manning, Kyoung-Gu Woo
- TLDR: We propose a novel Transformer architecture for Open-domain Question Answering that uses internal passage-wise attention mechanisms applied sequentially within the transformer architecture and feeding computed representations to the reader, with the hidden representations progressively refined at each stage.
Generative Entity Typing with Curriculum Learning
- Siyu Yuan, Deqing Yang, Jiaqing Liang, Zhixu Li, Jinxi Liu, Jingyue Huang, Yanghua Xiao
- TLDR: We propose a novel generative entity typing paradigm that can handle few-shot and zero-shot situations and can handle heterogeneous data.
SetGNER: General Named Entity Recognition as Entity Set Generation
- Yuxin He, Buzhou Tang
- TLDR: We propose a novel entity set generation framework for general NER scenes in this paper.
Opinion Summarization by Weak-Supervision from Mix-structured Data
- Yizhu Liu, Qi Jia, Kenny Zhu
- TLDR: We propose a new method to synthesize training pairs of such mix-structured data as input and the textual summary as output,and design a summarization model with OA encoder and IS encoder.
Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model
- Mingqi Li, Fei Ding, Dan Zhang, Long Cheng, Hongxin Hu, Feng Luo
- TLDR: We propose Multi-level Multilingual Knowledge Distillation, a novel method for improving multilingual language models.
Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval
- Houxing Ren, Linjun Shou, Ning Wu, Ming Gong, Daxin Jiang
- TLDR: We propose a novel query generator as the teacher in the cross-lingual domain for knowledge distillation and a novel enhancement method for dual-encoder dense retrieval.
R2F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference
- Hao Wang, Yixin Cao, Yangguang Li, Zhen Huang, Kun Wang, Jing Shao
- TLDR: We propose a novel framework for document-level natural language inference, which improves both performance and interpretability with the power of evidence.
Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Processing
- Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Xinyu Duan, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Phillippe Langlais
- TLDR: We present a new set of Arabic language models that significantly outperform existing Arabic PLMs and achieve new state-of-the-art performance on discriminative and generative Arabic NLU and NLG tasks.
KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering
- Jianing Wang, Chengyu Wang, Minghui Qiu, Qiuhui Shi, Hongbin Wang, Jun Huang, Ming Gao
- TLDR: We propose a novel framework for extracting extractive question answers from pre-trained language models by contrastive learning.
Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding
- Jianing Wang, Wenkang Huang, Minghui Qiu, Qiuhui Shi, Hongbin Wang, Xiang Li, Ming Gao
- TLDR: We introduce a knowledge-prompting-based PLM framework KP-PLM and propose a knowledge prompting paradigm for natural language understanding tasks.
On the Evaluation Metrics for Paraphrase Generation
- Lingfeng Shen, Lemao Liu, Haiyun Jiang, Shuming Shi
- TLDR: We propose ParaScore, a new evaluation metric for paraphrase generation that incorporates lexical divergence and explicitly models lexical divergence.
Curriculum Learning Meets Weakly Supervised Multimodal Correlation Learning
- Sijie Mai, Ya Sun, Haifeng Hu
- TLDR: We propose curriculum learning for weakly supervised multimodal correlation learning, which improves the performance of multimodality correlation learning.
Rethinking Positional Encoding in Tree Transformer for Code Representation
- Han Peng, Ge Li, Yunfei Zhao, Zhi Jin
- TLDR: We propose a novel tree Transformer encoding node positions based on our new description method for tree structures.
RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL
- Jiexing Qi, Jingyao Tang, Ziwei He, Xiangpeng Wan, Yu Cheng, Chenghu Zhou, Xinbing Wang, Quanshi Zhang, Zhouhan Lin
- TLDR: We propose a Transformer seq2seq architecture augmented with relation-aware self-attention that could leverage a variety of relational structures while inheriting the pretrained parameters from the T5 model effectively.
COM-MRC: A COntext-Masked Machine Reading Comprehension Framework for Aspect Sentiment Triplet Extraction
- Zepeng Zhai, Hao Chen, Fangxiang Feng, Ruifan Li, Xiaojie Wang
- TLDR: We propose a novel approach to extract sentiment triplets from sentences by using aspect terms.
CEM: Machine-Human Chatting Handoff via Causal-Enhance Module
- Shanshan Zhong, Jinghui Qin, Zhongzhan Huang, Daifeng Li
- TLDR: We propose Causal-Enhance Module for Machine-Human Chatting Handoff, a module that improves the performance of existing MHCH methods by improving the causal relationships among the causal variables.
Nearest Neighbor Zero-Shot Inference
- Weijia Shi, Julian Michael, Suchin Gururangan, Luke Zettlemoyer
- TLDR: We show that the gains of non-parametric augmentation of language models on perplexity-based evaluations are transferable to few-shot tasks, and show that other advantages of nonparametric augmentations hold for end tasks.
Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in Dialog Systems
- David Gros, Yu Li, Zhou Yu
- TLDR: We present a new empirical analysis of the feasibility of two-turn dialogs that are designed to output human-like responses.
A Joint Learning Framework for Restaurant Survival Prediction and Explanation
- Xin Li, Xiaojie Zhang, Peng JiaHao, Rui Mao, Mingyang Zhou, Xing Xie, Hao Liao
- TLDR: We propose a novel joint learning framework for explainable restaurant survival prediction based on the multi-modal data of user-restaurant interactions and users’ textual reviews.
Making Pretrained Language Models Good Long-tailed Learners
- Chen Zhang, Lei Ren, Jingang Wang, Wei Wu, Dawei Song
- TLDR: We show that prompt-tuning makes pretrained language models at least good long-tailed learners.
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
- Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen, Xiaodan Liang
- TLDR: Unified geometry problem solving with multi-task Geometric Transformer framework and efficient sequence generation.
Face-Sensitive Image-to-Emotional-Text Cross-modal Translation for Multimodal Aspect-based Sentiment Analysis
- Hao Yang, Yanyan Zhao, Bing Qin
- TLDR: We present a face-sensitive image-to-emotional-text translation method for aspect-level multimodal aspect-based sentiment analysis, which captures visual emotion cues from images and matches them with the target aspect in textual modality.
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
- Chen Zhang, Luis Fernando D’Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li
- TLDR: We propose a multi-dimensional dialogue-level metric that combines three sub-metrics with each targeting a specific dimension.
Sentence Representation Learning with Generative Objective rather than Contrastive Objective
- Bohong Wu, Hai Zhao
- TLDR: We propose a novel generative language model based on phrase reconstruction for sentence representation learning.
RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning
- Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric Xing, Zhiting Hu
- TLDR: We propose RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning, which improves performance on few-shot classification and unsupervised text style transfer.
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation
- Hanqing Zhang, Dawei Song
- TLDR: We propose a new CTG approach, namely DisCup, which incorporates the attribute knowledge of discriminator to optimize the control-prompts, steering a frozen CLM to produce attribute-specific texts.
CPL: Counterfactual Prompt Learning for Vision and Language Models
- Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Wang
- TLDR: We present a novel counterfactual prompt learning method for vision and language models that learns more generalizable prompt representation from both factual and counterfactually-similar examples via contrastive learning.
Red Teaming Language Models with Language Models
- Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, Geoffrey Irving
- TLDR: We automatically find cases where a target language model behaves in a harmful way, by generating test cases (“red teaming”) using another LM.
CapOnImage: Context-driven Dense-Captioning on Image
- Yiqi Gao, Xinglin Hou, Yuanmeng Zhang, Tiezheng Ge, Yuning Jiang, Peng Wang
- TLDR: We propose a novel captioning task for images which uses text as decorations and generate captions for different locations of the image based on contextual information.
SpanProto: A Two-stage Span-based Prototypical Network for Few-shot Named Entity Recognition
- Jianing Wang, Chengyu Wang, Chuanqi Tan, Minghui Qiu, Songfang Huang, Jun Huang, Ming Gao
- TLDR: We propose a novel span-based prototypical network for few-shot named entity recognition that outperforms strong baselines by a large margin.
Discovering Differences in the Representation of People using Contextualized Semantic Axes
- Li Lucy, Divya Tadimeti, David Bamman
- TLDR: We construct contextualized semantic axes for BERT embeddings and show that they can characterize differences in the way people view women and the contexts around them.
Generating Literal and Implied Subquestions to Fact-check Complex Claims
- Jifan Chen, Aniruddh Sriram, Eunsol Choi, Greg Durrett
- TLDR: We present CLAIMDECOMP, a dataset of decompositions for over 1000 claims.
Machine Translation Robustness to Natural Asemantic Variation
- Jacob Bremerman, Xiang Ren, Jonathan May
- TLDR: We introduce and formalize a new category of semantic variation that preserves meaning in the target language and show that existing MT models fail when presented with it.
Natural Language to Code Translation with Execution
- Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, Sida I. Wang
- TLDR: We introduce execution result–based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code translation.
Life is a Circus and We are the Clowns: Automatically Finding Analogies between Situations and Processes
- Oren Sultan, Dafna Shahaf
- TLDR: We develop an interpretable, scalable algorithm for extracting and extracting analogies from procedural text and demonstrate that it can identify the correct mappings 87% of the time for procedural texts and 94% for stories from cognitive-psychology literature.
Language Contamination Helps Explains the Cross-lingual Capabilities of English Pretrained Models
- Terra Blevins, Luke Zettlemoyer
- TLDR: We show that English pretrained language models are not monolingual when pretrained at scale, and that even when less than 1% of data is not English, they can facilitate cross-lingual transfer.
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models
- Terra Blevins, Hila Gonen, Luke Zettlemoyer
- TLDR: We investigate when multilingual pretrained models acquire their in-language and cross-lingual abilities and show that the point in pretraining when the model learns to transfer cross-language differs across language pairs.
Neural Machine Translation with Contrastive Translation Memories
- Xin Cheng, Shen Gao, Lemao Liu, Dongyan Zhao, Rui Yan
- TLDR: We propose a new retrieval-augmented NMTM model that is holistically similar to the source sentence while individually contrastive to each other providing maximal information gain in three phases.
Distilling Causal Effect from Miscellaneous Other-Class for Continual Named Entity Recognition
- Junhao Zheng, Zhanxian Liang, Haibin Chen, Qianli Ma
- TLDR: We propose a unified causal framework to retrieve the causality from both new entity types and Other-Class and propose a self-adaptive weight for balancing the causal effects between new entity type and Other Class.
Exploring the Secrets Behind the Learning Difficulty of Meaning Representations for Semantic Parsing
- Zhenwen Li, Jiaqi Guo, Qian Liu, Jian-Guang Lou, Tao Xie
- TLDR: We propose a data-aware metric called ISS (denoting incremental structural stability) of Meaning Representation (MR) and demonstrate that ISS is highly correlated with the final performance.
That’s the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data
- Jered McInerney, Geoffrey Young, Jan-Willem van de Meent, Byron Wallace
- TLDR: We show that the text in multimodal models on EHRs does not influence attention, and that the alignment of the model is not consistent with anatomical information.
Unsupervised Tokenization Learning
- Anton Kolonin, Vignav Ramesh
- TLDR: We show that the transition freedom metric for unsupervised tokenization is superior to other metrics, and that different languages require different offshoots of that metric (such as derivative, variance, and peak values) for successful tokenization.
A Template-based Method for Constrained Neural Machine Translation
- Shuo Wang, Peng Li, Zhixing Tan, Zhaopeng Tu, Maosong Sun, Yang Liu
- TLDR: We propose a template-based method for neural machine translation that can yield results with high translation quality and match accuracy and the inference speed of our method is comparable with unconstrained NMT models.
PATS: Sensitivity-aware Noisy Learning for Pretrained Language Models
- Yupeng Zhang, Hongzhi Zhang, Sirui Wang, Wei Wu, Zhoujun Li
- TLDR: We present a noisy training mechanism which considers each parameter’s importance in the downstream task to help fine-tune PLMs.
Towards Reinterpreting Neural Topic Models via Composite Activations
- Jia Peng Lim, Hady Lauw
- TLDR: We present a model-free two-stage process to reinterpret neural topic models and derive further insights on the state of the trained model.
Few-shot Query-Focused Summarization with Prefix-Merging
- Ruifeng Yuan, Zili Wang, Ziqiang Cao, Wenjie Li
- TLDR: We propose prefix-merging, a prefix-based pretraining strategy for few-shot learning in query-focused summarization.
Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment
- Siyu Lai, Zhen Yang, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou
- TLDR: We propose Cross-Align to model deep interactions between the input sentence pairs, in which the source and target sentences are encoded separately with the shared self-attention modules in the shallow layers, while cross-lingual interactions are explicitly constructed by the cross-att attention modules in upper layers.
BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation
- Tianxiang Sun, Junliang He, Xipeng Qiu, Xuanjing Huang
- TLDR: We present the first systematic study on the social bias in PLM-based metrics and propose a debiasing method to mitigate it.
HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
- Zihan Wang, Peiyi Wang, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui, Houfeng Wang
- TLDR: We propose Hierarchy-aware Prompt Tuning method for hierarchical text classification and its applications in multi-label training.
Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering
- Md Arafat Sultan, Avi Sil, Radu Florian
- TLDR: We show that as a model learns its source domains better, its zero-shot out-of-domain utility improves at an even faster pace.
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
- Maarten Sap, Ronan Le Bras, Daniel Fried, Yejin Choi
- TLDR: We show that large language models lack social intelligence and Theory of Mind, and propose a theory to explain this shortcoming.
Improving Passage Retrieval with Zero-Shot Question Generation
- Devendra Sachan, Mike Lewis, Mandar Joshi, Armen Aghajanyan, Wen-tau Yih, Joelle Pineau, Luke Zettlemoyer
- TLDR: We propose a simple and effective re-ranking method for improving passage retrieval in open question answering.
Summarizing Community-based Question-Answer Pairs
- Ting-Yao Hsu, Yoshi Suhara, Xiaolan Wang
- TLDR: We propose a novel CQA summarization task that aims to create a concise summary from CQAE pairs.
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models
- Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Marek Rei
- TLDR: We present a logical reasoning framework for NLI that improves interpretability and robustness in a reduced data setting.
How to disagree well: Investigating the dispute tactics used on Wikipedia
- Christine De Kock, Andreas Vlachos
- TLDR: We propose a framework of dispute tactics which unifies these two perspectives, as well as other dialogue acts which play a role in resolving disputes, such as asking questions and providing clarification.
Chapter Ordering in Novels
- Allen Kim, Steve Skiena
- TLDR: We propose novel chapter ordering as a constraint solving problem, and show that it is significantly more challenging than sentence ordering.
Open-ended Knowledge Tracing for Computer Science Education
- Naiming Liu, Zichao Wang, Richard Baraniuk, Andrew Lan
- TLDR: We propose a new knowledge tracing method for programming questions and show its promise in computer science education.
Logical Neural Networks for Knowledge Base Completion with Embeddings & Rules
- Prithviraj Sen, Breno William Carvalho, Ibrahim Abdelaziz, Pavan Kapanipathi, Salim Roukos, Alexander Gray
- TLDR: We propose to learn both conjunction-of-disjunctions and disjunction-of–conjunctions in logical neural networks for knowledge base completion.
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text
- Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, Jimeng Sun
- TLDR: We propose a novel multimodal contrastive learning framework based on medical knowledge and show that it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval.
GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization
- Zhiyuan Zhang, Ruixuan Luo, Qi Su, Xu Sun
- TLDR: We propose a new algorithm to learn algorithms find flat minima that generalize better.
Sparse Teachers Can Be Dense with Knowledge
- Yi Yang, Chen Zhang, Dawei Song
- TLDR: We propose a sparse teacher trick to remove the parameters that result in student-unfriendliness and show that it can lead to a more knowledgeable teacher.
BBTv2: Towards a Gradient-Free Future with Large Language Models
- Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
- TLDR: We present gradient-free tuning of pre-trained models for few-shot learning.
Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models
- Shujian Zhang, Chengyue Gong, Xingchao Liu
- TLDR: We introduce a learnable passage mask mechanism which desensitizes the impact from the top-rank retrieval passages and prevents the model from overfitting.
Mixed-effects transformers for hierarchical adaptation
- Julia White, Noah Goodman, Robert Hawkins
- TLDR: We introduce the mixed-effects transformer, a novel approach for learning hierarchically-structured prefixes to account for structured variation in language use.
On Measuring the Intrinsic Few-Shot Hardness of Datasets
- Xinran Zhao, Shikhar Murty, Christopher Manning
- TLDR: We show that few-shot learning is made possible by exploiting feature-space invariances between training and test samples.
Group is better than individual: Exploiting Label Topologies and Label Relations for Joint Multiple Intent Detection and Slot Filling
- Bowen Xing, Ivor Tsang
- TLDR: We propose a novel model for label-aware label-based multiple intent detection and slot filling that captures beneficial correlations among the labels from HLG.
An Empirical Study on Finding Spans
- Weiwei Gu, Boyuan Zheng, Yunmo Chen, Tongfei Chen, Benjamin Van Durme
- TLDR: We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks.
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding
- Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad Morariu
- TLDR: We propose MGDoc, a novel multi-modal multi-granular pre-training framework that encodes page-level, region-level and word-level information at the same time.
Understanding Jargon: Combining Extraction and Generation for Definition Modeling
- Jie Huang, Hanyin Shao, Kevin Chen-Chuan Chang, Jinjun Xiong, Wen-mei Hwu
- TLDR: We propose a new method for jargon definition modeling that can generate high-quality definitions for jargon and outperforms state-of-the-art models significantly.
ProsocialDialog: A Prosocial Backbone for Conversational Agents
- Hyunwoo Kim, Youngjae Yu, Liwei Jiang, Ximing Lu, Daniel Khashabi, Gunhee Kim, Yejin Choi, Maarten Sap
- TLDR: We present ProsocialDialog, a large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms.
Exploiting Global and Local Hierarchies for Hierarchical Text Classification
- Ting Jiang, Deqing Wang, Leilei Sun, Zhongzhi Chen, Fuzhen Zhuang, Qinghong Yang
- TLDR: We propose Hierarchy-guided BERT with Global and Local hierarchies, a new hierarchical text classification method that exploits global and local hierarchies in multi-label text classification.
Semantic-aware Contrastive Learning for More Accurate Semantic Parsing
- Shan Wu, Chunlei Xin, Bo Chen, Xianpei Han, Le Sun
- TLDR: We propose a new algorithm for contrastive learning of semantic parsers which can learn to distinguish fine-grained meaning representations and take the overall sequence-level semantic into consideration.
Scientific Paper Extractive Summarization Enhanced by Citation Graphs
- Xiuying Chen, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, Xiangliang Zhang
- TLDR: We propose a graph-based unsupervised summarization model for scientific paper extractive summarization.
Hardness-guided domain adaptation to recognise biomedical named entities under low-resource scenarios
- Ngoc Dang Nguyen, Lan Du, Wray Buntine, Changyou Chen, Richard Beare
- TLDR: We present a simple yet effective hardness-guided domain adaptation framework for bioNER tasks that can effectively leverage the domain hardness information to improve the adaptability of the learnt model in the low-resource scenarios.
Syntactic Multi-view Learning for Open Information Extraction
- Kuicai Dong, Aixin Sun, Jung-Jae Kim, Xiaoli Li
- TLDR: We propose a novel neural OpenIE model that learns from the syntactic structures of open-domain sentences.
TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
- Chaoya Jiang, Haiyang Xu, Chenliang Li, Ming Yan, Wei Ye, Shikun Zhang, Bin Bi, Songfang Huang
- TLDR: We propose an efficient vision-and-language pre-training model with text-guided patch-selection layer in the visual backbone for efficient training and inference.
CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation
- Yinpei Dai, Wanwei He, Bowen Li, Yuchuan Wu, Zheng Cao, Zhongqi An, Jian Sun, Yongbin Li
- TLDR: We propose CGoDial, a new challenging and comprehensive Chinese benchmark for multi-domain Goal-oriented Dialog evaluation.
Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding
- SongYang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang
- TLDR: We propose Kernel-Whitening, a Nystrom kernel approximation method for deep learning which improves the generalization ability of fine-tuned models while mitigating bias.
A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling
- Ye Wang, Xinxin Liu, Wenxin Hu, Tao Zhang
- TLDR: We propose a unified positive-unlabeled learning framework for document-level relation extraction and show that it outperforms previous state-of-the-art results under both fully supervised and extremely unlabeled settings as well.
Automatic Generation of Socratic Subquestions for Teaching Math Word Problems
- Kumar Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan
- TLDR: We explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving.
Mixture of Attention Heads: Selecting Attention Heads Per Token
- Xiaofeng Zhang, Yikang Shen, Zeyu Huang, Jie Zhou, Wenge Rong, Zhang Xiong
- TLDR: We propose a new architecture that combines multi-head attention with the MoE mechanism to achieve stronger performance than the standard multi-headed attention layer.
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models
- Eldar Kurtic, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, Dan Alistarh
- TLDR: We present a new method for sparsifying BERT models based on approximate second-order information and show state-of-the-art results in both pre-training and fine-tuning.
Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue
- Sunjae Yoon, Eunseop Yoon, Hee Suk Yoon, Junyeong Kim, Chang Yoo
- TLDR: We propose a novel method for tackling the text hallucination problem in video-grounded dialogue systems.
DSM: Question Generation over Knowledge Base via Modeling Diverse Subgraphs with Meta-learner
- Shasha Guo, Jing Zhang, Yanling Wang, Qianyi Zhang, Cuiping Li, Hong Chen
- TLDR: We propose a novel approach to model diverse subgraphs with meta-learner (DSM) and show that it improves the performance of knowledge base question generation models.
RelU-Net: Syntax-aware Graph U-Net for Relational Triple Extraction
- Yunqi Zhang, Yubo Chen, Yongfeng Huang
- TLDR: We propose a unified framework to incorporate syntactic information for relational triple extraction.
Evidence > Intuition: Transferability Estimation for Encoder Selection
- Elisa Bassignana, Max Müller-Eberstein, Mike Zhang, Barbara Plank
- TLDR: We propose to generate quantitative evidence to predict which pre-trained language models (LMs) will perform best on a target task without having to fine-tune all candidates.
Chunk-based Nearest Neighbor Machine Translation
- Pedro Henrique Martins, Zita Marinho, André F. T. Martins
- TLDR: We present a chunk-based kNN-MT model which retrieves chunks of tokens from the datastore, instead of a single token.
FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering
- Akhil Kedia, Mohd Abbas Zaidi, Haejun Lee
- TLDR: We propose a new method for generating high-quality answer scores for open-domain question answering using transformer encoders that use global representation to attend over multiple tokens across samples.
Inductive Relation Prediction with Logical Reasoning Using Contrastive Representations
- Yudai Pan, Jun Liu, Lingling Zhang, Tianzhe Zhao, Qika Lin, Xin Hu, Qianying Wang
- TLDR: We propose a novel graph convolutional network-based model LogCo with logical reasoning by contrastive representations for inductive relation prediction in knowledge graphs.
Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity
- Jiahao Li, Quan Wang, Zhendong Mao, Junbo Guo, Yanyan Yang, Yongdong Zhang
- TLDR: We propose a novel auxiliary spelling check task for Chinese characters that is adaptable and granular, and propose a new adaptive weighting scheme to balance the two tasks.
MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation
- Anna Currey, Maria Nadejde, Raghavendra Reddy Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, Georgiana Dinu
- TLDR: Gender-balanced, counterfactual data for evaluating gender accuracy in translation from English into eight widely-spoken languages.
A Span-level Bidirectional Network for Aspect Sentiment Triplet Extraction
- Yuqi Chen, Chen Keming, Xian Sun, Zequn Zhang
- TLDR: We present a span-level bidirectional network which utilizes all possible spans as input and extracts triplets from spans bidirectionally.
On the Calibration of Massively Multilingual Language Models
- Kabir Ahuja, Sunayana Sitaram, Sandipan Dandapat, Monojit Choudhury
- TLDR: We show that large multilingual language models are not as well-calibrated as expected in the zero-shot setting and propose strategies to improve their calibration.
Momentum Contrastive Pre-training for Question Answering
- Minda Hu, Muzhi Li, Yasheng Wang, Irwin King
- TLDR: We propose a novel Momentum Contrastive pRe-training fOr queStion anSwering method for extractive Question Answering.
A Second Wave of UD Hebrew Treebanking and Cross-Domain Parsing
- Amir Zeldes, Nick Howell, Noam Ordan, Yifat Ben Moshe
- TLDR: We present a new, freely available UD treebank of Hebrew, which is capable of performing state-of-the-art NLP tasks on a wide range of Hebrew topics.
Finding Dataset Shortcuts with Grammar Induction
- Dan Friedman, Alexander Wettig, Danqi Chen
- TLDR: We propose to use probabilistic grammars to characterize and discover shortcuts in NLP datasets and use them to generate diagnostic contrast examples and improve worst-group accuracy.
Retrieval Augmentation for Commonsense Reasoning: A Unified Approach
- Wenhao Yu, Chenguang Zhu, Zhihan Zhang, Shuohang Wang, Zhuosheng Zhang, Yuwei Fang, Meng Jiang
- TLDR: We propose a unified framework for retrieval-augmented commonsense reasoning that can significantly outperform other knowledge-enhanced method counterparts.
Open World Classification with Adaptive Negative Samples
- Ke Bai, Guoyin Wang, Jiwei Li, Sunghyun Park, Sungjin Lee, Puyang Xu, Ricardo Henao, Lawrence Carin
- TLDR: We propose an approach based on Adaptive Negative Samples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets.
Re3: Generating Longer Stories With Recursive Reprompting and Revision
- Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
- TLDR: We propose a novel approach to generating longer stories of over two thousand words by generating sentences from a general-purpose language model and then reranking continuations for plot coherence and relevance.
Does Joint Training Really Help Cascaded Speech Translation?
- Viet Anh Khoa Tran, David Thulke, Yingbo Gao, Christian Herold, Hermann Ney
- TLDR: We show that a strong cascaded baseline can diminish any improvements obtained using joint training, and we suggest alternatives to joint training.
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
- David Adelani, Graham Neubig, Sebastian Ruder, Shruti Rijhwani, Michael Beukman, Chester Palen-Michel, Constantine Lignos, Jesujoba Alabi, Shamsuddeen Muhammad, Peter Nabende, Cheikh M. Bamba Dione, Andiswa Bukula, Rooweither Mabuya, Bonaventure F. P. Dossou, Blessing Sibanda, Happy Buzaaba, Jonathan Mukiibi, Godson Kalipe, Derguene Mbaye, Amelia Taylor, Fatoumata Kabore, Chris Chinenye Emezue, Anuoluwapo Aremu, Perez Ogayo, Catherine Gitau, Edwin Munkoh-Buabeng, Victoire Memdjokam Koagne, Allahsera Auguste Tapo, Tebogo Macucwa, Vukosi Marivate, Mboning Tchiaze Elvis, Tajuddeen Gwadabe, Tosin Adewumi, Orevaoghene Ahia, Joyce Nakatumba-Nabende, Neo Lerato Mokono, Ignatius Ezeani, Chiamaka Chukwuneke, Mofetoluwa Oluwaseun Adeyemi, Gilles Quentin Hacheme, Idris Abdulmumin, Odunayo Ogundepo, Oreen Yousuf, Tatiana Moteu, Dietrich Klakow
- TLDR: We present the largest to-date human-annotated NER dataset for 20 African languages and show that choosing the best transfer language improves zero-shot F1 scores by 14% over 20 languages as compared to using English.
Ethics consideration sections in natural language processing papers
- Luciana Benotti, Patrick Blackburn
- TLDR: We present a manual classification of all ethical consideration sections for ACL 2021 and provide a list of the papers that required ethics review by at least one reviewer.
Continued Pretraining for Better Zero- and Few-Shot Promptability
- Zhaofeng Wu, Robert L Logan IV, Pete Walsh, Akshita Bhagia, Dirk Groeneveld, Sameer Singh, Iz Beltagy
- TLDR: We investigate if a dedicated continued pretraining stage could improve both zero-shot performance with natural language prompts and few-shot prompt tuning, and provide concrete recommendations to optimize promptability for different use cases.
Less is More: Summary of Long Instructions is Better for Program Synthesis
- Kirby Kuznia, Swaroop Mishra, Mihir Parmar, Chitta Baral
- TLDR: We present a meta-dataset for the program synthesis task and show that summaries of long and complicated programming questions help LMs in understanding a task.
Is a Question Decomposition Unit All We Need?
- Pruthvi Patel, Swaroop Mishra, Mihir Parmar, Chitta Baral
- TLDR: We investigate if humans can decompose a hard question into a set of simpler questions that are relatively easier for models to answer.
Discourse-Aware Soft Prompting for Text Generation
- Marjan Ghazvininejad, Vladimir Karpukhin, Vera Gor, Asli Celikyilmaz
- TLDR: We propose a structured design of prefix parameters for conditional text generation that can improve the generation of coherent, faithful and relevant generations.
ExPUNations: Augmenting Puns with Keywords and Explanations
- Jiao Sun, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Tagyoung Chung, Jing Huang, Yang Liu, Nanyun Peng
- TLDR: We present a new dataset of puns with detailed and fine-grained annotations of keywords denoting the most distinctive words that make the text funny, pun explanations describing why the text is funny, and fine gradients for funniness ratings.
SLING: Sino Linguistic Evaluation of Large Language Models
- Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Mohit Iyyer
- TLDR: We present a benchmark for Sino LINGuistics, a dataset of 38K minimal sentence pairs in Mandarin Chinese, which contains syntactic and semantic phenomena encoded by pretrained Chinese language models.
Context-Situated Pun Generation
- Jiao Sun, Anjali Narayan-Chen, Shereen Oraby, Shuyang Gao, Tagyoung Chung, Jing Huang, Yang Liu, Nanyun Peng
- TLDR: We propose a new task for context-situated pun generation, where a specific context represented by a set of keywords is provided, and the task is to first identify suitable pun words that are appropriate for the context, then generate puns based on the context keywords and the identified pun words.
Retrieval-Augmented Generative Question Answering for Event Argument Extraction
- Xinya Du, Heng Ji
- TLDR: We propose a retrieval-augmented generative QA model for event argument extraction that outperforms substantially prior methods across various settings (i.e. fully supervised, domain transfer, and fewshot learning).
Concadia: Towards Image-Based Text Generation with a Purpose
- Elisa Kreiss, Fei Fang, Noah Goodman, Christopher Potts
- TLDR: We show that describing and captions are distinct communicative roles and show that it is important to distinguish them.
Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics
- Elisa Kreiss, Cynthia Bennett, Shayan Hooshmand, Eric Zelikman, Meredith Ringel Morris, Christopher Potts
- TLDR: We argue against current referenceless metrics for image accessibility on the grounds that they do not align with the needs of BLV users.
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure
- Yinya Huang, Hongming Zhang, Ruixin Hong, Xiaodan Liang, Changshui Zhang, Dong Yu
- TLDR: We propose a comprehensive benchmark to investigate models’ logical reasoning capabilities in complex real-life scenarios.
Explicit Query Rewriting for Conversational Dense Retrieval
- Hongjin Qian, Zhicheng Dou
- TLDR: We propose a model CRDR that can perform query rewriting and context modelling in a unified framework in which the query rewriting’s supervision signals further enhance the context modelling.
Efficient Nearest Neighbor Emotion Classification with BERT-whitening
- Wenbiao Yin, Lin Shang
- TLDR: We propose a simple and efficient non-parametric emotion classification method using nearest neighbor retrieval.
FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification
- Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- TLDR: We propose FastClass, a novel weakly-supervised text classification algorithm that uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier.
Neural-Symbolic Inference for Robust Autoregressive Graph Parsing via Compositional Uncertainty Quantification
- Zi Lin, Jeremiah Liu, Jingbo Shang
- TLDR: We study compositionality-aware approach to neural-symbolic inference informed by model confidence, performing fine-grained neural-Symbolic reasoning at subgraph level (i.e., nodes and edges) and precisely targeting subgraph components with high uncertainty in the neural parser.
A Speaker-Aware Co-Attention Framework for Medical Dialogue Information Extraction
- Yuan Xia, Zhenhui Shi, Jingbo Zhou, Jiayu Xu, Chao Lu, Yehui Yang, Lei Wang, Haifeng Huang, Xia Zhang, Junwei Liu
- TLDR: We propose a speaker-aware dialogue encoder with multi-task learning and a co-attention fusion network for medical dialogue information extraction.
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework
- Yiquan Wu, Yifei Liu, Weiming Lu, Yating Zhang, Jun Feng, Changlong Sun, Fei Wu, Kun Kuang
- TLDR: We propose a novel Rationale-based Legal Judgment Prediction (RLJP) framework which is based on the judge’s real trial logic.
RelCLIP: Adapting Language-Image Pretraining for Visual Relationship Detection via Relational Contrastive Learning
- Yi Zhu, Zhaoqing Zhu, Bingqian Lin, Xiaodan Liang, Feng Zhao, Jianzhuang Liu
- TLDR: We propose a simple yet effective visual Relationship prediction framework that transfers natural language knowledge learned from Contrastive Language-Image Pre-training (CLIP) models to enhance the relationship prediction, termed RelCLIP.
Candidate Soups: Fusing Candidate Results Improves Translation Quality for Non-Autoregressive Translation
- Huanran Zheng, Wei Zhu, Pengfei Wang, Xiaoling Wang
- TLDR: We propose a simple but effective method called Candidate Soups, which can obtain high-quality translations while maintaining the inference speed of NAT models.
Evaluating Parameter Efficient Learning for Generation
- Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro
- TLDR: Parameter efficient learning methods with better performance than finetuning on ROUGE scores.
McQueen: a Benchmark for Multimodal Conversational Query Rewrite
- Yifei Yuan, Chen Shi, Runze Wang, Liyi Chen, Feijun Jiang, Yuan You, Wai Lam
- TLDR: We propose a multimodal query rewrite task for multimodally visual conversation and a multimodeal pre-trained model for it.
Self-supervised Graph Masking Pre-training for Graph-to-Text Generation
- Jiuzhou Han, Ehsan Shareghi
- TLDR: We propose graph masking pre-training strategies that neither require supervision signals nor adjust the architecture of the underlying pre-trained encoder-decoder model.
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping
- Chenghao Yang, Xuezhe Ma
- TLDR: We propose a new method for fine-tuning PLMs in a top-down manner that achieves consistent improvements in generalization performance, convergence speed, and training stability.
Differentially Private Language Models for Secure Data Sharing
- Justus Mattern, Zhijing Jin, Benjamin Weggenmann, Bernhard Schoelkopf, Mrinmaya Sachan
- TLDR: We present a new method for generating synthetic textual datasets that are highly accurate and fluent and are suitable for training classifiers.
Conditional set generation using Seq2seq models
- Aman Madaan, Dheeraj Rajagopal, Niket Tandon, Yiming Yang, Antoine Bosselut
- TLDR: We propose a novel algorithm for effectively sampling informative orders over the combinatorial space of label orders.
Analyzing and Evaluating Faithfulness in Dialogue Summarization
- Bin Wang, Chen Zhang, Yan Zhang, Yiming Chen, Haizhou Li
- TLDR: We present a new model-level faithfulness evaluation method for dialogue summarization and show that over 35% of generated summaries are faithfully inconsistent with the source dialogues.
Twist Decoding: Diverse Generators Guide Each Other
- Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, Noah A. Smith
- TLDR: We present a novel and effective language generation algorithm that benefits from diverse models at inference time.
Exploring Representation-level Augmentation for Code Search
- Haochen Li, Chunyan Miao, Cyril Leung, Yanxian Huang, Yuan Huang, Hongyu Zhang, Yanlin Wang
- TLDR: We propose a general format of representation-level augmentation methods for code search and show that the proposed methods can consistently boost the performance of the studied code search models.
Learning Semantic Textual Similarity via Topic-informed Discrete Latent Variables
- Erxin Yu, Lan Du, Yuan Jin, Zhepei Wei, Yi Chang
- TLDR: We develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization.
STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension
- Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir Radev
- TLDR: We propose a novel type of dialogue summarization task that can help pre-trained language models to better understand dialogues and improve their performance on important dialogue comprehension tasks.
Competency-Aware Neural Machine Translation: Can Machine Translation Know its Own Translation Quality?
- Pei Zhang, Baosong Yang, Hao-Ran Wei, Dayiheng Liu, Kai Fan, Luo Si, Jun Xie
- TLDR: We propose a novel competency-aware NMT by extending conventional NMT with a self-estimator, offering abilities to translate a source sentence and estimate its competency.
PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training
- Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao, Xiaoyong Du
- TLDR: We present PASTA, a novel method for table-based fact verification using synthesized sentence–table cloze questions and a recent pre-trained LM.
Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis
- Shuai Fan, Chen Lin, Haonan Li, Zhenghao Lin, Jinsong Su, Hang Zhang, Yeyun Gong, JIan Guo, Nan Duan
- TLDR: We propose SentiWSP, a novel Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks.
Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement
- Hui Liu, Wenya Wang, Haoliang Li
- TLDR: We propose a novel hierarchical framework for multi-modal sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attentions and the composition-level compositional congruities.
Efficiently Tuned Parameters Are Task Embeddings
- Wangchunshu Zhou, Canwen Xu, Julian McAuley
- TLDR: We propose to exploit off-the-shelf parameters from early checkpoints as task embeddings for the efficient selection of source datasets for intermediate-task transfer.
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
- Hao Peng, Xiaozhi Wang, Shengding Hu, Hailong Jin, Lei Hou, Juanzi Li, Zhiyuan Liu, Qun Liu
- TLDR: We comprehensively evaluate conceptual knowledge of pre-trained language models by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively.
Capturing Global Structural Information in Long Document Question Answering with Compressive Graph Selector Network
- Yuxiang Nie, Heyan Huang, Wei Wei, Xian-Ling Mao
- TLDR: We propose Compressive Graph Selector Network for long document question answering.
Structural generalization is hard for sequence-to-sequence models
- Yuekun Yao, Alexander Koller
- TLDR: We show that seq2seq models are not generalizing to syntactic structures that were not seen in training, and that this limitation can often be overcome by neurosymbolic models that have linguistic knowledge built in.
Contrastive Learning enhanced Author-Style Headline Generation
- Hui Liu, Weidong Guo, Yige Chen, Xiangyang Li
- TLDR: We propose a novel Seq2Seq model called CLH3G (Contrastive Learning enhanced Historical Headlines based Headline Generation) which can use the historical headlines of the articles that the author wrote in the past to improve the headline generation of current articles.
Multi-Granularity Optimization for Non-Autoregressive Translation
- Yafu Li, Leyang Cui, Yongjing Yin, Yue Zhang
- TLDR: Multi-granularity optimization for non-autoregressive machine translation.
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
- Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Atharva Naik, Arjun Ashok, Arut Selvan Dhanasekaran, Anjana Arunkumar, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Kuntal Kumar Pal, Maitreya Patel, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Savan Doshi, Shailaja Keyur Sampat, Siddhartha Mishra, Sujan Reddy A, Sumanta Patro, Tanay Dixit, Xudong Shen
- TLDR: We benchmark the generalization of NLP models under instructions and show that Tk-Instruct outperforms existing instruction-following models such as InstructGPT by over 9% on our benchmark despite being an order of magnitude smaller.
MetaFill: Text Infilling for Meta-Path Generation on Heterogeneous Information Networks
- Zequn Liu, Kefei Duan, Junwei Yang, Hanwen Xu, Ming Zhang, Sheng Wang
- TLDR: MetaFill is a text-infilling-based approach for meta-path generation and graph embedding.
DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering
- Miao Zhang, Rufeng Dai, Ming Dong, Tingting He
- TLDR: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graphs.
AEG: Argumentative Essay Generation via A Dual-Decoder Model with Content Planning
- Jianzhu Bao, Yasheng Wang, Yitong Li, Fei Mi, Ruifeng Xu
- TLDR: We propose a new task for argumentative essay generation based on a dual-decoder Transformer architecture.
BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Datasets
- Minju Kim, Chaehyeong Kim, Yong Ho Song, Seung-won Hwang, Jinyoung Yeo
- TLDR: We propose a novel framework for multi-skill dialogue systems which use multiple agents grounded to the specific target skills participate in a conversation to automatically annotate multi-task dialogues.
Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition
- Jun-Yu Ma, Beiduo Chen, Jia-Chen Gu, Zhenhua Ling, Wu Guo, Quan Liu, Zhigang Chen, Cong Liu
- TLDR: We propose a novel method for efficient and effective zero-shot cross-lingual named entity recognition by combining multiple distillers in the teacher-student distillation framework.
An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks
- Yuxiang Wu, Yu Zhao, Baotian Hu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel
- TLDR: We propose a new memory-augmented model that encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying.
Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation
- Xiaohui Song, Longtao Huang, Hui Xue, Songlin Hu
- TLDR: We propose a novel algorithm for emotion recognition in conversation that uses contrastive learning and curriculum learning to solve imbalanced classification problem.
RuCoLA: Russian Corpus of Linguistic Acceptability
- Vladislav Mikhailov, Tatiana Shamardina, Max Ryabinin, Alena Pestova, Ivan Smurov, Ekaterina Artemova
- TLDR: We present a comprehensive dataset of language models and generative models for Russian and show that language models still fall behind humans on many grammatical and semantic tasks.
Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform
- Huiru Xiao, Xin Liu, Yangqiu Song, Ginny Wong, Simon See
- TLDR: We propose to use the representation capacity of the complex hyperbolic geometry in multi-relational knowledge graph embeddings and propose to apply the fast Fourier transform as the conversion between the real and complex hyper-bolic space.
Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge
- Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Min-Yen Kan, Jian-Guang Lou
- TLDR: We propose a new domain knowledge bank for text-to-SQL parsing and propose a framework to leverage this knowledge during parsing.
Should We Ban English NLP for a Year?
- Anders Søgaard
- TLDR: We show that inequality amplification in NLP is not a problem, but a good thing.
LittleBird: Efficient Faster & Longer Transformer for Question Answering
- Minchul Lee, Kijong Han, Myeong Cheol Shin
- TLDR: We propose a novel model based on attention with linear biases for long inputs, which can work well in question answering tasks.
WeTS: A Benchmark for Translation Suggestion
- Zhen Yang, Fandong Meng, Yingxue Zhang, Ernan Li, Jie Zhou
- TLDR: We present a benchmark dataset for translation suggestion and show that it can be used to improve the performance of post-editing.
Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation
- Chen Wang, Yuchen Liu, Boxing Chen, Jiajun Zhang, Wei Luo, Zhongqiang Huang, Chengqing Zong
- TLDR: We propose a novel Discrete cross-modal alignment method for end-to-end speech translation that can match both modalities of speech and text in a shared vocabulary space.
Abstractive Summarization Guided by Latent Hierarchical Document Structure
- Yifu Qiu, Shay B. Cohen
- TLDR: We propose a hierarchy-aware graph neural network for summarizing text that captures the underlying structure of text and uses it to improve sequence models.
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning
- Jianguo Mao, Wenbin Jiang, Xiangdong Wang, Hong Liu, Yu Xia, Yajuan Lyu, QiaoQiao She
- TLDR: We propose a novel method for explainable reasoning using global differentiable learning and a dynamic adaptive reasoning algorithm.
DuReader-Retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine
- Yifu Qiu, Hongyu Li, Yingqi Qu, Ying Chen, QiaoQiao She, Jing Liu, Hua Wu, Haifeng Wang
- TLDR: We present DuReader-retrieval, a large-scale Chinese dataset for passage retrieval.
Pair-Based Joint Encoding with Relational Graph Convolutional Networks for Emotion-Cause Pair Extraction
- Junlong Liu, Xichen Shang, Qianli Ma
- TLDR: We propose a novel feature encoding method for emotion-cause pair extraction that can balance the information flow among emotion clauses, cause clauses and pairs.
Affective Knowledge Enhanced Multiple-Graph Fusion Networks for Aspect-based Sentiment Analysis
- Siyu Tang, Heyan Chai, Ziyi Yao, Ye Ding, Cuiyun Gao, Binxing Fang, Qing Liao
- TLDR: We propose a novel multi-graph fusion network based on latent graph to leverage the richer syntax dependency relation label information and affective semantic information of words to learn sentiment representations.
IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages
- Aman Kumar, Himani Shrotriya, Prachi Sahu, Amogh Mishra, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Mitesh M. Khapra, Pratyush Kumar
- TLDR: We present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages.
Improving Machine Translation with Phrase Pair Injection and Corpus Filtering
- Akshay Batheja, Pushpak Bhattacharyya
- TLDR: We show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems.
An Anchor-based Relative Position Embedding Method for Cross-Modal Tasks
- Ya Wang, Xingwu Sun, Lian Fengzong, ZhanHui Kang, Chengzhong Xu Xu
- TLDR: We propose a unified position embedding method for cross-modal transformer that uses anchor locating mechanism to bridge the semantic gap and locate anchors from different modalities.
Norm-based Noisy Corpora Filtering and Refurbishing in Neural Machine Translation
- Yu Lu, Jiajun Zhang
- TLDR: We propose a norm-based noisy corpora filtering and refurbishing method with no external data and costly scorers.
TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method
- Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiang-Yang Li, Tao Qin, Tie-Yan Liu
- TLDR: We present a two-stage lyric-to-melody generation system that generates melodies from lyrics and music templates without any lyrics-to melody paired data.
SEEN: Structured Event Enhancement Network for Explainable Need Detection of Information Recall Assistance
- You-En Lin, An-Zi Yen, Hen-Hsen Huang, Hsin-Hsi Chen
- TLDR: We propose a novel event enhancement network for detecting the need for information recall services.
Rethinking Style Transformer with Energy-based Interpretation: Adversarial Unsupervised Style Transfer using a Pretrained Model
- Hojun Cho, Dohee Kim, Seungwoo Ryu, ChaeHun Park, Hyungjong Noh, Jeong-in Hwang, Minseok Choi, Edward Choi, Jaegul Choo
- TLDR: We propose a novel approach which applies the pretrained language model to the text style transfer framework by restructuring the discriminator and the model itself, allowing the generator and the discrimator to also take advantage of the power of the pretraining model.
Towards Robust k-Nearest-Neighbor Machine Translation
- Hui Jiang, Ziyao Lu, Fandong Meng, Chulun Zhou, Jie Zhou, Degen Huang, Jinsong Su
- TLDR: We propose a confidence-enhanced k-Nearest-nearest-Neighbor machine translation model with robust training.
Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation
- Yang Yu, Fangzhao Wu, Chuhan Wu, Jingwei Yi, Qi Liu
- TLDR: We propose Tiny-NewsRec, a novel method for improving the efficiency and effectiveness of PLM-based news recommendation.
TABS: Efficient Textual Adversarial Attack for Pre-trained NL Code Model Using Semantic Beam Search
- YunSeok Choi, Hyojun Kim, Jee-Hyong Lee
- TLDR: We propose a new beam search black-box adversarial attack method for pre-trained models.
Investigating the Robustness of Natural Language Generation from Logical Forms via Counterfactual Samples
- Chengyuan Liu, Leilei Gan, Kun Kuang, Fei Wu
- TLDR: We propose two approaches to reduce the model’s reliance on the spurious correlation between the headers of the tables and operators of the logical forms, which leads to poor results on the standard test dataset.
Helping the Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators
- Xinyou Wang, Zaixiang Zheng, Shujian Huang
- TLDR: We propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals for neural machine translation models.
RACE: Retrieval-augmented Commit Message Generation
- Ensheng Shi, Yanlin Wang, Wei Tao, Lun Du, Hongyu Zhang, Shi Han, Dongmei Zhang, Hongbin Sun
- TLDR: We propose RACE, a new retrieval-augmented neural commit message generation method, which learns the semantic similarity between the retrieved and current code diff and leverages it to generate an accurate commit message.
PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation
- Ao Liu, Haoyu Dong, Naoaki Okazaki, Shi Han, Dongmei Zhang
- TLDR: We propose a novel method for logical table-to-text generation that learns logical inference from table-logic pairs and improves logical fidelity.
GHAN: Graph-Based Hierarchical Aggregation Network for Text-Video Retrieval
- Yahan Yu, Bojie Hu, Yu Li
- TLDR: Graph-based hierarchical aggregation network for text-video retrieval.
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text
- Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William Cohen
- TLDR: We propose the first Multimodal Retrieval-Augmented Transformer (MuRAG), which accesses an external non-parametric multimodal memory to augment language generation.
PHEE: A Dataset for Pharmacovigilance Event Extraction from Text
- Zhaoyue Sun, Jiazheng Li, Gabriele Pergola, Byron Wallace, Bino John, Nigel Greene, Joseph Kim, Yulan He
- TLDR: We present a novel dataset for pharmacovigilance that provides coarse and fine-grained information about patients’ demographics, treatments and (side) effects.
OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification
- Jie Cao, Yin Zhang
- TLDR: We propose an autoregressive sequence-to-set model for XMTC tasks which outperforms the state-of-the-art Seq2Seq method by 16.34% in micro-F1 score.
SimQA: Detecting Simultaneous MT Errors through Word-by-Word Question Answering
- HyoJung Han, Marine Carpuat, Jordan Boyd-Graber
- TLDR: We introduce a downstream word-by-word question answering evaluation task to measure whether neural machine translation systems can answer a question word-for-word.
Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations
- Zhihui Xie, Handong Zhao, Tong Yu, Shuai Li
- TLDR: We present a novel view of projecting away language-specific factors from a multilingual embedding space that primarily encodes information irrelevant to semantics.
Rethinking the Authorship Verification Experimental Setups
- Florin Brad, Andrei Manolache, Elena Burceanu, Antonio Barbalau, Radu Tudor Ionescu, Marius Popescu
- TLDR: We propose five new public splits over the PAN dataset, specifically designed to isolate and identify biases related to the text topic and to the author’s writing style.
Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification
- Chunpu Xu, Jing Li
- TLDR: We propose a novel method for multimodal social media classification that leverages visual and lingual similarity to capture hinting features from user comments.
Training Language Models with Memory Augmentation
- Zexuan Zhong, Tao Lei, Danqi Chen
- TLDR: We present TRIME, a novel yet simple training approach designed for training language models with memory augmentation.
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages
- Paul Röttger, Debora Nozza, Federico Bianchi, Dirk Hovy
- TLDR: We show that a small amount of target-language fine-tuning data is needed to achieve strong performance in under-resourced languages, and that initial fine-tuning on readily-available English data can partially substitute target-data and improve model generalisability.
Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder
- Zhenghao Liu, Han Zhang, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Xiaohua Li
- TLDR: We propose a Conditional Autoencoder for dense retrieval that can compress the high-dimensional embeddings of dense retrieval and improve the ranking performance of the teacher model.
Controlled Text Reduction
- Aviv Slobodkin, Paul Roit, Eran Hirsch, Ori Ernst, Ido Dagan
- TLDR: We propose a new approach to summarization that formalizes the task of generating coherent text given pre-selected content.
Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency
- Yanzhu Guo, Chloé Clavel, Moussa Kamal Eddine, Michalis Vazirgiannis
- TLDR: We present a new dataset for summarization evaluation that improves the quality of popular summarization datasets and provides a valid benchmark for developing and evaluating summarization systems.
Invariant Language Modeling
- Maxime Peyrard, Sarvjeet Ghotra, Martin Josifoski, Vidhan Agarwal, Barun Patra, Dean Carignan, Emre Kiciman, Saurabh Tiwary, Robert West
- TLDR: We propose a new framework for learning invariant representations in language models that generalize across multiple environments.
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
- Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao
- TLDR: We propose AdaMix as a general PEFT method that tunes a mixture of adaptation modules – given the underlying PEFT methods of choice – introduced in each Transformer layer while keeping most of the PLM weights frozen.
How “Multi” is Multi-Document Summarization?
- Ruben Wolhandler, Arie Cattan, Ori Ernst, Ido Dagan
- TLDR: We propose a metric for evaluating the degree to which a summary is “disperse” in terms of the number of source documents needed to cover its content.
BioReader: a Retrieval-Enhanced Text-to-Text Transformer for Biomedical Literature
- Giacomo Frisoni, Miki Mizutani, Gianluca Moro, Lorenzo Valgimigli
- TLDR: We present a novel retrieval-enhanced text-to-text model for biomedical natural language processing that uses domain knowledge to augment the input prompt and generate correct predictions.
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
- Paul-Ambroise Duquenne, Hongyu Gong, Benoît Sagot, Holger Schwenk
- TLDR: We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks.
LILA: A Unified Benchmark for Mathematical Reasoning
- Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, Ashwin Kalyan
- TLDR: We proposeLILA,a unified mathematical reasoning benchmark consisting of 23 diversetasks along four dimensions: mathematical abilities e.g., arithmetic, calculus, language format e.gs., question-answering, fill-in-the-blanks, external knowledge, and robustness to language perturbation.
Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding
- Md Mosharaf Hossain, Eduardo Blanco
- TLDR: We present an automated procedure to collect pairs of sentences with negation and their affirmative interpretations, resulting in over 150,000 pairs.
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation
- Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai
- TLDR: GraphQ IR is a unified intermediate representation for graph query languages that bridges the semantic gap between natural and formal languages.
InforMask: Unsupervised Informative Masking for Language Model Pretraining
- Nafis Sadeq, Canwen Xu, Julian McAuley
- TLDR: We propose InforMask, a new unsupervised masking strategy for training masked language models.
CTRLsum: Towards Generic Controllable Text Summarization
- Junxian He, Wojciech Kryscinski, Bryan McCann, Nazneen Rajani, Caiming Xiong
- TLDR: We present CTRLsum, a generic framework to control generated summaries through a set of keywords.
Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation
- Max Glockner, Yufang Hou, Iryna Gurevych
- TLDR: We show that existing NLP task definitions for fact-checking cannot refute misinformation as professional fact-checkers do for the majority of claims.
A Framework for Adapting Pre-Trained Language Models to Knowledge Graph Completion
- Justin Lovelace, Carolyn Rosé
- TLDR: We explore the suitability of entity embeddings extracted from pre-trained language models for knowledge graph completion and develop a knowledge graph model that significantly outperforms recent neural models.
Mutual Information Alleviates Hallucinations in Abstractive Summarization
- Liam van der Poel, Ryan Cotterell, Clara Meister
- TLDR: We propose a new algorithm for decoding abstractive summarization models that reduces the probability of hallucinated tokens while maintaining the Rouge and BERT-S scores of top-performing decoding strategies.
Toward the Limitation of Code-Switching in Cross-Lingual Transfer
- Yukun Feng, Feng Li, Philipp Koehn
- TLDR: We propose a novel method for cross-lingual token-sensitive tasks by making the token replacement grammatically consistent during both training and inference.
Syntactically Rich Discriminative Training: An Effective Method for Open Information Extraction
- Frank Mtumbuka, Thomas Lukasiewicz
- TLDR: We propose several new methods for training neural OIE models in this paper.
Transformer-based Entity Typing in Knowledge Graphs
- Zhiwei Hu, Victor Gutierrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Pan
- TLDR: We propose a novel Transformer-based Entity Typing approach which uses information about class membership of types to infer plausible entity types.
NewsClaims: A New Benchmark for Claim Detection from News with Attribute Knowledge
- Revanth Gangi Reddy, Sai Chetan Chinthakindi, Zhenhailong Wang, Yi Fung, Kathryn Conger, Ahmed ELsayed, Martha Palmer, Preslav Nakov, Eduard Hovy, Kevin Small, Heng Ji
- TLDR: We present NewsClaims, a new benchmark for attribute-aware claim detection in the news domain.
IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces
- Kelly Marchisio, Neha Verma, Kevin Duh, Philipp Koehn
- TLDR: We address the root-cause of faulty cross-lingual mapping: that word embedding training resulted in the underlying spaces being non-isomorphic.
Adversarial Concept Erasure in Kernel Space
- Shauli Ravfogel, Francisco Vargas, Yoav Goldberg, Ryan Cotterell
- TLDR: We propose a kernelization of the recently-proposed linear concept-removal objective, and show that it is effective in guarding against the ability of certain nonlinear adversaries to recover the concept.
The Authenticity Gap in Human Evaluation
- Kawin Ethayarajh, Dan Jurafsky
- TLDR: We identify the implicit assumptions in the standard protocol for NLG evaluation that are often violated in practice, and propose a new human evaluation protocol that recovers the expected preferences of annotators.
BERT in Plutarch’s Shadows
- Ivan Yamshchikov, Alexey Tikhonov, Yorgos Pantis, Charlotte Schubert, Jürgen Jost
- TLDR: We present a BERT language model for Ancient Greek and show that Pseudo-Plutarch is not the author of the Placita Philosophorum.
Leveraging Locality in Abstractive Text Summarization
- Yixin Liu, Ansong Ni, Linyong Nan, Budhaditya Deb, Chenguang Zhu, Ahmed Hassan Awadallah, Dragomir Radev
- TLDR: We propose a novel approach to summarization tasks by using a novel attention module that is applied to individual pages, which contain parts of inputs grouped by the principle of locality, during both the encoding and decoding stages.
Salience Allocation as Guidance for Abstractive Summarization
- Fei Wang, Kaiqiang Song, Hongming Zhang, Lifeng Jin, Sangwoo Cho, Wenlin Yao, Xiaoyang Wang, Muhao Chen, Dong Yu
- TLDR: We propose a novel salience guidance for abstractive summarization models that adapts well to articles in different abstractiveness.
Fine-tuned Language Models are Continual Learners
- Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- TLDR: We show that Continual Learning is a good way to learn new skills without forgetting previous skills.
Natural Logic-guided Autoregressive Multi-hop Document Retrieval for Fact Verification
- Rami Aly, Andreas Vlachos
- TLDR: We propose a novel retrieve-and-rerank method for multi-hop evidence retrieval that uses less memory than current state-of-the-art methods and is guided by a proof system based on natural logic.
AX-MABSA: A Framework for Extremely Weakly Supervised Multi-label Aspect Based Sentiment Analysis
- Sabyasachi Kamila, Walid Magdy, Sourav Dutta, MingXue Wang
- TLDR: We present an extremely weakly supervised multi-label Aspect Category Sentiment Analysis framework which does not use any labelled data.
Transfer Learning with Synthetic Corpora for Spatial Role Labeling and Reasoning
- Roshanak Mirzaee, Parisa Kordjamshidi
- TLDR: Synthesis of synthetic data for transfer learning on spatial question answering and role labeling.
A Survey of Active Learning for Natural Language Processing
- Zhisong Zhang, Emma Strubell, Eduard Hovy
- TLDR: We provide a literature review of active learning for NLP and explore several important aspects of applying active learning to NLP problems.
Bernice: A Multilingual Pre-trained Encoder for Twitter
- Alexandra DeLucia, Shijie Wu, Aaron Mueller, Carlos Aguirre, Philip Resnik, Mark Dredze
- TLDR: We present Bernice, the first multilingual RoBERTa language model trained from scratch on 2.5 billion tweets with a custom tweet-focused tokenizer.
CEFR-Based Sentence Difficulty Annotation and Assessment
- Yuki Arase, Satoru Uchida, Tomoyuki Kajiwara
- TLDR: We propose a sentence-level assessment model for controlable text simplification and a corpus for sentence-based sentence classification.
Simple Questions Generate Named Entity Recognition Datasets
- Hyunjae Kim, Jaehyo Yoo, Seunghyun Yoon, Jinhyuk Lee, Jaewoo Kang
- TLDR: We present a new ask-to-generate approach for named entity recognition that outperforms existing models on a number of NER benchmarks.
TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
- Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Minjoon Seo
- TLDR: We present a lifelong benchmark for language models that measures the ability of language models to adapt to Wikipedia and Wikipedia snapshots.
Bi-Directional Iterative Prompt-Tuning for Event Argument Extraction
- Lu Dai, Bang Wang, Wei Xiang, Yijun Mo
- TLDR: We propose a cloze-style iterative iterative prompt-tuning method for event argument extraction that takes full advantage of entity information and pre-trained language models.
Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation
- Peiyi Wang, Yifan Song, Tianyu Liu, Binghuai Lin, Yunbo Cao, Sujian Li, Zhifang Sui
- TLDR: We propose a simple adversarial class augmentation mechanism for continuous relation extraction that improves the performance of state-of-the-art CRE models on two popular benchmarks.
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering
- Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang
- TLDR: We propose a new large-scale dataset, ConvFinQA, aiming to study the chain of numerical reasoning in conversational question answering.
A Span-based Multimodal Variational Autoencoder for Semi-supervised Multimodal Named Entity Recognition
- Baohang Zhou, Ying Zhang, Kehui Song, Wenya Guo, Guoqing Zhao, Hongbin Wang, Xiaojie Yuan
- TLDR: We propose a novel span-based multimodal variational autoencoder for semi-supervised named entity recognition on social media.
R-TeaFor: Regularized Teacher-Forcing for Abstractive Summarization
- Guan-Yu Lin, Pu-Jen Cheng
- TLDR: Regularized Teacher-Forcing for summarization.
Modeling Consistency Preference via Lexical Chains for Document-level Neural Machine Translation
- Xinglin Lyu, Junhui Li, Shimin Tao, Hao Yang, Ying Qin, Min Zhang
- TLDR: We propose a novel approach to improve lexical consistency in document-level neural machine translation by modeling consistency preference for lexical chains.
Just Fine-tune Twice: Selective Differential Privacy for Large Language Models
- Weiyan Shi, Ryan Shea, Si Chen, Chiyuan Zhang, Ruoxi Jia, Zhou Yu
- TLDR: We propose a novel framework for fine-tuning language models to achieve selective differential privacy, which is a provable privacy guarantee for large transformer-based models.
Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents
- Marcio Fonseca, Yftah Ziser, Shay B. Cohen
- TLDR: We propose a novel method for generating and combining abstractive summary views covering salient information in subsets of the input document (document views) and combining them into a final summary (document summaries) that outperforms PEGASUS trained in domain by a large margin.
Open-Domain Sign Language Translation Learned from Online Video
- Bowen Shi, Diane Brentari, Gregory Shakhnarovich, Karen Livescu
- TLDR: We present a set of techniques for sign language translation in realistic settings and without glosses.
Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change
- Zhaochen Su, Zecheng Tang, Xinyan Guan, Lijun Wu, Min Zhang, Juntao Li
- TLDR: We propose a simple yet effective lexical-level masking strategy to post-train a converged language model.
ULN: Towards Underspecified Vision-and-Language Navigation
- Weixi Feng, Tsu-Jui Fu, Yujie Lu, William Yang Wang
- TLDR: We propose a new setting for vision-and-language navigation that evaluates agents using multi-level underspecified instructions instead of purely fine-grained or coarse-graining instructions.
Federated Model Decomposition with Private Vocabulary for Text Classification
- Zhuo Zhang, Xiangjing Hu, Lizhen Qu, Qifan Wang, Zenglin Xu
- TLDR: We propose a federated model decomposition method that protects the privacy of vocabularies in federated learning and propose an adaptive updating technique to improve the performance of local models.
ReCo: Reliable Causal Chain Reasoning via Structural Causal Recurrent Neural Networks
- Kai Xiong, Xiao Ding, Zhongyang Li, Li Du, Ting Liu, Bing Qin, Yi Zheng, Baoxing Huai
- TLDR: We propose a novel Reliable Causal chain reasoning framework for causal chain reasoning, which improves the performance of BERT models on four downstream causal-related tasks.
Video Question Answering: Datasets, Algorithms and Challenges
- Yaoyao Zhong, Wei Ji, Junbin Xiao, Yicong Li, Weihong Deng, Tat-Seng Chua
- TLDR: We present a survey on the recent advances in video question answering and point towards future directions.
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation
- Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
- TLDR: We present a new method to improve existing multilingual sentence embeddings with abstract meaning representation and show that it leads to better state-of-the-art performance on both semantic textual similarity and transfer tasks.
Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling
- Zhijun Wang, Xuebo Liu, Min Zhang
- TLDR: We present a novel representation method for Chinese characters to break the bottlenecks in neural machine translation.
Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction
- Yice Zhang, Yifan Yang, Yihui Li, Bin Liang, Shiwei Chen, Yixue Dang, Min Yang, Ruifeng Xu
- TLDR: We propose Boundary-Driven Table-Filling, a novel approach to the aspect sentence triplet extraction task, which can fully exploit both word-to-word interactions and relation-torelation interactions.
Attention and Edge-Label Guided Graph Convolutional Networks for Named Entity Recognition
- Renjie Zhou, Zhongyi Xie, Jian Wan, Jilin Zhang, Yong Liao, Qiang Liu
- TLDR: We propose a novel approach to better exploit structured information captured by dependency trees for named entity recognition.
Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset
- Haolin Deng, Yanan Zhang, Yangfan Zhang, Wangyang Ying, Changlong Yu, Jun Gao, Wei Wang, Xiaoling Bai, Nan Yang, Jin Ma, Xiang Chen, Tianhua Zhou
- TLDR: We present Title2Event, a large-scale sentence-level dataset benchmarking Open Event Extraction without restricting event types.
Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models
- Chaitanya Malaviya, Sudeep Bhatia, Mark Yatskar
- TLDR: We propose tracking annotator heuristic traces, where we tangibly measure low-effort annotation strategies that could indicate usage of various cognitive heuristics.
Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts
- Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- TLDR: We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion.
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation
- Fan Yin, Yao Li, Cho-Jui Hsieh, Kai-Wei Chang
- TLDR: We propose to test adversarial examples detection methods with adversarial adversarial attacks that are located near model decision boundaries.
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
- Zhongwei Wan, Yichun Yin, Wei Zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu
- TLDR: Domain-adaptive pre-training of general pre-trained language models with domain-specific corpora.
Towards Unifying Reference Expression Generation and Comprehension
- Duo Zheng, Tao Kong, Ya Jing, Jiaan Wang, Xiaojie Wang
- TLDR: Unified model for REG and REC, which outperforms previous state-of-the-art methods on both REG and Rec.
Textual Manifold-based Defense Against Natural Language Adversarial Examples
- Dang Nguyen Minh, Anh Tuan Luu
- TLDR: We propose a novel manifold-based defense mechanism for adversarial examples in NLP that learns the embedding space manifold of the underlying language model and projects novel inputs back to the approximated structure before classification.
Tiny-Attention Adapter: Contexts Are More Important Than the Number of Parameters
- Hongyu Zhao, Hao Tan, Hongyuan Mei
- TLDR: We propose a new adapter architecture for parameter-efficient transfer learning that learns to modify hidden states at each position directly conditioned on the hidden states of the other positions.
Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives
- Si Sun, Chenyan Xiong, Yue Yu, Arnold Overwijk, Zhiyuan Liu, Jie Bao
- TLDR: We propose a new method for dense retrieval training that improves convergence speed and improves stability.
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
- Akari Asai, Mohammadreza Salehi, Matthew Peters, Hannaneh Hajishirzi
- TLDR: We present a new multi-task, parameter-efficient language model (LM) tuning method that learns to transfer knowledge across different tasks via a mixture of soft prompts—small prefix embedding vectors pre-trained for different tasks.
Exploration of the Usage of Color Terms by Color-blind Participants in Online Discussion Platforms
- Ella Rabinovich, Boaz Carmeli
- TLDR: We show that red-green color-blind speakers use the “red” and “green” color terms in less predictable contexts, and in linguistic environments evoking mental image to a lower extent, when compared to their normal-sighted counterparts.
DEER: Descriptive Knowledge Graph for Explaining Entity Relationships
- Jie Huang, Kerui Zhu, Kevin Chen-Chuan Chang, Jinjun Xiong, Wen-mei Hwu
- TLDR: We propose DEER (Descriptive Knowledge Graph for Explaining Entity Relationships) - an open and informative form of modeling entity relationships.
META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI
- Liangtai Sun, Xingyu Chen, Lu Chen, Tianle Dai, Zichen Zhu, Kai Yu
- TLDR: GUI-based task-oriented dialogue system for mobile phone intelligent assistants.
Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders
- Minsoo Kim, Sihwa Lee, Suk-Jin Hong, Du-Seong Chang, Jungwook Choi
- TLDR: We provide an in-depth analysis of the mechanism of KD on attention recovery of quantized large Transformers.
Exploring Mode Connectivity for Pre-trained Language Models
- Yujia Qin, Cheng Qian, Jing Yi, Weize Chen, Yankai Lin, Xu Han, Zhiyuan Liu, Maosong Sun, Jie Zhou
- TLDR: We investigate the mode connectivity of pre-trained language models and propose a new mode connectivity metric to measure the connection of different minima.
Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks
- Jaehoon Oh, Jongwoo Ko, Se-Young Yun
- TLDR: We propose a cross-lingual fine-tuning algorithm for multilingual sentence classification tasks, which uses SupCon and MixUp jointly and improves the performance.
Increasing Visual Awareness in Multimodal Neural Machine Translation from an Information Theoretic Perspective
- Baijun Ji, Tong Zhang, Yicheng Zou, Bojie Hu, Si Shen
- TLDR: We propose a novel approach to improve visual awareness of MMT models by using mutual information to quantify visual signals.
Improving Event Coreference Resolution Using Document-level and Topic-level Information
- Sheng Xu, Peifeng Li, Qiaoming Zhu
- TLDR: We propose a novel event coreference resolution model based on sentence-level embeddings and a novel topic topic generator to learn sentence-based embeddable event embeddents.
Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding
- Rishabh Bhardwaj, Amrita Saha, Steven C.H. Hoi, Soujanya Poria
- TLDR: Vector-quantized Input-contextualized Prompts outperforms soft prompt tuning on various language understanding tasks.
Boosting Natural Language Generation from Instructions with Meta-Learning
- Budhaditya Deb, Ahmed Hassan Awadallah, Guoqing Zheng
- TLDR: Recent work has shown that language models (LMs) trained with multi-task
Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies
- Eitan Wagner, Renana Keydar, Amit Pinchevski, Omri Abend
- TLDR: We propose a new approach to topical segmentation of Holocaust survivor testimonies, which is challenging due to their unstructured surface level, relative abundance, and the relatively confined domain that they cover.
Unifying the Convergences in Multilingual Neural Machine Translation
- Yichong Huang, Xiaocheng Feng, Xinwei Geng, Bing Qin
- TLDR: LanguageSpecific Self-Distillation is a novel training strategy for all-in-one-model multilingual neural machine translation that can alleviate the convergence inconsistency in the joint training.
Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field
- Chengyue Jiang, Yong Jiang, Weiqi Wu, Pengjun Xie, Kewei Tu
- TLDR: We propose a new type-agnostic entity typing algorithm that predicts the categories of entities mentioned in sentences.
Help me write a Poem - Instruction Tuning as a Vehicle for Collaborative Poetry Writing
- Tuhin Chakrabarty, Vishakh Padmakumar, He He
- TLDR: We present a new approach to training large language models to follow natural language instructions.
Open Relation and Event Type Discovery with Type Abstraction
- Sha Li, Heng Ji, Jiawei Han
- TLDR: We introduce the idea of type abstraction, a new type-based representation for information extraction that can automatically infer new types from given corpora.
Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples
- Linlin Liu, Xin Li, Ruidan He, Lidong Bing, Shafiq Joty, Luo Si
- TLDR: We present novel knowledge based multilingual language models trained directly on Wikidata KG triples for knowledge-enhanced language model pretraining.
Revisiting Grammatical Error Correction Evaluation and Beyond
- Peiyuan Gong, Xuebo Liu, Heyan Huang, Min Zhang
- TLDR: We propose a novel GEC evaluation metric to achieve the best of both worlds, namely PT-M2 which only uses PT-based metrics to score those corrected parts.
R2D2: Robust Data-to-Text with Replacement Detection
- Linyong Nan, Lorenzo Jaime Flores, Yilun Zhao, Yixin Liu, Luke Benson, Weijin Zou, Dragomir Radev
- TLDR: We propose a new training framework for Data-to-Text generation systems that addresses unfaithful generation and improve the quality of the generated text.
IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension
- Rifki Afina Putri, Alice Oh
- TLDR: We present a new Indonesian MRC dataset that combines the automatic and manual unanswerable question generation to improve the performance of MRC models.
XLM-D: Decorate Cross-lingual Pre-training Model as Non-Autoregressive Neural Machine Translation
- Yong Wang, Shilin He, Guanhua Chen, Yun Chen, Daxin Jiang
- TLDR: We present XLM-D, a novel non-autoregressive translation model that achieves state-of-the-art performance on machine translation.
Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction
- Qin Dai, Benjamin Heinzerling, Kentaro Inui
- TLDR: We present a novel cross-stitch bi-encoder architecture for distantly-supervised relation extraction that allows full interaction between the text encoder and the knowledge graph encoder.
Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos
- Nayu Liu, Kaiwen Wei, Xian Sun, Hongfeng Yu, Fanglong Yao, Li Jin, Guo Zhi, Guangluan Xu
- TLDR: We propose a new task, named Multimodal Cross-Lingual Summarization for Videos (MCLS), which aims to generate cross-lingual summaries from multimodal inputs of videos.
PACIFIC: Towards Proactive Conversational Question Answering over Tabular and Textual Data in Finance
- Yang Deng, Wenqiang Lei, Wenxuan Zhang, Wai Lam, Tat-Seng Chua
- TLDR: We present PACIFIC, a new dataset for conversational question answering over hybrid contexts in finance, which combines clarification question generation and CQA.
Generative Data Augmentation with Contrastive Learning for Zero-Shot Stance Detection
- Yang Li, Jiawei Yuan
- TLDR: We propose a generative data augmentation approach to generate training samples containing targets and stances for zero-shot stance detection, and map the real samples and generated synthetic samples into the same embedding space with contrastive learning, then perform the final classification based on the augmented data.
Better Few-Shot Relation Extraction with Label Prompt Dropout
- Peiyuan Zhang, Wei Lu
- TLDR: We present a novel approach called label prompt dropout, which randomly removes label descriptions in the learning process, which leads to improved class representations.
Break it Down into BTS: Basic, Tiniest Subword Units for Korean
- Nayeon Kim, Jun-Hyung Park, Joon-Young Choi, Eojin Jeon, Youjin Kang, SangKeun Lee
- TLDR: We introduce Basic, Tiniest Subword (BTS) units for the Korean language, which are inspired by the invention principle of Hangeul, the Korean writing system.
The Devil in Linear Transformer
- Zhen Qin, Xiaodong Han, Weixuan Sun, Dongxu Li, Lingpeng Kong, Nick Barnes, Yiran Zhong
- TLDR: We propose a new linear attention that improves performance on text classification and language modeling tasks and outperforms vanilla transformers.
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
- Ping Yang, Junjie Wang, Ruyi Gan, Xinyu Zhu, Lin Zhang, Ziwei Wu, Xinyu Gao, Jiaxing Zhang, Tetsuya Sakai
- TLDR: We propose a new paradigm for zero-shot learning that is format agnostic, i.e., it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis.
Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation
- Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Junqiu Wei, Xin Jiang
- TLDR: We propose a new Transformer algorithm that outperforms the recent light-weight SOTA methods on three standard translation tasks under different parameter and speed scales.
FigMemes: A Dataset for Figurative Language Identification in Politically-Opinionated Memes
- Chen Liu, Gregor Geigle, Robin Krebs, Iryna Gurevych
- TLDR: We present FigMemes, a dataset for figurative language classification in politically-opinionated memes, and provide comprehensive benchmark results.
UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction
- Wei Tang, Benfeng Xu, Yuyue Zhao, Zhendong Mao, Yifeng Liu, Yong Liao, Haiyong Xie
- TLDR: Unifying the representations of entities and relations by jointly encoding them within a concatenated natural language sequence, and unify the modeling of interactions with a proposed Interaction Map, which is built upon the off-the-shelf self-attention mechanism within any Transformer block.
X-FACTOR: A Cross-metric Evaluation of Factual Correctness in Abstractive Summarization
- Subhajit Chaudhury, Sarathkrishna Swaminathan, Chulaka Gunasekara, Maxwell Crouse, Srinivas Ravishankar, Daiki Kimura, Keerthiram Murugesan, Ramón Fernandez Astudillo, Tahira Naseem, Pavan Kapanipathi, Alexander Gray
- TLDR: We present X-FACTOR, a cross-evaluation of three high-performing fact-aware abstractive summarization methods and propose a fact-based filtering mechanism that improves the quality of training data and, consequently, the factuality of these models.
ParaTag: A Dataset of Paraphrase Tagging for Fine-Grained Labels, NLG Evaluation, and Data Augmentation
- Shuohang Wang, Ruochen Xu, Yang Liu, Chenguang Zhu, Michael Zeng
- TLDR: We propose a novel fine-grained paraphrase annotation schema that labels the minimum spans of tokens in a sentence that don’t have the corresponding paraphrases in the other sentence.
Factual Accuracy is not Enough: Planning Consistent Description Order for Radiology Report Generation
- Toru Nishino, Yasuhide Miura, Tomoki Taniguchi, Tomoko Ohkuma, Yuki Suzuki, Shoji Kido, Noriyuki Tomiyama
- TLDR: We propose a planning-based radiology report generation system that generates the overall structure of reports as “plans’” prior to generating reports that are accurate and consistent in order.
FLUTE: Figurative Language Understanding through Textual Explanations
- Tuhin Chakrabarty, Arkadiy Saakyan, Debanjan Ghosh, Smaranda Muresan
- TLDR: We present FLUTE, a dataset of 9,000 figurative NLI instances with explanations, spanning four categories: Sarcasm, Simile, Metaphor, and Idioms.
Precisely the Point: Adversarial Augmentations for Faithful and Informative Text Generation
- Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Sujian Li, Yajuan Lyu
- TLDR: We propose a novel adversarial augmentation framework for improving faithfulness and informativeness of pre-trained Seq2Seq models via enhancing their robustness.
RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees
- Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
- TLDR: We propose RLET, a Reinforcement Learning based Entailment Tree generation framework, which is trained utilising the cumulative signals across the whole tree.
Let the CAT out of the bag: Contrastive Attributed explanations for Text
- Saneem Chemmengath, Amar Prakash Azad, Ronny Luss, Amit Dhurandhar
- TLDR: We propose a novel method for contrastive explanations for natural language text data with a novel twist as we build and exploit attribute classifiers leading to more semantically meaningful explanations.
monoQA: Multi-Task Learning of Reranking and Answer Extraction for Open-Retrieval Conversational Question Answering
- Sarawoot Kongyoung, Craig Macdonald, Iadh Ounis
- TLDR: We propose monoQA, a novel approach to answer the Conversational Question Answering task by using a text generation model with multi-task learning for both the reranker and reader.
Composing Ci with Reinforced Non-autoregressive Text Generation
- Yan Song
- TLDR: We propose a novel non-autoregressive approach to compose classical Chinese poetry using rigid format and dynamic reward reward.
MetaTKG: Learning Evolutionary Meta-Knowledge for Temporal Knowledge Graph Reasoning
- Yuwei Xia, Mengqi Zhang, Qiang Liu, Shu Wu, Xiao-Yu Zhang
- TLDR: We propose a novel Temporal meta-learning framework for TKG reasoning, MetaTKG for brevity.
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
- Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, He Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou, Luo Si
- TLDR: We present a novel vision-language foundation model for both cross-modal understanding and generation.
Q-TOD: A Query-driven Task-oriented Dialogue System
- Xin Tian, Yingzhan Lin, Mengfei Song, Siqi Bao, Fan Wang, Huang He, Shuqi Sun, Hua Wu
- TLDR: We present a query-driven task-oriented dialogue system that can adapt to unseen domains and solve the issue of knowledge base scalability.
Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings
- Che Liu, Rui Wang, Junfeng Jiang, Yongbin Li, Fei Huang
- TLDR: We propose a novel contrastive learning approach for dialogue embedding based on conversational interactions.
WR-One2Set: Towards Well-Calibrated Keyphrase Generation
- Binbin Xie, Xiangpeng Wei, Baosong Yang, Huan Lin, Jun Xie, Xiaoli Wang, Min Zhang, Jinsong Su
- TLDR: We propose WR-ONE2SET which extends ONE2SET with adaptive instance-level cost Weighting strategy and a target Re-assignment mechanism.
Eeny, meeny, miny, moe. How to choose data for morphological inflection.
- Saliha Muradoglu, Mans Hulden
- TLDR: We present four sampling strategies for morphological inflection using a Transformer model and show that the oracle experiment, which is presented as a proxy for linguist/language informer feedback, shows the most improvement.
An Adaptive Logical Rule Embedding Model for Inductive Reasoning over Temporal Knowledge Graphs
- Xin Mei, Libin Yang, Xiaoyan Cai, Zuowei Jiang
- TLDR: We propose an interpretable model for temporal knowledge graph reasoning based on logical rule embeddings and a one-class augmented matching loss for optimization.
UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning
- Yutao Mou, Pei Wang, Keqing He, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu
- TLDR: We propose a unified neighborhood learning framework for OOD detection and a scoring function for representation learning.
Open-domain Video Commentary Generation
- Edison Marrese-Taylor, Yumi Hamazono, Tatsuya Ishigaki, Goran Topić, Yusuke Miyao, Ichiro Kobayashi, Hiroya Takamura
- TLDR: We propose a new open-domain video commentary generation task based on transcribed commentary and propose a set of robust baselines for the task.
One size does not fit all: Investigating strategies for differentially-private learning across NLP tasks
- Manuel Senge, Timour Igamberdiev, Ivan Habernal
- TLDR: We provide an extensive analysis of different privacy preserving strategies on seven downstream datasets in five different NLP tasks with varying complexity using modern neural models based on BERT and XtremeDistil architectures.
Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario
- Xiao Liu, Yansong Feng, Jizhi Tang, Chengang Hu, Dongyan Zhao
- TLDR: We propose a novel recipe generation task that allows models to learn to modify a base recipe according to the change of an ingredient.
Tutoring Helps Students Learn Better: Improving Knowledge Distillation for BERT with Tutor Network
- Junho Kim, Jun-Hyung Park, Mingyu Lee, Wing-Lam Mok, Joon-Young Choi, SangKeun Lee
- TLDR: We propose a novel knowledge distillation framework for language models that improves the difficulty of training examples during pre-training by controlling the difficulty.
Does Corpus Quality Really Matter for Low-Resource Languages?
- Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa
- TLDR: We present a novel approach to filtering and crawling multilingual corpora in low-resource languages using tailored crawling and show that it improves downstream performance on representation learning tasks.
Unifying Data Perspectivism and Personalization: An Application to Social Norms
- Joan Plepi, Béla Neuendorf, Lucie Flek, Charles Welch
- TLDR: We provide a novel experimental setup that applies personalization methods to the modeling of annotators and compare their effectiveness for predicting the perception of social norms.
Does Self-Rationalization Improve Robustness to Spurious Correlations?
- Alexis Ross, Matthew Peters, Ana Marasovic
- TLDR: We evaluate robustness to spurious correlations in self-rationalization models and show that explainability can come at the cost of robustness.
Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking
- Mingyu Lee, Jun-Hyung Park, Junho Kim, Kang-Min Kim, SangKeun Lee
- TLDR: We propose a novel concept-based curriculum masking method to efficiently pre-train a language model.
Subword Evenness (SuE) as a Predictor of Cross-lingual Transfer to Low-resource Languages
- Olga Pelloni, Anastassia Shaitarova, Tanja Samardzic
- TLDR: We show that languages written in non-Latin and non-alphabetic scripts (mostly Asian languages) are the best choices for improving performance on the task of Masked Language Modelling (MLM) in a diverse set of 30 low-resource languages and that the success of the transfer is well predicted by our novel measure of Subword Evenness (SuE).
A Unified Neural Network Model for Readability Assessment with Feature Projection and Length-Balanced Loss
- Wenbiao Li, Wang Ziyang, Yunfang Wu
- TLDR: We propose a novel BERT-based model with feature projection and length-balanced loss for readability assessment.
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
- Zhihao Du, ShiLiang Zhang, Siqi Zheng, Zhi-Jie Yan
- TLDR: We propose a new formulation for speaker overlap-aware neural diarization task, which can outperform the state-of-the-art methods based on target speaker voice activity detection and can be further improved with SOND.
GREENER: Graph Neural Networks for News Media Profiling
- Panayot Panayotov, Utsav Shukla, Husrev Taha Sencar, Mohamed Nabeel, Preslav Nakov
- TLDR: We propose a graph neural network model for media media that can be used to predict the factuality and bias of media outlets.
Graph Hawkes Transformer for Extrapolated Reasoning on Temporal Knowledge Graphs
- Haohai Sun, Shangyi Geng, Jialun Zhong, Han Hu, Kun He
- TLDR: We propose a Graph Hawkes Transformer for temporal knowledge graph reasoning, which captures the instantaneous structural information and temporal evolution information, respectively, and a new relational continuous-time encoding function to facilitate feature evolution with the Hawkes process.
UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation
- Yongwei Zhou, Junwei Bao, Chaoqun Duan, Youzheng Wu, Xiaodong He, Tiejun Zhao
- TLDR: We propose a new method for Unified discrete Reasoning over heterogeneous knowledge resources, i.e., table and text, without requiring derivation annotation.
Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models
- Mozes van de Kar, Mengzhou Xia, Danqi Chen, Mikel Artetxe
- TLDR: We propose an alternative mining-based approach for zero-shot learning that is more flexible and interpretable than prompting, and outperforms it on a wide range of tasks when using comparable templates.
SEMGraph: Incorporating Sentiment Knowledge and Eye Movement into Graph Model for Sentiment Analysis
- Bingbing Wang, Bin Liang, Jiachen Du, Min Yang, Ruifeng Xu
- TLDR: We propose a graph architecture based on sentiment knowledge and eye movement to learn the sentiment expression of the context.
Cross-lingual neural fuzzy matching for exploiting target-language monolingual corpora in computer-aided translation
- Miquel Esplà-Gomis, Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez
- TLDR: We present a novel neural approach for exploiting monolingual corpora in a computer-aided translation (CAT) environment, and estimate their post-editing effort.
Multi-Label Intent Detection via Contrastive Task Specialization of Sentence Encoders
- Ivan Vulić, Iñigo Casanueva, Georgios Spithourakis, Avishek Mondal, Tsung-Hsien Wen, Paweł Budzianowski
- TLDR: We propose a novel framework for multi-label intent detection that learns a classifier on top of a fixed sentence encoder and a task specialization module.
Discovering Language-neutral Sub-networks in Multilingual Language Models
- Negar Foroutan, Mohammadreza Banaei, Rémi Lebret, Antoine Bosselut, Karl Aberer
- TLDR: Language neutrality of multilingual pre-trained language models is a function of the overlap between language-encoding sub-networks of these models.
Parameter-Efficient Tuning Makes a Good Classification Head
- Zhuoyi Yang, Ming Ding, Yanhui Guo, Qingsong Lv, Jie Tang
- TLDR: We show that the classification head jointly pretrained with parameter-efficient tuning improves the performance on 9 tasks in GLUE and SuperGLUE.
STGN: an Implicit Regularization Method for Learning with Noisy Labels in Natural Language Processing
- Tingting Wu, Xiao Ding, Minji Tang, Hao Zhang, Bing Qin, Ting Liu
- TLDR: We propose a novel stochastic tailor-made gradient noise for NLP tasks that can effectively mitigate the effect of noisy labels.
Cross-Modal Similarity-Based Curriculum Learning for Image Captioning
- Hongkuan Zhang, Saku Sugawara, Akiko Aizawa, Lei Zhou, Ryohei Sasano, Koichi Takeda
- TLDR: We propose a simple yet efficient difficulty measurement for image captioning using cross-modal similarity calculated by a pretrained vision–language model.
Debiasing Masks: A New Framework for Shortcut Mitigation in NLU
- Johannes Mario Meissner, Saku Sugawara, Akiko Aizawa
- TLDR: We propose a new debiasing method in which we identify debiased pruning masks that can be applied to a finetuned model.
Extending Phrase Grounding with Pronouns in Visual Dialogues
- Panzhong Lu, Xin Zhang, Meishan Zhang, Min Zhang
- TLDR: We propose a new phrase grounding task which incorporates both noun phrases and pronouns.
EUR-Lex-Sum: A Multi- and Cross-lingual Dataset for Long-form Summarization in the Legal Domain
- Dennis Aumiller, Ashish Chouhan, Michael Gertz
- TLDR: We propose a novel dataset for cross-lingual summarization of legal acts from the European Union law platform (EUR-Lex).
Differentiable Data Augmentation for Contrastive Sentence Representation Learning
- Tianduo Wang, Wei Lu
- TLDR: We propose a novel method for differentiable data augmentation during contrastive learning that improves sentence representation learning under both semi-supervised and supervised settings.
Text Style Transferring via Adversarial Masking and Styled Filling
- Jiarui Wang, Richong Zhang, Junfan Chen, Jaein Kim, Yongyi Mao
- TLDR: We propose a style transfer model, with an adversarial masking approach and a styled filling technique, which can guarantee diversity and semantic consistency of the transferred text.
Character-level White-Box Adversarial Attacks against Transformers via Attachable Subwords Substitution
- Aiwei Liu, Honghai Yu, Xuming Hu, Shu’ang Li, Li Lin, Fukun Ma, Yawen Yang, Lijie Wen
- TLDR: We propose the first character-level white-box adversarial attack method against transformer models.
Query-based Instance Discrimination Network for Relational Triple Extraction
- Zeqi Tan, Yongliang Shen, Xuming Hu, Wenqi Zhang, Xiaoxia Cheng, Weiming Lu, Yueting Zhuang
- TLDR: We propose a novel query-based approach to construct instance-level representations for relational triples.
Learning Inter-Entity-Interaction for Few-Shot Knowledge Graph Completion
- Yuling Li, Kui Yu, Xiaoling Huang, Yuhong Zhang
- TLDR: We propose a novel FKGC model that learns semantic representations of entity pairs by exploring the inter-entity interaction between head and tail entities.
Empowering the Fact-checkers! Automatic Identification of Claim Spans on Twitter
- Megha Sundriyal, Atharva Kulkarni, Vaibhav Pulastya, Md. Shad Akhtar, Tanmoy Chakraborty
- TLDR: We propose a novel claim span identification algorithm for Twitter corpus with token-level claim spans on more than 7.5k tweets.
ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization
- Jiaan Wang, Fandong Meng, Ziyao Lu, Duo Zheng, Zhixu Li, Jianfeng Qu, Jie Zhou
- TLDR: We present ClidSum, a benchmark dataset for cross-lingual summarization systems on dialogue documents.
Spectral Probing
- Max Müller-Eberstein, Rob van der Goot, Barbara Plank
- TLDR: We develop a fully learnable frequency filter to identify spectral profiles for any given task.
QASem Parsing: Text-to-text Modeling of QA-based Semantics
- Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, Ido Dagan
- TLDR: We present a unified parsing tool for semi-structured natural language structures, which can be used to generate semantic representations for NLP tasks.
Keyphrase Generation via Soft and Hard Semantic Corrections
- Guangzhen Zhao, Guoshun Yin, Peng Yang, Yu Yao
- TLDR: We propose a novel correction model CorrKG on top of the MLE pipeline, where the biases are corrected via the optimal transport (OT) and a frequency-based filtering-and-sorting (FreqFS) strategy.
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval
- Minjoon Jung, SeongHo Choi, JooChan Kim, Jin-Hwa Kim, Byoung-Tak Zhang
- TLDR: We propose a self-supervised learning framework for video corpus moment retrieval using pseudo queries exploiting both visualand textual information from the selected temporal moments.
DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models
- Hongyu Zhu, Yan Chen, Jing Yan, Jing Liu, Yu Hong, Ying Chen, Hua Wu, Haifeng Wang
- TLDR: We present a Chinese dataset for robustness evaluation of Chinese Question Matching models and show that the effect of artificial adversarial examples does not work on natural texts.
DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages
- Gabriele Sarti, Arianna Bisazza, Ana Guerberof-Arenas, Antonio Toral
- TLDR: We present a cross-lingual post-editing study of neural machine translation and show that post-translation is consistently faster than translation from scratch.
Bridging Fairness and Environmental Sustainability in Natural Language Processing
- Marius Hessenthaler, Emma Strubell, Dirk Hovy, Anne Lauscher
- TLDR: We show that knowledge distillation can actually decrease model fairness, contrary to other findings.
UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition
- Guimin Hu, Ting-En Lin, Yi Zhao, Guangming Lu, Yuchuan Wu, Yongbin Li
- TLDR: Unifying multimodal sentiment analysis and emotion recognition in conversation tasks from features, labels, and models.
Is the Brain Mechanism for Hierarchical Structure Building Universal Across Languages? An fMRI Study of Chinese and English
- Xiaohan Zhang, Shaonan Wang, Nan Lin, Chengqing Zong
- TLDR: We show that the brain uses different parsing strategies for Chinese and English, and show that these strategies generate less memory processing load according to different language structures.
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
- Huiyin Xue, Nikolaos Aletras
- TLDR: We propose HashFormers, a new family of vocabulary-independent pre-trained transformers that support an unlimited vocabulary (i.e. all possible tokens in a corpus) given a substantially smaller fixed-sized embedding matrix.
MatchPrompt: Prompt-based Open Relation Extraction with Semantic Consistency Guided Clustering
- Jiaxin Wang, Lingling Zhang, Jun Liu, Xi Liang, Yujie Zhong, Yaqiang Wu
- TLDR: We propose a prompt-based framework for unlabeled clustering and show that it achieves the new SOTA results for OpenRE.
Improving Aspect Sentiment Quad Prediction via Template-Order Data Augmentation
- Mengting Hu, Yike Wu, Hang Gao, Yinhao Bai, Shiwan Zhao
- TLDR: We propose a simple but effective method to identify the most proper orders, and further combine multiple proper templates as data augmentation to improve the aspect-level sentiment quad prediction task.
SocioProbe: What, When, and Where Language Models Learn about Sociodemographics
- Anne Lauscher, Federico Bianchi, Samuel R. Bowman, Dirk Hovy
- TLDR: We investigate the sociodemographic knowledge of pre-trained language models and show that they encode sociodemic aspects of language.
When does Parameter-Efficient Transfer Learning Work for Machine Translation?
- Ahmet Üstün, Asa Cooper Stickland
- TLDR: Parameter-efficient fine-tuning methods for machine translation.
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer
- Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder
- TLDR: We propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation.
Towards Robust Numerical Question Answering: Diagnosing Numerical Capabilities of NLP Systems
- Jialiang Xu, Mengyu Zhou, Xinyi He, Shi Han, Dongmei Zhang
- TLDR: We propose to diagnose numerical capability of Numerical Question Answering systems and datasets and propose perturbations to address the problem.
Enhancing Joint Multiple Intent Detection and Slot Filling with Global Intent-Slot Co-occurrence
- Mengxiao Song, Bowen Yu, Li Quangang, Wang Yubin, Tingwen Liu, Hongbo Xu
- TLDR: We propose a novel graph neural network to model the interaction between the two subtasks.
Towards Pragmatic Production Strategies for Natural Language Generation Tasks
- Mario Giulianelli
- TLDR: We propose a new framework for the design of Natural Language Generation (NLG) systems that follow efficient and effective production strategies in order to achieve complex communicative goals.
LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling
- Dongsheng Chen, Chaofan Tao, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu
- TLDR: We propose a novel video-language pre-trained model that adapts a pre-learned image-language model BLIP into a video-text model directly on downstream tasks, without heavy pre-training.
Communication breakdown: On the low mutual intelligibility between human and neural captioning
- Roberto Dessì, Eleonora Gualdoni, Francesca Franzon, Gemma Boleda, Marco Baroni
- TLDR: We compare the 0-shot performance of a neural caption-based image retriever and a neural captifier on ImageCoDe, a dataset that contains hard distractors.
Normalizing Mutual Information for Robust Adaptive Training for Translation
- Youngwon Lee, Changmin Lee, Hojin Lee, Seung-won Hwang
- TLDR: We propose Normalized Pointwise Mutual Information (NPMI), a scoring metric for the importance of target sentences and tokens in translation models, which captures the dependence between source-target and that NPMI-based token-level adaptive training brings improvements over baselines with empirical results from En-De, De-En, and En-Ro translation tasks.
Bilingual Synchronization: Restoring Translational Relationships with Editing Operations
- Jitao Xu, Josep Crego, François Yvon
- TLDR: We present a new translation algorithm that can perform bilingual synchronization tasks with multiple architectures and training regimes.
Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering
- Helena Bonaldi, Sara Dellantonio, Serra Sinem Tekiroğlu, Marco Guerini
- TLDR: We present a hybrid approach for dialogical data collection, which combines the intervention of human expert annotators over machine generated dialogues obtained using 19 different configurations.
JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation
- Xiaobo Liang, Lijun Wu, Juntao Li, Min Zhang
- TLDR: We propose a new sequence generation algorithm based on transformer-based autoregressive and non-autoregressive models.
Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
- Jialin Wu, Raymond Mooney
- TLDR: We propose an Entity-Focused Retrieval model for Outside-Knowledge Visual Question Answering that provides stronger supervision during training and recognizes question-relevant entities to help retrieve more specific knowledge.
Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer?
- Ningyu Xu, Tao Gui, Ruotian Ma, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang
- TLDR: We show that the distance between the distributions of grammatical relations induced from multilingual BERT is highly consistent with the syntactic difference in terms of linguistic formalisms.
“It’s Not Just Hate”: A Multi-Dimensional Perspective on Detecting Harmful Speech Online
- Federico Bianchi, Stefanie HIlls, Patricia Rossini, Dirk Hovy, Rebekah Tromble, Nava Tintarev
- TLDR: We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues.
Long Text Generation with Topic-aware Discrete Latent Variable Model
- Erguang Yang, Mingtong Liu, Deyi Xiong, Yujie Zhang, Yufeng Chen, Jinan Xu
- TLDR: We propose a topic-aware latent code-guided text generation model that learns information about topics.
TIARA: Multi-grained Retrieval for Robust Question Answering over Large Knowledge Base
- Yiheng Shu, Zhiwei Yu, Yuhan Li, Börje Karlsson, Tingting Ma, Yuzhong Qu, Chin-Yew Lin
- TLDR: We present a new KBQA model, TIARA, which addresses those issues by applying multi-grained retrieval to help the PLM focus on the most relevant KB context, viz., entities, exemplary logical forms, and schema items.
Structure-Unified M-Tree Coding Solver for Math Word Problem
- Bin Wang, Jiangzhou Ju, Yang Fan, Xinyu Dai, Shujian Huang, Jiajun Chen
- TLDR: We propose a new approach to solve math word problem solvers by taking into account the properties of the binary tree structure of mathematical expressions at the output side.
FormLM: Recommending Creation Ideas for Online Forms by Modelling Semantic and Structural Information
- Yijia Shao, Mengyu Zhou, Yifan Zhong, Tao Wu, Hongwei Han, Shi Han, Gideon Huang, Dongmei Zhang
- TLDR: We present FormLM to model online forms (by enhancing pre-trained language model with form structural information) and recommend form creation ideas (including question / options recommendations and block type suggestion).
Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework
- Yiming Chen, Yan Zhang, Bin Wang, Zuozhu Liu, Haizhou Li
- TLDR: We propose a semi-supervised sentence embedding framework that effectively leverages large-scale unlabeled data.
GPS: Genetic Prompt Search for Efficient Few-Shot Learning
- Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Wang Yanggang, Haiyu Li, Zhilin Yang
- TLDR: We present a genetic algorithm for few-shot learning with prompts, which outperforms manual prompts by a large margin.
Multitask Instruction-based Prompting for Fallacy Recognition
- Tariq Alhindi, Tuhin Chakrabarty, Elena Musi, Smaranda Muresan
- TLDR: We present a multi-task prompting approach for the task of detecting and detecting fallacies in multiple datasets and show that it improves the results against approaches built for a specific dataset such as T5, BERT or GPT-3.
Rethinking Multi-Modal Alignment in Multi-Choice VideoQA from Feature and Sample Perspectives
- Shaoning Xiao, Long Chen, Kaifeng Gao, Zhao Wang, Yi Yang, Zhimeng Zhang, Jun Xiao
- TLDR: We propose a novel multi-modal alignment method for Video Question Answering that improves the cross-modality correspondence ability of Video QuestionAnswering models.
Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach
- Miao Chen, Xinjiang Lu, Tong Xu, Yanyan Li, Zhou Jingbo, Dejing Dou, Hui Xiong
- TLDR: We propose a novel approach for table-to-text generation with pretrained language model and a multi-pass decoder framework for table descriptions.
Hierarchical Phrase-Based Sequence-to-Sequence Learning
- Bailin Wang, Ivan Titov, Jacob Andreas, Yoon Kim
- TLDR: We present a novel neural transducer that incorporates hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Natural Language Deduction with Incomplete Information
- Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett
- TLDR: We propose a new system that can handle the underspecified setting where not all premises are stated at the outset; that is, additional assumptions need to be materialized to prove a claim.
Character-centric Story Visualization via Visual Planning and Token Alignment
- Hong Chen, Rujun Han, Te-Lin Wu, Hideki Nakayama, Nanyun Peng
- TLDR: We propose a novel approach to story visualization that preserves characters in the images and produces high quality images.
ASQA: Factoid Questions Meet Long-Form Answers
- Ivan Stelmakh, Yi Luan, Bhuwan Dhingra, Ming-Wei Chang
- TLDR: We propose a new metric for measuring performance on long-form factoid question answering and a novel dataset for measuring it.
Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs
- Anej Svete, Benjamin Dayan, Ryan Cotterell, Tim Vieira, Jason Eisner
- TLDR: We present moreefficient algorithms for computing the pathsumin sparse acyclic WFSAs with failure transitions.
Towards Better Document-level Relation Extraction via Iterative Inference
- Liang Zhang, Jinsong Su, Yidong Chen, Zhongjian Miao, Min Zijun, Qingguo Hu, Xiaodong Shi
- TLDR: We propose a novel document-level relation extraction model with iterative inference and show that it outperforms other competitive baselines.
Efficient Adversarial Training with Robust Early-Bird Tickets
- Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
- TLDR: We propose an efficient adversarial training method that can achieve up to 10% more robustness than traditional fine-tuning.
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks
- Fatemehsadat Mireshghallah, Kartik Goyal, Archit Uniyal, Taylor Berg-Kirkpatrick, Reza Shokri
- TLDR: We propose a new method for quantifying the privacy risks of memorization in masked language models.
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
- Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier
- TLDR: We present a new massively multilingual machine translation model that outperforms previous massively multilinguality models on low-resource language pairs.
TextFusion: Privacy-Preserving Pre-trained Model Inference via Token Fusion
- Xin Zhou, Jinzhu Lu, Tao Gui, Ruotian Ma, Zichu Fei, Yuran Wang, Yong Ding, Yibo Cheung, Qi Zhang, Xuanjing Huang
- TLDR: We propose TextFusion, a novel method for preserving inference privacy.
Learning to Explain Selectively: A Case Study on Question Answering
- Shi Feng, Jordan Boyd-Graber
- TLDR: We propose learning to explain”selectively”: for each decision that the user makes, we use a model to choose the best explanation from a set of candidates and update this model with feedback to optimize human performance.
ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation
- Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, Min Zhang
- TLDR: We propose a novel transfer learning method for neural machine translation that can improve the inference accuracy of the child model.
Better Hit the Nail on the Head than Beat around the Bush: Removing Protected Attributes with a Single Projection
- Pantea Haghighatkhah, Antske Fokkens, Pia Sommerauer, Bettina Speckmann, Kevin Verbeek
- TLDR: We present a method for finding a single targeted projection for embedding space inference that is more efficient than multiple projections.
IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models
- Chenguang Wang, Xiao Liu, Dawn Song
- TLDR: We introduce a new open information extraction benchmark for pre-trained language models (LM) that allows us to fully examine the open relational information present in the pre-training LMs.
ConNER: Consistency Training for Cross-lingual Named Entity Recognition
- Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Luo Si, Chunyan Miao
- TLDR: We propose ConNER as a novel consistency training framework for cross-lingual named entity recognition, which comprises of: (1) translation-based consistency training on unlabeled target-language data, and (2) dropout-based dropout based consistency training.
A Sequential Flow Control Framework for Multi-hop Knowledge Base Question Answering
- Minghui Xie, Chuzhan Hao, Peng Zhang
- TLDR: We propose a simple but effective GRU-inspired Flow Control-inspired sequential reasoning self-attention mechanism to model sequential logic in multi-hop reasoning process in KBQA.
ACENet: Attention Guided Commonsense Reasoning on Hybrid Knowledge Graph
- Chuzhan Hao, Minghui Xie, Peng Zhang
- TLDR: We propose an Attention guided Commonsense rEasoning Network to integrate hybrid knowledge graph into commonsense reasoning.
Revisiting DocRED - Addressing the False Negative Problem in Relation Extraction
- Qingyu Tan, Lu Xu, Lidong Bing, Hwee Tou Ng, Sharifah Mahani Aljunied
- TLDR: We propose a novel approach to address the false negative problem in the DocRED dataset by adding the missed relation triples back to the original DocRED.
Towards Summary Candidates Fusion
- Mathieu Ravaut, Shafiq Joty, Nancy Chen
- TLDR: We propose a new paradigm in second-stage abstractive summarization called SummaFusion that fuses several summary candidates to produce a novel abstractive second-stages.
Multimodal Robustness for Neural Machine Translation
- Yuting Zhao, Ioan Calapodescu
- TLDR: We propose a two-step method, based on composable adapters, to deal with noisy input coming from various modalities, like speech, images, or noisy text extracted from the web.
TranSHER: Translating Knowledge Graph Embedding with Hyper-Ellipsoidal Restriction
- Yizhi Li, Wei Fan, Chao Liu, Chenghua Lin, Jiang Qian
- TLDR: We propose a novel score function TranSHER, which leverages relation-specific translations between head and tail entities to relax the constraint of hyper-ellipsoid restrictions.
IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection
- Jingcheng Deng, Hengwei Dai, Xuewei Guo, Yuanchen Ju, Wei Peng
- TLDR: We propose an Implicit Relational Reasoning Graph Network for multi-turn dialogue reasoning, which improves the baseline of four pre-trained language models and achieves state-of-the-art performance.
Predicting Prerequisite Relations for Unseen Concepts
- Yaxin Zhu, Hamed Zamani
- TLDR: We propose a novel alternating knowledge distillation approach to concept prerequisite learning that improves the performance of existing CPL algorithms.
Contrastive Learning with Expectation-Maximization for Weakly Supervised Phrase Grounding
- Keqin Chen, Richong Zhang, Samuel Mensah, Yongyi Mao
- TLDR: We propose a novel contrastive learning framework based on the expectation-maximization algorithm that adaptively refines the target prediction.
Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations
- Yu Fei, Zhao Meng, Ping Nie, Roger Wattenhofer, Mrinmaya Sachan
- TLDR: We show that pre-trained language models can be used to improve text classification by clustering text in the embedding spaces of PLMs without task-specific fine-tuning.
Generalizing over Long Tail Concepts for Medical Term Normalization
- Beatrice Portelli, Simone Scaboro, Enrico Santus, Hooman Sedghamiz, Emmanuele Chersoni, Giuseppe Serra
- TLDR: We propose a novel and effective learning strategy for medical term normalization that improves generalizability of both discriminative and generative models.
Unsupervised Opinion Summarisation in the Wasserstein Space
- Jiayu Song, Iman Munire Bilal, Adam Tsakalidis, Rob Procter, Maria Liakata
- TLDR: We present WassOS, an unsupervised abstractive summarization model which makesuse of the Wasserstein distance to synthesise opinions expressed in a group of documents discussingthe same topic to produce a single summary.
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
- Colin Leong, Joshua Nemecek, Jacob Mansdorfer, Anna Filighera, Abraham Owodunni, Daniel Whitenack
- TLDR: We present Bloom Library, a linguistically diverse set of multimodal and multilingual datasets for language modeling, image captioning, visual storytelling, and speech synthesis/recognition.
Disentangling Uncertainty in Machine Translation Evaluation
- Chrysoula Zerva, Taisiya Glushkova, Ricardo Rei, André F. T. Martins
- TLDR: We propose new uncertainty predictors for machine translation metrics that address specific uncertainty causes in MT evaluation, such as low quality references and out-of-domain data.
Does Your Model Classify Entities Reasonably? Diagnosing and Mitigating Spurious Correlations in Entity Typing
- Nan Xu, Fei Wang, Bangzheng Li, Mingtao Dong, Muhao Chen
- TLDR: We propose a counterfactual data augmentation method to mitigate model biases and improve generalization of entity typing models.
EDIN: An End-to-end Benchmark and Pipeline for Unknown Entity Discovery and Indexing
- Nora Kassner, Fabio Petroni, Mikhail Plekhanov, Sebastian Riedel, Nicola Cancedda
- TLDR: We present a new entity linking benchmark that detects, clusters, and indexes mentions of unknown entities in context.
POQue: Asking Participant-specific Outcome Questions for a Deeper Understanding of Complex Events
- Sai Vallurupalli, Sayontan Ghosh, Katrin Erk, Niranjan Balasubramanian, Francis Ferraro
- TLDR: We present a novel dataset for crowdworkers that allows them to learn about the impact of salient events in a complex event and their influence on the outcome.
Measuring the Mixing of Contextual Information in the Transformer
- Javier Ferrando, Gerard I. Gállego, Marta R. Costa-jussà
- TLDR: We propose a metric to measure token-to-token interactions in the Transformer architecture and provide input attribution scores for model predictions.
Dealing with Abbreviations in the Slovenian Biographical Lexicon
- Angel Daza, Antske Fokkens, Tomaž Erjavec
- TLDR: We propose a new method for identifying unseen abbreviations in a text and propose a method for expanding the identified abbreviations to make them more readable.
AfriCLIRMatrix: Enabling Cross-Lingual Information Retrieval for African Languages
- Odunayo Ogundepo, Xinyu Zhang, Shuo Sun, Kevin Duh, Jimmy Lin
- TLDR: We present a new dataset for cross-lingual information retrieval research in 15 diverse African languages.
CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation
- Abhilasha Ravichander, Matt Gardner, Ana Marasovic
- TLDR: We present CONDAQA, the first English reading comprehension dataset which requires reasoning about the implications of negated statements in paragraphs.
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
- Javier Ferrando, Gerard I. Gállego, Belen Alastruey, Carlos Escolano, Marta R. Costa-jussà
- TLDR: We propose a method for tracking input tokens’ attributions for both bilingual and multilingual Transformer models and present insights into their behaviour.
ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture
- Youssef Mohamed, Mohamed Abdelfattah, Shyma Alhuwaider, Feifan Li, Xiangliang Zhang, Kenneth Church, Mohamed Elhoseiny
- TLDR: ArtELingo is a new dataset for cultural-transfer and multilinguality research.
Decoding a Neural Retriever’s Latent Space for Query Suggestion
- Leonard Adolphs, Michelle Chen Huebscher, Christian Buck, Sertan Girgin, Olivier Bachem, Massimiliano Ciaramita, Thomas Hofmann
- TLDR: We learn a query decoder that can decode a query from its latent representation and use it to generate query reformulations for MSMarco.
T-STAR: Truthful Style Transfer using AMR Graph as Intermediate Representation
- Anubhav Jangra, Preksha Nema, Aravindan Raghuveer
- TLDR: We propose a novel model for text style transfer that uses abstract meaning representation as an intermediate representation for TST.
PromptBERT: Improving BERT Sentence Embeddings with Prompts
- Ting Jiang, Jian Jiao, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Denvy Deng, Qi Zhang
- TLDR: We propose PromptBERT, a novel contrastive learning method for learning better sentence representation.
Extending Logic Explained Networks to Text Classification
- Rishabh Jain, Gabriele Ciravegna, Pietro Barbiero, Francesco Giannini, Davide Buffelli, Pietro Lio
- TLDR: We propose LEN^p, a new classifier for local explanations that improves the quality of local explanations by perturbing input words.
Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database
- Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou
- TLDR: Unified semantic parser for question answering on structured data.
RAPO: An Adaptive Ranking Paradigm for Bilingual Lexicon Induction
- Zhoujin Tian, Chaozhuo Li, Shuo Ren, Zhiqiang Zuo, Zengxuan Wen, Xinyue Hu, Xiao Han, Haizhen Huang, Denvy Deng, Qi Zhang, Xing Xie
- TLDR: We propose a novel ranking-oriented lexicon induction model RAPO to learn personalized mapping function for each word.
On Parsing as Tagging
- Afra Amini, Ryan Cotterell
- TLDR: We propose a unifying pipeline for constituency parsing as tagging and show that it is the most accurate and efficient method.
Distilled Dual-Encoder Model for Vision-Language Understanding
- Zekun Wang, Wenhui Wang, Haichao Zhu, Ming Liu, Bing Qin, Furu Wei
- TLDR: We propose DiDE, a framework that distills the knowledge of the fusion-encoder teacher model into the dual-encoders student model and show that it is competitive with the fusion model on vision-language understanding tasks.
Argument Mining for Review Helpfulness Prediction
- Zaiqian Chen, Daniel Verdi do Amarante, Jenna Donaldson, Yohan Jo, Joonsuk Park
- TLDR: Argument mining on Amazon for product review helpfulness.
Hierarchical Multi-Label Classification of Scientific Documents
- Mobashir Sadat, Cornelia Caragea
- TLDR: We present a new dataset for hierarchical multi-label text classification of scientific papers and propose a multi-task learning approach for topic classification.
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
- Jiacheng Liu, Skyler Hallinan, Ximing Lu, Pengfei He, Sean Welleck, Hannaneh Hajishirzi, Yejin Choi
- TLDR: We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions.
A Major Obstacle for NLP Research: Let’s Talk about Time Allocation!
- Katharina Kann, Shiran Dudy, Arya D. McCarthy
- TLDR: We show that, in recent years, subpar time allocation has been a major obstacle for NLP research and propose remedies to improve the status quo.
Towards Inter-character Relationship-driven Story Generation
- Anvesh Rao Vijjini, Faeze Brahman, Snigdha Chaturvedi
- TLDR: We propose Relationships as Latent Variables for Story Generation, a novel approach for generating stories sentence by sentence.
Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking
- Tim Baumgärtner, Leonardo F. R. Ribeiro, Nils Reimers, Iryna Gurevych
- TLDR: We explore how relevance feedback can be directly integrated into neural re-ranking models by adopting few-shot and parameter-efficient learning techniques.
ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples
- Yilun Zhao, Linyong Nan, Zhenting Qi, Rui Zhang, Dragomir Radev
- TLDR: We develop ReasTAP, a new table reasoning skill injection method for pre-training models that can generate synthetic examples for table reasoning tasks.
Few-shot Learning with Multilingual Generative Language Models
- Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li
- TLDR: We train multilingual generative language models on a corpus covering a diverse set of languages, and study their few- and zero-shot learning capabilities in a wide range of tasks.
Are representations built from the ground up? An empirical examination of local composition in language models
- Emmy Liu, Graham Neubig
- TLDR: We show that language models do not generate compositional representations of longer phrases given the constituents of their constituents.
Detecting Label Errors by Using Pre-Trained Language Models
- Derek Chong, Jenny Hong, Christopher Manning
- TLDR: We show that large pre-trained language models are inherently highly capable of identifying label errors in natural language datasets: simply examining out-of-sample data points in descending order of fine-tuned task loss significantly outperforms more complex error-detection mechanisms proposed in previous work.
Intriguing Properties of Compression on Multilingual Models
- Kelechi Ogueji, Orevaoghene Ahia, Gbemileke Onilude, Sebastian Gehrmann, Sara Hooker, Julia Kreutzer
- TLDR: We propose a framework to characterize the impact of sparsifying multilingual pre-trained language models during fine-tuning and show that compression can improve robustness over dense models.
Sequence Models for Document Structure Identification in an Undeciphered Script
- Logan Born, M. Monroe, Kathryn Kelley, Anoop Sarkar
- TLDR: We provide new and independent evidence for the existence of headers in proto-Elamite, a script from 3100-2900 BCE.
English Contrastive Learning Can Learn Universal Cross-lingual Sentence Embeddings
- Yaushian Wang, Ashley Wu, Graham Neubig
- TLDR: We propose a new universal cross-lingual sentence embedding method that can learn universal cross sentences without any parallel data.
Active Example Selection for In-Context Learning
- Yiming Zhang, Shi Feng, Chenhao Tan
- TLDR: We present a novel approach to learn generalizable policies for in-context learning from demonstration examples in language models.
Improving Factual Consistency in Summarization with Compression-Based Post-Editing
- Alex Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong
- TLDR: We propose a novel model for post-editing summaries that improves factual consistency while maintaining ROUGE and informativeness.
Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
- Linlu Qiu, Peter Shaw, Panupong Pasupat, Tianze Shi, Jonathan Herzig, Emily Pitler, Fei Sha, Kristina Toutanova
- TLDR: We show that fine-tuning all parameters, prompt tuning, and in-context learning can improve compositional generalization in semantic parsing, but not as much as scaling up model size.
“I’m sorry to hear that”: Finding New Biases in Language Models with a Holistic Descriptor Dataset
- Eric Michael Smith, Melissa Hall, Melanie Kambadur, Eleonora Presani, Adina Williams
- TLDR: We present a new, more inclusive bias measurement dataset, HolisticBias, which includes nearly 600 descriptor terms across 13 different demographic axes and use it to explore, identify, and reduce novel forms of bias in several generative models.
Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense
- Zhecan Wang, Haoxuan You, Yicheng He, Wenhao Li, Kai-Wei Chang, Shih-Fu Chang
- TLDR: We present a Multimodal Evaluation pipeline to test visual commonsense understanding and show that it improves the model’s performance in standard VCR evaluation.
Semantic Novelty Detection and Characterization in Factual Text Involving Named Entities
- Nianzu Ma, Sahisnu Mazumder, Alexander Politowicz, Bing Liu, Eric Robertson, Scott Grigsby
- TLDR: We propose a novel method for detecting novel and surprising sentences in text that can also characterize the novelty.
CN-AutoMIC: Distilling Chinese Commonsense Knowledge from Pretrained Language Models
- Chenhao Wang, Jiachun Li, Yubo Chen, Kang Liu, Jun Zhao
- TLDR: We propose a large-scale Chinese CKG generated from multilingual PLMs, named as CN-AutoMIC, aiming to fill the research gap of non-English CKGs.
Calibrating Student Models for Emotion-related Tasks
- Mahshid Hosseini, Cornelia Caragea
- TLDR: We propose a simple yet effective mixup method for learning knowledge distillation from teacher models and show that it improves the performance of pre-trained language models.
Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation
- Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant
- TLDR: Parameter-efficient prompt tuning can overcome catastrophic forgetting to enable zero-shot cross-lingual generation.
Improving Large-scale Paraphrase Acquisition and Generation
- Yao Dou, Chao Jiang, Wei Xu
- TLDR: We present a new Multi-Topic Paraphrase in Twitter corpus that consists of 130k sentence pairs with crowdsoursing (MultiPIT_crowd) and expert (MultiMPU) annotations using two different paraphrase definitions for paraphrase identification and generation tasks.
Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal
- Byung-Doh Oh, William Schuler
- TLDR: We propose entropy-based and distance-based attention weights that capture the diffuseness of self-attention and propose new methods for incorporating vector norms into attention weights.
A Survey of Computational Framing Analysis Approaches
- Mohammad Ali, Naeemul Hassan
- TLDR: We present a comprehensive overview of computational framing analysis methods and propose methodological approaches to explore frames in large-scale datasets.
Learning Cross-Task Dependencies for Joint Extraction of Entities, Events, Event Arguments, and Relations
- Minh Van Nguyen, Bonan Min, Franck Dernoncourt, Thien Nguyen
- TLDR: We propose a novel model for JointIE that aims to learn cross-task dependencies from data.
Don’t Copy the Teacher: Data and Model Challenges in Embodied Dialogue
- So Yeon Min, Hao Zhu, Ruslan Salakhutdinov, Yonatan Bisk
- TLDR: We argue that imitation learning and related metrics are misleading and do not align with the goals of embodied dialogue research and may hinder progress.
ALFRED-L: Investigating the Role of Language for Action Learning in Interactive Visual Environments
- Arjun Akula, Spandana Gella, Aishwarya Padmakumar, Mahdi Namazifar, Mohit Bansal, Jesse Thomason, Dilek Hakkani-Tur
- TLDR: We present evidence that sequence-to-sequence and transformer-based models trained on ALFRED are not sufficiently sensitive to changes in input language instructions.
Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence
- Chris Callison-Burch, Gaurav Singh Tomar, Lara Martin, Daphne Ippolito, Suma Bailis, David Reitter
- TLDR: We present a dialogue system challenge for Dungeons and Dragons that uses a large language model to generate plausible and interesting conversational output.
Unsupervised Entity Linking with Guided Summarization and Multiple-Choice Selection
- Young Min Cho, Li Zhang, Chris Callison-Burch
- TLDR: We propose a fully unsupervised model for entity linking that learns to identify relevant entities from a list of candidates.
Weakly-Supervised Temporal Article Grounding
- Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang
- TLDR: Weakly-Supervised temporal Article Grounding.
Exploring Dual Encoder Architectures for Question Answering
- Zhe Dong, Jianmo Ni, Dan Bikel, Enrique Alfonseca, Yuan Wang, Chen Qu, Imed Zitouni
- TLDR: We explore the dual encoder architectures for question-answering and information retrieval tasks and show that SDEs perform better than ADEs.
arXivEdits: Understanding the Human Revision Process in Scientific Writing
- Chao Jiang, Wei Xu, Samuel Stevens
- TLDR: We provide a complete computational framework for studying text revision in scientific writing.
Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts
- Hongli Zhan, Tiberiu Sosea, Cornelia Caragea, Junyi Jessy Li
- TLDR: We present a novel method for emotion detection and summarization in social media posts related to the COVID-19 pandemic.
Analogical Math Word Problems Solving with Enhanced Problem-Solution Association
- Zhenwen Liang, Jipeng Zhang, Xiangliang Zhang
- TLDR: We propose to build a novel MWP solver by leveraging analogical MWPs, which advance the solver’s generalization ability across different kinds of MWPs.
Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement
- Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Clark
- TLDR: We propose a teachable reasoning system for question-answering that improves with time, without retraining the model.
Knowledge Transfer from Answer Ranking to Answer Generation
- Matteo Gabburo, Rik Koncel-Kedziorski, Siddhant Garg, Luca Soldaini, Alessandro Moschitti
- TLDR: We propose to train a GenQA model by transferring knowledge from a trained AS2 model, to overcome the aforementioned issue.
Perturbation Augmentation for Fairer NLP
- Rebecca Qian, Candace Ross, Jude Fernandes, Eric Michael Smith, Douwe Kiela, Adina Williams
- TLDR: We explore whether training on demographically perturbed data leads to fairer language models.
Automatic Document Selection for Efficient Encoder Pretraining
- Yukun Feng, Patrick Xia, Benjamin Van Durme, João Sedoc
- TLDR: We propose a novel method for training language models that uses a novel domain-representative dataset to train them.
The Aligned Multimodal Movie Treebank: An audio, video, dependency-parse treebank
- Adam Yaari, Jan DeWitt, Henry Hu, Bennett Stankovits, Sue Felshin, Yevgeni Berzak, Helena Aparicio, Boris Katz, Ignacio Cases, Andrei Barbu
- TLDR: We present a multimodal movie treebank derived from dialog in Hollywood movies which includes transcriptions of the audio-visual streams with word-level alignment, as well as part of speech tags and dependency parses in the Universal Dependencies formalism.
DEMETR: Diagnosing Evaluation Metrics for Translation
- Marzena Karpinska, Nishant Raj, Katherine Thai, Yixiao Song, Ankita Gupta, Mohit Iyyer
- TLDR: We provide a diagnostic dataset for evaluating the sensitivity of machine translation evaluation metrics to linguistic perturbations spanning semantic, syntactic, and morphological error categories.
Empowering Language Models with Knowledge Graph Reasoning for Open-Domain Question Answering
- Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Kai-Wei Chang, Yizhou Sun
- TLDR: We propose knOwledge REasOning empowered Language Model(OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs to interact with a differentiable Knowledge Graph Reasoning module collaboratively.
Debiasing Pretrained Text Encoders by Paying Attention to Paying Attention
- Yacine Gaci, Boualem Benatallah, Fabio Casati, Khalid Benabdeslem
- TLDR: We propose a debiasing method for sentence-level text encoders that both reduces social stereotypes, and inflicts next to no semantic damage.
MEE: A Novel Multilingual Event Extraction Dataset
- Amir Pouran Ben Veyseh, Javid Ebrahimi, Franck Dernoncourt, Thien Nguyen
- TLDR: We propose a novel Multilingual Event Extraction dataset for non-English languages that provides annotation for more than 50K event mentions in 8 typologically different languages.
RobustLR: A Diagnostic Benchmark for Evaluating Logical Robustness of Deductive Reasoners
- Soumya Sanyal, Zeyi Liao, Xiang Ren
- TLDR: We present RobustLR, a diagnostic benchmark that evaluates the robustness of language models to minimal logical edits in the inputs and different logical equivalence conditions.
Evaluating and Improving Factuality in Multimodal Abstractive Summarization
- David Wan, Mohit Bansal
- TLDR: We propose CLIPBERTSCORE, a simple weighted combination of CLIPScore and BERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary, respectively.
Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation
- Melanie Sclar, Peter West, Sachin Kumar, Yulia Tsvetkov, Yejin Choi
- TLDR: We present Referee, a novel framework for sentence summarization that can be trained reference-free (i.e., requiring no gold summaries for supervision), while allowing direct control for compression ratio.
Algorithms for Weighted Pushdown Automata
- Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, David Chiang
- TLDR: We develop novel algorithms for dynamic programming on weighted pushdown automata.
MABEL: Attenuating Gender Bias using Textual Entailment Data
- Jacqueline He, Mengzhou Xia, Christiane Fellbaum, Danqi Chen
- TLDR: We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
- Kyle Richardson, Ronen Tamari, Oren Sultan, Dafna Shahaf, Reut Tsarfaty, Ashish Sabharwal
- TLDR: We propose a new approach to represent language understanding models by tracking and training their beliefs at arbitrary points in text.
Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis
- Changyuan Qiu, Winston Wu, Xinliang Frederick Zhang, Lu Wang
- TLDR: We present multimodal ideology prediction, where a model predicts binary or five-point scale ideological leanings given a text-image pair with political content.
Leveraging QA Datasets to Improve Generative Data Augmentation
- Dheeraj Mekala, Tu Vu, Timo Schick, Jingbo Shang
- TLDR: We propose CONDA, an approach to further improve GLM’s ability to generate synthetic data by reformulating data generation as context generation for a given question-answer (QA) pair and leveraging QA datasets for training context generators.
Meta-Learning Fast Weight Language Models
- Kevin Clark, Kelvin Guu, Ming-Wei Chang, Panupong Pasupat, Geoffrey Hinton, Mohammad Norouzi
- TLDR: We present Fast Weight Layers, a neural component that provides the benefits of dynamic evaluation much more efficiently by expressing gradient updates as linear attention.
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
- Róbert Csordás, Kazuki Irie, Juergen Schmidhuber
- TLDR: We present CTL++, a new diagnostic dataset based on compositions of unary symbolic functions.
Learning with Rejection for Abstractive Text Summarization
- Meng Cao, Yue Dong, Jingyi He, Jackie Chi Kit Cheung
- TLDR: We propose a new objective for abstractive summarization based on rejection learning, which improves the factuality of generated summaries in automatic and human evaluations while increasing the abstractiveness of the generated summary.
Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation
- Dongkyu Lee, Ka Chun Cheung, Nevin Zhang
- TLDR: We propose a regularizer for label smoothing that incorporates dynamic smoothing parameter and a new dynamic smoothed parameter for prediction of soft labels.
Hard Gate Knowledge Distillation - Leverage Calibration for Robust and Reliable Language Model
- Dongkyu Lee, Zhiliang Tian, Yingxiu Zhao, Ka Chun Cheung, Nevin Zhang
- TLDR: We propose a novel knowledge distillation scheme that switches between learning from a teacher model and training data.
Are All Spurious Features in Natural Language Alike? An Analysis through a Causal Lens
- Nitish Joshi, Xiang Pan, He He
- TLDR: We propose a causal model and probabilities of necessity and sufficiency for spurious features in NLP, which help explain results of existing debiasing methods on different spurious features, and demystifies surprising results such as the encoding of spurious features after debiase.
Correcting Diverse Factual Errors in Abstractive Summarization via Post-Editing and Language Model Infilling
- Vidhisha Balachandran, Hannaneh Hajishirzi, William Cohen, Yulia Tsvetkov
- TLDR: We propose to generate hard, representative synthetic examples of non-factual summaries through infilling language models to improve factual consistency in summarization models.
Coordinated Topic Modeling
- Pritom Saha Akash, Jie Huang, Kevin Chen-Chuan Chang
- TLDR: We propose a new problem called coordinated topic modeling that imitates human behavior while describing a text corpus.
Large Dual Encoders Are Generalizable Retrievers
- Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernandez Abrego, Ji Ma, Vincent Zhao, Yi Luan, Keith Hall, Ming-Wei Chang, Yinfei Yang
- TLDR: We propose a new approach to scaling up the size of the dual encoder model by scaling up its input and output.
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
- Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang
- TLDR: We present CRIPP-VQA, a video question answering dataset for reasoning about the implicit physical properties of objects in a scene.
Entity-centered Cross-document Relation Extraction
- Fengqi Wang, Fei Li, Hao Fei, Jingye Li, Shengqiong Wu, Fangfang Su, Wenxuan Shi, Donghong Ji, Bo Cai
- TLDR: We propose a novel approach for cross-document RE based on entity-based document-context filter and cross-path entity relation attention.
Exploring Document-Level Literary Machine Translation with Parallel Paragraphs from World Literature
- Katherine Thai, Marzena Karpinska, Kalpesh Krishna, Bill Ray, Moira Inghilleri, John Wieting, Mohit Iyyer
- TLDR: We present a novel dataset of novel sentences in the public domain that shows that expert literary translators prefer reference human translations over machine-translated paragraphs at a rate of 84% by experts.
Label-aware Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding
- Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Xianglin Zuo, Daxin Jiang
- TLDR: We propose to model the utterance-slot-word structure in zero-shot cross-lingual SLU and propose a novel contrastive learning framework to facilitate explicit alignment of utterance, slot and word representations in zero shot cross-language SLU.
Polyglot Prompt: Multilingual Multitask Prompt Training
- Jinlan Fu, See-Kiong Ng, Pengfei Liu
- TLDR: We propose a new multilingual prompt-based learning framework for multilingual tasks and show how it can improve multilingual learning.
VisToT: Vision-Augmented Table-to-Text Generation
- Prajwal Gatti, Anand Mishra, Manish Gupta, Mithun Das Gupta
- TLDR: We present a novel multimodal table-to-text generation task that incorporates signals from both tables as well as associated images to generate relevant text.
Generative Entity-to-Entity Stance Detection with Knowledge Graph Augmentation
- Xinliang Frederick Zhang, Nick Beauchamp, Lu Wang
- TLDR: We present a novel generative framework to generate canonical names for entities as well as stances among them.
Symptom Identification for Interpretable Detection of Multiple Mental Disorders on Social Media
- Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Zhu
- TLDR: We present a novel annotation framework for symptom-assisted MDD enabled by symptom prediction and show that it can outperform strong pure-text baselines.
Improving Iterative Text Revision by Learning Where to Edit from Other Revision Tasks
- Zae Myung Kim, Wanyu Du, Vipul Raheja, Dhruv Kumar, Dongyeop Kang
- TLDR: We propose a novel iterative text revision system that can iteratively generate helpful edits by explicitly detecting editable spans (where-to-edit) with their corresponding edit intents and then instructing a revision model to revise the detected edit spans.
CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning
- Zeqiu Wu, Yi Luan, Hannah Rashkin, David Reitter, Hannaneh Hajishirzi, Mari Ostendorf, Gaurav Singh Tomar
- TLDR: We present a query rewriting model for conversational question answering that can be used to improve the performance of off-the-shelf passage retrieval for conversative question answering.
Specializing Multi-domain NMT via Penalizing Low Mutual Information
- Jiyoung Lee, Hantae Kim, Hyunchang Cho, Edward Choi, Cheonbok Park
- TLDR: We propose a new objective that penalizes low MI to become higher resulting in domain-specific multi-domain NMT.
A Simple Contrastive Learning Framework for Interactive Argument Pair Identification via Argument-Context Extraction
- Lida Shi, Fausto Giunchiglia, Rui Song, Daqian Shi, Tongtong Liu, Xiaolei Diao, Hao Xu
- TLDR: We propose a simple contrastive learning framework for argument pair identification by extracting valuable information from the context.
Sentence-level Media Bias Analysis Informed by Discourse Structures
- Yuanyuan Lei, Ruihong Huang, Lu Wang, Nick Beauchamp
- TLDR: We propose a novel system for identifying sentences within an article that can reveal the ideological bias of the entire article.
Towards Efficient Dialogue Pre-training with Transferable and Interpretable Latent Structure
- Xueliang Zhao, Lemao Liu, Tingchen Fu, Shuming Shi, Dongyan Zhao, Rui Yan
- TLDR: We propose a novel dialogue generation model with a transferable latent structure that is easily transferable from the general domain to downstream tasks in a lightweight and transparent way.
An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks
- Changlong Yu, Tianyi Xiao, Lingpeng Kong, Yangqiu Song, Wilfred Ng
- TLDR: We propose trivial graphs as necessary baselines for knowledge fusion and show that trivial graphs can be used to improve language model pretraining performance in fully-supervised and few-shot settings.
Unsupervised Non-transferable Text Classification
- Guangtao Zeng, Wei Lu
- TLDR: We propose a novel unsupervised non-transferable learning method for text classification task that does not require annotated target domain data.
Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Prediction
- Thong Nguyen, Xiaobao Wu, Anh Tuan Luu, Zhen Hai, Lidong Bing
- TLDR: We propose Multi-modal Contrastive Learning for Multimodal Review Helpfulness Prediction problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modALLEGING MARVELS, MEANING, and optimizing cross-MODAL RELATIONSHIPS.
Adaptive Token-level Cross-lingual Feature Mixing for Multilingual Neural Machine Translation
- Junpeng Liu, Kaiyu Huang, Jiuyi Li, Huan Liu, Jinsong Su, Degen Huang
- TLDR: We propose a novel token-level feature mixing method that enables the model to capture different features and dynamically determine the feature sharing across languages.
A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach
- Yew Ken Chia, Lidong Bing, Sharifah Mahani Aljunied, Luo Si, Soujanya Poria
- TLDR: We propose CubeRE, a cube-filling model for hyper-relational extraction and cube-pruning, which can be used to improve knowledge graph construction.
Low-resource Neural Machine Translation with Cross-modal Alignment
- Zhe Yang, Qingkai Fang, Yang Feng
- TLDR: We propose a cross-modal contrastive learning method to learn a shared space for all languages, where both a coarse-grained sentence-level objective and a fine-graining token-level one are introduced.
Prompt-based Distribution Alignment for Domain Generalization in Text Classification
- Chen Jia, Yue Zhang
- TLDR: We learn domain invariant representation across source and downstream domains by prompting.
Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering
- Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria
- TLDR: We propose a simple refactoring of multi-choice question answering tasks as a series of binary classifications.
HEGEL: Hypergraph Transformer for Long Document Summarization
- Haopeng Zhang, Xiao Liu, Jiawei Zhang
- TLDR: We propose HEGEL, a hypergraph neural network for long document summarization by capturing high-order cross-sentence relations.
Adapting a Language Model While Preserving its General Knowledge
- Zixuan Ke, Yijia Shao, Haowei Lin, Hu Xu, Lei Shu, Bing Liu
- TLDR: We propose a novel method to perform domain-adaptive pre-training using an unlabeled corpus of aparticular domain to adapt the LM so that end-tasks in the domain can give improved performances.
Human Guided Exploitation of Interpretable Attention Patterns in Summarization and Topic Segmentation
- Raymond Li, Wen Xiao, Linzi Xing, Lanjun Wang, Gabriel Murray, Giuseppe Carenini
- TLDR: We propose a new pipeline for learning task-specific attention patterns in transformers and show improvements in accuracy and efficiency.
Continual Training of Language Models for Few-Shot Learning
- Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, Bing Liu
- TLDR: We propose a continual post-training system for large language models that improves few-shot end-task learning in many NLP applications.
Dictionary-Assisted Supervised Contrastive Learning
- Patrick Wu, Richard Bonneau, Joshua Tucker, Jonathan Nagler
- TLDR: We introduce dictionary-assisted supervised contrastive learning (DASCL) objective, allowing researchers to leverage specialized dictionaries when fine-tuning pretrained language models.
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
- Huanru Henry Mao
- TLDR: We propose a simple alternative to decaying fast weights for autoregressive Transformers that outperforms prior methods and retains 99% of attention’s performance on WikiText-103.
PRO-CS : An Instance-Based Prompt Composition Technique for Code-Switched Tasks
- Srijan Bansal, Suraj Tripathi, Sumit Agarwal, Teruko Mitamura, Eric Nyberg
- TLDR: We propose a novel instance-based prompt composition technique for code-switched tasks that combine language and task knowledge.
SentBS: Sentence-level Beam Search for Controllable Summarization
- Chenhui Shen, Liying Cheng, Lidong Bing, Yang You, Luo Si
- TLDR: We propose a sentence-level beam search generation method for structured text generation, which improves the agreement between the generated text and the desired structure.
A Fine-grained Chinese Software Privacy Policy Dataset for Sequence Labeling and Regulation Compliant Identification
- Kaifa Zhao, Le Yu, Shiyao Zhou, Jing Li, Xiapu Luo, Yat Fei Aemon Chiu, Yutong Liu
- TLDR: We construct the first Chinese privacy policy dataset for Android applications and provide a robust and representative dataset for sequence labeling tasks and regulation compliance identification between privacy policies and software.
Saving Dense Retriever from Shortcut Dependency in Conversational Search
- Sungdong Kim, Gangwoo Kim
- TLDR: We demonstrate the existence of a new class of conversational inputs for conversational search.
Graph-Induced Transformers for Efficient Multi-Hop Question Answering
- Giwon Hong, Jeonghwan Kim, Junmo Kang, Sung-Hyon Myaeng
- TLDR: Graph-Induced Transformer for multi-hop question answering tasks.
DiscoSense: Commonsense Reasoning with Discourse Connectives
- Prajjwal Bhargava, Vincent Ng
- TLDR: We present DiscoSense, a benchmark for commonsense reasoning via understanding a wide variety of discourse connectives.
Boosting Document-Level Relation Extraction by Mining and Injecting Logical Rules
- Shengda Fan, Shasha Mo, Jianwei Niu
- TLDR: We propose a logic enhanced framework for document-level relation extraction that improves the relation extraction performance and inference.
MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective
- Zhe Hu, Hou Pong Chan, Lifu Huang
- TLDR: We propose a novel multi-task training strategy for long text generation grounded on the cognitive theory of writing, which empowers the model to learn essential subskills needed for writing including planning and reviewing besides end-to-end generation.
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
- Zhuang Li, Lizhen Qu, Qiongkai Xu, Tongtong Wu, Tianyang Zhan, Gholamreza Haffari
- TLDR: We propose a variational autoencoder with disentanglement priors for task-specific natural language generation with none or a handful of task- specific labeled examples.
CISLR: Corpus for Indian Sign Language Recognition
- Abhinav Joshi, Ashwani Bhat, Pradeep S, Priya Gole, Shashwat Gupta, Shreyansh Agarwal, Ashutosh Modi
- TLDR: We propose a new dataset CISLR (Corpus for Indian Sign Language Recognition) for word-level recognition in Indian Sign language using videos.
Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction
- Kai Shen, Yichong Leng, Xu Tan, Siliang Tang, Yuan Zhang, Wenjie Liu, Edward Lin
- TLDR: We propose a simple yet effective masking strategy for text error correction that improves the accuracy consistently and effectively.
AMAL: Meta Knowledge-Driven Few-Shot Adapter Learning
- S. K. Hong, Tae Young Jang
- TLDR: Meta-learning-driven low-rank adapter pooling method for leveraging pre-trained language models even with just a few data points.
Discourse Context Predictability Effects in Hindi Word Order
- Sidharth Ranjan, Marten van Schijndel, Sumeet Agarwal, Rajakrishnan Rajkumar
- TLDR: We investigate the role of discourse predictability in Hindi syntactic priming and show that it influences word order preferences.
“Covid vaccine is against Covid but Oxford vaccine is made at Oxford!” Semantic Interpretation of Proper Noun Compounds
- Keshav Kolluru, Gabriel Stanovsky, Mausam -
- TLDR: We present a new dataset of 22.5K proper noun compounds along with their free-form semantic interpretations and show that adding targeted knowledge, particularly about the common noun, results in performance gains of upto 2.8%.
Context Limitations Make Neural Language Models More Human-Like
- Tatsuki Kuribayashi, Yohei Oseki, Ana Brassard, Kentaro Inui
- TLDR: We show that constraining the LMs’ context access improves their simulation of human reading behavior.
A Generative Model for End-to-End Argument Mining with Reconstructed Positional Encoding and Constrained Pointer Mechanism
- Jianzhu Bao, Yuhang He, Yang Sun, Bin Liang, Jiachen Du, Bing Qin, Min Yang, Ruifeng Xu
- TLDR: We propose a novel end-to-end generative framework for argument mining that can handle all the subtasks of argument mining in an end- to-end fashion.
Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality
- Pei Zhou, Hyundong Cho, Pegah Jandaghi, Dong-Ho Lee, Bill Yuchen Lin, Jay Pujara, Xiang Ren
- TLDR: We show that current response generation models produce generic and dull responses in dialogues because they act reflexively, failing to explicitly model CG, both due to the lack of CG in training data and the standard RG training procedure.
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows
- Jianqiao Zhao, Yanyang Li, Wanyu Du, Yangfeng Ji, Dong Yu, Michael Lyu, Liwei Wang
- TLDR: We propose a new framework for dialogue evaluation featuring dialog act information and demonstrate its effectiveness and other desirable characteristics.
FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning
- Suvir Mirchandani, Licheng Yu, Mengjiao Wang, Animesh Sinha, Wenwen Jiang, Tao Xiang, Ning Zhang
- TLDR: We propose a novel fashion-specific pre-training framework based on weakly-supervised triplets constructed from fashion image-text pairs.
MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences
- Wei Han, Hui Chen, Min-Yen Kan, Soujanya Poria
- TLDR: We propose a novel approach named MM-Align to address the missing-modality inference problem.
Evaluating the Knowledge Dependency of Questions
- Hyeongdon Moon, Yoonseok Yang, Hangyeol Yu, Seunghyun Lee, Myeongho Jeong, Juneyoung Park, Jamin Shin, Minsam Kim, Seungtaek Choi
- TLDR: We propose a novel automatic evaluation metric for MCQ generation that measures the MCQ’s answerability given knowledge of the target fact.
MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion
- Yu Zhao, Xiangrui Cai, Yike Wu, Haiwei Zhang, Ying Zhang, Guoqing Zhao, Ning Jiang
- TLDR: We propose a new method for multimodal knowledge graph completion based on modality-split representation learning and ensemble inference.
Entropy-Based Vocabulary Substitution for Incremental Learning in Multilingual Neural Machine Translation
- Kaiyu Huang, Peng Li, Jin Ma, Yang Liu
- TLDR: We propose an entropy-based vocabulary substitution method that just needs to walk through new language pairs for incremental learning in a large-scale multilingual data updating while remaining the size of the vocabulary.
Eliciting Knowledge from Large Pre-Trained Models for Unsupervised Knowledge-Grounded Conversation
- Yanyang Li, Jianqiao Zhao, Michael Lyu, Liwei Wang
- TLDR: We propose a noisy knowledge source for dialogue generation and propose a new method for generating knowledge from large models.
An Unsupervised, Geometric and Syntax-aware Quantification of Polysemy
- Anmol Goel, Charu Sharma, Ponnurangam Kumaraguru
- TLDR: We propose a novel, unsupervised framework to quantify polysemy scores for words in multiple languages and show that syntax is intricately linked to ambiguity/polysemy.
Reorder and then Parse, Fast and Accurate Discontinuous Constituency Parsing
- Kailai Sun, Zuchao Li, Hai Zhao
- TLDR: We propose a novel reordering method for constructing fast and accurate discontinuous constituency parsing systems working in continuous way.
Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature
- Tomas Goldsack, Zhihao Zhang, Chenghua Lin, Carolina Scarton
- TLDR: We present two novel biomedical lay summarisation datasets, PLOS (large-scale) and eLife (medium-scale), each of which contains biomedical journal articles alongside expert-written lay summaries.
Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference
- Sara Rajaee, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar
- TLDR: We investigate the reasons for the emergence of the overlap bias in the NLI models and the role of minority examples in mitigating this bias.
An Empirical Study on the Transferability of Transformer Modules in Parameter-efficient Fine-tuning
- Mohammad AkbarTajari, Sara Rajaee, Mohammad Taher Pilehvar
- TLDR: We show that transformer modules in LMs can transfer knowledge from a pre-trained model to a downstream task.
CODER: An efficient framework for improving retrieval through COntextual Document Embedding Reranking
- George Zerveas, Navid Rekabsaz, Daniel Cohen, Carsten Eickhoff
- TLDR: We introduce Contextual Document Embedding Reranking Rerankers, a novel approach to dense retrieval that improves performance by a significant margin.
AdapterShare: Task Correlation Modeling with Adapter Differentiation
- Zhi Chen, Bei Chen, Lu Chen, Kai Yu, Jian-Guang Lou
- TLDR: We propose AdapterShare, an adapter differentiation method to explicitly model the task correlation among multiple tasks.
Rethinking Task-Specific Knowledge Distillation: Contextualized Corpus as Better Textbook
- Chang Liu, Chongyang Tao, Jianxin Liang, Tao Shen, Jiazhan Feng, Quzhe Huang, Dongyan Zhao
- TLDR: We present a novel approach for knowledge distillation that enables task-specific distillation without general distillation and enables the teacher to customize the student model with desired model size under various computation constraints.
Recovering Gold from Black Sand: Multilingual Dense Passage Retrieval with Hard and False Negative Samples
- Tianhao Shen, Mingtong Liu, Ming Zhou, Deyi Xiong
- TLDR: We propose a novel multilingual dense passage retrieval framework that uses hard and false negative samples to recover and utilize multilingual negative samples.
The “Problem” of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
- Barbara Plank
- TLDR: We argue that there exists genuine human variation in labeling due to disagreement, subjectivity in annotation or multiple plausible answers.
Quality Scoring of Source Words in Neural Translation Models
- Priyesh Jain, Sunita Sarawagi, Tushar Tomar
- TLDR: We propose a simple approach based on comparing the difference of probabilities from two language models for word-level quality scores on input source sentences.
Pneg: Prompt-based Negative Response Generation for Dialogue Response Selection Task
- Nyoungwoo Lee, ChaeHun Park, Ho-Jin Choi, Jaegul Choo
- TLDR: We propose a novel method for synthesizing adversarial negative responses to dialogue selection tasks leveraging a large-scale language model.
Facilitating Contrastive Learning of Discourse Relational Senses by Exploiting the Hierarchy of Sense Relations
- Wanqiu Long, Bonnie Webber
- TLDR: We present a novel approach to implicit discourse relation recognition that uses the sense hierarchy to select negative examples for contrastive learning.
Simplified Graph Learning for Inductive Short Text Classification
- Kaixin Zheng, Yaqing Wang, Quanming Yao, Dejing Dou
- TLDR: SimpleSTC is a novel approach for inductive STC which leverages words.
Don’t Stop Fine-Tuning: On Training Regimes for Few-Shot Cross-Lingual Transfer with Multilingual Language Models
- Fabian David Schmidt, Ivan Vulić, Goran Glavaš
- TLDR: We present a spectrum of few-shot cross-lingual transfer methods that yield both improved and stable FS-XLT across the board.
Towards Compositional Generalization in Code Search
- Hojae Han, Seung-won Hwang, Shuai Lu, Nan Duan, Seungtaek Choi
- TLDR: We propose a new code search algorithm that uses templates as building blocks for compositional generalization.
Towards relation extraction from speech
- Tongtong Wu, Guitao Wang, Jinming Zhao, Zhaoran Liu, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari
- TLDR: We propose a new listening information extraction task for speech relation extraction, which is based on speech relation extractions.
Structural Constraints and Natural Language Inference for End-to-End Flowchart Grounded Dialog Response Generation
- Dinesh Raghu, Suraj Joshi, Sachindra Joshi, Mausam -
- TLDR: We propose Structure-Aware FLONET, a new approach for learning flowchart grounded dialog systems which learns to predict the polarity of indirect Y/N answers.
SLICER: Sliced Fine-Tuning for Low-Resource Cross-Lingual Transfer for Named Entity Recognition
- Fabian David Schmidt, Ivan Vulić, Goran Glavaš
- TLDR: We propose a simple yet highly effective approach for improving zero-shot transfer for NER to low-resource languages.
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
- Tao Ge, Si-Qing Chen, Furu Wei
- TLDR: We introduce EdgeFormer – a parameter-efficient Transformer for on-device seq2seq generation under the strict computation and memory constraints.
End-to-End Unsupervised Vision-and-Language Pre-training with Referring Expression Matching
- Chi Chen, Peng Li, Maosong Sun, Yang Liu
- TLDR: We propose a novel vision encoder for unsupervised vision-and-language pre-training that learns multimodal representations without parallel image-caption data.
Faithful Knowledge Graph Explanations in Commonsense Question Answering
- Guy Aglionby, Simone Teufel
- TLDR: We show that graph-based explanations for knowledge graphs cannot be extracted from existing models of this type.
KOLD: Korean Offensive Language Dataset
- Younghoon Jeong, Juhyun Oh, Jongwon Lee, Jaimeen Ahn, Jihyung Moon, Sungjoon Park, Alice Oh
- TLDR: We present a new dataset for offensive language detection in Korean and show that it is effective in offensiveness detection, target classification, and target span detection.
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention
- Wenhao Li, Xiaoyuan Yi, Jinyi Hu, Maosong Sun, Xing Xie
- TLDR: We propose a novel attention regularization loss to control the sharpness of the attention distribution in Transformer architectures and show that it improves diversity and novelty of generated text.
The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative
- Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, Hinrich Schütze
- TLDR: We present an investigation of the ability of pretrained language models to classify and understand one of the most commonly studied constructions, the English comparative correlative.
ProofInfer: Generating Proof via Iterative Hierarchical Inference
- Zichu Fei, Qi Zhang, Xin Zhou, Tao Gui, Xuanjing Huang
- TLDR: We propose a new algorithm for deductive reasoning that generates a proof tree by iteratively adding nodes to the proof tree.
ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts
- Rajdeep Mukherjee, Abhinav Bohra, Akash Banerjee, Soumya Sharma, Manjunath Hegde, Afreen Shaikh, Shivani Shrivastava, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal
- TLDR: We present a new dataset with transcripts of earnings calls, hosted by publicly traded companies, as documents, and experts-written short telegram-style bullet point summaries derived from corresponding Reuters articles.
Cross-domain Generalization for AMR Parsing
- Xuefeng Bai, Sen Yang, Leyang Cui, Linfeng Song, Yue Zhang
- TLDR: We analyze challenges to cross-domain AMR parsing and propose two approaches to reduce the domain distribution divergence of text and AMR features, respectively.
CiteSum: Citation Text-guided Scientific Extreme Summarization and Domain Adaptation with Limited Supervision
- Yuning Mao, Ming Zhong, Jiawei Han
- TLDR: We propose a simple yet effective approach to automatically extracting TLDR summaries for scientific papers from their citation texts.
FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue
- Alon Albalak, Yi-Lin Tuan, Pegah Jandaghi, Connor Pryor, Luke Yoffe, Deepak Ramachandran, Lise Getoor, Jay Pujara, William Yang Wang
- TLDR: We present a benchmark for conversational task transfer in open-domain dialogue and show that most performance trends are model-specific, and span extraction and multiple-choice tasks benefit the most from task transfer.
Do Children Texts Hold The Key To Commonsense Knowledge?
- Julien Romero, Simon Razniewski
- TLDR: We propose a novel approach to commonsense knowledge compilation based on the hypothesis that children’s texts hold the key to commonsens knowledge compilation, based on their less-observable and more-commonly-explainable assertions.
On the Limitations of Reference-Free Evaluations of Generated Text
- Daniel Deutsch, Rotem Dror, Dan Roth
- TLDR: We show that reference-free metrics are inherently biased and limited in their ability to evaluate generated text, and we argue that they should not be used to measure progress on tasks like machine translation or summarization.
Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
- Bryan Eikema, Wilker Aziz
- TLDR: We show that mode-seeking strategies can aid in constructing compact sets of promising hypotheses and that MBR is effective in identifying good translations in them.
IndicXNLI: Evaluating Multilingual Inference for Indian Languages
- Divyanshu Aggarwal, Vivek Gupta, Anoop Kunchukuttan
- TLDR: We present INDICXNLI, an Indic NLP dataset for 11 Indic languages.
Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems
- Neeraj Varshney, Chitta Baral
- TLDR: We present a simple technique for efficient NLP systems that utilizes a collection of models of varying capacities to accurately yet efficiently output predictions.
Semantic Simplification for Sentiment Classification
- Xiaotong Jiang, Zhongqing Wang, Guodong Zhou
- TLDR: We propose a novel method for document-level sentiment classification based on simplified clauses and abstract meaning representation.
XPrompt: Exploring the Extreme of Prompt Tuning
- Fang Ma, Chen Zhang, Lei Ren, Jingang Wang, Qifan Wang, Wei Wu, Xiaojun Quan, Dawei Song
- TLDR: We propose a novel Prompt tuning model with an eXtremely small scale (XPrompt) under the regime of lottery tickets hypothesis.
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
- Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
- TLDR: We show that ground truth demonstrations are not required for in-context learning, and that other aspects of the demonstrations are the key drivers of endtask performance.
The Curious Case of Control
- Elias Stengel-Eskin, Benjamin Van Durme
- TLDR: We explore the connections between control and labeling event participants with properties typically associated with agents and patients.
SHARE: a System for Hierarchical Assistive Recipe Editing
- Shuyang Li, Yufei Li, Jianmo Ni, Julian McAuley
- TLDR: We propose a novel system for controllable recipe editing that produces convincing, coherent recipes that are appropriate for a target dietary constraint.
IM^2: an Interpretable and Multi-category Integrated Metric Framework for Automatic Dialogue Evaluation
- Zhihua Jiang, Guanghui Ye, Dongning Rao, Di Wang, Xin Miao
- TLDR: We propose a new dialogue metric which combines a large number of metrics which are good at measuring different qualities.
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
- Yuan Yao, Qianyu Chen, Ao Zhang, Wei Ji, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
- TLDR: We propose a novel language modeling framework for vision-language pre-training that enables explicit object position modeling in vision-languages pre-trained without object detectors.
Pre-training Language Models with Deterministic Factual Knowledge
- Shaobo Li, Xiaoguang Li, Lifeng Shang, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu
- TLDR: We propose to learn the deterministic relationship between the remaining context and the masked content of a pre-trained language model and use it to improve the robustness of the model.
Finding Skill Neurons in Pre-trained Transformer-based Language Models
- Xiaozhi Wang, Kaiyue Wen, Zhengyan Zhang, Lei Hou, Zhiyuan Liu, Juanzi Li
- TLDR: We show that pre-trained language models are highly predictive of task labels and that the activations of some neurons within pre-training Transformers are highly predicted of the task labels.
Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue
- Yingxiu Zhao, Yinhe Zheng, Zhiliang Tian, Chang Gao, Jian Sun, Nevin L. Zhang
- TLDR: We propose a novel method, prompt conditioned VAE for lifelong learning (PCLL), to enhance generative replay by incorporating tasks’ statistics.
PreQuEL: Quality Estimation of Machine Translation Outputs in Advance
- Shachar Don-Yehiya, Leshem Choshen, Omri Abend
- TLDR: We present the task of PreQuEL, Pre-(Quality-Estimation) Learning, a novel translation task that uses neural networks to predict how well a sentence will be translated.
Can Transformers Reason in Fragments of Natural Language?
- Viktor Schlegel, Kamen Pavlov, Ian Pratt-Hartmann
- TLDR: We investigate the ability of transformer-based language models to make formally valid inferences in controlled fragments of natural language for which the satisfiability problem becomes increasingly complex.
Textless Speech Emotion Conversion using Discrete & Decomposed Representations
- Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu Anh Nguyen, Morgan Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi
- TLDR: We present a novel method for speech emotion conversion that uses a speech signal decomposition to predict the emotion of a speech utterance and use it to modify the speech content.
Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks
- Yangyi Chen, Fanchao Qi, Hongcheng Gao, Zhiyuan Liu, Maosong Sun
- TLDR: We find two simple tricks that can make existing textual backdoor attacks much more harmful.
Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP
- Yangyi Chen, Hongcheng Gao, Ganqu Cui, Fanchao Qi, Longtao Huang, Zhiyuan Liu, Maosong Sun
- TLDR: We propose a new benchmark for security-oriented adversarial NLP and propose a simple method based on heuristic rules that can easily fulfill the actual adversarial goals to simulate real-world attack methods.
Retrieval Augmented Visual Question Answering with Outside Knowledge
- Weizhe Lin, Bill Byrne
- TLDR: We propose a novel approach to training external knowledge visual question answering systems that use differentiable DPR for retrieval and answer generation.
Instance Regularization for Discriminative Language Model Pre-training
- Zhuosheng Zhang, Hai Zhao, Ming Zhou
- TLDR: We propose to estimate the complexity of restoring the original sentences from corrupted ones in language model pre-training.
GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation
- Mingzhou Xu, Longyue Wang, Derek F. Wong, Hongye Liu, Linfeng Song, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
- TLDR: We propose a benchmark testset for target evaluation on zero pronoun translation and propose a new model for the study of zero pronoun (ZP) translation.
ScienceWorld: Is your Agent Smarter than a 5th Grader?
- Ruoyao Wang, Peter Jansen, Marc-Alexandre Côté, Prithviraj Ammanabrolu
- TLDR: We present ScienceWorld, a benchmark to test agents’ scientific reasoning abilities in a new interactive text environment at the level of a standard elementary school science curriculum.
Improving Embeddings Representations for Comparing Higher Education Curricula: A Use Case in Computing
- Jeffri Murrugarra-Llerena, Fernando Alva-Manchego, Nils Murrugarra-LLerena
- TLDR: We propose an approach for comparing curricula of study programs in higher education.
Mitigating Spurious Correlation in Natural Language Understanding with Counterfactual Inference
- Can Udomcharoenchaikit, Wuttikorn Ponwitayarat, Patomporn Payoungkhamdee, Kanruethai Masuk, Weerayut Buaphet, Ekapol Chuangsuwanich, Sarana Nutanong
- TLDR: We propose a causal analysis framework to help debias NLU models.
End-to-End Neural Discourse Deixis Resolution in Dialogue
- Shengjie Li, Vincent Ng
- TLDR: We adapt Lee et al.’s span-based entity coreference model to the task of end-to-end discourse deixis resolution in dialogue, specifically by proposing extensions to their model that exploit task-specific characteristics.
Balancing out Bias: Achieving Fairness Through Balanced Training
- Xudong Han, Timothy Baldwin, Trevor Cohn
- TLDR: We propose a simple objective for countering bias in natural language processing tasks by countering group bias using balanced training.
Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models
- Mengzhou Xia, Mikel Artetxe, Jingfei Du, Danqi Chen, Veselin Stoyanov
- TLDR: We adapt prompt-based few-shot learning to ELECTRA and show that it outperforms masked language models in a wide range of tasks.
Identifying Physical Object Use in Sentences
- Tianyu Jiang, Ellen Riloff
- TLDR: We present a new task for object use classification that allows sentence understanding to be made more intuitively and more effectively.
CDialog: A Multi-turn Covid-19 Conversation Dataset for Entity-Aware Dialog Generation
- Deeksha Varshney, Aizan Zafar, Niranshu Behera, Asif Ekbal
- TLDR:
Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals
- Maarten De Raedt, Fréderic Godin, Chris Develder, Thomas Demeester
- TLDR: We propose a novel solution that only requires annotation of a small fraction (e.g., 1%) of the original training data, and uses automatic generation of extra counterfactuals in an encoding vector space.
Data-Efficient Playlist Captioning With Musical and Linguistic Knowledge
- Giovanni Gabbolini, Romain Hennequin, Elena Epure
- TLDR: We propose PlayNTell, a data-efficient multi-modal encoder-decoder model for automatic playlist captioning.
Improved grammatical error correction by ranking elementary edits
- Alexey Sorokin
- TLDR: We propose a two-stage reranking method for grammatical error correction, which achieves state-of-the-art quality on BEA 2019 English dataset even using weak BERT-GEC edit generator.
Improving Tokenisation by Alternative Treatment of Spaces
- Edward Gow-Smith, Harish Tayyar Madabushi, Carolina Scarton, Aline Villavicencio
- TLDR:
GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation
- Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, Daniel Weld
- TLDR: We present a system for running standardized human evaluations of text generation tasks that are reproducible over time and across different populations.
Attentional Probe: Estimating a Module’s Functional Potential
- Tiago Pimentel, Josef Valvoda, Niklas Stoehr, Ryan Cotterell
- TLDR:
When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems
- Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, Yu Su
- TLDR: We show that as the training dataset grows, the output of mainstream neural NLU models on a small set of new symbols often decreases.
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
- Lianzhe Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Houfeng Wang
- TLDR: We propose a novel model that uses a unified prompt for multilingual PLMs, which can significantly outperform the strong baselines across different languages.
Three Real-World Datasets and Neural Computational Models for Classification Tasks in Patent Landscaping
- Subhash Pujari, Jannik Strötgen, Mark Giereth, Michael Gertz, Annemarie Friedrich
- TLDR: We present a novel neural model for patent landscape studies that takes into account textual information from the patents’ full texts as well as embeddings created based on the patents’ CPC labels.
Topic Modeling With Topological Data Analysis
- Ciarán Byrne, Danijela Horak, Karo Moilanen, Amandla Mabona
- TLDR: Unsupervised topic modelling using topological data analysis and topological modelling.
Predicting Fine-Tuning Performance with Probing
- Zining Zhu, Soroosh Shahtalebi, Frank Rudzicz
- TLDR: We propose a lightweight method for probing deep NLP models to extract a proxy signal widely used in model development – the fine-tuning performance.
Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers
- Abhijeet Awasthi, Ashutosh Sathe, Sunita Sarawagi
- TLDR: We present ReFill, a framework for synthesizing high-quality and textually diverse parallel datasets for adapting Text-to-SQL parsers.
Agent-Specific Deontic Modality Detection in Legal Language
- Abhilasha Sancheti, Aparna Garimella, Balaji Vasan Srinivasan, Rachel Rudinger
- TLDR: We present a corpus of English contracts annotated with deontic modalities and a model trained on it that can detect red flags in the legal domain.
COLD: A Benchmark for Chinese Offensive Language Detection
- Jiawen Deng, Jingyan Zhou, Hao Sun, Chujie Zheng, Fei Mi, Helen Meng, Minlie Huang
- TLDR: We propose a benchmark for Chinese offensive language detection and a baseline detector for generative models.
Fixing Model Bugs with Natural Language Patches
- Shikhar Murty, Christopher Manning, Scott Lundberg, Marco Tulio Ribeiro
- TLDR: We propose natural language patches that can improve the accuracy of neural network models by up to 7 points on a large dataset, and show that finetuning on as many as 100 labeled examples may be needed to match the performance of a small set of language patches.
WeDef: Weakly Supervised Backdoor Defense for Text Classification
- Lesheng Jin, Zihan Wang, Jingbo Shang
- TLDR: We propose a novel weakly supervised backdoor defense framework WeDef.
Interventional Training for Out-Of-Distribution Natural Language Understanding
- Sicheng Yu, Jing Jiang, Hao Zhang, Yulei Niu, Qianru Sun, Lidong Bing
- TLDR: We propose a novel interventional training method for NLU tasks that suffer from confounding bias in out-of-distribution settings.
Pseudo-Relevance for Enhancing Document Representation
- Jihyuk Kim, Seung-won Hwang, Seoho Song, Hyeseon Ko, Young-In Song
- TLDR: We propose a novel multi-vector representation for the bi-encoder approach in dense document retrieval, which reduces the latency and memory footprint, and improves the effectiveness, without compromising the effectiveness.
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
- Jiacheng Ye, Jiahui Gao, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong
- TLDR: We present a novel and efficient zero-shot dataset generation method based on pre-trained language models.
Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings
- Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm
- TLDR: We use controlled nearest neighbor sampling over citation graph embeddings for contrastive learning to learn continuous similarity in scientific document representations.
SPE: Symmetrical Prompt Enhancement for Fact Probing
- Yiyuan Li, Tong Che, Yezhen Wang, Zhengbao Jiang, Caiming Xiong, Snigdha Chaturvedi
- TLDR: We propose a continuous prompt-based method for factual probing in PLMs that leverages the symmetry of the task by constructing symmetrical prompts for subject and object prediction.
Efficient Large Scale Language Modeling with Mixtures of Experts
- Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giridharan Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O’Horo, Jeffrey Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Veselin Stoyanov
- TLDR: We show that autoregressive autoregression of Experts layers in language models can outperform dense models in a wide range of settings.
MedJEx: A Medical Jargon Extraction Model with Wiki’s Hyperlink Span and Contextualized Masked Language Model Score
- Sunjae Kwon, Zonghai Yao, Harmon Jordan, David Levy, Brian Corner, Hong Yu
- TLDR: We present a novel medical jargon extraction algorithm which outperforms existing state-of-the-art NLP models.
Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections
- Wei-Jen Ko, Cutter Dalton, Mark Simmons, Eliza Fisher, Greg Durrett, Junyi Jessy Li
- TLDR: We present a novel paradigm for discourse comprehension by question answering, which captures both discourse and semantic links between sentences in the form of free-form, open-ended questions.
Learning to Generate Overlap Summaries through Noisy Synthetic Data
- Naman Bansal, Mousumi Akter, Shubhra Kanti Karmaker Santu
- TLDR: We propose a novel data augmentation technique for seq-to-seq summarization task which allows us to create large amount of synthetic data for training a seq- to-seq model that can perform the SOS task.
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality
- Yichen Jiang, Xiang Zhou, Mohit Bansal
- TLDR: We propose two techniques to address the lack of the systematic generalization ability in standard sequence-to-sequence models and show substantial empirical improvements.
Directions for NLP Practices Applied to Online Hate Speech Detection
- Paula Fortuna, Monica Dominguez, Leo Wanner, Zeerak Talat
- TLDR: We argue that many conventions in NLP are poorly suited for the problem and encourage researchers to develop methods that are more appropriate for the task.
Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection
- Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti
- TLDR: We propose sentence-level transformer pre-training objectives for answer sentence selection, which improve the performance of transformers for AS2.
OpenCQA: Open-ended Question Answering with Charts
- Shankar Kantharaj, Xuan Long Do, Rixie Tiffany Leong, Jia Qing Tan, Enamul Hoque, Shafiq Joty
- TLDR: We present a new task called OpenCQA, where a chart and an accompanying article are provided as input to a model and provide an answer to an open-ended question about a chart with descriptive texts.
A Systematic Investigation of Commonsense Knowledge in Large Language Models
- Xiang Lorraine Li, Adhiguna Kuncoro, Jordan Hoffmann, Cyprien de Masson d’Autume, Phil Blunsom, Aida Nematzadeh
- TLDR: We conduct a systematic and rigorous zero-shot and few-shot commonsense evaluation of large pre-trained language models and show that they are not capable of acquiring commonsense knowledge without task-specific supervision.
Transforming Sequence Tagging Into A Seq2Seq Task
- Karthik Raman, Iftekhar Naim, Jiecao Chen, Kazuma Hashimoto, Kiran Yalasangi, Krishna Srinivasan
- TLDR: We rigorously study different formats one could use for sequence tagging tasks and show that the new format is both simpler and more effective than existing formats.
CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting
- Andrea Iovine, Anjie Fang, Besnik Fetahu, Jie Zhao, Oleg Rokhlenko, Shervin Malmasi
- TLDR: We present CycleKQQA, an unsupervised query rewriting task for query understanding, which improves query understanding capabilities of Search and QA systems for all surface forms.
Model Criticism for Long-Form Text Generation
- Yuntian Deng, Volodymyr Kuleshov, Alexander Rush
- TLDR: We propose to use model criticism in latent space to evaluate the high-level structure of language models.
Improving Faithfulness by Augmenting Negative Summaries from Fake Documents
- Tianshu Wang, Faisal Ladhak, Esin Durmus, He He
- TLDR: We propose a back-translation-style approach to augment negative samples with factual errors to improve faithfulness in abstractive summarization systems.
Joint Completion and Alignment of Multilingual Knowledge Graphs
- Soumen Chakrabarti, Harkanwar Singh, Shubham Lohiya, Prachi Jain, Mausam -
- TLDR: We propose a novel relation representation for knowledge graph completion and a novel loss term for completion.
Offer a Different Perspective: Modeling the Belief Alignment of Arguments in Multi-party Debates
- Suzanna Sia, Kokil Jaidka, Hansin Ahuja, Niyati Chhaya, Kevin Duh
- TLDR: We propose a constraint-based model to predict the winning arguments in multi-party interactions in the Reddit Change My View and Intelligence Squared debates datasets.
A Federated Approach to Predicting Emojis in Hindi Tweets
- Deep Gandhi, Jash Mehta, Nirali Parekh, Karan Waghela, Lynette D’Mello, Zeerak Talat
- TLDR: We propose a new dataset for emoji prediction in Hindi and propose a modification to the federated learning algorithm, CausalFedGSD, which aims to strike a balance between model performance and user privacy.
Injecting Domain Knowledge in Language Models for Task-oriented Dialogue Systems
- Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour
- TLDR: We present knowledge injection methods for pre-trained language models that can be used to fine-tuned on task-oriented dialogue tasks.
TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack
- Yu Cao, Dianqi Li, Meng Fang, Tianyi Zhou, Jun Gao, Yibing Zhan, Dacheng Tao
- TLDR: We present a novel adversarial attack method for question answering models that produces fluent and grammatical adversarial contexts while maintaining gold answers.
Improving Low-Resource Languages in Pre-Trained Multilingual Language Models
- Viktor Hangya, Hossain Shaikh Saadi, Alexander Fraser
- TLDR: We propose an unsupervised approach to improve the cross-lingual representations of low-resource languages by bootstrapping word translation pairs from monolingual corpora and using them to improve language alignment in pre-trained language models.
SCROLLS: Standardized CompaRison Over Long Language Sequences
- Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy
- TLDR: We present SCROLLS, a suite of tasks that require reasoning over long texts.
PAR: Political Actor Representation Learning with Social Context and Expert Knowledge
- Shangbin Feng, Zhaoxuan Tan, Zilong Chen, Ningnan Wang, Peisheng Yu, Qinghua Zheng, Xiaojun Chang, Minnan Luo
- TLDR: We propose a novel representation learning framework for political actors that leverages social context and expert knowledge for holistic ideological analysis.
JDDC 2.1: A Multimodal Chinese Dialogue Dataset with Joint Tasks of Query Rewriting, Response Generation, Discourse Parsing, and Summarization
- Nan Zhao, Haoran Li, Youzheng Wu, Xiaodong He
- TLDR: We present a large-scale multimodal multi-turn dialogue dataset and a novel multimodality task framework for multimodally multi-task dialogue.
PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings
- Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang
- TLDR: We propose a novel Peer-Contrastive Learning (PCL) with diverse augmentations for unsupervised sentence embeddings.
Digging Errors in NMT: Evaluating and Understanding Model Errors from Partial Hypothesis Space
- Jianhao Yan, Chenming Wu, Fandong Meng, Jie Zhou
- TLDR: We propose a novel evaluation protocol for neural machine translation (NMT) models, which defines model errors with model’s ranking capability over hypothesis space and show that the state-of-the-art Transformer models face serious ranking issues and only perform at the random chance level in the top region.
DialogConv: A Lightweight Fully Convolutional Network for Multi-view Response Selection
- Yongkang Liu, Shi Feng, Wei Gao, Daling Wang, Yifei Zhang
- TLDR: We propose a novel lightweight fully convolutional architecture for response selection in dialogue systems.

« AACL 2022 EACL 2023 »