TLDRs

ACL 2023 (at Toronto, Canada)

Program Chairs’ Report on Peer Review at ACL 2023
- Anna Rogers, Marzena Karpinska, Jordan Boyd-Graber, Naoaki Okazaki
- TLDR: We present a summary of the efforts to improve conference peer review that were implemented at ACL’23.
One Cannot Stand for Everyone! Leveraging Multiple User Simulators to train Task-oriented Dialogue Systems
- Yajiao Liu, Xin Jiang, Yichun Yin, Yasheng Wang, Fei Mi, Qun Liu, Xiang Wan, Benyou Wang
- TLDR: We propose a new approach to task-oriented dialogue that is optimized toward a user simulator.
SafeConv: Explaining and Correcting Conversational Unsafe Behavior
- Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Wenliang Chen, Dong Yu
- TLDR: We present a new dataset for conversational safety and propose a new model for detoxifying chatbots.
Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better
- David Dale, Elena Voita, Loic Barrault, Marta R. Costa-jussà
- TLDR: We propose a method that evaluates the percentage of the source contribution to a generated translation and use it to identify hallucinations.
Explainable Recommendation with Personalized Review Retrieval and Aspect Learning
- Hao Cheng, Shuo Wang, Wensheng Lu, Wei Zhang, Mingyang Zhou, Kezhong Lu, Hao Liao
- TLDR: We propose a novel model for explainable recommendation that combines prediction and generation tasks to produce more persuasive results.
Binary and Ternary Natural Language Generation
- Zechun Liu, Barlas Oguz, Aasish Pappu, Yangyang Shi, Raghuraman Krishnamoorthi
- TLDR: We present the first ternary and binary transformer models for text generation and machine translation.
Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking
- Björn Bebensee, Haejun Lee
- TLDR: We propose SPLAT, a novel architecture for schema-guided dialogue state tracking which achieves better generalization and efficiency than prior approaches by constraining outputs to a limited prediction space.
EM Pre-training for Multi-party Dialogue Response Generation
- Yiyang Li, Hai Zhao
- TLDR: We propose an Expectation-Maximization approach for multi-party dialogue response generation that iteratively performs the expectation steps to generate addressee labels, and the maximization steps to optimize a response generation model.
ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER
- Sreyan Ghosh, Utkarsh Tyagi, Manan Suri, Sonal Kumar, Ramaneswaran S, Dinesh Manocha
- TLDR: We present ACLM Attention-map aware keyword selection for complex named entity recognition (a novel data augmentation approach based on conditional generation) to address the data scarcity problem in low-resource complex NER.
Natural Language to Code Generation in Interactive Data Science Notebooks
- Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Oleksandr Polozov, Charles Sutton
- TLDR: We present a novel algorithm for generating programs for computational notebooks given natural language intents from users.
Subset Retrieval Nearest Neighbor Machine Translation
- Hiroyuki Deguchi, Taro Watanabe, Yusuke Matsui, Masao Utiyama, Hideki Tanaka, Eiichiro Sumita
- TLDR: We propose a new algorithm for deep neural machine translation that improves the translation performance of trained neural machine machine translation models by incorporating example-search into the decoding algorithm.
MIL-Decoding: Detoxifying Language Models at Token-Level via Multiple Instance Learning
- Xu Zhang, Xiaojun Wan
- TLDR: We introduce MIL-Decoding, which detoxifies language models at token-level by interpolating it with a trained multiple instance learning (MIL) network.
Dependency resolution at the syntax-semantics interface: psycholinguistic and computational insights on control dependencies
- Iria de-Dios-Flores, Juan Garcia Amboage, Marcos Garcia
- TLDR: We show that language models can identify control dependencies in Spanish sentences that are not accepted by humans, but not the correct antecedents.
Open-ended Long Text Generation via Masked Language Modeling
- Xiaobo Liang, Zecheng Tang, Juntao Li, Min Zhang
- TLDR: We propose a novel iterative non-autoregressive language model for Open-LTG and a new dynamic sliding window attention and linear temperature decay algorithm for long text generation.
A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces
- Gabriella Chronis, Kyle Mahowald, Katrin Erk
- TLDR: We study semantic construal in grammatical constructions using large language models.
Holographic CCG Parsing
- Ryosuke Yamaki, Tadahiro Taniguchi, Daichi Mochihashi
- TLDR: We propose a method for formulating CCG as a recursive composition in a continuous vector space.
Prompts Can Play Lottery Tickets Well: Achieving Lifelong Information Extraction via Lottery Prompt Tuning
- Zujie Liang, Feng Wei, Yin Jie, Yuxi Qian, Zhenghong Hao, Bing Han
- TLDR: We present a novel parameter- and deployment-efficient prompt tuning method for pre-trained language models that can learn new tasks without forgetting old ones.
Retrieve-and-Sample: Document-level Event Argument Extraction via Hybrid Retrieval Augmentation
- Yubing Ren, Yanan Cao, Ping Guo, Fang Fang, Wei Ma, Zheng Lin
- TLDR: We propose novel retrieval methods for document-level EAE that use pseudo demonstrations to augment the input and label distribution views in document-based retrieval.
WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning
- Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Sujian Li, Yajuan Lyu
- TLDR: We propose a weakly supervised framework for evaluating text generation models that generate text that is factually inconsistent with inputs.
AMR-based Network for Aspect-based Sentiment Analysis
- Fukun Ma, Xuming Hu, Aiwei Liu, Yawen Yang, Shuang Li, Philip S. Yu, Lijie Wen
- TLDR: We propose a new approach to sentiment classification that takes full advantage of semantic structures.
Text Adversarial Purification as Defense against Adversarial Attacks
- Linyang Li, Demin Song, Xipeng Qiu
- TLDR: We propose a novel adversarial purification algorithm for textual models against word-substitution adversarial attacks.
SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
- Shumin Deng, Shengyu Mao, Ningyu Zhang, Bryan Hooi
- TLDR: We propose Structured Prediction with Energy-based Event-Centric Hyperspheres (SPEECH) for event-centric structured prediction.
Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection
- Christopher Clarke, Matthew Hall, Gaurav Mittal, Ye Yu, Sandra Sajeev, Jason Mars, Mei Chen
- TLDR: We present Rule By Example (RBE), a novel exemplar-based contrastive learning approach for learning from logical rules for the task of textual content moderation.
What about “em”? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns
- Anne Lauscher, Debora Nozza, Ehm Miltersen, Archie Crowley, Dirk Hovy
- TLDR: We show that gender-neutral pronouns in 3rd-person pronouns can lead to grammatical and semantic translation errors in machine translation.
What Is Overlap Knowledge in Event Argument Extraction? APE: A Cross-datasets Transfer Learning Model for EAE
- Kaihang Zhang, Kai Shuang, Xinyue Yang, Xuyang Yao, Jinyu Guo
- TLDR: We propose a novel approach to learn the overlap knowledge across datasets and use it to improve the EAE task.
Tailor: A Soft-Prompt-Based Approach to Attribute-Based Controlled Text Generation
- Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Mingfeng Xue, Boxing Chen, Jun Xie
- TLDR: Attribute-based Controlled Text Generation with Parameter-efficient and Fluency-Sensitive Prompts.
Knowledge of cultural moral norms in large language models
- Aida Ramezani, Yang Xu
- TLDR: We investigate the extent to which monolingual English language models contain knowledge about moral norms in different countries.
Songs Across Borders: Singable and Controllable Neural Lyric Translation
- Longshen Ou, Xichu Ma, Min-Yen Kan, Ye Wang
- TLDR: We propose a novel lyric translation system that achieves high quality on length accuracy, rhyme accuracy, and word boundary recall.
Fantastic Expressions and Where to Find Them: Chinese Simile Generation with Multiple Constraints
- Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Xiangpeng Wei, Zhengyuan Liu, Jun Xie
- TLDR: We present controllable simile generation with controllability and a vehicle retrieval module Scorer to obtain the explicable comparison for a given tenor in the vehicle-unknown situation.
Revealing Single Frame Bias for Video-and-Language Learning
- Jie Lei, Tamara Berg, Mohit Bansal
- TLDR: We show that a single-frame trained model that does not consider temporal information can achieve better performance than existing methods that use multiple frames for training.
Learning with Partial Annotations for Event Detection
- Jian Liu, Dianbo Sui, Kang Liu, Haoyan Liu, Zhe Zhao
- TLDR: We propose a new trigger localization formulation using contrastive learning to distinguish ground-truth triggers from contexts, showing a decent robustness for addressing partial annotation noise.
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models
- Ziqiao Ma, Jiayi Pan, Joyce Chai
- TLDR: We propose a novel visually-grounded language model that learns unseen words more rapidly and robustly by pre-training on image-text pairs highlighting grounding as an objective.
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models
- Alessandro Stolfo, Zhijing Jin, Kumar Shridhar, Bernhard Schoelkopf, Mrinmaya Sachan
- TLDR: We propose a novel framework for studying the robustness and sensitivity of language models to external inputs and output solutions.
Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information
- Kun Zhao, Bohao Yang, Chenghua Lin, Wenge Rong, Aline Villavicencio, Xiaohui Cui
- TLDR: We propose a novel learning-based automatic evaluation metric for open-domain dialogues, which can robustly evaluate open-domains dialogues by augmenting Conditional Variational Autoencoders with a Next Sentence Prediction objective and employing Mutual Information.
Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions
- John Chung, Ece Kamar, Saleema Amershi
- TLDR: We explore human-AI partnerships to facilitate high diversity and accuracy in LLM-based text data generation.
Pruning Pre-trained Language Models Without Fine-Tuning
- Ting Jiang, Deqing Wang, Fuzhen Zhuang, Ruobing Xie, Feng Xia
- TLDR: We propose Static Model Pruning, a new method for first-order pruning of Pre-trained Language Models that is more parameter efficient than other methods due to it does not require fine-tuning.
When Does Translation Require Context? A Data-driven, Multilingual Exploration
- Patrick Fernandes, Kayo Yin, Emmy Liu, André Martins, Graham Neubig
- TLDR: We develop a new benchmark for evaluating and evaluating the performance of machine translation models that capture discourse phenomena in translation.
Causal Intervention and Counterfactual Reasoning for Multi-modal Fake News Detection
- Ziwei Chen, Linmei Hu, Weixin Li, Yingxia Shao, Liqiang Nie
- TLDR: We propose a causal intervention and counterfactual reasoning based Debiasing framework for multi-modal fake news detection.
LexSym: Compositionality as Lexical Symmetry
- Ekin Akyurek, Jacob Andreas
- TLDR: We show that compositional models can be solved by compositional data augmentation schemes that imparts compositional inductive bias on any model trained to solve the same task.
Layer-wise Fusion with Modality Independence Modeling for Multi-modal Emotion Recognition
- Jun Sun, Shoukang Han, Yu-Ping Ruan, Xiaoning Zhang, Shu-Kai Zheng, Yulong Liu, Yuxin Huang, Taihao Li
- TLDR: We propose a novel multi-modal emotion recognition model that learns representations for each individual modality independently, and a multi-model that fuses all modalities jointly.
CASN:Class-Aware Score Network for Textual Adversarial Detection
- Rong Bao, Rui Zheng, Liang Ding, Qi Zhang, Dacheng Tao
- TLDR: We propose a score-based generative adversarial detection method that uses the log-density data distribution to estimate the gap between adversarial and normal samples and use it to estimate adversarial samples.
Do Androids Laugh at Electric Sheep? Humor “Understanding” Benchmarks from The New Yorker Caption Contest
- Jack Hessel, Ana Marasovic, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff, Yejin Choi
- TLDR: We challenge neural networks to match a joke to a cartoon caption and explain why a caption is funny.
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation
- Martijn Bartelds, Nay San, Bradley McDonnell, Dan Jurafsky, Martijn Wieling
- TLDR: We investigate whether data augmentation techniques could help improve low-resource ASR performance, focusing on four typologically diverse minority languages or language variants (West Germanic: Gronings, West-Frisian; Malayo-Polynesian: Besemah, Nasal; Malayan-Polynesia: Besamah, Nal; and Malay-Poly-Indian: Besa, Nasa).
CLCL: Non-compositional Expression Detection with Contrastive Learning and Curriculum Learning
- Jianing Zhou, Ziheng Zeng, Suma Bhat
- TLDR: We propose a dynamic curriculum learning framework for non-compositional expressions and demonstrate its effectiveness on idiom usage recognition and metaphor detection tasks.
Multi-VALUE: A Framework for Cross-Dialectal English NLP
- Caleb Ziems, William Held, Jingfeng Yang, Jwala Dhamala, Rahul Gupta, Diyi Yang
- TLDR: We present a new system for evaluating and achieving English dialect invariance.
Self-Edit: Fault-Aware Code Editor for Code Generation
- Kechi Zhang, Zhuo Li, Jia Li, Ge Li, Zhi Jin
- TLDR: We propose a generate-and-edit approach named Self-Edit that utilizes execution results of the generated code from LLMs to improve the code quality on the competitive programming task.
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
- Shachar Don-Yehiya, Elad Venezian, Colin Raffel, Noam Slonim, Leshem Choshen
- TLDR: We propose ColD Fusion, a method that provides the benefits of multitask learning but leverages distributed computation and requires limited communication and no sharing of data.
Test-time Adaptation for Machine Translation Evaluation by Uncertainty Minimization
- Runzhe Zhan, Xuebo Liu, Derek F. Wong, Cuilian Zhang, Lidia S. Chao, Min Zhang
- TLDR: We propose a method for improving the accuracy of neural metrics by minimizing uncertainty during test time.
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
- Haw-Shiuan Chang, Ruei-Yao Sun, Kathryn Ricci, Andrew McCallum
- TLDR: We propose Multi-CLS BERT, a novel ensembling method for CLS-based prediction tasks that is almost as efficient as a single BERT model.
On-the-fly Cross-lingual Masking for Multilingual Pre-training
- Xi Ai, Bin Fang
- TLDR: We present a dynamic and token-wise masking scheme for multilingual pre-training, using a special token for cross-lingual prototype masking.
How About Kind of Generating Hedges using End-to-End Neural Models?
- Alafate Abulimiti, Chloé Clavel, Justine Cassell
- TLDR: We develop a model of hedge generation based on i) fine-tuning state-of-the-art language models trained on human-human tutoring data, followed by ii) reranking to select the candidate that best matches the expected hedging strategy within a candidate pool using a hedge classifier.
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models
- Zijie J. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover, Duen Horng Chau
- TLDR: We present a large-scale text-to-image prompt dataset for diffusion models and show how to interpret the resulting images.
From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization
- Arie Cattan, Lilach Eden, Yoav Kantor, Roy Bar-Haim
- TLDR: We present a novel method for organizing a given set of key points into a hierarchy, according to their specificity, and show that it can be used to make sense of long, flat list of key point hierarchies.
When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications
- Kevin Pei, Ishan Jindal, Kevin Chen-Chuan Chang, ChengXiang Zhai, Yunyao Li
- TLDR: We present an empirical survey of neural OpenIE models, training sets, and benchmarks in an effort to help users choose the most suitable OpenIE systems for their applications.
Subjective Crowd Disagreements for Subjective Data: Uncovering Meaningful CrowdOpinion with Population-level Learning
- Tharindu Cyril Weerasooriya, Sarah Luger, Saloni Poddar, Ashiqur KhudaBukhsh, Christopher Homan
- TLDR: We present CrowdOpinion, an unsupervised learning based approach that uses language features and label distributions to pool similar items into larger samples of label distributions.
Post-Abstention: Towards Reliably Re-Attempting the Abstained Instances in QA
- Neeraj Varshney, Chitta Baral
- TLDR: We present a new task that allows re-attempting the abstained instances with the aim of increasing the coverage of the system without significantly sacrificing its accuracy.
UniLG: A Unified Structure-aware Framework for Lyrics Generation
- Tao Qian, Fan Lou, Jiatong Shi, Yuning Wu, Shuai Guo, Xiang Yin, Qin Jin
- TLDR: Unified structure-aware lyrics generation framework for conditional lyrics generation.
FC-KBQA: A Fine-to-Coarse Composition Framework for Knowledge Base Question Answering
- Lingxi Zhang, Jing Zhang, Yanling Wang, Shulin Cao, Xinmei Huang, Cuiping Li, Hong Chen, Juanzi Li
- TLDR: We propose a Fine-to-Coarse Composition framework for KBQA (FC-KBQA) to both ensure the generalization ability and executability of the logical expression.
Does GPT-3 Grasp Metaphors? Identifying Metaphor Mappings with Generative Language Models
- Lennart Wachowiak, Dagmar Gromann
- TLDR: We propose GPT-3, a neural model that can detect metaphoric language and predict its source domain without any pre-set domains.
Being Right for Whose Right Reasons?
- Terne Sasha Thorn Jakobsen, Laura Cabello, Anders Søgaard
- TLDR: We present a first-of-its-kind dataset of human rationale annotations for Transformer-based models and show how the models are biased towards aligning best with older and/or white annotators.
ALERT: Adapt Language Models to Reasoning Tasks
- Ping Yu, Tianlu Wang, Olga Golovneva, Badr AlKhamissi, Siddharth Verma, Zhijing Jin, Gargi Ghosh, Mona Diab, Asli Celikyilmaz
- TLDR: We present a benchmark and suite of analyses for evaluating reasoning skills of language models.
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
- Ayyoob ImaniGooghari, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André Martins, François Yvon, Hinrich Schütze
- TLDR: We propose a new multilingual Large Language Model for low-resource languages and show that no single factor explains the quality of multilingual LLM representations.
Joint Constrained Learning with Boundary-adjusting for Emotion-Cause Pair Extraction
- Huawen Feng, Junlong Liu, Junhao Zheng, Haibin Chen, Xichen Shang, Qianli Ma
- TLDR: We propose a new framework for emotion-cause pair extraction based on constrained learning and Boundary-adjusting.
Pretrained Bidirectional Distillation for Machine Translation
- Yimeng Zhuang, Mei Tu
- TLDR: We propose a novel method for transfer bidirectional language knowledge from masked language pretraining to NMT models.
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
- Kyuyong Shin, Hanock Kwak, Wonjae Kim, Jisu Jeong, Seungjae Jung, Kyungmin Kim, Jung-Woo Ha, Sang-Woo Lee
- TLDR: Language modeling for user history corpus can improve recommender systems.
Improving Continual Relation Extraction by Distinguishing Analogous Semantics
- Wenzheng Zhao, Yuanning Cui, Wei Hu
- TLDR: We propose a novel continual relation extraction model for analogous relations and overcome overfitting problem.
Improving Pretraining Techniques for Code-Switched NLP
- Richeek Das, Sahasra Ranjan, Shreya Pathak, Preethi Jyothi
- TLDR: We present a new method for pretraining language models that are cognizant of language boundaries prior to masking and show that it improves performance on two downstream tasks.
A Theory of Unsupervised Speech Recognition
- Liming Wang, Mark Hasegawa-Johnson, Chang Yoo
- TLDR: Unsupervised speech recognition ({pasted macro ‘ASRU’}/) is the problem of learning automatic speech recognition (ASR) systems from
ThinkSum: Probabilistic reasoning over sets using large language models
- Batu Ozturkler, Nikolay Malkin, Zhen Wang, Nebojsa Jojic
- TLDR: We propose a two-stage probabilistic inference paradigm for large language models that can be used to improve the reasoning capabilities of LLMs.
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist
- Iftitahu Nimah, Meng Fang, Vlado Menkovski, Mykola Pechenizkiy
- TLDR: We present metric preference checklist as a framework to assess the effectiveness of automatic metrics in three NLG tasks: Text Summarization, Dialogue Response Generation, and Controlled Generation.
DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations
- Ang Lv, Jinpeng Li, Yuhan Chen, Gao Xing, Ji Zhang, Rui Yan
- TLDR: We propose DialoGue Path Sampling (DialoGPS) method in continuous semantic space, the first many-to-many augmentation method for multi-turn dialogues.
TECHS: Temporal Logical Graph Networks for Explainable Extrapolation Reasoning
- Qika Lin, Jun Liu, Rui Mao, Fangzhi Xu, Erik Cambria
- TLDR: We propose a graph convolutional network for extrapolation reasoning on temporal knowledge graphs and a logical decoder for explainability.
Consistency Regularization Training for Compositional Generalization
- Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Jie Zhou, Yue Zhang
- TLDR: We improve the capability of Transformer on compositional generalization through consistency regularization training, which promotes representation consistency across samples and prediction consistency for a single sample.
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
- Shengming Yin, Chenfei Wu, Huan Yang, Jianfeng Wang, Xiaodong Wang, Minheng Ni, Zhengyuan Yang, Linjie Li, Shuguang Liu, Fan Yang, Jianlong Fu, Ming Gong, Lijuan Wang, Zicheng Liu, Houqiang Li, Nan Duan
- TLDR: We propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation.
Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe
- Xiang Yue, Huseyin Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, Robert Sim
- TLDR: We show that a simple and practical recipe in the text domain is effective: simply fine-tuning a pretrained generative language model with differential privacy enables the model to generate useful synthetic text with strong privacy protection.
A Close Look into the Calibration of Pre-trained Language Models
- Yangyi Chen, Lifan Yuan, Ganqu Cui, Zhiyuan Liu, Heng Ji
- TLDR: We study the dynamic change in PLMs’ calibration performance in training and show that overconfidence in predictions is a major problem.
DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization
- Yu Li, Baolin Peng, Pengcheng He, Michel Galley, Zhou Yu, Jianfeng Gao
- TLDR: We propose DIONYSUS, a pre-trained encoder-decoder model for dialogue summarization that outperforms existing methods on six datasets.
MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction
- Wang Jing, Aixin Sun, Hao Zhang, Xiaoli Li
- TLDR: Proposing a proposal-based proposal-driven method for video localization that learns a subset of candidate moments for a query.
Diverse Demonstrations Improve In-context Compositional Generalization
- Itay Levy, Ben Bogin, Jonathan Berant
- TLDR: We propose a method to select diverse demonstrations that cover all of the structures required in the output program, in order to encourage the model to generalize to new structures from these demonstrations.
Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering
- Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- TLDR: We propose a new principle for in-context learning that improves performance by 40% over the common practice setting.
On the Efficacy of Sampling Adapters
- Clara Meister, Tiago Pimentel, Luca Malagutti, Ethan Wilcox, Ryan Cotterell
- TLDR: We propose a unified framework for understanding the trade-off between precision and recall in language generation techniques and show that these trade-offs can lead to qualitatively better text.
Cross-Domain Data Augmentation with Domain-Adaptive Language Modeling for Aspect-Based Sentiment Analysis
- Jianfei Yu, Qiankun Zhao, Rui Xia
- TLDR: We propose a new cross-domain Data Augmentation approach based on domain-Adaptive Language Modeling named DA
Compositional Data Augmentation for Abstractive Conversation Summarization
- Siru Ouyang, Jiaao Chen, Jiawei Han, Diyi Yang
- TLDR: We present a compositional data augmentation method for generating diverse and high-quality pairs of conversations and summaries.
PMAES: Prompt-mapping Contrastive Learning for Cross-prompt Automated Essay Scoring
- Yuan Chen, Xia Li
- TLDR: We propose a new method for learning more consistent representations of source and target prompts in cross-prompt automated essay scoring.
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
- Myra Cheng, Esin Durmus, Dan Jurafsky
- TLDR: We present Marked Personas, a prompt-based method to measure stereotypes in large language models for intersectional demographic groups without any lexicon or data labeling.
On Prefix-tuning for Lightweight Out-of-distribution Detection
- Yawen Ouyang, Yongchang Cao, Yuan Gao, Zhen Wu, Jianbing Zhang, Xinyu Dai
- TLDR: We propose an unsupervised prefix-tuning based OOD detection framework that is lightweight, easy-to-reproduce, and theoretically justified.
GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding
- Konstantin Yakovlev, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya
- TLDR: We propose a novel non-autoregressive approach to Grammatical Error Correction that uses a self-attention weight matrix to find the best permutation of input tokens and a decoder network based on a step-unrolled denoising autoencoder that fills in specific tokens.
Measuring Progress in Fine-grained Vision-and-Language Understanding
- Emanuele Bugliarello, Laurent Sartran, Aishwarya Agrawal, Lisa Anne Hendricks, Aida Nematzadeh
- TLDR: We investigate four competitive V&L models on four fine-grained benchmarks and show that the best model on each benchmark is X-VLM, which outperforms other benchmarks on both fine- and coarse-grind tasks.
Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information
- Sunjae Kwon, Rishabh Garodia, Minhwa Lee, Zhichao Yang, Hong Yu
- TLDR: We propose an unsupervised VWSD approach that uses gloss information of an external lexical knowledge-base, especially the sense definitions.
Chain-of-Skills: A Configurable Model for Open-Domain Question Answering
- Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
- TLDR: We propose a modular retriever where individual modules correspond to key skills that can be reused across datasets.
Elaboration-Generating Commonsense Question Answering at Scale
- Wenya Wang, Vivek Srikumar, Hannaneh Hajishirzi, Noah A. Smith
- TLDR: We finetune smaller language models to generate useful intermediate context, referred to here as elaborations.
Neural Unsupervised Reconstruction of Protolanguage Word Forms
- Andre He, Nicholas Tomlin, Dan Klein
- TLDR: We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms.
DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation
- Menglong Lu, Zhen Huang, Yunxiang Zhao, Zhiliang Tian, Yang Liu, Dongsheng Li
- TLDR: We propose a new meta-learning framework for domain adaptation, which improves the performance of BERT and improves the quality of the meta-validation set.
On Evaluating Multilingual Compositional Generalization with Translated Datasets
- Zi Wang, Daniel Hershcovich
- TLDR: We show that compositional generalization in NLP is not as good as we thought, and that multilingual models still struggle with cross-lingual compositional compositional generative models.
FAA: Fine-grained Attention Alignment for Cascade Document Ranking
- Zhen Li, Chongyang Tao, Jiazhan Feng, Tao Shen, Dongyan Zhao, Xiubo Geng, Daxin Jiang
- TLDR: We propose a novel approach to jointly optimize a cascade document ranking model by using the attention activations over the passages from the ranker as fine-grained attention feedback to optimize the selector.
Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models
- Zhong Zhang, Bang Liu, Junming Shao
- TLDR: We study the problem of re-parameterizing and fine-tuning language models from a new perspective: Discovery of intrinsic task-specific subspace.
Facilitating Multi-turn Emotional Support Conversation with Positive Emotion Elicitation: A Reinforcement Learning Approach
- Jinfeng Zhou, Zhuang Chen, Bo Wang, Minlie Huang
- TLDR: Emotional support conversation (ESC) aims to provide emotional support (ES) to improve one’s mental state.
Query Enhanced Knowledge-Intensive Conversation via Unsupervised Joint Modeling
- Mingzhu Cai, Siqi Bao, Xin Tian, Huang He, Fan Wang, Hua Wu
- TLDR: We propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv.
Why Aren’t We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts
- Piotr Szymański, Lukasz Augustyniak, Mikolaj Morzy, Adrian Szymczak, Krzysztof Surdyk, Piotr Żelasko
- TLDR: We present the complex relationship between ASR and NER errors which limit the ability of NER models to recover entity mentions from spontaneous speech transcripts.
Precise Zero-Shot Dense Retrieval without Relevance Labels
- Luyu Gao, Xueguang Ma, Jimmy Lin, Jamie Callan
- TLDR: We propose to use Hypothetical Document Embeddings to generate a hypothetical document that captures relevance patterns but is “fake” and may contain hallucinations.
White-Box Multi-Objective Adversarial Attack on Dialogue Generation
- Yufei Li, Zexin Li, Yingfan Gao, Cong Liu
- TLDR: We propose a novel adversarial attack method for dialogue generation systems that can significantly degrade state-of-the-art dialogue generation models with a higher success rate than traditional accuracy-based methods.
A Cautious Generalization Goes a Long Way: Learning Morphophonological Rules
- Salam Khalifa, Sarah Payne, Jordan Kodner, Ellen Broselow, Owen Rambow
- TLDR: We present a novel approach for automatically learning morphophonological rules of Arabic from a corpus.
Few-shot Adaptation Works with UnpredicTable Data
- Jun Shern Chan, Michael Pieler, Jonathan Jao, Jérémy Scheurer, Ethan Perez
- TLDR: We show that training on a large number of diverse datasets improves few-shot learning (FSL) performance on new tasks.
Cross-lingual Science Journalism: Select, Simplify and Rewrite Summaries for Non-expert Readers
- Mehwish Fatima, Michael Strube
- TLDR: We propose a novel cross-lingual simplified science summarization algorithm based on SELECT, SIMPLIFY and REWRITE for cross-languages science journalism.
HuCurl: Human-induced Curriculum Discovery
- Mohamed Elgaar, Hadi Amiri
- TLDR: We introduce the problem of curriculum discovery and describe a curriculum learning framework capable of discovering effective curricula in a curriculum space based on prior knowledge about sample difficulty.
kNN-TL: k-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation
- Shudong Liu, Xuebo Liu, Derek F. Wong, Zhaocong Li, Wenxiang Jiao, Lidia S. Chao, Min Zhang
- TLDR: We propose a new transfer learning method for low-resource neural machine translation that leverages the parent knowledge during the child inference.
Do language models have coherent mental models of everyday things?
- Yuling Gu, Bhavana Dalvi Mishra, Peter Clark
- TLDR: We propose a benchmark dataset for language models that captures the relationships between everyday things, their parts, and the relationships to these parts.
Rogue Scores
- Max Grusky
- TLDR: We find that many papers using ROUGE scores are not accurate, and that many of them are not comparable to the real world.
Instruction Induction: From Few Examples to Natural Language Task Descriptions
- Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy
- TLDR: We show that language models can explicitly infer an underlying task from a few demonstrations by prompting them to generate a natural language instruction that fits the examples.
In-Context Analogical Reasoning with Pre-Trained Language Models
- Xiaoyang Hu, Shane Storks, Richard Lewis, Joyce Chai
- TLDR: We use language-based abstractions to support analogical reasoning in AI systems.
Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering
- Avi Caciularu, Matthew Peters, Jacob Goldberger, Ido Dagan, Arman Cohan
- TLDR: We propose a novel multi-document question answering pre-training objective for multi-word multi-task inference and demonstrate its performance on a variety of multi-language downstream tasks.
Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation
- Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li
- TLDR: We propose Learning Good Teacher Matters, a new teacher training technique for incorporating distillation influence into the teacher training process.
REV: Information-Theoretic Evaluation of Free-Text Rationales
- Hanjie Chen, Faeze Brahman, Xiang Ren, Yangfeng Ji, Yejin Choi, Swabha Swayamdipta
- TLDR: We propose REV, a metric for evaluating rationale-label pairs, which measures the amount of new, label-relevant information in a rationale beyond the information already available in the input or the label.
ELQA: A Corpus of Metalinguistic Questions and Answers about English
- Shabnam Behzad, Keisuke Sakaguchi, Nathan Schneider, Amir Zeldes
- TLDR: We present ELQA, a corpus of questions and answers in and about the English language.
Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking
- Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang, Dacheng Tao, Li Guo
- TLDR: We present a simple and effective zero-shot transfer learning method for dialogue state tracking, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism.
BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
- Claytone Sikasote, Eunice Mukonde, Md Mahfuz Ibn Alam, Antonios Anastasopoulos
- TLDR: We present BIG-C (Bemba Image Grounded Conversations), a large multimodal dataset for Bemba.
Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues
- Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai
- TLDR: We propose SG-USM, a novel schema-guided user satisfaction modeling framework for task-oriented dialogue systems evaluation.
Robust Multi-bit Natural Language Watermarking through Invariant Features
- KiYoon Yoo, Wonhyuk Ahn, Jiho Jang, Nojun Kwak
- TLDR: We propose a robust multi-bit watermarking framework that is invariant to minor corruption and robust to corruption.
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding
- Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov
- TLDR: We propose KALM, a language model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding.
AtTGen: Attribute Tree Generation for Real-World Attribute Joint Extraction
- Yanzeng Li, Bingcong Xue, Ruoyu Zhang, Lei Zou
- TLDR: We present Attribute Tree, a unified formulation for real-world attribute extraction application, where closed-world, open-world and semi-open attribute extraction tasks are modeled uniformly.
Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization
- Shiyue Zhang, David Wan, Mohit Bansal
- TLDR: We propose a new metric for detecting unfaithful extractive summarization and show that it is not as faithful as the existing faithfulness evaluation metrics.
Improving Translation Quality Estimation with Bias Mitigation
- Hui Huang, Shuangzhi Wu, Kehai Chen, Hui Di, Muyun Yang, Tiejun Zhao
- TLDR: We propose a novel method to mitigate the bias of translation Quality Estimation models and improve estimation performance.
Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation
- Josef Jon, Ondřej Bojar
- TLDR: We propose a genetic algorithm (GA) based method for modifying the DNA of a cancer patient.
MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Moral Discussions
- Hao Sun, Zhexin Zhang, Fei Mi, Yasheng Wang, Wei Liu, Jianwei Cui, Bin Wang, Qun Liu, Minlie Huang
- TLDR: We propose a framework for building a moral dialogue system and evaluate its effectiveness.
Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion
- Shaoxiang Wu, Damai Dai, Ziwei Qin, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui
- TLDR: We propose a novel method for fine-grained video multimodal fusion by denoising bottleneck fusion and a mutual information maximization module.
SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval
- Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei
- TLDR: We propose a novel and efficient pre-training method for dense passage retrieval that learns to compress the passage information into a dense vector through self-supervised pre-trained language model.
From Ultra-Fine to Fine: Fine-tuning Ultra-Fine Entity Typing Models to Fine-grained
- Hongliang Dai, Ziqian Zeng
- TLDR: We propose a new approach for entity typing that can avoid the need of creating distantly labeled data whenever there is a new type schema.
Controlling Learned Effects to Reduce Spurious Correlations in Text Classifiers
- Parikshit Bansal, Amit Sharma
- TLDR: We propose an algorithm to regularize the learnt effect of the features on the model’s prediction to the estimated effect of feature on label.
What Makes Pre-trained Language Models Better Zero-shot Learners?
- Jinghui Lu, Dongsheng Zhu, Weidong Han, Rui Zhao, Brian Mac Namee, Fei Tan
- TLDR: We propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection).
Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
- Xinxi Lyu, Sewon Min, Iz Beltagy, Luke Zettlemoyer, Hannaneh Hajishirzi
- TLDR: We present a new zero-shot method for pseudo-demonstrations for language models that outperforms previous zero-shots methods by a significant margin.
Learning Optimal Policy for Simultaneous Machine Translation via Binary Search
- Shoutao Guo, Shaolei Zhang, Yang Feng
- TLDR: We present a new method for constructing the optimal translation policy online via binary search.
Better Simultaneous Translation with Monotonic Knowledge Distillation
- Shushu Wang, Jing Wu, Kai Fan, Wei Luo, Jun Xiao, Zhongqiang Huang
- TLDR: We propose a novel approach that leverages traditional translation models as teachers and employs a two-stage beam search algorithm to generate monotonic yet accurate reference translations for sequence-level knowledge distillation.
StoryARG: a corpus of narratives and personal experiences in argumentative texts
- Neele Falk, Gabriella Lapesa
- TLDR: We present a dataset of annotated stories in computational argumentation and show that they make the argument more effective.
Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue
- Maksim Eremeev, Ilya Valmianski, Xavier Amatriain, Anitha Kannan
- TLDR: We show how to use knowledge to identify rare tokens that appear in both source and reference sequences and use them to improve factual accuracy of sequence-to-sequence models.
Sequence Parallelism: Long Sequence Training from System Perspective
- Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You
- TLDR: We propose sequence parallelism, a memory-efficient parallelism to solve this issue from system perspective instead.
MUSTIE: Multimodal Structural Transformer for Web Information Extraction
- Qifan Wang, Jingang Wang, Xiaojun Quan, Fuli Feng, Zenglin Xu, Shaoliang Nie, Sinong Wang, Madian Khabsa, Hamed Firooz, Dongfang Liu
- TLDR: We propose a novel MUltimodal Structural Transformer that incorporates multiple modalities for web information extraction.
Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In
- Zichun Yu, Chenyan Xiong, Shi Yu, Zhiyuan Liu
- TLDR: We propose a generic retrieval plug-in for language models that can be used to augment the preferences of target LMs.
TableVLM: Multi-modal Pre-training for Table Structure Recognition
- Leiyuan Chen, Chengsong Huang, Xiaoqing Zheng, Jinshu Lin, Xuanjing Huang
- TLDR: We propose a novel multi-modal pre-training model for table structure recognition, which can improve the performance up to 1.97% in tree-editing-distance-score on ComplexTable.
Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?
- Jiashu Xu, Mingyu Derek Ma, Muhao Chen
- TLDR: We present NBR, which converts biomedical relation extraction (RE) as natural language inference formulation through indirect supervision.
Dynamic Routing Transformer Network for Multimodal Sarcasm Detection
- Yuan Tian, Nan Xu, Ruike Zhang, Wenji Mao
- TLDR: Dynamic Routing Transformer Network for multimodal sarcasm detection.
What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary
- Ori Ram, Liat Bezalel, Adi Zicher, Yonatan Belinkov, Jonathan Berant, Amir Globerson
- TLDR: We propose to interpret the vector representations produced by dual encoders by projecting them into the model’s vocabulary space, and draw connection between them and sparse retrieval.
Cold-Start Data Selection for Better Few-shot Language Model Fine-tuning: A Prompt-based Uncertainty Propagation Approach
- Yue Yu, Rongzhi Zhang, Ran Xu, Jieyu Zhang, Jiaming Shen, Chao Zhang
- TLDR: We present PATRON, a prompt-based data selection method for pre-trained language model fine-tuning under cold-start scenarios, i.e., no initial labeled data are available.
Training-free Neural Architecture Search for RNNs and Transformers
- Aaron Serianni, Jugal Kalita
- TLDR: We develop a new training-free metric for neural architecture search that predicts the trained performance of an RNN architecture and significantly outperforms existing training-fuzzy metrics.
CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs
- Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, Rifat Shahriyar
- TLDR: We present CrossSum, a large-scale cross-lingual summarization dataset comprising 1.68 million article-summary samples in 1,500+ language pairs.
Improving Gradient Trade-offs between Tasks in Multi-task Text Classification
- Heyan Chai, Jinhao Cui, Ye Wang, Min Zhang, Binxing Fang, Qing Liao
- TLDR: We present a novel gradient trade-off approach to mitigate the task conflict problem in multi-task learning.
Bi-Phone: Modeling Inter Language Phonetic Influences in Text
- Abhirut Gupta, Ananya B. Sai, Richard Sproat, Yuri Vasilevski, James Ren, Ambarish Jash, Sukhdeep Sodhi, Aravindan Raghuveer
- TLDR: We propose a method to mine phoneme confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of L1 and L2.
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment
- Shengqiong Wu, Hao Fei, Wei Ji, Tat-Seng Chua
- TLDR: We propose to address the irrelevancy and disfluency issues of unpaired cross-lingual image captioning by incorporating the scene graph and syntactic constituency structures.
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
- Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Lan, Roy Ka-Wei Lee, Ee-Peng Lim
- TLDR: We propose Zero-shot-CoT, a new method for multi-step reasoning in large language models.
RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models
- Zheng Liu, Shitao Xiao, Yingxia Shao, Zhao Cao
- TLDR: We propose a novel pre-training method called Duplex masked auto-encoder, a.k.a. DupMAE, which improves the quality of semantic representation of retrieval-oriented language models.
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition
- Ali Modarressi, Mohsen Fayyaz, Ehsan Aghazadeh, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar
- TLDR: We present DecompX, a faithful vector-based explanation of Transformer-based models that incorporates all components into the analysis and provides a faithful prediction-based model.
Symbolic Chain-of-Thought Distillation: Small Models Can Also “Think” Step-by-Step
- Liunian Harold Li, Jack Hessel, Youngjae Yu, Xiang Ren, Kai-Wei Chang, Yejin Choi
- TLDR: We present a method to train a small language model on rationalizations from a significantly larger teacher model.
Generating EDU Extracts for Plan-Guided Summary Re-Ranking
- Griffin Adams, Alex Fabbri, Faisal Ladhak, Noémie Elhadad, Kathleen McKeown
- TLDR: We propose a novel method to generate and re-rank abstractive candidates for re-ranking that outperforms existing methods on CNN / Dailymail, NYT, NYT and Xsum.
A Survey on Asking Clarification Questions Datasets in Conversational Systems
- Hossein A. Rahmani, Xi Wang, Yue Feng, Qiang Zhang, Emine Yilmaz, Aldo Lipani
- TLDR: We comprehensively analyse the current ACQs research status, which offers a detailed comparison of publicly available datasets, and discusses the applied evaluation metrics, joined with benchmarks for multiple ACQ-related tasks.
Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
- Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun
- TLDR: Chain-of-Thought prompting can improve multi-step reasoning in large language models, even with invalid demonstrations.
Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation
- Jean Maillard, Cynthia Gao, Elahe Kalbassi, Kaushik Ram Sadagopan, Vedanuj Goswami, Philipp Koehn, Angela Fan, Francisco Guzman
- TLDR: We show that the creation of a few thousand professionally translated sentence pairs for low-resource languages can improve translation accuracy and improve the quality of models trained on them.
RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks
- Zhaoyang Wang, Zhiyue Liu, Xiaopeng Zheng, Qinliang Su, Jiahai Wang
- TLDR: We propose a defense framework that aims to mitigate adversarial attacks by confusing attackers and correcting adversarial contexts that are caused by malicious perturbations.
Gradient-based Intra-attention Pruning on Pre-trained Language Models
- Ziqing Yang, Yiming Cui, Xin Yao, Shijin Wang
- TLDR: We propose a structured intra-attention pruning method GRAIN (gradient-based intra-Attention prune), which performs task-specific pruning with knowledge distillation and yields highly effective models.
Learning to Substitute Spans towards Improving Compositional Generalization
- Zhaoyi Li, Ying Wei, Defu Lian
- TLDR: We propose a novel compositional augmentation strategy for neural sequence models that enables multi-grained composition of substantial substructures in the whole training set.
DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation
- Guanqun Bi, Lei Shen, Yanan Cao, Meng Chen, Yuqiang Xie, Zheng Lin, Xiaodong He
- TLDR: We propose a framework DiffusEmp based on conditional diffusion language model to generate empathetic responses that are controllable and diverse without losing context-relatedness.
BREAK: Breaking the Dialogue State Tracking Barrier with Beam Search and Re-ranking
- Seungpil Won, Heeyoung Kwak, Joongbo Shin, Janghoon Han, Kyomin Jung
- TLDR: We present a novel framework for dialogue state tracking that achieves outstanding performance on MultiWOZ 2.1-2.4.
Faithful Low-Resource Data-to-Text Generation through Cycle Training
- Zhuoer Wang, Marcus Collins, Nikhita Vedula, Simone Filice, Shervin Malmasi, Oleg Rokhlenko
- TLDR: We present a novel unsupervised data-to-text generation algorithm which achieves comparable performance to fully supervised approaches on structured data and natural language text.
Towards Stable Natural Language Understanding via Information Entropy Guided Debiasing
- Li Du, Xiao Ding, Zhouhao Sun, Ting Liu, Bing Qin, Jingshuo Liu
- TLDR: We propose a debiasing method for natural language understanding that improves performance on out-of-distribution samples.
Dynamic and Efficient Inference for Text Generation via BERT Family
- Xiaobo Liang, Juntao Li, Lijun Wu, Ziqiang Cao, Min Zhang
- TLDR: We propose a novel fine-tuning method for pre-trained language models.
Learning to Generate Equitable Text in Dialogue from Biased Training Data
- Anthony Sicilia, Malihe Alikhani
- TLDR: We provide formal definitions of equity in text generation and prove formal connections between learning human-likeness and learning equity: algorithms for improving equity ultimately reduce to algorithms for learning human.
Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification
- Ke Ji, Yixin Lian, Jingsheng Gao, Baoyuan Wang
- TLDR: We propose a new hierarchical verbalizer for hierarchical text classification and demonstrate its effectiveness in few-shot flat text classification tasks.
Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization
- Yunlong Liang, Fandong Meng, Jinan Xu, Jiaan Wang, Yufeng Chen, Jie Zhou
- TLDR: We propose to improve the summary quality through summary-oriented visual features.
Helping a Friend or Supporting a Cause? Disentangling Active and Passive Cosponsorship in the U.S. Congress
- Giuseppe Russo, Christoph Gote, Laurence Brandenberger, Sophia Schlosser, Frank Schweitzer
- TLDR: We show that active and passive cosponsorship are driven by two different motivations: the backing of political colleagues and the backing the bill’s content.
TREA: Tree-Structure Reasoning Schema for Conversational Recommendation
- Wendi Li, Wei Wei, Xiaoye Qu, Xian-Ling Mao, Ye Yuan, Wenfeng Xie, Dangyang Chen
- TLDR: We propose a novel Tree structure Reasoning schEmA for causality reasoning in conversational recommender systems.
CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality
- Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li
- TLDR: We present CATS, a pragmatic Chinese answer-to-sequence dataset with large scale and high quality.
Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing, and Persuasion Techniques
- Jakub Piskorski, Nicolas Stefanovitch, Nikolaos Nikolaidis, Giovanni Da San Martino, Preslav Nakov
- TLDR: We present a new multilingual multifacet dataset of news articles, each annotated for genre (objective news reporting vs. opinion vs. satire), framing (what key aspects are highlighted), and persuasion techniques (logical fallacies, emotional appeals, ad hominem attacks, etc.).
Learning Action Conditions from Instructional Manuals for Instruction Understanding
- Te-Lin Wu, Caiqi Zhang, Qingyuan Hu, Alexander Spangher, Nanyun Peng
- TLDR: We propose a weakly supervised approach utilizing automatically constructed large-scale training instances from online instructions, and curate a densely human-annotated and validated dataset to study how well the current NLP models do on the proposed task.
StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation
- Yulun Du, Lydia Chilton
- TLDR: We present StoryWars, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform.
Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning
- Fan Yin, Jesse Vig, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
- TLDR: We systematically study the role of task definitions in instruction learning and propose two strategies to help models better leverage task instructions.
Do PLMs Know and Understand Ontological Knowledge?
- Weiqi Wu, Chengyue Jiang, Yong Jiang, Pengjun Xie, Kewei Tu
- TLDR: We investigate whether Pretrained language models can memorize ontological knowledge and utilize implicit knowledge in reasoning.
CORE: Cooperative Training of Retriever-Reranker for Effective Dialogue Response Selection
- Chongyang Tao, Jiazhan Feng, Tao Shen, Chang Liu, Juntao Li, Xiubo Geng, Daxin Jiang
- TLDR: We present a cooperative training of the response retriever and the reranker whose parameters are dynamically optimized by the ground-truth labels as well as list-wise supervision signals from each other.
Exploring How Generative Adversarial Networks Learn Phonological Representations
- Jingyi Chen, Micha Elsner
- TLDR: We analyze how GANs learn representations of phonological phenomena in French and English and show that the learned representations in neural networks are different from the phonological representations proposed by linguists.
Interpretable Word Sense Representations via Definition Generation: The Case of Semantic Change Analysis
- Mario Giulianelli, Iris Luden, Raquel Fernandez, Andrey Kutuzov
- TLDR: We propose using automatically generated natural language definitions of contextualised word usages as interpretable word and word sense representations.
Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing
- Hao Yan, Saurabh Srivastava, Yintao Tai, Sida I. Wang, Wen-tau Yih, Ziyu Yao
- TLDR: We propose a novel feedback evaluator for interactive semantic parsing based on natural language feedback, which can help to boost the error correction ability of a specific parser.
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
- Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin
- TLDR: We propose an Informative Metric for Reference-free Image Caption evaluation that provides informative feedback on the quality of captioning.
An Invariant Learning Characterization of Controlled Text Generation
- Carolina Zheng, Claudia Shi, Keyon Vafa, Amir Feder, David Blei
- TLDR: We show that controlled generation under distribution shift is poor if the distributions of text in response to user prompts differ from the distributions the predictor was trained on.
HistRED: A Historical Document-Level Relation Extraction Dataset
- Soyoung Yang, Minseok Choi, Youngwoo Cho, Jaegul Choo
- TLDR: We present a novel historical relation extraction dataset for Korean and Hanja texts and propose a bilingual RE model that outperforms monolingual baselines.
A Critical Evaluation of Evaluations for Long-form Question Answering
- Fangyuan Xu, Yixiao Song, Mohit Iyyer, Eunsol Choi
- TLDR: We present a careful analysis of experts’ evaluation of long-form question answering and show that no existing metrics are predictive of human preference judgments.
HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation
- Hongyi Yuan, Zheng Yuan, Chuanqi Tan, Fei Huang, Songfang Huang
- TLDR: We propose HyPe, a simple yet effective fine-tuning technique to alleviate such problems by perturbing hidden representations of Transformers layers.
Generating User-Engaging News Headlines
- Pengshan Cai, Kaiqiang Song, Sangwoo Cho, Hongwei Wang, Xiaoyang Wang, Hong Yu, Fei Liu, Dong Yu
- TLDR: We present a novel framework for generating personalized headlines that meet the needs of diverse readers.
Word sense extension
- Lei Yu, Yang Xu
- TLDR: We present a paradigm of word sense extension (WSE) that allows words to spawn new senses toward novel context.
PVGRU: Generating Diverse and Relevant Dialogue Responses via Pseudo-Variational Mechanism
- Yongkang Liu, Shi Feng, Daling Wang, Yifei Zhang, Hinrich Schütze
- TLDR: We propose a novel recurrent summarizing variable that can perceive subtle semantic variability observed in multi-turn dialogue in generative chatbots.
Decoding Symbolism in Language Models
- Meiqi Guo, Rebecca Hwa, Adriana Kovashka
- TLDR: We present a new evaluative framework for language models to decode symbolism and show that conventional symbols are more reliably elicited from LMs while situated symbols are challenging.
A Survey on Zero Pronoun Translation
- Longyue Wang, Siyou Liu, Mingzhou Xu, Linfeng Song, Shuming Shi, Zhaopeng Tu
- TLDR: We provide a comprehensive overview of the literature on zero pronoun translation and provide a number of insightful findings.
We Understand Elliptical Sentences, and Language Models should Too: A New Dataset for Studying Ellipsis and its Interaction with Thematic Fit
- Davide Testa, Emmanuele Chersoni, Alessandro Lenci
- TLDR: We explored the effect of prototypicality of event participants on the ability of Language Models to solve ellipsis and reconstructing the missing element in a sentence.
MPCHAT: Towards Multimodal Persona-Grounded Conversation
- Jaewoo Ahn, Yeda Song, Sangdoo Yun, Gunhee Kim
- TLDR: We extend multimodal persona-based dialogue to the multimodality domain and show that incorporating multimodally grounded persona is crucial for improving multimodAL dialogue comprehension.
DOC: Improving Long Story Coherence With Detailed Outline Control
- Kevin Yang, Dan Klein, Nanyun Peng, Yuandong Tian
- TLDR: We propose the Detailed Outline Control framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories.
Dual-Alignment Pre-training for Cross-lingual Sentence Embedding
- Ziheng Li, Shaohan Huang, Zihan Zhang, Zhi-Hong Deng, Qiang Lou, Haizhen Huang, Jian Jiao, Furu Wei, Weiwei Deng, Qi Zhang
- TLDR: We propose a novel token-level alignment pre-training task for cross-lingual sentence embedding that incorporates both sentence-level and token- level alignment.
Exploring Better Text Image Translation with Multimodal Codebook
- Zhibin Lan, Jiawei Yu, Xiang Li, Wen Zhang, Jian Luan, Bin Wang, Degen Huang, Jinsong Su
- TLDR: We present a novel multi-stage training framework for text image translation and a novel model for image image translation.
FEDLEGAL: The First Real-World Federated Learning Benchmark for Legal NLP
- Zhuo Zhang, Xiangjing Hu, Jingyuan Zhang, Yating Zhang, Hui Wang, Lizhen Qu, Zenglin Xu
- TLDR: We present the first real-world FL benchmark for legal NLP, coined FEDLEGAL, which comprises five legal NPs tasks and one privacy task based on the data from Chinese courts.
A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning
- Naibin Gu, Peng Fu, Xiyu Liu, Zhengxiao Liu, Zheng Lin, Weiping Wang
- TLDR: We propose a gradient control method to consolidate the backdoor attack effect, which improves the effect of the attack on sentiment classification and spam detection.
History Semantic Graph Enhanced Conversational KBQA with Temporal Information Modeling
- Hao Sun, Yang Li, Liwei Deng, Bowen Li, Binyuan Hui, Binhua Li, Yunshi Lan, Yan Zhang, Yongbin Li
- TLDR: We propose a new approach for conversational conversational graph-based context information modeling that is able to effectively model long-range semantic dependencies in conversation history while maintaining low computational cost.
From the One, Judge of the Whole: Typed Entailment Graph Construction with Predicate Generation
- Zhibin Chen, Yansong Feng, Dongyan Zhao
- TLDR: We propose a multi-stage method for generating high-quality and scale-controllable entailment graphs based on predicates and predicate distributions.
Alleviating Over-smoothing for Unsupervised Sentence Representation
- Nuo Chen, Linjun Shou, Jian Pei, Ming Gong, Bowen Cao, Jianhui Chang, Jia Li, Daxin Jiang
- TLDR: Simple and effective contrastive learning for unsupervised sentence representation.
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model
- Yeskendir Koishekenov, Alexandre Berard, Vassilina Nikoulina
- TLDR: We propose a pruning method that enables the removal of up to 80% of experts without further finetuning and with negligible loss in translation quality, which makes it feasible to run the model on a single 32GB GPU.
DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue
- William Held, Christopher Hidey, Fei Liu, Eric Zhu, Rahul Goel, Diyi Yang, Rushin Shah
- TLDR: We show that contrastive alignment pretraining improves zero-shot performance of multilingual and codeswitched semantic parsing systems.
From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
- Li Sun, Florian Luisier, Kayhan Batmanghelich, Dinei Florencio, Cha Zhang
- TLDR: We present a novel open-vocabulary language model that learns word representations from character sequences and uses a hierarchical two-level approach to represent them.
MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling
- Yu Song, Santiago Miret, Bang Liu
- TLDR: We present MatSci-NLP, a natural language benchmark for evaluating the performance of natural language processing (NLP) models on materials science text.
Code4Struct: Code Generation for Few-Shot Event Structure Prediction
- Xingyao Wang, Sha Li, Heng Ji
- TLDR: We propose Code4Struct to generate structured argument structures from text and use it to generate event-argument structures for structured prediction tasks.
GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles
- Tanmay Parekh, I-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, Nanyun Peng
- TLDR: We present a large and diverse EAE ontology and a new benchmarking dataset for event argument extraction.
Efficient Semiring-Weighted Earley Parsing
- Andreas Opedal, Ran Zmigrod, Tim Vieira, Ryan Cotterell, Jason Eisner
- TLDR: We present Earley’s (1970) context-free parsing algorithm as a deduction system, incorporating various known and new speed-ups.
Tree-Based Representation and Generation of Natural and Mathematical Language
- Alexander Scarlatos, Andrew Lan
- TLDR: We propose a new language model for mathematical language that combines text and math, and show that it outperforms baselines on mathematical expression generation tasks.
ParaLS: Lexical Substitution via Pretrained Paraphraser
- Jipeng Qiang, Kang Liu, Yun Li, Yunhao Yuan, Yi Zhu
- TLDR: We propose two simple decoding strategies that focus on the variations of the target word during decoding and generate paraphrases from a paraphraser that preserve the sentence’s meaning.
Peer-Label Assisted Hierarchical Text Classification
- Junru Song, Feifei Wang, Yang Yang
- TLDR: We propose a novel method for hierarchical text classification that uses the latent relevancy of labels in the same level of the hierarchy to improve the classification performance.
Free Lunch for Efficient Textual Commonsense Integration in Language Models
- Wanyun Cui, Xingran Chen
- TLDR: We propose a method to improve the efficiency of textual commonsense knowledge bases by using a single batch of training samples.
A Probabilistic Framework for Discovering New Intents
- Yunhua Zhou, Guofeng Quan, Xipeng Qiu
- TLDR: We propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables.
MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset
- Leonhard Hennig, Philippe Thomas, Sebastian Möller
- TLDR: We present a new multi-language dataset for multi-lingual Relation Extraction and show that machine translation is a viable strategy for transfer RE instances, but also show that translation and annotation projection errors can degrade dataset quality and RE model performance.
Towards Higher Pareto Frontier in Multilingual Machine Translation
- Yichong Huang, Xiaocheng Feng, Xinwei Geng, Baohang Li, Bing Qin
- TLDR: We propose a new training framework for multilingual neural machine translation that pushes the Pareto frontier outwards rather than making trade-offs.
Small Pre-trained Language Models Can be Fine-tuned as Large Models via Over-Parameterization
- Ze-Feng Gao, Kun Zhou, Peiyu Liu, Wayne Xin Zhao, Ji-Rong Wen
- TLDR: We propose a new approach for scaling up the parameters of large pre-trained language models.
Entity Tracking in Language Models
- Najoung Kim, Sebastian Schuster
- TLDR: We investigate the ability of language models to track entities in discourse, and show that only GPT-3 and GPT3.5 models, which have been pretrained on large amounts of code, exhibit this ability.
A Textual Dataset for Situated Proactive Response Selection
- Naoki Otani, Jun Araki, HyeongSik Kim, Eduard Hovy
- TLDR: We present a task of proactive response selection based on situational information and show that it is not easy to select appropriate responses to a given request and situation.
DiffusionNER: Boundary Diffusion for Named Entity Recognition
- Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang
- TLDR: We propose a novel entity generation algorithm that learns to generate named entities from noisy spans and uses a reverse diffusion process to recover the entity boundaries.
WACO: Word-Aligned Contrastive Learning for Speech Translation
- Siqi Ouyang, Rong Ye, Lei Li
- TLDR: Word-Aligned COntrastive learning for extremely low-resource speech-to-text translation.
Cross-lingual Continual Learning
- Meryem M’hamdi, Xiang Ren, Jonathan May
- TLDR: We present a principled evaluation paradigm for cross-lingual continual learning and provide insights into what makes multilingual sequential learning particularly challenging.
Faithful Question Answering with Monte-Carlo Planning
- Ruixin Hong, Hongming Zhang, Hong Zhao, Dong Yu, Changshui Zhang
- TLDR: We propose FAME (FAithful question answering with Monte-Carlo planning) to answer questions based on faithful reasoning steps.
Unbalanced Optimal Transport for Unbalanced Word Alignment
- Yuki Arase, Han Bao, Sho Yokoi
- TLDR: We show that the family of optimal transport (OT) is a natural and powerful approach for word alignment and show that it can be used to solve the problem of null alignment.
Guiding Computational Stance Detection with Expanded Stance Triangle Framework
- Zhengyuan Liu, Yong Keong Yap, Hai Leong Chieu, Nancy Chen
- TLDR: We propose a novel approach to the stance detection task by characterizing the fundamental ways people express their stance and characterizing their relationship between explicit and implicit objects.
Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast
- Yiduo Guo, Yaobo Liang, Dongyan Zhao, Bing Liu, Nan Duan
- TLDR: We analyze the fine-tuning process and propose a method to address the gap between the performance of the source language and that of the non-source languages.
Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning
- Ran Zhou, Xin Li, Lidong Bing, Erik Cambria, Chunyan Miao
- TLDR: We propose a novel method for cross-lingual named entity recognition by combining contrastive self-training and pseudo label refinement in one coherent framework.
MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks
- Letitia Parcalabescu, Anette Frank
- TLDR: We propose MM-SHAP, a performance-agnostic multimodality score based on Shapley values that reliably quantifies in which proportions a multimodal model uses individual modalities.
Towards Boosting the Open-Domain Chatbot with Human Feedback
- Hua Lu, Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- TLDR: We propose a novel and efficient framework Diamante to boost the open-domain chatbot, where two kinds of human feedback (including explicit demonstration and implicit preference) are collected and leveraged.
Knowledge-enhanced Mixed-initiative Dialogue System for Emotional Support Conversations
- Yang Deng, Wenxuan Zhang, Yifei Yuan, Wai Lam
- TLDR: We propose a knowledge-enhanced mixed-initiative framework for emotional support conversations and present a novel analysis on mixed-interaction systems.
UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction
- Hang Yan, Yu Sun, Xiaonan Li, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
- TLDR: We propose a unified token-pair classification architecture for Information Extraction that outperforms task-specific and unified models on all tasks in 10 datasets.
Social-Group-Agnostic Bias Mitigation via the Stereotype Content Model
- Ali Omrani, Alireza Salkhordeh Ziabari, Charles Yu, Preni Golazizian, Brendan Kennedy, Mohammad Atari, Heng Ji, Morteza Dehghani
- TLDR: We propose a novel method for debiasing stereotypes by using only pairs of terms for each social attribute.
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
- Yixin Liu, Alex Fabbri, Pengfei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev
- TLDR: We propose a modified summarization salience protocol based on atomic content units and a robust benchmark for summarization evaluation, which leads to more statistically stable and significant results.
FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information
- Andrew Zhu, Karmanya Aggarwal, Alexander Feng, Lara Martin, Chris Callison-Burch
- TLDR: We present FIREBALL, a large dataset containing nearly 25,000 unique sessions from real D&D gameplay on Discord with true game state info.
A fine-grained comparison of pragmatic language understanding in humans and language models
- Jennifer Hu, Sammy Floyd, Olessia Jouravlev, Evelina Fedorenko, Edward Gibson
- TLDR: We show that language models can learn to interpret language as human utterances, and that they use similar linguistic cues as humans to solve pragmatic phenomena.
Counterfactual Multihop QA: A Cause-Effect Approach for Reducing Disconnected Reasoning
- Wangzhen Guo, Qinkang Gong, Yanghui Rao, Hanjiang Lai
- TLDR: We propose a novel counterfactual multihop QA model that exploits the true multi-hop reasoning instead of shortcuts.
Causal-Debias: Unifying Debiasing in Pretrained Language Models and Fine-tuning via Causal Invariant Learning
- Fan Zhou, Yuzhou Mao, Liu Yu, Yi Yang, Ting Zhong
- TLDR: We propose a unified debiasing framework Causal-Debias to remove unwanted stereotypical associations in PLMs during fine-tuning.
Parameter-Efficient Fine-Tuning without Introducing New Latency
- Baohao Liao, Yan Meng, Christof Monz
- TLDR: We present a new sparse parameter-efficient fine-tuning method for language models that achieves state-of-the-art performance and storage efficiency, while also achieving identical inference speed to that of full fine-tuning.
MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition
- Jinyuan Fang, Xiaobin Wang, Zaiqiao Meng, Pengjun Xie, Fei Huang, Yong Jiang
- TLDR: We propose MANNER, a variational memory-augmented few-shot NER model which can adapt the knowledge learned from source domain to cross-domain few-shots named entity recognition.
MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages
- Jack FitzGerald, Christopher Hench, Charith Peris, Scott Mackie, Kay Rottmann, Ana Sanchez, Aaron Nash, Liam Urbach, Vishesh Kakarala, Richa Singh, Swetha Ranganath, Laurie Crist, Misha Britan, Wouter Leeuwis, Gokhan Tur, Prem Natarajan
- TLDR: We present the MASSIVE dataset–Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation.
Distilling Script Knowledge from Large Language Models for Constrained Language Planning
- Siyu Yuan, Jiangjie Chen, Ziquan Fu, Xuyang Ge, Soham Shah, Charles Jankowski, Yanghua Xiao, Deqing Yang
- TLDR: We propose a novel constrained language planning dataset for large language models and use it to improve the constrained language modeling ability of large language model.
RED^FM: a Filtered and Multilingual Relation Extraction Dataset
- ‪Pere-Lluís Huguet Cabot, Simone Tedeschi, Axel-Cyrille Ngonga Ngomo, Roberto Navigli
- TLDR: We present SRED^FM, a large, human-revised dataset for 18 languages, 400 relation types, 13 entity types, totaling more than 40 million triplet instances, and a smaller, human re-vised dataset RED^{, for seven languages, that allows for the evaluation of multilingual RE systems.}
Modeling Appropriate Language in Argumentation
- Timon Ziegenbein, Shahbaz Syed, Felix Lange, Martin Potthast, Henning Wachsmuth
- TLDR: We propose a new taxonomy of 14 dimensions that determine inappropriate language in online discussions.
CELDA: Leveraging Black-box Language Model as Enhanced Classifier without Labels
- Hyunsoo Cho, Youna Kim, Sang-goo Lee
- TLDR: We propose Clustering-enhanced Linear Discriminative Analysis (CELDA), a novel approach that improves the text classification accuracy with a very weak-supervision signal.
MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction
- Zhibin Gou, Qingyan Guo, Yujiu Yang
- TLDR: We propose Multi-view Prompting (MVP) that aggregates sentiment elements generated in different orders, leveraging the intuition of human-like problem-solving processes from different views.
ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems
- Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, Nanyun Peng
- TLDR: We present a new approach to evaluating commonsense in dialogue systems.
Explanation-based Finetuning Makes Models More Robust to Spurious Cues
- Josh Magnus Ludan, Yixuan Meng, Tai Nguyen, Saurabh Shah, Qing Lyu, Marianna Apidianaki, Chris Callison-Burch
- TLDR: We propose explanation-based finetuning as a general approach to mitigate LLMs’ reliance on spurious correlations.
CAME: Confidence-guided Adaptive Memory Efficient Optimization
- Yang Luo, Xiaozhe Ren, Zangwei Zheng, Zhuo Jiang, Xin Jiang, Yang You
- TLDR: We propose a confidence-guided strategy to reduce the instability of existing memory efficient optimizers and improve performance on BERT and GPT-2 training.
On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning
- Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang
- TLDR: We show that zero-shot CoT reasoning in sensitive domains significantly increases a model’s likelihood to produce harmful or undesirable output, with trends holding across different prompt formats and model variants.
Solving Math Word Problems via Cooperative Reasoning induced Language Models
- Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Yongfeng Huang, Ruyi Gan, Jiaxing Zhang, Yujiu Yang
- TLDR: We present a cooperative reasoning-induced PLM for solving math word problem.
Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model
- Chantal Amrhein, Florian Schottmann, Rico Sennrich, Samuel Läubli
- TLDR: We propose a rewriting model for German that uses machine translation to create gender-biased text from real gender-fair text.
Early Discovery of Disappearing Entities in Microblogs
- Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda
- TLDR: We propose a novel method for detecting disappearing entities from noisy microblog streams in a timely manner.
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
- Zhengfu He, Tianxiang Sun, Qiong Tang, Kuanning Wang, Xuanjing Huang, Xipeng Qiu
- TLDR: We present DiffusionBERT, a new generative masked language model based on discrete dif- fusion models.
Lifting the Curse of Capacity Gap in Distilling Language Models
- Chen Zhang, Yang Yang, Jiahao Liu, Jingang Wang, Yunsen Xian, Benyou Wang, Dawei Song
- TLDR: We propose a mixture of minimal experts (MiniMoE), which imposes extra parameters to the student but introduces almost no additional inference compute.
Towards Faithful Dialogues via Focus Learning
- Yifan Deng, Xingsheng Zhang, Heyan Huang, Yue Hu
- TLDR: We propose Focus Learning, a novel learning approach that adjusts the contribution of each token to the optimization direction by directly scaling the corresponding objective loss.
Back Translation for Speech-to-text Translation Without Transcripts
- Qingkai Fang, Yang Feng
- TLDR: We present a novel back translation algorithm for speech-to-text translation without transcripts.
Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation
- Ibrahim Taha Aksu, Min-Yen Kan, Nancy Chen
- TLDR: We propose a new method for zero-shot domain adaptation that uses descriptions of target domain slots to generate dynamic prefixes that are concatenated to the key and values at each layer’s self-attention mechanism.
Enhancing Dialogue Generation via Dynamic Graph Knowledge Aggregation
- Chen Tang, Hongbo Zhang, Tyler Loakman, Chenghua Lin, Frank Guerin
- TLDR: We propose a novel framework for knowledge graph enhanced dialogue generation by learning on pseudo nodes and exploiting subgraph patterns within our feature aggregation process.
Multi-modal Action Chain Abductive Reasoning
- Mengze Li, Tianbao Wang, Jiahe Xu, Kairong Han, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Shiliang Pu, Fei Wu
- TLDR: Weird News Photos:
Exploring the Capacity of Pretrained Language Models for Reasoning about Actions and Change
- Weinan He, Canming Huang, Zhanhao Xiao, Yongmei Liu
- TLDR: We propose four essential RAC tasks as a comprehensive textual benchmark and generate problems in a way that minimizes the influence of other linguistic requirements (e.g., grounding) to focus on RAC.
Unified Demonstration Retriever for In-Context Learning
- Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu
- TLDR: We propose Unified Demonstration Retriever, a single model to retrieve demonstrations for a wide range of tasks.
Movie101: A New Movie Understanding Benchmark
- Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin
- TLDR: We propose a large-scale Chinese movie benchmark for movie narrating systems that provides external knowledge for better movie understanding and a new metric for movie narration evaluation.
Enhancing Language Representation with Constructional Information for Natural Language Understanding
- Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Zhilin Gong, Ming Cai, Tianxiang Wang
- TLDR: We propose a new construction grammar framework for natural language understanding, which captures high-order word interactions among constructions and improves language representation.
Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs
- Siyuan Wang, Zhongyu Wei, Meng Han, Zhihao Fan, Haijun Shan, Qi Zhang, Xuanjing Huang
- TLDR: We propose a structure-modeled textual encoding framework for inductive logical reasoning over incomplete knowledge graphs.
DimonGen: Diversified Generative Commonsense Reasoning for Explaining Concept Relationships
- Chenzhengyi Liu, Jie Huang, Kerui Zhu, Kevin Chen-Chuan Chang
- TLDR: We propose a novel task for generating diverse sentences describing concept relationships in various everyday scenarios.
Incorporating Attribution Importance for Improving Faithfulness Metrics
- Zhixue Zhao, Nikolaos Aletras
- TLDR: We propose a simple yet effective soft erasure criterion for feature attribution methods that can be used to measure the importance of each individual token in the input.
Reward Gaming in Conditional Text Generation
- Richard Yuanzhe Pang, Vishakh Padmakumar, Thibault Sellam, Ankur Parikh, He He
- TLDR: We show that even though learned metrics achieve high performance on the distribution of the data used to train the reward function, the undesirable patterns may be amplified during RL training of the text generation model.
Hidden Schema Networks
- Ramses Sanchez, Lukas Conrads, Pascal Welke, Kostadin Cvejoski, Cesar Ojeda Marin
- TLDR: We introduce a novel neural language model that enforces, via inductive biases, explicit relational structures which allow for compositionality onto the output representations of pretrained language models.
Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations
- Linlin Liu, Xingxuan Li, Megh Thakkar, Xin Li, Shafiq Joty, Luo Si, Lidong Bing
- TLDR: We present a novel method that operates on the hidden representations of a PLM to reduce overfitting in the low resource scenarios.
An Ordinal Latent Variable Model of Conflict Intensity
- Niklas Stoehr, Lucas Torroba Hennigen, Josef Valvoda, Robert West, Ryan Cotterell, Aaron Schein
- TLDR: We propose a probabilistic generative model for measuring conflict intensity that incorporates latent variables that are ordered to indicate higher levels of intensity.
Multilingual Conceptual Coverage in Text-to-Image Models
- Michael Saxon, William Yang Wang
- TLDR: We propose a new benchmark for generative text-to-image systems that provides multilingual parity to its training language in terms of tangible nouns.
Pre-Training to Learn in Context
- Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- TLDR: We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models’ in-context learning ability by pre-training the model on a large collection of “intrinsic tasks” in the general plain-text corpus using the simple language modeling objective.
Ethical Considerations for Machine Translation of Indigenous Languages: Giving a Voice to the Speakers
- Manuel Mager, Elisabeth Mager, Katharina Kann, Ngoc Thang Vu
- TLDR: Ethical considerations for the automatic translation of Indigenous languages.
Revisiting non-English Text Simplification: A Unified Multilingual Benchmark
- Michael Ryan, Tarek Naous, Wei Xu
- TLDR: We present a new multi-language text simplification benchmark that covers complex-simple sentence pairs in 12 distinct languages and show that few-shot prompting with BLOOM-176b achieves comparable quality to reference simplifications outperforming fine-tuned models in most languages.
Don’t Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
- Yu Gu, Xiang Deng, Yu Su
- TLDR: We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability.
Privacy-Preserving Domain Adaptation of Semantic Parsers
- Fatemehsadat Mireshghallah, Yu Su, Tatsunori Hashimoto, Jason Eisner, Richard Shin
- TLDR: We propose a new method for generating realistic user utterances synthetically, without compromising the privacy of actual users.
Guide the Many-to-One Assignment: Open Information Extraction via IoU-aware Optimal Transport
- Kaiwen Wei, Yiran Yang, Li Jin, Xian Sun, Zequn Zhang, Jingyuan Zhang, Xiao Li, Linhao Zhang, Jintao Liu, Guo Zhi
- TLDR: We propose a dynamic many-to-one label assignment strategy for open information extraction based on the intersection-over-union value and a multi-granularity loss.
Actively Supervised Clustering for Open Relation Extraction
- Jun Zhao, Yongxin Zhang, Qi Zhang, Tao Gui, Zhongyu Wei, Minlong Peng, Mingming Sun
- TLDR: We present a novel setting for actively supervised clustering for Open Relation Extraction, which provides the necessary guidance for clustering without a significant increase in human effort.
ConvGQR: Generative Query Reformulation for Conversational Search
- Fengran Mo, Kelong Mao, Yutao Zhu, Yihong Wu, Kaiyu Huang, Jian-Yun Nie
- TLDR: We propose ConvGQR, a new framework to reformulate conversational queries based on generative pre-trained language models (PLMs), one for query rewriting and another for generating potential answers.
KILM: Knowledge Injection into Encoder-Decoder Language Models
- Yan Xu, Mahdi Namazifar, Devamanyu Hazarika, Aishwarya Padmakumar, Yang Liu, Dilek Hakkani-Tur
- TLDR: We propose Knowledge Injection into Language Models, a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training.
VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
- Yuxuan Wang, Zilong Zheng, Xueliang Zhao, Jinpeng Li, Yueqian Wang, Dongyan Zhao
- TLDR: We present a new video-grounded dialogue understanding benchmark that captures the intrinsic attributes of multimodal dialogues.
NLPeer: A Unified Resource for the Computational Study of Peer Review
- Nils Dycke, Ilia Kuznetsov, Iryna Gurevych
- TLDR: We present NLPeer, a multi-domain corpus of peer review data for NLP and its applications.
IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures
- Mingyu Zheng, Yang Hao, Wenbin Jiang, Zheng Lin, Yajuan Lyu, QiaoQiao She, Weiping Wang
- TLDR: We present a new TQA dataset with implicit and multi-type table structures and propose an RGCN-RCI framework for TQAs.
Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization
- Pengcheng He, Baolin Peng, Song Wang, Yang Liu, Ruochen Xu, Hany Hassan, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang
- TLDR: We present a new pre-trained language model optimized for abstractive text summarization.
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models’ Memories
- Shizhe Diao, Tianyang Xu, Ruijia Xu, Jiawei Wang, Tong Zhang
- TLDR: We propose a novel mixture-of-adapters gate to adapt PLMs dynamically to a new domain.
Unsupervised Graph-Text Mutual Conversion with a Unified Pretrained Language Model
- Yi Xu, Shuqian Sheng, Jiexing Qi, Luoyi Fu, Zhouhan Lin, Xinbing Wang, Chenghu Zhou
- TLDR: We propose INFINITY, a novel unsupervised graph-to-graph mutual conversion method with a unified pretrained language model that does not introduce external annotation tools or additional parallel information.
Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications
- Han Cheol Moon, Shafiq Joty, Ruochen Zhao, Megh Thakkar, Chi Xu
- TLDR: We propose a novel adversarial robustness framework that combines randomized smoothing (RS) with masked inference (MI) to improve the adversarial adversarial and stability of large-scale NLP models.
SESCORE2: Learning Text Generation Evaluation via Synthesizing Realistic Mistakes
- Wenda Xu, Xian Qian, Mingxuan Wang, Lei Li, William Yang Wang
- TLDR: We propose SEScore2, a self-supervised approach for training a model-based metric for evaluating text generation quality without human-annotated ratings.
Tokenization and the Noiseless Channel
- Vilém Zouhar, Clara Meister, Juan Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell
- TLDR: We propose that efficient tokenizers are efficient because they use less entropy than the maximum entropy of the subword distribution.
Contextual Distortion Reveals Constituency: Masked Language Models are Implicit Parsers
- Jiaxi Li, Wei Lu
- TLDR: We propose a novel chart-based method for extracting parse trees from masked language models (LMs) without the need to train separate parsers.
MetaAdapt: Domain Adaptive Few-Shot Misinformation Detection via Meta Learning
- Zhenrui Yue, Huimin Zeng, Yang Zhang, Lanyu Shang, Dong Wang
- TLDR: MetaAdapt is a meta learning based approach for domain adaptive few-shot misinformation detection.
Tackling Modality Heterogeneity with Multi-View Calibration Network for Multimodal Sentiment Detection
- Yiwei Wei, Shaozu Yuan, Ruosong Yang, Lei Shen, Zhangmeizhi Li, Longbiao Wang, Meng Chen
- TLDR: We propose a novel Multi-View Calibration Network for multimodal post fusion and a novel sentiment-based congruity constraint task to address inconsistent annotated labels.
COLA: Contextualized Commonsense Causal Reasoning from the Causal Inference Perspective
- Zhaowei Wang, Quyet V. Do, Hongming Zhang, Jiayao Zhang, Weiqi Wang, Tianqing Fang, Yangqiu Song, Ginny Wong, Simon See
- TLDR: We propose a new task to detect commonsense causation between two events in an event sequence (i.e., context) using contextualized commonsense causal reasoning.
MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched Contextualization
- Shivam Sharma, Ramaneswaran S, Udit Arora, Md. Shad Akhtar, Tanmoy Chakraborty
- TLDR: We propose a novel task, MEMEX, to extract the relevant background of a meme and its related documents.
WikiHowQA: A Comprehensive Benchmark for Multi-Document Non-Factoid Question Answering
- Valeriia Bolotova-Baranova, Vladislav Blinov, Sofya Filippova, Falk Scholer, Mark Sanderson
- TLDR: We present WikiHowQA, a new multi-document NFQA benchmark built on WikiHow, a website dedicated to answering “how-to” questions.
Making Language Models Better Reasoners with Step-Aware Verifier
- Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen
- TLDR: We present DiVeRSe, a novel approach that further enhances the reasoning capability of language models.
Distributed Marker Representation for Ambiguous Discourse Markers and Entangled Relations
- Dongyu Ru, Lin Qiu, Xipeng Qiu, Yue Zhang, Zheng Zhang
- TLDR: We propose to learn a Distributed Marker Representation for discourse markers and propose a new way to represent discourse relations.
MISGENDERED: Limits of Large Language Models in Understanding Pronouns
- Tamanna Hossain, Sunipa Dev, Sameer Singh
- TLDR: We comprehensively evaluate popular language models for their ability to correctly use English gender-neutral pronouns (e.g., singular they, them) and neo-pronouns (e.) that are used by individuals whose gender identity is not represented by binary pronouns.
Reasoning with Language Model Prompting: A Survey
- Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, Huajun Chen
- TLDR: We provide a comprehensive survey of cutting-edge research on reasoning with language model prompting.
Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation
- Matthieu Futeral, Cordelia Schmid, Ivan Laptev, Benoît Sagot, Rachel Bawden
- TLDR: We present a novel multimodal multi-language machine translation model based on text-only MT and a contrastive multilingual multimodal translation evaluation evaluation set.
Hybrid Knowledge Transfer for Improved Cross-Lingual Event Detection via Hierarchical Sample Selection
- Luis Guzman Nateras, Franck Dernoncourt, Thien Nguyen
- TLDR: We propose a novel method for learning cross-lingual knowledge transfer under a zero-shot cross-language setting where a model is trained on a source language but evaluated on a distinct target language for which there is no labeled data available.
BLEURT Has Universal Translations: An Analysis of Automatic Metrics by Minimum Risk Training
- Yiming Yan, Tao Wang, Chengqi Zhao, Shujian Huang, Jiajun Chen, Mingxuan Wang
- TLDR: We analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems.
Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment
- Rohan Pandey, Rulin Shao, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency
- TLDR: We show that relation alignment can be enforced by encouraging the language attention from ‘mug’ to ‘grass’ (capturing the semantic relation ‘in’) to match the visual attention from the mug to the grass (captured the corresponding physical relation).
Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona
- Yihong Tang, Bo Wang, Miao Fang, Dongming Zhao, Kun Huang, Ruifang He, Yuexian Hou
- TLDR: We combine sparse and dense persona descriptions and dialogue histories to generate personalized dialogue agents.
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge
- Yasumasa Onoe, Michael Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi
- TLDR: We study LMs’ ability to make inferences based on injected facts and show that they can propagate those facts.
Explaining How Transformers Use Context to Build Predictions
- Javier Ferrando, Gerard I. Gállego, Ioannis Tsiamas, Marta R. Costa-jussà
- TLDR: We present a procedure to analyze models for language generation that generate human-like source-target alignments for building predictions.
DISCO: Distilling Counterfactuals with Large Language Models
- Zeming Chen, Qiyue Gao, Antoine Bosselut, Ashish Sabharwal, Kyle Richardson
- TLDR: We present a new method for generating high-quality counterfactual data for natural language inference and show that it improves generalization and robustness.
Non-Sequential Graph Script Induction via Multimedia Grounding
- Yu Zhou, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal, Heng Ji
- TLDR: We propose a novel approach to generate and predict graph scripts for procedural planning using video sequences.
SCOTT: Self-Consistent Chain-of-Thought Distillation
- Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, Xiang Ren
- TLDR: We propose SCOTT, a faithful knowledge distillation method to learn a small, self-consistent CoT model from a teacher model that is orders of magnitude larger.
Clinical Note Owns its Hierarchy: Multi-Level Hypergraph Neural Networks for Patient-Level Representation Learning
- Nayeon Kim, Yinhua Piao, Sun Kim
- TLDR: We propose a taxonomy-aware multi-level hypergraph neural network for clinical notes that can effectively utilize the hierarchy information of the patient.
Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization
- Dongqi Pu, Yifan Wang, Vera Demberg
- TLDR: We propose a novel summarization model that comprehensively incorporates both the types and uncertainty of rhetorical relations in document-level rhetorical structure.
Evaluating Open-Domain Question Answering in the Era of Large Language Models
- Ehsan Kamalloo, Nouha Dziri, Charles Clarke, Davood Rafiei
- TLDR: We present a comprehensive evaluation of various open-domain question answering models and show that the current state-of-the-art models are on par with existing top models, while the current top models are significantly under-estimated.
No clues good clues: out of context Lexical Relation Classification
- Lucia Pitarch, Jordi Bernad, Lacramioara Dranca, Carlos Bobed Lisbona, Jorge Gracia
- TLDR: We show that very simple PTLMs for lexical relation classification and graded lexical entailment can outperform graded ones.
Won’t Get Fooled Again: Answering Questions with False Premises
- Shengding Hu, Yifan Luo, Huadong Wang, Xingyi Cheng, Zhiyuan Liu, Maosong Sun
- TLDR: We show that the PLMs already possess the knowledge required to rebut such questions, and the key is how to activate the knowledge.
What the DAAM: Interpreting Stable Diffusion Using Cross Attention
- Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, Ferhan Ture
- TLDR: We perform a text-image attribution analysis on Stable diffusion models from a visuolinguistic perspective, which enables future research.
Zero-shot Faithful Factual Error Correction
- Kung-Hsiang Huang, Hou Pong Chan, Heng Ji
- TLDR: We present a zero-shot framework for evaluating factual error corrections that uses human judgments to identify and correct factual errors.
Open-Domain Hierarchical Event Schema Induction by Incremental Prompting and Verification
- Sha Li, Ruining Zhao, Manling Li, Heng Ji, Chris Callison-Burch, Jiawei Han
- TLDR: We propose a new method for event schema induction that learns to generate large and complex schemas from large language models.
Zero-shot Approach to Overcome Perturbation Sensitivity of Prompts
- Mohna Chakraborty, Adithya Kulkarni, Qi Li
- TLDR: We propose a new method for generating high-quality natural-language prompts for the binary sentence-level sentiment classification task using few-shot learning.
Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging
- Fabian Schmidt, Ivan Vulić, Goran Glavaš
- TLDR: We propose a simple and effective method that averages different checkpoints (i.e., model snapshots) during task fine-tuning to improve robustness of ‘true’ cross-lingual transfer setups.
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training
- Yan Zeng, Wangchunshu Zhou, Ao Luo, Ziming Cheng, Xinsong Zhang
- TLDR: We introduce Cross-View Language Modeling, a simple and effective pre-training framework that unifies cross-lingual and cross-modal pre- training with shared architectures and objectives.
Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars
- Songlin Yang, Roger Levy, Yoon Kim
- TLDR: We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing.
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
- Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom
- TLDR: We show that Random Transformers are more biased towards functions of low sensitivity than recurrent models on Boolean functions and that generalization accuracy is better than recurrents.
Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generation
- Rishabh Gupta, Shaily Desai, Manvi Goel, Anil Bandhakavi, Tanmoy Chakraborty, Md. Shad Akhtar
- TLDR: We propose a novel framework for intent-conditioned counterspeech generation and demonstrate its effectiveness in combating hate speech.
DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation
- Suraj Kothawade, Anmol Mekala, D.Chandra Sekhara Hetha Havya, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi
- TLDR: We propose DITTO (Data-efficient and faIr Targeted subseT selectiOn) a speech selection method that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget.
Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework
- Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- TLDR: We propose Verify-and-Edit framework for CoT prompting, which improves trust and model performance on complex reasoning tasks by post-editing reasoning chains according to external knowledge.
Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation
- Zhiwei Cao, Baosong Yang, Huan Lin, Suhang Wu, Xiangpeng Wei, Dayiheng Liu, Jun Xie, Min Zhang, Jinsong Su
- TLDR: Weird News Photos:
Node Placement in Argument Maps: Modeling Unidirectional Relations in High & Low-Resource Scenarios
- Iman Jundi, Neele Falk, Eva Maria Vecchi, Gabriella Lapesa
- TLDR: We introduce node placement in argument maps and show that it improves performance of both humans and models.
Towards a Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review
- Fred Philippy, Siwen Guo, Shohreh Haddadan
- TLDR: We provide a systematic review of existing research on the cross-lingual transfer capacity of pre-trained Multilingual Language Models and provide a framework for future research.
Toward Human-Like Evaluation for Natural Language Generation with Error Analysis
- Qingyu Lu, Liang Ding, Liping Xie, Kanjian Zhang, Derek F. Wong, Dacheng Tao
- TLDR: We present a novel approach to improve sentence refining and improve sentence confidence of pretrained language model metrics by fine-grained error analysis.
Connective Prediction for Implicit Discourse Relation Recognition via Knowledge Distillation
- Hongyi Wu, Hao Zhou, Man Lan, Yuanbin Wu, Yadong Zhang
- TLDR: We propose a novel Connective Prediction via Knowledge Distillation (CP-KD) approach to instruct large-scale pre-trained language models (PLMs) mining the latent correlations between connectives and discourse relations, which is meaningful for IDRR.
What is the best recipe for character-level encoder-only modelling?
- Kris Cao
- TLDR: We benchmark recent progress in language understanding models that output contextualised representations at the character level.
Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training
- Zejun Li, Zhihao Fan, Jingjing Chen, Qi Zhang, Xuanjing Huang, Zhongyu Wei
- TLDR: We propose a framework to effectively unify cross-lingual and cross-modal pre-training and show that the universal multilingual representation learned from texts allows the cross-language interaction learned in English to be transferable to other languages.
Learning “O” Helps for Learning More: Handling the Unlabeled Entity Problem for Class-incremental NER
- Ruotian Ma, Xuanting Chen, Zhang Lin, Xin Zhou, Junzhe Wang, Tao Gui, Qi Zhang, Xiang Gao, Yun Wen Chen
- TLDR: We propose a novel representation learning method to learn discriminative representations for the entity classes and “O” and propose two effective distance-based relabeling strategies for better learning the old classes.
Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination
- Hao Fei, Qian Liu, Meishan Zhang, Min Zhang, Tat-Seng Chua
- TLDR: We propose a novel unsupervised multimodal machine translation setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-only inputs.
CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition
- Tingting Ma, Qianhui Wu, Huiqiang Jiang, Börje Karlsson, Tiejun Zhao, Chin-Yew Lin
- TLDR: We propose a novel method for cross-lingual named entity recognition that uses model-collaboration to denoise pseudo labels used by each other.
Dialect-robust Evaluation of Generated Text
- Jiao Sun, Thibault Sellam, Elizabeth Clark, Tu Vu, Timothy Dozat, Dan Garrette, Aditya Siddhant, Jacob Eisenstein, Sebastian Gehrmann
- TLDR: We propose NANO, a new metric for text generation that incorporates regional and language information to the metric pretraining and improves the correlation between automated metrics and human ratings.
Understanding and Improving the Robustness of Terminology Constraints in Neural Machine Translation
- Huaao Zhang, Qiang Wang, Bo Qin, Zelin Shi, Haibo Wang, Ming Chen
- TLDR: We study the robustness of two typical terminology translation methods: Placeholder (PH) and Code-Switch (CS), concerning (1) the number of constraints and (2) the target constraint length.
Language model acceptability judgements are not always robust to context
- Koustuv Sinha, Jon Gauthier, Aaron Mueller, Kanishka Misra, Keren Fuentes, Roger Levy, Adina Williams
- TLDR: We show that model judgements are generally robust when placed in randomly sampled linguistic contexts, but are unstable when contexts match the test stimuli in syntactic structure.
RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations
- Yilun Zhao, Chen Zhao, Linyong Nan, Zhenting Qi, Wenlin Zhang, Xiangru Tang, Boyu Mi, Dragomir Radev
- TLDR: We propose RobuT, a benchmark for Table QA models that measures the robustness of Table QAs to task-specific adversarial perturbations.
Morphological Inflection: A Reality Check
- Jordan Kodner, Sarah Payne, Salam Khalifa, Zoey Liu
- TLDR: We propose new data sampling and evaluation strategies that better reflect likely use-cases and generalize the generalization abilities of current inflection systems.
TOME: A Two-stage Approach for Model-based Retrieval
- Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, Haifeng Wang
- TLDR: We propose a novel two-stage model-based retrieval approach called TOME, which makes two major technical contributions, including the utilization of tokenized URLs as identifiers and the design of a two-stages generation architecture.
Using Neural Machine Translation for Generating Diverse Challenging Exercises for Language Learner
- Frank Palma Gomez, Subhadarshi Panda, Michael Flor, Alla Rozovskaya
- TLDR: We propose a novel approach to automatically generate distractors for cloze exercises for English language learners, using round-trip neural machine translation.
Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-intense Argumentation Tasks
- Moritz Plenz, Juri Opitz, Philipp Heinisch, Philipp Cimiano, Anette Frank
- TLDR: We present a new unsupervised method for constructing Contextualized Commonsense Knowledge Graphs (CCKGs) that extract contextually relevant knowledge from large knowledge graphs (KGs).
miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings
- Tassilo Klein, Moin Nabi
- TLDR: We present a new contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding.
Learning Non-linguistic Skills without Sacrificing Linguistic Proficiency
- Mandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan
- TLDR: We propose a novel framework for non-linguistic skill injection for Math-NLP that enables the learning of strict arithmetic reasoning.
Forgotten Knowledge: Examining the Citational Amnesia in NLP
- Janvijay Singh, Mukund Rungta, Diyi Yang, Saif Mohammad
- TLDR: We show that the most cited papers in the last decade are from the immediate five years prior to publication, whereas only about 17% are more than ten years old.
Measuring the Instability of Fine-Tuning
- Yupei Du, Dong Nguyen
- TLDR: We analyze the stability of pre-trained language models on downstream tasks with varying random seeds and propose a systematic evaluation framework for their validity.
FairPrism: Evaluating Fairness-Related Harms in Text Generation
- Eve Fleisig, Aubrie Amstutz, Chad Atalla, Su Lin Blodgett, Hal Daumé III, Alexandra Olteanu, Emily Sheng, Dan Vann, Hanna Wallach
- TLDR: We present a dataset of 5,000 examples of AI-generated English text with detailed human annotations covering a diverse set of harms relating to gender and sexuality.
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
- Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Leonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos Garea, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor
- TLDR: We use reinforcement learning with reference-free, textual-entailment rewards to optimize for factual consistency and explore the ensuing trade-offs, as improved consistency may come at the cost of less informative or more extractive summaries.
SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams
- Te-Lin Wu, Satwik Kottur, Andrea Madotto, Mahmoud Azab, Pedro Rodriguez, Babak Damavandi, Nanyun Peng, Seungwhan Moon
- TLDR: We propose a novel data collection paradigm that captures user–assistant interactions with all of the aforementioned features in VR.
Multilingual LLMs are Better Cross-lingual In-context Learners with Alignment
- Eshaan Tanwar, Subhabrata Dutta, Manish Borthakur, Tanmoy Chakraborty
- TLDR: We provide the first in-depth analysis of ICL for cross-lingual text classification and propose a novel prompt construction strategy for crosslingual ICL.
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning
- Soumya Sanyal, Yichong Xu, Shuohang Wang, Ziyi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren
- TLDR: We propose APOLLO, a simple adaptive pretraining approach to improve the logical reasoning skills of language models.
MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering
- Vaishali Pal, Andrew Yates, Evangelos Kanoulas, Maarten de Rijke
- TLDR: We propose a new multi-table QA model that answers questions over multiple tables and generates tabular answers.
To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion
- Rui Li, Xu Chen, Chaozhuo Li, Yanming Shen, Jianan Zhao, Yujing Wang, Weihao Han, Hao Sun, Weiwei Deng, Qi Zhang, Xing Xie
- TLDR: We present Vertical Learning Paradigm, a novel approach for embedding models that allows to explicitly copy target information from related factual triples for more accurate prediction.
CoAD: Automatic Diagnosis through Symptom and Disease Collaborative Generation
- Huimin Wang, Wai Chung Kwan, Kam-Fai Wong, Yefeng Zheng
- TLDR: We present a novel disease and symptom collaborative generation framework for automatic disease diagnosis.
Long-Tailed Question Answering in an Open World
- Yi Dai, Hao Lang, Yinhe Zheng, Fei Huang, Yongbin Li
- TLDR: We propose an OLTQA model that encourages knowledge sharing between head, tail and unseen tasks, and explicitly mines knowledge from a large pre-trained language model (LM).
Parallel Context Windows for Large Language Models
- Nir Ratner, Yoav Levine, Yonatan Belinkov, Ori Ram, Inbal Magar, Omri Abend, Ehud Karpas, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham
- TLDR: We present Parallel Context Windows, a method that alleviates the context window restriction for any off-the-shelf LLM without further training.
Efficient Transformers with Dynamic Token Pooling
- Piotr Nawrot, Jan Chorowski, Adrian Lancucki, Edoardo Maria Ponti
- TLDR: We propose a dynamic-pooling mechanism for language models that learns to infer segment boundaries in an autoregressive fashion.
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction
- Haotian Chen, Bingsheng Chen, Xiangdong Zhou
- TLDR: We propose a new perspective on comprehensively evaluating a model’s understanding and reasoning capabilities.
ContraCLM: Contrastive Learning For Causal Language Model
- Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Bing Xiang
- TLDR: We present CONTRACLM, a novel contrastive learning framework for causal language models that improves the expressiveness of representations and improves the source code generation capability.
Advancing Multi-Criteria Chinese Word Segmentation Through Criterion Classification and Denoising
- Tzu Hsuan Chou, Chun-Yi Lin, Hung-Yu Kao
- TLDR: We show that through a simple yet elegant input-hint-based MCCWS model, we can achieve state-of-the-art (SoTA) performances on several datasets simultaneously.
Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition
- Haodong Zhao, Ruifang He, Mengnan Xiao, Jing Xu
- TLDR: We propose a prompt-based Parameter-Efficient Multi-level IDRR framework to solve the above problems.
Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model
- Pengwei Zhan, Jing Yang, Xiao Huang, Chunlei Jing, Jingying Li, Liming Wang
- TLDR: We propose a Contrastive learning regularization method using adversarial examples for Alleviating the Pathology (ConAAP), which calibrates the sentence representation of out-of-distribution examples.
Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children’s Fairy Tales
- Paulina Toro Isaza, Guangxuan Xu, Toye Oloko, Yufang Hou, Nanyun Peng, Dakuo Wang
- TLDR: We propose a computational pipeline that automatically extracts a story’s temporal narrative verb-based event chain for each of its characters as well as character attributes such as gender.
FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue
- Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu
- TLDR: We propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context using a self-training framework.
LAMBADA: Backward Chaining for Automated Reasoning in Natural Language
- Mehran Kazemi, Najoung Kim, Deepti Bhatia, Xin Xu, Deepak Ramachandran
- TLDR: We develop a Backward Chaining algorithm for automated reasoning with natural text, which is significantly more efficient than forward reasoning methods.
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives
- Silin Gao, Beatriz Borges, Soyoung Oh, Deniz Bayazit, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut
- TLDR: We present a new large-scale persona commonsense knowledge graph, PeaCoK, containing ~100K human-validated persona facts.
OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
- Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao
- TLDR: We propose a new training system for speech recognition that achieves zero-shot modality transfer by maintaining the multi-modality alignment in phoneme space learned with unlabeled multimedia utterances in the high resource domain during the pre-training.
Retrieval-free Knowledge Injection through Multi-Document Traversal for Dialogue Models
- Rui Wang, Jianzhu Bao, Fei Mi, Yi Chen, Hongru Wang, Yasheng Wang, Yitong Li, Lifeng Shang, Kam-Fai Wong, Ruifeng Xu
- TLDR: We propose a retrieval-free approach to dialogue models that improves the quality of dialogue models while being cheaper to transfer.
BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval
- Shicheng Xu, Liang Pang, Huawei Shen, Xueqi Cheng
- TLDR: We propose a novel method to improve the generalization of dense retrieval via capturing matching signal via capturing multiple matching signal.
Multiview Identifiers Enhanced Generative Retrieval
- Yongqi Li, Nan Yang, Liang Wang, Furu Wei, Wenjie Li
- TLDR: Synthetic identifiers for generative retrieval that are generated based on the content of a passage.
Prompting Language Models for Linguistic Structure
- Terra Blevins, Hila Gonen, Luke Zettlemoyer
- TLDR: We present a structured prompting approach for linguistic structured prediction tasks, allowing us to perform zero- and few-shot sequence tagging with autoregressive PLMs.
Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis
- Agam Shah, Suvan Paturi, Sudheer Chava
- TLDR: We construct the largest tokenized and annotated dataset of FOMC speeches, meeting minutes, and press conference transcripts in order to understand how monetary policy influences financial markets.
RE-Matching: A Fine-Grained Semantic Matching Method for Zero-Shot Relation Extraction
- Jun Zhao, WenYu Zhan, Xin Zhao, Qi Zhang, Tao Gui, Zhongyu Wei, Junzhe Wang, Minlong Peng, Mingming Sun
- TLDR: We propose a new semantic matching method for zero-shot relation extraction, which uses the sentence-level similarity score to model the relation semantics.
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created through Human-Machine Collaboration
- Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Meeyoung Cha, Yejin Choi, Byoungpil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh, Sangchul Park, Jung-Woo Ha
- TLDR: We present a large-scale Korean dataset of 49k sensitive questions and acceptable responses with 42k acceptable and 46k non-acceptable responses.
Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation
- Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyu Tae Kim, Minjoon Seo, Alice Oh
- TLDR: We present a new evaluation benchmark for Korean grammatical error correction and provide a new automatic grammar error annotation system for Korean.
FLamE: Few-shot Learning from Natural Language Explanations
- Yangqiaoyu Zhou, Yiming Zhang, Chenhao Tan
- TLDR: We present FLamE, a two-stage few-shot learning framework that first generates explanations using GPT-3, and then fine-tunes a smaller model (e.g., RoBERTa) with generated explanations.
Learning Symbolic Rules over Abstract Meaning Representations for Textual Reinforcement Learning
- Subhajit Chaudhury, Sarathkrishna Swaminathan, Daiki Kimura, Prithviraj Sen, Keerthiram Murugesan, Rosario Uceda-Sosa, Michiaki Tatsubori, Achille Fokoue, Pavan Kapanipathi, Asim Munawar, Alexander Gray
- TLDR: We propose a modular, NEuro-Symbolic Textual Agent that learns abstract interpretable rules as policies.
Counterfactual Debiasing for Fact Verification
- Weizhi Xu, Qiang Liu, Shu Wu, Liang Wang
- TLDR: We propose a novel method for debiasing claims by counteracting biases in two outputs and obtain the final prediction via subtracting output of the claim-only model from output of a claim-evidence fusion model.
What social attitudes about gender does BERT encode? Leveraging insights from psycholinguistics
- Julia Watson, Barend Beekhuizen, Suzanne Stevenson
- TLDR: We show how word preferences in a large language model reflect social attitudes about gender, using two datasets from human experiments that found differences in gendered or gender neutral word choices by participants with differing views on gender.
Rethinking Multimodal Entity and Relation Extraction from a Translation Point of View
- Changmeng Zheng, Junhao Feng, Yi Cai, Xiaoyong Wei, Qing Li
- TLDR: We present a multimodal back-translation method for entity and relation extraction based on the cross-modal misalignment issue in text-image datasets.
Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization
- Rongxin Zhu, Jianzhong Qi, Jey Han Lau
- TLDR: We present the first dataset with fine-grained factual error annotations for dialogue summaries and propose an unsupervised model ENDERANKER via candidate ranking using pretrained encoder-decoder models.
Improving the Robustness of Summarization Systems with Dual Augmentation
- Xiuying Chen, Guodong Long, Chongyang Tao, Mingzhe Li, Xin Gao, Chengqi Zhang, Xiangliang Zhang
- TLDR: We propose a novel approach to improve the robustness of summarization models against adversarial and noisy perturbations.
Interpretable Math Word Problem Solution Generation via Step-by-step Planning
- Mengxue Zhang, Zichao Wang, Zhichao Yang, Weiqi Feng, Andrew Lan
- TLDR: We propose a step-by-step planning approach for intermediate solution generation, which strategically plans the generation of the next solution step based on the MWP and the previous solution steps.
TemplateGEC: Improving Grammatical Error Correction with Detection Template
- Yinghao Li, Xuebo Liu, Shuo Wang, Peiyuan Gong, Derek F. Wong, Yang Gao, Heyan Huang, Min Zhang
- TLDR: We propose a novel method for Grammatical Error Correction using sequence-to-edit and sequence- to-sequence models.
Deep Model Compression Also Helps Models Capture Ambiguity
- Hancheol Park, Jong Park
- TLDR: We propose a novel method for quantifying ambiguity in natural language understanding tasks by quantifying the degree of relationship between each sample and its candidate classes.
Are Experts Needed? On Human Evaluation of Counselling Reflection Generation
- Zixiu Wu, Simone Balloccu, Ehud Reiter, Rim Helaoui, Diego Reforgiato Recupero, Daniele Riboni
- TLDR: Laypeople can be trusted to evaluate the quality of synthetic and human reflections from therapists.
PairSpanBERT: An Enhanced Language Model for Bridging Resolution
- Hideo Kobayashi, Yufang Hou, Vincent Ng
- TLDR: We present PairSpanBERT, a SpanBERT-based pre-trained model specialized for bridging resolution.
Compounding Geometric Operations for Knowledge Graph Completion
- Xiou Ge, Yun Cheng Wang, Bin Wang, C.-C. Jay Kuo
- TLDR: We propose a new knowledge graph embedding model by leveraging all three operations in a composite form.
Few-shot In-context Learning on Knowledge Base Question Answering
- Tianle Li, Xueguang Ma, Alex Zhuang, Yu Gu, Yu Su, Wenhu Chen
- TLDR: We propose KB-BINDER, a new framework for few-shot in-context learning over knowledge bases.
Fact-Checking Complex Claims with Program-Guided Reasoning
- Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov
- TLDR: We present Program-Guided Fact-Checking, a novel fact-checking model that decomposes complex claims into simpler sub-tasks that can be solved using a shared library of specialized functions.
Patton: Language Model Pretraining on Text-Rich Networks
- Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Xinyang Zhang, Qi Zhu, Jiawei Han
- TLDR: Patton is a novel pretraining framework for text-rich networks that captures the inherent dependency between textual attributes and network structure.
Soft Language Clustering for Multilingual Model Pre-training
- Jiali Zeng, Yufan Jiang, Yongjing Yin, Yi Jing, Fandong Meng, Binghuai Lin, Yunbo Cao, Jie Zhou
- TLDR: We propose XLM-P, a method that contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach
- Nidhi Vakil, Hadi Amiri
- TLDR: We propose a novel approach for curriculum learning for graph neural networks that incorporates graph complexity formalisms (as difficulty criteria) and model competence during training.
When and how to paraphrase for named entity recognition?
- Saket Sharma, Aviral Joshi, Yiyun Zhao, Namrata Mukhija, Hanoz Bhathena, Prateek Singh, Sashank Santhanam
- TLDR: We explore the influence of paraphrasing on named entity recognition and show that paraphrases generated by larger GPT-3 variants are more capable of generating high quality paraphrased entities.
UniEvent: Unified Generative Model with Multi-Dimensional Prefix for Zero-Shot Event-Relational Reasoning
- Zhengwei Tao, Zhi Jin, Haiyan Zhao, Chengfeng Dou, Yongqiang Zhao, Tao Shen, Chongyang Tao
- TLDR: We propose a novel unified framework for event relational reasoning tasks that enables zero-shot generalization and efficient knowledge transfer.
Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-text Rationales
- Brihi Joshi, Ziyi Liu, Sahana Ramnath, Aaron Chan, Zhewei Tong, Shaoliang Nie, Qifan Wang, Yejin Choi, Xiang Ren
- TLDR: We show that human utility of existing rationales generated by large language models is not as good as it could be, and propose a new score to measure its helpfulness in answering similar unseen instances.
Automatic Annotation of Direct Speech in Written French Narratives
- Noé Durandard, Viet Anh Tran, Gaspard Michel, Elena Epure
- TLDR: We present a unified framework for automatic annotation of direct speech in French and show that the task still requires substantial efforts and emphasise characteristics of each baseline.
Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations
- Hyunjae Kim, Jaehyo Yoo, Seunghyun Yoon, Jaewoo Kang
- TLDR: We present a novel framework for generating pseudo-dictionaries with high-coverage pseudo-words and a new verification process based on the embedding distance between candidate entity mentions and entity types to reduce the false-positive noise in weak labels generated by high-covering pseudo-word dictionaries.
Dynamic Transformers Provide a False Sense of Efficiency
- Yiming Chen, Simin Chen, Zexin Li, Wei Yang, Cong Liu, Robby Tan, Haizhou Li
- TLDR: We propose a novel adversarial attack framework on multi-exit models, which is specially tailored to reduce the efficiency of the multi-entry models.
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features
- Ester Hlavnova, Sebastian Ruder
- TLDR: We propose M2C, a morphologically-aware framework for behavioral testing of NLP models.
Local Byte Fusion for Neural Machine Translation
- Makesh Narsimhan Sreedhar, Xiangpeng Wan, Yu Cheng, Junjie Hu
- TLDR: Byte-based machine translation with byte n-grams and word boundaries.
Where’s the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
- Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić
- TLDR: We present a multilingual punctuation-agnostic sentence segmentation method that outperforms all the prior best sentence-segmentation tools by an average of 6.1% F1 points.
Multi-target Backdoor Attacks for Code Pre-trained Models
- Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu
- TLDR: We propose task-agnostic backdoor attacks for pre-trained neural code models that can attack code-related downstream tasks.
Learning Better Masking for Better Language Model Pre-training
- Dongjie Yang, Zhuosheng Zhang, Hai Zhao
- TLDR: We show that time-invariant masking strategy on ratio and content influence the MLM pre-training.
VisText: A Benchmark for Semantically Rich Chart Captioning
- Benny Tang, Angie Boggust, Arvind Satyanarayan
- TLDR: We present a dataset of 12,441 pairs of charts and captions that describe the charts’ construction, report key statistics, and identify perceptual and cognitive phenomena.
Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora
- Svanhvít Lilja Ingólfsdóttir, Petur Ragnarsson, Haukur Jónsson, Haukur Simonarson, Vilhjalmur Thorsteinsson, Vésteinn Snæbjarnarson
- TLDR: Byte-level encoding for grammar error correction improves accuracy and semantic accuracy.
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
- Qianhui Wu, Huiqiang Jiang, Haonan Yin, Börje Karlsson, Chin-Yew Lin
- TLDR: We propose a multi-level knowledge distillation approach for out-of-distribution out-out-of distribution detection with only the texts of in-distributed examples.
Peeking inside the black box: A Commonsense-aware Generative Framework for Explainable Complaint Detection
- Apoorva Singh, Raghav Jain, Prince Jha, Sriparna Saha
- TLDR: We propose a commonsense-aware unified generative framework for explainable complaint detection and propose a novel benchmark dataset for explainability of the problem.
MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation
- Jiazhan Feng, Qingfeng Sun, Can Xu, Pu Zhao, Yaming Yang, Chongyang Tao, Dongyan Zhao, Qingwei Lin
- TLDR: We present a new multi-modal conversation dataset for conversational agents.
ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models
- Jonas Belouadi, Steffen Eger
- TLDR: We propose a novel end-to-end poetry generation algorithm that learns the nuances of poetry from data alone, reducing the degree of human supervision required.
Envisioning Future from the Past: Hierarchical Duality Learning for Multi-Turn Dialogue Generation
- Ang Lv, Jinpeng Li, Shufang Xie, Rui Yan
- TLDR: We propose a hierarchical duality learning for dialogue text that leverages duality to enable interaction between dialogue history and the future.
DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations
- Duzhen Zhang, Feilong Chen, Xiuyi Chen
- TLDR: We propose Dual Graph ATtention networks for Emotion Recognition in Conversations that incorporate discourse structure and speaker-aware context.
Consistent Prototype Learning for Few-Shot Continual Relation Extraction
- Xiudi Chen, Hui Wu, Xiaodong Shi
- TLDR: We propose a novel few-shot continual relation extraction method with Consistent Prototype Learning to reduce catastrophic forgetting and improve the stability of the episodic memory.
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models
- Myles Foley, Ambrish Rawat, Taesung Lee, Yufang Hou, Gabriele Picco, Giulio Zizzo
- TLDR: We trace back the origin of a given fine-tuned generative large language model to its corresponding pre-trained base model.
Large Language Models Meet NL2Code: A Survey
- Daoguang Zan, Bei Chen, Fengji Zhang, Dianjie Lu, Bingchao Wu, Bei Guan, Wang Yongji, Jian-Guang Lou
- TLDR: We present a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metrics.
When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP
- Jingwei Ni, Zhijing Jin, Qian Wang, Mrinmaya Sachan, Markus Leippold
- TLDR: We show that multi-task learning can work well when tasks are diverse but related, and when the size of the task aggregation and the shared capacity of the model are balanced to avoid overwhelming certain tasks.
Enhancing Grammatical Error Correction Systems with Explanations
- Yuejiao Fei, Leyang Cui, Sen Yang, Wai Lam, Zhenzhong Lan, Shuming Shi
- TLDR: We propose a large dataset for grammatical error correction systems with explanations and a method for evaluating their explanations.
Linguistic representations for fewer-shot relation extraction across domains
- Sireesh Gururaja, Ritam Dutt, Tinglong Liao, Carolyn Rosé
- TLDR: We explore the impact of linguistic representations on cross-domain performance in few-shot transfer and show that syntactic representations can be more helpful than semantic ones.
DarkBERT: A Language Model for the Dark Side of the Internet
- Youngjin Jin, Eugene Jang, Jian Cui, Jin-Woo Chung, Yongjae Lee, Seungwon Shin
- TLDR: We introduce DarkBERT, a language model pretrained on Dark Web data that can help to combat the extreme lexical and structural diversity of the Dark Web that may be detrimental to building a proper representation of the domain.
MDACE: MIMIC Documents Annotated with Code Evidence
- Hua Cheng, Rana Jafari, April Russell, Russell Klopfer, Edmond Lu, Benjamin Striner, Matthew Gormley
- TLDR: We introduce a dataset for evidence/rationale extraction on an extreme multi-label classification task over long medical documents.
Towards Zero-Shot Multilingual Transfer for Code-Switched Responses
- Ting-Wei Wu, Changsheng Zhao, Ernie Chang, Yangyang Shi, Pierce Chuang, Vikas Chandra, Biing Juang
- TLDR: We propose a new adapter-based framework that allows for efficient transfer by learning task-specific representations and encapsulating source and target language representations.
One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning
- Guangtao Zeng, Peiyuan Zhang, Wei Lu
- TLDR: We propose ProPETL, a novel method that enables efficient sharing of a single prototype PETL network (e.g. adapter, LoRA, and prefix-tuning) across layers and tasks.
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk
- Jianquan Li, XiangBo Wu, Xiaokang Liu, Qianqian Xie, Prayag Tiwari, Benyou Wang
- TLDR: We benchmark various generation approaches including training-from-scratch Seq2seq, fine-tuned middle-scale PLMs, and large-scale large-scaled PLMs (with and without fine-tuneing).
Convergence and Diversity in the Control Hierarchy
- Alexandra Butoi, Ryan Cotterell, David Chiang
- TLDR: We present a hierarchy of language classes whose second member (L2) is generated by tree-adjoining grammars (TAG), linear indexed grammarts (LIG), combinatory categorial grammar, and head grammets.
ConFEDE: Contrastive Feature Decomposition for Multimodal Sentiment Analysis
- Jiuding Yang, Yakun Yu, Di Niu, Weidong Guo, Yu Xu
- TLDR: We propose ConFEDE, a unified learning framework that jointly performs contrastive representation learning and contrastive feature decomposition to enhance the representation of multimodal information.
Using Domain Knowledge to Guide Dialog Structure Induction via Neural Probabilistic Soft Logic
- Connor Pryor, Quan Yuan, Jeremiah Liu, Mehran Kazemi, Deepak Ramachandran, Tania Bedrax-Weiss, Lise Getoor
- TLDR: Neural Probabilistic Soft Logic Dialogue Structure Induction using symbolic knowledge.
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark
- Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie
- TLDR: We propose an Embedding Watermark method for EaaS models that can effectively protect the copyright of large language models.
Answering Ambiguous Questions via Iterative Prompting
- Weiwei Sun, Hengyi Cai, Hongshen Chen, Pengjie Ren, Zhumin Chen, Maarten de Rijke, Zhaochun Ren
- TLDR: We present AmbigPrompt, a novel approach to answering ambiguous questions in open-domain question answering that uses less memory and lower inference latency than competing approaches.
A Dataset of Argumentative Dialogues on Scientific Papers
- Federico Ruggeri, Mohsen Mesgar, Iryna Gurevych
- TLDR: Argative dialogues on scientific papers.
Massively Multilingual Lexical Specialization of Multilingual Transformers
- Tommaso Green, Simone Paolo Ponzetto, Goran Glavaš
- TLDR: We show that multilingual lexical specialization of PLMs with large number of lexical constraints leads to substantial improvements in type-level lexical tasks.
RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs
- Afra Feyza Akyurek, Ekin Akyurek, Ashwin Kalyan, Peter Clark, Derry Tanti Wijaya, Niket Tandon
- TLDR: We propose RL4F, a multi-agent collaborative framework for feedback generation and reinforcement learning for language models.
WebIE: Faithful and Robust Information Extraction on the Web
- Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos, Andrea Pierleoni
- TLDR: We present WebIE, the first large-scale, entity-linked closed IE dataset consisting of 1.6M sentences automatically collected from the English Common Crawl corpus.
NormBank: A Knowledge Bank of Situational Social Norms
- Caleb Ziems, Jane Dwivedi-Yu, Yi-Chia Wang, Alon Halevy, Diyi Yang
- TLDR: We present NormBank, a knowledge bank of 155k situational norms.
DIP: Dead code Insertion based Black-box Attack for Programming Language Model
- CheolWon Na, YunSeok Choi, Jee-Hyong Lee
- TLDR: We propose a high-performance and effective black-box attack method to generate adversarial examples using dead code insertion.
Modeling Structural Similarities between Documents for Coherence Assessment with Graph Convolutional Networks
- Wei Liu, Xiyan Fu, Michael Strube
- TLDR: We propose a novel GCN-based coherence model that captures structural similarities between documents and achieves state-of-the-art results on discourse coherence and automated essay scoring.
HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification
- He Zhu, Chong Zhang, Junjie Huang, Junran Wu, Ke Xu
- TLDR: We propose Hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance the text representations with only syntactic information of the label hierarchy.
Contextual Knowledge Learning for Dialogue Generation
- Wen Zheng, Natasa Milic-Frayling, Ke Zhou
- TLDR: We present a novel approach to context and knowledge weighting as an integral part of dialogue generation models and show that it leads to a significant improvement compared with the performance of six strong baseline models and shows robustness with regard to reduced sizes of training sets.
Easy Guided Decoding in Providing Suggestions for Interactive Machine Translation
- Ke Wang, Xin Ge, Jiayi Wang, Yuqi Zhang, Yu Zhao
- TLDR: We propose a novel constrained decoding algorithm for machine translation that improves translation quality and reduces time overhead by 63.4% on benchmark datasets.
Discourse-Centric Evaluation of Document-level Machine Translation with a New Densely Annotated Parallel Corpus of Novels
- Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell
- TLDR: We present a new dataset with rich discourse annotations for sentence-level machine translation and show that document-level translation is fundamentally different from human translation in terms of their latent discourse structures.
CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation
- Yan Zhou, Qingkai Fang, Yang Feng
- TLDR: We propose a novel method for end-to-end speech translation that uses optimal transport to overcome the modality gap between speech and text.
On the Evaluation of Neural Selective Prediction Methods for Natural Language Processing
- Zhengyao Gu, Mark Hopkins
- TLDR: We provide a survey and empirical comparison of the state-of-the-art in neural selective classification for NLP tasks.
Speech-Text Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment
- Tianshu Yu, Haoyu Gao, Ting-En Lin, Min Yang, Yuchuan Wu, Wentao Ma, Chao Wang, Fei Huang, Yongbin Li
- TLDR: We propose Speech-text Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model.
Text Style Transfer with Contrastive Transfer Pattern Mining
- Jingxuan Han, Quan Wang, Licheng Zhang, Weidong Chen, Yan Song, Zhendong Mao
- TLDR: We propose a novel approach, contrastive transfer pattern mining, which automatically mines and utilizes inherent latent transfer patterns to improve the performance of text style transfer.
Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning
- Zhenrui Yue, Huimin Zeng, Mengfei Lan, Heng Ji, Dong Wang
- TLDR: MetaEvent is a meta learning-based framework for zero-shot event detection that can efficiently project output to unseen event types without any prior knowledge.
Text Style Transfer Back-Translation
- Daimeng Wei, Zhanglin Wu, Hengchao Shang, Zongyao Li, Minghan Wang, Jiaxin Guo, Xiaoyu Chen, Zhengzhe Yu, Hao Yang
- TLDR: We propose a novel method for improving the translation of natural inputs by modifying the style of source-side text.
Generating Visual Spatial Description via Holistic 3D Scene Understanding
- Yu Zhao, Hao Fei, Wei Ji, Jianguo Wei, Meishan Zhang, Min Zhang, Tat-Seng Chua
- TLDR: We propose a novel 3D scene extractor for visual spatial description, which can produce more spatially-diversified text generation.
Continual Knowledge Distillation for Neural Machine Translation
- Yuanchi Zhang, Peng Li, Maosong Sun, Yang Liu
- TLDR: We propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest.
Query Refinement Prompts for Closed-Book Long-Form QA
- Reinald Kim Amplayo, Kellie Webster, Michael Collins, Dipanjan Das, Shashi Narayan
- TLDR: We propose query refinement prompts that encourage large language models to explicitly express the multifacetedness in questions and generate long-form answers covering multiple facets of the question.
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
- Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, W.k. Chan, Chong-Wah Ngo, Mike Zheng Shou, Nan Duan
- TLDR: We propose CONE, an efficient COarse-to-fiNE alignment framework for long video temporal grounding that accelerates inference time by 2x on Ego4D-NLQ and 15x on MAD while keeping SOTA results.
Few-Shot Document-Level Event Argument Extraction
- Xianjun Yang, Yujie Lu, Linda Petzold
- TLDR: We present FewDocAE, a document-level event argument extraction benchmark based on the existing document- level event argument extractable dataset.
ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation
- Kuan-Hao Huang, Varun Iyer, I-Hung Hsu, Anoop Kumar, Kai-Wei Chang, Aram Galstyan
- TLDR: We present a large-scale syntactically diverse paraphrase dataset created by abstract meaning representation back-translation.
Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation
- Songming Zhang, Yunlong Liang, Shuaibo Wang, Yufeng Chen, Wenjuan Han, Jian Liu, Jinan Xu
- TLDR: We propose a new method for knowledge distillation based on the knowledge of teachers.
Multi-Row, Multi-Span Distant Supervision For Table+Text Question Answering
- Vishwajeet Kumar, Yash Gupta, Saneem Chemmengath, Jaydeep Sen, Soumen Chakrabarti, Samarth Bharadwaj, Feifei Pan
- TLDR: We present a transformer-based TextTableQA system that is explicitly designed to cope with distant supervision along both these axes, through a multi-instance loss objective, together with careful curriculum design.
HAHE: Hierarchical Attention for Hyper-Relational Knowledge Graphs in Global and Local Level
- Haoran Luo, Haihong E, Yuhao Yang, Yikai Guo, Mingzhi Sun, Tianyu Yao, Zichen Tang, Kaiyang Wan, Meina Song, Wei Lin
- TLDR: We propose a novel Hierarchical Attention model for HKG Embedding, which can learn the graphical and sequential structure of HKG and achieve state-of-the-art performance in link prediction tasks on HKG standard datasets.
ORGAN: Observation-Guided Radiology Report Generation via Tree Reasoning
- Wenjun Hou, Kaishuai Xu, Yi Cheng, Wenjie Li, Jiang Liu
- TLDR: We propose an observation-guided radiology report generation framework that captures multi-formats of each observation and improves the quality of the report.
Data Curation Alone Can Stabilize In-context Learning
- Ting-Yun Chang, Robin Jia
- TLDR: We show that carefully curating a subset of training data greatly stabilizes ICL performance without any other changes to the ICL algorithm.
MidMed: Towards Mixed-Type Dialogues for Medical Consultation
- Xiaoming Shi, Zeming Liu, Chuan Wang, Haitao Leng, Kui Xue, Xiaofan Zhang, Shaoting Zhang
- TLDR: We propose a novel human-to-human mixed-type medical consultation dialogue corpus for diagnosis, recommendation, recommendation and chitchat.
FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning
- Qinyuan Ye, Iz Beltagy, Matthew Peters, Xiang Ren, Hannaneh Hajishirzi
- TLDR: We present a comprehensive study on applying three fusion methods—concatenation-based (early fusion), FiD (intermediate), and ensemble-based, and ensemble (late) ICL to improve the efficiency and end-task performance of few-shot in-context learning.
S2ynRE: Two-stage Self-training with Synthetic data for Low-resource Relation Extraction
- Benfeng Xu, Quan Wang, Yajuan Lyu, Dai Dai, Yongdong Zhang, Zhendong Mao
- TLDR: We propose S2ynRE, a framework of two-stage Self-training with Synthetic data for Relation Extraction.
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
- Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng
- TLDR: We propose a framework for resource-efficient fine-tuning of pre-trained language models by leveraging sparsity prior in both weight updates and the final model weights.
CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation
- Jinfeng Zhou, Chujie Zheng, Bo Wang, Zheng Zhang, Minlie Huang
- TLDR: We propose a novel model for empathetic dialogue generation that combines cognition and affection to generate more empathetically relevant and informative responses.
Comparative evaluation of boundary-relaxed annotation for Entity Linking performance
- Gabriel Herman Bernardim Andrade, Shuntaro Yada, Eiji Aramaki
- TLDR: We present a case study on the impact of imprecise boundary annotation on Entity Linking performance and show that it is not necessary to extract entity surface forms for Entity Linked models.
Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023?
- Shuheng Liu, Alan Ritter
- TLDR: We show that different NER models generalize well to new data while others do not, and attempt to disentangle the effects of temporal drift and overfitting due to test reuse due to temporal mismatch between the pre-training corpora and the downstream test sets.
READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
- Chenglei Si, Zhengyan Zhang, Yingfa Chen, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun
- TLDR: We present a Chinese multi-task benchmark with REalistic and diverse input noises for NLP.
MAD-TSC: A Multilingual Aligned News Dataset for Target-dependent Sentiment Classification
- Evan Dufraisse, Adrian Popescu, Julien Tourille, Armelle Brun, Jerome Deshayes
- TLDR: We present a new dataset for target-dependent sentiment classification in news and show that it can match English translation performance.
A New Dataset and Empirical Study for Sentence Simplification in Chinese
- Shiping Yang, Renliang Sun, Xiaojun Wan
- TLDR: We present a new dataset for sentence simplification in Chinese and evaluate several unsupervised and zero/few-shot learning methods on it.
Factual or Contextual? Disentangling Error Types in Entity Description Generation
- Navita Goyal, Ani Nenkova, Hal Daumé III
- TLDR: We show that language models are often at odds with each other in the task of entity description generation, and that models specifically struggle with accurate descriptions of entities that are less familiar to people.
Weakly Supervised Vision-and-Language Pre-training with Relative Representations
- Chi Chen, Peng Li, Maosong Sun, Yang Liu
- TLDR: We propose a new WVLP framework based on the relative representations for weakly supervised vision-and-language pre-training.
HermEs: Interactive Spreadsheet Formula Prediction via Hierarchical Formulet Expansion
- Wanrong He, Haoyu Dong, Yihuai Gao, Zhichao Fan, Xingzhuo Guo, Zhitao Hou, Xiao Lv, Ran Jia, Shi Han, Dongmei Zhang
- TLDR: We propose HermEs, the first approach for spreadsheet formula prediction via HiEraRchical forMulet ExpanSion, where hierarchical expansion means generating formulas following the underlying parse tree structure, and Formulet refers to commonly-used multi-level patterns mined from real formula parse trees.
ArgU: A Controllable Factual Argument Generator
- Sougata Saha, Rohini Srihari
- TLDR: We present a novel argument generator capable of generating factual arguments from input facts and real-world concepts that can be explicitly controlled for stance and argument structure using Walton’s argument scheme-based control codes.
Learning Answer Generation using Supervision from Automatic Question Answering Evaluators
- Matteo Gabburo, Siddhant Garg, Rik Koncel-Kedziorski, Alessandro Moschitti
- TLDR: We propose a novel training paradigm for sentence-level extractive QA using supervision from automatic QA evaluation models (GAVA) and a novel algorithm for weighting the generator loss during the learning of the GenQA model.
RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation
- Shuai Liu, Hyundong Cho, Marjorie Freedman, Xuezhe Ma, Jonathan May
- TLDR: We propose a new retrieval-enhanced approach for personalized response generation and show its effectiveness on a real-world dataset.
Don’t Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span Selection
- Songlin Yang, Kewei Tu
- TLDR: Autoregressive span selection for continuous and discontinuous constituency parsing.
Laziness Is a Virtue When It Comes to Compositionality in Neural Semantic Parsing
- Maxwell Crouse, Pavan Kapanipathi, Subhajit Chaudhury, Tahira Naseem, Ramon Fernandez Astudillo, Achille Fokoue, Tim Klinger
- TLDR: We introduce a new method for generating logical forms from the bottom up, beginning from the logical form’s leaves.
AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression
- Siyue Wu, Hongzhan Chen, Xiaojun Quan, Qifan Wang, Rui Wang
- TLDR: We present a novel attribution-driven knowledge distillation approach, which explores the token-level rationale behind the teacher model based on Integrated Gradients (IG) and transfers attribution knowledge to the student model.
(QA)^2: Question Answering with Questionable Assumptions
- Najoung Kim, Phu Mon Htut, Samuel R. Bowman, Jackson Petty
- TLDR: We propose (QA)2 (Question Answering with Questionable Assumptions), an open-domain evaluation dataset consisting of naturally occurring search engine queries that may or may not contain questionable assumptions.
Attributable and Scalable Opinion Summarization
- Tom Hosking, Hao Tang, Mirella Lapata
- TLDR: We propose a method for unsupervised opinion summarization that encodes sentences from customer reviews into a hierarchical discrete latent space, then identifies common opinions based on the frequency of their encodings.
Targeted Data Generation: Finding and Fixing Model Weaknesses
- Zexue He, Marco Tulio Ribeiro, Fereshte Khani
- TLDR: We propose Targeted Data Generation, a framework that automatically identifies challenging subgroups, and generates new data for those subgroups using large language models (LLMs) with a human in the loop.
HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation
- Anchun Gui, Han Xiao
- TLDR: We propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuneed.
CFSum Coarse-to-Fine Contribution Network for Multimodal Summarization
- Min Xiao, Junnan Zhu, Haitao Lin, Yu Zhou, Chengqing Zong
- TLDR: We propose a novel Coarse-to-fine contribution network for multimodal summarization which considers different contributions of images for summarization.
On “Scientific Debt” in NLP: A Case for More Rigour in Language Model Pre-Training Research
- Made Nindyatama Nityasya, Haryo Wibowo, Alham Fikri Aji, Genta Winata, Radityo Eko Prasojo, Phil Blunsom, Adhiguna Kuncoro
- TLDR: We provide a case in point by revisiting the success of BERT over its baselines, ELMo and GPT-1, and demonstrate how — under comparable conditions where the baselines are tuned to a similar extent — these baselines (and even-simpler variants thereof) can, in fact, achieve competitive or better performance than BERT.
End-to-end Knowledge Retrieval with Multi-modal Queries
- Man Luo, Zhiyuan Fang, Tejas Gokhale, Yezhou Yang, Chitta Baral
- TLDR: We present a retriever model for knowledge retrieval with multi-modal queries and show superior performance on both zero-shot and finetuned datasets.
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
- Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao
- TLDR: We present AV-TranSpeech, the first audio-visual speech-to-speech translation model without relying on intermediate text.
Dual Class Knowledge Propagation Network for Multi-label Few-shot Intent Detection
- Feng Zhang, Wei Chen, Fei Ding, Tengjiao Wang
- TLDR: We propose a novel dual class knowledge propagation network for multi-label few-shot intent detection and a simple yet effective method to predict the intent count of each utterance.
VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets
- Vageesh Saxena, Nils Rethmeier, Gijs van Dijck, Gerasimos Spanakis
- TLDR: We propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, and link unique vendor accounts across text advertisements (ads) on seven public Darknet markets.
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method
- Yiming Wang, Zhuosheng Zhang, Rui Wang
- TLDR: We present a novel method for generating concise summaries that correlate with human writing mindset.
Efficient Shapley Values Estimation by Amortization for Text Classification
- Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang
- TLDR: We present a novel amortized model for predicting Shapley Values for neural text classification models without additional model evaluations.
PeerDA: Data Augmentation via Modeling Peer Relation for Span Identification Tasks
- Weiwen Xu, Xin Li, Yang Deng, Wai Lam, Lidong Bing
- TLDR: We propose a novel span augmentation method for span identification and classification which leverages the Peer (PR) relation to augment the span data for training models.
Dynamic Regularization in UDA for Transformers in Multimodal Classification
- Ivonne Monter-Aldana, Adrian Pastor Lopez Monroy, Fernando Sanchez-Vega
- TLDR: We propose a novel intra-CLS token fusion model for multimodal machine learning and a dynamic adjustment for the loss function.
Conflicts, Villains, Resolutions: Towards models of Narrative Media Framing
- Lea Frermann, Jiatong Li, Shima Khanehzar, Gosia Mikolajczak
- TLDR: We present a novel multi-label model of narrative framing in NLP and present a case study on the framing of climate change in news articles from news outlets across the political spectrum.
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
- Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Veselin Stoyanov, Ivan Koychev, Preslav Nakov, Dragomir Radev
- TLDR: We present bgGLUE (Bulgarian General Language Understanding Evaluation), a benchmark for evaluating language models on Natural Language Understanding (NLU) tasks in Bulgarian.
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
- Yuxi Feng, Xiaoyuan Yi, Xiting Wang, Laks Lakshmanan, V.S., Xing Xie
- TLDR: Self-training for attribute-controllable language generation.
What does the Failure to Reason with “Respectively” in Zero/Few-Shot Settings Tell Us about Language Models?
- Ruixiang Cui, Seolhwa Lee, Daniel Hershcovich, Anders Søgaard
- TLDR: We show that language models still lag behind humans in generalizing to the long tail of linguistic constructions.
BLIND: Bias Removal With No Demographics
- Hadas Orgad, Yonatan Belinkov
- TLDR: We present a method for bias removal without any prior knowledge of the demographics in the dataset.
How do humans perceive adversarial text? A reality check on the validity and naturalness of word-based adversarial attacks
- Salijona Dyrmishi, Salah Ghamizi, Maxime Cordy
- TLDR: We show that existing text adversarial attacks are impractical in real-world scenarios where humans are involved.
Soft Alignment Objectives for Robust Adaptation of Language Generation
- Michal Štefánik, Marek Kadlcik, Petr Sojka
- TLDR: We propose novel training objectives for domain adaptation based on the semantic similarity of the predicted tokens to the reference.
The CRINGE Loss: Learning what language not to model
- Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston
- TLDR: We propose a novel procedure to train language model training with negative examples, and show its effectiveness on safe generation, contradiction avoidance, and open-domain dialogue.
Modeling User Satisfaction Dynamics in Dialogue via Hawkes Process
- Fanghua Ye, Zhiyuan Hu, Emine Yilmaz
- TLDR: We propose a new estimator for dialogue systems that learns to model user satisfaction dynamics across turns and use a Hawkes process to estimate user satisfaction.
Towards Identifying Fine-Grained Depression Symptoms from Memes
- Shweta Yadav, Cornelia Caragea, Chenye Zhao, Naincy Kumari, Marvin Solberg, Tanmay Sharma
- TLDR: We introduce a new task of identifying fine-grained depressive symptoms from memes and show how to enforce orthogonal constraints on textual and visual feature representations in a multimodal setting can enforce the model to learn non-redundant and de-correlated features leading to a better prediction of fine-gained depression symptoms.
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks
- Suwon Shon, Siddhant Arora, Chyi-Jiunn Lin, Ankita Pasad, Felix Wu, Roshan S Sharma, Wei-Lun Wu, Hung-yi Lee, Karen Livescu, Shinji Watanabe
- TLDR: We present several new annotated speech understanding tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape.
My side, your side and the evidence: Discovering aligned actor groups and the narratives they weave
- Pavan Holur, David Chong, Timothy Tangherlini, Vwani Roychowdhury
- TLDR: We propose a novel two-step graph-based framework for identifying aligned story actors responsible for sustaining the issue-specific narratives.
Characterizing and Measuring Linguistic Dataset Drift
- Tyler Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth
- TLDR: We propose three dimensions of linguistic dataset drift that affect model performance, and we modify past performance prediction methods to predict model performance at both the example and dataset level for English sentiment classification and natural language inference.
WebCPM: Interactive Web Search for Chinese Long-form Question Answering
- Yujia Qin, Zihan Cai, Dian Jin, Lan Yan, Shihao Liang, Kunlun Zhu, Yankai Lin, Xu Han, Ning Ding, Huadong Wang, Ruobing Xie, Fanchao Qi, Zhiyuan Liu, Maosong Sun, Jie Zhou
- TLDR: We present a novel approach to long-form question answering using interactive web search and pre-trained language models.
Synthesize, Prompt and Transfer: Zero-shot Conversational Question Generation with Pre-trained Language Model
- Hongwei Zeng, Bifan Wei, Jun Liu, Weiping Fu
- TLDR: We propose a multi-stage knowledge transfer framework for Zero-shot conversational question generation, which requires no human-labeled conversations for training.
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
- Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolay Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister
- TLDR: We introduce a new multimodal graph contrastive learning strategy for form document understanding that unifies multimodality in one loss and achieves state-of-the-art performance on FUNSD, CORD, SROIE and Payment benchmarks.
MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies
- Shiyue Zhang, Shijie Wu, Ozan Irsoy, Steven Lu, Mohit Bansal, Mark Dredze, David Rosenberg
- TLDR: We propose a new objective for language models that learns with reverse cross-entropy, and show that the resulting models yield better generated text without complex decoding strategies.
Knowledgeable Parameter Efficient Tuning Network for Commonsense Question Answering
- Ziwang Zhao, Linmei Hu, Hanyu Zhao, Yingxia Shao, Yequan Wang
- TLDR: We propose a simple knowledgeable parameter efficient tuning network to couple PLMs with external knowledge for commonsense question answering.
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric
- Mingda Chen, Paul-Ambroise Duquenne, Pierre Andrews, Justine Kao, Alexandre Mourachko, Holger Schwenk, Marta R. Costa-jussà
- TLDR: We propose a text-free evaluation metric for end-to-end speech-to speech translation that correlates significantly better with human judgment than text-based metrics.
NLPositionality: Characterizing Design Biases of Datasets and Models
- Sebastin Santy, Jenny Liang, Ronan Le Bras, Katharina Reinecke, Maarten Sap
- TLDR: We introduce NLPositionality, a framework for characterizing design biases and quantifying the positionality of NLP datasets and models.
Backpack Language Models
- John Hewitt, John Thickstun, Christopher Manning, Percy Liang
- TLDR: We present Backpacks: a new neural architecture that marries strong modeling performancewith an interface for interpretability and control.
WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models
- Virginia Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May
- TLDR: We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community.
Grounded Multimodal Named Entity Recognition on Social Media
- Jianfei Yu, Ziyan Li, Jieming Wang, Rui Xia
- TLDR: We propose a novel MNER task for social media that uses entity-type-region triples to identify the named entities in text and image.
Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference
- Junhao Zheng, Qianli Ma, Shengjie Qiu, Yue Wu, Peitian Ma, Junlong Liu, Huawen Feng, Xichen Shang, Haibin Chen
- TLDR: We propose a unified objective for fine-tuning PLMs that can mitigate catastrophic forgetting and preserve knowledge.
Translation-Enhanced Multilingual Text-to-Image Generation
- Yaoyiran Li, Ching-Yun Chang, Stephen Rawls, Ivan Vulić, Anna Korhonen
- TLDR: We propose Ensemble Adapter, a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTTI performance.
Benchmarking Large Language Model Capabilities for Conditional Generation
- Joshua Maynez, Priyanka Agrawal, Sebastian Gehrmann
- TLDR: We provide an empirical study of the limitations and capabilities of pre-trained large language models in natural language generation tasks along dimensions such as scale, architecture, input and output language.
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
- Anne Wu, Kiante Brantley, Noriyuki Kojima, Yoav Artzi
- TLDR: We present lilGym, a new benchmark for language-conditioned reinforcement learning in visual environments.
Unsupervised Melody-to-Lyrics Generation
- Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Gunnar Sigurdsson, Chenyang Tao, Wenbo Zhao, Tagyoung Chung, Jing Huang, Nanyun Peng
- TLDR: We propose a novel method for generating high-quality lyrics without training on any aligned melody-lyric data.
Causality-aware Concept Extraction based on Knowledge-guided Prompting
- Siyu Yuan, Deqing Yang, Jinxi Liu, Shuyu Tian, Jiaqing Liang, Yanghua Xiao, Rui Xie
- TLDR: We propose a knowledge-guided prompt for concept extraction in knowledge graphs to alleviate concept bias in PLM-based concept extraction.
Span-level Aspect-based Sentiment Analysis via Table Filling
- Mao Zhang, Yongxin Zhu, Zhen Liu, Zhimin Bao, Yunfei Wu, Xing Sun, Linli Xu
- TLDR: We propose a novel span-level model for Aspect-based sentiment analysis, which aims at identifying the sentiment polarity of the given aspect.
Limitations of Language Models in Arithmetic and Symbolic Induction
- Jing Qian, Hong Wang, Zekun Li, Shiyang Li, Xifeng Yan
- TLDR: We show that large pretrained language models can do well on arithmetic induction, symbolic manipulation, and commonsense reasoning, but not on addition.
EEL: Efficiently Encoding Lattices for Reranking
- Prasann Singhal, Jiacheng Xu, Xi Ye, Greg Durrett
- TLDR: We use Transformers to efficiently encode lattices of generated outputs and use token-factored rerankers to extract high-quality hypotheses from the lattices.
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training
- Zhenhui Ye, Rongjie Huang, Yi Ren, Ziyue Jiang, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao
- TLDR: We propose CLAPSpeech, a cross-modal contrastive pre-training framework that learns from the prosody variance of the same text token under different contexts.
Revisiting Cross-Lingual Summarization: A Corpus-based Study and A New Benchmark with Improved Annotation
- Yulong Chen, Huajian Zhang, Yijie Zhou, Xuefeng Bai, Yueguan Wang, Ming Zhong, Jianhao Yan, Yafu Li, Judy Li, Xianchao Zhu, Yue Zhang
- TLDR: We propose ConvSumX, a cross-lingual conversation summarization benchmark, through a new annotation schema that explicitly considers source input context.
Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation
- Xiaohang Tang, Yi Zhou, Danushka Bollegala
- TLDR: We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates.
How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech
- Aditya Yedetore, Tal Linzen, Robert Frank, R. Thomas McCoy
- TLDR: We show that learning to generalize from text alone requires stronger biases than the general sequence-processing biases of standard neural network architectures.
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
- Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li
- TLDR: We propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model.
Log-linear Guardedness and its Implications
- Shauli Ravfogel, Yoav Goldberg, Ryan Cotterell
- TLDR: We show that, in the binary case, under certain assumptions, a downstream log-linear model cannot recover the erased concept.
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
- Eleftheria Briakou, Colin Cherry, George Foster
- TLDR: We investigate the role of incidental bilingualism—the unintentional consumption of bilingual signals, including translation examples—in explaining the translation capabilities of large language models, taking the Pathways Language Model (PaLM) as a case study.
Open Set Relation Extraction via Unknown-Aware Training
- Jun Zhao, Xin Zhao, WenYu Zhan, Qi Zhang, Tao Gui, Zhongyu Wei, Yun Wen Chen, Xiang Gao, Xuanjing Huang
- TLDR: We propose an unknown-aware training method for supervised relation extraction, which can provide the missing supervision signals.
Learning to Imagine: Visually-Augmented Natural Language Generation
- Tianyi Tang, Yushuo Chen, Yifan Du, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen
- TLDR: We propose a novel plug-and-play fusion layer for visual-augmented natural language gEneration that makes pre-trained language models learn to imagine relevant scenes based on the text.
Generating Hashtags for Short-form Videos with Guided Signals
- Tiezheng Yu, Hanchao Yu, Davis Liang, Yuning Mao, Shaoliang Nie, Po-Yao Huang, Madian Khabsa, Pascale Fung, Yi-Chia Wang
- TLDR: We propose a novel generation task for short-form video hashtag recommendation that is based on how hashtags are created naturally.
NEUROSTRUCTURAL DECODING: Neural Text Generation with Structural Constraints
- Mohaddeseh Bastan, Mihai Surdeanu, Niranjan Balasubramanian
- TLDR: NeuroStructural Decoding is a new language generation algorithm that incorporates syntactic constraints to further improve the quality of the generated text.
The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning
- Zhuang Li, Lizhen Qu, Philip Cohen, Raj Tumuluri, Gholamreza Haffari
- TLDR: We propose an active learning approach that exploits the strengths of both human and machine translations by iteratively adding small batches of human translations into the machine-translated training set.
Ideology Prediction from Scarce and Biased Supervision: Learn to Disregard the “What” and Focus on the “How”!
- Chen Chen, Dylan Walker, Venkatesh Saligrama
- TLDR: We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs.
Unsupervised Extractive Summarization of Emotion Triggers
- Tiberiu Sosea, Hongli Zhan, Junyi Jessy Li, Cornelia Caragea
- TLDR: We develop new unsupervised learning models that can jointly detect emotions and summarize their triggers.
Document-Level Event Argument Extraction With a Chain Reasoning Paradigm
- Jian Liu, Chen Liang, Jinan Xu, Haoyan Liu, Zhe Zhao
- TLDR: We present a new chain reasoning paradigm for document-level event argument extraction, which captures long-range interdependence and allows end-to-end learning and generalization of neural networks.
Pre-training Multi-party Dialogue Models with Latent Discourse Inference
- Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao
- TLDR: We propose to treat the discourse structures of multi-party dialogues as latent variables, then jointly infer them and pre-train the discourse-aware model by unsupervised latent variable inference methods.
Interpreting Positional Information in Perspective of Word Order
- Zhang Xilong, Liu Ruochen, Liu Jin, Liang Xuefeng
- TLDR: We propose a novel weight concatenation operation for positional encoding in the attention module and show its efficacy in neural machine translation tasks.
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
- Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Lianhui Qin, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Yejin Choi
- TLDR: We present a novel commonsense distillation algorithm that can learn to generate generics from commonsense facts about everyday concepts, and show that it can outperform the existing language models.
More than Classification: A Unified Framework for Event Temporal Relation Extraction
- Quzhe Huang, Yutong Hu, Shengqi Zhu, Yansong Feng, Chang Liu, Dongyan Zhao
- TLDR: We show that all relations can be interpreted using the start and end time points of events.
Multi-Source Test-Time Adaptation as Dueling Bandits for Extractive Question Answering
- Hai Ye, Qizhe Xie, Hwee Tou Ng
- TLDR: We study multi-source test-time model adaptation from user feedback, where
Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery
- Yutao Mou, Xiaoshuai Song, Keqing He, Chen Zeng, Pei Wang, Jingang Wang, Yunsen Xian, Weiran Xu
- TLDR: We propose a decoupled prototype learning framework for pseudo label disambiguation and representation learning.
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering
- Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang
- TLDR: We propose a simple yet effective metric for evaluating natural language generation tasks that uses instruction-tuned pre-trained language models to evaluate the quality of generated text.
Backdooring Neural Code Search
- Weisong Sun, Yuchen Chen, Guanhong Tao, Chunrong Fang, Xiangyu Zhang, Quanjun Zhang, Bin Luo
- TLDR: We present a novel attack on neural code search models which can make them return buggy or even vulnerable code with security/privacy issues.
Concise Answers to Complex Questions: Summarization of Long-form Answers
- Abhilash Potluri, Fangyuan Xu, Eunsol Choi
- TLDR: We present a user study on summarizing long-form question answering systems and propose a new extract-and-decontextualize approach for summarizing them.
Towards Better Entity Linking with Multi-View Enhanced Distillation
- Yi Liu, Yuan Tian, Jianxun Lian, Xinlong Wang, Yanan Cao, Fang Fang, Wen Zhang, Haizhen Huang, Weiwei Deng, Qi Zhang
- TLDR: We propose a multi-view Enhanced Distillation framework for entity linking that can match divergent mentions.
A Measure-Theoretic Characterization of Tight Language Models
- Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell
- TLDR: We propose a measure-theoretic treatment of language modeling and show that many popular language model families are in fact tight, meaning that they will not leak in this sense.
PAED: Zero-Shot Persona Attribute Extraction in Dialogues
- Luyao Zhu, Wei Li, Rui Mao, Vlad Pandelea, Erik Cambria
- TLDR: We propose a novel hard negative sampling strategy for generalized zero-shot persona attribute extraction from conversations and a novel contrastive learning- and generation-based model for generalized persona attribute extractions.
PromptRank: Unsupervised Keyphrase Extraction Using Prompt
- Aobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xiaoyan Bai
- TLDR: We propose a simple yet effective unsupervised approach for keyphrase extraction using the PLM and an encoder-decoder architecture.
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
- Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh Hajishirzi
- TLDR: We show that large language models struggle with less popular factual knowledge, and that retrieval augmentation helps significantly in these cases.
infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information
- Jaehyung Kim, Yekyung Kim, Karin de Langis, Jinwoo Shin, Dongyeop Kang
- TLDR: We present a universal framework for dataset characterization that captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models
- Akshita Jha, Aida Mostafazadeh Davani, Chandan K Reddy, Shachi Dave, Vinodkumar Prabhakaran, Sunipa Dev
- TLDR: We present SeeGULL, a broad-coverage stereotype dataset, built by utilizing generative capabilities of large language models such as PaLM, and GPT-3, and leveraging a globally diverse rater pool to validate the prevalence of those stereotypes in society.
Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations
- Lucy Lu Wang, Yulia Otmakhova, Jay DeYoung, Thinh Hung Truong, Bailey Kuehl, Erin Bransom, Byron Wallace
- TLDR: We present a dataset of human-assessed summary quality facets and pairwise preferences for multi-document summarization evaluation metrics and show that the metrics produced by these metrics are not correlated with human-attributed quality.
Say What You Mean! Large Language Models Speak Too Positively about Negative Commonsense Knowledge
- Jiangjie Chen, Wei Shi, Ziquan Fu, Sijie Cheng, Lei Li, Yanghua Xiao
- TLDR: We show that large language models fail to generate valid sentences grounded in negative commonsense knowledge, yet they can correctly answer polar yes-or-no questions.
An Inner Table Retriever for Robust Table Question Answering
- Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adria de Gispert, Gonzalo Iglesias
- TLDR: We propose a general-purpose approach for handling long tables in Table Question Answering that extracts sub-tables to preserve the most relevant information for a question.
SIMSUM: Document-level Text Simplification via Simultaneous Summarization
- Sofia Blinova, Xinyu Zhou, Martin Jaggi, Carsten Eickhoff, Seyed Ali Bahrainian
- TLDR: Simulation and simplification of document-level text simplification using keywords.
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation
- Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng
- TLDR: We propose a simple but effective two-stage SimOAP strategy for persona-based dialogue generation that improves the backbone models and outperforms the baseline strategies in both automatic and human evaluations.
NatLogAttack: A Framework for Attacking Natural Language Inference Models with Natural Logic
- Zi’ou Zheng, Xiaodan Zhu
- TLDR: We propose a novel adversarial attack model based on natural logic for natural language inference.
Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction
- Ashish Sharma, Kevin Rushton, Inna Lin, David Wadden, Khendra Lucas, Adam Miner, Theresa Nguyen, Tim Althoff
- TLDR: We develop a framework of seven linguistic attributes that can be used to reframe a thought and train a retrieval-enhanced in-context learning model that effectively generates reframed thoughts and controls their linguistic attributes.
Dating Greek Papyri with Text Regression
- John Pavlopoulos, Maria Konstantinidou, Isabelle Marthot-Santaniello, Holger Essler, Asimina Paparigopoulou
- TLDR: We present a method for estimating the date of documentary Greek papyri documents by using a dataset of transcriptions of documentary papyres.
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
- Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- TLDR: We show that the question-based large language model is insufficient for multi-step question answering.
Direct Fact Retrieval from Knowledge Graphs without Entity Linking
- Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang
- TLDR: We propose a simple knowledge retrieval framework for knowledge graph retrieval based on their representational similarities.
DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering
- Ella Neeman, Roee Aharoni, Or Honovich, Leshem Choshen, Idan Szpektor, Omri Abend
- TLDR: We propose a new paradigm in which QA models are trained to disentangle the two sources of knowledge.
A New Direction in Stance Detection: Target-Stance Extraction in the Wild
- Yingjie Li, Krishna Garg, Cornelia Caragea
- TLDR: We propose a new task Target-Stance Extraction that aims to extract the (target, stance) pair from the text.
Improved Instruction Ordering in Recipe-Grounded Conversation
- Duong Le, Ruohao Guo, Wei Xu, Alan Ritter
- TLDR: We propose to explore two auxiliary subtasks, namely User Intent Detection and Instruction State Tracking, to support Response Generation with improved instruction grounding.
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
- Byung-Doh Oh, William Schuler
- TLDR: We present a new method for analyzing the contribution of input tokens in Transformer language models and show that they are important predictors of next-word probabilities.
Document-Level Multi-Event Extraction with Event Proxy Nodes and Hausdorff Distance Minimization
- Xinyu Wang, Lin Gui, Yulan He
- TLDR: We propose an alternative approach for document-level multi-event extraction with event proxy nodes and Hausdorff distance minimization.
Dialog-Post: Multi-Level Self-Supervised Objectives and Hierarchical Model for Dialogue Post-Training
- Zhenyu Zhang, Lei Shen, Yuming Zhao, Meng Chen, Xiaodong He
- TLDR: We propose a novel dialogue-adaptive post-training method that uses multi-level self-supervised objectives and a hierarchical model to model dialogues more comprehensively.
Language Detoxification with Attribute-Discriminative Latent Space
- Jin Myung Kwak, Minseon Kim, Sung Ju Hwang
- TLDR: We propose an effective yet efficient method for language detoxification using an attribute-discriminative latent space.
Just Like a Human Would, Direct Access to Sarcasm Augmented with Potential Result and Reaction
- Changrong Min, Ximing Li, Liang Yang, Zhilin Wang, Bo Xu, Hongfei Lin
- TLDR: We develop a novel method for detecting sarcastic text from social media, which can outperform strong baselines on benchmark datasets.
Adaptive and Personalized Exercise Generation for Online Language Learning
- Peng Cui, Mrinmaya Sachan
- TLDR: We present a novel task of adaptive and personalized exercise generation for online language learning.
NLP Reproducibility For All: Understanding Experiences of Beginners
- Shane Storks, Keunwoo Yu, Ziqiao Ma, Joyce Chai
- TLDR: We show that the best way to support beginners in natural language processing is to open-sourcing their work.
Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA
- Elias Stengel-Eskin, Jimena Guallar-Blasco, Yi Zhou, Benjamin Van Durme
- TLDR: We present a linguistically-aligned ontology of reasons for ambiguity in visual questions and show that answering them using a question generation objective and rephrasing them for each group reduces ambiguity.
UMRSpell: Unifying the Detection and Correction Parts of Pre-trained Models towards Chinese Missing, Redundant, and Spelling Correction
- Zheyu He, Yujin Zhu, Linlin Wang, Liang Xu
- TLDR: We propose a novel model UMR- Spell to learn detection and correction parts together at the same time from a multi-task learning perspective by using a multi task learning perspective.
LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction
- Jeremiah Milbauer, Annie Louis, Mohammad Javad Hosseini, Alex Fabrikant, Donald Metzler, Tal Schuster
- TLDR: We propose Layer-Adjustable Interactions in Transformers, a novel architecture for segmented input data that can reduce latency and improve accuracy on NLP tasks.
Local Interpretation of Transformer Based on Linear Decomposition
- Sen Yang, Shujian Huang, Wei Zou, Jianbing Zhang, Xinyu Dai, Jiajun Chen
- TLDR: We propose a method to interpret neural networks by linear decomposition and show that the ReLU-activated Transformer can be considered as a linear model on a single input.
DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions
- Vijay Viswanathan, Luyu Gao, Tongshuang Wu, Pengfei Liu, Graham Neubig
- TLDR: We present a novel dataset search algorithm for text-based dataset recommendation and a novel bi-encoder retriever for text based dataset recommendation.
Multilingual Event Extraction from Historical Newspaper Adverts
- Nadav Borenstein, Natália da Silva Perez, Isabelle Augenstein
- TLDR: We present a novel dataset for historical event extraction from a novel domain of historical texts.
BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency
- Zhenyu Lei, Herun Wan, Wenqian Zhang, Shangbin Feng, Zilong Chen, Jundong Li, Qinghua Zheng, Minnan Luo
- TLDR: We propose BIC, a Twitter Bot detection framework with text-graph Interaction and semantic Consistency.
Do I have the Knowledge to Answer? Investigating Answerability of Knowledge Base Questions
- Mayur Patidar, Prayushi Faldu, Avinash Singh, Lovekesh Vig, Indrajit Bhattacharya, Mausam -
- TLDR: We present a new benchmark dataset for answerability in knowledge bases over knowledge bases and show that state-of-the-art KBQA models are not robust to unanswerable questions.
Understanding Client Reactions in Online Mental Health Counseling
- Anqi Li, Lizhi Ma, Yaling Mei, Hongliang He, Shuai Zhang, Huachuan Qiu, Zhenzhong Lan
- TLDR: We present a theoretically grounded annotation framework for analyzing how clients react to counselors’ strategies and their reactions to the strategies.
Nonlinear Structural Equation Model Guided Gaussian Mixture Hierarchical Topic Modeling
- HeGang Chen, Pengbo Mao, Yuyin Lu, Yanghui Rao
- TLDR: We propose a deep topic model for textcorpus that explicitly models hierarchical and symmetric relations between topics through the dependency matrices and nonlinear structural equations.
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
- Qihuang Zhong, Liang Ding, Juhua Liu, Xuebo Liu, Min Zhang, Bo Du, Dacheng Tao
- TLDR: We propose a simple yet effective semantic-consistent learning method for token dropping in BERT.
The Benefits of Bad Advice: Autocontrastive Decoding across Model Layers
- Ariel Gera, Roni Friedman, Ofir Arviv, Chulaka Gunasekara, Benjamin Sznajder, Noam Slonim, Eyal Shnarch
- TLDR: We propose a novel approach that utilizes the contrast between layers to improve text generation outputs, and show that it mitigates degenerative behaviors of the model in open-ended generation, significantly improving the quality of generated texts.
FACTIFY-5WQA: 5W Aspect-based Fact Verification through Question Answering
- Anku Rani, S.M Towhidul Islam Tonmoy, Dwip Dalal, Shreya Gautam, Megha Chakraborty, Aman Chadha, Amit Sheth, Amitava Das
- TLDR: We propose a 5W framework for question-answer-based fact explainability and a baseline QA system for automatic fact verification.
Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages
- Arnav Mhaske, Harshit Kedia, Sumanth Doddapaneni, Mitesh M. Khapra, Pratyush Kumar, Rudra Murthy, Anoop Kunchukuttan
- TLDR: We present,
CREPE: Open-Domain Question Answering with False Presuppositions
- Xinyan Yu, Sewon Min, Luke Zettlemoyer, Hannaneh Hajishirzi
- TLDR: We present a dataset of questions with a natural distribution of presupposition failures from online information-seeking forums and provide annotations for these presuppositions and their corrections.
Joint Document-Level Event Extraction via Token-Token Bidirectional Event Completed Graph
- Qizhi Wan, Changxuan Wan, Keli Xiao, Dexi Liu, Chenliang Li, Bolong Zheng, Xiping Liu, Rong Hu
- TLDR: We solve the challenging document-level event extraction problem by proposing a joint exaction methodology that can avoid inefficiency and error propagation issues in classic pipeline methods.
Robust Representation Learning with Reliable Pseudo-labels Generation via Self-Adaptive Optimal Transport for Short Text Clustering
- Xiaolin Zheng, Mengling Hu, Weiming Liu, Chaochao Chen, Xinting Liao
- TLDR: We propose a robust short text clustering model that is able to handle imbalanced and noisy data.
Multilingual Knowledge Graph Completion with Language-Sensitive Multi-Graph Attention
- Rongchuan Tang, Yang Zhao, Chengqing Zong, Yu Zhou
- TLDR: We propose a novel multilingual knowledge graph completion framework with language-sensitive multi-graph attention and a universal knowledge completion model.
What are the Desired Characteristics of Calibration Sets? Identifying Correlates on Long Form Scientific Summarization
- Griffin Adams, Bichlien Nguyen, Jake Smith, Yingce Xia, Shufang Xie, Anna Ostropolets, Budhaditya Deb, Yuan-Jyue Chen, Tristan Naumann, Noémie Elhadad
- TLDR: We propose a calibration step for summarization models that exposes a model to its own ranked outputs to improve relevance or, in a separate line of work, contrasts positive and negative sets to improve faithfulness.
Annotating Mentions Alone Enables Efficient Domain Adaptation for Coreference Resolution
- Nupoor Gandhi, Anjalie Field, Emma Strubell
- TLDR: We show that adapting mention detection is the key component to successful domain adaptation of coreference models, rather than antecedent linking.
A Universal Discriminator for Zero-Shot Generalization
- Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, Zhilin Yang
- TLDR: We present a novel approach for training discriminative models for NLP tasks that outperform generative models on a large number of NLP benchmarks.
Syntax and Geometry of Information
- Raphaël Bailly, Laurent Leblond, Kata Gábor
- TLDR: We propose a new model of syntactic generalization based on the independence of structure and context.
GreenKGC: A Lightweight Knowledge Graph Completion Method
- Yun Cheng Wang, Xiou Ge, Bin Wang, C.-C. Jay Kuo
- TLDR: We propose a modularized knowledge graph completion algorithm that can outperform SOTA methods in low dimensions and outperforms high-dimensional models.
Unsupervised Open-domain Keyphrase Generation
- Lam Do, Pritom Saha Akash, Kevin Chen-Chuan Chang
- TLDR: We propose a seq2seq model that consists of two modules, namely phraseness and informativeness module, both of which can be built in an unsupervised and open-domain fashion.
A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment
- Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu
- TLDR: We propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the therapy principle and emotional support strategy.
Plug-and-Play Knowledge Injection for Pre-trained Language Models
- Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Huadong Wang, Deming Ye, Chaojun Xiao, Xu Han, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
- TLDR: We propose a new paradigm for knowledge injection that uses existing downstream models for downstream NLP tasks.
Two Birds One Stone: Dynamic Ensemble for OOD Intent Classification
- Yunhua Zhou, Jianqiang Yang, Pengyu Wang, Xipeng Qiu
- TLDR: We propose a two-birds-one-stone method for OOD intent classification, which can improve inference speed and improve accuracy.
SWiPE: A Dataset for Document-Level Simplification of Wikipedia Pages
- Philippe Laban, Jesse Vig, Wojciech Kryscinski, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
- TLDR: We propose a dataset for document-level text simplification that reconstructs the document- level editing process from English Wikipedia articles to paired Simple Wikipedia (SEW) articles.
Are Message Passing Neural Networks Really Helpful for Knowledge Graph Completion?
- Juanhui Li, Harry Shomer, Jiayuan Ding, Yiqi Wang, Yao Ma, Neil Shah, Jiliang Tang, Dawei Yin
- TLDR: We show that simple MLP models are able to achieve comparable performance to MPNNs, suggesting that MP may not be as crucial as previously believed.
A dynamic programming algorithm for span-based nested named-entity recognition in O(n^2)
- Caio Corro
- TLDR: Span-based nested named-entity recognition has a cubic-time complexity using avariant of the CYK algorithm.
Target-Side Augmentation for Document-Level Machine Translation
- Guangsheng Bao, Zhiyang Teng, Yue Zhang
- TLDR: We propose a target-side augmentation method for document-level machine translation, which improves the MT performance on News and Europarl benchmarks.
Rethinking Masked Language Modeling for Chinese Spelling Correction
- Hongqiu Wu, Shaohua Zhang, Yuchen Zhang, Hai Zhao
- TLDR: We show that fine-tuning BERT tends to over-fit the error model while under-fitting the language model, resulting in poor generalization to out-of-distribution error patterns.
A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues
- Yunxin Li, Baotian Hu, Chen Xinyu, Yuxin Ding, Lin Ma, Min Zhang
- TLDR: We propose a novel multi-modal context reasoning approach for conditional inference on joint textual and visual clues.
Simple and Effective Unsupervised Speech Translation
- Changhan Wang, Hirofumi Inaguma, Peng-Jen Chen, Ilia Kulikov, Yun Tang, Wei-Ning Hsu, Michael Auli, Juan Pino
- TLDR: We present a simple and effective approach to build speech translation systems without labeled data by leveraging recent advances in unsupervised speech recognition, machine translation and speech synthesis, either in a pipeline approach, or to generate pseudo-labels for training end-to-end speech translation models.
Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation
- Xuan Long Do, Bowei Zou, Shafiq Joty, Tran Tai, Liangming Pan, Nancy Chen, Ai Ti Aw
- TLDR: We present SG-CQG, a two-stage CQG framework for answer-aware and answer-unaware question generation.
CHEER: Centrality-aware High-order Event Reasoning Network for Document-level Event Causality Identification
- Meiqi Chen, Yixin Cao, Yan Zhang, Zhiwei Liu
- TLDR: We propose a novel document-level event causality identification model and a novel event-interaction graph for cross-sentence reasoning.
f-Divergence Minimization for Sequence-Level Knowledge Distillation
- Yuqiao Wen, Zichao Li, Wenyu Du, Lili Mou
- TLDR: We propose a novel method for sequence-level knowledge distillation that reduces intractable sequence- and word-level divergence to word-specific losses and improves learning in a tractable manner.
Supervised Adversarial Contrastive Learning for Emotion Recognition in Conversations
- Dou Hu, Yinan Bao, Lingwei Wei, Wei Zhou, Songlin Hu
- TLDR: We propose a supervised adversarial contrastive learning framework for learning class-spread structured representations in a supervised manner and a sequence-based adversarial learning algorithm for emotion recognition in conversations.
A Novel Table-to-Graph Generation Approach for Document-Level Joint Entity and Relation Extraction
- Ruoyu Zhang, Yanzeng Li, Lei Zou
- TLDR: We propose a novel table-to-graph generation model for joint extractionof entities and relations at document-level.
A Synthetic Data Generation Framework for Grounded Dialogues
- Jianzhu Bao, Rui Wang, Yasheng Wang, Aixin Sun, Yitong Li, Fei Mi, Ruifeng Xu
- TLDR: Synthetic data generation framework for grounded dialogues.
MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African languages
- Cheikh M. Bamba Dione, David Ifeoluwa Adelani, Peter Nabende, Jesujoba Alabi, Thapelo Sindane, Happy Buzaaba, Shamsuddeen Hassan Muhammad, Chris Chinenye Emezue, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jonathan Mukiibi, Blessing Sibanda, Bonaventure F. P. Dossou, Andiswa Bukula, Rooweither Mabuya, Allahsera Auguste Tapo, Edwin Munkoh-Buabeng, Victoire Memdjokam Koagne, Fatoumata Ouoba Kabore, Amelia Taylor, Godson Kalipe, Tebogo Macucwa, Vukosi Marivate, Tajuddeen Gwadabe, Mboning Tchiaze Elvis, Ikechukwu Onyenwe, Gratien Atindogbe, Tolulope Adelani, Idris Akinade, Olanrewaju Samuel, Marien Nahimana, Théogène Musabeyezu, Emile Niyomutabazi, Ester Chimhenga, Kudzai Gotosa, Patrick Mizha, Apelete Agbolo, Seydou Traore, Chinedu Uchechukwu, Aliyu Yusuf, Muhammad Abdullahi, Dietrich Klakow
- TLDR: We present a large part-of-speech dataset for 20 typologically diverse African languages and show that choosing the best transfer language(s) in both single-source and multi-source setups greatly improves the POS tagging performance of the target languages, in particular when combined with parameter-fine-tuning methods.
Semantic Structure Enhanced Event Causality Identification
- Zhilei Hu, Zixuan Li, Xiaolong Jin, Long Bai, Saiping Guan, Jiafeng Guo, Xueqi Cheng
- TLDR: We propose a semantic structure integration model for event causal identification, which captures the implicit associations between events in unstructured text and provides possible supports for ECI.
Weakly-Supervised Spoken Video Grounding via Semantic Interaction Learning
- Ye Wang, Wang Lin, Shengyu Zhang, Tao Jin, Linjun Li, Xize Cheng, Zhou Zhao
- TLDR: Weakly-supervised spoken video grounding with acoustic-semantic pre-training and acoustic-visual contrastive learning.
Rehearsal-free Continual Language Learning via Efficient Parameter Isolation
- Zhicheng Wang, Yufang Liu, Tao Ji, Xiaoling Wang, Yuanbin Wu, Congcong Jiang, Ye Chao, Zhencong Han, Ling Wang, Xu Shao, Wenqiu Zeng
- TLDR: We study the problem of defying catastrophic forgetting when learning a series of language processing tasks.
Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification
- Chih Yao Chen, Tun Min Hung, Yi-Li Hsu, Lun-Wei Ku
- TLDR: We propose HypEmo, a novel framework that can integrate hyperbolic embeddings to improve the FEC task.
Combo of Thinking and Observing for Outside-Knowledge VQA
- Qingyi Si, Yuchen Mo, Zheng Lin, Huishan Ji, Weiping Wang
- TLDR: We propose a novel framework for external-knowledge visual question answering which uses multimodal and textual knowledge to solve the question answering problem.
AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model
- I-Hung Hsu, Zhiyu Xie, Kuan-Hao Huang, Prem Natarajan, Nanyun Peng
- TLDR: We propose AMPERE, a generation-based EAE model that incorporates abstract meaning representation of input passages into the generation model and improves the generation.
Your spouse needs professional help: Determining the Contextual Appropriateness of Messages through Modeling Social Relationships
- David Jurgens, Agrima Seth, Jackson Sargent, Athena Aghighi, Michael Geraci
- TLDR: We present a new approach to identifying inappropriate communication by explicitly modeling the social relationship between the individuals.
TART: Improved Few-shot Text Classification Using Task-Adaptive Reference Transformation
- Shuo Lei, Xuchao Zhang, Jianfeng He, Fanglan Chen, Chang-Tien Lu
- TLDR: We propose a novel task-adaptive reference transformation network for few-shot text classification and show clear superiority over the state-of-the-art models in all the datasets.
How Do In-Context Examples Affect Compositional Generalization?
- Shengnan An, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Jian-Guang Lou, Dongmei Zhang
- TLDR: We present CoFe, a test suite to investigate in-context compositional generalization.
Attractive Storyteller: Stylized Visual Storytelling with Unpaired Text
- Dingyi Yang, Qin Jin
- TLDR: We propose a novel style-specific visual storytelling model based on visual storytelling data and unpaired style corpus, which can generate attractive and coherent stories with different styles such as fairy tale, romance, and humor.
Multitask Pretraining with Structured Knowledge for Text-to-SQL Generation
- Robert Giaquinto, Dejiao Zhang, Benjamin Kleiner, Yang Li, Ming Tan, Parminder Bhatia, Ramesh Nallapati, Xiaofei Ma
- TLDR: We present a novel approach for learning representations of text, tables, and SQL code that leverages the entire context of the problem.
WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction
- Qiyu Wu, Masaaki Nagata, Yoshimasa Tsuruoka
- TLDR: We propose a novel weakly supervised word alignment method that improves upon the best supervised baseline by 3.3 points in F1 and 1.1 points in AER.
Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models
- Junmo Kang, Wei Xu, Alan Ritter
- TLDR: We show that distilling from T5-XXL to T5 Small is almost always a cost-efficient strategy compared to annotating more data to directly train a compact model.
OD-RTE: A One-Stage Object Detection Framework for Relational Triple Extraction
- Jinzhong Ning, Zhihao Yang, Yuanyuan Sun, Zhizheng Wang, Hongfei Lin
- TLDR: We propose a novel Object Detection framework for Relational Triple Extraction task based on table-filling method and show that it achieves state-of-the-art performance on two widely used datasets.
I Cast Detect Thoughts: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons
- Pei Zhou, Andrew Zhu, Jennifer Hu, Jay Pujara, Xiang Ren, Chris Callison-Burch, Yejin Choi, Prithviraj Ammanabrolu
- TLDR: We propose a novel task, G4C, to study teacher-student natural language interactions in a goal-driven and grounded environment.
Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning
- Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang
- TLDR: We present Multi-task Pre-trained Modular Prompt (MP2) to boost prompt tuning for few-shot learning.
Is GPT-3 a Good Data Annotator?
- Bosheng Ding, Chengwei Qin, Linlin Liu, Yew Ken Chia, Boyang Li, Shafiq Joty, Lidong Bing
- TLDR: We evaluate the performance of GPT-3 as a data annotator for NLP tasks and provide insight into its potential as a general-purpose data annotators.
Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog
- Fanqi Wan, Weizhou Shen, Ke Yang, Xiaojun Quan, Wei Bi
- TLDR: We propose a multi-grained knowledge retriever that uses supervision signals from the response generator to improve knowledge retrieval performance.
Few-shot Event Detection: An Empirical Study and a Unified View
- Yubo Ma, Zehao Wang, Yixin Cao, Aixin Sun
- TLDR: We present a unified framework for few-shot event detection and propose a simple yet effective baseline for both prompt-based and prototype-based methods.
How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases
- Aaron Mueller, Tal Linzen
- TLDR: We show that pre-training on simpler language, such as child-directed speech, induces a hierarchical bias using an order-of-magnitude less data than pre- training on more typical datasets based on web text or Wikipedia; this suggests that in cognitively plausible language acquisition settings, neural language models may be more data-efficient than previously thought.
ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations
- Valentina Pyatkin, Jena D. Hwang, Vivek Srikumar, Ximing Lu, Liwei Jiang, Yejin Choi, Chandra Bhagavatula
- TLDR: We present ClarifyDelphi, an interactive system that learns to ask clarification questions (e.g., why did you lie to your friend?) in order to elicit additional salient contexts of a social or moral situation.
HINT: Hypernetwork Instruction Tuning for Efficient Zero- and Few-Shot Generalisation
- Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, Matthew Peters
- TLDR: We present a novel method for parameter-efficient fine-tuning of neural networks using only natural language instructions as guidance.
Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations
- Chenglei Si, Dan Friedman, Nitish Joshi, Shi Feng, Danqi Chen, He He
- TLDR: We investigate the inductive biases of GPT-3 models and show that they exhibit strong feature biases.
An Inclusive Notion of Text
- Ilia Kuznetsov, Iryna Gurevych
- TLDR: We propose a new taxonomy of linguistic and non-linguistic elements that can be used in NLP modeling and explore the role of text in NLPs.
AlignScore: Evaluating Factual Consistency with A Unified Alignment Function
- Yuheng Zha, Yichi Yang, Ruichen Li, Zhiting Hu
- TLDR: We propose AlignScore, a new holistic metric that applies to a variety of factual inconsistency scenarios as above.
Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation
- Liqiang Jing, Xuemeng Song, Kun Ouyang, Mengzhao Jia, Liqiang Nie
- TLDR: We propose a novel mulTi-source sEmantic grAph-based Multimodal sarcasm explanation scheme, named TEAM.
Counterfactual Active Learning for Out-of-Distribution Generalization
- Xun Deng, Wenjie Wang, Fuli Feng, Hanwang Zhang, Xiangnan He, Yong Liao
- TLDR: We propose Counterfactual Active Learning, a novel out-of-distribution generalization of active learning that adaptively selects samples for annotation in learning the decision boundary of classification.
Multi-granularity Temporal Question Answering over Knowledge Graphs
- Ziyang Chen, Jinzhi Liao, Xiang Zhao
- TLDR: We present a large scale dataset for multi-granularity temporal question answering over knowledge graphs and present a competing baseline MultiQA over MultiTQ, which is experimentally demonstrated to be effective in dealing with TKGQA.
A New Aligned Simple German Corpus
- Vanessa Toborek, Moritz Busch, Malte Boßert, Christian Bauckhage, Pascal Welke
- TLDR: We present a new sentence-aligned monolingual corpus for Simple German – German.
Introducing Semantics into Speech Encoders
- Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Bing Liu, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Guan-Ting Lin, Alexei Baevski, Hung-yi Lee, Yizhou Sun, Wei Wang
- TLDR: We propose a task-agnostic unsupervised way of incorporating semantic information from large language model (LLM) systems into self-supervised speech encoders without labeled audio transcriptions.
Constrained Tuple Extraction with Interaction-Aware Network
- Xiaojun Xue, Chunxia Zhang, Tianxiang Xu, Zhendong Niu
- TLDR: We propose a novel constraint-based tuple extraction task for knowledge graph extraction and knowledge graph construction.
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
- Zhiyang Xu, Ying Shen, Lifu Huang
- TLDR: We present a multimodal instruction tuning benchmark dataset for vision and multimodality tasks and show that fine-tuning the model on a diverse set of tasks and instructions leads to a reduced sensitivity to variations in instructions for each task.
Single Sequence Prediction over Reasoning Graphs for Multi-hop QA
- Gowtham Ramesh, Makesh Narsimhan Sreedhar, Junjie Hu
- TLDR: We propose a new method for generating a graph-based reasoning graph for multi-hop question answering that can improve answer accuracy and faithfulness in the reasoning path.
Contrastive Error Attribution for Finetuned Language Models
- Faisal Ladhak, Esin Durmus, Tatsunori Hashimoto
- TLDR: We present a novel method for detecting and removing low-quality training instances that lead to faithfulness errors in NLG tasks.
DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications
- Adam Ivankay, Mattia Rigotti, Pascal Frossard
- TLDR: We propose a robustness estimation method for attributions in biomedical datasets that can be used to prevent explanations that are inaccurate but still look convincing in the context of the domain at hand.
Neural Machine Translation for Mathematical Formulae
- Felix Petersen, Moritz Schubotz, Andre Greiner-Petter, Bela Gipp
- TLDR: We tackle the problem of neural machine translation of mathematical formulae between ambiguous presentation languages and unambiguous content languages.
Query-Efficient Black-Box Red Teaming via Bayesian Optimization
- Deokjae Lee, JunYeong Lee, Jung-Woo Ha, Jin-Hwa Kim, Sang-Woo Lee, Hwaran Lee, Hyun Oh Song
- TLDR: We propose a new method for black-box red teaming in which a red team generates test cases and interacts with the victim model to discover a diverse set of failures with limited query access.
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
- Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov
- TLDR: We present SSD-LM, a novel language model that outperforms autoregressive language models on unconstrained text generation and controlled text generation benchmarks.
Recall, Expand, and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing
- Chengyue Jiang, Wenyang Hui, Yong Jiang, Xiaobin Wang, Pengjun Xie, Kewei Tu
- TLDR: Ultra-fine entity typing (UFET) predicts extremely free-formed types (e.g.,
MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition
- Yuchen Hu, Chen Chen, Ruizhe Li, Heqing Zou, Eng Siong Chng
- TLDR: We propose an adversarial network to learn the shared representations across modalities to bridge the gap in multi-modality fusion and representation learning.
Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors
- Liyan Tang, Tanya Goyal, Alex Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryscinski, Justin Rousseau, Greg Durrett
- TLDR: We stratify the factuality metrics of summarization models according to their underlying summarization model and show that their performance varies significantly across different types of summarizations.
GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding
- Jia-Chen Gu, Zhenhua Ling, Quan Liu, Cong Liu, Guoping Hu
- TLDR: Graph-induced fine-tuning for universal MPC understanding.
Hybrid Uncertainty Quantification for Selective Text Classification in Ambiguous Tasks
- Artem Vazhentsev, Gleb Kuzmin, Akim Tsvigun, Alexander Panchenko, Maxim Panov, Mikhail Burtsev, Artem Shelmanov
- TLDR: We propose a new uncertainty estimation method that combines epistemic and aleatoric uncertainty for ambiguous text classification tasks.
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
- Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Winata, Stella Biderman, Edward Raff, Dragomir Radev, Vassilina Nikoulina
- TLDR: Language adaptation improves zero-shot prompting performance in multilingual language models.
Logic-driven Indirect Supervision: An Application to Crisis Counseling
- Mattia Medina Grespan, Meghan Broadbent, Xinyao Zhang, Katherine Axford, Brent Kious, Zac Imel, Vivek Srikumar
- TLDR: We propose a logic-based indirect supervision approach that exploits declaratively stated structural dependencies between both levels of annotation to improve utterance modeling.
Grounding Characters and Places in Narrative Text
- Sandeep Soni, Amanpreet Sihra, Elizabeth Evans, Matthew Wilkens, David Bamman
- TLDR: We propose a novel spatial relationship categorization task for characters and locations in narrative text and train a model to predict their spatial relationships.
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
- Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov
- TLDR: We empirically quantify the effects of political (social, economic) biases in pretraining data on the fairness of high-stakes social-oriented tasks.
SLABERT Talk Pretty One Day: Modeling Second Language Acquisition with BERT
- Aditya Yadavalli, Alekhya Yadavalli, Vera Tobin
- TLDR: We show that language family distance predicts negative transfer in second language acquisition and show that conversational speech data shows greater facilitation for language acquisition than scripted speech data.
Contrastive Novelty-Augmented Learning: Anticipating Outliers with Large Language Models
- Albert Xu, Xiang Ren, Robin Jia
- TLDR: We present Contrastive Novelty-Augmented Learning, a novel novelty-based method for detecting and abstaining on novel class examples.
Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?
- Chengwei Qin, Shafiq Joty, Qian Li, Ruochen Zhao
- TLDR: Meta-learning can improve cross-task generalization in few-shot learning by learning to initialize the prompt embeddings from other relevant tasks.
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale
- Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth
- TLDR: We investigate the hypothesis that large language models may be under-trained for in-context learning and provide several insights that indicate large language model may be over-trained.
Question-Answering in a Low-resourced Language: Benchmark Dataset and Models for Tigrinya
- Fitsum Gaim, Wonsuk Yang, Hancheol Park, Jong Park
- TLDR: We present a native QA dataset for Tigrinya, a language in East Africa, and present a dataset for question-answer answering in the language.
ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain
- Mike Zhang, Rob van der Goot, Barbara Plank
- TLDR: We present a language model based on XLM-R-large for the computational job market domain that achieves state-of-the-art results on 6 out of 9 datasets.
CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval
- Minghan Li, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen
- TLDR: We propose conditional token interaction via dynamic lexical routing for efficient and effective multi-vector retrieval.
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
- Bang Yang, Fenglin Liu, Xian Wu, Yaowei Wang, Xu Sun, Yuexian Zou
- TLDR: We present a simple yet effective zero-shot approach MultiCapCLIP that can generate visual captions for different scenarios and languages without any labeled vision-caption pairs of downstream datasets.
Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge
- Vasudha Varadarajan, Swanie Juhng, Syeda Mahwish, Xiaoran Liu, Jonah Luby, Christian Luhmann, H. Andrew Schwartz
- TLDR: We propose and investigate transfer- and active learning solutions to the rare class problem of dissonance detection through utilizing models trained on closely related tasks and the evaluation of acquisition strategies, including a proposed probability-of-rare-class (PRC) approach.
In-sample Curriculum Learning by Sequence Completion for Natural Language Generation
- Qi Jia, Yizhu Liu, Haifeng Tang, Kenny Zhu
- TLDR: We propose to do in-sample curriculum learning for natural language generation tasks by training the model to generate the last few words, i.e., do sequence completion, and gradually extends to generate sequence completion.
Product Question Answering in E-Commerce: A Survey
- Yang Deng, Wenxuan Zhang, Qian Yu, Wai Lam
- TLDR: We categorize PQA studies into four problem settings and present existing datasets and evaluation protocols for each setting.
Towards Domain-Agnostic and Domain-Adaptive Dementia Detection from Spoken Language
- Shahla Farzana, Natalie Parde
- TLDR: We show that domain adaptation techniques can improve generalizability across diverse datasets for dementia detection.
Generalizing Backpropagation for Gradient-Based Interpretability
- Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alex Warstadt, Ryan Cotterell
- TLDR: We show that the gradient computation of a model is a special case of a more general formulation using semirings.
UPPAM: A Unified Pre-training Architecture for Political Actor Modeling based on Language
- Xinyi Mou, Zhongyu Wei, Qi Zhang, Xuanjing Huang
- TLDR: We propose a Unified Pre-training Architecture for Political Actor Modeling based on language (UPPAM).
Generic Temporal Reasoning with Differential Analysis and Explanation
- Yu Feng, Ben Zhou, Haoyu Wang, Helen Jin, Dan Roth
- TLDR: We present a novel task named TODAY that evaluates whether systems can correctly understand the effect of incremental changes in temporal relations.
Model-Based Simulation for Optimising Smart Reply
- Benjamin Towle, Ke Zhou
- TLDR: Simulation-based simulation for smart reply systems that optimise the relevance of predicted replies.
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval
- John Wieting, Jonathan Clark, William Cohen, Graham Neubig, Taylor Berg-Kirkpatrick
- TLDR: Generative model for multilingual text embeddings.
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
- Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov
- TLDR: We explore the limitations of existing evaluation metrics for text generation and summarization, and propose a new methodology for robustness analysis.
Dealing with Semantic Underspecification in Multimodal NLP
- Sandro Pezzelle
- TLDR: We show that semantic underspecification is a crucial language feature that boosts its storage and processing efficiency, but we show that it can also negatively affect its performance and lead to harmful consequences when used for applications.
Trigger Warning Assignment as a Multi-Label Document Classification Problem
- Matti Wiegmann, Magdalena Wolska, Christopher Schröder, Ole Borchardt, Benno Stein, Martin Potthast
- TLDR: We propose a novel multi-label classification task for trigger warnings and provide a comprehensive taxonomy of trigger warnings.
WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
- Wenjie Zhuo, Yifan Sun, Xiaohan Wang, Linchao Zhu, Yi Yang
- TLDR: We present a novel whitening-based contrastive learning method for sentence embedding learning, which improves the contrastive and uniformity of sentence embeddings.
Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms
- Tianshu Zhang, Changchang Liu, Wei-Han Lee, Yu Su, Huan Sun
- TLDR: We propose a novel LOss Reduction Adjusted Re-weighting mechanism for federated learning for semantic parsing, which improves performance of existing federated algorithms and improves the performance of new algorithms.
Causality-Guided Multi-Memory Interaction Network for Multivariate Stock Price Movement Prediction
- Di Luo, Weiheng Liao, Shuqi Li, Xin Cheng, Rui Yan
- TLDR: We propose a novel end-to-end deep neural network for stock price movement prediction which, for the first time, models the multi-modality between financial text data and causality-enhanced stock correlations to achieve higher prediction accuracy.
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization
- SongYang Gao, Shihan Dou, Yan Liu, Xiao Wang, Qi Zhang, Zhongyu Wei, Jin Ma, Ying Shan
- TLDR: We propose a novel adversarial training procedure that minimizes the expected global loss under adversarial attacks by perturbing the input data’s probability distribution rather than their embeddings.
A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires
- Hoyun Song, Jisu Shin, Huije Lee, Jong Park
- TLDR: We propose a novel approach that captures the semantic meanings directly from the text and compares them to symptom-related descriptions.
Downstream Datasets Make Surprisingly Good Pretraining Corpora
- Kundan Krishna, Saurabh Garg, Jeffrey Bigham, Zachary Lipton
- TLDR: We show that pretraining on the BookWiki corpus outperforms standard pretraining and that pretrained models can outperform the latter on structured output prediction tasks.
Towards Open-World Product Attribute Mining: A Lightly-Supervised Approach
- Liyan Xu, Chenwei Zhang, Xian Li, Jingbo Shang, Jinho D. Choi
- TLDR: We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention.
XDailyDialog: A Multilingual Parallel Dialogue Corpus
- Zeming Liu, Ping Nie, Jie Cai, Haifeng Wang, Zheng-Yu Niu, Peng Zhang, Mrinmaya Sachan, Kaiping Peng
- TLDR: We provide a multilingual parallel open-domain dialog dataset, XDailyDialog, to enable researchers to explore the challenging task of multilingual and cross-lingual open-dialogue modeling.
PAL to Lend a Helping Hand: Towards Building an Emotion Adaptive Polite and Empathetic Counseling Conversational Agent
- Kshitij Mishra, Priyanshu Priya, Asif Ekbal
- TLDR: We propose a novel conversational agent for online counseling that is designed to provide a pleasant and empathetic counseling experience to substance addicts and crime victims.
Bidirectional Generative Framework for Cross-domain Aspect-based Sentiment Analysis
- Yue Deng, Wenxuan Zhang, Sinno Jialin Pan, Lidong Bing
- TLDR: We propose a unified bidirectional generative framework for cross-domain aspect-based sentiment analysis, which can tackle various cross-domains and provide new state-of-the-art results on all tasks.
Contrastive Decoding: Open-ended Text Generation as Optimization
- Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, Mike Lewis
- TLDR: We propose contrastive decoding, a reliable and effective approach for open-ended generation of language models that optimizes a contrastive objective subject to a plausibility constraint.
Resolving Indirect Referring Expressions for Entity Selection
- Mohammad Javad Hosseini, Filip Radlinski, Silvia Pareti, Annie Louis
- TLDR: We present a new dataset of indirect referring expressions for conversational systems and develop models for the disambiguation problem.
Accelerating Transformer Inference for Translation via Parallel Decoding
- Andrea Santilli, Silvio Severino, Emilian Postolache, Valentino Maiorca, Michele Mancusi, Riccardo Marin, Emanuele Rodola
- TLDR: We propose to reframe the standard greedy autoregressive decoding of Machine Translation as a parallel algorithm for fast inference and improve the translation quality.
Hard Sample Aware Prompt-Tuning
- Yuanjian Xu, Qi An, Jiahuan Zhang, Peng Li, Zaiqing Nie
- TLDR: We propose a new prompt-tuning based few-shot learning framework for few-shots NLP tasks that can improve the accuracy and discriminability of the model in few-shots.
WikiBio: a Semantic Resource for the Intersectional Analysis of Biographical Events
- Marco Antonio Stranisci, Rossana Damiano, Enrico Mensa, Viviana Patti, Daniele Radicioni, Tommaso Caselli
- TLDR: We present a new corpus annotated for biographical event detection and a model for biographic event detection.
Best-k Search Algorithm for Neural Text Generation
- Jiacheng Xu, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- TLDR: We propose a new algorithm for efficient and diverse NLG generation by greedy expanding the top k nodes.
Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages
- Sumanth Doddapaneni, Rahul Aralikatte, Gowtham Ramesh, Shreya Goyal, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar
- TLDR: We present a new benchmark for Indic languages that aims to test the multilingual zero-shot capabilities of pretrained language models.
Transforming Visual Scene Graphs to Image Captions
- Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang
- TLDR: We propose to TransForm Scene Graphs into more descriptive Captions (TFSGC).
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
- Yun Tang, Anna Sun, Hirofumi Inaguma, Xinyue Chen, Ning Dong, Xutai Ma, Paden Tomasello, Juan Pino
- TLDR: We propose a novel speech-to-text model that combines Transducer and Attention based Encoder-Decoder for speech-transcription tasks.
Improving Domain Generalization for Prompt-Aware Essay Scoring via Disentangled Representation Learning
- Zhiwei Jiang, Tianyi Gao, Yafeng Yin, Meng Liu, Hua Yu, Zifeng Cheng, Qing Gu
- TLDR: We propose a prompt-aware neural AES model to extract comprehensive representation for essay scoring, including both prompt-invariant and prompt-specific features.
What’s the Meaning of Superhuman Performance in Today’s NLU?
- Simone Tedeschi, Johan Bos, Thierry Declerck, Jan Hajič, Daniel Hershcovich, Eduard Hovy, Alexander Koller, Simon Krek, Steven Schockaert, Rico Sennrich, Ekaterina Shutova, Roberto Navigli
- TLDR: We show that the current benchmarks for Pretrained Language Models are not measuring human performance and provide recommendations for fairer and more transparent benchmarks.
PromptNER: Prompt Locating and Typing for Named Entity Recognition
- Yongliang Shen, Zeqi Tan, Shuhui Wu, Wenqi Zhang, Rongsheng Zhang, Yadong Xi, Weiming Lu, Yueting Zhuang
- TLDR: We propose a new multi-prompt multi-task model for NER tasks that uses the extended bipartite graph to generate a multi-slot multi-progressive prompt template and a dynamic template filling mechanism to fill the slots.
Hints on the data for language modeling of synthetic languages with transformers
- Rodolfo Zevallos, Nuria Bel
- TLDR: We show that the amount of language data needed for language models depends on the amount and type of morphological characteristics of the language.
Neural Machine Translation Methods for Translating Text to Sign Language Glosses
- Dele Zhu, Vera Czehmann, Eleftherios Avramidis
- TLDR: Improve Transformer-based Transformer models with Transformer augmentation and Transformer Transformer based NMT on Sign Language Glosses.
Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?
- Yuxin He, Jingyue Hu, Buzhou Tang
- TLDR: We propose a new approach to event argument extraction by exploiting co-occurrences.
HAUSER: Towards Holistic and Automatic Evaluation of Simile Generation
- Qianyu He, Yikai Zhang, Jiaqing Liang, Yuncheng Huang, Yanghua Xiao, Yunwen Chen
- TLDR: We present a holistic and automatic evaluation system for simile generation that is more correlated with human ratings and more efficient than prior automatic metrics.
Large-scale Lifelong Learning of In-context Instructions and How to Tackle It
- Jisoo Mok, Jaeyoung Do, Sungjin Lee, Tara Taghavi, Seunghak Yu, Sungroh Yoon
- TLDR: We propose a new method for lifelong in-context instruction learning that improves the generalization performance of language models.
Controllable Text Generation via Probability Density Estimation in the Latent Space
- Yuxuan Gu, Xiaocheng Feng, Sicheng Ma, Lingyuan Zhang, Heng Gong, Weihong Zhong, Bing Qin
- TLDR: We propose a novel control framework using probability density estimation in the latent space for controllable text generation.
Learning Latent Relations for Temporal Knowledge Graph Reasoning
- Mengqi Zhang, Yuwei Xia, Qiang Liu, Shu Wu, Liang Wang
- TLDR: We propose a novel Latent relations Learning method for temporal knowledge graph reasoning.
DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-level Value Function
- Haiming Wang, Ye Yuan, Zhengying Liu, Jianhao Shen, Yichun Yin, Jing Xiong, Enze Xie, Han Shi, Yujun Li, Lin Li, Jian Yin, Zhenguo Li, Xiaodan Liang
- TLDR: We propose a novel Dynamic-Tree Driven Theorem Solver that improves the state-of-the-art theorem-proving algorithm by guiding the search procedure with state confidence and proof-level values.
Unsupervised Selective Rationalization with Noise Injection
- Adam Storek, Melanie Subbiah, Kathleen McKeown
- TLDR: We propose a novel training technique for unsupervised selective rationalization that improves rationale plausibility and task accuracy over the state-of-the-art across a variety of tasks, including our new benchmark, while maintaining or improving model faithfulness.
Understanding In-Context Learning via Supportive Pretraining Data
- Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang
- TLDR: We study the role of pretraining data in improving language models’ ability to learn in-context learning.
ETHICIST: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation
- Zhexin Zhang, Jiaxin Wen, Minlie Huang
- TLDR: We propose a method for targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation, investigating how to recover the suffix in the training data when given a prefix.
Effective Contrastive Weighting for Dense Query Expansion
- Xiao Wang, Sean MacAvaney, Craig Macdonald, Iadh Ounis
- TLDR: We present a contrastive solution that learns to select the most useful embeddings for semantic search.
Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore
- Janosch Haber, Bertie Vidgen, Matthew Chapman, Vibhor Agarwal, Roy Ka-Wei Lee, Yong Keong Yap, Paul Röttger
- TLDR: We present a new multilingual dataset of online attacks, and provide fine-grained hierarchical labels for online attacks.
Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a Pretrained Language Model
- Jakob Prange, Man Ho Ivy Wong
- TLDR: We use both Bayesian and neural models to dissect a data set of Chinese learners’ pre- and post-interventional responses to two tests measuring their understanding of English prepositions.
Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization
- Artidoro Pagnoni, Alex Fabbri, Wojciech Kryscinski, Chien-Sheng Wu
- TLDR: We introduce Socratic pretraining, a question-driven, unsupervised pretraining objective specifically designed to improve controllability in long document controllable summarization tasks.
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
- Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Eisenschlos
- TLDR: We propose MatCha (Math reasoning and Chart derendering pretraining) to enhance visual language models’ capabilities in jointly modeling charts/plots and language data.
MGR: Multi-generator Based Rationalization
- Wei Liu, Haozhao Wang, Jun Wang, Ruixuan Li, Xinyang Li, YuanKai Zhang, Yang Qiu
- TLDR: We propose a novel method for rationalization that combines multiple generators and predictors to solve the two problems of spurious correlation and degeneration.
BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics
- Liang Ma, Shuyang Cao, Robert L Logan IV, Di Lu, Shihao Ran, Ke Zhang, Joel Tetreault, Alejandro Jaimes
- TLDR: We present a benchmark of unfaithful minimal pairs (BUMP), a dataset of 889 human-written, minimally different summary pairs, where a single error is introduced to a summary from the CNN/DailyMail dataset to produce an unfaithly summary.
Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection
- Rheeya Uppaal, Junjie Hu, Yixuan Li
- TLDR: We present a study investigating the efficacy of directly leveraging pre-trained language models for OOD detection, without any model fine-tuning on the ID data.
UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot Summarization
- Yulong Chen, Yang Liu, Ruochen Xu, Ziyi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang
- TLDR: Unified few-shot summarization model pre-trained with multiple summarization tasks and can be prefix-tuned to excel at any few-shots summarization task.
RADE: Reference-Assisted Dialogue Evaluation for Open-Domain Dialogue
- Zhengliang Shi, Weiwei Sun, Shuo Zhang, Zhen Zhang, Pengjie Ren, Zhaochun Ren
- TLDR: We propose a new approach for evaluating open-domain dialogue systems that uses reference as reference to predict the candidate response to the one-to-many problem.
An AMR-based Link Prediction Approach for Document-level Event Argument Extraction
- Yuqing Yang, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
- TLDR: We propose a novel graph structure for document-level event argument extraction and a novel method for link prediction on AMR graphs.
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
- Qingqing Cao, Bhargavi Paranjape, Hannaneh Hajishirzi
- TLDR: We present PuMer: a token reduction framework that uses text-informed Pruning and modality-aware Merging strategies to progressively reduce the tokens of input image and text, improving model inference speed and reducing memory footprint.
Gloss-Free End-to-End Sign Language Translation
- Kezhou Lin, Xiaohan Wang, Linchao Zhu, Ke Sun, Bang Zhang, Yi Yang
- TLDR: We propose a novel end-to-end sign language translation framework that improves the performance of sign language translations without gloss annotations.
TAGPRIME: A Unified Framework for Relational Structure Extraction
- I-Hung Hsu, Kuan-Hao Huang, Shuning Zhang, Wenxin Cheng, Prem Natarajan, Kai-Wei Chang, Nanyun Peng
- TLDR: We propose a sequence tagging model for relational structure extraction tasks that can be used to extract contextualized representations for a given condition.
Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
- Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, Alvin Cheung, Jianfeng Gao, Xia Song
- TLDR: We develop a new model, METRO-T0, which is pretrained using the redesigned ELECTRA-Style pretraining strategies and then prompt-finetuned on a mixture of NLP tasks.
BITE: Textual Backdoor Attacks with Iterative Trigger Injection
- Jun Yan, Vansh Gupta, Xiang Ren
- TLDR: We propose BITE, a backdoor attack that poisons the training data to establish strong correlations between the target label and a set of “trigger words” and use it to predict the target labels of the adversary’s model.
A Crosslingual Investigation of Conceptualization in 1335 Languages
- Yihong Liu, Haotian Ye, Leonie Weissweiler, Philipp Wicke, Renhao Pei, Robert Zangenfeind, Hinrich Schütze
- TLDR: We propose Conceptualizer, a method for aligning concepts in a parallel corpus of 1,335 languages by aligning them in a bipartite directed alignment graph between source language concepts and sets of target language strings.
Exploring and Verbalizing Academic Ideas by Concept Co-occurrence
- Yi Xu, Shuqian Sheng, Bo Xue, Luoyi Fu, Xinbing Wang, Chenghu Zhou
- TLDR: We propose a framework based on concept co-occurrence for academic idea inspiration, which has been integrated into a research assistant system.
mCLIP: Multilingual CLIP via Cross-lingual Transfer
- Guanhua Chen, Lu Hou, Yun Chen, Wenliang Dai, Lifeng Shang, Xin Jiang, Qun Liu, Jia Pan, Wenping Wang
- TLDR: We present a novel dual-stream multilingual VLP model trained by aligning the CLIP model and a Multilingual Text Encoder and a novel Triangle Cross-modal Knowledge Distillation method.
Distantly Supervised Course Concept Extraction in MOOCs with Academic Discipline
- Mengying Lu, Yuquan Wang, Jifan Yu, Yexing Du, Lei Hou, Juanzi Li
- TLDR: We propose a novel three-stage framework for deep learning of Massive Open Online Courses using distant supervision and a self-trained dataset for expert-labeled dataset.
Extrinsic Evaluation of Machine Translation Metrics
- Nikita Moghe, Tom Sherborne, Mark Steedman, Alexandra Birch
- TLDR: We investigate how useful MT metrics are at detecting the segment-level quality of machine translation systems and show that they are not interpretable.
ExplainMeetSum: A Dataset for Explainable Meeting Summarization Aligned with Human Intent
- Hyun Kim, Minsoo Cho, Seung-Hoon Na
- TLDR: We propose a novel multiple extractor guided summarization algorithm based on human-aligned extractive oracles and propose a new explainability-aware task, E3, which aims to automatically detect all evidence sentences that support a given summary.
A Cross-Modality Context Fusion and Semantic Refinement Network for Emotion Recognition in Conversation
- Xiaoheng Zhang, Yang Li
- TLDR: We propose a cross-modality context fusion and semantic refinement network for emotion recognition in conversation.
CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning
- Weiqi Wang, Tianqing Fang, Baixuan Xu, Chun Yi Louis Bo, Yangqiu Song, Lei Chen
- TLDR: We propose a novel semi-supervised learning framework for commonsense reasoning that learns to infer commonsense knowledge from existing knowledge.
The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research
- Mohamed Abdalla, Jan Philip Wahle, Terry Lima Ruas, Aurélie Névéol, Fanny Ducel, Saif Mohammad, Karen Fort
- TLDR: We explore the industry presence in the NLP community over time and show that it is growing rapidly.
Language of Bargaining
- Mourad Heddaya, Solomon Dworkin, Chenhao Tan, Rob Voigt, Alexander Zentefis
- TLDR: We propose a novel dataset for studying how the use of language shapes bilateral bargaining and show that when subjects can talk, negotiations finish faster, the likelihood of reaching agreement rises, and the variance of prices at which subjects agree drops substantially.
Do Question Answering Modeling Improvements Hold Across Benchmarks?
- Nelson F. Liu, Tony Lee, Robin Jia, Percy Liang
- TLDR: We measure the concurrence between 32 QA benchmarks on a set of 20 diverse modeling approaches and find that human-constructed benchmarks have high concurrence amongst themselves, even if their passage and question distributions are very different.
VLN-Trans: Translator for the Vision and Language Navigation Agent
- Yue Zhang, Parisa Kordjamshidi
- TLDR: We present a new synthetic sub-instruction dataset for navigation agents and train them to convert the original instructions into easy-to-follow sub-structured representations.
Bridging the Gap between Decision and Logits in Decision-based Knowledge Distillation for Pre-trained Language Models
- Qinhong Zhou, Zonghan Yang, Peng Li, Yang Liu
- TLDR: We propose a novel method to estimate logits from the decision distributions of pre-trained language models.
Continual Contrastive Finetuning Improves Low-Resource Relation Extraction
- Wenxuan Zhou, Sheng Zhang, Tristan Naumann, Muhao Chen, Hoifung Poon
- TLDR: We propose a novel method for low-resource RE by using contrastive learning to improve the pretraining and finetuning of RE models.
KGA: A General Machine Unlearning Framework Based on Knowledge Gap Alignment
- Lingzhi Wang, Tong Chen, Wei Yuan, Xingshan Zeng, Kam-Fai Wong, Hongzhi Yin
- TLDR: We propose a general unlearning framework for NLP tasks that induces forgetfulness and propose several unlearning evaluation metrics with pertinence.
UniCoRN: Unified Cognitive Signal ReconstructioN bridging cognitive signals and human language
- Nuwa Xi, Sendong Zhao, Haochun Wang, Chi Liu, Bing Qin, Ting Liu
- TLDR: We propose fMRI2text, a novel open-vocabulary task aiming to bridge fMRI time series and human language.
Dense-ATOMIC: Towards Densely-connected ATOMIC with High Knowledge Coverage and Massive Multi-hop Paths
- Xiangqing Shen, Siwei Wu, Rui Xia
- TLDR: We propose a CSKG completion method called Rel-CSKGC to predict the relation given the head event and the tail event of a triplet, and train a CSSKG comple model based on existing triplets in ATOMIC.
Shrinking Embeddings for Hyper-Relational Knowledge Graphs
- Bo Xiong, Mojtaba Nayyeri, Shirui Pan, Steffen Staab
- TLDR: We present ShrinkE, a geometric hyper-relational KG embedding method aiming to explicitly model the inference patterns of hyper-Relational facts.
CTC-based Non-autoregressive Speech Translation
- Chen Xu, Xiaoqian Liu, Xiaowen Liu, Qingxuan Sun, Yuhao Zhang, Murun Yang, Qianqian Dong, Tom Ko, Mingxuan Wang, Tong Xiao, Anxiang Ma, Jingbo Zhu
- TLDR: We propose a prediction-aware encoding approach for non-autoregressive speech translation and a curriculum learning approach for training it.
Attention as a Guide for Simultaneous Speech Translation
- Sara Papi, Matteo Negri, Marco Turchi
- TLDR: We propose EDAtt (Encoder-Decoder Attention), an adaptive policy that exploits the attention patterns between audio source and target textual translation to guide an offline-trained ST model during simultaneous inference.
On Complementarity Objectives for Hybrid Retrieval
- Dohyeon Lee, Seung-won Hwang, Kyungjae Lee, Seungtaek Choi, Sunghyun Park
- TLDR: We propose a new objective, RoC, which captures a fuller notion of complementarity, and show that the improved RoC of our model, in turn, improves the performance of hybrid retrieval.
C-STANCE: A Large Dataset for Chinese Zero-Shot Stance Detection
- Chenye Zhao, Yingjie Li, Cornelia Caragea
- TLDR: We present C-STANCE, the first Chinese dataset for zero-shot stance detection.
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding
- Haoli Bai, Zhiguang Liu, Xiaojun Meng, Li Wentao, Shuang Liu, Yifeng Luo, Nian Xie, Rongfu Zheng, Liangwei Wang, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu
- TLDR: We propose Wukong-Reader, a novel pre-training objective for document textline understanding, which leverages the structural knowledge nested in document textlines.
PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts
- Yunshui Li, Binyuan Hui, ZhiChao Yin, Min Yang, Fei Huang, Yongbin Li
- TLDR: We propose a unified, structured, compositional multi-modal dialogue pre-training framework that achieves state-of-the-art results on eight multi-dialogue benchmarks.
MVP-Tuning: Multi-View Knowledge Retrieval with Prompt Tuning for Commonsense Reasoning
- Yongfeng Huang, Yanyang Li, Yichong Xu, Lin Zhang, Ruyi Gan, Jiaxing Zhang, Liwei Wang
- TLDR: We propose MultiView Knowledge Retrieval with Prompt Tuning (MVP-Tuning) for commonsense reasoning tasks.
PEIT: Bridging the Modality Gap with Pre-trained Models for End-to-End Image Translation
- Shaolin Zhu, Shangjie Li, Yikun Lei, Deyi Xiong
- TLDR: We propose PEIT, an end-to-end image translation framework that bridges the modality gap with pre-trained models.
Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection
- Erik Arakelyan, Arnav Arora, Isabelle Augenstein
- TLDR: We propose Topic Efficient StancE Detection (TESTED), a multi-domain data efficient training set and a contrastive objective that is used for fine-tuning a stance classifier using the produced set.
DiSCoMaT: Distantly Supervised Composition Extraction from Tables in Materials Science Articles
- Tanishq Gupta, Mohd Zaki, Devanshi Khatsuriya, Kausik Hira, N M Anoop Krishnan, Mausam -
- TLDR: We present a novel NLP task for extracting compositions of materials (e.g., glasses) from scientific tables in materials science papers.
Self-Instruct: Aligning Language Models with Self-Generated Instructions
- Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
- TLDR: We present a novel method for tuning language models by bootstrapping off their own generations to learn to follow instructions.
Disentangled Phonetic Representation for Chinese Spelling Correction
- Zihong Liang, Xiaojun Quan, Qifan Wang
- TLDR: We propose a novel method to learn useful phonetic representations for Chinese spelling correction using only phonetic information.
Dissecting Transformer Length Extrapolation via the Lens of Receptive Field Analysis
- Ta-Chung Chi, Ting-Han Fan, Alexander Rudnicky, Peter Ramadge
- TLDR: We propose a new method for generating transformable language models that preserves perplexities when tested on long sequences.
CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models
- Jiaxu Zhao, Meng Fang, Zijing Shi, Yitong Li, Ling Chen, Mykola Pechenizkiy
- TLDR: Weird News Photos:
Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback
- Jing Xu, Megan Ung, Mojtaba Komeili, Kushal Arora, Y-Lan Boureau, Jason Weston
- TLDR: We study how to improve conversational skills in a new online-driven conversational framework by collecting feedback from humans during deployment.
Uncovering and Categorizing Social Biases in Text-to-SQL
- Yan Liu, Yan Gao, Zhe Su, Xiaokang Chen, Elliott Ash, Jian-Guang Lou
- TLDR: We propose a method to assess and mitigate social bias in Text-to-SQL models.
On the Compositional Generalization in Versatile Open-domain Dialogue
- Tingchen Fu, Xueliang Zhao, Lemao Liu, Rui Yan
- TLDR: We propose a sparsely activated modular architecture and multi-task learning for dialogue generation and demonstrate its effectiveness on 9 datasets.
What is the Real Intention behind this Question? Dataset Collection and Intention Classification
- Maryam Sadat Mirzaei, Kourosh Meshgi, Satoshi Sekine
- TLDR: We present a dataset that captures questions with positive/neutral and negative intentions and propose a classification method for the underlying intention categories.
Conjunct Resolution in the Face of Verbal Omissions
- Royi Rassin, Yoav Goldberg, Reut Tsarfaty
- TLDR: We propose a new method for recovering omitted elements of VP sentences from syntactic omissions.
Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts
- Mounica Maddela, Megan Ung, Jing Xu, Andrea Madotto, Heather Foran, Y-Lan Boureau
- TLDR: We present a novel dataset of unhelpful thought patterns and a novel method for generating positive reframes.
Learning In-context Learning for Named Entity Recognition
- Jiawei Chen, Yaojie Lu, Hongyu Lin, Jie Lou, Wei Jia, Dai Dai, Hua Wu, Boxi Cao, Xianpei Han, Le Sun
- TLDR: We propose an in-context learning-based NER approach, which can effectively inject in-Context NER ability into PLMs and recognize entities of novel types on-the-fly using only a few demonstrative instances.
Holistic Prediction on a Time-Evolving Attributed Graph
- Shohei Yamasaki, Yuya Sasaki, Panagiotis Karras, Makoto Onizuka
- TLDR: We propose a unified framework that predicts node attributes and topology changes such as the appearance and disappearance of links and the emergence and loss of nodes.
Modeling Instance Interactions for Joint Information Extraction with Neural High-Order Conditional Random Field
- Zixia Jia, Zhaohui Yan, Wenjuan Han, Zilong Zheng, Kewei Tu
- TLDR: We propose a novel approach to model cross-instance information extraction by using binary factors and ternary factors to directly model interactions between not only a pair of instances but also triplets.
Training Trajectories of Language Models Across Scales
- Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Veselin Stoyanov
- TLDR: We show that perplexity is a strong predictor of in-context learning performance on 74 multiple-choice tasks from BIG-Bench, and this holds independent of the model size.
A Diverse Set of Freely Available Linguistic Resources for Turkish
- Duygu Altinok
- TLDR: We present a diverse set of freely available linguistic resources for Turkish natural language processing, including corpora, pretrained models and education material.
Measuring Consistency in Text-based Financial Forecasting Models
- Linyi Yang, Yingpeng Ma, Yue Zhang
- TLDR: We show that the consistency of state-of-the-art NLP models for financial forecasting is poor.
Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation
- Nuno M. Guerreiro, Pierre Colombo, Pablo Piantanida, André Martins
- TLDR: We propose a novel, unsupervised, plug-in detector for neural machine translation that can detect hallucinations in neural machine translations.
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank
- Jiduan Liu, Jiahao Liu, Qifan Wang, Jingang Wang, Wei Wu, Yunsen Xian, Dongyan Zhao, Kai Chen, Rui Yan
- TLDR: We propose a novel approach, RankCSE, for unsupervised sentence representation learning, which incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
Entailment as Robust Self-Learner
- Jiaxin Ge, Hongyin Luo, Yoon Kim, James Glass
- TLDR: We propose a prompting strategy for entailment training that improves the zero-shot adaptation of pretrained entailment models and a pseudo-labeling algorithm for pseudo-labeling.
ReCode: Robustness Evaluation of Code Generation Models
- Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang, Mingyue Shang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, Bing Xiang
- TLDR: We propose ReCode, a comprehensive robustness evaluation benchmark for code generation models.
EPIC: Multi-Perspective Annotation of a Corpus of Irony
- Simona Frenda, Alessandro Pedrani, Valerio Basile, Soda Marem Lo, Alessandra Teresa Cignarella, Raffaella Panizzon, Cristina Marco, Bianca Scarlini, Viviana Patti, Cristina Bosco, Davide Bernardi
- TLDR: We present EPIC (English Perspectivist Irony Corpus), the first annotated corpus for irony analysis based on the principles of data perspectivism.
Dialogue Summarization with Static-Dynamic Structure Fusion Graph
- Shen Gao, Xin Cheng, Mingzhe Li, Xiuying Chen, Jinpeng Li, Dongyan Zhao, Rui Yan
- TLDR: We propose a dynamic graph-based dialogue summarization model that adaptively learns the graph structure in an end-to-end learning fashion.
Large-Scale Correlation Analysis of Automated Metrics for Topic Models
- Jia Peng Lim, Hady Lauw
- TLDR: We conduct a large-scale correlation analysis of coherence metrics and human judgement.
U-CREAT: Unsupervised Case Retrieval using Events extrAcTion
- Abhinav Joshi, Akshat Sharma, Sai Kiran Tanikella, Ashutosh Modi
- TLDR: We propose an unsupervised retrieval method-based pipeline U-CREAT for legal case retrieval and show state-of-the-art performance on the IL-PCR and COLIEE corpora.
ArgAnalysis35K : A large-scale dataset for Argument Quality Analysis
- Omkar Joshi, Priya Pitre, Yashodhara Haribhakta
- TLDR: We present a new dataset for arguing quality detection and a new method for scoring the relevance of arguments.
Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework
- Mingqi Gao, Xiaojun Wan, Jia Su, Zhefeng Wang, Baoxing Huai
- TLDR: We propose a new evaluation framework for dialogue summarization that automatically evaluates the performance of FEC models on different factual error categories.
Minding Language Models’ (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker
- Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Yejin Choi, Yulia Tsvetkov
- TLDR: We propose SymbolicToM, a plug-and-play algorithm that enhances theory of mind of off-the-shelf neural language models without explicit supervision.
Don’t Retrain, Just Rewrite: Countering Adversarial Perturbations by Rewriting Text
- Ashim Gupta, Carter Blum, Temma Choji, Yingjie Fei, Shalin Shah, Alakananda Vempala, Vivek Srikumar
- TLDR: We present ATINTER, a language model that intercepts and learns to rewrite adversarial inputs to make them non-adversarial for a downstream text classifier.
Aggregating Multiple Heuristic Signals as Supervision for Unsupervised Automated Essay Scoring
- Cong Wang, Zhiwei Jiang, Yafeng Yin, Zifeng Cheng, Shiping Ge, Qing Gu
- TLDR: We propose a novel unsupervised AES approach ULRA, which does not require groundtruth scores of essays for training.
Mitigating Label Biases for In-context Learning
- Yu Fei, Yifan Hou, Zeming Chen, Antoine Bosselut
- TLDR: We propose a novel bias calibration method for language models that estimates a language model’s label bias using random in-domain words from the task corpus.
QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations
- Chaitanya Malaviya, Peter Shaw, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
- TLDR: We present a dataset of natural language queries with implicit set operations that map to a set of entities corresponding to Wikipedia documents.
Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
- Yujie Wang, Hu Zhang, Jiye Liang, Ru Li
- TLDR: We propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representations of the HKG using a two-stage pruning strategy and knowledge represent learning.
Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation
- Hao Li, Viktor Schlegel, Riza Batista-Navarro, Goran Nenadic
- TLDR: We propose a two-step abstractive summarisation framework based on neural topic modelling with an iterative clustering procedure, to generate key points which are aligned with how humans identify key points.
Ambiguous Learning from Retrieval: Towards Zero-shot Semantic Parsing
- Shan Wu, Chunlei Xin, Hongyu Lin, Xianpei Han, Cao Liu, Jiansong Chen, Fan Yang, Guanglu Wan, Le Sun
- TLDR: We propose a new method for ambiguous supervision of semantic parsing using pretrained language models.
Explicit Syntactic Guidance for Neural Text Generation
- Yafu Li, Leyang Cui, Jianhao Yan, Yongjing Yin, Wei Bi, Shuming Shi, Yue Zhang
- TLDR: We propose a syntax-guided generation schema, which generates the sequence guided by a constituency parse tree in a top-down direction.
What does a Text Classifier Learn about Morality? An Explainable Method for Cross-Domain Comparison of Moral Rhetoric
- Enrico Liscio, Oscar Araque, Lorenzo Gatti, Ionut Constantinescu, Catholijn Jonker, Kyriaki Kalimeri, Pradeep Kumar Murukannaiah
- TLDR: We propose Tomea, a method to compare a supervised classifier’s representation of moral rhetoric across domains.
Graph-based Relation Mining for Context-free Out-of-vocabulary Word Embedding Learning
- Ziran Liang, Yuyin Lu, HeGang Chen, Yanghui Rao
- TLDR: We propose a novel graph-based relation mining method for OOV word embedding learning.
Multimodal Persona Based Generation of Comic Dialogs
- Harsh Agrawal, Aditya Mishra, Manish Gupta, Mausam -
- TLDR: We propose a multimodal persona based architecture for dialogue generation in comic strips and show that it can significantly improve the perplexity score of existing dialogue generation models.
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
- Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- TLDR: We present LLM-Blender, an ensembling framework designed to attain consistently superior performance by leveraging the diverse strengths of multiple open-source large language models (LLMs).
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation
- Weihao Zeng, Lulu Zhao, Keqing He, Ruotong Geng, Jingang Wang, Wei Wu, Weiran Xu
- TLDR: We propose a prompt-based disentangled controllable dialogue generation model for multi-attribute controllables.
Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization
- Minghang Zheng, Shaogang Gong, Hailin Jin, Yuxin Peng, Yang Liu
- TLDR: We propose a structure-based Pseudo Label generation method for video sentence localization in a zero-shot setting, which learns with only video data without any annotation.
IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages
- Ananya Sai B, Tanay Dixit, Vignesh Nagarajan, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra, Raj Dabre
- TLDR: Meta-evaluation of machine translation metrics for Indian languages.
Weaker Than You Think: A Critical Look at Weakly Supervised Learning
- Dawei Zhu, Xiaoyu Shen, Marius Mosbach, Andreas Stephan, Dietrich Klakow
- TLDR: We show that the benefits of weakly supervised learning are significantly overestimated and that the available clean validation samples are not sufficient to train robust models.
Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases
- Yingji Li, Mengnan Du, Xin Wang, Ying Wang
- TLDR: We propose a novel adversarial training-inspired two-stage debiasing model using Contrastive learning with Continuous Prompt Augmentation to mitigate social biases in PLMs’ encoding.
Towards Understanding Omission in Dialogue Summarization
- Yicheng Zou, Kaitao Song, Xu Tan, Zhongkai Fu, Qi Zhang, Dongsheng Li, Tao Gui
- TLDR: We propose a new dataset for dialogue summarization that provides high-quality omission labels for dialogue summary and propose an omission detection task for dialogue summation.
Python Code Generation by Asking Clarification Questions
- Haau-Sing (Xiaocheng) Li, Mohsen Mesgar, André Martins, Iryna Gurevych
- TLDR: We propose a novel and realistic dataset for code generation from text that can be used to answer questions about the under-specification of a natural language description.
A Compare-and-contrast Multistage Pipeline for Uncovering Financial Signals in Financial Reports
- Jia-Huei Ju, Yu-Shiang Huang, Cheng-Wei Lin, Che Lin, Chuan-Ju Wang
- TLDR: We propose a novel multi-age multistage pipeline for identifying financial signals in narrative financial reports.
Improving the robustness of NLI models with minimax training
- Michalis Korakakis, Andreas Vlachos
- TLDR: We present a training method to reduce the reliance of NLI models on shortcuts and improve their out-of-distribution performance without assuming prior knowledge of the shortcuts being targeted.
USSA: A Unified Table Filling Scheme for Structured Sentiment Analysis
- Zepeng Zhai, Hao Chen, Ruifan Li, Xiaojie Wang
- TLDR: We propose a novel 2D table-filling scheme for structured sentiment analysis that addresses the kernel bottleneck of previous SSA methods by utilizing 13 different types of relations.
PAD-Net: An Efficient Framework for Dynamic Networks
- Shwai He, Liang Ding, Daize Dong, Boan Liu, Fuqiang Yu, Dacheng Tao
- TLDR: We propose a partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones.
Resolving Ambiguities in Text-to-Image Generative Models
- Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta
- TLDR: We study ambiguities that arise in text-to-image generative models and propose a framework to disambiguate them.
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
- Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, Minjoon Seo
- TLDR: We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc.
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
- Or Honovich, Thomas Scialom, Omer Levy, Timo Schick
- TLDR: We present a large dataset of creative and diverse instructions, collected with virtually no human labor, that outperforms the performance of existing datasets on inference-time natural language descriptions.
To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering
- Dheeru Dua, Emma Strubell, Sameer Singh, Pat Verga
- TLDR: We propose a new evaluation set for domain shift evaluation and show that the type of shift in a target dataset is predictive of which data augmentation schemes will be effective for domain adaption.
A Survey for Efficient Open Domain Question Answering
- Qin Zhang, Shangsi Chen, Dongkuan Xu, Qingqing Cao, Xiaojun Chen, Trevor Cohn, Meng Fang
- TLDR: We present a new paper on the state of the art in open domain question answering and provide a quantitative analysis of memory cost, query speed, accuracy, and overall performance comparison.
Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities
- Sina Ahmadi, Antonios Anastasopoulos
- TLDR: Synthetic data for Perso-Arabic scripts can be used to improve the performance of downstream tasks such as machine translation and language identification.
Compositional Generalization without Trees using Multiset Tagging and Latent Permutations
- Matthias Lindemann, Alexander Koller, Ivan Titov
- TLDR: We propose a new way of parsing semantic tokens using regularized linear programs and predict permutations.
ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
- Xiao Xu, Bei Li, Chenfei Wu, Shao-Yen Tseng, Anahita Bhiwandiwalla, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan
- TLDR: We propose ManagerTower, a novel VL model architecture that gathers and combines the insights of pre-trained uni-modal experts at different levels.
Finding the Pillars of Strength for Multi-Head Attention
- Jinjie Ni, Rui Mao, Zonglin Yang, Han Lei, Erik Cambria
- TLDR: We propose a new method for improving Multi-Head Attention by focusing on the most representative and distinctive features with minimum resources.
Jointprop: Joint Semi-supervised Learning for Entity and Relation Extraction with Heterogeneous Graph-based Propagation
- Yandan Zheng, Anran Hao, Anh Tuan Luu
- TLDR: We propose Jointprop, a novel semi-supervised entity and relation extraction framework that captures the global structure information between individual tasks and exploits interactions within unlabeled data.
Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering
- Jiajie Zhang, Shulin Cao, Tingjian Zhang, Xin Lv, Juanzi Li, Lei Hou, Jiaxin Shi, Qi Tian
- TLDR: We propose a novel two-stage XQA framework for explainable question answering by reasoning over Hierarchical Question Decomposition Tree and probabilistic reasoning over hierarchical question decompositions.
Faking Fake News for Real Fake News Detection: Propaganda-Loaded Training Data Generation
- Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi, Heng Ji
- TLDR: PropaNews is a novel framework for generating fake news detectors trained on PropaNews, which are better at detecting human-written disinformation than machine-generated fake news.
A Length-Extrapolatable Transformer
- Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei
- TLDR: We propose length extrapolation, i.e., training on short texts while evaluating longer sequences.
A Survey of Deep Learning for Mathematical Reasoning
- Pan Lu, Liang Qiu, Wenhao Yu, Sean Welleck, Kai-Wei Chang
- TLDR: We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
- Nitay Calderon, Subhabrata Mukherjee, Roi Reichart, Amir Kantor
- TLDR: We study the potential of knowledge distillation techniques for NLG and propose a family of Pseudo-Target augmentation methods for task-specific KD.
Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation
- Chaoya Jiang, Wei Ye, Haiyang Xu, Songfang Huang, Fei Huang, Shikun Zhang
- TLDR: We propose a novel Semantic Awared Contrastive Learning framework for cross-modal contrastive learning and show that the existence of false negative samples in the cross-mode contrastive loss can decrease downstream task performance of VLP models.
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
- Sicong Leng, Yang Zhou, Mohammed Haroon Dupty, Wee Sun Lee, Sam Joyce, Wei Lu
- TLDR: We propose a Sequence-to-Sequence model for language-guided design generation and benchmark it against several text-conditional image generation models.
Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations
- Bingsheng Yao, Prithviraj Sen, Lucian Popa, James Hendler, Dakuo Wang
- TLDR: We propose a new metric that can objectively evaluate the quality of human-annotated explanations, while Simulatability falls short.
Rethinking Annotation: Can Language Learners Contribute?
- Haneul Yoo, Rifki Afina Putri, Changyoon Lee, Youngin Lee, So-Yeon Ahn, Dongyeop Kang, Alice Oh
- TLDR: Language learners can contribute annotations to benchmark datasets for NLP tasks, and we show that language learners can improve their language proficiency in terms of vocabulary and grammar.
Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
- Shengqiong Wu, Hao Fei, Yixin Cao, Lidong Bing, Tat-Seng Chua
- TLDR: We propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting for multimodal relation extraction.
MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations
- Tao Shi, Shao-Lun Huang
- TLDR: We propose a novel attention-based correlation-aware multimodal fusion framework for emotion recognition in conversations that captures cross-modal mapping relationships across textual, audio and visual modalities based on bidirectional multi-head cross-attention layers.
Learning Language-Specific Layers for Multilingual Machine Translation
- Telmo Pires, Robin Schmidt, Yi-Hsiu Liao, Stephan Peitz
- TLDR: Language-Specific Transformer Layers for Multilingual Machine Translation.
Personality Understanding of Fictional Characters during Book Reading
- Mo Yu, Jiangnan Li, Shunyu Yao, Wenjie Pang, Xiaochen Zhou, Zhou Xiao, Fandong Meng, Jie Zhou
- TLDR: We present a novel dataset for character annotation that mimics the process of reading a novel and show how it can help to improve the accuracy of NLP models.
StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing
- Xuekai Zhu, Jian Guan, Minlie Huang, Juan Liu
- TLDR: We propose a novel generation model for non-parallel text style transfer and content preservation.
Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models
- Qingyu Tan, Hwee Tou Ng, Lidong Bing
- TLDR: We present a new temporal reasoning dataset for large language models and propose a novel learning framework to improve the temporal reasoning capability of large language model.
Finding the SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings
- Daniel Rotem, Michael Hassid, Jonathan Mamou, Roy Schwartz
- TLDR: We propose SWEET (Separating Weights for Early-Exit Transformers) an adaptive inference method that assigns each classifier its own set of unique model weights, not updated by other classifiers.
Large Language Models Are Reasoning Teachers
- Namgyu Ho, Laura Schmid, Se-Young Yun
- TLDR: We use large teacher models as reasoning teachers to enable complex reasoning in smaller models and reduce model size requirements by several orders of magnitude.
Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations
- Wenting Zhao, Justin Chiu, Claire Cardie, Alexander Rush
- TLDR: We propose a novel approach for abductive commonsense reasoning that exploits the fact that only a subset of explanations is correct for a given context.
PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification
- Yau-Shian Wang, Ta-Chung Chi, Ruohong Zhang, Yiming Yang
- TLDR: We present PESCO, a novel contrastive learning framework that substantially improves the performance of zero-shot text classification.
Visually-augmented pretrained language models for NLP tasks without images
- Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Qinyu Zhang, Ji-Rong Wen
- TLDR: We propose a novel visual knowledge augmentation method for pre-trained language models that can be applied to various PLMs or NLP tasks, without using any retrieved or generated images.
Using counterfactual contrast to improve compositional generalization for multi-step quantitative reasoning
- Armineh Nourbakhsh, Sameena Shah, Carolyn Rosé
- TLDR: We propose CounterComp, a method that uses counterfactual scenarios to generate samples with compositional contrast.
A Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization
- Lining Zhang, Simon Mille, Yufang Hou, Daniel Deutsch, Elizabeth Clark, Yixin Liu, Saad Mahamood, Sebastian Gehrmann, Miruna Clinciu, Khyathi Raghavi Chandu, João Sedoc
- TLDR: We show that we can successfully filter out subpar workers before they carry out the evaluations and obtain high-agreement annotations with similar constraints on resources.
TAVT: Towards Transferable Audio-Visual Text Generation
- Wang Lin, Tao Jin, Wenwen Pan, Linjun Li, Xize Cheng, Ye Wang, Zhou Zhao
- TLDR: We propose a novel transfer learning framework for audio-visual text generation based on audio-vox-visual correlation and destructive counterfactual transformations.
MeetingQA: Extractive Question-Answering on Meeting Transcripts
- Archiki Prasad, Trung Bui, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Mohit Bansal
- TLDR: We present a new QA dataset for meeting transcripts that can be used to build interactive interfaces on top of long transcripts.
FERMAT: An Alternative to Accuracy for Numerical Reasoning
- Jasivan Sivakumar, Nafise Sadat Moosavi
- TLDR: We introduce a multi-view evaluation set for numerical reasoning in English, called FERMAT.
Don’t Forget Your ABC’s: Evaluating the State-of-the-Art in Chat-Oriented Dialogue Systems
- Sarah E. Finch, James D. Finch, Jinho D. Choi
- TLDR: We present a novel human evaluation method for dialogue systems that can measure many{pasted macro ‘LN’} dialogue system behaviors.
Decoder Tuning: Efficient Language Understanding as Decoding
- Ganqu Cui, Wentao Li, Ning Ding, Longtao Huang, Zhiyuan Liu, Maosong Sun
- TLDR: We propose a novel algorithm for input-side adaptation of pre-trained models with model parameters frozen.
The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources
- Akshatha Arodi, Martin Pömsl, Kaheer Suleman, Adam Trischler, Alexandra Olteanu, Jackie Chi Kit Cheung
- TLDR: We present a suite of coreference resolution subtasks that require reasoning over multiple facts and show that even the best models struggle to integrate knowledge presented only at inference time.
CREST: A Joint Framework for Rationalization and Counterfactual Text Generation
- Marcos Treviso, Alexis Ross, Nuno M. Guerreiro, André Martins
- TLDR: We introduce CREST (ContRastive Edits with Sparse raTionalization), a joint framework for selective rationalization and counterfactual text generation, and show that this framework leads to improvements in counterfactually quality, model robustness, and interpretability.
Towards Unifying Multi-Lingual and Cross-Lingual Summarization
- Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie Zhou
- TLDR: We propose a novel multilingual text summarization task that can transfer task knowledge across different languages and perform better than existing multilingual summarization tasks.
On Improving Summarization Factual Consistency from Natural Language Feedback
- Yixin Liu, Budhaditya Deb, Milagro Teruel, Aaron Halfaker, Dragomir Radev, Ahmed Hassan Awadallah
- TLDR: We present a high-quality dataset for natural language generation and summarization that provides factual consistency in summarization and provide insights into summarization factual consistency.
From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models
- Julia Mendelsohn, Ronan Le Bras, Yejin Choi, Maarten Sap
- TLDR: We present the first large-scale computational investigation of dogwhistles and show that they are used to evade both political repercussions and algorithmic content moderation.
Exploring Large Language Models for Classical Philology
- Frederick Riemenschneider, Anette Frank
- TLDR: We study the strengths of language models for Ancient Greek and Latin and provide benchmarking analysis of existing models.
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
- Yi Tu, Ya Guo, Huan Chen, Jinyang Tang
- TLDR: We propose a novel multi-modal pre-training model for visual-rich document understanding that can improve the interactions between text and layout modalities in a unified model and produce adaptive and robust multi-document representations for downstream tasks.
Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
- Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiu-Shi Zhu, Eng Siong Chng
- TLDR: We propose a universal viseme-phoneme mapping (UniVPM) approach to improve robustness of audio-visual speech recognition with visual information.
An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation
- Xuancheng Huang, Zijun Liu, Peng Li, Tao Li, Maosong Sun, Yang Liu
- TLDR: We propose a unified way to control the input of multiple aspects of text generation and machine translation by using trainable gates to normalize the intervention of prefixes to restrain the growing interference.
Double-Branch Multi-Attention based Graph Neural Network for Knowledge Graph Completion
- Hongcai Xu, Junpeng Bao, Wenbo Liu
- TLDR: We propose a graph attention network-based local aggregator for knowledge graph completion and a snowball local attention mechanism for graph graph completion.
Dual Cache for Long Document Neural Coreference Resolution
- Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
- TLDR: We propose a new hybrid cache that captures global and local entities separately, and effectively reduces the aggregated cache misses up to half as before, while improving F1 score of coreference by 0.7pt.
Knowledge Transfer in Incremental Learning for Multilingual Neural Machine Translation
- Kaiyu Huang, Peng Li, Jin Ma, Ting Yao, Yang Liu
- TLDR: We propose a knowledge transfer method that can efficiently adapt original MNMT models to diverse incremental language pairs incrementally while maintaining performance on original language pairs.
DisorBERT: A Double Domain Adaptation Model for Detecting Signs of Mental Disorders in Social Media
- Mario Aragon, Adrian Pastor Lopez Monroy, Luis Gonzalez, David E. Losada, Manuel Montes
- TLDR: We propose a double-domain adaptation of a language model to detect signs of mental disorders on social media.
Toward Interactive Dictation
- Belinda Z. Li, Jason Eisner, Adam Pauls, Sam Thomson
- TLDR: We study the feasibility of allowing users to interrupt their dictation with spoken editing commands in open-ended natural language.
CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors
- Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu
- TLDR: We propose to use code-based prompts to generate structured output for information extraction tasks and demonstrate its effectiveness on few-shot learning.
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning
- Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song
- TLDR: We propose a novel method for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications.
Bridging The Gap: Entailment Fused-T5 for Open-retrieval Conversational Machine Reading Comprehension
- Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao
- TLDR: We propose a novel one-stage end-to-end framework, called Entailment Fused-T5 (EFT), to bridge the information gap between decision-making and question generation in a global understanding manner.
LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming
- Jingsheng Gao, Yixin Lian, Ziyi Zhou, Yuzhuo Fu, Baoyuan Wang
- TLDR: We present a new dataset for open-domain dialogue systems and propose retrieval-based baselines for response modeling and addressee recognition.
Prompting PaLM for Translation: Assessing Strategies and Performance
- David Vilar, Markus Freitag, Colin Cherry, Jiaming Luo, Viresh Ratnakar, George Foster
- TLDR: We investigate the state of the art in machine translation and provide a new analysis of its performance.
Exploring Lottery Prompts for Pre-trained Language Models
- Yulin Chen, Ning Ding, Xiaobin Wang, Shengding Hu, Haitao Zheng, Zhiyuan Liu, Pengjun Xie
- TLDR: We explore the instance-level prompt and their generalizability.
A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations
- Wenjie Zheng, Jianfei Yu, Rui Xia, Shijin Wang
- TLDR: We propose a multimodal facial expression-aware multi-task learning framework for multi-party multi-channel emotion recognition based on multi-tasking.
TeAST: Temporal Knowledge Graph Embedding via Archimedean Spiral Timeline
- Jiang Li, Xiangdong Su, Guanglai Gao
- TLDR: We propose a novel TKGE model which encodes temporal information into entities and provides interpretability.
Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition
- Yuwei Bao, Barrett Lattimer, Joyce Chai
- TLDR: We present a computational process for word acquisition that learns to filter out and extract common information for each shared linguistic label.
Conjunct Lengths in English, Dependency Length Minimization, and Dependency Structure of Coordination
- Adam Przepiórkowski, Michał Woźniak
- TLDR: We show that, in English binary coordinations, left conjuncts tend to be shorter than right conjunct, regardless of the position of the governor of the coordination.
LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development
- Ilias Chalkidis, Nicolas Garneau, Catalina Goanta, Daniel Katz, Anders Søgaard
- TLDR: We analyze the interplay between the upstream, probing, and downstream performance of legal-oriented pre-trained language models and show that probing performance strongly correlates with upstream and downstream downstream performance.
Revisiting Commonsense Reasoning in Machine Translation: Training, Evaluation and Challenge
- Xuebo Liu, Yutong Wang, Derek F. Wong, Runzhe Zhan, Liangxuan Yu, Min Zhang
- TLDR: We present a comprehensive study aimed at expanding the understanding of commonsense reasoning in neural machine translation and propose a novel entity-aware evaluation method for evaluating NMT models with high CR abilities.
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
- Kai Mei, Zheng Li, Zhenting Wang, Yang Zhang, Shiqing Ma
- TLDR: We propose transferable backdoor attacks against prompt-based models, called NOTABLE, which is independent of downstream tasks and prompting strategies.
Revisiting Relation Extraction in the era of Large Language Models
- Somin Wadhwa, Silvio Amir, Byron Wallace
- TLDR: We evaluate GPT-3 and Flan-T5 on Relation Extraction tasks under varying levels of supervision and show that GPT3 is capable of near SOTA performance, while Flan-5 is not as capable.
Pre-trained Language Models Can be Fully Zero-Shot Learners
- Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, Lei Li
- TLDR: We propose a new nonparametric prompting PLM for fully zero-shot language understanding.
Can Large Language Models Be an Alternative to Human Evaluations?
- Cheng-Han Chiang, Hung-yi Lee
- TLDR: We use large language models to evaluate the quality of texts in unseen tasks.
HyperMixer: An MLP-based Low Cost Alternative to Transformers
- Florian Mai, Arnaud Pannatier, Fabio Fehr, Haolin Chen, Francois Marelli, Francois Fleuret, James Henderson
- TLDR: We propose a simple MLP-based architecture for token mixing that performs better than Transformers and on par with Transformers.
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
- Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino
- TLDR: We present a novel two-pass direct S2ST architecture, UnitY, which first generates textual representations and predicts discrete acoustic units subsequently.
Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression
- Wen Wu, Chao Zhang, Philip Woodland
- TLDR: We propose a Bayesian approach to estimate the uncertainty in emotion attribute labels of utterances and show state-of-the-art results on the mean values and distribution of emotion attributes.
Annotation-Inspired Implicit Discourse Relation Classification with Auxiliary Discourse Connective Generation
- Wei Liu, Michael Strube
- TLDR: We propose a novel neural model to generate discourse connectives for implicit discourse relation classification.
Plug-and-Play Document Modules for Pre-trained Models
- Chaojun Xiao, Zhengyan Zhang, Xu Han, Chi-Min Chan, Yankai Lin, Zhiyuan Liu, Xiangyang Li, Zhonghua Li, Zhao Cao, Maosong Sun
- TLDR: We propose to encode documents for PTMs in a plug-and-play module for downstream tasks, which is more efficient than conventional encoding-task coupling methods.
An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models
- Zhongbin Xie, Thomas Lukasiewicz
- TLDR: Parameter-efficient methods in combination with counterfactual data augmentation for bias mitigation.
Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models
- Lijing Wang, Yingya Li, Timothy Miller, Steven Bethard, Guergana Savova
- TLDR: We propose a new method for fine-tuning large pretrained neural networks that reduces bias and variance in a fine-tuning setting by using ensemble methods explicitly designed to decrease variance due to optimization.
A Comparative Study on the Impact of Model Compression Techniques on Fairness in Language Models
- Krithika Ramesh, Arnav Chavan, Shrey Pandit, Sunayana Sitaram
- TLDR: Compression techniques for deep learning can affect fairness measures in language classification.
Ranking-Enhanced Unsupervised Sentence Representation Learning
- Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh
- TLDR: We propose a novel unsupervised sentence encoder that predicts the semantic meaning of a sentence by leveraging its relationship with other sentences in an external corpus, as well as the input sentence itself.
To Revise or Not to Revise: Learning to Detect Improvable Claims for Argumentative Writing Support
- Gabriella Skitalinskaya, Henning Wachsmuth
- TLDR: We propose a new sampling strategy based on revision distance to identify revision-based corpora that can help to identify argumentative claims in need of specific revisions.
Human-in-the-loop Evaluation for Early Misinformation Detection: A Case Study of COVID-19 Treatments
- Ethan Mendes, Yang Chen, Wei Xu, Alan Ritter
- TLDR: We present a human-in-the-loop evaluation framework for fact-checking novel misinformation claims and identifying social media messages that support them.
Composition-contrastive Learning for Sentence Embeddings
- Sachin Chanchani, Ruihong Huang
- TLDR: We propose a novel method for learning textual representations from unlabelled text by maximizing alignment between minimally-perturbed embeddings of the same text, and encouraging a uniform distribution of embeddlings across a broader corpus.
Causes and Cures for Interference in Multilingual Translation
- Uri Shaham, Maha Elbayad, Vedanuj Goswami, Omer Levy, Shruti Bhosale
- TLDR: We identify the main factors that contribute to interference in multilingual machine translation and show that using standard transformer configurations with less than one billion parameters largely alleviates interference and promotes synergy.
Understanding and Bridging the Modality Gap for Speech Translation
- Qingkai Fang, Yang Feng
- TLDR: We propose a method to bridge the modality gap in speech translation by regularizing output predictions of ST and MT.
Few-shot Reranking for Multi-hop QA via Language Model Prompting
- Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu Wang
- TLDR: We propose PromptRank, a new method for few-shot reranking for multi-hop QA with open-domain questions.
DICE: Data-Efficient Clinical Event Extraction with Generative Models
- Mingyu Derek Ma, Alexander Taylor, Wei Wang, Nanyun Peng
- TLDR: We introduce DICE, a robust and data-efficient generative model for clinical event extraction.
XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations
- Yusen Zhang, Jun Wang, Zhiguo Wang, Rui Zhang
- TLDR: We present XSemPLR, a unified benchmark for cross-lingual semantic parsing on 22 natural languages and 8 meaning representations.
INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation
- Wenhao Zhu, Jingjing Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen
- TLDR: We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.
Uncertainty Guided Label Denoising for Document-level Distant Relation Extraction
- Qi Sun, Kun Huang, Xiaocui Yang, Pengfei Hong, Kun Zhang, Soujanya Poria
- TLDR: We propose a novel uncertainty estimation method for document-level distant relation extraction and propose a new framework for pseudo label denoising.
Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language Learning
- Shivaen Ramshetty, Gaurav Verma, Srijan Kumar
- TLDR: We propose a robustness evaluation strategy for multimodal deep vision and language models that uses visual attributes of the objects in the image to augment the input text.
Crosslingual Generalization through Multitask Finetuning
- Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel
- TLDR: We show that multilingual language models can generalize to tasks in languages they have never seen.
Evaluate AMR Graph Similarity via Self-supervised Learning
- Ziyi Shou, Fangzhen Lin
- TLDR: We propose AMRSim, an automatic AMR graph similarity evaluation metric that improves the correlations with human semantic scores and remains robust under diverse challenges.
Analyzing Transformers in Embedding Space
- Guy Dar, Mor Geva, Ankit Gupta, Jonathan Berant
- TLDR: We present a theoretical analysis of Transformer-based models and propose a new method for interpreting them in embedding space.
Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning
- Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Yang Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang
- TLDR: We present a novel approach for data-to-text generation that addresses the limitations of current methods that primarily focus on specific types of structured data.
FactKG: Fact Verification via Reasoning on Knowledge Graphs
- Jiho Kim, Sungjin Park, Yeonsu Kwon, Yohan Jo, James Thorne, Edward Choi
- TLDR: We present FactKG: Fact Verificationvia Reasoning on Knowledge Graphs, a dataset for fact verification.
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
- Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
- TLDR: We present a new study of biomedical PLMs trained on biomedical data and show that biomedical PLM can be trained on both public and private data.
Discriminative Reasoning with Sparse Event Representation for Document-level Event-Event Relation Extraction
- Changsen Yuan, Heyan Huang, Yixin Cao, Yonggang Wen
- TLDR: We propose a novel document-level event-level model for document-based reasoning.
Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks
- Junyu Lu, Bo Xu, Xiaokun Zhang, Changrong Min, Liang Yang, Hongfei Lin
- TLDR: We propose a new dataset for fine-grained detection of Chinese toxic language and a benchmark model for toxic knowledge enhancement.
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
- Paul-Ambroise Duquenne, Hongyu Gong, Ning Dong, Jingfei Du, Ann Lee, Vedanuj Goswami, Changhan Wang, Juan Pino, Benoît Sagot, Holger Schwenk
- TLDR: We present SpeechMatrix, a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings.
Character-Aware Models Improve Visual Text Rendering
- Rosanne Liu, Dan Garrette, Chitwan Saharia, William Chan, Adam Roberts, Sharan Narang, Irina Blok, Rj Mical, Mohammad Norouzi, Noah Constant
- TLDR: We show that character-aware text generation models outperform character-blind text encoders on a range of novel text rendering tasks.
IDRISI-RA: The First Arabic Location Mention Recognition Dataset of Disaster Tweets
- Reem Suwaileh, Muhammad Imran, Tamer Elsayed
- TLDR: We present a new dataset for Arabic Location Mention Recognition and propose a new model for Arabic LMR.
FSUIE: A Novel Fuzzy Span Mechanism for Universal Information Extraction
- Tianshuo Peng, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao
- TLDR: We propose a novel framework for Universal Information Extraction based on fuzzy span and span length.
What Do NLP Researchers Believe? Results of the NLP Community Metasurvey
- Julian Michael, Ari Holtzman, Alicia Parrish, Aaron Mueller, Alex Wang, Angelica Chen, Divyam Madaan, Nikita Nangia, Richard Yuanzhe Pang, Jason Phang, Samuel R. Bowman
- TLDR: We present the results of the NLP Community Metasurvey.
Prototype-Guided Pseudo Labeling for Semi-Supervised Text Classification
- Weiyi Yang, Richong Zhang, Junfan Chen, Lihong Wang, Jaein Kim
- TLDR: We propose a novel semi-supervised framework for text classification with few labeled data and massive unlabeled data.
LENS: A Learnable Evaluation Metric for Text Simplification
- Mounica Maddela, Yao Dou, David Heineman, Wei Xu
- TLDR: We present LENS, a Learnable Evaluation Metric for Text Simplification, a new metric for evaluating machine translation models.
MeetingBank: A Benchmark Dataset for Meeting Summarization
- Yebowen Hu, Timothy Ganter, Hanieh Deilamsalehy, Franck Dernoncourt, Hassan Foroosh, Fei Liu
- TLDR: We present MeetingBank, a new benchmark dataset of city council meetings over the past decade, which provides a new testbed for summarization technology and allows the public to gain insight into how council decisions are made.
UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective
- Yang Ping, JunYu Lu, Ruyi Gan, Junjie Wang, Yuxiang Zhang, Pingjian Zhang, Jiaxing Zhang
- TLDR: Unified Universal Information Extractive Framework for Universal Information Extraction.
DEplain: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification
- Regina Stodden, Omar Momen, Laura Kallmeyer
- TLDR: We present a new dataset of parallel, professionally written and manually aligned simplifications in plain German, and a transformer-based seq2seq text simplification model.
A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text
- Yunxin Li, Baotian Hu, Yuxin Ding, Lin Ma, Min Zhang
- TLDR: We propose an end-to-end Neural Divide-and-Conquer Reasoning framework for linguistically complex text.
RARR: Researching and Revising What Language Models Say, Using Language Models
- Luyu Gao, Zhuyun Dai, Panupong Pasupat, Anthony Chen, Arun Tejasvi Chaganty, Yicheng Fan, Vincent Zhao, Ni Lao, Hongrae Lee, Da-Cheng Juan, Kelvin Guu
- TLDR: We propose RARR (Retrofit Attribution using Research and Revision), a system that automatically finds attribution for the output of any text generation model, and post-edits the output to fix unsupported content while preserving the original output as much as possible.

« Findings 2023