TLDRs

Findings 2023 (including all conferences)

EACL 2023

Using Punctuation as an Adversarial Attack on Deep Learning-Based NLP Systems: An Empirical Study
- Brian Formento, Chuan Sheng Foo, Luu Anh Tuan, See Kiong Ng
- TLDR: Punctuation insertions are a general adversarial attack mechanism that can be used to improve the quality of NLP models.
Self-Supervised Unimodal Label Generation Strategy Using Recalibrated Modality Representations for Multimodal Sentiment Analysis
- Yewon Hwang, Jong-Hwan Kim
- TLDR: We propose SUGRM which combines multimodal and unimodal tasks to optimize multimodality in multimodals.
Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks
- Pedro Rodriguez, Mahmoud Azab, Becka Silvert, Renato Sanchez, Linzy Labson, Hardik Shah, Seungwhan Moon
- TLDR: We propose to improve the accuracy of text-to-video retrieval benchmarks by correcting false negatives in caption-video pairs.
Improving Numeracy by Input Reframing and Quantitative Pre-Finetuning Task
- Chung-Chi Chen, Hiroya Takamura, Ichiro Kobayashi, Yusuke Miyao
- TLDR: We propose a method to solve the innumeracy problem in language models by exploring the notation of numbers.
Visualize Before You Write: Imagination-Guided Open-Ended Text Generation
- Wanrong Zhu, An Yan, Yujie Lu, Wenda Xu, Xin Wang, Miguel Eckstein, William Yang Wang
- TLDR: We propose a novel method for using machine-generated images to guide language models in open-ended text generation.
ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation
- Wanrong Zhu, Xin Wang, An Yan, Miguel Eckstein, William Yang Wang
- TLDR: We propose ImaginE, an imagination-based automatic evaluation metric for natural language generation.
Entity-Aware Dual Co-Attention Network for Fake News Detection
- Sin-han Yang, Chung-chi Chen, Hen-Hsen Huang, Hsin-Hsi Chen
- TLDR: We propose a Dual Co-Attention Network for fake news detection, which takes news content, social media replies, and external knowledge into consideration.
CIKQA: Learning Commonsense Inference with a Unified Knowledge-in-the-loop QA Paradigm
- Hongming Zhang, Yintong Huo, Yanai Elazar, Yangqiu Song, Yoav Goldberg, Dan Roth
- TLDR: We propose a new commonsense reasoning benchmark to motivate commonsense inference progress from two perspectives: (1) Evaluating whether models can distinguish knowledge quality by predicting if the knowledge is enough to answer the question; (2) Evaluated whether models are able to develop commonsense inferability capabilities that generalize across tasks.
Data-Efficient Methods For Improving Hate Speech Detection
- Sumegh Roychowdhury, Vikram Gupta
- TLDR: We propose a data augmentation technique for implicit and explicit hate speech detection and reformulating the task reformulation using entailment and cross-learning across five languages.
Learning the Effects of Physical Actions in a Multi-modal Environment
- Gautier Dagan, Frank Keller, Alex Lascarides
- TLDR: We propose a multi-modal task for predicting the outcomes of physical commonsense information from realistic sensory inputs.
FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering
- Weizhe Lin, Zhilin Wang, Bill Byrne
- TLDR: We present a new adversarial variant of the Fact-based Visual Question Answering dataset which improves the accuracy of the dataset and reduces the vulnerability of existing systems.
Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective
- Jongwoo Ko, Seungjoon Park, Minchan Jeong, Sukjin Hong, Euijai Ahn, Du-Seong Chang, Se-Young Yun
- TLDR: We present the overfitting of existing Intermediate Layer Distillation methods and propose a simple yet effective consistency-regularized ILD method for knowledge distillation.
Implicit Temporal Reasoning for Evidence-Based Fact-Checking
- Liesbeth Allein, Marlon Saelens, Ruben Cartuyvels, Marie-Francine Moens
- TLDR: We show that the presence of temporal information and the manner in which timelines are constructed greatly influence how fact-checking models determine the relevance and supporting or refuting character of evidence documents.
Active PETs: Active Data Annotation Prioritisation for Few-Shot Claim Verification with Pattern Exploiting Training
- Xia Zeng, Arkaitz Zubiaga
- TLDR: We propose Active PETs, a novel weighted approach that utilises an ensemble of Pattern Exploiting Training (PET) models based on various language models, to actively select unlabelled data as candidates for annotation.
Plan-then-Seam: Towards Efficient Table-to-Text Generation
- Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma, Binhua Li, Yongbin Li
- TLDR: We propose the first totally non-autoregressive table-to-text model (Plan-then-Seam, PTS) that produces its outputs in parallel with one single network.
A corpus of metaphors as register markers
- Markus Egg, Valia Kordoni
- TLDR: We present a corpus of metaphors that can serve as register markers and can also be reliably indentifiedfor annotation.
Translate First Reorder Later: Leveraging Monotonicity in Semantic Parsing
- Francesco Cazzaro, Davide Locatelli, Ariadna Quattoni, Xavier Carreras
- TLDR: We propose Translator-Reorderer-Translator, a two-step approach that first translates input sentences monotonically and then reorders them to obtain the correct output.
PePe: Personalized Post-editing Model utilizing User-generated Post-edits
- Jihyeon Lee, Taehee Kim, Yunwon Tae, Cheonbok Park, Jaegul Choo
- TLDR: Personalized automatic post-editing framework for machine translation based on user preferences.
Infusing Context and Knowledge Awareness in Multi-turn Dialog Understanding
- Ting-Wei Wu, Biing-Hwang Juang
- TLDR: We propose to model multi-turn dynamics in natural language understanding by equipping a BERT-based NLU framework with knowledge and context awareness.
MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages
- Zhiruo Wang, Grace Cuenca, Shuyan Zhou, Frank F. Xu, Graham Neubig
- TLDR: We present a multilingual dataset for code generation from natural language commands extending beyond English.
Augmenting pre-trained language models with audio feature embedding for argumentation mining in political debates
- Rafael Mestre, Stuart Middleton, Matt Ryan, Masood Gheasi, Timothy Norman, Jiatong Zhu
- TLDR: We investigate the integration of audio features with text in argumentation mining tasks and show that multimodal features add value in fully supervised scenarios with limited data.
Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions
- Cuong Hoang, Devendra Sachan, Prashant Mathur, Brian Thompson, Marcello Federico
- TLDR: We propose a novel architecture to control interactions between a source sentence and the top-k fuzzy target-language matches, and compare it to architectures from prior work.
CALM-Bench: A Multi-task Benchmark for Evaluating Causality-Aware Language Models
- Dhairya Dalal, Paul Buitelaar, Mihael Arcan
- TLDR: We propose CALM-Bench, a multi-task benchmark for evaluating causality-aware language models (CALM).
ezCoref: Towards Unifying Annotation Guidelines for Coreference Resolution
- Ankita Gupta, Marzena Karpinska, Wenlong Zhao, Kalpesh Krishna, Jack Merullo, Luke Yeh, Mohit Iyyer, Brendan O’Connor
- TLDR: We develop a crowdsourcing-friendly annotation methodology for English coreference datasets that allows annotators to learn to identify and classify common cases that are treated similarly across these datasets.
PREME: Preference-based Meeting Exploration through an Interactive Questionnaire
- Negar Arabzadeh, Ali Ahmadvand, Julia Kiseleva, Yang Liu, Ahmed Hassan Awadallah, Ming Zhong, Milad Shokouhi
- TLDR: We propose a novel end-to-end framework for generating interactive questionnaires for preference-based meeting exploration.
Sentence Identification with BOS and EOS Label Combinations
- Takuma Udagawa, Hiroshi Kanayama, Issei Yoshida
- TLDR: We propose a novel sentence segmentation task which combines the beginning of the sentence (BOS) and EOS labels to identify the most probable SUs and NSUs based on dynamic programming.
Gauging the Gap Between Human and Machine Text Simplification Through Analytical Evaluation of Simplification Strategies and Errors
- Daichi Yamaguchi, Rei Miyata, Sayuka Shimada, Satoshi Sato
- TLDR: We present an analytical evaluation of neural text simplification systems and show that the systems can hardly perform information addition operations.
Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation
- Haoran Yang, Yan Wang, Piji Li, Wei Bi, Wai Lam, Chen Xu
- TLDR: We propose a two-stage framework to alleviate the problem of corrupted sentences with incorrect word order in pre-training and fine-tuning of language models.
LED: A Dataset for Life Event Extraction from Dialogs
- Yi-Pei Chen, An-Zi Yen, Hen-Hsen Huang, Hideki Nakayama, Hsin-Hsi Chen
- TLDR: We present a dataset containing fine-grained life event annotations on conversational data and propose a novel Conversational Life Event Extraction task.
Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking
- Mubashara Akhtar, Oana Cocarascu, Elena Simperl
- TLDR: We propose a novel task, chart-based fact-checking, and introduce ChartBERT as the first model for AFC against chart evidence.
Causal Reasoning of Entities and Events in Procedural Texts
- Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch
- TLDR: We propose CREPE, the first benchmark on causal reasoning of event plausibility and entity states.
Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task Dialogues
- Thibault Cordier, Tanguy Urvoy, Fabrice Lefèvre, Lina M. Rojas Barahona
- TLDR: We show that graph neural networks are better than simulated experts in dialogue frameworks when learning from a small number of dialogues.
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
- Jielin Qiu, William Han, Jiacheng Zhu, Mengdi Xu, Michael Rosenberg, Emerson Liu, Douglas Weber, Ding Zhao
- TLDR: We propose an approach for cardiovascular disease diagnosis and automatic ECG diagnosis report generation by transferring knowledge from large language models to clinical Electrocardiography.
Practical Takes on Federated Learning with Pretrained Language Models
- Ankur Agarwal, Mehdi Rezagholizadeh, Prasanna Parthasarathi
- TLDR: We propose a new hypothesis on the domain adaptation of federated learning in NLP with pre-trained language models.
Paper Bullets: Modeling Propaganda with the Help of Metaphor
- Daniel Baleato Rodríguez, Verna Dankers, Preslav Nakov, Ekaterina Shutova
- TLDR: Propaganda aims to persuade an audience by appealing to emotions and using faulty reasoning, with the purpose of promoting a particular point of view. Propaganda techniques can be used to tune up or down the emotional volume of the message.
Lexical Semantics with Large Language Models: A Case Study of English “break”
- Erika Petersen, Christopher Potts
- TLDR: Large neural language models can be powerful tools for research in lexical semantics.
SWING: Balancing Coverage and Faithfulness for Dialogue Summarization
- Kung-Hsiang Huang, Siffi Singh, Xiaofei Ma, Wei Xiao, Feng Nan, Nicholas Dingwall, William Yang Wang, Kathleen McKeown
- TLDR: We propose to use natural language inference to improve coverage and factual consistency in dialogue summarization by encouraging the model to generate content in the reference summaries that have not been covered, as well as to distinguish between factually consistent and inconsistent generated sentences.
Language-Aware Multilingual Machine Translation with Self-Supervised Learning
- Haoran Xu, Jean Maillard, Vedanuj Goswami
- TLDR: We propose a novel and effective co-training approach for multilingual machine translation that improves performance by a large margin over the current state-of-the-art methods.
Cloze Quality Estimation for Language Assessment
- Zizheng Zhang, Masato Mita, Mamoru Komachi
- TLDR: We propose a novel task called Cloze Quality Estimation, which evaluates cloze tests for language assessment based on two important factors: reliability and validity.
Bag of Tricks for In-Distribution Calibration of Pretrained Transformers
- Jaeyoung Kim, Dongbin Na, Sungchul Choi, Sungbin Lim
- TLDR: We present empirical studies on confidence calibration for pre-trained language models and propose a combination of calibration techniques for PLMs.
Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features
- Sishuo Chen, Wenkai Yang, Xiaohan Bi, Xu Sun
- TLDR: We present a general OOD score for textual OOD detection that is effective in both semantic and non-semantic shifts.
A Question of Style: A Dataset for Analyzing Formality on Different Levels
- Elisabeth Eder, Ulrike Krieg-Holz, Michael Wiegand
- TLDR: We present a corpus of sentences from a wide range of genres assessed on a continuous informal-formal scale via comparative judgments.
Task-specific Compression for Multi-task Language Models using Attribution-based Pruning
- Nakyeong Yang, Yunah Jang, Hwanhee Lee, Seohyeong Jeong, Kyomin Jung
- TLDR: We propose a novel training-free compression method for multi-task language models using pruning method.
Zero-shot Transfer of Article-aware Legal Outcome Classification for European Court of Human Rights Cases
- Santosh T.y.s.s, Oana Ichim, Matthias Grabmair
- TLDR: We present a novel domain adaptation method for legal judgment prediction that improves the model’s ability to generalize to zero-shot settings.
Abstractive Document Summarization with Summary-length Prediction
- Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura
- TLDR: We propose a method for enabling the model to understand the summarization-specific information by predicting the summary length in the encoder and generating a summary of the predicted length in decoder in fine-tuning.
Hierarchical Label Generation for Text Classification
- Jingun Kwon, Hidetaka Kamigaito, Young-In Song, Manabu Okumura
- TLDR:
Active Learning for Multilingual Semantic Parser
- Zhuang Li, Gholamreza Haffari
- TLDR: We propose a new active learning procedure for multilingual semantic parsing that reduces translation costs and improves parsing performance.
Joint Word and Morpheme Segmentation with Bayesian Non-Parametric Models
- Shu Okabe, François Yvon
- TLDR: We study the effect of explicitly introducing a hierarchy of units in joint segmentation models and evaluate their validity for language documentation data.
Cross-Lingual Transfer of Cognitive Processing Complexity
- Charlotte Pouw, Nora Hollenstein, Lisa Beinborn
- TLDR: We use sentence-level eye-tracking patterns as a cognitive indicator for structural complexity and show that the multilingual model XLM-RoBERTa can successfully predict varied patterns for 13 typologically diverse languages, despite being fine-tuned only on English data.
Does Transliteration Help Multilingual Language Modeling?
- Ibraheem Muhammad Moosa, Mahmud Elahi Akhter, Ashfia Habib
- TLDR: We empirically measure the effect of transliteration on Multilingual Language Models in this context.
A Multilingual Dataset of Racial Stereotypes in Social Media Conversational Threads
- Tom Bourgeade, Alessandra Teresa Cignarella, Simona Frenda, Mario Laurent, Wolfgang Schmeisser-Nieto, Farah Benamara, Cristina Bosco, Véronique Moriceau, Viviana Patti, Mariona Taulé
- TLDR: We present a corpus-based study for multilingual racial stereotype identification in social media conversational threads and provide a set of guidelines for the annotation of racial stereotypes in social networks.
Detecting Contextomized Quotes in News Headlines by Contrastive Learning
- Seonyeong Song, Hyeonho Song, Kunwoo Park, Jiyoung Han, Meeyoung Cha
- TLDR: We present QuoteCSE, a contrastive learning framework that represents the embedding of news quotes based on domain-driven positive and negative samples to identify such an editorial strategy.
Zero-Shot On-the-Fly Event Schema Induction
- Rotem Dror, Haoyu Wang, Dan Roth
- TLDR: We present a new approach in which large language models are utilized to generate source documents that allow predicting, given a high-level event definition, the specific events, arguments, and relations between them to construct a schema that describes the complex event in its entirety.
BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla
- Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Rifat Shahriyar
- TLDR: We present a comprehensive benchmark for evaluating natural language generation models in Bangla, a widely spoken yet low-resource language.
It’s about Time: Rethinking Evaluation on Rumor Detection Benchmarks using Chronological Splits
- Yida Mu, Kalina Bontcheva, Nikolaos Aletras
- TLDR: We provide a re-evaluation of rumor detection models on four popular rumor detection benchmarks considering chronological instead of random splits.
MUTANT: A Multi-sentential Code-mixed Hinglish Dataset
- Rahul Gupta, Vivek Srivastava, Mayank Singh
- TLDR: We propose a novel task of identifying multi-sentential code-mixed text (MCT) from multilingual articles.
Bridging the Gap between Native Text and Translated Text through Adversarial Learning: A Case Study on Cross-Lingual Event Extraction
- Pengfei Yu, Jonathan May, Heng Ji
- TLDR: We propose an adversarial training framework for cross-lingual event extraction with machine translation and show that it can effectively incorporate machine translation to improve event extraction.
Scalable Prompt Generation for Semi-supervised Learning with Language Models
- Yuhang Zhou, Suraj Maharjan, Beiye Liu
- TLDR: We propose two methods to automatically design multiple prompts and integrate automatic verbalizer in semi-supervised learning settings without sacrificing performance.
Novel Feature Discovery for Task-Oriented Dialog Systems
- Vinh Thinh Ho, Mohamed Soliman, Abdalghani Abujabal
- TLDR: We present a novel feature discovery technique for discovering novel features from user utterances.
Context Generation Improves Open Domain Question Answering
- Dan Su, Mostofa Patwary, Shrimai Prabhumoye, Peng Xu, Ryan Prenger, Mohammad Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro
- TLDR: We propose a novel closed-book question answering framework which uses a coarse-to-fine approach to extract the relevant knowledge and answer a question without needing external knowledge.
RedHOT: A Corpus of Annotated Medical Questions, Experiences, and Claims on Social Media
- Somin Wadhwa, Vivek Khetan, Silvio Amir, Byron Wallace
- TLDR: We present Reddit Health Online Talk (RedHOT), a corpus of 22,000 richly annotated social media posts from Reddit spanning 24 health conditions.
Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions
- Henrik Voigt, Jan Hombeck, Monique Meuschke, Kai Lawonn, Sina Zarrieß
- TLDR: We investigate whether a state-of-the-art language and vision model, CLIP, is able to ground perspective descriptions of a 3D object and identify canonical views of common objects based on text queries.
PLACES: Prompting Language Models for Social Conversation Synthesis
- Maximillian Chen, Alexandros Papangelis, Chenyang Tao, Seokhwan Kim, Andy Rosenbaum, Yang Liu, Zhou Yu, Dilek Hakkani-Tur
- TLDR: We synthesize synthetic multi-party conversations using expert-written examples and evaluate them on a variety of metrics.
FedPerC: Federated Learning for Language Generation with Personal and Context Preference Embeddings
- Andrew Silva, Pradyumna Tambwekar, Matthew Gombolay
- TLDR: We propose a new direction for personalization research within federated learning, leveraging both personal embeddings and shared context embeddens.
A Neural CRF-based Hierarchical Approach for Linear Text Segmentation
- Inderjeet Nair, Aparna Garimella, Balaji Vasan Srinivasan, Natwar Modani, Niyati Chhaya, Srikrishna Karanam, Sumit Shekhar
- TLDR: We propose a novel hierarchical segmentation algorithm based on a neural conditional random field and a data augmentation scheme.
MultiFin: A Dataset for Multilingual Financial NLP
- Rasmus Jørgensen, Oliver Brandt, Mareike Hartmann, Xiang Dai, Christian Igel, Desmond Elliott
- TLDR: We present a publicly available financial dataset consisting of real-world article headlines covering 15 languages across different writing systems and language families.
MLASK: Multimodal Summarization of Video-based News Articles
- Mateusz Krubiński, Pavel Pecina
- TLDR: We present a new dataset of video-based news articles paired with a textual summary and a cover picture, all obtained by automatically crawling several news websites.
Going beyond research datasets: Novel intent discovery in the industry setting
- Aleksandra Chrabrowa, Tsimur Hadeliya, Dariusz Kajtoch, Robert Mroczkowski, Piotr Rybak
- TLDR: We present methods for improving the intent discovery pipeline on publicly available datasets by utilizing conversational structure of real-life datasets.
DATScore: Evaluating Translation with Data Augmented Translations
- Moussa Kamal Eddine, Guokan Shang, Michalis Vazirgiannis
- TLDR: We propose a metric for evaluating the quality of machine translation generated by large pretrained language models.
How do decoding algorithms distribute information in dialogue responses?
- Saranya Venkatraman, He He, David Reitter
- TLDR: We show that model-generated responses follow the Uniform Information Density Principle to a greater extent than human responses, and that encouraging non-uniform responses is a potential solution to the “likelihood trap” problem.
Benchmarking Long-tail Generalization with Likelihood Splits
- Ameya Godbole, Robin Jia
- TLDR: We propose a method to create challenging benchmarks that require generalizing to the tail of the distribution by re-splitting existing datasets.
Exploring Enhanced Code-Switched Noising for Pretraining in Neural Machine Translation
- Vivek Iyer, Arturo Oncevay, Alexandra Birch
- TLDR: We propose a novel approach to denoise synthetic code-switched data in neural machine translation by generating contextual, many-to-many word translations from a base NMT model.
XQA-DST: Multi-Domain and Multi-Lingual Dialogue State Tracking
- Han Zhou, Ignacio Iacobacci, Pasquale Minervini
- TLDR: We propose a domain-agnostic extractive question answering approach for dialogue state tracking that can efficiently leverage domain-scalable and open vocabulary in DST.
Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion
- Yi-An Lai, Elman Mansimov, Yuqing Xie, Yi Zhang
- TLDR: We propose a novel method, Gated Fusion, that promotes backward compatibility via learning to mix predictions between old and new models.
AmbiCoref: Evaluating Human and Model Sensitivity to Ambiguous Coreference
- Yuewei Yuan, Chaitanya Malaviya, Mark Yatskar
- TLDR: We present a diagnostic corpus of minimal sentence pairs with ambiguous and unambiguous referents and show that most coreference models are sensitive to such pronominal ambiguity.
Improving Unsupervised Out-of-domain detection through Pseudo Labeling and Learning
- Byounghan Lee, Jaesik Kim, Junekyu Park, Kyung-Ah Sohn
- TLDR: We propose a two-stage framework for textual OOD detection that leverages the latent categorical information to improve representation learning for textual out-of-domain detection.
How Many Data Samples is an Additional Instruction Worth?
- Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, Chitta Baral
- TLDR: We augment a subset of tasks in the expanded version of NATURAL INSTRUCTIONS with additional instructions and find that it significantly improves model performance (up to 35%), especially in the low-data regime.
[MASK] Insertion: a robust method for anti-adversarial attacks
- Xinrong Hu, Ce Xu, Junlong Ma, Zijian Huang, Jie Yang, Yi Guo, Johan Barthelemy
- TLDR: We present a simple yet efficient algorithm for adversarial attack against masked language modeling.
ViDeBERTa: A powerful pre-trained language model for Vietnamese
- Cong Dao Tran, Nhut Huy Pham, Anh Tuan Nguyen, Truong Son Hy, Tu Vu
- TLDR: We present ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeberTa_xsmall, ViDeBRT_base, and ViDeBET_large, which are pre- trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture.
NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization
- Junru Lu, Jiazheng Li, Byron Wallace, Yulan He, Gabriele Pergola
- TLDR: We propose a novel two-stage text simplification algorithm that improves the quality of medical literature by improving lexical similarity and narrative consistency.
Long-tailed Extreme Multi-label Text Classification by the Retrieval of Generated Pseudo Label Descriptions
- Ruohong Zhang, Yau-Shian Wang, Yiming Yang, Donghan Yu, Tom Vu, Likun Lei
- TLDR: We propose to generate pseudo label descriptions from a trained bag-of-words classifier, which demonstrates better classification performance under severe scarce data conditions.
Unsupervised Keyphrase Extraction via Interpretable Neural Networks
- Rishabh Joshi, Vidhisha Balachandran, Emily Saldanha, Maria Glenski, Svitlana Volkova, Yulia Tsvetkov
- TLDR: We propose INSPECT—a novel approach for unsupervised keyphrase extraction based on self-explaining models for identifying influential keyphrases in a document by measuring the predictive impact of input phrases on the downstream task of the document topic classification.
Large Language Models are few(1)-shot Table Reasoners
- Wenhu Chen
- TLDR: We show that large language models can perform table reasoning tasks with few-shot in-context learning.
Realistic Citation Count Prediction Task for Newly Published Papers
- Jun Hirako, Ryohei Sasano, Koichi Takeda
- TLDR: We propose a realistic citation count prediction task that uses information available at the time of a paper’s publication.
“Why do I feel offended?” - Korean Dataset for Offensive Language Identification
- San-Hee Park, Kang-Min Kim, O-Joun Lee, Youjin Kang, Jaewon Lee, Su-Min Lee, SangKeun Lee
- TLDR: We propose a novel KOrean Dataset for Offensive Language Identification and propose auxiliary tasks to help identify offensive languages.
Empirical Investigation of Neural Symbolic Reasoning Strategies
- Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui
- TLDR: We investigate and factorize the benefit of generating intermediate steps for symbolic reasoning.
Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering
- Xanh Ho, Anh-Khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa
- TLDR: We analyze the effectiveness of underlying reasoning tasks in multi-hop question answering datasets and show that (1) they can improve QA performance, (2) reasoning shortcuts, (3) reasoning robustness, and (4) adversarial questions.
PubMedCLIP: How Much Does CLIP Benefit Visual Question Answering in the Medical Domain?
- Sedigheh Eslami, Christoph Meinel, Gerard de Melo
- TLDR: We present PubMedCLIP, a fine-tuned version of Contrastive Language–Image Pre-training for the medical domain based on PubMed articles.
Multilingual BERT has an accent: Evaluating English influences on fluency in multilingual models
- Isabel Papadimitriou, Kezia Lopez, Dan Jurafsky
- TLDR: We show that multilingual language models can be biased toward English-like grammatical structures, and show that this bias can reduce average performance on all languages.
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
- Aishwarya Agrawal, Ivana Kajic, Emanuele Bugliarello, Elnaz Davoodi, Anita Gergely, Phil Blunsom, Aida Nematzadeh
- TLDR: We show that multimodal models pretrained on multimodals on visual question answering tasks exhibit poor generalization, and show that the evaluation metrics used to evaluate them are overly stringent.
Our kind of people? Detecting populist references in political debates
- Christopher Klamm, Ines Rehbein, Simone Paolo Ponzetto
- TLDR: We present a novel cross-lingual dataset for the identification of populist rhetoric in text and present a new method for zero-shot learning.
SharPT: Shared Latent Space Prompt Tuning
- Bo Pang, Semih Yavuz, Caiming Xiong, Yingbo Zhou
- TLDR: We propose a new soft prompt transfer method for language models that improves performance on a variety of tasks.
Mini But Mighty: Efficient Multilingual Pretraining with Linguistically-Informed Data Selection
- Tolulope Ogunremi, Dan Jurafsky, Christopher Manning
- TLDR: We propose that training on smaller amounts of data but from related languages could match the performance of models trained on large, unrelated data.
Long Document Summarization with Top-down and Bottom-up Inference
- Bo Pang, Erik Nijkamp, Wojciech Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong
- TLDR: We propose a novel method for efficient and faithful inference of latent representations of words or tokens in text summarization models.
Open Information Extraction with Entity Focused Constraints
- Prajna Upadhyay, Oana Balalau, Ioana Manolescu
- TLDR: We exploit domain knowledge about the subject and object in sentences to improve open information extraction.
Hierarchical3D Adapters for Long Video-to-text Summarization
- Pinelopi Papalampidi, Mirella Lapata
- TLDR: We present a novel multimodal video-to-text summarization method that uses multimodality to improve performance over more memory-heavy and fully fine-tuned textual summarization methods.
An Intra-Class Relation Guided Approach for Code Comment Generation
- Zhenni Wang, Xiaohan Yu, Yansong Feng, Dongyan Zhao
- TLDR: We present a novel graph-based learning framework to capture various relations among functions in a class file to generate missing comments for code snippets.
Spelling convention sensitivity in neural language models
- Elizabeth Nielsen, Christo Kirov, Brian Roark
- TLDR: We investigate whether large neural language models learn the potentially long-distance dependency of British versus American spelling conventions, i.e., whether spelling is consistently one or the other within model-generated strings.
Modelling Language Acquisition through Syntactico-Semantic Pattern Finding
- Jonas Doumen, Katrien Beuls, Paul Van Eecke
- TLDR: We present a computational operationalisation of syntactico-semantic pattern finding and a methodology for learning grammars based on similarities and differences in the form and meaning of linguistic observations alone.
Benchmark Data and Evaluation Framework for Intent Discovery Around COVID-19 Vaccine Hesitancy
- Shai Gretz, Assaf Toledo, Roni Friedman, Dan Lahav, Rose Weeks, Naor Bar-Zeev, João Sedoc, Pooja Sangha, Yoav Katz, Noam Slonim
- TLDR: We present a novel automatic evaluation framework for intent discovery in the COVID-19 pandemic vaccine use-case.
Learning Disentangled Representations for Natural Language Definitions
- Danilo Silva De Carvalho, Giangiacomo Mercatali, Yingji Zhang, André Freitas
- TLDR: We propose a novel method for learning disentangled representations of neural models by using recurrent syntactic regularities in text.
Distinguishability Calibration to In-Context Learning
- Hongjing Li, Hanqi Yan, Yanran Li, Li Qian, Yulan He, Lin Gui
- TLDR: We propose a calibration method for prompt-based language learning which improves interpretability and accuracy.
Investigating anatomical bias in clinical machine learning algorithms
- Jannik Pedersen, Martin Laursen, Pernille Vinholt, Anne Alnor, Thiusius Savarimuthu
- TLDR: We measure anatomical bias in clinical text classification algorithms and show that clinical text algorithms are highly prone to anatomical bias.
Topic Ontologies for Arguments
- Yamen Ajjour, Johannes Kiesel, Benno Stein, Martin Potthast
- TLDR: We present a new approach to mapping the argumentation landscape and comparing the topics in the ontology with those in 59 argument corpora.
Longtonotes: OntoNotes with Longer Coreference Chains
- Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum, Mrinmaya Sachan
- TLDR: We present a corpus of long document coreference-annotated documents of significantly longer length than what is currently available.
More Robust Schema-Guided Dialogue State Tracking via Tree-Based Paraphrase Ranking
- Alexandru Coca, Bo-Hsiang Tseng, Weizhe Lin, Bill Byrne
- TLDR: We propose a framework for generating synthetic schemas which optimise lexical diversity and semantic faithfulness for dialogue state tracking.
Language Model Decoding as Likelihood–Utility Alignment
- Martin Josifoski, Maxime Peyrard, Frano Rajič, Jiheng Wei, Debjit Paul, Valentin Hartmann, Barun Patra, Vishrav Chaudhary, Emre Kiciman, Boi Faltings
- TLDR: We propose a taxonomy of misalignment mitigation strategies for decoding algorithms based on their implicit assumptions about likelihood–utility misalocation, yielding general statements about their applicability across tasks.
Lightweight Spatial Modeling for Combinatorial Information Extraction From Documents
- Yanfei Dong, Lambert Deng, Jiazheng Zhang, Xiaodong Yu, Ting Lin, Francesco Gelli, Soujanya Poria, Wee Sun Lee
- TLDR: We propose KNN-Former, a novel approach for document entity classification based on the K-nearest-neighbor graph of document entities.
On the Generalization Ability of Retrieval-Enhanced Transformers
- Tobias Norlund, Ehsan Doostmohammadi, Richard Johansson, Marco Kuhlmann
- TLDR: We show that the performance gains from retrieval-augmented language models such as RETRO are due to overlapping tokens between the database and the test data, suggesting less of non-trivial generalization than previously assumed.
Assessing Monotonicity Reasoning in Dutch through Natural Language Inference
- Gijs Wijnholds
- TLDR: We investigate monotonicity reasoning in Dutch, through a novel Natural Language Inference dataset.
Noisy Parallel Data Alignment
- Ruoyu Xie, Antonios Anastasopoulos
- TLDR: We study the existing word-level alignment models under noisy settings and aim to make them more robust to noisy data.
Enhancing Dialogue Generation with Conversational Concept Flows
- Siheng Li, Wangjie Jiang, Pengda Si, Cheng Yang, Qiu Yao, Jinchao Zhang, Jie Zhou, Yujiu Yang
- TLDR: We propose to enhance dialogue generation with conversational concept flows guided by the knowledge graph.
SMHD-GER: A Large-Scale Benchmark Dataset for Automatic Mental Health Detection from Social Media in German
- Sourabh Zanwar, Daniel Wiechmann, Yu Qiao, Elma Kerz
- TLDR: We present a large-scale, carefully constructed dataset for self-reported mental health diagnoses for German and propose a novel approach for detecting and understanding MHC.
Exploring Data Augmentation for Code Generation Tasks
- Pinzhen Chen, Gerasimos Lampouras
- TLDR: We propose and adapt augmentation methods that yield consistent improvements in code translation and summarization by up to 6.9% and 7.5% respectively.
Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking
- Derek Chen, Kun Qian, Zhou Yu
- TLDR: We propose a new meta-learning scheme for dialogue state tracking that stabilizes the ability of the model to perform well under various prompts.
Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers
- Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto, Goran Glavaš
- TLDR: We investigate whether incorporating demographic factors into Transformer-based language models can improve performance across four languages.
JBLiMP: Japanese Benchmark of Linguistic Minimal Pairs
- Taiga Someya, Yohei Oseki
- TLDR: We present JBLiMP, a novel dataset for targeted syntactic evaluations of language models in Japanese.
SMATCH++: Standardized and Extended Evaluation of Semantic Graphs
- Juri Opitz
- TLDR: We propose a new metric for graph distances that is more efficient and more accurate than the current Smatch metric.
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction
- Stuart Mesham, Christopher Bryant, Marek Rei, Zheng Yuan
- TLDR: We extend a current sequence-tagging approach to Grammatical error correction by introducing specialised tags for spelling correction and morphological inflection using the SymSpell and LemmInflect algorithms.
Cheating to Identify Hard Problems for Neural Machine Translation
- Proyag Pal, Kenneth Heafield
- TLDR: We identify hard problems for neural machine translation models by analyzing progressively higher-scoring translations generated by letting models cheat to various degrees.
Model-Agnostic Bias Measurement in Link Prediction
- Lena Schwertmann, Manoj Prabhakar Kannan Ravi, Gerard de Melo
- TLDR: We present a model-agnostic approach for bias measurement leveraging fairness metrics to compare bias in knowledge graph embedding-based predictions (KG only) with models that use pre-trained, Transformer-based language models (KGB+LM).
Divergence-Based Domain Transferability for Zero-Shot Classification
- Alexander Pugantsov, Richard McCreadie
- TLDR: We propose statistical measures that approximate the divergence between domain representations as a means to estimate whether tuning using one task pair will exhibit performance benefits over tuning another.
EDU-level Extractive Summarization with Varying Summary Lengths
- Yuping Wu, Ching-Hsun Tseng, Jiayu Shang, Shengzhong Mao, Goran Nenadic, Xiao-Jun Zeng
- TLDR: We propose a novel extractive model with Varying summary lengths and a novel algorithm for extracting Elementary Discourse Unit-level extractive summaries.
“Chère maison” or “maison chère”? Transformer-based prediction of adjective placement in French
- Eleni Metheniti, Tim Van de Cruys, Wissam Kerkri, Juliette Thuilier, Nabil Hathout
- TLDR: We show that transformer-based language models are able to learn the adjective position in noun phrases in French.
On the Role of Reviewer Expertise in Temporal Review Helpfulness Prediction
- Mir Tafseer Nayeem, Davood Rafiei
- TLDR: We present a new algorithm for detecting helpful reviews and show that it can be used to improve the quality of reviews.
Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering
- Chenxi Whitehouse, Tillman Weyde, Pranava Madhyastha
- TLDR: We propose a multitask learning approach towards a Unified Model for Answer and Explanation generation (UMAE) for visual question answering.
Machine Translation between Spoken Languages and Signed Languages Represented in SignWriting
- Zifan Jiang, Amit Moryossef, Mathias Müller, Sarah Ebling
- TLDR: We present novel methods for machine translation between spoken and signed languages, where signed languages are represented in SignWriting, a sign language writing system.
A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained Models
- Jimin Sun, Patrick Fernandes, Xinyi Wang, Graham Neubig
- TLDR: We provide empirical comparison of multilingual tokenizer-free and subword-based models and show that subword based models are the most practical and efficient.
Neural Ranking with Weak Supervision for Open-Domain Question Answering : A Survey
- Xiaoyu Shen, Svitlana Vakulenko, Marco del Tredici, Gianni Barlacchi, Bill Byrne, Adria de Gispert
- TLDR: We provide a structured overview of standard WS signals used for training a neural ranking model.
Double Retrieval and Ranking for Accurate Question Answering
- Zeyu Zhang, Thuy Vu, Alessandro Moschitti
- TLDR: We propose a new answer verification step in Transformer-based answer selection models which is based on a double reranking model and a second neural retrieval stage for question and answer pair.
Evaluating the Diversity, Equity, and Inclusion of NLP Technology: A Case Study for Indian Languages
- Simran Khanuja, Sebastian Ruder, Partha Talukdar
- TLDR: We propose a new evaluation paradigm that assesses NLP technologies across all three dimensions, and propose a novel approach to optimising resource allocation during fine-tuning.
Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog
- Mayank Mishra, Danish Contractor, Dinesh Raghu
- TLDR: We present a novel knowledge modality-based model for task oriented dialogs that can fuse structured as well as unstructured knowledge to generate responses.
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
- Mohammadreza Banaei, Klaudia Bałazy, Artur Kasymov, Rémi Lebret, Jacek Tabor, Karl Aberer
- TLDR: We propose a novel autoencoder-based autoencoder-based offline compression method for transformer language models that significantly outperforms the classical matrix factorization methods.
PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation
- Ishan Jindal, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, Hiroshi Kanayama, Marina Danilevsky, Yunyao Li
- TLDR: We propose a new SRL evaluation metric, PriMeSRL, which penalizes actual failures in SoTA SRL models.
Prompt-based Learning for Text Readability Assessment
- Bruce W. Lee, Jason Lee
- TLDR: We propose a novel adaptation of a pre-trained seq2seq model for readability assessment and show that it can be adapted to discern which text is more difficult from two given texts (pairwise).
Best Practices in the Creation and Use of Emotion Lexicons
- Saif Mohammad
- TLDR: We present practical and ethical considerations for emotion lexicons, and provide a comprehensive set of relevant information on how to use them.
The Role of Semantic Parsing in Understanding Procedural Text
- Hossein Rajaby Faghihi, Parisa Kordjamshidi, Choh Man Teng, James Allen
- TLDR: We investigate whether symbolic semantic representations extracted from deep semantic parsers can help reasoning over the states of involved entities in a procedural text.
Named Entity Recognition in a Very Homogenous Domain
- Oshin Agarwal, Ani Nenkova
- TLDR: We present a dataset of sentences from news articles from the same newspaper in English that are annotated with named entities and show that even in a homogeneous domain, the performance of named entity recognition models varies significantly across news topics.
Crawling The Internal Knowledge-Base of Language Models
- Roi Cohen, Mor Geva, Jonathan Berant, Amir Globerson
- TLDR: We propose a method for extracting a knowledge-graph of facts from a language model.
Intent Identification and Entity Extraction for Healthcare Queries in Indic Languages
- Ankan Mullick, Ishani Mondal, Sourjyadip Ray, Raghav R, G Chaitanya, Pawan Goyal
- TLDR: We propose a new approach to detect query intents and entities in healthcare query data and analyze its practical relevance.
Text-Derived Knowledge Helps Vision: A Simple Cross-modal Distillation for Video-based Action Anticipation
- Sayontan Ghosh, Tanvi Aggarwal, Minh Hoai, Niranjan Balasubramanian
- TLDR: We show how knowledge in pretrained language models can be adapted and distilled into vision based action anticipation models.
Simple Yet Effective Synthetic Dataset Construction for Unsupervised Opinion Summarization
- Ming Shen, Jie Ma, Shuai Wang, Yogarshi Vyas, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba
- TLDR: We propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries by training on synthetic datasets constructed with aspect-related review contents.
Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation
- Mohammadreza Tayaranian Hosseini, Alireza Ghaffari, Marzieh S. Tahaei, Mehdi Rezagholizadeh, Masoud Asgharian, Vahid Partovi Nia
- TLDR: We use integer arithmetic for both forward and backward propagation of language models for fine-tuning their metric performance.
Data Augmentation for Radiology Report Simplification
- Ziyu Yang, Santhosh Cherian, Slobodan Vucetic
- TLDR: A novel data augmentation approach for text simplification.
Embedding Recycling for Language Models
- Jon Saad-Falcon, Amanpreet Singh, Luca Soldaini, Mike D’Arcy, Arman Cohan, Doug Downey
- TLDR: We show how a simple embedding recycling technique that caches activations from an intermediate layer of a pretrained model, and learns task-specific adapters on the later layers, is broadly effective.
Trained on 100 million words and still in shape: BERT meets British National Corpus
- David Samuel, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal
- TLDR: We present a new language modeling benchmark on the British National Corpus and show that pre-training on this carefully curated corpus can reach better performance than the original BERT model.
Generating Synthetic Speech from SpokenVocab for Speech Translation
- Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi
- TLDR: We propose SpokenVocab, a simple, scalable and effective data augmentation technique to convert MT data to ST data on-the-fly.
Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints
- Albert Lu, Hongxin Zhang, Yanzhe Zhang, Xuezhi Wang, Diyi Yang
- TLDR: We present a generic methodology for analyzing and bounding the abilities of open-ended generative models.
Learning to Retrieve Engaging Follow-Up Queries
- Christopher Richardson, Sudipta Kar, Anjishnu Kumar, Anand Ramachandran, Zeynab Raeesy, Omar Khan, Abhinav Sethy
- TLDR: We present a retrieval based system and associated dataset for predicting the next questions that the user might have.
Selective-LAMA: Selective Prediction for Confidence-Aware Evaluation of Language Models
- Hiyori Yoshikawa, Naoaki Okazaki
- TLDR: We propose a new evaluation framework for neural language models that takes the confidence of predictions into account.
Multi-View Source Ablation for Faithful Summarization
- Shuyang Cao, Liang Ma, Di Lu, Robert L Logan IV, Joel Tetreault, Alejandro Jaimes
- TLDR: We present MuFaSSa, a metric for evaluating faithfulness of abstractive summaries, and for guiding training of more faithful summarizers.
Mining Effective Features Using Quantum Entropy for Humor Recognition
- Yang Liu, Yuexian Hou
- TLDR: We propose QE-Uncertainty and QE Incongruity as features for humor recognition.
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
- Alexandra Chronopoulou, Matthew Peters, Alexander Fraser, Jesse Dodge
- TLDR: We propose a novel method for weight-space averaging of domain-specific adapters in language models that improves performance on new domains without extra training.
Towards End-to-End Open Conversational Machine Reading
- Sizhe Zhou, Siru Ouyang, Zhuosheng Zhang, Hai Zhao
- TLDR: We propose an end-to-end framework for open-retrieval conversational machine reading task that achieves state-of-the-art results on both sub-tasks.
Generative Knowledge Selection for Knowledge-Grounded Dialogues
- Weiwei Sun, Pengjie Ren, Zhaochun Ren
- TLDR: We propose a simple yet effective generative approach for knowledge selection, called GenKS.
Evaluating the Tradeoff Between Abstractiveness and Factuality in Abstractive Summarization
- Markus Dreyer, Mengwen Liu, Feng Nan, Sandeep Atluri, Sujith Ravi
- TLDR: We analyze the tradeoff between abstractiveness and factuality of generated summaries across multiple datasets and models, using extensive human evaluations of factuality.
Fairness in Language Models Beyond English: Gaps and Challenges
- Krithika Ramesh, Sunayana Sitaram, Monojit Choudhury
- TLDR: We present a survey of fairness in multilingual and non-English contexts, highlighting the shortcomings of current research and the difficulties faced by methods designed for English.
Global-Local Modeling with Prompt-Based Knowledge Enhancement for Emotion Inference in Conversation
- Renxi Wang, Shi Feng
- TLDR: We propose a global-local modeling method based on recurrent neural networks (RNNs) and pre-trained language models (PLM) to do emotion inference, which utilizes the sequence modeling ability of RNNs and abundant knowledge from PLMs.
Headline Token-based Discriminative Learning for Subheading Generation in News Article
- Joonwon Jang, Misuk Kim
- TLDR: We propose a new subheading generation model using topical headline information and show that it outperforms the comparative models on three news datasets written in two languages.
Decipherment as Regression: Solving Historical Substitution Ciphers by Learning Symbol Recurrence Relations
- Nishant Kambhatla, Logan Born, Anoop Sarkar
- TLDR: We propose a novel algorithm for deciphering substitution ciphertexts using a Transformer-based causal language model and a sequence prediction task.
A Survey on Recent Advances in Keyphrase Extraction from Pre-trained Language Models
- Mingyang Song, Yi Feng, Liping Jing
- TLDR: We introduce keyphrase extraction and provide a comparative experimental study of popular supervised as well as unsupervised techniques on several datasets.
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
- Pride Kavumba, Ana Brassard, Benjamin Heinzerling, Kentaro Inui
- TLDR: We show that explanation prompts improve robustness to adversarial perturbations in natural language inference.
JobXMLC: EXtreme Multi-Label Classification of Job Skills with Graph Neural Networks
- Nidhi Goyal, Jushaan Kalra, Charu Sharma, Raghava Mutharaju, Niharika Sachdeva, Ponnurangam Kumaraguru
- TLDR: We propose a novel skill prediction framework called JobXMLCL which uses graph neural networks with skill attention to predict missing skills using job descriptions.
ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities
- Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu
- TLDR: We present a novel vision-language benchmark for human activity planning based on video clips about their initial activities and intents in text.
Grammatical Error Correction through Round-Trip Machine Translation
- Yova Kementchedjhieva, Anders Søgaard
- TLDR: We present a new approach to machine translation that uses round-trip translation to guide grammatical error correction.
Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?
- Hiroto Tamura, Tosho Hirasawa, Hwichan Kim, Mamoru Komachi
- TLDR: We found that pre-training neural machine translation models with artificial data improves translation performance in low-resource situations.
Performance and Risk Trade-offs for Multi-word Text Prediction at Scale
- Aniket Vashishtha, S Sai Prasad, Payal Bajaj, Vishrav Chaudhary, Kate Cook, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury
- TLDR: We present a new paper on the challenges of toxicity detection and fairness in text prediction tasks, and present a novel method for evaluating toxicity detection approaches.
Searching for Better Database Queries in the Outputs of Semantic Parsers
- Anton Osokin, Irina Saparina, Ramil Yarullin
- TLDR: We propose a search algorithm that finds many queries passing a criterion that evaluates the generated queries.
Style-Aware Contrastive Learning for Multi-Style Image Captioning
- Yucheng Zhou, Guodong Long
- TLDR: Style-aware contrastive learning for multi-style image captioning.
Strategize Before Teaching: A Conversational Tutoring System with Pedagogy Self-Distillation
- Lingzhi Wang, Mrinmaya Sachan, Xingshan Zeng, Kam-Fai Wong
- TLDR: We propose a unified framework that combines teaching response generation and pedagogical strategy prediction, where a self-distillation mechanism is adopted to guide the teaching strategy learning and facilitate tutor response generation.
ICA-Proto: Iterative Cross Alignment Prototypical Network for Incremental Few-Shot Relation Classification
- Wangjie Jiang, Zhihao Ye, Bang Liu, Ruihui Zhao, Jianguang Zheng, Mengyao Li, Zhiyong Li, Yujiu Yang, Yefeng Zheng
- TLDR: Iterative Cross Alignment prototypical network for incremental few-shot relation classification.
A Large-Scale Multilingual Study of Visual Constraints on Linguistic Selection of Descriptions
- Uri Berger, Lea Frermann, Gabriel Stanovsky, Omri Abend
- TLDR: We present a large, multilingual study into how vision constrains linguistic choice, covering four languages and five linguistic properties, such as verb transitivity or use of numerals.
How Much Syntactic Supervision is “Good Enough”?
- Hiroshi Noji, Yohei Oseki
- TLDR: We explore how much syntactic supervision is “good enough” to make language models (LMs) more human-like.
Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
- Sonal Sannigrahi, Josef van Genabith, Cristina España-Bonet
- TLDR: We provide a systematic comparison of methods to produce document-level representations from sentences based on LASER, LaBSE, and Sentence BERT pre-trained multilingual models.
Improving User Controlled Table-To-Text Generation Robustness
- Hanxu Hu, Yunqing Liu, Zhongyi Yu, Laura Perez-Beltrachini
- TLDR: We study user controlled table-to-text generation with noisy cell selections and show that noisy cell selection improves performance on the ToTTo dataset.
Better Pre-Training by Reducing Representation Confusion
- Haojie Zhang, Mingfei Liang, Ruobing Xie, Zhenlong Sun, Bo Zhang, Leyu Lin
- TLDR: We propose two novel techniques to improve Transformer-based pre-trained language models and identify two different types of information confusion in position encoding and model representations, respectively.
MAFiD: Moving Average Equipped Fusion-in-Decoder for Question Answering over Tabular and Textual Data
- Sung-Min Lee, Eunhwan Park, Daeryong Seo, Donghyeon Jeon, Inho Kang, Seung-Hoon Na
- TLDR: We propose a fusion-in-decoder and exponential moving average model for long-range reasoning over tables and texts.
Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis
- Akshita Jha, Adithya Samavedhi, Vineeth Rakesh, Jaideep Chandrashekar, Chandan Reddy
- TLDR: We empirically demonstrate the effectiveness of simple neural models (such as feed-forward networks, and CNNs) and simple embeddings (like GloVe, and Paragraph Vector) over transformer-based models on the task of document matching.
Simple and Effective Multi-Token Completion from Masked Language Models
- Oren Kalinsky, Guy Kushilevitz, Alexander Libov, Yoav Goldberg
- TLDR: We show that pre-trained neural masked language models can be adapted to produce multi-token completions, with only a modest addition to their parameter count.
A Survey on Dynamic Neural Networks for Natural Language Processing
- Canwen Xu, Julian McAuley
- TLDR: We present three types of dynamic neural networks in NLP and show their performance and challenges.
Transformers with Learnable Activation Functions
- Haishuo Fang, Ji-Ung Lee, Nafise Sadat Moosavi, Iryna Gurevych
- TLDR: We propose a rational activation function based Transformer model that outperforms the baseline model on GLUE and SQuAD.
The Solvability of Interpretability Evaluation Metrics
- Yilun Zhou, Julie Shah
- TLDR: We propose a beam search explainer for neural network predictions that can solve the problem of optimizing an explanation for a metric, which can be solved by beam search.
Reliable Gradient-free and Likelihood-free Prompt Tuning
- Maohao Shen, Soumya Ghosh, Prasanna Sattigeri, Subhro Das, Yuheng Bu, Gregory Wornell
- TLDR: We develop methods to tune soft prompts in language models with only API access.
Combining Psychological Theory with Language Models for Suicide Risk Detection
- Daniel Izmaylov, Avi Segal, Kobi Gal, Meytal Grimland, Yossi Levi-Belz
- TLDR: We propose a new language model for automatic suicide detection in low-resource languages, which outperforms a wide range of strong baselines.
Cross-Lingual Question Answering over Knowledge Base as Reading Comprehension
- Chen Zhang, Yuxuan Lai, Yansong Feng, Xingyu Shen, Haowei Du, Dongyan Zhao
- TLDR: We propose a novel approach for cross-lingual question answering over knowledge base (xKBQA) in reading comprehension paradigm.
Delving Deeper into Cross-lingual Visual Question Answering
- Chen Liu, Jonas Pfeiffer, Anna Korhonen, Ivan Vulić, Iryna Gurevych
- TLDR: We analyze cross-lingual VQA across different question types and languages, and identify question types that are the most difficult to improve on.
Bridging Argument Quality and Deliberative Quality Annotations with Adapters
- Neele Falk, Gabriella Lapesa
- TLDR: We propose adapter-fusion as a multi-task learning framework for argument quality dimensions and show that it improves the prediction of quality dimensions.
Interventional Probing in High Dimensions: An NLI Case Study
- Julia Rozanova, Marco Valentino, Lucas Cordeiro, André Freitas
- TLDR: We investigate the effect of semantic fea-tures intermediate to natural logic on NLI classification.
Program Synthesis for Complex QA on Charts via Probabilistic Grammar Based Filtered Iterative Back-Translation
- Shabbirhussain Bhaisaheb, Shubham Paliwal, Rajaswa Patil, Manasi Patwardhan, Lovekesh Vig, Gautam Shroff
- TLDR: We present a novel chart-based question answering algorithm for reasoning-based queries from chart images.
Exploiting Language Characteristics for Legal Domain-Specific Language Model Pretraining
- Inderjeet Nair, Natwar Modani
- TLDR: We propose domain-agnostic pretraining objectives for large language models that aim to exploit the domain specific language characteristics.
Global Constraints with Prompting for Zero-Shot Event Argument Classification
- Zizheng Lin, Hongming Zhang, Yangqiu Song
- TLDR: We propose a novel method for event argument classification that uses global constraints with prompting to effectively tackles event argument classifier without any annotation and task-specific training.
Distillation of encoder-decoder transformers for sequence labelling
- Marco Farina, Duccio Pappadopulo, Anant Gupta, Leslie Huang, Ozan Irsoy, Thamar Solorio
- TLDR: We propose a hallucination-free framework for sequence tagging that is especially suited for distillation.
Predicting Desirable Revisions of Evidence and Reasoning in Argumentative Writing
- Tazin Afrin, Diane Litman
- TLDR: We develop models to classify desirable evidence and desirable reasoning revisions in student argumentative writing.
Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues
- Chuyuan Li, Patrick Huber, Wen Xiao, Maxime Amblard, Chloe Braud, Giuseppe Carenini
- TLDR: We propose a novel approach to infer latent discourse structures for dialogues based on attention matrices from Pre-trained Language Models.
Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
- Zhen Wan, Fei Cheng, Qianying Liu, Zhuoyuan Mao, Haiyue Song, Sadao Kurohashi
- TLDR: We propose a weighted contrastive learning method for distant supervision that uses the supervised data to estimate the reliability of pre-training instances and explicitly reduce the effect of noise.
CK-Transformer: Commonsense Knowledge Enhanced Transformers for Referring Expression Comprehension
- Zhi Zhang, Helen Yannakoudakis, Xiantong Zhen, Ekaterina Shutova
- TLDR: We propose a novel framework for Commonsense Knowledge Enhanced Transformers (CK-Transformer) which effectively integrates commonsense knowledge into the representations of objects in an image, facilitating identification of the target objects referred to by the expressions.
Curricular Next Conversation Prediction Pretraining for Transcript Segmentation
- Anvesh Rao Vijjini, Hanieh Deilamsalehy, Franck Dernoncourt, Snigdha Chaturvedi
- TLDR: Pretraining a model for transcript segmentation using next conversation prediction and curriculum.

« EACL 2023 ACL 2023 »