News
Accepted papers at the EACL 2024
ATHENE researcher Prof. Iryna Gurevych presented seven papers at the 8th Conference of the European Chapter of the Associations for Computational Linguistics, EACL for short. Two of them were directly related to her research work in the ATHENE research project "Fake News and Conspiracy Theories" from the research area Secure Digital Transformation in Health Care ( SeDiTraH). EACL is one of the leading European conferences in the field of computational linguistics covering a broad spectrum of research areas that are concerned with computational approaches to natural language.
The two publications in the ATHENE context are:
Like a Good Nearest Neighbor: Practical Content Moderation and Text Classification
Luke Bates and Iryna Gurevych
More about the paper:
Text classification is the most important tool for natural language processing practitioners, but performant systems are either too slow, too cumbersome, or too unpredictable to be able to deploy and use reliably. Practical text classification is essential for content moderation, which is flagging undesirable text on social media platforms to ensure a safe experience for users. Content moderation is especially difficult because new types of undesirable text emerge constantly, for example, novel fake news topics. The researchers address these issues by creating a text classifier that exploits inexpensive distance-based algorithms to modify input text with text the model already knows, signaling to the model it has seen similar instances before. Their method can not only detect harmful content, but is also lightweight and more performant than more expensive systems.
PDF of the paper
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Rachneet Singh Sachdeva, Martin Tutek and Iryna Gurevych
More about the paper:
In recent years, large language models (LLMs) have shown amazing capabilities on a large scale, especially in generating text based on a given instruction. In their work, the researchers explore how LLMs can be used to augment the training data of smaller language models (SLMs). To do this, they add counterfactual (CF) instances to automatically generated minimally modified inputs. This should improve the performance of SLMs in extractive question answering (QA) outside their normal training domain.
In their work, the researchers show that this expansion of the data across different LLM generators consistently improves performance outside the normal training domain. Additionally, it enhances the calibration of models, whether they are confidence-based or rationale-augmented calibrator models.
Other papers written by Prof. Gurevych's research group that were accepted at the EACL are:
Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon
Fajri Koto, Tilman Beck, Zeerak Talat, Iryna Gurevych and Timothy Baldwin
Document Structure in Long Document Transformers
Jan Buchmann, Max Eichler, Jan-micha Bodensohn, Ilia Kuznetsov and Iryna Gurevych
Predicting Client Emotions and Therapist Interventions in Psychotherapy Dialogues
Tobias Mayer, Neha Warikoo, Amir Eliassaf, Dana Atzil-slonim and Iryna Gurevych
Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting Tilman Beck, Hendrik Schuff, Anne Lauscher and Iryna Gurevych
The paper received the "Social Impact Award".
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection
Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Chenxi Whitehouse, Osama Mohammed Afzal, Tarek Mahmoud, Toru Sasaki, Thomas Arnold, Alham Aji, Nizar Habash, Iryna Gurevych and Preslav Nakov.
The paper received the "Resource Paper Award".
The EACL took place from March 17 - 22, 2024 in Malta.
More about the ATHENE-project "Fake News and Conspiracy Theories"
show all news