Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Published in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), 2024

We provide JUDGE-BENCH, a collection of 20 NLP datasets with human annotations, and comprehensively evaluate 11 current LLMs, covering both open-weight and proprietary models, for their ability to replicate the annotations. Our evaluations show that each LLM exhibits a large variance across datasets in its correlation to human judgments. We conclude that LLMs are not yet ready to systematically replace human judges in NLP.

Anna Bavaresco, Raffaella Bernardi, Leonardo Bertolazzi, Desmond Elliott, Raquel Fernández, Albert Gatt, Esam Ghaleb, Mario Giulianelli, Michael Hanna, Alexander Koller, Andre Martins, Philipp Mondorf, Vera Neplenbroek, Sandro Pezzelle, Barbara Plank, David Schlangen, Alessandro Suglia, Aditya K Surikuchi, Ece Takmaz, and Alberto Testoni. 2025. LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 238–255, Vienna, Austria. Association for Computational Linguistics. https://aclanthology.org/2025.acl-short.20

MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs

Published in COLM, 2024

MBBQ (Multilingual Bias Benchmark for Question-answering) is a carefully curated version of the English BBQ dataset extended to Dutch, Spanish, and Turkish, which measures stereotypes commonly held across these languages. Our results based on several open-source and proprietary LLMs confirm that some non-English languages suffer from bias more than English, and that there are significant cross-lingual differences in bias behaviour for all except the most accurate models.

Neplenbroek, V., Bisazza, A. and Fernández, R., 2024. MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs. In the first Conference on Language Modeling (COLM) 2024. https://openreview.net/pdf?id=X9yV4lFHt4

Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation

Published in Findings of the Association for Computational Linguistics: ACL 2025, 2024

Finetuning on specialized datasets can mitigate harmful behavior, and doing this in English can transfer to other languages. In this work we also observe this transfer and show that the extent to which transfer takes place can be predicted by the amount of data in a given language present in the model’s pretraining data. However, this transfer of bias and toxicity mitigation often comes at the expense of decreased language generation ability in non-English languages.

Vera Neplenbroek, Arianna Bisazza, and Raquel Fernández. 2025. Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation. In Findings of the Association for Computational Linguistics: ACL 2025, pages 2805–2830, Vienna, Austria. Association for Computational Linguistics. https://aclanthology.org/2025.findings-acl.145

Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization

Published in arXiv, 2025

In this work we show that LLMs infer the user’s demographic attributes based on stereotypical signals in the conversation, which for a number of groups even persists when the user explicitly identifies with a different demographic group. We effectively mitigate this form of stereotype-driven implicit personalization by intervening on the model’s internal representations using a trained linear probe to steer them toward the explicitly stated identity.

Neplenbroek, V., Bisazza, A. and Fernández, R., 2025. Reading Between the Prompts: How Stereotypes Shape LLM's Implicit Personalization. arXiv preprint arXiv:2505.16467. https://arxiv.org/abs/2505.16467

talks

Comparing and Mitigating Bias and Toxicity Across Languages

Published:

Large language models (LLMs) are being used by vast amounts of speakers over the world, and show remarkable performance in many non-English languages. However, they often only receive safety fine-tuning in English, if at all, and their performance is known to be inconsistent across languages. There is therefore a need to investigate to what extent LLMs exhibit harmful biases and toxic behaviors across languages and how such harmful behaviors can best be reduced. In this talk I will discuss my work which shows that stereotypical bias exhibited by LLMs differs significantly depending on the language they are prompted in. Furthermore, we show that mitigation of these stereotypical biases and toxic behaviors performed in English transfers to other languages, though often at the expense of decreased language generation ability in those non-English languages.

HumanCLAIM Workshop

Published:

Poster presentation of “Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation”.

NatWest Group DS Seminar Series

Published:

Presentation of “Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization”.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.