Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

[Re] Replication study of ‘Data-Driven Methods for Balancing Fairness and Efficiency in Ride-Pooling’

Published in ReScience C, 2022

Replication study evaluting claims related to fairness-based objective functions for ride-pooling matching systems.

Neplenbroek, V., Perdijk, S., and Prins, V. 2022. [Re] Replication study of ’Data-Driven Methods for Balancing Fairness and Efficiency in Ride-Pooling.’ ReScience C 8, 2, #29. https://rescience.github.io/bibliography/Neplenbroek_2022.html

LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks

Published in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), 2024

We provide JUDGE-BENCH, a collection of 20 NLP datasets with human annotations, and comprehensively evaluate 11 current LLMs, covering both open-weight and proprietary models, for their ability to replicate the annotations. Our evaluations show that each LLM exhibits a large variance across datasets in its correlation to human judgments. We conclude that LLMs are not yet ready to systematically replace human judges in NLP.

Anna Bavaresco, Raffaella Bernardi, Leonardo Bertolazzi, Desmond Elliott, Raquel Fernández, Albert Gatt, Esam Ghaleb, Mario Giulianelli, Michael Hanna, Alexander Koller, Andre Martins, Philipp Mondorf, Vera Neplenbroek, Sandro Pezzelle, Barbara Plank, David Schlangen, Alessandro Suglia, Aditya K Surikuchi, Ece Takmaz, and Alberto Testoni. 2025. LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 238–255, Vienna, Austria. Association for Computational Linguistics. https://aclanthology.org/2025.acl-short.20

MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs

Published in COLM, 2024

MBBQ (Multilingual Bias Benchmark for Question-answering) is a carefully curated version of the English BBQ dataset extended to Dutch, Spanish, and Turkish, which measures stereotypes commonly held across these languages. Our results based on several open-source and proprietary LLMs confirm that some non-English languages suffer from bias more than English, and that there are significant cross-lingual differences in bias behaviour for all except the most accurate models.

Neplenbroek, V., Bisazza, A. and Fernández, R., 2024. MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs. In the first Conference on Language Modeling (COLM) 2024. https://openreview.net/pdf?id=X9yV4lFHt4

Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation

Published in Findings of the Association for Computational Linguistics: ACL 2025, 2024

Finetuning on specialized datasets can mitigate harmful behavior, and doing this in English can transfer to other languages. In this work we also observe this transfer and show that the extent to which transfer takes place can be predicted by the amount of data in a given language present in the model’s pretraining data. However, this transfer of bias and toxicity mitigation often comes at the expense of decreased language generation ability in non-English languages.

Vera Neplenbroek, Arianna Bisazza, and Raquel Fernández. 2025. Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation. In Findings of the Association for Computational Linguistics: ACL 2025, pages 2805–2830, Vienna, Austria. Association for Computational Linguistics. https://aclanthology.org/2025.findings-acl.145

Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization

Published in Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

In this work we show that LLMs infer the user’s demographic attributes based on stereotypical signals in the conversation, which for a number of groups even persists when the user explicitly identifies with a different demographic group. We effectively mitigate this form of stereotype-driven implicit personalization by intervening on the model’s internal representations using a trained linear probe to steer them toward the explicitly stated identity.

Vera Neplenbroek, Arianna Bisazza, and Raquel Fernández. 2025. Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 20378–20411, Suzhou, China. Association for Computational Linguistics. https://aclanthology.org/2025.emnlp-main.1029/

One Persona, Many Cues, Different Results: How Sociodemographic Cues Impact LLM Personalization

Published in arXiv, 2026

We compare using different cues to convey the same user persona and find that while cues are overall highly correlated, they produce substantial variance in responses across personas.

Weeber, F., Neplenbroek, V., Batzner, J., & Padó, S. (2026). One Persona, Many Cues, Different Results: How Sociodemographic Cues Impact LLM Personalization. arXiv preprint arXiv:2601.18572. https://arxiv.org/abs/2601.18572

talks

Workshop on New Perspectives on Bias and Discrimination in Language Technology

Published: November 04, 2024

Presentation of “MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs”.

Comparing and Mitigating Bias and Toxicity Across Languages

Published: March 14, 2025

Large language models (LLMs) are being used by vast amounts of speakers over the world, and show remarkable performance in many non-English languages. However, they often only receive safety fine-tuning in English, if at all, and their performance is known to be inconsistent across languages. There is therefore a need to investigate to what extent LLMs exhibit harmful biases and toxic behaviors across languages and how such harmful behaviors can best be reduced. In this talk I will discuss my work which shows that stereotypical bias exhibited by LLMs differs significantly depending on the language they are prompted in. Furthermore, we show that mitigation of these stereotypical biases and toxic behaviors performed in English transfers to other languages, though often at the expense of decreased language generation ability in those non-English languages.

HumanCLAIM Workshop

Published: March 26, 2025

Poster presentation of “Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation”.

NatWest Group DS Seminar Series

Published: August 28, 2025

Presentation of “Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization”.

Visit to MilaNLP

Published: November 24, 2025

Presentation of “Reading Between the Prompts: How Stereotypes Shape LLM’s Implicit Personalization” and ongoing work.

Social (un)safety of LLMs

Published: January 28, 2026

Amsterdam AI invited me to give a talk as part of their “Technology for people” session at the Deep Tech Day 2026.

How User Sociodemographics Inadvertently Affect LLM-User Conversations

Published: March 06, 2026

Generative Large Language Models (LLMs) personalize responses based on perceived user characteristics, a phenomenon called implicit personalization. This can enhance user experience but also introduce or amplify bias when models infer sociodemographic attributes from subtle conversational cues. In this talk, I present a systematic investigation of how LLMs infer and act on such information when confronted with stereotypical cues. Using controlled synthetic conversations, we show that models extract demographic attributes from stereotypical cues and encode them in latent user representations; notably, for several groups, these inferences persist even when users explicitly identify with a different demographic group. I will also present work in which we examine the methodological foundations of persona-based bias research by comparing six commonly used sociodemographic cues across seven LLMs on a range of tasks. Although outputs generated from different cues are often correlated, we observe substantial variance in how personas are realized, underscoring LLM sensitivity to prompt formulation and cautioning against drawing conclusions from a single cue. Together, our findings highlight both the prevalence and malleability of demographic inference in LLMs and argue for greater transparency, methodological rigor, and user control in personalization research and deployment.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.