Evaluating Bias Detection in Lightweight LLMs

Veronika Bryskina; Milita Songailaitė; Justina Mandravickaitė

doi:10.15388/LMITT.2026.4

Articles

Veronika Bryskina

Vytautas Magnus University

Milita Songailaitė

Vytautas Magnus University

Justina Mandravickaitė

Vytautas Magnus University

Published 2026-05-08

https://doi.org/10.15388/LMITT.2026.4

PDF

Keywords

bias detection
LLM
benchmarking
open-source LLMs
evaluation

Abstract

This study evaluates the ability of lightweight open-source large language models (LLMs) to detect bias in text. Eleven models of six popular LLM families were tested in a zero-shot setting on a unified dataset of 8,745 sentences derived from three selected sources, covering gender, race, religion, and appearance bias. Results showed that none of the models exceeded 70% accuracy, which highlighted limitations of lightweight LLMs and existing challenges related to current bias detection datasets.

PDF

References

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.

Most read articles by the same author(s)

Eglė Kankevičiūtė, Milita Songailaitė, Justina Mandravickaitė, Neapykantos kalbos atpažinimas lietuviškuose komentaruose panaudojant dirbtinį intelektą , Vilnius University Open Series: 2023: Proceedings of the Conference "Lithuanian MSc Research in Informatics and ICT"