Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach

Hari Shrawgi, Prasanjit Rath, Tushar Singhal, Sandipan Dandapat

Main: Ethics and NLP Oral Paper

Session 9: Ethics and NLP (Oral)
Conference Room: Carlson
Conference Time: March 20, 09:00-10:30 (CET) (Europe/Malta)
TLDR:
You can open the #paper-263-Oral channel in a separate window.
Abstract: Recent Large Language Models (LLMs) have unlocked unprecedented applications of AI. As these models continue to transform human life, there are growing socio-ethical concerns around their inherent stereotypes that can lead to bias in their applications. There is an urgent need for holistic bias evaluation of these LLMs. Few such benchmarks exist today and evaluation techniques that do exist are either non-holistic or may provide a false sense of security as LLMs become better at hiding their biases on simpler tasks. We address these issues with an extensible benchmark - LLM Stereotype Index (LSI). LSI is grounded on Social Progress Index, a holistic social benchmark. We also test the breadth and depth of bias protection provided by LLMs via a variety of tasks with varying complexities. Our findings show that both ChatGPT and GPT-4 have strong inherent prejudice with respect to nationality, gender, race, and religion. The exhibition of such issues becomes increasingly apparent as we increase task complexity. Furthermore, GPT-4 is better at hiding the biases, but when displayed it is more significant. Our findings highlight the harms and divide that these LLMs can bring to society if we do not take very diligent care in their use.