These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Detecting social bias in text is challenging due to nuance, subjectivity, and
difficulty in obtaining good quality labeled datasets at scale, especially
given the evolving nature of social biases and society. To address these
challenges, we propose a few-shot instruction-based method for prompting
pre-trained language models (LMs). We select a few class-balanced exemplars
from a small support repository that are closest to the query to be labeled in
the embedding space. We then provide the LM with instruction that consists of
this subset of labeled exemplars, the query text to be classified, a definition
of bias, and prompt it to make a decision. We demonstrate that large LMs used
in a few-shot context can detect different types of fine-grained biases with
similar and sometimes superior accuracy to fine-tuned models. We observe that
the largest 530B parameter model is significantly more effective in detecting
social bias compared to smaller models (achieving at least 13% improvement in
AUC metric compared to other models). It also maintains a high AUC (dropping
less than 2%) when the labeled repository is reduced to as few as $100$
samples. Large pretrained language models thus make it easier and quicker to
build new bias detectors.