Addressing Age-Related Bias in Sentiment Analysis

Díaz, Mark; Johnson, Isaac; Lazar, Amanda; Piper, Anne Marie; Gergle, Darren

Computational approaches to text analysis are useful in understanding aspects of online interaction, such as opinions and subjectivity in text. Yet, recent studies have identified various forms of bias in language-based models, raising concerns about the risk of propagating social biases against certain groups based on sociodemographic factors (e.g., gender, race, geography). In this study, we contribute a systematic examination of the application of language models to study discourse on aging. We analyze the treatment of age-related terms across 15 sentiment analysis models and 10 widely-used GloVe word embeddings and attempt to alleviate bias through a method of processing model training data. Our results demonstrate that significant age bias is encoded in the outputs of many sentiment analysis algorithms and word embeddings. We discuss the models’ characteristics in relation to output bias and how these models might be best incorporated into research.