Date of Award

Spring 4-28-2023

Document Type

Honors Project

Degree Name

Bachelor of Science

Department

Computer Science

Department Chair or Program Director

Anewalt, Karen

First Advisor

Davies, Stephen

Major or Concentration

Other

Abstract

Polarization in the political sphere, seen through combative communication and stalemate, may impose negative social impacts on the population. Attempting to measure political polarization in the masses through self-reported surveys and interviews can present response biases of social desirability. The classification of thought freely written online allows political polarization to be measured in an impartial manner. Reddit is one application that enables users to share opinions and create discussions anonymously; this text can be used to measure the political climate at any given time. Disagreement has grown over the perceived level of polarization in our society. The purpose of my research is to measure the degree of political polarization over time by collecting and classifying threads of dialogue within political communities on Reddit.

I utilized Reddit APIs to gather threads of text from multiple subreddit communities online. I recruited a team of evaluators to hand-annotate the threads for polarization. Then, I created a machine learning classifier to predict whether a thread posted on Reddit is polarized or not. I trained a multilayer perceptron on the hand-tagged data. Depending on parameter choices, the classifier performed with an accuracy of between 75% and 80% on an independent test set. There does not appear to be a strong pattern indicating a rise in polarization overall on Reddit during the period 2008 to the present. However, when looking at some of the subreddits individually, there is evidence of an increase in polarization.

Included in

Data Science Commons

Share

COinS