Author

Bryan Holster

Date

Spring 5-1-2015

Document Type

Honors Project

First Advisor

Davies, Stephen

Degree Name

Bachelor of Science

Major or Concentration

Computer Science

Department or Program

Computer Science

Abstract

In this thesis, I explore the identification of non-human Twitter users. I am interested in classifying users by behavior into the categories of either bot or human. My goal in this research is to find an accurate and efficient means of identifying and segregating non-human Twitter users from their human counterparts. I use a two-stage data collection process to collect Twitter users suspected of being a bot and then obtain a majority vote on the suspected users to validate the suspicion. I gather, on average, 1000 tweets per user, on which I calculate 40 features characterizing the user. I explore the effectiveness of three different methods to most accurately classify users as either a bot or a human based on these features. The results of this work show that bots can be classified efficiently and with a high degree of accuracy. I show that certain features play a larger role in the classification process than others. The applications of Twitter bot identification include: (i) protecting users from malicious content (ii) spam filtering, and (iii) bot removal from Twitter data for other research.

Language

English

Share

COinS