Scientific Research Project 1

Sadooh_77

ContentServer.asp.pdf

Home >Science homework help >Scientific Research Project 1

RESEARCH ARTICLE

Automatic detection of cyberbullying in social

media text

Cynthia Van HeeID 1☯*, Gilles Jacobs1☯, Chris Emmery2, Bart Desmet1, Els Lefever1,

Ben Verhoeven 2 , Guy De Pauw

2 , Walter Daelemans

2 , Véronique Hoste

1 Department of Translation, Interpreting and Communication - Faculty of Arts and Philosophy, Ghent

University, Ghent, Belgium, 2 Department of Linguistics - Faculty of Arts, University of Antwerp, Antwerp,

Belgium

☯ These authors contributed equally to this work. * [email protected]

Abstract

While social media offer great communication opportunities, they also increase the vulnera-

bility of young people to threatening situations online. Recent studies report that cyberbully-

ing constitutes a growing problem among youngsters. Successful prevention depends on

the adequate detection of potentially harmful messages and the information overload on the

Web requires intelligent systems to identify potential risks automatically. The focus of this

paper is on automatic cyberbullying detection in social media text by modelling posts written

by bullies, victims, and bystanders of online bullying. We describe the collection and fine-

grained annotation of a cyberbullying corpus for English and Dutch and perform a series of

binary classification experiments to determine the feasibility of automatic cyberbullying

detection. We make use of linear support vector machines exploiting a rich feature set and

investigate which information sources contribute the most for the task. Experiments on a

hold-out test set reveal promising results for the detection of cyberbullying-related posts.

After optimisation of the hyperparameters, the classifier yields an F1 score of 64% and 61%

for English and Dutch respectively, and considerably outperforms baseline systems.

Introduction

Web 2.0 has had a substantial impact on communication and relationships in today’s society.

Children and teenagers go online more frequently, at younger ages, and in more diverse ways

(e.g. smartphones, laptops and tablets). Although most of teenagers’ Internet use is harmless

and the benefits of digital communication are evident, the freedom and anonymity experi-

enced online makes young people vulnerable with cyberbullying being one of the major threats

[1, 2].

Bullying is not a new phenomenon and cyberbullying has manifested itself as soon as digital

technologies have become primary communication tools. On the positive side, social media

like blogs, social networking sites (e.g. Facebook), and instant messaging platforms (e.g. What-

sApp) make it possible to communicate with anyone and at any time. Moreover, they are a

place where people engage in social interaction, offering the possibility to establish new

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 1 / 22

a1111111111

OPEN ACCESS

Citation: Van Hee C, Jacobs G, Emmery C, Desmet

B, Lefever E, Verhoeven B, et al. (2018) Automatic

detection of cyberbullying in social media text.

PLoS ONE 13(10): e0203794. https://doi.org/

10.1371/journal.pone.0203794

Editor: Hussein Suleman, University of Cape Town,

SOUTH AFRICA

Received: February 6, 2017

Accepted: August 28, 2018

Published: October 8, 2018

Creative Commons Attribution License, which

permits unrestricted use, distribution, and

reproduction in any medium, provided the original

author and source are credited.

Data Availability Statement: Because the actual

posts in our corpus could contain names or other

identifying information, we cannot share them

publicly in a repository. They can, however be

obtained upon request, for academic purposes

solely and via [email protected] or cynthia.

[email protected]. The replication data are

available through the Open Science Framework

repository https://osf.io/rgqw8/ with DOI 10.17605/

OSF.IO/RGQW8. This replication dataset allows

interested researchers to download 1) the feature

vectors of the corpus underlying the experiments

described in this paper, 2) the indices

http://orcid.org/0000-0001-7365-6703

https://doi.org/10.1371/journal.pone.0203794

http://crossmark.crossref.org/dialog/?doi=10.1371/journal.pone.0203794&domain=pdf&date_stamp=2018-10-08

https://doi.org/10.1371/journal.pone.0203794

http://creativecommons.org/licenses/by/4.0/

mailto:[email protected]

https://osf.io/rgqw8/

https://doi.org/10.17605/OSF.IO/RGQW8

relationships and maintain existing friendships [3, 4]. On the negative side however, social

media increase the risk of children being confronted with threatening situations including

grooming or sexually transgressive behaviour, signals of depression and suicidal thoughts, and

cyberbullying. Users are reachable 24/7 and are often able to remain anonymous if desired:

this makes social media a convenient way for bullies to target their victims outside the school

yard.

With regard to cyberbullying, a number of national and international initiatives have been

launched over the past few years to increase children’s online safety. Examples include KiVa (http://www.kivaprogram.net/), a Finnish cyberbullying prevention programme, the ‘Non au harcèlement’ campaign in France, Belgian governmental initiatives and helplines (e.g. clicksafe. be, veiligonline.be, mediawijs.be) that provide information about online safety, and so on.

In spite of these efforts, a lot of undesirable and hurtful content remains online. [2] analysed

a body of quantitative research on cyberbullying and observed cybervictimisation rates among

teenagers between 20% and 40%. [5] focused on 12 to 17 year olds living in the United States

and found that no less than 72% of them had encountered cyberbullying at least once within

the year preceding the questionnaire. [6] surveyed 9 to 26 year olds in the United States, Can-

ada, the United Kingdom and Australia, and found that 29% of the respondents had ever been

victimised online. A study among 2,000 Flemish secondary school students (age 12 to 18)

revealed that 11% of them had been bullied online at least once in the six months preceding

the survey [7]. Finally, the 2014 large-scale EU Kids Online Report [8] published that 20% of

11 to 16 year olds had been exposed to hate messages online. In addition, youngsters were 12%

more likely to be exposed to cyberbullying as compared to 2010, which clearly demonstrates

that cyberbullying is a growing problem.

The prevalence of cybervictimisation depends on the conceptualisation used in describing

cyberbullying, but also on research variables such as location and the number and age span of

the participants. Nevertheless, the above studies demonstrate that online platforms are increas-

ingly used for bullying, which is a cause for concern given its impact. As shown by [9–11],

cyberbullying may negatively impact the victim’s self-esteem, academic achievement and emo-

tional well-being. [12] found that self-reported effects of cyberbullying include negative effects

on school grades and feelings of sadness, anger, fear, and depression. In extreme cases, cyber-

bullying could even lead to self-harm and suicidal thoughts.

These findings demonstrate that cyberbullying is a serious problem the consequences of

which can be dramatic. Early detection of cyberbullying attempts is therefore of key impor-

tance to youngsters’ mental well-being. Successful detection depends on effective monitoring

of online content, but the amount of information on the Web makes it practically unfeasible

for moderators to monitor all user-generated content manually. To tackle this problem, intelli-

gent systems are required that process this information in a fast way and automatically signal

potential threats. This way, moderators can respond quickly and prevent threatening situations

from escalating. According to recent research, teenagers are generally in favour of such auto-

matic monitoring, provided that effective follow-up strategies are formulated, and that privacy

and autonomy are guaranteed [13].

Parental control tools (e.g. NetNanny, https://www.netnanny.com/) already block unsuited or undesirable content and some social networks make use of keyword-based moderation

tools (i.e. using lists of profane and insulting words to flag harmful content). However, such

approaches typically fail to detect implicit and subtle forms of cyberbullying in which no

explicit vocabulary is used. This creates the need for intelligent and self-learning systems that

go beyond keyword spotting and hence improve the recall of cyberbullying detection.

The ultimate goal of this type of research is to develop models that could improve manual

monitoring for cyberbullying on social networks. We explore the automatic detection of

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 2 / 22

corresponding to instances that were kept

separately to test the experimental design (referred

to as the "hold-out test set" in the paper), 3) a

feature mapping dictionary that allows to trace all

indices in the feature vector files back to the

corresponding feature types (e.g. the feature

indices 0 to 14,230 represent word 3-gram

features). We also share the seed terms that were

used to construct the corpora for our topic model

features. Lastly, we provide an Excel spreadsheet

presenting a results overview of all the tested

systems. All of this information is made available

for both the Dutch and English experiments.

Funding: The work presented in this paper was

carried out in the framework of the AMiCA IWT

SBO-project 120007 project to WD and VH, funded

by the government Flanders Innovation &

Entrepreneurship (VLAIO) agency; http://www.

vlaio.be. The funders had no role in study design,

data collection and analysis, decision to publish, or

preparation of the manuscript.

Competing interests: The authors have declared

that no competing interests exist.

http://www.kivaprogram.net/

https://www.netnanny.com/

https://doi.org/10.1371/journal.pone.0203794

http://www.vlaio.be

textual signals of cyberbullying, in which cyberbulying is approached as a complex phenome-

non that can be realised in various ways (see the Annotation guidelines section for a detailed

overview). While the vast majority of the related research focuses on detecting cyberbullying

‘attacks’ (i.e. verbal aggression), the present study takes different types of cyberbullying into

account, including more implicit posts from the bully, but also posts written by victims and

bystanders. This is a more inclusive conceptualisation for the task of cyberbullying detection

and should aid in moderation and prevention efforts by capturing different and more implicit

signals of bullying.

To tackle this problem, we propose a machine learning method based on a linear SVM

classifier [14, 15] exploiting a rich feature set. The contribution we make is twofold: first, we

develop a complex classifier to detect signals of cyberbullying, which allows us to detect differ- ent types of cyberbullying that are related to different social roles involved in a cyberbullying

event. Second, we demonstrate that the methodology is easily portable to other languages, pro-

vided there is annotated data available, by performing experiments on an English and Dutch

dataset.

The remainder of this paper is structured as follows: the next section presents a definition

of cyberbullying and its participant roles and provides an overview of the state of the art in

cyberbullying detection. The Data collection and annotation section describes the corpus con- struction and annotation. Next, we present the experimental setup and discuss our experimen-

tal results for English and Dutch. Finally, the Conclusion and future research section concludes this paper and provides some perspectives for further research.

Related research

Both offline and online bullying are widely covered in the realm of social sciences and psychol-

ogy, and the increasing number of cyberbullying cases in recent years [16] has stimulated

research efforts to detect cyberbullying automatically. In the following section, we present a

definition of cyberbullying and identify its participant roles and we provide a brief overview of

automatic approaches to cyberbullying detection.

Cyberbullying definition and participant roles

A common starting point for conceptualising cyberbullying are definitions of traditional (i.e.

offline) bullying, one of the most influential ones being formulated by [17]. The researcher described bullying based on three main criteria, including i) intention (i.e. a bully intends to

inflict harm on the victim), ii) repetition (i.e. bullying acts take place repeatedly over time)

and iii) a power imbalance between the bully and the victim (i.e. a more powerful bully attacks

a less powerful victim). With respect to cyberbullying, a number of definitions are based on

the above criteria. A popular definition is that of [18, p. 376], which describes cyberbullying as

“an aggressive, intentional act carried out by a group or individual, using electronic forms of

contact, repeatedly and over time, against a victim who cannot easily defend him or herself”.

However, opinion on the applicability of the above characteristics to cyberbullying is very

much divided [19], and besides theoretical objections, a number of practical limitations have

been observed. Firstly, while [17] claims intention to be inherent to traditional bullying, this is

much harder to ascertain in an online environment. Online conversations lack the signals of a

face-to-face interaction like intonation, facial expressions and gestures, which makes them

more ambiguous than real-life conversations. The receiver may therefore get the wrong

impression that they are being offended or ridiculed [20]. Another criterion for bullying that

might not hold in online situations is the power imbalance between the bully and the victim.

This can be evident in real life (e.g. the bully is taller, stronger or older than the victim), but it

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 3 / 22

https://doi.org/10.1371/journal.pone.0203794

is hard to conceptualise or measure online, where power may be related to technological skills,

anonymity or the inability of the victim to escape from the bullying [19, 21]. Also empowering

for the bully are inherent characteristics of the Web: once defamatory or confidential informa-

tion is made public through the Internet, it is hard to remove.

Finally, while arguing that repetition distinguishes bullying from single acts of aggression,

[17] himself states that such a single aggressive action can be considered bullying under certain

circumstances. Accordingly, [21] claim that repetition in cyberbullying is problematic to oper-

ationalise, as it is unclear what the consequences are of a single derogatory message on a public

page. A single act of aggression or humiliation may cause continued distress and humiliation

for the victim if it is shared or liked by a large audience [21]. [22, p. 26] compare this with the

“snowball effect”: one post may be repeated or distributed by other people so that it becomes

out of the control of the initial bully and has larger effects than was originally intended.

Given these arguments, a number of less ‘strict’ definitions of cyberbullying were proposed

by among others [2, 5, 6], where a power imbalance and repetition are not deemed necessary

conditions for cyberbullying.

The above paragraphs demonstrate that defining cyberbullying is far from trivial, and vary-

ing prevalence rates (see the Introduction section) confirm that a univocal definition of the

phenomenon is still lacking in the literature [2]. Based on existing conceptualisations, we

define cyberbullying as content that is published online by an individual and that is aggressive or hurtful against a victim. Based on this definition, an annotation scheme was developed [23] to signal textual characteristics of cyberbullying, including posts from bullies, as well as reactions

from victims and bystanders.

Cyberbullying research also involves the identification of its participant roles. [24] were

among the first to define the roles in a bullying situation. Based on surveys among teenagers

involved in real-life bullying situations, they defined six participant roles: victims (i.e. who are

the target of repeated harassment), bullies (i.e. who are the initiative-taking perpetrators),

assistants of the bully (i.e. who encourage the bullying), reinforcers of the bully (i.e. who rein-

force the bullying), defenders (i.e. who comfort the victim, take their side or try to stop the bul-

lying) and outsiders (i.e. who ignore or distance themselves from the situation). In sum, in

addition to the bully and victim, the researchers distinguish four bystanders (i.e. assistants,

reinforcers, defenders and outsiders). [25], however, do not distinguish between reinforcers

and assistants of the bullying. Their typology includes victims, bullies and three types of

bystanders: i) bystanders who participate in the bullying, ii) bystanders who help or support

the victim and iii) bystanders who ignore the bullying. The cyberbullying roles that are identi-

fied in our annotation scheme are based on existing bullying role typologies, given that tradi-

tional bullying roles are applicable to cyberbullying as well [26, 27]. More details about the

different roles that we take into account are provided in the Data collection and annotation

section.

Bystanders and -to a lesser extent- victims are often overlooked in the related research. As a

result, these studies can be better characterised as verbal aggression detection concerned with

retrieving bully attacks. By taking bystanders into account, we capture different and more sub-

tle signals of a bullying episode. Note that while in this work we did not include classification

of the participant roles as such, they are essential to the conceptualisation of the current detec-

tion task.

Detecting and preventing cyberbullying

As mentioned earlier, although research on cyberbullying detection is more limited than social

studies on the phenomenon, some important advances have been made in recent years. In

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 4 / 22

https://doi.org/10.1371/journal.pone.0203794

what follows, we present a brief overview of the most important natural language processing

approaches to cyberbullying detection, but we refer to the survey paper by [28] for a more

detailed overview.

Although some studies have investigated the effectiveness of rule-based modelling [29], the

dominant approach to cyberbullying detection involves machine learning. Most machine

learning approaches are based on supervised [30, 30–32] or semi-supervised learning [33]. The

former involves the construction of a classifier based on labelled training data, whereas semi-

supervised approaches rely on classifiers that are built from a training corpus containing a

small set of labelled and a large set of unlabelled instances. Semi-supervised methods are often

used to handle data sparsity, a typical issue in cyberbullying research. As cyberbullying detec-

tion essentially involves the distinction between bullying and non-bullying posts, the problem

is generally approached as a binary classification task where the positive class is represented

by instances containing (textual) cyberbullying, while the negative class is devoid of bullying

signals.

A key challenge in cyberbullying research is the availability of suitable data, which is neces-

sary to develop models that characterise cyberbullying. In recent years, only a few datasets

have become publicly available for this particular task, such as the training sets provided in

the context of the CAW 2.0 workshop (http://caw2.barcelonamedia.org), a MySpace (https://

myspace.com) [34] and Formspring (http://www.formspring.me) cyberbullying corpus anno-

tated with the help of Mechanical Turk [29], and more recently, the Twitter Bullying Traces

dataset [35]. Many studies have therefore constructed their own corpus from social media

websites that are prone to bullying content, such as YouTube [30, 32], Twitter [36, 37], Insta-

gram [38], MySpace [31, 34], FormSpring [29, 39], Kaggle [40] and ASKfm [41]. Despite the

bottleneck of data availability, cyberbullying detection approaches have been successfully

implemented over the past years and the relevance of automatic text analysis techniques to

ensure child safety online has been recognised [42].

Among the first studies on cyberbullying detection are [29–31], who explored the predic-

tive power of n-grams (with and without tf-idf weighting), part-of-speech information (e.g. first and second pronouns), and sentiment information based on (polarity and profanity)

lexicons for this task. Similar features were not only exploited for coarse-grained cyberbully-

ing detection, but also for the detection of more fine-grained cyberbullying categories [41].

Despite their apparent simplicity, content-based features (i.e. lexical, syntactic and sentiment

information) are very often exploited in recent approaches to cyberbullying detection [33,

43]. In fact, as observed by [28], more than 41 papers have approached cyberbullying detec-

tion using content-based features, which confirms that this type of information is crucial for

the task.

More and more, however, content-based features are combined with semantic features

derived from topic model information [44], word embeddings and representation learning

[43, 45]. More recent studies have also demonstrated the added value of user-based informa-

tion for the task, more specifically by including users’ activities (i.e. the number of posts) on a

social network, their age, gender, location, number of friends and followers, and so on [32, 33,

46, 47]. A final feature type that gains increasing popularity in cyberbullying detection are net-

work-based features, whose application is motivated by the frequent use of social media data

for the task. By using network information, researchers aim to capture social relations between

participants in a conversation (e.g. bully versus victim), and other relevant information such as

the popularity of a person (i.e. which can indicate the power of a potential bully) on a social

network, the number of (historical) interactions between two people, and so on. [48] for

instance used network-based features to take the behavioural history of a potential bully into

account. [49] detected cyberbullying in tweets and included network features inspired by

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 5 / 22

http://caw2.barcelonamedia.org

https://myspace.com

http://www.formspring.me

https://doi.org/10.1371/journal.pone.0203794

Olweus’ [17] bullying conditions (see supra). More specifically, they measured the power

imbalance between a bully and victim, as well as the bully’s popularity based on interaction

graphs and the bully’s position in the network.

As mentioned earlier, social media are a commonly used genre for this type of tasks. More

recently, researchers have investigated cyberbullying detection in multi-modal data offered by

specific platforms. For instance [38] explored cyberbullying detection using multi-modal data

extracted from the social network Instagram. More precisely, they combined textual features

derived from the posts themselves with user metadata and image features and showed that

integrating the latter enhanced the classification performance. [37] also detected cyberbullying

in different data genres, including ASKfm, Twitter, and Instagram. They took role information

into account by integrating bully and victim scores as features, based on the occurrence of

bully-related keywords in their sent or received posts.

With respect to the datasets used in cyberbullying research, it can be observed that corpora

are often composed by keyword search (e.g. [43, 44]), which produces a biased dataset of posi-

tive (i.e. bullying) instances. To balance these corpora, negative data are often added from a

background corpus or data resampling [50] techniques are adopted [33, 47]. For this research,

data were randomly crawled across ASKfm and no keyword search was used to collect bullying

data. Instead, all instances were manually annotated for the presence of bullying. As a result,

our corpus contains a realistic distribution of bullying instances.

When looking at the performance of automatic cyberbullying, we see that scores vary

greatly and do not only depend on the implemented algorithm and parameter settings, but

also on a number of other variables. These include the metrics that are used to evaluate the sys-

tem (i.e. micro- or macro-averaged F1, precision, recall, AUC, etc.), the corpus genre (i.e. Face-

book, Twitter, ASKfm, Instagram) and class distribution (i.e. balanced or unbalanced), the

annotation method (i.e. automatic annotations or manual annotations using crowdsourcing

or by experts) and, perhaps the most important distinguishing factor, the conceptualisation of

cyberbullying that is used. More concretely, while some approaches identify sensitive topics

[30] or insulting language [29], others propose a more comprehensive approach by capturing

different types of cyberbullying [41] or by modelling the bully-victim communications

involved in a cyberbullying incident [37].

The studies discussed in this section demonstrated the variety of approaches that have been

used to tackle cyberbullying detection. However, most of them focused on cyberbullying

‘attacks’, or posts written by a bully. Moreover, it is not entirely clear if different forms of

cyberbullying were taken into account (e.g. sexual intimidation or harassment, or psychologi-

cal threats), in addition to derogatory language or insults. In the present study, cyberbullying

is considered a complex phenomenon comprising different forms of harmful online behav-

iour, which are described in more detail in our annotation scheme [23]. Purposing to facilitate

manual monitoring efforts on social networks, we developed a system that automatically

detects signals of cyberbullying, including attacks from bullies, as well as victim and bystander

reactions, the latter of which are generally overlooked in related research.

Most similar to this research is the work by [44], [43, 45], who investigated bullying traces

posted by different author roles (e.g. bully, victim, bystander, assistant, defender, reporter,

accuser, reinforcer). However, they collected tweets using the keywords bully, bullied and bully- ing. As a result, their corpus contained many reports or testimonials of cyberbullying (example 1), instead of actual cyberbullying. Moreover, their method implies that cyberbullying signals

that are devoid of such keywords are not included in the training corpus.

1. “Some tweens got violent on the n train, the one boy got off after blows 2 the chest. . . Saw him cryin as he walkd away: (bullying not cool” [44, p. 658]

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 6 / 22

https://doi.org/10.1371/journal.pone.0203794

What clearly distinguishes these works from the present is that their conceptualisation of

cyberbullying is not explained. It is, in other words, not clear which type of posts are consid-

ered bullying and which are not. In the present research, we identify different types of bullying

and all are included in the positive class of our experimental corpus.

For this research, English and Dutch social media data were annotated for fine-grained

forms of cyberbullying, based on the actors involved in a cyberbullying incident. After prelimi-

nary experiments for Dutch [41, 51], we currently present an optimised cyberbullying detec-

tion method for English and Dutch and hereby show that the proposed methodology can

easily be applied to different languages, provided that annotated data are available.

Data collection and annotation

To be able to build representative models for cyberbullying, a suitable dataset is required. This

section describes the construction of two corpora, English and Dutch, containing social media

posts that are manually annotated for cyberbullying according to our fine-grained annotation

scheme. This allows us to cover different forms and participants (or roles) involved in a cyber- bullying event.

Data collection

Two corpora were constructed by collecting data from the social networking site ASKfm,

where users can create profiles and ask or answer questions, with the option of doing so anon-

ymously. ASKfm data typically consists of question-answer pairs published on a user’s profile.

The data were retrieved by crawling a number of seed profiles using the GNU Wget software

(http://www.gnu.org/software/wget/) in April and October, 2013. After language filtering

(i.e. non-English or non-Dutch content was removed), the experimental corpora comprised

113,698 and 78,387 posts for English and Dutch, respectively.

Data annotation

Cyberbullying has been a widely covered research topic recently and studies have shed light on

direct and indirect types of cyberbullying, implicit and explicit forms, verbal and non-verbal

cyberbullying, and so on. This is important from a sociolinguistic point of view, but knowing

what cyberbullying involves is also crucial to build models for automatic cyberbullying detec-

tion. In the following paragraphs, we present our data annotation guidelines [23] and focus on

different types and roles related to the phenomenon.

Types of cyberbullying

Cyberbullying research is mainly centered around the conceptualisation, occurrence and pre-

vention of the phenomenon [1, 52, 53]. Sociolinguistic studies have identified different types

of cyberbullying [12, 54, 55] and compared these types with forms of traditional or offline

bullying [20]. Like traditional bullying, direct and indirect forms of cyberbullying have been

identified. Direct cyberbullying refers to actions in which the victim is directly involved (e.g.

sending a virus-infected file, excluding someone from an online group, insulting and threaten-

ing), whereas indirect cyberbullying can take place without awareness of the victim (e.g. outing or publishing confidential information, spreading gossip, creating a hate page on social net-

working sites) [20].

The present annotation scheme describes some specific textual categories related to cyber-

bullying, including threats, insults, defensive statements from a victim, encouragements to the

harasser, etc. (see the Data collection and annotation section for a complete overview). All of

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 7 / 22

http://www.gnu.org/software/wget/

https://doi.org/10.1371/journal.pone.0203794

these forms were inspired by social studies on cyberbullying [7, 20] and manual inspection of

cyberbullying examples.

Roles in cyberbullying

Similarly to traditional bullying, cyberbullying involves a number of participants that adopt

well-defined roles. Researchers have identified several roles in (cyber)bullying interactions.

Although traditional studies on bullying have mainly concentrated on bullies and victims [24],

the importance of bystanders in a bullying episode has been acknowledged [56, 57]. Bystanders

can support the victim and mitigate the negative effects caused by the bullying [57], especially

on social networking sites, where they hold higher intentions to help the victim than in real life

conversations [58]. [25] distinguish three main types of bystanders: i) bystanders who partici-

pate in the bullying, ii) who help or support the victim and iii) those who ignore the bullying.

Given that passive bystanders are hard to recognise in online text, only the former two are

included in our annotation scheme.

Annotation guidelines

To operationalise the task of automatic cyberbullying detection, we elaborated a detailed anno-

tation scheme for cyberbullying that is strongly embedded in the literature and applied it to

our corpora. The applicability of the scheme was iteratively tested. Our final guidelines for the

fine-grained annotation of cyberbullying are described in a technical report [23]. The objective

of the scheme was to indicate several types of textual cyberbullying and verbal aggression, their

severity, and the author participant roles. The scheme is formulated to be generic and is not

limited to a specific social media platform. All messages were annotated in context (i.e. pre-

sented within their original content or conversation event) when available.

Essentially, the annotation scheme describes two levels of annotation. Firstly, the annotators

were asked to indicate, at the message or post level, whether the text under investigation was

related to cyberbullying. If the message was considered harmful and thus contained indica-

tions of cyberbullying, annotators identified the author’s participant role. Based on the litera-

ture on role-allocation in cyberbullying episodes [25, 59], four roles are distinguished in the

annotation scheme, including victim, bully, and two types of bystanders.

1. Harasser or bully: person who initiates the bullying.

2. Victim: person who is harassed.

3. Bystander-defender: person who helps the victim and discourages the harasser from con-

tinuing his actions.

4. Bystander-assistant: person who does not initiate, but helps or encourages the harasser.

Secondly, at the sub-sentence level, the annotators were tasked with the identification of

fine-grained text categories related to cyberbullying. In the literature, different forms of cyber-

bullying are identified [12, 54, 55] and compared with traditional bullying [20]. Based on these

forms, the annotation scheme describes a number of textual categories that are often inherent

to a cyberbullying event, such as threats, insults, defensive statements from a victim, encour-

agements to the harasser, etc. Most of the categories are related to direct forms of cyberbullying

(as defined by [25]), while one is related to outing [25], an indirect form of cyberbullying, namely defamation. Additionally, a number of subcategories were defined to make the annota- tion scheme as concrete and distinctive as possible (e.g., discrimination as a subcategory of insult). All cyberbullying-related categories in the scheme are listed below, and an example post for each category is presented in Table 1.

Automatic detection of cyberbullying in social media text

PLOS ONE | https://doi.org/10.1371/journal.pone.0203794 October 8, 2018 8 / 22

https://doi.org/10.1371/journal.pone.0203794

• Threat/blackmail: expressions containing physical or psychological threats or indications of

blackmail.

• Insult: expressions meant to hurt or offend the victim.

• General insult: general expressions containing abusive, degrading or offensive language

that are meant to insult the addressee.

• Attacking relatives: insulting expressions towards relatives or friends of the victim.

• Discrimination: expressions of unjust or prejudicial treatment of the victim. Two types of

discrimination are distinguished (i.e. sexism and racism). Other forms of discrimination

should be categorised as general insults.

• Curse/exclusion: expressions of a wish that some form of adversity or misfortune will befall

the victim and expressions that exclude the victim from a conversation or a social group.

• Defamation: expressions that reveal confident or defamatory information about the victim

to a large public.

• Sexual Talk: expressions with a sexual meaning or connotation. A distinction is made

between innocent sexual talk and sexual harassment.

• Defense: expressions in support of the victim, expressed by the victim himself or by a

bystander.

• Bystander defense: expressions by which a bystander shows support for the victim or dis-

courages the harasser from continuing his actions.

• Victim defense: assertive or powerless reactions from the victim.

• Encouragement to the harasser: expressions in support of the harasser.

• Other: expressions that contain any other form of cyberbullying-related behaviour than the

ones described here.

It is important to note that the categories were always indicated in text, even if the post

in which they occurred was not considered harmful, for instance in the post “hi bitches, in

for a movie?”, “bitches” was annotated as an insult while the post itself was not considered

cyberbullying.

Table 1. Definitions and brat annotation examples of more fine-grained text categories related to cyberbullying.

Annotation