cyb505

profilemaria@12
howuserthinkaboutsocialmedia.pdf

#MyPrivacy:

How Users Think About Social Media Privacy Kelly Quinn

University of Illinois at Chicago

Department of Communication

Chicago, IL USA

[email protected]

Dmitry Epstein University of Illinois at Chicago

Department of Communication

Chicago, IL USA

[email protected]

ABSTRACT

This study explores privacy from the perspective of the user,

leveraging a “framing in thought” approach to capture how users

make sense of privacy in their social media use. Definitions of

privacy collected from 608 social media users are analyzed through

topic modelling and the clustering of word pairs to surface themes

present in the data. Results indicate the dominance of frames

related to horizontal privacy, or privacy vis-à-vis peers, over

vertical privacy (i.e., that from institutions and governments).

Themes relating to economic and legal frameworks had a reduced

level of prominence. These findings suggest that user

conceptualization of privacy reflects a cognate-based approach that

emphasizes control and limits to information access.

CCS CONCEPTS

• Security and privacy → Social aspects of security and

privacy • Human-centered computing → Social

media • Human-centered computing → Empirical studies in

collaborative and social computing

KEYWORDS

Privacy; user definitions; social media; framing

ACM Reference format:

Kelly Quinn and Dmitry Epstein. 2018. #MyPrivacy: How Users

Think About Social Media Privacy. In Proceedings of the

International Conference on Social Media & Society, Copenhagen,

Denmark (SMSociety).1 DOI:10.1145/3217804.3217945

1 INTRODUCTION

Privacy has become the de facto currency of the social media world.

People routinely disclose information, which not too long ago was

considered private, in exchange for digital tools and services. At

the same time, the concept itself is surprisingly fluid [1], its

interpretation and enactment are highly contextual [2], and these

characteristics often do not align, resulting in the so-called “privacy

1 Permission to make digital or hard copies of part or all of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-

party components of this work must be honored. For all other uses, contact

the Owner/Author. SMSociety '18, July 18–20, 2018, Copenhagen, Denmark

© 2018 Copyright is held by the owner/author(s).

ACM ISBN 978-1-4503-6334-1/18/07. https://doi.org/10.1145/3217804.3217945

paradox” [3], [4]. While previous research has looked into the

behavioral aspects of privacy [3], [5] or into the strategic

deployment of its various meanings [6], there is limited inquiry into

how social media users themselves perceive and interpret the idea

of privacy. In this paper we use recently collected survey data in an

attempt to unpack how users of social media frame privacy. Our

goal is to help conceptualize social media privacy in a participant-

centric way, thus further enhancing efforts to theorize privacy and

design privacy-sensitive tools and solutions.

2 LITERATURE REVIEW

In our effort to map the frames of references to privacy used by

social media users we draw on two main bodies of literature. First,

we discuss framing literature, which offers an established

conceptual framework for understanding the importance of

definition for both behavioral and structural outcomes. Second, we

discuss relevant privacy literature in order to situate our current

work in earlier efforts to tease out the plurality and context-driven

fluidity of privacy definitions.

2.1 Framing

Framing is a popular analytical framework, used in a variety of

fields (e.g., psychology, political science, sociology,

communication), across levels of analysis (e.g., individual,

institutional), and with different epistemological approaches [7],

[8]. Goffman has originally referred to frames as primary

“schemata of interpretation” that allow “its user to locate, perceive,

identify, and label a seemingly infinite number of concrete

occurrences defined in its terms” and within her personal context

[9]. Gamson and Modigliani explained that frames present “a

central organizing idea […] for making sense of relevant events,

suggesting what is at issue” and thus giving meaning to issues and

ideas [10].

Frames operate at both systemic (macro) and individual (micro)

levels. At the macro level, competing social, political, and cultural

actors engage in constructing, modifying, and disseminating frames

SMSociety, July 2018, Copenhagen, Denmark Quinn and Epstein

both intentionally and unintentionally. In doing so, the actors

operate with frames in communication as they engage in

competitive behavior around frames of reference that align with

their values, goals, and interests [7], [8]. At the micro level, frames

in thought delineate the realm of the possible or desirable by

restricting and prioritizing a set of available considerations or by

amending their subjective value and applicability in the eyes of an

individual [8], [11]. In other words, at the fundamental level,

“frames in communication” are strategically constructed and

deployed, typically by powerful social actors, in an attempt to

influence audiences to adapt particular interpretive “frames in

thought” when making sense of complex issues [12].

The tension between frames in communication and frames in

thought are at core of framing research. Gamson and Modigliani

have demonstrated how the gap between these two types of frames

affected policy discourse and public understanding around nuclear

power, [10]. Entman has proposed a cascading activation model,

where issue frame construction and diffusion are linked through

mass media portrayals [13]. Framing is especially influential in

areas lacking clear definitions, established frames of reference, or

where mechanisms of influence are obscured from actors. Privacy

is a prime example of one such area.

2.2 Privacy

Conceptualization of privacy is an ongoing debate among scholars.

Westin, in one of the early, hierarchical models of privacy, viewed

it as nested spaces of political, socio-cultural, and personal levels

of analysis with each layer representing a different set of social

structures that both constrain and enable privacy behaviors [14].

Nissenbaum has further developed this idea by referring to privacy

as contextual [2]. Solove called for a pluralistic conceptualization

of privacy as a “a set of protections against a plurality of distinct,

but related problems” [15], thus making implicit privacy practices

explicit. Such problems include issues of information collection,

processing, dissemination, and invasion. Taken together, these

conceptual frameworks lend themselves to speaking about a

privacy space where discourse can be examined at the political,

socio-cultural, or personal levels and be viewed as context-

dependent [6].

Within the conceptual privacy space, one can identify a number of

more concrete frames of reference to privacy. Smith, Dinev, and

Xu roughly divide those approaches into value-based and cognate-

based conceptualizations [5]. Central to the value-based

conceptualizations is Warren and Brandies’ framing of privacy as

a “right to be left alone” [16]. Critics claim, however, that this

framing is rooted in the physical notion of privacy, which does not

transfer well into the digital realm, where privacy is treated as a

commodity [3], [17].

The cognate-based approach harbors the control and limited access

paradigms of privacy. The control paradigm offers a broad

conceptual continuum of what privacy may mean. For some,

control over one’s information equates with privacy itself. For

others, it is viewed as a mediating factor in what constitutes “a

dialectic and dynamic boundary regulation process” [18]. The

limited access paradigm treats privacy just as a state of limited

access to a person or her information [14], [19] and can also be

viewed as a continuum, from absolute to minimal [5].

The rapid adoption of social media and the growing use of private

information as de facto currency for digital (and increasingly

physical) services, have pushed the conceptual work around

privacy to new frontiers with an emphasis on contextual nature of

privacy and the plurality of meaning placed in the idea of privacy

by different actors. Some argue for privacy context collapse, where

multiple audiences of an actor collapse into one [20], [21], which,

in turn, requires a networked model of privacy determined through

“constellation of audience dynamics, social norms, and technical

functionality that affect the processes of information disclosure,

concealment, obscurity, and interpretation within a networked

public” [22]. Others suggest a distinction between vertical and

horizontal privacy. The former relates to privacy vis-à-vis

institutions (e.g., government, social media platform), while the

latter refers to privacy vis-à-vis peers (e.g., co-workers, social

media connections) [23]. These approaches imply a set of frames

in thought about privacy. Our goal here is to address this question

directly.

2.3 Framing privacy

The rich conceptual debate about privacy is a fertile ground for

framing research. Somewhat surprisingly however, studies

explicitly tackling the question of framing of privacy are scarce

[24]. Existing research focuses primarily on frames in

communication of elite actors such as policymakers, new media

[6], [24], or technology designers, who often view emphasizing

privacy as a barrier to adoption [25]. Fornaciari, attempted to

examine frames in communication of non-elite actors. Studying

privacy framing on Twitter she identified eight distinct frames

ranging from privacy and technology being the most frequent to

trading privacy being the least frequent frame [26].

Expanding this line of research, the current project asks to delve

explicitly into frames in thought of non-elite actors. Hence, we

pursue a rather direct research question: How do users of social

media frame social media privacy? We explore this question with

original survey data and by using topic modeling as our analytical

approach. Given our explicit focus on social media, we expect the

frames in thought to reify or challenge more recent conceptual

developments such as a networked view of privacy or the vertical

versus horizontal distinction.

Table 1: Topic identification using ConText

Topic Topic

Weight

Members

1 0.350 share, information, private, privacy, people, post, personal, thing, life, personal_information

2 0.269 friend, post, people, profile, platform_sponsor, website, family, privacy, view, public

3 0.235 information, people, privacy, platform_sponsor, access, personal_information, important, give, secure, account

4 0.149 privacy, platform_sponsor, setting, post, thing, respect, social_media, private, put, person

5 0.116 privacy, information, platform_sponsor, intrude, make, include, company, advertise, permit, personal

SMSociety, July 2018, Copenhagen, Denmark Quinn and Epstein

6 0.104 safe, protect, secure, hack, identity, personal_information, information, account, steal, feel

7 0.077 information, share, sell, company, consent, safe, address, contact, ability, person

8 0.041 information, address, photo, email, phone_number, show, secure, personal, give, number

9 0.016 business, stay, follow, follower, time, hide, people, boundary, knowledge, put

10 0.009 mine, photo, good, ability, private, word, opinion, work, privacy, limit

3 METHODOLOGY

3.1 Sample

Participants were recruited using the Qualtrics panel service, and

received a small incentive by the platform sponsor. This study was

approved by the Institutional Review Board at the University of

Illinois at Chicago. Data were collected in a self-administered,

web-based survey tool which included questions on attitudes and

behaviors related to social media and privacy. Included in these was

the question, “With respect to [most frequently used social media

platform], what does privacy mean to you?” Participants were

required to supply a definition consisting of a minimum of 135

characters. The resulting corpus included 608 individually

generated definitions of privacy. The underlying sample was

representative of the US population, based on 2010 US Census

demographics, on characteristics of age, gender, and income. Mean

age was 47.8 years (SD=16.7, range=18-90, Mdn=47.0) and gender

was balanced (53.1% female, 46.2% male, .7% not reported).

Racial/ethnic composition included: African-American 8.9%

(n=54); Hispanic/Latino 7.6% (n=46); Asian 4.9% (n=30);

Caucasian 77.0% (n=468); Multi-ethnic/Other/Undisclosed 1.2%

(n=7). Participants in the study were actively engaged with social

media, with 90.8% reported having two or more social media

profiles and 81.1% reported accessing their favored social media

site at least once/day.

3.2 Method

Topic modelling is a text-mining approach that uses statistical

probabilities for discovering topics or themes in a collection of

documents. It is based on the notion that documents are collections

of topics that reflect a thematic structure which can be inferred by

examining the probability distribution of words appearing together

[27]. In each topic, different sets of terms have higher probabilities;

topics can be visualized by listing these terms [28]. Topic modeling

algorithms analyze the words of the unstructured texts to discover

the themes that run through them and how those themes may be

connected to each other [29]. In contrast to traditional content

analysis, topic modelling utilizes computer algorithms to identify

patterns of word co-occurrence, and thus is useful for analyzing

large datasets. In addition, because it can be used without a priori

coding structures, topic modelling offers the benefit of not

requiring identification of themes in advance.

ConText [30] is an automated topic modelling tool for analyzing

texts and networks that can be used to analyze the a large volume

of texts. ConText leverages the “Machine Learning for LanguagE

Toolkit” [31] to perform topic modelling, which is based on the

Latent Dirichlet Allocation (LDA) model [32]. LDA assumes

documents are generated by drawing on fixed topic vocabularies

that are composed of words with high probabilities; it then reverses

this generative process to uncover the latent topics within the texts

using probabilistic modelling [29].

3.3 Analysis

Initial cleaning of the data, preprocessing, was carried out in

ConText. This included the removal of stopwords, or articles,

prepositions, conjunctions, and transitive verbs that do not

contribute to the meaning of the text (e.g., if, and, that, a, an, the,

to, is, was, were). Stemming was also carried out in ConText, and

adjusted tense and different forms of the same word into a unified

morpheme. Finally, a codebook was applied to correct spelling,

consolidate n-gram terms such as “social media,” and generalize

specific categories of words, such as “Amazon” and “Facebook”

into a single term of “platform_sponsor”. Topic modelling and

network generation functions were then performed.

4 RESULTS

To illustrate the themes found within the privacy definitions, we

specified 10 topics and 10 words per topic. Results are summarized

in Table 1, with the words in each topic presented in order of

occurrence within that topic. Topics are ordered by weight within

the document corpus. As shown in Table 1, Topics 1 through 4

suggest themes that reflect horizontal privacy or boundary control

mechanisms. They include words such as “share” and “people,”

and also the types of information typically generated by social

media use, such as “profiles” and “posts”. These topics have the

highest weighting, which suggests their prominence in the corpus

of definitions.

Themes 5 through 7 reflect concepts related to more vertical forms

of privacy. Included in these topics are words relating to the use of

personal information by others, including those such as “company,”

“sell,” and “advertise,” as well as those related to the security of

information such as “hack,” “steal,” and “secure”. Topic 7

specifically references terms that are consistent with an economic

framing of privacy, though the weight of the topic suggests that it

is not a major theme. Notably absent are terms related to privacy

from a legal perspective, such as “right” or “law”. As indicated by

the overall topic weights, these latter topics hold much less

prominence in the corpus of definitions, indicating a substantially

lower level of attention to conceptions of vertical privacy by the

participants in this study. Topics 8 through 10 held little weight in

the corpus, and were omitted from further analysis.

To provide additional clarity to the themes in the corpus of

definitions, ConText was used to generate a list of co-occurring

word-pairs from the definitions, using a word-distance window of

seven words. These were used to map an undirected network using

Gephi [33] consisting of 112 nodes and 2,946 edges. Using Gephi’s

native community detection algorithm [34], three communities

were detected and individually examined. To simplify, graphs were

scaled to show nodes of 45 degrees or larger.

SMSociety, July 2018, Copenhagen, Denmark Quinn and Epstein

Figure 1: Horizontal privacy

The first community (Figure 1) depicts word pairs indicative of horizontal privacy: types of information, such as “posts,” “photos,”

and “profiles,” along with those whom such information might be

shared with, such as “friends,” “family,” and even “strangers”. The

second community (Figure 2) evidences word pairs indicative of vertical privacy awareness: “personal information,” “platform

sponsors,” in addition to terms such as “secure,” “safe,” and

“hack”. The final community (Figure 3) describes the types of information that may be vulnerable to exploitation, as well as of an

economic framing of privacy: phone numbers, emails, sell, and

company.

Figure 2: Vertical privacy

Of note, there is evidence in all of the clusters of cognate-based

approaches to privacy. Words such as “access,” “control,” “block,”

and “setting” suggest that, with respect to social media, users

conceptualize privacy as a boundary-control process whereby

limits on access to information is prioritized .

Figure 3: Information vulnerable to exploitation

5 DISCUSSION

Our results suggest that users’ conceptualizations of privacy

emphasize dimensions of social privacy (i.e., privacy between users

of social media platforms), over conceptualizations of privacy that

emphasizes freedom from oversight. While mostly consistent with

prior work on privacy framing by non-elites [26], these results

demonstrate that users view their own social networks as their

primary audiences, as opposed to platform sponsors or other

institutions. Such prioritization of the social aligns with ideas of

networked privacy, and may indicate that user framing of privacy

is perhaps more focused on social aspects than what has been

assumed by policy makers.

Vertical privacy, while not lost on users, seems to have lower levels

of relevance across user definitions. This lack of attention to

institutional privacy incursions may further reify existing power

imbalances along the lines of what Braman describes as a

“panspectron” society, a condition where information about an

individual is collected continuously [35]. Obvious means to address

this potential imbalance include for designers to make data

collection more explicit and for policymakers to emphasize the

importance of privacy education for users. Given the nature of

work-in-progress, we acknowledge the need to further refine the

methodology and to more closely interpret the findings. Limitations

of this project include the limited scope of user-supplied

definitions, and the varying level of detail among corpus

definitions. As data collection for these definitions also included

user reports of privacy activities, future work might examine these

definitions in the context of reported behaviors.

REFERENCES

[1] L. R. BeVier, “Information About Individuals in the Hands of

Government : Some Reflections on Mechanisms for Privacy

Protection,” William Mary Bill Rights J., vol. 4, no. 2, pp. 455– 506, 1995.

[2] H. Nissenbaum, Privacy in Context: Technology, Policy, and the

Integrity of Social Life. Stanford, CA: Stanford University Press, 2010.

[3] A. Acquisti, L. Brandimarte, and G. Loewenstein, “Privacy and

human behavior in the age of information,” Science (80-. )., vol. 347, no. 6221, pp. 509–515, 2015.

[4] A. Acquisti and R. Gross, “Imagined Communities : Awareness,

Information Sharing, and Privacy on the Facebook,” in Privacy Enhancing Technologies, 2006, pp. 36–58.

[5] H. J. Smith, T. Dinev, and H. Xu, “Information privacy research:

An interdisciplinary review,” MIS Q., vol. 35, no. 4, pp. 989–

#MyPrivacy SMSociety, July 2018, Copenhagen, Denmark

1016, 2011.

[6] Dmitry Epstein, Merrill C. Roth, and Eric P.S. Baumer, “It’s the Definition, Stupid! Framing of Online Privacy in the Internet

Governance Forum Debates,” J. Inf. Policy, vol. 4, no. 2014, p.

144, 2014. [7] D. Scheufele, “Framing as a theory of media effects,” J.

Commun., vol. 49, no. 1, pp. 103–122, 1999.

[8] D. Chong and J. N. Druckman, “Framing Theory,” Annu. Rev. Polit. Sci., vol. 10, no. 1, pp. 103–126, 2007.

[9] E. Goffman, Frame Analysis. Cambridge, MA: Harvard

University Press, 1974. [10] W. A. Gamson and A. Modigliani, “Media discourse and public

opinion on nuclear power: A constructionist approach,” Am. J.

Sociol., vol. 95, no. 1, pp. 1–37, Jul. 1989. [11] T. E. Nelson, Z. M. Oxley, and R. A. Clawson, “Toward a

psychology of framing effects,” Polit. Behav., vol. 19, no. 3, pp.

221–246, 1997.

[12] D. Epstein, E. C. Nisbet, and T. Gillespie, “Who’s responsible for

the digital divide? Public perceptions and policy implications,”

Inf. Soc., vol. 27, no. 2, pp. 92–104, 2011. [13] R. M. Entman, Projections of Power: Framing News, Public

Opinion, and US Foreign Policy. Chicago, IL: University of

Chicago Press, 2004. [14] A. F. Westin, Privacy and Freedom. New York: Athenum, 1967.

[15] D. J. Solove, Understanding Privacy. Cambridge, MA: Harvard University Press, 2008.

[16] S. D. Warren and L. D. Brandeis, “The right to privacy,” Harv.

Educ. Rev., vol. 4, no. 5, pp. 193–220, 1890. [17] Z. Papacharissi, “Privacy as a luxury commodity,” First Monday,

vol. 15, no. 8, 2010.

[18] L. Palen and P. Dourish, “Unpacking ‘privacy’ for a networked world,” Proc. Conf. Hum. factors Comput. Syst. - CHI ’03, no. 5,

p. 129, 2003.

[19] G. S. Dhillon and T. T. Moores, “Internet Privacy,” Inf. Resour. Manag. J., vol. 14, no. 4, pp. 33–37, Oct. 2001.

[20] A. E. Marwick and danah boyd, “I tweet honestly, I tweet

passionately: Twitter users, context collapse, and the imagined audience,” New Media Soc., vol. 13, no. 1, pp. 114–133, Jul.

2011.

[21] J. Vitak, “The Impact of Context Collapse and Privacy on Social Network Site Disclosures,” J. Broadcast. Electron. Media, vol.

56, no. 4, pp. 451–470, Oct. 2012.

[22] A. E. Marwick and danah boyd, “Networked privacy: How teenagers negotiate context in social media,” New Media Soc.,

vol. 16, no. 7, pp. 1051–1067, 2014.

[23] M. Bartsch and T. Dienlin, “Control your Facebook: An analysis of online privacy literacy,” Comput. Human Behav., vol. 56, pp.

147–154, Mar. 2016.

[24] F. Fornaciari, “Mapping the territories of privacy: Textual analysis of privacy frames in American mainstream news,” Proc.

Annu. Hawaii Int. Conf. Syst. Sci., pp. 1823–1832, 2014.

[25] J. A. Obar and A. Oeldorf-Hirsch, “Clickwrap Impact : Quick- Join Options and Ignoring Privacy and Terms of Service Policies

of Social Networking Services,” in #SMSociety17 Proceedings of

the 8th International Conference on Social Media & Society, 2017, p. Article 50.

[26] F. Fornaciari, “iTweet about #privacy: Mapping privacy frames

in Twitter conversation,” in ALLDATA 2017 : The Third International Conference on Big Data, Small Data, Linked Data

and Open Data, 2017, pp. 70–73.

[27] M. Steyvers and T. Griffiths, “Probabilistic topic models,” in Latent Semantic Analysis: A Road to Meaning, T. Landauer, D.

McNamara, S. Dennis, and W. Kintsch, Eds. Hillsdale, NJ:

Laurence Erlbaum, 2007. [28] D. M. Blei, “Topic modeling and digital humanities,” J. Digit.

Humanit., vol. 2, no. 1, 2012.

[29] D. M. Blei, “Probabilistic topic models,” Commun. ACM, vol. 55, no. 4, p. 77, Apr. 2012.

[30] J. Diesner, “ConText: Software for the integrated analysis of text

data and network data,” Pap. Present. Soc. Semant. Networks Commun. Res. Preconference Conf. Int. Commun. Assoc., 2014.

[31] A. K. McCallum, “MALLETT: A machine learning for language

toolkit.” 2002. [32] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet

Allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.

[33] M. Bastian, S. Heymann, and M. Jacomy, “Gephi: An Open Source Software for Exploring and Manipulating Networks,”

Third Int. AAAI Conf. Weblogs Soc. Media, pp. 361–362, 2009.

[34] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” J. Stat. Mech.

Theory Exp., vol. 2008, no. 10, p. P10008, Oct. 2008.

[35] S. Braman, Change of State: Information, Policy, and Power. Cambridge, MA: MIT Press.

  1. xftPage01: 360
  2. xftPage11: 361
  3. xftPage21: 362
  4. xftPage31: 363
  5. xftPage41: 364