A recent paper, Self Censorship on Facebook, shows that Facebook wants to know why users might abort a post at the last minute, and was able to collect data from 3.9 million users over 17 days to find out.
New research possibilities
The methodology expands the possibilities of sociological research. The authors argue that self-censorship is a practice for designing the best possible face for a perceived audience. The study found that 71% of users “self-censor” at the last minute.
[P]osts are censored more frequently than comments, with status updates and posts directed at groups censored most frequently of all sharing use cases investigated.
[P]eople with more boundaries to regulate censor more; males censor more posts than females and censor even more posts with mostly male friends than do females, but censor no more comments than females; people who exercise more control over their audience censor more content; and, users with more politically and age diverse friends censor less, in general.
These findings are in line with the claims made by influential American sociologist Erving Goffman about impression management. He argued that individuals try to guide the impressions that others form of them while others try to infer what they can expect from the individual. In that game, controlling what others don’t know about you is as important as what they do.
The study can’t provide evidence about whether posts are deliberately aborted for reasons of self-presentation or for other reasons such as distraction, accident, or a device crash.
But it does provide a kind of empirical evidence that neither Goffman’s ethnographic methods or even excellent survey-based research can emulate.
New ethical challenges
Facebook is arguably denying users the right to informational self-determination in terms of their informed consent to participating in the research and by storing this data longer than needed for immediate technical purposes.
Does this kind of study violate user privacy? The authors claim not.
These analyses were conducted in an anonymous manner, so researchers were not privy to any specific user’s activity … the content of self-censored posts and comments was not sent back to Facebook’s servers.
When I contacted the researchers to clarify what was sent, Facebook communications manager Matt Steinfeld responded that the authors saw only a “binary value”.
“We didn’t track how long the post was … we only looked at instances where at least five characters were entered, to mitigate noise in our data,” he said.
“Nevertheless, self-censored posts of 5 characters or 300 characters were treated the same as part of this research.”
Steinfeld also asserted that Facebook’s Data Use Policy, which is available to users before they join Facebook (and remains available), constituted informed consent.
Facebook’s Data Use Policy on “Information we receive and how it is used” includes a section on data such as that collected for the study:
We receive data about you whenever you use or are running Facebook, such as when you look at another person’s timeline, send or receive a message, search for a friend or a Page, click on, view or otherwise interact with things, use a Facebook mobile app, or make purchases through Facebook.
In the “How we use the information we receive” section the last bullet point includes uses such as:
internal operations, including troubleshooting, data analysis, testing, research and service improvement.
This is at best a form of passive informed consent. The MIT Technology Review proposes that consent should be both active and real time in the age of big data.
That still leaves two questions about the status of user choice and rights.
First, is passive informed consent to collect data on technical interactions sufficient when users have actively chosen not to make content socially available?
Second, to what extent does a proposed right to be forgotten apply to data on technical interactions even if no social data is posted?
Even if users do not yet appreciate the value of this data in the same sense as their actual posts, Facebook and other entities certainly do. This is a form of metadata.
Australian and US governments use and share – sometimes apparently overshare – metadata for security purposes. Governments tend to rather hypocritically claim that metadata is an invaluable resource for surveillance (although that is highly contestable) while also denying that collecting it is a violation of privacy because they believe it is innocuous.
Facebook’s Steinfeld noted that “none of the revelations about government surveillance have extended to the topics covered in this study.”
However, this does not mean requests for such information have not occurred or could not occur in the future under blanket arguments that it is “not content”, “innocuous”, or most concerningly, collected with “informed consent”.
Oversight and informed consent
This seems to be a reasonable corporate research review process, but the highly intrusive Facebook Beacon made its way through, so corporate interests clearly can win over ethical standards. Users have also often complained that Facebook makes changes first and asks (forgiveness) later.
As VentureBeat reports, the US Council for Big Data, Ethics, and Society will convene for the first time in 2014. Its influence on the big data ethics scene should be positive, but its task is hugely complex.
Big data research requires a total overhaul of the concept of informed consent as well as stronger principles aligned to new rights for the digital age.
- Rintel, S. (2014, January 6). A thin blue line: how Facebook deals with controversial content. The Conversation.
- Rintel, S. (2013, June 11). Nine reasons you should care about NSA’s PRISM surveillance. The Conversation.