Is it possible to keep the Internet from realizing that you’re pregnant? That’s the question Princeton sociology professor Janet Vertesi set out to answer in 2013 when she discovered that she was expecting. Her nine-month experiment required her to think like a criminal about how she could go about leaving no trace of her bundle of joy in any of her email activity. She had to call family and friends and tell them not to talk about the pregnancy on Facebook. She and her husband bought baby products — like prenatal vitamins — in person in cash. When she did buy things online, she used Tor to mask her IP address and conceal her identity while browsing, bought items with gift cards from Rite Aid, and had them shipped to an Amazon locker so her home address wouldn’t be associated with the orders.
Vertesi wanted to demonstrate the extreme lengths to which a consumer has to go to keep companies from finding out something very valuable about them: that they’re about to spend lots of money on a new human being. “Big data is getting creepy and it’s invading people’s lives more and more. There’s been this tremendous rise of an invisible layer of the Internet that involves bots, beacons and cookies that build a profile of who we are online,” she said at the time. “People have reasons for privacy that are not terrible ones. They just don’t want everything about them captured by a company and kept.”
It seemed to work. Vertesi never saw a diaper ad. And she is still keeping up the experiment in a way, trying to keep traces of her one-year-old off of the Internet. “Family and friends keep hoping I’ll put images of my child on Facebook,” she told me when I called her for episode 5 of the web-doc series Do Not Track, above, which deals with Big Data and algorithms.
In her regular life as a scholar, Vertesi works on how humans and robots interact, specifically how astronauts work with robotic spacecraft teams at NASA —think TARS in Interstellar. So her research has taken her into the realm of how Big Data will be harnessed to help us visit Mars or Saturn. Very sci-fi stuff. But it’s what made her start thinking about a world in which algorithms are increasingly empowered to gather information about us and make decisions about, and for, us.
What disturbs her now is how these powerful tools are being harnessed not to help us survive in outer space, but rather to judge us, put us on watch lists, and to sell us stuff. Online and offline data are gathered by powerful third parties – such as Google, Facebook, Apple and Amazon – fueling a $230 billion industry. Data brokers and data-mining companies can cross-correlate a user’s shopping purchases with visited websites, search histories, social media chatter and geo-locations to predict consumers’ potential preferences and behaviors. But these systems are not omniscient. They can make mistakes about us, and because these systems are so passive, automated, and inscrutable, there’s no way for us to correct them. Ask the women whose pregnancies ended in miscarriages who continue to see baby ads mindlessly targeting them and reminding them of a painful event in their life.
Algorithms discriminate too
These still very primitive algorithms are used to categorize and classify people in a wide array of situations and are empowered to make increasingly important decisions: deciding who gets a loan or housing, and predicting who will be a more loyal employee or in which neighborhood a future crime is most likely to occur.
Scholars and activists are sounding the alarm about the potentially damaging consequences seemingly objective algorithms can have. Kate Crawford, principal investigator at Microsoft Research, has written and spoken at length about the discrimination inherent in supposedly “objective” algorithms. For example, the city of Boston released a smartphone app that its residents could download that would use the phone’s accelerometer and GPS data to pick up when a person drove over a pothole and instantly report it to the city. “People in lower income groups in the US are less likely to have smartphones, and this is particularly true of older residents, where smartphone penetration can be as low as 16%,” Crawford wrote. So the app would result in the parts of the city through which better-off residents move getting their potholes fixed.
Crawford told us that when the app designers realized the flaw, they addressed it by asking city employees, such as police officers and garbagemen, who spend time in a diversity of neighborhoods to carry phones with the app. It was an important reminder of the vigilance we should have of potential ethical problems inherent in Big Data "solutions."
When the Internet gets it wrong
For data tracking agencies, what we do online is the only thing that matters. All notions of solidarity, curiosity, friendship and sociability are eluded. If you are looking for flu symptoms on google, you’re probably sick from the flu – you’re not reassuring a friend she is not sick, and you’re not helping someone else who might have the flu. If you’re looking for wedding dresses, you’re probably getting married – not curious about changing marital fashions in history. Wrongfully targeted ads can be annoying or laughable. But discriminatory assertions based on those wrong assumptions raise questions about civic and human rights issues.
So how do you hide from this? How do you avoid sharing personal information that could be used against you? You could choose not to go on the Internet. But you’d still be tracked, by facial recognition systems or friends tagging you on Facebook. Even if you’re not on Facebook, the social network still has a profile on you. Increasingly, there is no opting out of the system. To not participate means, says Vertesi, treating your online and consumption trail like a (smart) criminal would, going to extreme lengths to hide it. But that, she admits, is not sustainable for a normal human being.
So that’s what we explore in this episode of Do Not Track. How is Big Data judging you and what we will do about it?
Sandra Rodriguez is an independent documentary filmmaker and a scholar in new media sociology. She's the director of the fifth episode of the Do Not Track web series, a personalized documentary series about privacy and the web economy. Each episode uses your own data to reveal what the web knows about you. The series is produced by Upian in partnership with the National Film Board of Canada, Arte and Bayerischer Rundfunk.
Sandra Rodriguez is an independent documentary filmmaker and a scholar in new media sociology. She's the author of the fifth episode of the Do Not Track webseries.