News moves fast on the Internet, and sometimes it's wrong. When the Boston Marathon bombers were at large and unidentified, a group of Redditors took on the role of amateur detectors and tried to narrow in on a suspect. They incorrectly named one man, who had been missing for a month, as a possible perpetrator, resulting in vitriol targeted at him and his family. In fact, he was not only innocent, but had died before the attack happened. The Week reported at the time that four people were misidentified as bombing suspects by the media, in large part because of the rapid-fire spread of rumors on the Internet.
Misinformation during a time of crisis can be a security risk. So it comes as little surprise that the Department of Defense (DoD) is funding research into techniques that might minimize the spread of these rumors, by using computers to fact-check lies on the web.
Earlier this month, researchers published a study on an algorithmic fact-checker funded in part by the DoD. They developed a tool that uses our greatest fount of crowdsourced wisdom: Wikipedia. “There is growing concern about the spread and danger of misinformation, from hoaxes to social bots and fake news sites,” report co-author Giovanni Luca Ciampaglia of the Center for Complex Networks and Systems Research at the Indiana University School of Informatics and Computing told Fusion by email.
Ciampaglia is not the only researcher who has been at this for the last few years. The Washington Post introduced its political fact-checker app, TruthTeller, back in 2013. The news outlet explained at the time:
“TruthTeller gives you the reporting you need at the moment you need it. If anything the politician says has been fact-checked before, you’ll know immediately. Think of it as a tool to help you sort out when politicians say false or misleading things, and to learn the facts about the situation.”
And others have also tried to figure out ways for computers to work alongside humans to separate fact from fiction. Ciampaglia told Fusion that he and his colleagues were inspired to build their own tool after reading about efforts already underway — and realizing the paucity in the field.
Their idea, explained Ciampaglia, was to treat Wikipedia as a huge, existing network of knowledge, rather than a collection of discrete articles. The degrees of separation, and the specificity of each connection, help teach the algorithm whether or not the statement is true.
Overall, the researchers put together a knowledge graph with 3 million Wikipedia concepts, and 23 million links between them. In a press release, Indiana University explained the algorithm’s success:
“Significantly, the IU team found their computational method could even assess the truthfulness of statements about information not directly contained in the infoboxes. For example, the fact that Steve Tesich — the Serbian-American screenwriter of the classic Hoosier film "Breaking Away" — graduated from IU, despite the information not being specifically addressed in the infobox about him.”
In their paper, the authors echoed how pleased they are with the outcome:
“These results are both encouraging and exciting, a simple shortest path computation maximizing information content can leverage an existing body of collective human knowledge to assess the truth of new statements. In other words, the important and complex human task of fact checking can be effectively reduced to a simple network analysis problem, which is easy to solve computationally.”
Of course, that doesn’t mean there aren’t limitations to the program. When asked if the algorithm could tell the difference between a fact that is technically true, but likely false (think of how many symptoms signify cancer according to WebMD) or technically not true but effectively true (like gravity; a theory) Ciampaglia said no. “Right now, the algorithm is very simple and cannot deal with those kinds of nuances.”
And the algorithm relies on the knowledge graph to reach its conclusions: It's not going to learn to tell the difference between fact and fiction on its own. Because of this, it's not likely that robot fact-checkers will replace human fact-checkers anytime soon. “We imagine our methods more as tool for aiding journalists —like the grammar and spelling checkers some years ago,” Ciampaglia said, adding “people will be still in control.”
Plus, Wikipedia is not a perfect source of information. A number of hoaxes get past Wikipedia's editors; Wikipedia's official list of known pranks is quite long. Crowdsourcing works when a topic gets a lot of eyeballs, but its harder to fact-check bits of knowledge that are obscure or personal. And the open nature of Wikipedia means groups can mess with entries to promote their own agenda (we're looking at you, brands and government employees).
Even more advanced fact-checking concepts — like Google’s effort to use algorithms to show us the most trusted (rather than most linked-to) responses to search — are in early stages. A Google spokesperson explained that a paper outlining these ideas is still, at this point, just research. Google researchers would also need to figure out how to teach the algorithm to cope with a growing, changing body of knowledge, and how to respond to things like jokey memes.
But there's value in working on making these fact-checkers effective. Ciampaglia explained: "The DoD needs to understand how the spread of misinformation online may affect or manipulate events on the ground as part of its mission to prevent strategic surprise. Therefore they have supported research by several labs throughout the country (including ours) on methods that might help identify misinformation or deception campaigns." In addition to keeping rumors in check, the fact-checker could presumably help the government identify credible terrorist threats online.
Until the government or Google gives us a way to detect BS, you'll have to rely on your own skeptical skills while navigating the Internet, keeping a few grains of salt sprinkled on your screen.
Danielle Wiener-Bronner is a news reporter.