Some people have suggested that Hillary Clinton has changed her message to latch on to Bernie Sanders' popularity, especially with younger voters. A "Saturday Night Live" parody even had Clinton slowly morphing into a white-haired politician railing against "millionaihs and billionaihs."
But how much of an effect has Sanders really had on what the other candidates are talking about? Has she really started sounding more like him?
To answer those questions, we used computational linguistics techniques and analyzed the text that each presidential candidate has shared on Twitter and Facebook. We plotted the results to create a visual representation of what they've talked about most.
The upshot: Sanders seems to have had more influence on the minds of voters than on his rival's campaign. Clinton and Sanders' messaging has remained distinct over time. They stand apart from each other—and from the tangle of overlapping messages competing in the Republican primary.
Data source: Facebook and Twitter posts by the candidates; May 2015 to February 2016. Smoothed scatterplot of semantic topics mentioned by the candidates over a rolling 30-day window.
Each point in the graph represents 30 days of a candidate's Facebook and Twitter posts, plotted using latent semantic analysis, a technique commonly used by search engines to identify the most important topics in a set of documents.
For our analysis, we made a "document" out of each 30-day window of each candidate's social posts, starting last May. Then we used LSA to arrange those documents by similarity. Each recurring topic was given a numeric value at random, for the sake of being able to represent the data in two dimensions. Those values were plotted on a grid to create an abstract representation of the social media conversation.
We indicated topically important words and the directions to which they correspond. For instance, the more a candidate used the word "billionaire" (or words often associated with the word "billionaire") in a month, the farther left that point is. Referring to topics like "conservative" or "poll" pushes the point downward.
This technique is based entirely on the words the candidates are using, without taking into account who said them. Even so, the result appears to be a reasonable representation of the race: the Republican candidates at one end, Sanders at the other, and Clinton and Martin O'Malley somewhere in the middle.
Some of our data predates the launch of Sanders' campaign last May. The graph shows that before he entered the race, his social content was relatively similar to the rest of the candidates'. Since his campaign ramped up, though, he has been consistently different.
To see what the Sanders-Clinton difference looks like in practice, we identified the words from each month that had the largest difference in usage between the two candidates:
Data source: Facebook and Twitter posts by the candidates; May 2015 to February 2016. These are the words with the most different frequency of use between the two candidates each month.
Sanders' preferred topics are relatively consistent. "Billionaire" appears almost every month, and "revolution" appears four times. Clinton's are more eclectic, including everything from "LGBT" to mentions of people who've endorsed her, riffs on pantsuits, and gefilte fish. Clinton seems more focused on reaching specific constituencies, such as Latinos, women or people with disabilities.
The issues Sanders has raised — issues such as the income gap, campaign finance reform and the legalization of marijuana— have resonated with millions of voters. These issues have set his campaign apart, but for now he stands alone.
Daniel McLaughlin is a creative technologist exploring the 2016 presidential election. Before joining Fusion, Daniel worked at the Boston Globe and graduated from MIT with a BS in urban studies and planning.