When will computers be able to identify this adorable pug?

Latest

You’re the worst. Shut up. You’re full of it. You suck. I can’t stand you. Go fuck yourself.

If anyone said such things to you, you’d probably take offense. But a computer? Maybe not. For a computer to understand language, algorithms have to turn language into math, and in that translation, sometimes meaning can get lost.

Last week, a new startup called Metamind released a bunch of tools you can play around with that let you see how well the company’s software can understand human communication. One mini-app lets you classify any text as positive or negative. Each prediction comes with a degree of certainty.

We decided to test-drive this sentiment analysis tool with a few crass expressions whose meanings might be a bit ambiguous if you don’t have a real brain.

Metamind deemed ‘Go fuck yourself’ negative—but it was only 42 percent certain of that. It didn’t fare much better with ‘Shut up’ (neutral, 66 percent), ‘You suck’ (neutral, 43 percent), or ‘You’re full of it’ (neutral, 48 percent). A human who speaks American English would have had no issue deciphering these insults, even without context. The system worked better ‘You’re the worst’ (negative, 98 percent) and on positive phrases like “I love you” and “He’s the best.”

We also tested Metamind’s image labeling capabilities. The goal is to be able to upload some labeled images, then have  the software be able to classify new images based on what it’s learned. (This kind of tool is called an image classifier for obvious reasons.) So, I uploaded a few different types of images — a heron I saw in Menlo Park, a gray cat, a stuffed-toy owl, and countless photos of my former co-worker’s precocious pug, Bogart, all labeled. Then I hit a button marked ‘Train.’ A minute or so later, Metamind had tried to learn the common features of the photos we’d uploaded. What, exactly, made Bogart Bogart?

After training, Metamind’s tool correctly identified images of cats, but was more certain a cat was a cat if it was gray, like the image that I uploaded. (A couple of times, though, it mistook ridiculously cute cats for Bogart.) And when I found a random picture of pugs online, Metamind labeled those as Bogart, too – a trick question, I know.

Metamind’s new tool is still a work in progress, but it lets you play with how these kinds of artificial intelligences learn and process the world. “We want to coin the concept of ‘Drag, drop and learn’,” said Richard Socher, an AI researcher who co-founded Metamind just four months ago. “A lot of deep-learning technology exists inside large companies. We’re really trying to bring this to anybody who can use a web browser.”

Metamind is the latest in a handful of startups hawking artificial intelligence, and more specifically deep learning, as a service. Deep learning refers to a subfield of artificial intelligence that aims to mimic how real brains process information. It involves building multi-layered software, called neural networks whose layers are modeled after the columns of neurons in the cortex, the region of the brain that handles things like speech and vision. For example, neurons in low-level layers might recognize edges, while those in layers higher up might be able to “see” whole objects.

Recenly, deep learning has been getting the attention of all the big web companies, who have gone on a sort of hiring spree of deep-learning talent because the technique performs remarkably well at tasks like computer vision and speech recognition.

So, in the last year or so, startups like Clarafai, ExpectLabs, Ersatz Labs, BigML, Wise.io and Skymind have launched. Their business models, the markets they’re going after, and the types of algorithms they’re selling differ, but they all have essentially the same mission: to democratize AI.

Joshua Bloom, the founder of Wise.io, likens this trend to what happened with cloud computing just some years ago. Companies used to invest in their own data centers. But then Amazon started offering cloud services on the cheap. This, some say, helped once-small internet startups, like Netflix, Uber, Dropbox, mushroom by letting them use money to hire talent rather than setting up their own servers and the buildings to house them. Today, distributed computing is so inexpensive even consumers have their own digital lockers in the cloud.

Metamind, like Wise.io and others, is going after big, enterprise clients—like financial institutions, insurance companies, and healthcare organizations—to pay the bills.

But it’s also going a bit further with tools like the ones we tested. With these, it’s going toward a future where we each have our own artificial neural networks. And that’s what’s potentially more interesting in the long run, because it may signal the future of a truly more customizable AI.

Metamind isn’t alone in imagining a future where personalized neural networks are the norm. A decade from now, neural networks will look very different than the ones we have to day, said Dave Sullivan of Ersatz Labs, one of the startups selling deep-learning as a service. We could very well have “our own little neural networks” on our smartphones and gadgets that help us keep track of music or people in our lives, he said.

Imagine if you owned a security camera for your home, for example, and you wanted it to detect when certain people—say, a landlord or a gardener—came and went. A company would provide you the basic structure of a neural network, ready-made, but you could upload your own images, like photos of those people, and train the software to recognize them. Or if you were interested in leaves or birds, you could train a neural network that would identify the flora and fauna in your neighborhood.

“It would be like updating an app,” said Eugenio Culurciello, a professor at Purdue University and founder of Teradeep, a startup focused on computer vision.

Because most of these things will be social, communities could also share these neural nets to create local databases. (On MetaMind, you can already make your classifier public.) If you lived in a surveillance happy neighborhood, you could use a block’s worth of home security cameras to train a neural net to identify people who didn’t live on the street.

To be sure, right now all this difficult. For one, deep neural networks, like the ones Metamind is developing, thrive on really large amounts of labeled data – more than any one of us have at our disposal. “Deep learning is a really hard cutting approach, but it’s not a silver bullet for everything. One of the challenges I see is that it shines when you have tremendous amounts of data – Google-scale level data,” said Bloom, the Wise.io founder. With that data, Google has been able to come up with a system that’s incredibly good at classifying images and words. But without that, it might be difficult to build personalized neural networks that are accurate. (Metamind does warn users that for good accuracy, they should upload at least five to 10 images for each label.)

Then, there’s the issue of training and setting up the right parameters for the problem you want to solve. Right now, that isn’t exactly an out-of-the-box solution. Fine-tuning all the features and parameters that make an algorithm work, some experts have said, can be something of a dark art.

“You have to think quite carefully about what are the features you want to include [and] how to encode them,” said Andrew McCallum, a computer science professor at the University of Massachussetts, Amherst. “These things can make a huge difference. Sometime you can’t tell until you put the data in and play around wiht a bit.”

If a system like Metamind didn’t provide a way to do this, McCallum added, “I would be concerned.”

In other words, it might take some time for neural networks to evolve into the types of things Sullivan and others are predicting, but Metamind’s platform might be one step in that direction.

Daniela Hernandez is a senior writer at Fusion. She likes science, robots, pugs, and coffee.

0 Comments
Inline Feedbacks
View all comments
Share Tweet Submit Pin