Two weeks ago, I went on a quest to find the best erotica e-books the internet has to offer. Naturally, I started by typing "best erotica" into Google, which led to multiple lists of must-read literary porn: hard-core, Victorian, LGBTQ, paranormal, werewolf, romance, trashy, and, of course, many different shades of Fifty Shades of Grey.
My goal wasn't to curl up on a couch with my iPad, a candle and a bottle of wine. I wanted to show an algorithm what erotica was so that I could test whether it could learn to write its own. Yes, my smut-search was for science.
A human can learn what erotica is after reading just a few stories. We know it usually involves star-crossed lovers and a lot of c-words. But algorithms don't have those intuitions. They need thousands of examples to be able to pick out patterns obvious to humans right away. I needed to get my dirty little hands on as much erotica as possible to teach an algorithm the intricacies of language and sexual arousal. A coder told me I needed at least 750,000, but I was aiming for a million.
I started with the lists Google recommended. Then, I visited SexyFic publishing, an erotic reading smorgasbord, with titles likeThe Warehouse 2: Hung Up. Two hours into my search, I had tons of great material.
But than I began to wonder whether copying and feeding these stories into an algorithm was allowed. Sure, I could read all these stories and then write something derivative, but how does the law deal with a computer doing the same thing? I asked a copyright lawyer who advised me to stick to erotica with a Creative Commons license.
So I collected DTF ("down to freely-share") books, including a collection of stories dubbed "99 Erotic Notions" which included Supersex 3000, a story about holographic sex. (My erotibot would surely appreciate a tale about technology-mediated sexual encounters.) I sucked up all I could from TxtGasm and TueBl's creative commons erotica collection, including Studs: Gay Erotic Fiction, Straight Boy Jock Chronicles, The Immoral Simpsons (yes, those Simpsons), and others. These sites were like buffets of erotic fantasies: bi, trans, gay, lesbian, straight and everything in between. Just the kind of raunch my bot needed to be an equal opportunity love machine.
I also reached out to authors through Twitter and email, asking them if they'd consider "donating" their naughty narratives to my cause. Sex blogger "Lunabelle" and "Leonard Delaney," who penned Invaded by the Apple Watch and other ridiculous stories about a future in which we get intimate with our devices, agreed to send me some of their work.
I amassed about 1.2 million naughty words. Not a Google-sized dataset by any means, but more than my initial goal. It was time to start training.
Some of the best tools available for getting machines to understand language are neural networks, programs that are exceptional at learning patterns from lots of data. More specifically, AI experts typically use recurrent neural networks because they have a built-in short-term memory. That makes them especially good at keeping track of sequences like words and sentences. Google, Microsoft, Facebook and others are using them to build smarter chatbots, for machine translation, and for automatic image captioning.
I've never coded anything, and neural networks are complicated, so I used a pre-baked open-source tool called char-rnn — it's a recurrent neural network that learns to predict the next character in a sequence. Erotibot would read through stories like Supersex 3000 and The Immoral Simpsons, learn the patterns of letters and words and use that to craft its own naughty fiction.
Lucky for me (and anyone else wanting to birth their own baby AI), other AI-lovers have put together comprehensive instructions on how to make your own. Samim Winiger, a German programmer who has used char-rnn to build bots that give TED Talks and visualize porn, even built a virtual machine on the cloud platform Terminal with all the plug-ins necessarily to run char-rnn.
I signed up for Terminal — it's super cheap, $0.006/hour for the most inexpensive option— uploaded my million-word erotica text file and then put the virtual machine to work processing and "learning" about erotica.
My neural network was up and running, but it's not like the Matrix where skills are downloaded instantaneously. An artificial brain takes a while to be trained. After a couple of days of processing, Erotibot, as I had named it, was ready to share its "fantasies" with the world.
Erotibot is a simple AI, so he needed a little help to get the ball rolling. I asked Fusion colleagues to share the first line of the erotic novel they'd write if they had the time, and used those phrases as Erotibot's prompts, including "Hi, my little nibble" and "shę had a strong back, like a teenage boy." (Without a phrase from me to get his creative sparks flying, he's silent.)
In less than a minute, he was telling me a story about lovers "talking to the sperm at the carefully in love straight off as I started talking about the room." Here's one of the stories Erotibot wrote for us:
If you prefer your automated-erotica audiobook style, here you go. (I've done a little bit of editing on this one.) Put your headphones on, because it's NSFW:
Erotibot's sexual musings were nonsensical, awkward and difficult to read, not unlike the erotic tweets some Twitter bots had written about Taye Diggs last year. He used nonexistent words like "baressical" and "afreutifully." I expected that. Language understanding is one of the most difficult problems in AI. Even with powerful tools like recurrent neural networks, it's difficult to generate written text that always makes sense. That's why most chatbots we interact with aren't up to snuff. They can't keep up with long-running conversations; they lose sight of context and don't get things like slang, idioms, sarcasm or humor. All this makes robo-writing challenging.
Also, my erotica training set was puny compared to what Google, Facebook and Microsoft can amass. In AI, size matters. Neural net guru Geoff Hinton once told me he joined Google because there he could build the biggest neural network humanity had ever seen to decipher language. Neural nets require a lot of computers and data, and Google has plenty.
Getting Erotibot to write stories people might actually want to buy was never my goal. I just wanted to get an appreciation for what went into training a neural network. There are easier-to-use tools for non-experts out there, thanks to companies like Metamind and Microsoft, but using char-rnn gave me a better glimpse into how much work researchers put into developing the AI we take for granted. "Lots of data" is meaningless until you have to gather it yourself.
It also gave me insight into what creeps people out about AI. The erotica authors had written their stories for humans to read, but now my bot was "reading" them too, extracting from them what turns people on: dirty talk, being hand-cuffed, cocks of unlikely lengths (two feet!), celebrity sex…. I understood why some writers I contacted to lend me their stories for this project declined or never wrote back. Theoretically, AI could take their work product and write an endless number of stories, making the human authors unnecessary.
And an AI could even personalize it, pairing it with other data about us: our purchase history, activity logs, food diaries, friendships and relationships. So, theoretically, the Erotibot could write porn made just for you.
My bot couldn't do any of that yet, though. And professional AIs, in general, can't do this well either, but that may be where we're heading.
Erotibot read one of his stories during our first live Real Future show in Los Angeles last week. (His voice came to life thanks to text-to-speech software developed by Acapela Box.) The audience thought he was funny, maybe for all the same reasons we laugh at the tone-deaf kid who sings in public, oblivious to how bad he sounds. Erotibot is rudimentary, and his writing basically sucks:
So I looked around to the table to stay to her right nipples, my life was so not so because that to the balls…. I want to be a little bitch.
But even in its incompetence, there were glimmers of genius, like that last sentence. I want to be a little bitch. I'd be willing to bet a vibrator that a human has written a variation of that line, either for a story or in a text to a lover. That's why it's my favorite phrase in Erotibot's writings. It's probably perfect human-like sentences like this that keep the pros feeling like they're making progress toward more intelligent AI.
RELATED: Fusion explores the increasingly diverse ways people are consuming – and producing –porn, from GIFs to live “camming” to teledildonics. Watch our original investigative documentary, Miami Porn: Sex Work in the Sunshine State, a look inside the world of South Florida’s booming adult entertainment industry, where porn models are increasingly taking their careers into their own hands by producing, shooting, and releasing their own material, all in search of maximum profit.
Daniela Hernandez is a senior writer at Fusion. She likes science, robots, pugs, and coffee.