Suppose this column was written by a robot. Would you care? Should you?
The Associated Press recently announced that most of its coverage of corporate earnings reports will soon be “produced using automation technology.” It’s just the latest example of algorithm-driven tools that arrange straightforward English sentences into a news story, with little or no human intervention — the bot-ification of journalism.
It’s no surprise that the headlines about this sort of thing sound so jarring — “The AP’s newest business reporter is an algorithm” or “Associated Press Will Use Robots To Write Articles.” Journalists (me included!) have been obsessing for a couple of years about code-driven storytelling as a “potentially job-killing technology.”
But forget for a moment about the plight of journalists. (You probably already have, and you may not even trust us in the first place!) What does auto-journalism mean to the average reader? Can the robo-press produce material that’s useful, readable, relevant?
To find out, I spent a good amount of time reading and scrutinizing the output of computerized journalists. I realized two things. First, it’s already doing a better job than you probably think. And, second, while it’s not going to replace journalism as we know it (sorry, “lamestream media” haters!) it is going to add a lot to contemporary information flow — mostly for the better.
At the moment, bot-news seems to be dominated by just two major players.
One is the company the AP just hired: Automated Insights, which says its Wordsmith product “transforms Big Data into narrative reports by spotting patterns, correlations and key insights in the data and then describing them in plain English, just like a human would.”
The other is called Narrative Science. Its product is Quill: “an artificial intelligence platform” that mines data sets and “delivers meaning and insight in a form that makes natural sense to all of us, as narratives.” As it happens, this algorithm, or robot, or whatever, already writes earnings previews for Forbes.com. Like Automated Insights, it evolved out of sports coverage, starting out as a university project that generated recaps from box scores.
(Disclosure: Automated Insights’ technology is used in Yahoo’s fantasy sports coverage.)
I sampled the prose generated by both of these firms, and while I wish I could point to stylistic differences, they were pretty similar. And competent.
Certainly there are occasional oddities: “The majority of analysts (100%) rate Steelcase as a buy,” notes this Narrative Science write-up. “All two analysts rate Steelcase as a buy.” All two, eh?
And the game recaps generated on Automated Insights’ StatSheet site tend toward unwieldy sentences like this: “Brian Roberts finished 4-for-5 and David Robertson notched his 20th save of the year as New York stopped any hopes of a rally to top Minnesota, 6-5.”
But those are quibbles. And the one relevant study I’m aware of comparing reader reactions to stories written by bots vs. humans resulted in, basically, a draw. Subjects reportedly found the human-made article more “pleasant” to read — but considered the computer-generated piece more “trustworthy.”
See what you think: Take our quiz to test your skill at telling bot-prose from the human kind.
Most interestingly, even the reaction gaps in that study were marginal. ‟The lack of difference may be seen as an indicator that the software is doing a good job, or it may indicate that the journalist is doing a poor job,” one researcher commented. “Or perhaps both are doing a good (or poor) job?”
As long as we’re on the subject of the relative merits of journalistic execution: The excitable coverage of robo-writers generally downplays or simply omits the fact that what most of us think of as news stories are a relatively minor element in the business models of Narrative Science and Automated Insights.
Most of Narrative Science’s clients are financial firms, government entities, and corporations using its tools to make sense of unwieldy data sets for internal reasons. Automated Insights pushes the use of its Wordsmith tool for demonstrating marketing return-on-investment reports, or converting fitness-tracking data into personalized narratives. In other words, these are technologies for coping with “Big Data.”
These goals overlap with journalism, really, in a fairly limited way. It still takes a human to decide what to cover, from what angle, and with what point of view. And that’s what most readers want. While a bot could report the specs of the next iPhone accurately, you’d still want to know from an actual person (even a journalist!) whether it’s worth your money.
Still, for those cases where the data-to-narrative trick really can perform a journalistic function, the point is that while the bots may not be impressive writers, they are very meticulous reporters. Sure, a robo-composed game recap may be marred by awkward phrasing and hackneyed clichés — but if the data is good, the algorithm won’t mistype a score or botch a batting average.
Plus, these robots don’t mind boring tasks and are very productive. Which is why so far what they are mostly doing is writing stories that would never have been written otherwise. ProPublica, the revered nonprofit journalism-maker, has tapped into Narrative Science’s technology to produce data-derived “narratives” about more than 50,000 public schools — a useful addition to a project scrutinizing education opportunities, and one that’s hard to fathom being executed by humans without a huge budget adjustment.
Some bot-journalism is now originating in traditional-journalism operations. Most notably: The Los Angeles Times devised a “Quakebot” that was evidently the first to “report” the news of an (ultimately very minor) L.A.-area earthquake earlier this year. “Whenever an alert comes in from the U.S. Geological Survey about an earthquake above a certain size threshold,” Will Oremus of Slate explained at the time, Quakebot snuffs up USGS data and spews it into “a pre-written template,” which gets reviewed by an editor before publication and can be endlessly revised by other humans afterward.
The news-bot elite make some rather grandiose claims for their code. One auto-journo entrepreneur speculated that, for instance, algorithms will write 90 percent of news stories within 15 years, and win a Pulitzer within five.
I’m not sure that computes — unless we change the definition of a “news story,” or count a scenario where bot-tools play a supporting role in a Pulitzer-worthy project. I also wonder about the reverse scenario: Couldn’t a flawed data set or misguided algorithm result in some journalistic equivalent of a flash crash?
That said, as a news consumer I’m more intrigued than ever about how this form of reporting tool could be used creatively, leading to stories that couldn’t, or wouldn’t, have been told before. And as a journalist? I, for one, welcome my new robot peers.