“Machine learning has moved past the high-school sex phase (everyone talking about it, but no one doing it), to much broader implementations across industries including media.”
Robin Govik digital strategy at Mittmedia and Sören Karlsson, CEO of NLG startup United Robots, have worked on some of the most advanced examples of automated content. At DIS 2019 they explained how bots have become editors and looked at the pros and cons of letting these robot journalists take over.
Better user experience
Robin began his presentation at DIS 2019 by talking about his experiences as a journalist. He explained: “I’ve been a journalist for years, and I am now responsible for digital strategy at Mittmedia, a local news publisher with 28 newspapers. When I started in journalism we were told that it was important to have a good gut feeling, because we didn’t really know the readers. But we could base our decisions on guesses, powered by experience.”
Robin explained that today Mittmedia knows its users better than ever, as they are logged in and the company has access to their details and preferences.
“We use the data points to give them a better user experience,” added Robin. “And of course to make more money. Both directly by targeted ads, but also indirectly by producing news products that are tailored to their needs.”
Robin explained the importance of delivering content that keeps readers loyal and coming back for more, and to do this they need good, clean data to create the right content for their subscribers.
“Reporter resources are slim nowadays, so we need to use them wisely and make data-informed decisions on what content to produce,” added Robin. “But distributing the content is another thing. We need to enhance the relevance of the selection of stories that a user gets when they visit our sites and apps. Until recently we used to present our stories in our digital channels just like we did in the printed newspaper. Everybody got the same selection of stories. But that changed in October for us in Mittmedia.”
Robin then showed the delegates the Mittmedia website explaining that “only the three top stories on our sites and apps are manually published by our editors. All other links and teasers are automated and personalised.”
“Personalisation is not the goal. The goal is to enhance relevancy, and relevancy is deeply personal. Personalisation is just a tool. And a very effective tool. If you are interested in ice hockey you will now see more ice hockey content. But you won’t only see ice hockey news, because that’s not what you want. Personalisation for us is to enhance the selection, not build an entirely personalised feed. The editorial input is still very important.”
“We have clustered our users into groups depending on areas of interest. Both from a geographical perspective but also a topical perspective,” continued Robin. “An interesting fact here is that we don’t use administrative boundaries like counties or towns. Instead, we used machine learning to look at real user behavior and found new geographical clusters that we use for our personalisation.”
“From a user perspective, you always want your local newspaper to feel close to you. But when we ask users if they want a personalised experience they say “no”. At the same time, they still want to see the news that’s relevant to them. What I believe we have to do here is to listen to the users, but not always do as they say.”
Robin reported that results since personalisation was launched in October last year are impressive. They have seen a 50 percent increase in click-through-rate from the start pages and loyalty among the users is increasing.
Robin then said that one of the main issues for Mittmedia was to increase the amount of local content that is being created - and this is where automation provides a solution.
“The creation of the bots was the answer to a specific need,” explained Sören. “The publishers in Sweden have to create much more content to deliver personalisation. For local newspapers personalisation starts and ends with geography. We wanted to see what type of story we could create with algorithms for data.”
As Sören explained, they started with sports data articles as the rules of the games are defined and easy to work with. He added that volume is one of the most important features, which makes distribution on the publishers side challenging. Sören said his bots produce over 2,000 stories a day.
He then outlined the different types of bots and the content they create.
First up is Sports bot - a simple football bot that started in 2016. A local publisher in Sweden created a whole new site where they published a story for every game played. They had both reviews of the game and reports. The publisher mixed in traditional football coverage and delivered a whole new site, which truly covered everything about football in that region.
A second bot was created together with sports publisher Sportbladet to produce more advanced sports texts. Sören said that they had more data to work with, including historical data, when was the goal created, what position on the pitch it came from etc, to make the story more interesting.
Another innovation is to automatically add comments from coaches into text, this will be rolled out in April. As Sören explained: “The platform automatically sends questions to the coach, which is based on the automated text we are producing at the same time. When the coach replies it goes through several filters and then their comment is published into the text that the robot wrote.”
Sören then mentioned two non-sports bots – a news bot and a property bot. The news bot accesses data from Swedish traffic and weather authorities and then creates stories about traffic accidents and weather warnings, these are then optimised for a local audience.
The property bot automatically takes details of a house sale in Sweden and then creates a story delivering the details. The pictures are taken from Google Street Maps.
“We are now experimenting with inserting a list of famous people into the database and then we match the buyer or seller to the article,” said Sören. “So that when a famous person buys or sells a house the newsroom gets an alert. It generates news for the newsroom.”
Sören also said that they are working on stock market data. “If data can tell a story we can analyse it and create content,” he concluded.
Robin then took over the presentation again and talked about the efficiency of automated content.
“MittMedia’s Text Robot is our most efficient ‘employee’,” he declared. “If you compare it to a human being. It produced 64,000 articles and got three million logged-in page views last year. That’s impressive by our standards.”
Robin adds that for most of the time readers are unaware they are consuming automated content.
“In a survey that we sent out to users who we knew had read a robot article, two-thirds said they didn’t notice the article was written by a robot. Readers care about the content. Who actually wrote the article is not the important thing.”
Pros and cons of robots
Finally Robin spoke of some of the pros and cons of automated content.
Better journalism. “Some of you perhaps think it’s a bold statement to say that news automation will lead to better journalism. From my perspective, it’s totally logical. We should use humans for tasks humans are good at. Journalists make the difference. They investigate, they ask the hard questions, they use words to paint the reader a picture.
A bot, a news automation, is a brilliant tool to use to write thousands of stories based on data. Like the examples Sören has made. With the help of robots, we can free up reporters to do stuff that can’t or shouldn’t be automated.”
Better user experience. “We can create content that wasn’t possible before. Like covering all football games or house sales. Topics that readers care about. A story can adapt to your personal interests, something that only a robot-written article can do. That is a really good user experience.”
Increased quality. “There’s a reason we say the expression: “We are only human”. Because humans fail from time to time. Studies have shown that automated stories have far fewer errors than human written stories. Of course, errors can occur in robot written stories, but then we can treat it just like a software bug. We fix it. And robots never make the same mistake twice.”
Bad data – bad robot. “If the data is wrong it can get bizarre consequences. Like when our sports data supplier got the results wrong. A game between two soccer teams ended 1–0. But because the person responsible for reporting the result put it in the wrong field it was reported as 10–1. So the headline read something like “Humiliating loss for Gnarp” when in reality the game was pretty even. To tackle this type of problem it is important that you take measures to enhance the data quality. In this case, our data supplier developed an alert notifier if the result was improbable.”
Potentially dull and spiritless. “The old way of publishing might not work. Automated content must be published in the right context.
The strength of automated content is quantity. The cost of producing 10 articles are roughly the same as producing 10,000 articles. But 10,000 articles on a news site is not a good user experience. If you want to start with automated content you must make the user experience a priority.”
Pivot to availability. “Missing data sources could lead to topics under-reported. We have established that good quality data is essential for making good automated content. This poses a risk from a journalistic perspective. It is a risk that we choose to produce content from topics where we can get good data. Like sports, traffic accidents, weather and so on. While topics that are much harder to quantify might suffer and be less covered. This is why I believe that having human and machine working together is always the best solution.”