American teenager 'devastated' after filling thousands of Scots Wikipedia articles with 'gibberish'

·4 min read
Tens of thousands of Scots Wikipedia articles were written by someone who could not properly write in Scots - AP
Tens of thousands of Scots Wikipedia articles were written by someone who could not properly write in Scots - AP

An American teenager has said he is “devastated” after users of online encyclopedia Wikipedia learned that he wrote half of the entries in the Scots language version of the site, despite not being able to speak or write in Scots.

Wikipedia users discovered earlier this week that the American teenager had written around 30,000 articles in Scots, the language of Robert Burns and portions of Irvine Welsh's novel Trainspotting.

But other internet users complained that the author made repeated mistakes and accused him of failing to properly write in Scots.

“I think this person has possibly done more damage to the Scots language than anyone else in history. They engaged in cultural vandalism on a hitherto unprecedented scale,” wrote a user of social media forum Reddit who discovered the problem.

On Thursday, a debate was raging among Wikipedia editors on whether to entirely delete the Scots site and start afresh.

Wikipedia and Reddit users have complained that the tens of thousands of Scots Wikipedia entries created by the teenager are merely rewritten from the English language version of the site with occasional Scots words added. Sometimes errors and improper Scots words and phrases were introduced into the articles.

The problem is compounded by the fact that Wikipedia is written and edited by thousands of internet users. These users are able to delete or amend articles by other users, based on consensus and their reputation on the website. 

English Wikipedia in numbers
English Wikipedia in numbers

The US teenager, despite little knowledge of Scots, was one of the only editors on the Wikipedia site, creating thousands of articles with almost no oversight.

One example saw repeated use of the phrase “an aw”, literally translated as “also”, when in fact it is used to mean “and all”.

Wikipedia users claimed this had led to thousands of articles written in “Scotched English” featuring made up words and “gibberish”. Some users called for the page to be deleted entirely. “The only real answer is to nuke it,” one said.

Others noted that other articles on Scots Wikipedia, by other users, were also of poor quality and included made up words such as “pheesicist”. Some called for an academic to be brought in to fix them.

However, others defended the user, arguing that the problem stemmed from a lack of other speakers contributing to the site. They noted that the user had mostly simply done straight translations of English words for Scots equivalents, despite grammatical errors. 

Scots is recognised as an indigenous language of Scotland. Around 99,000 people report they speak it as their first language, and around 1.5 million people claim to be able to understand or speak it as a second language. 

The unnamed teenager who goes by the online handle of AmaryllisGardener, who wrote the articles apologised on Wikipedia on Tuesday, writing of his “devastation” after the discovery.

“I was only a 12-year-old kid when I started, and sometimes when you start something young, you can't see that the habit you've developed is unhealthy and unhelpful as you get older,” the user added.

This content is not available due to your privacy preferences.
Update your settings here to see it.

The teenager asked internet users to stop harassing him on social media following the incident.

They said: “Honestly, I don't mind if you revert all of my edits, delete my articles, and ban me from the wiki for good. I've already found out that my "contributions" have angered countless people, and to me that's all the devastation I can be given, after years of my thinking I was doing good.”

Many Wikipedia users are now considering a mass deletion of Scots Wikipedia articles in order to remove the pages.

There are also concerns that the Scots Wikipedia articles could have caused problems in machine learning software which uses Wikipedia as a source of information on languages to teach systems how to detect them.

Robyn Speer, the chief scientist of artificial intelligence business Luminoso, wrote on Twitter: “If you have a multilingual language model, this fakery might be your entire training data for Scots.”

Daria Cybulska, of Wikimedia UK, said: “We do not own or control the Scots-language Wikipedia, which as with all parts of the Wiki community, is edited and managed by volunteers. We are aware of the concerns that have emerged about the content of the Scots-language Wikipedia and are in touch with the Wikimedia Foundation and volunteer editor community to offer support in helping to ensure that these issues are addressed.”