Last Monday, the tech world was abuzz with discussion of Wired’s article, Inside Google’s Internet Justice League and Its AI-Powered War on Trolls by Andy Greenberg. The article discusses Google subsidiary Jigsaw’s solution to the problem of online harassment and its ability to cause self-censorship. This solution is called Conversation AI, an artificial intelligence that detects abusive language online and eliminates it
While some of the buzz is centered around the question of whether censoring trolls in order to allow victims of harassment to be free from troll censorship is progress toward or regression from free speech, we’re more interested in the technical aspect of this endeavor. How was this program created and are AIs truly capable of processing human language to the extent that they can effectively judge people’s intentions in using it?
Well, according to Wired, Conversation AI performs with 92% certainty and a 10% false-positive rate. Jared Cohen, founder and president of Jigsaw, claims that these percentages will improve with time. That’s because this AI has been created with machine learning (see our Data Driven video). Jigsaw partnered with The New York Times to gather 17 million user comments, including information on which ones had been flagged by human moderators as inappropriate. Then Jigsaw crowdsourced volunteers to label a sampling of 170,000 conversation snippets for harassment or personal attacks. All of this data was fed to Conversation AI so that it could learn from an immense amount of examples what constitutes abusive language.
While these rates are impressive, Greenberg points out that there are flaws with the algorithm. Insulting words or word combinations taken out of context are rated as abusive by the program when humans wouldn’t consider them so (the author’s examples were “Trump is a moron” and “you suck all the fun out of life”). On the other hand, when the author tested a violent threat made against a Twitter user, it rated low on the abuse scale because it was indirect (aimed at “her” instead of “you”).
Google isn’t the only company using machine learning algorithms to teach AIs to make judgments on internet content. Facebook has become notorious recently for false-positives made by its algorithms, most famously for repeatedly deleting the iconic photograph of a nude girl running from a napalm attack during the Vietnam War, which was labeled as obscene due to child nudity. In another Facebook AI fail, when the trending sidebar of Facebook was automated after its human moderators were accused of political bias, the AI managing the section promptly began promoting false news sources.
Progress in the fields of natural language processing and computational linguistics has been impressive in the last two decades, aided by advancements in machine learning. Deep neural networks have illuminated pieces of the puzzle that is human language comprehension, such as the vast improvements in speech recognition software by Microsoft and IBM. While AIs seem not quite able to process the nuances of natural language yet, Conversation AI still has promising potential for squashing trolls, and is an exciting idea in applying new technology for social good.