Fuck you Twitter.
If you’re going to write something like this in a reply to a tweet, the social network might soon ask you to reconsider before sending it.
Last night, the company announced that it’s currently testing this feature with a limited number of users on iOS to gauge their response. Essentially, if you post slurs. epithets, or swear words in a reply, Twitter might reprimand you like your school teacher.
When things get heated, you may say things you don’t mean. To let you rethink a reply, we’re running a limited experiment on iOS with a prompt that gives you the option to revise your reply before it’s published if it uses language that could be harmful.
— Twitter Support (@TwitterSupport) May 5, 2020
There’s no clarity at the moment as to what Twitter considers ‘harmful.’ The company has a hate speech policy in place which tackles broader issues such as harmful imagery and violent threats.
Plus, we have no idea if Twitter‘s AI is powerful enough to detect deliberate spelling changes and typos someone might make, in order to avoid their foul tweets being spotted. In 2018, we wrote about how researchers were able to fool Alphabet subsidiary Jigsaw’s toxic comments detection AI with ‘love’. Even if Twitter‘s successful at detecting such trickery, it has to maintain cultural context and an up-to-date list of internet slang to keep up with global trends.
In January, Twitter‘s head of product, Kayvon Beykpour, said in a Q&A that the company is trying to leverage machine learning to reduce toxicity on the platform. In that session he said determining what’s abusive for someone is tricky:
Basically we’re trying to predict the tweets that are likely to violate our rules. And that’s just one form of what people might consider abusive, because something that you might consider abusive may not be against our policies, and that’s where it gets tricky.
Twitter is not the only platform to play with this idea. Last year, Instagram rolled out a feature that will ask you to think twice before posting a foul-mouthed comment. Later, the company expanded it to captions where it shows a warning to users before posting that it looks “similar to what has been reported.”
Curbing toxic or abusive comments is a hard problem for social networks because it spans across various cultural and language-based planes. At the moment, there are no right answers to it. Twitter’s experiment will be only successful if the company manages to curb some abuse while maintaining freedom of expression.