Guys stop attacking @staff. They are merely victims of the machine uprising.

I’m willing to bet money that the safe mode fiasco is entirely a result of machine learning gone awry, and not because @staff suddenly has an evil homophobic agenda and wants to protect kids from the gays. Why? Well let’s look at what safe mode is supposed to do: block porn bots.

Imagine you’re a Tumblr IT person and you’re tasked with developing an algorithm to block said porn bots. You’re on your third cup of coffee for the night and you really just want to go home, so you think of the quickest, easiest, dirtiest way to detect sensitive content: set up a neural network and feed it tons upon tons of examples of typical posts porn bots make. Easy-peasy. Except not at all, because robots are stupid and never do what you want them to.

Here’s what this neural network can’t do with much effectiveness:

Identify pornographic nudity
Identify degrading content
Identify hateful rhetoric
Distinguish any of those things from things that aren’t those things

Here’s what it can do:

See that “sensitive posts” are mostly still images and gifs. Block accordingly.
See that “sensitive posts” contain curse words. Block accordingly.
See that images in “sensitive posts” are often black-and-white. There goes aesthetic blogs.
See that text in “sensitive posts” are short with an external link. Artists linking to your website? Sucks to be you.
And perhaps most importantly, see what “sensitive posts” are tagged as.

You’ve probably already guessed, but porn bots usually don’t tag their posts with “porn” or “nsfw” like a courteous person. Here’s what they do tag with:

etc. etc. etc., mixed in with things like “hot” and “sex” and “babes” and a bunch of slurs too.

Now if you were a neural network designed to look primarily at tags to identify what content to block from the kiddies and you only had this example data to work with, what would you think if you came across an lgbt positivity post that has tags that are nearly identical?

It’s not that I’m not blaming @staff. They still rolled out shitty code without doing any sort of QA on any sort of actual content people put on this site and messed things up as a result. But I really, really doubt there was any sort of malice involved. It’s just another textbook example of why you can’t rely on half-baked algorithms to do your jobs for you. Something that I really think we all should have learned by now.





