I though a bit about it, but I think Tay is basically a software version of a parrot that repeats back what it hears - I don't think it has any commonsense knowledge or serious attempt to understand that tweets are about a world that exists outside of twitter. I.e it has no semantics, it's just a syntax manipulator that uses some kind of probabilistic language model to generate grammatically correct sentences and a machine learning model to try and learn which kind of sentences will get the most retweets or will most closely resemble other things people are tweeting about. Tay does't know what a "Nazi" actually is. I haven't looked into it in any detail but I know enough to guess that that's how it works.
As such, the failure of Tay doesn't particularly tell us much about Friendliness, because friendliness research pertains to superintelligent AIs which would definitely have a correct ontology/semantics and understand the world.
However, it does tell us that a sufficiently stupid, amateurish attempt to harvest human values using an infrahuman intelligence wouldn't reliably work. This is obvious to anyone who has been "in the trade" for a while, however it does seem to surprise the mainstream media.
It's probably useful as a rude slap-in-the-face to people who are so ignorant of how software and machine learning work that they think friendliness is a non-issue.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
It might help to take an outside view here:
Picture a hypothetical set of highly religious AI researchers who make an AI chatbot, only to find that the bot has learned to say blasphemous things. What lessons should they learn from the experience?
Original thread here.