Well, let's not set the bar too high. E.g. "convinces 90% of a panel of psychologists, cognitive scientists, neuroscientists, and Natural Language Processing researchers in an hour long interrogation".
Somebody else mentioned Winograd schema testing, which is justified by its targeting of specific weaknesses of current Question Answering / NLP approaches.
The chatterbot "Eugene Goostman" has apparently passed the Turing test:
As I kind of predicted, the program passed the Turing test, but does not seem to have any trace of general intelligence. Is this a kind of weak p-zombie?
EDIT: The fact it was a publicity stunt, the fact that the judges were pretty terrible, does not change the fact that Turing's criteria were met. We now know that these criteria were insufficient, but that's because machines like this were able to meet them.