PI MNIST is up to at least 99.43% with Ladder Networks https://arxiv.org/abs/1507.02672. I think I vaguely remember some ~99.5% published since (it's been 6 years) but I haven't done the lit tree crawling to find it currently. Another example of a higher performing result than Maxout is Virtual Adversarial Training at 99.36% https://arxiv.org/abs/1704.03976. The JMLR version of dropout https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf also has a 99.21% with dropout finetuning of a Deep Boltzmann Machine.
PI MNIST is up to at least 99.43% with Ladder Networks https://arxiv.org/abs/1507.02672. I think I vaguely remember some ~99.5% published since (it's been 6 years) but I haven't done the lit tree crawling to find it currently. Another example of a higher performing result than Maxout is Virtual Adversarial Training at 99.36% https://arxiv.org/abs/1704.03976. The JMLR version of dropout https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf also has a 99.21% with dropout finetuning of a Deep Boltzmann Machine.