You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

JoshuaZ comments on Open thread, Jan. 12 - Jan. 18, 2015 - Less Wrong Discussion

6 Post author: Gondolinian 12 January 2015 12:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 16 January 2015 01:31:59AM *  14 points [-]

Image recognition, courtesy of the deep learning revolution & Moore's Law for GPUs, seems near reaching human parity. The latest paper is "Deep Image: Scaling up Image Recognition", Wu et al 2015 (Baidu):

We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. The key components are a custom-built supercomputer dedicated to deep learning, a highly optimized parallel algorithm using new strategies for data partitioning and communication, larger deep neural network models, novel data augmentation approaches, and usage of multi-scale high-resolution images. On one of the most challenging computer vision benchmarks, the ImageNet classification challenge, our system has achieved the best result to date, with a top-5 error rate of 5.98% - a relative 10.2% improvement over the previous best result.

...The result is the custom-built supercomputer, which we call Minwa 2 . It is comprised of 36 server nodes, each with 2 six-core Intel Xeon E5-2620 processors. Each sever contains 4 Nvidia Tesla K40m GPUs and one FDR InfiniBand (56Gb/s) which is a high-performance low-latency interconnection and supports RDMA. The peak single precision floating point performance of each GPU is 4.29TFlops and each GPU has 12GB of memory. Thanks to the GPUDirect RDMA, the InfiniBand network interface can access the remote GPU memory without involvement from the CPU. All the server nodes are connected to the InfiniBand switch. Figure 1 shows the system architecture. The system runs Linux with CUDA 6.0 and MPI MVAPICH2, which also enables GPUDirect RDMA. In total, Minwa has 6.9TB host memory, 1.7TB device memory, and about 0.6PFlops theoretical single precision peak performance...We are now capable of building very large deep neural networks up to hundreds of billions parameters thanks to dedicated supercomputers such as Minwa.

...As shown in Table 3, the accuracy has been optimized a lot during the last three years. The best result of ILSVRC 2014, top-5 error rate of 6.66%, is not far from human recognition performance of 5.1% [18]. Our work marks yet another exciting milestone with the top-5 error rate of 5.98%, not just setting the new record but also closing the gap between computers and humans by almost half.

For another comparison, on pg9 Table 3 shows past performance. In 2012, the best performer reached 16.42%; 2013 knocked it down to 11.74%, and 2014 to 6.66% or to 5.98% depending on how much of a stickler you want to be; leaving ~0.8% left.

EDIT: Google may have already beaten 5.98% with a 5.5% (and thus halved the remaining difference to 0.4%), according to a commenter on HN, "smhx":

googlenet already has 5.5%, they published it at a bay area meetup, but did not officially publish the numbers yet!

Comment author: JoshuaZ 16 January 2015 01:37:35AM 1 point [-]

That is shocking and somewhat disturbing.