Framework Benchmarks: Neon vs NVcaffe vs Tensorflow

All benchmarks are done with a batch size of 64. The choice was made to fit all data in the GPU memory for each card and each network. This way there is minimal use of the CPU and workstation RAM, and the results are comparable. Each result is the average time of 100 iterations.

Future benchmarks:

  • will benchmark different cards in the same workstation
  • will consider 16-bit and 32-bit

Benchmarks by GPU

Training Benchmarks

Total time [msec] (smaller is better) for one forward and backward pass using the gpus: Titan X (Pascal), Titan X (Maxwell), GeForce GTX 1080

Inference Benchmarks

Total time [msec] (smaller is better) for one forward pass using the gpus: Titan X (Pascal), Titan X (Maxwell), GeForce GTX 1080

Benchmarks by Neural Networks

Training Benchmarks

Total time [msec] (smaller is better) for one forward and backward pass using the neural networks: VGG-A, OverFeat, AlexNet, GoogLeNet.

Inference Benchmarks

Total time [msec] (smaller is better) for one forward pass using the neural networks: VGG-A, OverFeat, AlexNet, GoogLeNet

Setup

GeForce GTX 1080 Titan X (Maxwell) Titan X (Pascal)
CPU Intel Core i7-5960x 3.00GHz x 16 Intel core i7 unlocked i7-6850k LGA2011 Intel Core i7-4930K 3.40GHz x 12
Memory 32 GB 64 GB 64 GB
OS Ubuntu 14.04 Ubuntu 16.04.1 LTS Ubuntu 16.04
Driver nvidia-367 nvidia-367 nvidia-361
Cuda 8.0 8.0 8.0
cuDNN 5.1 5.1 5.1
Neon 1.6.0+4fb5ff6 1.6.0+4fb5ff6 1.6.0+4fb5ff6
NVcaffe 0.15.10 0.15.0 0.15.10
Tensorflow 0.10.0 0.10.0 0.10.0rc0

Benchmark tools

To perform benchmarks for Neon we used the neon/tests/run_benchmarks.py script available in the neon framework.

For Tensorflow we used the framework convnet-benchmarks and the scripts convnet-benchmarks/tensorflow/benchmark_alexnet.py, convnet-benchmarks/tensorflow/benchmark_googlnet.py, convnet-benchmarks/tensorflow/benchmark_overfeat.py and convnet-benchmarks/tensorflow/benchmark_vgg.py without modifications.

We also used the convnet-benchmarks framework for Caffe, but modified the the script convnet-benchmarks/caffe/run_imagenet.sh to point to our caffe installation.

Machine Learning Meetup Italy by Addfor

To be the first to know when we publish new benchmarks register yourself in our meetup Machine Learning Italy.