A Lesson Buying a Multi-GPU System
What I learned looking for an on-premise multi GPU system for deep learning experiments.
Originally published on Medium

This story is about the H word, I mean hardware or more precisely high-end GPU system for heavy lifting deep learning tasks.
So after using GPU cloud instances for a long time and even writing a PyTorch checkpoint handler to support saving and loading model weights and other hyperparameters as a snapshot, making it easier to work on spot instances. It was about time to get an on-premise machine. I currently focus on video-related tasks and training heavy I3D models, so I needed a solid multi GPU machine to support such efforts. Using a large mini-batch size forces a lot of GPU memory. The latest and finest NVIDIA GPU is the TITAN RTX packed with 24GB memory and sold at $2,499, so why not get four of these babies. I reached out to a few recommended suppliers to get a spec suggestion and price estimates of such a system. I was a bit surprised to hear one of the suppliers saying this cannot be done while the rest did send a suggestion.
His full answer was something like:
They will heat up without a blower, and the best value for money is the GeForce RTX 2080 Ti TURBO 11G, since each costs half and has 11GB of memory just get 8 of them.
I actually did purchase a single GPU machine recently with a single GEFORCE RTX 2080 Ti, so I ask if I can add more GPUs to that machine, he asked for the exact GPU model and again said: “no blower, it will be too hot in the chassis”.
So what’s the deal here and what is a blower?
Confused I started looking for a simple answer and with Google’s help I found this image and the great blog post it came from.

A picture is worth a thousand words. Image from: NVIDIA RTX Graphics Card Cooling Issues
Yep, on the right — a blower!
A blower as written in GIGABYTE website is:
“GIGABYTE turbo fan cooling system is designed for systems that use multiple graphics cards in space restricted chassis. With the vapor chamber direct touch GPU, blower fan and specially shaped exterior, it provides more airflow and better heat dissipation.”

Image from: GIGABYTE product page
It turns out that gamers models, like the TITAN RTX, are not dedicated for multi GPU systems and are designed for, well, gaming. They cool up by using an “open-air” system.
Assembling a multi GPU system on a non-server rack force you to put them close to each other and the blower’s job is to take care of the heat dissipation.
So next time you are looking for a GPU to handle deep learning tasks you better off ask your supplier if the GPU has a blower.
For more reading, I recommend this great blog post: https://www.pugetsystems.com/blog/2019/01/11/NVIDIA-RTX-Graphics-Card-Cooling-Issues-1326/