The New Kid on the Block: GPU-Accelerated Big Data Analytics
Myth #1: GPUs are only good for gamers or supercomputers
Truth: It’s true that the early adopters of GPUs are mostly in the computer gaming industry or makers of supercomputers. However, the massively parallel computing power of GPUs can also be used to speed up machine learning or data mining algorithms that have nothing to do with 3D graphics. Take the Nvidia Titan Black GPU as an example, it has 2880 cores capable of performing over 5 trillion floating-point operations per second (TFLOPS). For comparison a Xeon E5-2699v3 processor can perform about 0.75 TFLOPS, but may cost 4x as much. Besides TFLOPS, GPUsalso enjoy a significant advantage over CPUs in terms of memory bandwidth, which is more important for data intensive applications. For Titan Black, its maximum memory bandwidth is 336 GB/ sec; whereas E5-2699v3’s is only 68 GB/sec. Higher memory bandwidth means more data can be transferred between the processor and its memory in the same amount of time, which is why GPUs can process large quantities of data in a split second.
"It’s true that GPUs are not as easy to program as their CPU counterparts, due to their unconventional processor designs"
One of the hottest areas of machine learning nowadays is Deep Learning (DL), which uses deep neural networks (DNNs) to teach computers to perform tasks such as machine vision and speech recognition. GPUs are widely used in the training of DNNs, which can take up to a few months on the CPU. With GPU-accelerated DL packages such as Caffe and Theano, the training time is often reduced to a few days.
Myth #2: GPUs are only for small data
Truth: It’s true that GPU cards have limited on-board memory, which cannot be upgraded once they are manufactured, unlike the RAM of a CPU. Furthermore, the maximum RAM size of a GPU is typically much smaller. For example, the maximum memory currently supported by a single Nvidia GPU is 12GB; whereas a multi-socket CPU system can have up to a few TBs of RAM.The conventional thinking is that GPUs areonly suitable for processing small datasets.