To answer why we use GPUs for machine learning we should understand how a GPU works, what are the differences between a GPU and a CPU, and then we will learn an overview about ML.
By knowing that, I hope that you will answer the question naturally, because you will understand the underlying structure of ML algorithms.
In this article we’ll try to explain the differences between a GPU and a CPU, some aspects about how they both work, and why a GPU is the natural choice for machine learning. We’ll do this by using detailed explanations and easy to understand analogies.
Table of Contents
GPUs and CPUs
There are some similarities and differences between a CPU and a GPU, which is what makes one suitable for a task while the other isn’t, we will discuss the main characteristics that are relevant to ML.
What is a CPU
The central processing unit is the main component of every computer or mobile, it does most of the processing that is required to perform the operations you would require a computer to do like: managing inputs (keyboard, mouse, … etc), performing programs instructions, displaying a GUI in your screen, or even run multiple programs at the same time.
CPUs have some components that help it to perform operations fast:
- Core
If the CPU is the engine of the computer, then the core is the cylinder and piston of it, all the computations occur in the circuits of the core. Nowadays, most CPUs contain multiple cores; which lead to mitigating Moore’s law, which states that CPUs will become faster as transistors shrink in size, but as transistor approach the size of few atoms, the CPU computing power will reach the maximum, and we are there now, but with multi-core CPUs, we will still see some improvements. - Cache
Super sonic airplanes need higher quality structures, fuel, … etc. so does the CPU, it needs super-fast memory to write and read from at the CPU speed, the CPU’s cache is a high-speed memory that serves as an intermediate layer between the cores and other hardware (RAM), we can’t use RAM instead because it is too slow for the CPU, even the fastest one.
Cache is often measured by kilobytes or a few megabytes in modern CPU’s. One might think, if Cache is that fast, why should we use it instead of RAM? Because it is too expensive to manufacture. - Memory Management Unit
As the name suggests, here is the entity that manages what goes where and when, think of it as a traffic lights system, inside the CPU. - Clock
This is the component that dictates how fast the CPU works, it is measured in Hertz (or cycles per second), think of a cycle as a procedure of a certain computation, the higher the cycles a CPU can run every second, the faster the CPU.
This whole setup makes the CPU able to do wide range of operations efficiently, but it mostly works in sequential fashion, so performing many (like millions) of the set of operations isn’t the best use for a CPU.
What is a GPU
The GPU stands for graphics processing unit. It emerged of the necessity of displaying high resolution videos (and playing high-quality games), because playing high-quality videos seamlessly requires massive amounts of computations, think of how many pixels you need to display at the same time, this amount goes higher if you consider 3D realistic games, as every object have to be rendered in real-time to get a crisp image.
GPU is a processing unit that is specifically designed for parallel computations, you feed-in the instructions to process thousands of operations of your video or game, and get the result of all of them at the same time pretty quickly.
A GPU contains the same components as a CPU, the difference is in the details, while a CPU has few powerful multi-purpose cores; the GPU has thousands of single-purposed cores, which make the GPU excel at parallel “specific” tasks.
Feeding those cores at the same time requires a large memory, so the memory of the GPU is higher than that of the CPU, and it’s measured by the GPU’s band width.
What are the tasks a GPU is good at?
Basically, these computations are vector and matrix operations, a vector is a set of numbers stacked together, so these operations are actually parallel in nature.
But how vectors are related to videos?
You can encode colors of any pixel as a set of numbers, depending on the color scheme you are using, so for example you can encode yellow in RGB color scheme as the vector of three numbers (255,255,0); or (full red, full green, no blue).
One can also encode objects in 3D games as vectors (or a matrix, the matrix is a set of vectors) describing every aspect of the object, then motion is an easy set of matrix operations.
CPUs vs. GPUs
An informal analogy between CPUs and GPUs can go like this:
The canons are slow compared to the archers in terms of loading and firing, but they can target small ranges and varieties of objects (multi-purpose), but an archer can only target very small and specific targets, but their sum can cover a very wide range (single-purposed).
Machine Learning
Artificial intelligence is a relatively new field of computer science, originating during the 1950s, AI is the study and research area about enabling computers to perform tasks that makes them “seem” intelligent, those tasks seem obvious to a human being, but to machine it is very hard, some areas of AI are: computer vision, natural language processing, real-time decision making, … etc.
Machine learning is a subfield of artificial intelligence (AI), which is about designing algorithms that make the computer able to learn how to perform a task without explicitly telling the computer the rules of how to do it (telling the computer the rules is programming).
A more mathematical definition of ML is designing algorithms that make the computer able to map a set of inputs to some outputs correctly, without telling the rules of how to do so.
Let’s dissect the definition of ML:
- The algorithms: in this context the algorithms are called ML models, nowadays most models consist of interconnected nodes called “neurons”, as the neurons of our brain, thus their web is called a neural network, they are a way of setting a capacity for the computer to learn, and how to feed the data, and how to get results, although neural networks are named after parts of the nervous system, don’t fall into the misconception that artificial neural networks work like our biological brain, because we don’t yet know how the brain exactly works!
- Inputs: these are the fuel of ML, because without data to learn from, a computer can’t learn anything by itself, but thanks to the internet; the data is abundant, yet there will be a challenge: how to feed this data to the model? The models expect numerical values, or more formally a vector or a matrix fed to the input neurons, so any data must be encoded in a numerical way, one possible way is to create a dictionary that maps numbers to say words in a text, or as discussed previously, images and videos can be nicely encoded with their pixel values.
- Outputs: these are what we want the model to predict, a good model will predict the correct output for the given input, and they are the whole reason for ML.
A classical example is recognizing handwritten digits, the input is the image of the digit, the output will be the digit itself.
How Machine Learning Works Internally
You would uncrumple it bit by bit, then you will recognize its shape and tell me, right?
The goal of ML is to learn the right way to uncrumple the paper (input) and give the right answer. The process of uncrumpling the paper is called transformation in mathematics.
Transforming a variable (input) to another variable (output) is done by using matrix multiplication, or more realistically tensor operations (a tensor a generalization of a matrix).
And to quickly train a model you need to feed multiple inputs at the same time.
That makes even the process more resource intensive, but there are good things, the operation is massively parallel (many inputs at the same time), and it is matrix (tensor) operations.
An example of a non-linear transformation of an image, that’s a simple version of what ML is about.
Why Are GPUs Well-Suited for Machine Learning?
The requirements of machine learning are massive parallelism, and doing specific operations upon the inputs, those operations are matrix and tensor operations, which are where GPUs outperforms CPUs.
Another thing that validates the choice of GPUs for ML is that modern GPUs have some cores dedicated to tensor operations, which makes them do these operations efficiently and quickly.
Also, GPUs from the same company of the same version are compatible for clustering, that opens opportunities for creating an efficient hardware for ML.
What to Look for in a GPU for Machine Learning?
- High bandwidth: it makes you able to feed more input, hence lead to better performance.
- Tensor cores: provide more efficiency performing tensor operations.
- Compatibility: because that will ensure more parallelism by clustering GPUs.
Conclusion
In this article we learned a lot about GPUs, CPUs, artificial intelligence, and particularly machine learning, and more importantly why GPUs are a natural choice for doing machine learning.
In the past few years, a lot of advances in the field of machine learning took place, not just in the theory, also the hardware, and that’s why machine learning scientist moved to a more specialized hardware like GPUs, but also new hardware made solely for machine learning has emerged like googles TPUs (tensor processing unit), which made the process more efficient.