FPGAs and Deep Machine Learning


The concept of machine learning is not new. Attempts at systems emulating intelligent behavior, like expert systems, go as far back as the early 1980’s. And the very notion of modern Artificial Intelligence has a long history. The name itself was coined at a Dartmouth College conference (1956), but the idea of an “electronic brain” was born together with the development of modern computers. AI as an idea accompanies us from the dawn of human history.

Three latest development are pushing forward “Machine Learning”:

  • Powerful distributed processors
  • Cheap and high volume storage
  • High bandwidth interconnection to bring the data to the processors

As in many other fields, development of Machine Learning is also seeing development on algorithms that take advantage of the new hardware capabilities.

There are four types of algorithms used in machine learning:

  • Supervised – The vast majority of systems today. These systems are ‘trained’ based on past data on an attempt to predict future outcomes.
  • Unsupervised – These systems try to build models, by themselves, of the process analyzed.
  • Semi supervised – is a combination of the first two, where a small amount of data is ‘labeled’ (i.e. related to known training rules) and the machine uses this as a seed to label the rest of the data
  • Reinforcement – The algorithm creates its rules through trial and error.


According Wikipedia, Deep Learning is “a part of a broader family of machine learning methods based on learning representations of data. An observation (e.g., an image) can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task”.

Many Deep Learning solutions were based on the use of GPUs. However, FPGAs are being seen as a valid alternative for GPU based Deep Learning solutions.

The main reason for that is the lower cost and lower power consumption of FPGAs compared to GPUs in Deep Learning applications.

Microsoft adopted Altera Arria 10 devices for their Convolutional Neural Network (CNNs), estimating that the usage of FPGAs would increase their system throughput roughly at 70% with the same power consumption.

A recent article on Next Platform comments on how Baidu has also adopted FPGAs for deep learning solutions.

Teradeep is another company (startup) developing CNNs, and one among the first of those adopting FPGAs as an alternative to GPUs. In May this year Xilinx announced it invested in Teradeep and continue working closely together to optimize its technology.

For some time Altera has been pushing OpenCL for the implementation of Neural Networks. New devices from Altera and Xilinx are specifically oriented for distributed processing applications, efficient integration of FPGAs with high end processors and/or high bandwidth throughput (see our previous articles Stratix 10MX – High memory bandwidth on SiP package and Intel announces Xeon processor with FPGA accelerator)

Leave a Reply