Back To Schedule
Wednesday, October 27 • 10:50am - 11:35am
Scalable and Sustainable AI Acceleration for Everyone: Hashing Algorithms Train Billion-parameter AI Models on a Commodity CPU faster than Hardware Accelerators - Auditorium

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Current Deep Learning (DL) architectures are growing larger to learn from complex datasets. Training and tuning astronomical-sized models are time and energy-consuming and stalls the progress in AI. Industries are increasingly investing in specialized hardware and deep learning accelerators like TPUs and GPUs to scale up the process. It is taken for granted that commodity hardware CPU is incapable of outperforming powerful accelerators such as GPUs in a head-to-head comparison of training large DL models. However, GPUs come with additional concerns: expensive infrastructural change which only few can afford, hard to virtualize, main memory limitations, chip shortage. Furthermore, the energy consumption of current AI training is prohibitively expensive. An article from MIT Technology Review noted that training one Deep Learning model generates more carbon footprint than five cars in their lifetime.

In this talk, I will demonstrate the first algorithmic progress that exponentially reduces the computation cost associated with training neural networks by mimicking the brain's sparsity. We will show how data structures, particularly hash tables, can be used to design an efficient "associative memory" that reduces the number of multiplications associated with the training of the neural networks. Implementation of this algorithm challenges the common knowledge prevailing in the community that specialized processors like GPUs are significantly superior to CPUs for training large neural networks. The resulting algorithm is orders of magnitude cheaper and energy-efficient. Our careful implementations can train Billions of parameter recommendation models on refurbished old generation CPU significantly faster than top-of-the-line TensorFlow alternatives on the most potent A100 GPU clusters. In the end, I will discuss the current and future state of this line of work along with a brief discussion on the planned extensions.

avatar for Anshumali Shrivastava

Anshumali Shrivastava

Professor, Rice University; Founder, ThirdAI Corp
Anshumali Shrivastava's research focuses on Large Scale Machine Learning, Scalable and Sustainable Deep Learning, Randomized Algorithms for Big-Data and Graph Mining.

Wednesday October 27, 2021 10:50am - 11:35am CDT

Attendees (4)