Trends in Deep Learning Hardware
Bill Dally (NVIDIA)
Distinguished Lecture
Thursday, November 9, 2023, 3:30 pm
Abstract
The current resurgence of artificial intelligence, including generative AI like ChatGPT, is due to advances in deep learning. Systems based on deep learning now exceed human capability in speech recognition, object classification, and playing games like Go. Deep learning has been enabled by powerful, efficient computing hardware. The algorithms used have been around since the 1980s, but it has only been in the last decade -- when powerful GPUs became available to train networks -- that the technology has become practical. Advances in DL are now gated by hardware performance. In the last decade, the efficiency of DL inference on GPUs had improved by 1000x. Much of this gain was due to improvements in data representation starting with FP32 in the Kepler generation of GPUs and scaling to Int8 and FP8 in the Hopper generation. This talk will review this history and discuss further improvements in number representation including logarithmic representation, optimal clipping, and per-vector quantization.
Bio
Bill Dally joined NVIDIA in January 2009 as chief scientist, after spending 12 years at Stanford University, where he was chairman of the computer science department. Dally and his Stanford team developed the system architecture, network architecture, signaling, routing and synchronization technology that is found in most large parallel computers today. Dally was previously at the Massachusetts Institute of Technology from 1986 to 1997, where he and his team built the J-Machine and the M-Machine, experimental parallel computer systems that pioneered the separation of mechanism from programming models and demonstrated very low overhead synchronization and communication mechanisms. From 1983 to 1986, he was at California Institute of Technology (CalTech), where he designed the MOSSIM Simulation Engine and the Torus Routing chip, which pioneered "wormhole" routing and virtual-channel flow control. He is a member of the National Academy of Engineering, a Fellow of the American Academy of Arts & Sciences, a Fellow of the IEEE and the ACM, and has received the ACM Eckert-Mauchly Award, the IEEE Seymour Cray Award, and the ACM Maurice Wilkes award. He has published over 250 papers, holds over 120 issued patents, and is an author of four textbooks. Dally received a bachelor's degree in Electrical Engineering from Virginia Tech, a master's in Electrical Engineering from Stanford University and a Ph.D. in Computer Science from CalTech. He was a cofounder of Velio Communications and Stream Processors.
This talk is available on the Allen School's YouTube channel.