This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve key metrics-such as energy-efficiency, throughput, and latency-without sacrificing accuracy or…mehr
This book provides a structured treatment of the key principles and techniques for enabling efficient processing of deep neural networks (DNNs). DNNs are currently widely used for many artificial intelligence (AI) applications, including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Therefore, techniques that enable efficient processing of deep neural networks to improve key metrics-such as energy-efficiency, throughput, and latency-without sacrificing accuracy or increasing hardware costs are critical to enabling the wide deployment of DNNs in AI systems.
The book includes background on DNN processing; a description and taxonomy of hardware architectural approaches for designing DNN accelerators; key metrics for evaluating and comparing different designs; features of DNN processing that are amenable to hardware/algorithm co-design to improve energy efficiency and throughput; and opportunities for applying new technologies. Readers will find a structured introduction to the field as well as formalization and organization of key concepts from contemporary work that provide insights that may spark new ideas.
Vivienne Sze received the B.A.Sc. (Hons.) degree in electrical engineering from the University of Toronto, Toronto, ON, Canada, in 2004, and the S.M. and Ph.D. degrees in electrical engineering from the Massachusetts Institute of Technology (MIT), Cambridge, MA, in 2006 and 2010, respectively. In 2011, she received the Jin-Au Kong Outstanding Doctoral Thesis Prize in Electrical Engineering at MIT. She is an Associate Professor at MIT in the Electrical Engineering and Computer Science Department. Her research interests include energy-aware signal processing algorithms and low-power circuit and system design for portable multimedia applications, including computer vision, deep learning, autonomous navigation, image processing, and video compression. Prior to joining MIT, she was a Member of the Technical Staff in the Systems and Applications R&D Center at Texas Instruments (TI), Dallas, TX, where she designed low-power algorithms and architectures for video coding. She also represented TI in the JCT-VC committee of ITU-T and ISO/IEC standards body during the development of High Efficiency Video Coding (HEVC), which received a Primetime Engineering Emmy Award. Within the committee, she was the primary coordinator of the core experiment on coefficient scanning and coding, and she chaired/vice-chaired several ad hoc groups on entropy coding. She is a co-editor of High Efficiency Video Coding (HEVC): Algorithms and Architectures (Springer, 2014). Prof. Sze is a recipient of the inaugural ACM-W Rising Star Award, the 2019 Edgerton Faculty Achievement Award at MIT, the 2018 Facebook Faculty Award, the 2018 & 2017 Qualcomm Faculty Award, the 2018 & 2016 Google Faculty Research Award, the 2016 AFOSR Young Investigator Research Program (YIP) Award, the 2016 3M Non-Tenured Faculty Award, the 2014 DARPA Young Faculty Award, and the 2007 DAC/ISSCC Student Design Contest Award; and she is a co-recipient of the 2018 VLSI Best Student Paper Award, the 2017 CICC Outstanding Invited Paper Award, the 2016 IEEE Micro Top Picks Award, and the 2008 A-SSCC Outstanding Design Award. She currently serves on the technical program committee for the International Solid-State Circuits Conference (ISSCC) and the SSCS Advisory Committee (AdCom). She has served on the technical program committees for VLSI Circuits Symposium, Micro, and the Conference on Machine Learning and Systems (MLSys); as a guest editor for the IEEE Transactions on Circuits and Systems for Video Technology (TCSVT); and as a Distinguished Lecturer for the IEEE Solid-State Circuits Society (SSCS). Prof. Sze was the Systems Program Chair of MLSys in 2020. Tien-Ju Yang received the B. S. degree in electrical engineering from National Taiwan University (NTU), Taipei, Taiwan, in 2010, and the M. S. degree in electronics engineering from NTU in 2012. Between 2012 and 2015, he worked as an engineer in the Intelligent Vision Processing Group, MediaTek Inc., Hsinchu, Taiwan. He is currently a Ph.D.candidate in Electrical Engineering and Computer Science at Massachusetts Institute of Technology, Cambridge, MA, working on energy-efficient deep neural network design. His research interests span the areas of computer vision, machine learning, image/video processing, and VLSI system design. He won first place in the 2011 National Taiwan University Innovation Contest and co-taught a tutorial on "Efficient Image Processing with Deep Neural Networks" at ICIP2019. Joel S. Emer received the B.S. (Hons.) and M.S. degrees in electrical engineering from Purdue University, West Lafayette, IN, USA, in 1974 and 1975, respectively, and the Ph.D. degree in electrical engineering from the University of Illinois at Urbana-Champaign, Champaign, IL, USA, in 1979. He is currently a Senior Distinguished Research Scientist with Nvidia's Architecture Research Group, Westford, MA, USA, where he is responsible for exploration of future architectures and modeling and analysis methodologies. He isalso a Professor of the Practice at the Massachusetts Institute of Technology, Cambridge, MA, USA. Previously he was with Intel, where he was an Intel Fellow and the Director of Microarchitecture Research. At Intel, he led the VSSAD Group, which he had previously been a member of at Compaq and the Digital Equipment Corporation. Over his career, he has held various research and advanced development positions investigating processor micro-architecture and developing performance modeling and evaluation techniques. He has made architectural contributions to a number of VAX, Alpha, and X86 processors and is recognized as one of the developers of the widely employed quantitative approach to processor performance evaluation. He has been recognized for his contributions in the advancement of simultaneous multithreading technology, processor reliability analysis, cache organization, pipelined processor organization, and spatial architectures for deep learning. Dr. Emer is a Fellow of the ACM and IEEE and a member of the NAE. He has been a recipient of numerous public recognitions. In 2009, he received the Eckert-Mauchly Award for lifetime contributions in computer architecture. He received the Purdue University Outstanding Electrical and Computer Engineer Alumni Award and the University of Illinois Electrical and Computer Engineering Distinguished Alumni Award in 2010 and 2011, respectively. His 1996 paper on simultaneous multithreading received the ACM/SIGARCHIEEE-CS/TCCA Most Influential Paper Award in 2011. He was named to the International Symposium on Computer Architecture (ISCA) and International Symposium on Microarchitecture (MICRO) Halls of Fame in 2005 and 2015, respectively. He has had six papers selected for the IEEE Micro's Top Picks in Computer Architecture in 2003, 2004, 2007, 2013, 2015, and 2016. He was the Program Chair of ISCA in 2000 and MICRO in 2017.
Inhaltsangabe
Preface.- Acknowledgments.- Introduction.- Overview of Deep Neural Networks.- Key Metrics and Design Objectives.- Kernel Computation.- Designing DNN Accelerators.- Operation Mapping on Specialized Hardware.- Reducing Precision.- Exploiting Sparsity.- Designing Efficient DNN Models.- Advanced Technologies.- Conclusion.- Bibliography.- Authors' Biographies.