Abstract

The talk will start with an overview of our work Neural Cache architecture which is capable of fully executing convolutional, fully connected, pooling layers in-cache and also supports quantization in-cache. Then I will present a versatile Compute Cache architecture named Duality Cache, which re-purposes cache structures to transform them into massively parallel compute units capable of running arbitrary data parallel workloads including Deep Neural Networks. Our work presents a holistic approach to building Compute Cache system stack with techniques of performing in-cache floating-point and fixed-point arithmetic, transcendental functions, enabling SIMT execution model, designing a compiler that accepts existing CUDA programs, and providing flexibility in adapting for various workload characteristics.

Exposure to massive parallelism that exists in Duality Cache architecture improves performance of GPU benchmarks by 3.6× and OpenACC benchmarks by 3.2× over server class GPU. Re-purposing existing caches provides 72.6× better performance for CPU with only 3.5% of area cost. Duality Cache reduces energy by 5.2× over GPU and 20× over CPU.

Biography: Reetu Das is a faculty at the University of Michigan. Prior to this, she was a research scientist at Intel Labs, and the researcher-in-residence for the Center for Future Architectures Research. She received her Ph.D. in Computer Science and Engineering from Pennsylvania State University, University Park. Some of her recent projects include in-memory architectures, custom computing for precision health and AI, fine-grain heterogeneous core architectures for mobile systems, and low-power scalable interconnects for kilo-core processors. She has authored over 45 papers, filed 7 patents, served on over 30 technical program committees and is serving as program co-chair for MICRO-52. She has received IEEE Top Picks awards, an NSF CAREER award, CRA-W's Borg Early Career Award, and Sloan Foundation Fellowship. Prof. Das has been inducted into IEEE/ACM MICRO and ISCA Hall of Fame. She also serves as the co-founder and CTO of a precision medicine start-up, Sequal Inc. The works being presented in this talk have been recognized by best Demo award in C-FAR selected from 50 projects from leading University research groups, and an IEEE Micro Top Picks award.

keynote2

Abstract