Close

Presentation

ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems
Time
Location
Event Type
Research Manuscript
Virtual Programs
Hosted in Virtual Platform
Keywords
Embedded System Design Methodologies
Topics
Embedded Systems
DescriptionTraditional neural network training methods are not latency-oriented. In this work, we propose a novel learning compact deep neural networks method to directly reduce the time cost of model inference. For efficiency, we introduce a more universal hardware-customized latency predictor to guide and optimize this training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the `hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3%.