TAIT: One-Shot Full-Integer Lightweight DNN Quantization via Tunable Activation Imbalance Transfer
Hosted in Virtual Platform
DescriptionDepthwise convolution reduces the computational density of DNNs without sacrificing much accuracy. However, current quantization methods suffer from accuracy loss, long finetuning time or partial quantization. To solve these issues, we have developed a mechanism called Tunable Activation Imbalance Transfer (TAIT), which tunes the value range uniformity between an activated feature map and its latter weights. Moreover, TAIT fully supports full-integer quantization. We demonstrate TAIT on SkyNet and deploy it on FPGA. Compared to the SOTA, our work achieves 2.2\%+ IoU, 2.4x speed, and 1.8$x energy efficiency improvements, without any requirement of finetuning.