On The Efficiency of Sparse-Tiled Tensor Graph Processing For Low Memory Usage
Hosted in Virtual Platform
DescriptionThe memory space taken to host and process large tensor graphs is a limiting factor for embedded ConvNets. Even though many data-driven compression pipelines have proven their efficacy, this work shows there is still room for optimization at the intersection with compute-oriented optimizations.
We demonstrate that tensor pruning via weight sparsification can cooperate with a model-agnostic tiling strategy, leading ConvNets towards a feasible region of the solution space at no cost of accuracy. The collected results show for the first time fast versions of MobileNets deployed at full scale on an ARM M7 core with 512KB/2MB SRAM/FLASH.