Pruning In Time (PIT): A Light-weight Network Architecture Optimizer for Temporal Convolutional Networks
TimeThursday, December 9th10:30am - 10:52am PST
Event Type
Research Manuscript
Virtual Programs
Presented In-Person
AI/ML System Design
DescriptionTemporal Convolutional Networks (TCNs) are promising Deep Learning models for time-series processing tasks. One key feature of TCNs is time-dilated convolution, whose optimization requires extensive experimentation. We propose an automatic dilation optimizer that learns dilation factors together with weights, in a single training. Our method reduces the model size and inference latency on the GAP8 SoC by up to 7.4X and 3X, respectively with no accuracy drop compared to a network without dilation. It also yields a rich set of Pareto-optimal TCNs starting from a single model, outperforming hand-designed solutions in both size and accuracy.