Training Acceleration for Deep Neural Networks: A Hybrid Parallelization Strategy
Hosted in Virtual Platform
Time-Critical System Design
DescriptionDeep Neural Networks are widely investigated due to their state-of-the-art inference accuracy in various applications. It is required to train them using multiple accelerators in a distributed setting. However, intra-layer parallelization techniques often face communication bottlenecks, while the performance of inter-layer parallelization techniques depends on the partitioning possibilities of the model. We present EffTra, a synchronous hybrid parallelization strategy that uses a combination of intra-layer and inter-layer parallelism to realize distributed training of DNNs. Our evaluation shows that EffTra accelerates training by up to 1.6x and 2.0x compared to Gpipe and data parallelization techniques respectively.