A Compute-in-Memory Architecture Compatible with 3D NAND Flash that Parallelly Activates Multi-Layers
Event Type
Research Manuscript
Virtual Programs
Hosted in Virtual Platform
Near-Memory and In-Memory Computing
Embedded Systems
DescriptionWe propose a novel NAND-based architecture to efficiently accelerate the vector-matrix multiplication for deep neural networks. The proposed approach is fully compatible with 3D-NAND and allows multiple layers of wordline planes to be activated in parallel, as opposed to the previous layer-by-layer activation. Innovative linear-VT correction and positive-negative weights techniques help to achieve multi-level weight storage and better computing precision. The feasibility and accuracy of the proposed architecture are verified using TCAD, SPICE and system-level simulations based on commercial 3D-NAND parameters. Major advantages of the approach include 16~32x increase of array utilization and 64~128x reduction of read power consumption.