Functional Criticality Classification of Structural Faults in AI Accelerators
Event Type
Special Session (Research Track)
Virtual Programs
Hosted in Virtual Platform
Machine Learning/AI
DescriptionThe ubiquitous application of deep neural networks (DNN) has led to a rise in demand for AI accelerators. DNN-specific functional criticality analysis identifies faults that cause measurable and significant deviations from acceptable requirements. The criticality of faults can thus be used to guide test grading and quality assessment of AI accelerators at various stages of the product lifecycle. As the functional criticality of faults depends on the type of dataset, DNN model, and model-to-hardware mapping, individual catalogs containing lists of critical and benign faults can be pre-generated for every domain-specific use-case. This presentation will examine the problem of classifying structural faults in two promising hardware architectures (a) systolic-array accelerator, and (b) Resistive Random-Access Memory (ReRAM)-based systems, based on their functional criticality. The systolic-array based accelerator resembles Google’s TPU architecture and utilizes floating-point multiplier for inference while the ReRAM-based architecture implements CNN inference in a pipelined fashion using fixed-point representations. The disparate nature of these two hardware platforms also leads to different mapping policies which affects how faults impact final inference accuracy. The impact of hardware faults in these two types of PEs will be analyzed, and a multi-tier machine-learning (ML) based method will be presented to assess the functional criticality of these faults. The speaker will address the problem of minimizing misclassification by utilizing generative adversarial networks and graph convolutional networks specifically for systolic array as an example.