|Title||A Novel Architecture Design for Output Significance Aligned Flow with Adaptive Control in ReRAM-based Neural Network Accelerator|
|Publication Type||Journal Article|
|Year of Publication||2022|
|Authors||T Li, N Jing, J Jiang, Q Wang, Z Mao, and Y Chen|
|Journal||Acm Transactions on Design Automation of Electronic Systems|
Resistive-RAM-based (ReRAM-based) computing shows great potential on accelerating DNN inference by its highly parallel structure. Regrettably, computing accuracy in practical is much lower than expected due to the non-ideal ReRAM device. Conventional computing flow with fixed wordline activation scheme can effectively protect computing accuracy but at the cost of significant performance and energy savings reduction. For such embarrassment of accuracy, performance and energy, this article proposes a new Adaptive-Wordline-Activation control scheme (AWA-control) and combines it with a theoretical Output-Significance-Aligned computing flow (OSA-flow) to enable fine-grained control on output significance with distinct impact on final result. We demonstrate AWA-control-supported OSA-flow architecture with maximal compatibility to conventional crossbar by input retiming and weight remapping using shifting registers to enable the new flow. However, in contrast to the conventional computing architecture, the OSA-flow architecture shows the better capability to exploit data sparsity commonly seen in DNN models. So we also design a sparsity-aware OSA-flow architecture for further DNN speedup. Evaluation results show that OSA-flow architecture can provide significant performance improvement of 21.6×, and energy savings of 96.2% over conventional computing architecture with similar DNN accuracy.
|Short Title||Acm Transactions on Design Automation of Electronic Systems|