Lattice: An ADC/DAC-less ReRAM-based processing-in-memory architecture for accelerating deep convolution neural networks

Abstract

Nonvolatile Processing-In-Memory (NVPIM) has demonstrated its great potential in accelerating Deep Convolution Neural Networks (DCNN). However, most of existing NVPIM designs require costly analog-digital conversions and often rely on excessive data copies or writes to achieve performance speedup. In this paper, we propose a new NVPIM architecture, namely, Lattice, which calculates the partial sum of the dot products between the feature map and weights of network layers in a CMOS peripheral circuit to eliminate the analog-digital conversions. Lattice also naturally offers an efficient data mapping scheme to align the data of the feature maps and the weights and hence, avoiding the excessive data copies or writes in the previous NVPIM designs. Finally, we develop a zero-flag encoding scheme to save the energy of processing zero-values in sparse DCNNs. Our experimental results show that Lattice improves the system energy efficiency by 4× ~ 13.22× compared to three state-of-the-art NVPIM designs: ISAAC, PipeLayer, and FloatPIM.

DOI
10.1109/DAC18072.2020.9218590
Year