Abstract
As the demand for image applications with high resolution increases, the importance of the system for image processing is growing. Graphics processing units (GPUs) can increase computational capacity with massive parallelism, but are still subject to limited memory bandwidth. Near-data-processing (NDP) is expected to mitigate the performance and energy overhead caused as a result of data transfer by performing computations on the logic die of 3D-stacked memory. Although prior studies have demonstrated the advantages of NDP, a NDP solution focused on image processing has not yet been developed. This article proposes a GPU-based NDP architecture and well-matched optimization strategies considering both the characteristics of image applications and NDP constraints. First, data allocation to the processing unit is addressed to maintain the data locality and data access pattern. Second, a lightweight yet efficient NDP GPU architecture is proposed. By applying a prefetcher that leverages the pattern-aware data allocation, the number of active warps and the on-chip SRAM size of the NDP are significantly reduced. This enables the NDP constraints to be satisfied and a greater number of processing units to be integrated on a logic die. The evaluation results show that the proposed NDP GPU improves the performance by 1.85× and consumes 82.7 percent energy compared to the baseline NDP GPU.
Original language | English |
---|---|
Pages (from-to) | 13-26 |
Number of pages | 14 |
Journal | IEEE Transactions on Computers |
Volume | 71 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2022 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 1968-2012 IEEE.
Keywords
- Near-data processing
- image processing
- processing-in-memory