Addressing Prolonged Restore Challenge in Further Scaling DRAMs
PhD Dissertation Defense
July 14, 2017 12:00 to 2:00 PM
6106 Sennott Square
University of PIttsburgh Ph.D. Candidate
With continuous scaling to keep performance growth and capacity enhancement, DRAM (Dynamic Random Access Memory) has been the de facto memory technology in the past decades and is being widely adopted in modern computing systems. However, DRAM further scaling into deep sub-micron regime faces significant challenges to balance density, yield and performance. Among the induced issues, prolonged restore time is expected to be one of the major concerns, but it has been paid little attention. Aiming at the restore scaling issue, this thesis performs pioneering studies to characterize the problems, and presents architectural techniques to overcome them.
First, our experimental studies identify significant restore process variations, which affects restore timing constraints. Adopting traditional approaches results in either low yield rate or large performance degradation. To solve the problem, we propose schemes to expose the variations to the architectural level. By constructing memory chunks with different accessing speeds and, in particular, exploiting the performance benefits of fast chunks, a variation-aware memory controller can effectively compensate the performance loss. Going further, we maximize the performance improvement by applying restore-time-aware rank construction and hotness-aware page allocation schemes to make better use of the fast regions.
Second, in addition to simply expose the timing variations to higher level, we dive deeper finding that refresh and restore are two strongly correlated operations. Whereas DRAM cells’ restore charging is executed after each normal read or write access, refresh is always being periodically performed to fully charge the cells, providing an opportunity to early terminate restore operation. With the insight, we first propose to truncate a restore on basis of the time distance to the next refresh. Further, to expose more truncation opportunities, we integrate the multirate-refresh concept to shorten the refresh distance by increasing the refresh rate of recently accessed rows.
Lastly, we go higher to the application level with the inspiration that a large set of applications can well tolerate output accuracy loss and runtime errors, which enables us to exploit approximate computing to achieve accuracy-performance tradeoff. By utilizing the variance in restore timing exhibited at different row segments of a DRAM row, we reduce the restore time such that only partial segments are fully reliable. We then map the critical data onto the reliable row segments to keep the application-level errors low. Atop of the approximation-aware technique, we further generalize it to support both precise and hybrid approximate-precise computing.
Youtao Zhang (Advisor)