Design Guidelines for High-Performance SCM Hierarchies

AI-generated keywords: SCM DRAM Performance Cost Hierarchy

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Integration of emerging storage-class memory (SCM) in servers to improve performance and cost compared to DRAM-only architectures
  • SCM offers high density and access latencies similar to DRAM but higher memory access latency poses challenges for latency-sensitive services
  • Proposal of deploying a modestly sized high-bandwidth 3D stacked DRAM cache in an SCM-mostly memory system to mitigate latency issues
  • Identification of key design parameters in the memory hierarchy that impact performance and cost when combining SCM with a 3D stacked DRAM cache
  • Introduction of a methodology for provisioning these parameters based on a target performance/cost goal
  • Demonstration using PCM as a case study, showing that a two bits/cell technology achieves a performance/cost sweet spot, reducing memory subsystem cost by 40% while maintaining performance within 3% of the best performing DRAM-only system
  • Valuable insights into designing high-performance SCM hierarchies in servers and guidelines for integrating SCM effectively while considering cost constraints
  • Contribution to advancing the adoption of emerging SCM technologies in server architectures.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Dmitrii Ustiugov, Alexandros Daglis, Javier Picorel, Mark Sutherland, Edouard Bugnion, Babak Falsafi, Dionisios Pnevmatikatos

Published at MEMSYS'18

Abstract: With emerging storage-class memory (SCM) nearing commercialization, there is evidence that it will deliver the much-anticipated high density and access latencies within only a few factors of DRAM. Nevertheless, the latency-sensitive nature of memory-resident services makes seamless integration of SCM in servers questionable. In this paper, we ask the question of how best to introduce SCM for such servers to improve overall performance/cost over existing DRAM-only architectures. We first show that even with the most optimistic latency projections for SCM, the higher memory access latency results in prohibitive performance degradation. However, we find that deployment of a modestly sized high-bandwidth 3D stacked DRAM cache makes the performance of an SCM-mostly memory system competitive. The high degree of spatial locality that memory-resident services exhibit not only simplifies the DRAM cache's design as page-based, but also enables the amortization of increased SCM access latencies and the mitigation of SCM's read/write latency disparity. We identify the set of memory hierarchy design parameters that plays a key role in the performance and cost of a memory system combining an SCM technology and a 3D stacked DRAM cache. We then introduce a methodology to drive provisioning for each of these design parameters under a target performance/cost goal. Finally, we use our methodology to derive concrete results for specific SCM technologies. With PCM as a case study, we show that a two bits/cell technology hits the performance/cost sweet spot, reducing the memory subsystem cost by 40% while keeping performance within 3% of the best performing DRAM-only system, whereas single-level and triple-level cell organizations are impractical for use as memory replacements.

Submitted to arXiv on 20 Jan. 2018

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1801.06726v4

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

This paper explores the integration of emerging storage-class memory (SCM) in servers to improve overall performance and cost compared to existing DRAM-only architectures. SCM offers high density and access latencies similar to DRAM but its higher memory access latency poses challenges for latency-sensitive memory-resident services. To mitigate this, the authors propose deploying a modestly sized high-bandwidth 3D stacked DRAM cache in an SCM-mostly memory system. The paper identifies key design parameters in the memory hierarchy that impact performance and cost when combining SCM technology with a 3D stacked DRAM cache. A methodology is introduced to drive provisioning for each of these parameters based on a target performance/cost goal. Using PCM as a case study, the authors demonstrate that a two bits/cell technology achieves a performance/cost sweet spot, reducing the memory subsystem cost by 40% while maintaining performance within 3% of the best performing DRAM-only system. This research provides valuable insights into designing high-performance SCM hierarchies in servers, offering guidelines for integrating SCM effectively and optimizing performance while considering cost constraints. The findings contribute to advancing the adoption of emerging SCM technologies in server architectures.
Created on 19 Sep. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.