1 NVMe Blurs the Strains between Memory And Storage
Alexis Theriault edited this page 2025-11-22 23:43:07 +00:00
This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.


Personally I dont assume we are going to see the line between memory and storage be all that muddled sooner or later. Sure, 3D XPoint is much more responsive than Flash. But Flash isnt all that spectacular as is. The standard arduous drive has entry latencies around 4ms on average. Flash can attain well bellow µs latency, however for very small arrays. This is expensive and mostly seen in Microcontrollers that execute immediately from their Flash. "Enterprise grade" flash that optimizes at value/GB could have far higher latency, in the few to tens of µs area. 3D Xpoint is a little bit of a wash. I've seen quoted figures of sub 350ns write latency, however that is probably going for a single cell, not an array. Optane modules from Intel then again have typical latencies round 5-15µs, however that is from a "system" perspective, ie, protocol and controller overhead comes into play, in addition to ones software atmosphere.


DRAM alternatively has access latencies around 2-15ns at present. The problem with latency is that it results in our processor stalling as a result of not getting the information in time. One can prefetch, however branches makes prefetching tougher, since what facet do you have to fetch? Branch prediction partly solves this difficulty. But from a performance standpoint, Memory Wave we should fetch both sides. But when we now have extra latency, Memory Wave Workshop we have to prefetch even earlier, Memory Wave Workshop risking more branches. In different words, peak bandwidth required by our processor will increase at an exponential price compared to latency. A price that's software dependent as nicely. Caching might sound like the trivial solution to the issue, however the efficiency of cache is proportional to the latency. To a degree, cache is a magic bullet that just makes memory latency disappear. However each time an application requires one thing that isnt in cache, then the application stalls, so long as there is threads to take its place that even have knowledge to work on, then you wont have a performance deficit apart from thread switching penelties, however should you dont have such threads, then the CPU stalls.


One can ensure that more threads have their information by simply making the cache bigger, but cache is quite a bit dearer than DRAM. In the long run, it all results in the truth that increasing latency will require an arbitrary amount extra cache for the same system efficiency. Going from the few ns latency of DRAM to the couple of µs latency of present persistent memory isn't realistic as an precise alternative for DRAM, even when it reduces its latency to a 100th it remains to be not spectacular so far as Memory Wave Workshop goes. Although, the usage of persistent DIMMs for storage caching or Memory Wave as a "RAM drive" of kinds nonetheless has main benefits, however for Memory Wave Workshop program execution it is laughable.