Hacker News new | past | comments | ask | show | jobs | submit login
[dupe] What every programmer should know about memory (2007) (lwn.net)
119 points by jimsojim on Nov 20, 2015 | hide | past | favorite | 53 comments




What every programmer should know about memory:

Even though it may not seem so at first, memory is very lossy, especially in old age. So, write comments, write documentation, make notes. Diagrams help too.

It's best to acquire these skills at early age, since at older age memory stores also require more cycles.

There are also ways to boost memory. One option is that emotionally charged events are very well remembered. Thus, don't be afraid to experiment; the bigger mistake you make, the better you will remember it.

Edit: Ah, nevermind. Apparently this is a different type of memory that programmers have to deal with. Still, I hope this advice was useful.


Obligatory "Latency Numbers Every Programmer Should Know"

http://www.eecs.berkeley.edu/~rcs/research/interactive_laten...


What every programmer should know: large capacity and blazing fast NAND is poised to change everything you used to know. JEDEC is working on standardizing large capacity NVDIMMs [1][2][3], and that will probably mean at the very least a new tier in this hierarchy. And perhaps changes to swap, filesystems, databases, and boot/initialization will come to capitalize on this new tier.

[1] https://www.jedec.org/news/pressreleases/jedec-announces-sup...

[2] http://www.jedec.org/sites/default/files/files/Brett_William...

[3] https://en.wikipedia.org/wiki/NVDIMM


Unless they get CAS latency down(the little I know about flash points to probably not) I don't know if it's going to change that much.


This is on a different tier entirely from NVDIMMs but I've been wondering lately about whether all of the cache locality efforts are going to go to waste or even be counter productive as hardware advances. Effective cache locality requires jumping through a lot of hoops.


I would love to read an updated version of this. The fundamentals won't have changed much. But the details are changing and Urlich went into a lot of detail in this doc.

Great Read.


Wiki could work for updating this writeup efficiently, good example is the way linux booting was described using github.

https://github.com/0xAX/linux-insides


this is way to low level for every programmer to know. While there probably is some use for this, the average java/python/ruby/node.js programmer will never need this information.

When working with embedded systems this information is more usefull, but very few programmers actually do work in that industry compared to others, e.g. web


Wrong. Every programmer should know this stuff. You may not be directly manipulating memory in python, but you should know enough about memory (and other scarce resources) to know when you need to optimize your application. How are you going to know when you need to use __slots__ in python? Should java programmers just blindly throw -Xms<size> when they think they need it? How should they justify that?

We have GB and GB of memory in our computers today, but all of the "I don't have to worry about memory constraints" have snowballed. The application we run today are nowhere near as efficient as they could be otherwise.


> the average java/python/ruby/node.js programmer will never need this information

That's what every kid in school ever says about a subject they don't care about. They're usually proven wrong.

Just in the first few pages you'll find this gem:

  This leakage is why a DRAM cell must be constantly refreshed. For
  most DRAM chips these days this refresh must happen every 64ms. During
  the refresh cycle no access to the memory is possible. For some workloads
  this overhead might stall up to 50% of the memory accesses (see [highperfdram]).
You don't think a Java, Python, Ruby, or Node programmer might need to know this at some juncture?

Having a deep understanding of how things work helps you visualize everything going on and see potential problems before they happen. This kind of deep thinking is helpful in all kinds of activities.


"Would benefit, eventually, at some point when butting up against resource limitations if they're advanced and do this for a while and no one's around to point this out of them on code review" != "everyone needs to know"


I think I get what you're saying here. You don't "need" to know this to write code. Which is true - heck, you don't "need" to know how a tcp/ip connection works, or how disk storage works. But one day you might.

When you put fuel in your car and you turn the ignition and step on the gas, the wheels move. So all you need to know is where the fuel goes and how to turn the ignition and where the gas pedal is. But one day, I guarantee you, the wheels will stop moving.

When your boss storms up to your desk and says "Why aren't the god damn wheels moving?! We're losing ad revenue every minute!!", I hope Code Review Guy is around to tell you how fuel injectors work.


You know, I have a really hard time imagining a situation where your boss storms up to your desk and says "Why aren't the goddamn wheels moving? We're losing ad revenue every minute!!" and you reply "Aww goddamn, I forgot to account for DRAM refresh cycle."

There's knowing how fuel injectors work, and there's knowing how copper crystal in the wire admits an energy band for free electrons.


The point isn't to know the answer to every problem. The point is to be educated enough to even have a guess as to where to start looking for the problem, and then start trying to figure out the answer.


I think there's a big difference between "read once" and "know". When some people say a fact is "known" they mean strictly that you are able to recall that fact in toto. That is probably overkill for a Web programmer.

After you read something once you don't strictly know it. However if you run into a related problem down the road, you have a good chance that you can remember enough of what you read once to put a Google query together and find the original article.

After thinking about it a little more, what they probably mean is that while most programmers will forget most of what they read here (because they don't use it day-to-day) it is reliable information that will help shape their intuition about how memory works. That is probably what they mean about every programmer and this article.

Every programmer ought to read and test their intuition about how memory works against the content in this article. If they are surprised by something they read, they have the chance to adjust their intuition so that it is closer to the truth.


I agree. Don't read it and tell everyone else not to. Keep the competition low for the rest of us ;-).


I agree with you to a degree. I think there are more systems programmers and people who work on performance-sensitive applications than you're implying, but I also think that there's really no such thing as a generic programmer anymore; everyone's specialized into their own domain and programmers from different domains have trouble understanding each other's needs, or understanding why the things they find essentially important are largely ignored by programmers in different domains. We all have a finite number of things that we can be experts in, so we tend to focus on the things that are right in front of us.


I would argue that java programmers need to know this since it is even more important with a garbage collected language. Java wastes so much that the Northbridge is basically saturated with GC traffic. Sticking to a few disciplined principles can make java performance competitive.


Exactly. As the saying goes, whenever I hear "what every programmer needs to know about ...", I unsafety my Browning.


That one is quite the evergreen (and probably needs a '2007'):

https://hn.algolia.com/?query=what%20every%20programmer%20sh...


Article is from 2007. (And as someone else writes, details have changed and the article is rather detailed.)


I wish that was the case. I wouldn't have to worry about DRAM latencies anymore


[flagged]


I think it is very clear that he means the cited article. The post title should include the year (2007) since technology has changed somewhat in the 8 years since.


I disagree with your judgement about the grammatically of the utterance. And if you've ever written comments about anything, for example, someone's writing or pictures, you'll probably say things like "me in Peru" or "needs revising" or "word is inappropriate"


I would never say, "word is inappropriate". I would say, "This word is inappropriate". It's clearer and there's almost no cost to using a well-formed sentence.

Similarly, I find comments where the single word "this" is used as an entire sentence detract from a forum based site. I can talk to u in allthememes! and sms abbreviations, but I chose to save that register for chat and look for something a bit more correct on forums. I readily admit that HN has been slowly trending away from that for nearly its entire existence.


I was on the subway and saw "DON'T LEAN ON DOOR" and remembered this thread.

Your comment about "this" and internet shorthand and memes is not related to your grammaticality judgement, unless you are admitting to conflating "clarity" with "propriety".


Please don't. Blasting someone like that is rude at best, and outright mean when English is not their first language, as is the case with so many users here.


If there's anyone sensitive to the plight of a second language speaker, it's me. I can completely empathize with the struggle.

It probably wasn't the best way to phrase it and it's likely a losing battle anyway. I've found native speakers (which I believe the person I was responding to was) using this form more and more on boards such as this. Mistakes with uncountable vs plural distinctions are also increasingly common. E.g., "there's many reasons" instead of "there are many reasons".

As someone who spent years helping people with a variety of English problems including grammar, it's difficult to stop. I'll try to restrain this kind of comment to the articles themselves. There it's in the author's best interest to write in a way that doesn't cause the 30% or so of the population with stronger grammatical sensitivities to discount them.


I am strongly grammatically sensitive. And although I am not a native English speaker, I claim that the sentence is grammatically correct. It's just a shortened sentence form. Which is often appropriate on forums and similar.


Hmm... I wouldn't have put the period between "form" and "which" in your comment either. Doing so made your last sentence a sentence fragment.

What's your native language? Your English appears to be at least very near native.


Yes, I was aware of that.

Dutch, by the way.


I'm in much the same boat as you, and I don't think it's necessary to stop; what's necessary is to err on the side of being helpful as opposed to judgmental or defensive.


On a similar note, for the difference in timings and sizes on a scale you can relate to, is this:

https://plus.google.com/+PeterBurnsrictic/posts/LvhVwngPqSC


Memory product guy here.

Great content & relatively current. FWIW, the subject gets even more interesting if you consider how switches & routers manage the flow of packets. The memory hierarchy gets even more esoteric.

Regarding the "this is too low level for programmers" comment: only if you don't need to understand how latency, bandwidth & power works at a fundamental level. Every programmer I know wants to understand this stuff.


there are just three things i wish programmers knew about memory... none of them are especially low-level or require depth of understanding.

never dynamically allocate it unless forced.

really. never allocate at run-time it unless you are absolutely forced by your algorithm.

if you absolutely really must allocate memory on the fly, have a budget so you can do one big up-front allocation and carefully reuse it. even better, don't allocate it to start with.


I feel like you are projecting your domain specific experience on a larger more general audience than is appropriate.


you are probably right


That will basically prohibit using most of C++ STL, for example. Premature optimization is the root of...


i'm not a fan of that religious statement. its assuming it is premature.

STL is not too bad either. you can make it work with budgets too... implementing a custom allocator is a pain, but there are lots of ways to not write code that imagines that memory is an infinite resource...


I'll remember that the next time I'm using python.


Or Java or go or ruby or Scala or php or....


Still very much relevant in Java if you don't want your GC to churn like mad.

Heck even some of the cache aware algorithms can be taken advantage of with some smart use of ByteBuffers to get a 10-50x improvement in performance.


Few applications need this... mostly in the financial space. The GC is quite good at quick temp objects these days.


i'm not sure using a dynamic late scripting language is an excuse for pretending memory is infinite. :)

but yes... i do have frustration from "highly skilled native programmers" who seem to think that memory is a boundless resource and then complain that something like 32GB is a constrained memory environment when it is the exact opposite.


You do know that function stack frames are dynamically allocated (just not from the heap in most C and C++ implementations).


in a sense yes, but its not a realistic way to use the term.

what happens is that the stack is once dynamically allocated up-front when the thread is created and an index into that blob of memory is incremented and decremented as appropriate (stack pointer).

you can call that allocation, and it would be accurate but its not what i meant at all. it also avoids growing your memory consumption and doesn't impact memory budgets no matter how much stack you try and use (you will overflow it instead... and will need to allocate a bigger budget upfront for the stack with a compiler flag)


Maybe he is coming from the fact that lots of things still leak memory this day and age. And you tend to not leak memory when you avoid dynamic allocation?

Though you can still build up crud depending upon the system.


It would help to give reasons why rather than simply commandments.

I'll give one: Performance takes a hit with dynamic allocation. Deallocation also has to occur and if you use GC you may experience pauses as this happens. Your GC may vary. The performance inpact of dynamic allocation may of course be completely acceptable when traded against productivity gains.


true. i wasn't just thinking about performance but also budgetting.

using memory as an endless resource without thinking seems to be standard practice rather than something that is considered "sloppy".

there is also the practice involved. run-time allocation is a bit of a golden sledgehammer and a double edged sword... sure you can write code more quickly, but you are also being enabled to write worse code (i.e. leaks). i don't think we should take the power away, but respect it more...


Definitely start off with a huge preamble about the olden days and document structure and how to report problems. Sheesh


wait.. I have to know ALL of this?! is there a TL;DR?


Again?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: