
So what's wrong with 1975 programming? (2008) - nwjsmith
https://www.varnish-cache.org/trac/wiki/ArchitectNotes
======
coffeemug
_So what happens with squids elaborate memory management is that it gets into
fights with the kernels elaborate memory management, and like any civil war,
that never gets anything done._

This quote, much like various scientific quantum mechanics quotes adopted by
the laymen, keeps haunting honest systems programmers because people with a
little bit of knowledge read it, misinterpret (or misunderstand) it, and then
share it.

Look, I don't know how Squid is designed, but most database systems use this
strategy and it _does not_ get into wars with the kernel for a whole slew of
reasons that aren't addressed in the article. I know, because we've done a ton
of sophisticated benchmarking comparing custom use case cache performance to
general purpose page cache performance. Here are a few of the many, many
reasons why this quote cannot be applied to sensibly designed pieces of
systems software:

1\. If the database/proxy/whatever server is designed correctly, it'll always
use just enough RAM that it won't go into swap. That means the kernel won't
magically page out its memory preventing it from doing its job.

2\. In fact, kernels provide mechanisms to _guarantee_ this by using various
mechanisms (such as mlock).

3\. Also, if your process misbehaves, modern kernels will just deploy the OOM
killer (depending on how things are configured), so you can't just get into
fights with the page cache without being sniped.

4\. Of course you have to be smart and read from the file in a way that
bypasses the page cache (via DIRECT_IO). Yes, it complicates things greatly
for systems programmers (all sorts of alignment issues, journal data
filesystems issues, etc.) but if you want high performance, especially on
SSDs, and have special use cases to warrant it, it's worth it.

5\. If you really know what you're doing, a custom cache can be significantly
more efficient than the general purpose kernel cache, which in turn can make
significant impact on performance bottom line. For example, a b-tree aware
caching scheme has to do less bookkeeping, is more efficient, and has more
information to make decisions than the general purpose LRU-K cache.

In fact, it is absolutely astounding how many 1975 abstractions translate
wonderfully into the world of 2012. Architecturally, almost everything that
worked back then _still_ works now, including OS research, PL research,
algorithms research, and software engineering research -- the four pillars
that are holding up the modern software world. Some things are obsolete,
perhaps, but far, far fewer than one might think.

Incidentally, this is also one of the reasons why I cringe when people say
"the world is changing so fast, it's getting harder and harder to keep up". In
matters of fashion, perhaps, but as far as core principles go (in computer
science, mathematics, human emotions/interaction, and pretty much everything
else of consequence) the world is moving at a glacial pace. Shakespeare might
be a bit clunky to read these days because the language is a bit out of style,
but what Hamlet had to say in 1600 is, amazingly, just as relevant today (and
likely much more useful, because instead of actually reading Hamlet, most
people read things like The Purple Cow, The 22 Immutable Laws of Marketing,
The 99 Immutable Laws of Leadership, etc.)

~~~
ChuckMcM
As others have noted the rant is from 2008, which is interesting because this
was the transition point between wide spread adoption of 64 bit aware OSes
from 32 bit aware OSes [1]. You can see lots of folks who are just getting
their feet wet with 64 bit Linux [2], and configured memory sizes are getting
up over 4GB 'real' memory.

One of the wonderful things about 64 bit address spaces? You don't ever have
to re-use an address. Once folks figure that out you can do some amazing
things that would have seemed stupid in the 70's, you can _hard code_ the
address for every single library function on your machine. Can you even
imagine how weird that would be? Linking time would be instantaneous, calling
printf? its always at 0x10015081aaf10000, all of libc? starting at
0x1000000000000 and working up. One giant database of the 'standard place' to
put every single function. Remember when the 32 bit OS would put the kernel at
0x80000000 ? You know right on the 2G border, above that, kernel space, below
that user space.

Anyway, I completely concur that abstractions that worked before, work
wonderfully today. But I also don't worry about an array of 100M items on a
server in RAM anymore. Using mmap to map a 1G file into the address space? Not
a problem.

Its interesting to watch the behavior of systems when they are essentially all
in memory.

[1] [http://www.tomshardware.com/reviews/vista-
workshop,1775-3.ht...](http://www.tomshardware.com/reviews/vista-
workshop,1775-3.html)

[2] <http://blekko.com/ws/?q=64+bit+linux+%2Fdate%3D2008>

~~~
mrb
0x80000000 on Windows. Linux i386 sets the boundary at 0xc0000000 :) This is
yet another aspect where Linux made a better choice than Windows, as this
gives processes 3GB of virtual memory instead of only 2GB.

~~~
mauvehaus
Incidentally, you can change this on Windows i386 in the boot.ini [1] file
using the /3GB option. This is for exactly the case where you want your
database software (for instance) to be able to address more user-mode memory.

As you might imagine, it occasionally trips up badly written kernel-mode code
that makes assumptions about what type of address it's dealing with based on
whether the MSB is set or not.

[1] <http://support.microsoft.com/kb/833721>

~~~
marshray
Ah yes, I remember that 6 months of Moore's law where my Windows XP32 machine
had 4GB of RAM. My code project grew to need the /3GB switch in order to
compile successfully (until we just moved to 64 bit dev boxes).

------
jessedhillon
The first few lines mentioned "acoustic delay lines" which piqued my interest.
Wikipedia has a page on this old technology:
[http://en.wikipedia.org/wiki/Delay_line_memory#Acoustic_dela...](http://en.wikipedia.org/wiki/Delay_line_memory#Acoustic_delay_lines)

It was a pretty amazing hack, before magnetic memory cores. Because sound
moved at a slow rate through a medium like mercury, an acoustic wave (that is,
a sound) could be applied to one side of a volume of mercury and be expected
to arrive at the other end after a predictable, useful delay. So what would be
done is that a column of mercury with transducers on both ends would function
as speakers and microphones, which in an acoustic medium are the equivalent of
read and write heads!

The system memory would be a collection of these columns, each I guess storing
one bit. The memory would of course have to be refreshed: when the signal
arrived at the other end, it would be fed back into the column, assuming I
suppose that there wasn't a new signal waiting to be written to that bit
instead. The article mentions that this was not randomly accessible memory,
but rather serially accessible. From that and other bits of information, I
gather that the device would visit each bit in sequence, according to some
clock, and produce a signal on the read line corresponding to the value in
that bit. You had to wait for the memory device to read out the particular bit
you were waiting for.

Does anyone know if this a correct understanding of how this kind of storage
worked? What a cool way to store bits!

~~~
dcminter
The Hodges biography of Turing has lots of in-passing mention of fascinating
technology like this. I think my favourite was the use of a CRT as a memory
array (by picking up the charge on the fluorescent screen and feeding it back
to the electron gun to refresh it!) which suggested to Turing the idea of
using light to stimulate the feedback cycle and thus writing directly to
memory with a very real "light pen"!

I'm probably borking up the details there but my point that the biog is great
stands.

I guess there are readers of HN who never encountered the later "light pens."
A photodiode picks up the raster on a CRT based monitor and with appropriate
timing logic uses this to decide where to draw pixels. I had a cheap light pen
on an 80s microcomputer before I ever got my hands on a mouse.

Ah, happy (but often frustrating) days...

~~~
ZoFreX
> A photodiode picks up the raster on a CRT based monitor and with appropriate
> timing logic uses this to decide where to draw pixels

Incidentally this is how "light guns" worked, and also why they sadly do not
work on LCD or plasma screens.

~~~
bonyt
Once I made a program that was like duck hunt, except you held a webcam at the
screen. When you fired, the screen flashed black with a patch of white where
the duck was, the camera took a snapshot and looked for how close to the
center of the webcam's frame the white was to decide if you had hit it or not.
It worked surprisingly well.

~~~
ZoFreX
Awesome! I think some light gun games used this method too :)

------
_delirium
Earlier discussion, fwiw (though it was 2 1/2 years ago):
<http://news.ycombinator.com/item?id=1554656>

Among other things, contains an interesting alternate perspective from a
former Squid developer, about some of Squid's design decisions, some of which
were driven by a goal of being maximally cross-platform and compatible with
all possible clients/servers. Others were driven by the fact that Unix VM
systems were actually _not_ very good much more recently than 1975, like in
the 1990s.

------
marshray
_Well, today computers really only have one kind of storage, and it is usually
some sort of disk, the operating system and the virtual memory management
hardware has converted the RAM to a cache for the disk storage._

I used to think that too. Specifically Windows NT was said to need a pagefile
at least as large as physical RAM. This was back when a workstation might have
16MB RAM and a 1GB disk. I thought this was because the kernel might be
eliminating the need for some indirection by direct mapping physical RAM
addresses to pagefile addresses. I was wrong.

On the Linux side, you would typically see the recommendation to make a swap
partition "twice the size off RAM". Despite the possibility of using swap
files, most distros still give dire warnings if you don't define a fixed-size
swap partition on installation.

I don't think there was ever a solid justification for this "twice RAM"
heuristic. A better method might be something like "max amount of memory
you're ever going to need minus physical RAM" or "max amount of time you're
willing to be stuck in the weeds divided by the expected disk bandwidth under
heavy thrashing".

Regardless, if your server is actively swapping _at all_ you're probably doing
it wrong. It's not just that swapping is slow, it's that your database or your
web cache have special knowledge about the workload that, in theory, should
allow it to perform caching more intelligently.

I'd prefer to disable swap entirely, but there are occasions where it can make
the difference in being able to SSH into a box on which some process has
started running away with CPU and RAM.

But this guy is a kernel developer so he seems to feel that the kernel should
manage the "one true cache". I like the ease and performance of memory-mapped
files as much as the next guy, but I wouldn't go sneering at other developers
for attempting to manage their disk IO in a more hands-on fashion.

~~~
mauvehaus
On NT, if the kernel bugchecks and it's configured to do a full or kernel
memory dump, it dumps to the pagefile [1] (which get's copied elsewhere after
you reboot). If you were looking for another good reason to have a pagefile
the size of your physical memory on NT, there you go :-)

[1] <http://support.microsoft.com/kb/254649>

~~~
marshray
Yep, me and like 3 other guys I know actually do look at kernel minidumps and
have a big pagefile for that reason.

------
georgemcbay
tl;dr - Premature optimization is (still) the root of all evil.

I'm not familiar with squid, but I'm quite familiar with the idea of
programmers writing their own systems on top of other systems that are
basically a worse implementation of something the underlying system is already
doing.

To my chagrin, I occasionally catch myself doing this sort of thing once in a
while when I'm first moving into new language/API/concept and don't really
understand what is going on underneath.

It is always a good idea to try the simplest thing that could possibly work
first, and then measure it, and only then try to improve it and always make
sure you measure your "improvements" against the baseline. And make sure
you're measuring the right things. I think this is a concept most developers
are aware of but one of those things you have to constantly checklist yourself
on because it is too easy to backslide on.

~~~
qznc
One reason to reimplement something on top is portability. The tradeoffs
between portability and performance are hard.

~~~
fleitz
It's still generally a bad idea, it's much better to write an interface and
then provide multiple implementations that take advantage of the native
features of the OS.

I think we all remember the last that language that decided to be so portable
they shipped their own reimplemented GUI toolkit with the standard library.

~~~
Deestan
> it's much better to write an interface and then provide multiple
> implementations

That _is_ a reimplementation.

------
mikeash
One thing to keep in mind with this talk of virtual memory is that current
smartphones and tablets have basically regressed to the 1975 model when it
comes to swap. That is to say, there isn't any. If you take the approach of
"allocate plenty of memory and let the kernel sort out what should go to disk"
then you'll end up killed by the OS if you're running on e.g. an iPhone,
because there's no swap.

~~~
LnxPrgr3
Coming from iOS:

You still get memory-mapped file I/O. There's no swap file, but you can still
map files into virtual memory and the OS can page pieces in and out as
necessary.

Virtual memory isn't just about swap.

~~~
mikeash
I know that VM is more than just swap, but the way it was used in this
article, it pretty much only discussed swap.

~~~
LnxPrgr3
"Varnish allocate some virtual memory, it tells the operating system to back
this memory with space from a disk file."

I'm pretty sure the article's talking specifically about mapping a gigantic
file into memory and pretending it's all in RAM, and isn't talking about swap
at all.

~~~
mikeash
It's hard to tell whether they mean that it's done explicitly or not:

"...all we need to have in Varnish is a pointer into virtual memory and a
length, the kernel does the rest."

If you're manually memory mapping stuff, you'd need more than that. In any
case, the first part of the article is definitely talking about swap when it
comes to fighting with the kernel over whether something should be in RAM or
on disk. Explicitly memory mapping a large file will work on iOS, although the
lack of sparse file support on the filesystem would seem to make it painful.

------
crazygringo
Wait a minute -- I'm not a sysadmin guy, but all the servers I've ever dealt
with had swapping / virtual memory turned off. Because you'd rather a web
request failed, then start churning things on disk.

When you're dealing with a web cache, don't you want to explicitly know
whether your cache contents are in memory or on disk, and be able to fine-tune
that? It seems like the last thing you want is the OS making decisions about
memory vs disk for you. Am I missing something?

~~~
wmf
_When you're dealing with a web cache, don't you want to explicitly know
whether your cache contents are in memory or on disk, and be able to fine-tune
that?_

The point of this article is that you think you want to control that, but you
actually don't. The kernel can probably do a fine job. (Of course, PHK
develops the kernel, so when it doesn't do what he wants he can just change
it. Many others are not so lucky.) PHK is telling normal programmers to follow
the rules; extraordinary ninja rockstar programmers are smart enough to know
when to break the rules. If you want extremely high performance you should
manage everything yourself, but this is so difficult to do right that
documenting how to do it would just encourage people to shoot themselves in
the foot.

On a practical note, which is faster, HAProxy or Varnish?

~~~
miah_
HAProxy isn't a cache. Its not a good comparison.

------
javajosh
This article is one of those interesting things that doesn't affect me
directly because I don't do systems programming, but holds a great deal of
fascination. I've often wondered about how the kernel allocates memory and
deals with disk, and how that affects the behavior of an application that may
do it's own memory allocation.

In object oriented programming there is a thing called a CRC card[1] where you
list what the responsibilities of important classes are. This helps the
developer visualize and understand how the system works, and to keep things as
orthogonal as practical. Here we have an example of someone pointing out that
the _system-level_ "CRC cards" are stepping on each other's toes. Pretty
compelling stuff.

An aside - would there be any benefit to using `go` rather than `c` for
writing something like varnish if you were starting in 2012?

[1] [http://en.wikipedia.org/wiki/Class-responsibility-
collaborat...](http://en.wikipedia.org/wiki/Class-responsibility-
collaboration_card)

~~~
SwellJoe
"An aside - would there be any benefit to using `go` rather than `c` for
writing something like varnish if you were starting in 2012?"

go isn't fast yet, from what I can tell. So, you'd be writing a slow proxy,
for the time being.

But, from a simplicity of design perspective, yes, it'd be awesome to work in
a language with really good concurrency primitives. Proxies are an ideal
example of an embarrassingly parallel problem; a thousand simultaneous users
is a thousand independent tasks with very little shared state. go is designed
for exactly this sort of task. And, in five years, by which time go will
probably be really fast, you'll have a simple and fast proxy server.

~~~
jff
Go's pretty damn fast now. It's fast enough that we've got physicists writing
simulations in it. It's fast enough that a lot of the people I know (systems
and HPC guys) just write in Go unless they have to write in anything else.
I've personally used it to write cluster management tools, a cpu-intensive
simulation, a modem emulator, and a decent handful of servers.

If nothing else, it's so easy to do concurrency that I end up hiding a lot of
slowness.

~~~
azth
> It's fast enough that we've got physicists writing simulations in it.

Cool, source?

~~~
jff
Sadly not available yet, sorry. It's a hassle to get new projects open-
sourced, and given that HPC codes traditionally require that you have a
million-dollar supercomputer anyway...

~~~
azth
I meant the source of what you mentioned, not the source code :) Like are
there any articles or papers?

------
halayli
I might be missing something here but assuming varnish is using
mmap()+madvise(), accessing memory might block the thread until the page fault
is served, which is not ideal for a user-facing server.

If you manage your own memory/swap, at least you can use async IO and free up
the thread while the IO request is being served by the OS.

~~~
reynolds
Varnish is heavily threaded. It maintains queues where connections are put and
worker threads pull them out. It is expected that a single connection gets a
dedicated thread.

~~~
halayli
Since each thread is an actual kernel thread, this will limit the concurrent
connections to the maximum number of threads a kernel can handle which isn't
that high.

~~~
wmf
Linux can create over 250,000 threads, but that may have been on a 32-bit
system. On 64-bit it should be limited only by RAM.

~~~
halayli
The overhead of context switching becomes pretty high. Some say that context
switching has become cheap, but you still at the very least need to update the
tlb, and schedule the next pthread.

~~~
marshray
At least the performance of context switching should scale with the number of
cores, which seems to be the main direction of increased performance in
hardware looking into the future.

------
antirez
Sometimes it is a good idea, but it works only if:

1) You have a threaded implementation, otherwise your single thread blocks
every time you access a page on the swap.

2) You have decently sized continuous objects. If instead a request involves
many fragments of data from many different pages, it is not going to work
well.

There are other issues but probably 1 & 2 are the most important.

------
miah_
Oh joy.

“these days so small that girls get disappointed if think they got hold of
something else than the MP3 player you had in your pocket.”

An otherwise interesting article.

------
stcredzero
Here's the thing wrong with any kind of programming. The "best" way is highly
contextual. Your situation, the OS, the hardware, the problem domain, the
target market -- these all change the situation and bring their own particular
trade-offs. There will always be something "wrong" with the way most anyone
programs from the point of view of somebody not familiar with a particular
situation.

------
guilloche
I wish all that this guy said is true. I desperately wish a perfect virtual
memory can relieve me from all the pains of caching.

Take an example, in a word processor, can we just keep all possible cursor
positions (for moving the cursor around) and all line-breaking, page breaking
info, each character's location in virtual memory?

~~~
jff
Moving the cursor in a word processor is a people-time operation; as long as
you can manage it in 100ms or so, it's "good enough". That shouldn't be hard,
but somehow LibreOffice still manages to take multiple seconds to update the
cursor location...

~~~
guilloche
Let me explain more on the details of moving a cursor:

In a terminal, usually mono-width fonts with fixed font size are used, so
cursor positions can be computedly easily.

But for a word processor, variable-width fonts with variable font size are
more common. To move a cursor to its next position, we need to know the width
of current character, otherwise it does not know where to show the cursor.
Should we cache these width or compute them in fly? It is more complicated if
a word appears different from as a group of single characters in some
languages. Should we cache character positions for these words? Should we keep
them all in virtual memory?

Computers may even have no swap partition allocated, can application just
reject to run in this case?

~~~
jff
I understand that a word processor faces greater challenges in cursor
positioning than a simple text editor--that's why I write my documents in
LaTeX :)

As for swap, I always figure that if you're hitting swap, you're dead. It's
time to buy more RAM. I've had instances where running Firefox and a kernel
compile at the same time on my 4 GB laptop turn it into an absolute thrashfest
--I have to walk away for 15-30 minutes, because I cannot do _anything_ at
that point. Applications, even plain old xterm, are big enough that swapping
them in and out makes the machine unusable.

~~~
guilloche
+1 for latex.

------
khitchdee
All this abstraction with garbage collection and virtual memory etc etc is
only taking us further away from the hardware. In some ways its good to think
like a 1975 programmer because you are acknowledging the fact that there's
hardware underneath. If you completely ignore that and rely on the
abstractions provided to you by an OS layer, the end result is you get a
system that uses resources very wastefully. Look at how much software has
bloated in the last 35 years. A large reason for that is the amount of
abstraction of the hardware and lower layers of software we've started relying
on. The more abstraction you use, the easier your job becomes, but it also
results in a less lean system

~~~
chris_wot
My goodness, non-sequiturs wrapped in inconsistencies disguised by absurdity!

Virtual memory uses CPU traps to do page faults in x86 land. Abstractions are
useful and in fact even if you go all the way to assembly, you've still got a
thin abstraction over the hardware. If you don't have an operating system,
then no software will be written. If you look at software over the last 35
years, you'll notice it has become easier to use, and it does more complex
things.

~~~
khitchdee
The memory, hard disk and CPU speed requirement of modern day PCs has also
gone up substantially. I wouldn't call them lean. That, in my opinion, is
where the future lies and to get there we have to get closer to the hardware,
not abstract it away further. Most CS background programmers tend to leave the
details of the hardware to the OS's abstraction layers thereby isolating
themselves from the hardware designs of the EEs. In the future, we need design
teams that do both. That's driven by the lean requirement.

------
taylorbuley
This is a really interesting talk by the author of this article and program:
<http://archive.org/details/VarnishHttpCacheServer>

------
ccleve
This article is wrong, just wrong.

I would love it if there were just one kind of storage, and my code could
ignore the distinction between disk and memory. But it can't, for three
reasons: 10 ms seek times, RAM that is much smaller than disk, and garbage
collection.

10 ms seek times mean that fast random access across large disk files just
isn't possible. There is a vast amount of literature and research devoted to
getting over this specific limitation. And it isn't old, either: all of the
recent work on big data is aimed at resolving the tension between sequential
disk access, which is fast, and random access, which is required for executing
queries.

RAM that is smaller than disk means that virtual disk files don't work very
well when you have large data files. If you try to map more than the amount of
physical RAM you get a mess:
[http://stackoverflow.com/questions/12572157/using-lots-of-
ma...](http://stackoverflow.com/questions/12572157/using-lots-of-
mappedbytebuffers-in-read-write-mode-slows-down-windows-7-to-a-cr)

Garbage collection means that it is easy to allocate a bit of memory, and then
let it go when the reference goes out of scope. There's no need to explicitly
deallocate it. It's one of the things that makes modern programming efficient.
With disk, you don't get that; if you write something, you've got to erase it
or disk fills up.

In short, this guy's casual contempt for "1975 programming" is irksome,
because it's clear that he isn't working on the same class of problems that
the rest of us are. He may be able to get away with virtual memory for his
limited application, but the rest of us can't.

~~~
chris_wot
What is a "virtual disk file"?

~~~
ramidarigaz
I think he is referring to mmap? Not entirely sure though...

~~~
chris_wot
Perhaps conflating swap files with virtual memory?

~~~
jlgreco
mmap makes use of the virtual memory system. The author does not appear to be
conflating anything.

~~~
chris_wot
Yeah, that would be great if a. the link was to a question on StackExchange
about mmap, but it's not, and b. if the use of mmap was commonly known as
using a "virtual disk file", but again, it's not.

------
hakaaak
5.2% of the worlds top 10,000 websites use it, as of July 11, 2012:
<http://royal.pingdom.com/2012/07/11/how-popular-is-varnish/>

So the question is- if it is so great, why only 5.2%? I'm not being sarcastic.
This is a totally serious question.

~~~
InclinedPlane
Every site is different. Varnish is great for a particular fairly common use
case, but it's not an all purpose "make site go faster" button. For example,
putting varnish in front a CMS is usually a fantastic idea, or anything that
has a high ratio of reads vs. writes and serves the same pages to multiple
users. That could be anything from a content blog or site, an online store,
etc. However, for other types of sites it doesn't make as much sense. A site
like facebook or twitter would gain almost no advantage from it, since the
overwhelmingly most common use case is for every single user to receive
_different_ pages on every single visit. Similarly, it doesn't make sense for
search engines, or for web mail apps, etc.

Also, most really large sites have probably already developed some other
method of caching if it suits their site needs, so it wouldn't make sense for
them to switch over to varnish all of a sudden.

~~~
dlisboa
Facebook uses Varnish, so does Twitter. They use it where it makes sense,
where reads are high and content is less dynamic. To say they'd gain almost no
advantage of it is oversimplification as they have various requirements and
some of those do indeed benefit from caching.

------
jwilliams
If you take the premise of this article literally - then since 1975 computers
have gotten inordinately more complex, but we've developed no abstractions to
help programmers deal with it.

~~~
dalore
Isn't it the abstraction (virtual memory) creating a problem in the first
place? By programmers not understanding that an abstraction has been applied.

~~~
chris_wot
No. Virtual Memory has been known about since the early 60s, the issue was
that the x86 architecture picked it up in the '90s with the 386 CPU.

The problem with programmers not knowing how the operating system works is not
the fault of the operating system, it's the fault of the developers.

------
mcfunley
[2008]

------
guilloche
The article is misleading and the author has totally no clue on the complexity
of user space memory management. Random on-disk virtual memory access will be
a disaster if we just keep everything in so-called virtual memory without
complicated cache mechanism.

~~~
dkokelley
Let's not resort to attacks on the author. Consider that this article is at
least 4 years old.

------
martinced
I understand he's a kernel developer but to me this sounds exactly the same as
people who kept repeating, since years:

"Don't create a ramdisk (a true, fixed size, one, that you prevent from ever
getting to disk) because the (Linux) kernel is so good and so sentient that
you won't gain anything by doing that"

Yet anyone compiling from scratch big projects made of thousands of source
file know that it's much faster to write the compiled files to the ramdisk.

I can't tell how many times I've seen this argument between "pro 'kernel is
sentient'" and "pro 'compile into a real ramdisk'" but I can tell you that, by
experience (and it's hard to beat that), the ramdisk Just Works [TM] faster
than the 'sentient kernel'.

So how is it different this time?

~~~
dchest
You mean, there's a performance difference between writing to RAM, and writing
to RAM and flushing writes to disk every few seconds? Doh!

------
smegel
Fuck this guy gets on my nerves, acting like he's the only person in the world
who knows what virtual memory is or that paging is some kind of dark magic
only understood by kernel developers, rather than standard subject matter for
any intro to computer architecture/OS concepts course.

~~~
gliese1337
Consider: what proportion of programmers have taken an OS course? Specifically
OSes, 'cause I know that virtual memory was not covered in any of the computer
architecture courses I've taken.

Now, how many of those programmers who know about virtual memory (larger than
the number who have taken a course on operating systems, but still far from
100%, I'd wager) have actually realized "hey, virtual memory can make it so
that I never have to explicitly write stuff to disk?" _I_ certainly hadn't.

Just because you expect people to know basic concepts, doesn't mean that an
article explaining what they're useful for is useless. Quite the opposite, in
fact.

~~~
LnxPrgr3
My experience has been that most programmers are more or less clueless about
how virtual memory works. At best, they understand it as letting you swap out
memory to disk when memory runs low.

Even ignoring Kamp's "mmap the world" approach, if I brought up any of his
other ideas in a meeting with most programmers, there'd be immediate cries of
premature optimization (and reinventing the wheel, in the case of using a
single malloc'd chunk for workspaces). Never mind what we're talking about
building or what we already know about the performance of different
approaches, and never mind that a lot of important performance decisions are
architectural and are a lot harder to change later on.

These ideas just aren't on most programmers' radar—it's all evil voodoo to be
avoided at all costs to them.

How many programmers know the effect of a write from one CPU on the next read
from another CPU on the same cache line? How many programmers know the
relative cost of a syscall vs. a function call? How many programmers ever
think about optimizing their use of CPU cache?

Most of the time they get away with ignoring these things because they really
don't matter in context. But sometimes they don't get away with it because
these things do matter, and in those moments I wish more programmers had a
better understanding of their machines and their operating systems.

