
When a disk cache performs better than an in-memory cache - douche
http://www.productiverage.com/when-a-disk-cache-performs-better-than-an-inmemory-cache-befriending-the-net-gc
======
bluejekyll
> [the GC] ...it's keeping you from the misery of manual memory management and
> that you're better to consider it an ally than a foe

I used to think this way, then I tried Rust. After C and C++, I never thought
I'd want to give up the GC, but to avoid the issues that this article talks
about, I wanted to go back to a systems language.

Now I want to use Rust for everything.

~~~
saynsedit
RAII in Rust comes from C++. If you're using new/delete in C++, you're doing
it wrong.

~~~
rcfox
Not everything can live on the stack.

If you're not using new/delete in C++, you're probably not doing something
very complicated. (And hey, that's not necessarily a bad thing!)

~~~
netheril96
The resource manager (like std::vector or std::unique_ptr) can certainly live
on the stack when the data it manages resides on the heap.

------
mamp
The title is a bit misleading, the performance was memory/GC related and the
specific in-memory cache that didn't work well were .Net data structures that
didn't solve memory problems. He didn't try Redis or Memcache but ended up
writing files to disk.

~~~
barrkel
So he's essentially using the OS's disk cache as his cache, and tracking the
disk files substituting for manual memory allocation.

In this situation on .NET, I wrote a resource manager that handed out handles
to large byte arrays (in particular, ones that were too big to be collected
outside of gen2, in the large object heap, which kicked in at 80kb at the time
IIRC). The handles implemented IDisposable so taking care of handing back the
byte array was no more or less tedious than any other resource you need to
manage explicitly. The resource manager kept a hold of the arrays internally
using a weak pointer so they could still be collected when gen2 collections
actually happened, but allocating the buffers themselves would never cause
gen2 collections in a steady state.

To turn that into a cache, you'd need another layer with keys, an eviction
policy and an invalidation mechanism. I think it ought still be better than
round-tripping to disk.

I wrote a different version of the resource manager that used P/Invoke helpers
and unsafe code to allocate from unmanaged memory directly, but it didn't
perform any better - it didn't relieve any pressure on the GC, which was 2% of
CPU usage at full load in any case.

~~~
dom0
> So he's essentially using the OS's disk cache as his cache, and tracking the
> disk files substituting for manual memory allocation.

And that's often not the worst idea, because, when done correctly, this stuff
never hits the disk when enough RAM is around.

Plus, life cycle management is done by the OS, not by you. It also tends to
work better than "not at all" on memory pressure or if there isn't a lot of
memory in the first place.

~~~
Filligree
If you write a file to disk, then wait fifteen minutes before deleting it,
chances are the OS will have flushed it to disk sometime in the meantime. It
won't have to re-read it if there's still free memory, but it's extra load on
disk io.

If you write a lot of them, then even if you have the memory, you may overrun
the size limit of the write buffer and cause application stalls.

Writing to a ramdisk (e.g. tmpfs) is always an option, though.

~~~
dom0
Every OS has flags for this. Windows' CreateFile has a _TEMPORARY flag and
most unixes have something like O_TMPFILE.

------
inmemory_net
Did you try do an LOH Garbage Collection. We have a dot net in memory
database, and do a scheduled one every 60 minutes or so.

