
Memory Management in Python - blacknessninja
https://rushter.com/blog/python-memory-managment/
======
jakear
I’ve found it’s easy to get much better performance in CPython by simply
replacing their malloc with my own. Mine allocates a large block of memory on
the stack, then pulls from it. Won’t work for large programs, but for smaller
scripts it’s a good amount faster. I think this is because CPython makes
something like 1000 malloc calls just to startup, so if those can be made
~instant you get a lot of benefit.

~~~
jakear
The repo:
[https://github.com/JacksonKearl/cpython/blob/master/README.r...](https://github.com/JacksonKearl/cpython/blob/master/README.rst)

My allocator isn’t included as it is part of a class project, but you can make
your own malloc pretty easily.

Alternatively you can just do some LD preload magic, but I don’t know much
about that (and this was needed to match the class spec)

~~~
codetrotter
> My allocator isn’t included as it is part of a class project, but you can
> make your own malloc pretty easily.

Was this a class that you took, or one that you TA or tutored?

~~~
jakear
I was a TA.

------
saagarjha
Why does Python use its own allocation abstractions when the C allocator is
likely doing the exact same thing underneath it? Or does it only employ these
when it's directly getting pages of memory from the OS?

~~~
rushter
Author of the article here.

I don't think system allocators are clever enough to process and allocate
100-500k of very small objects each minute when Python is performing something
very intensive.

It's a pretty standard way to speedup allocation for dynamic languages. Game
developers use similar techniques as well.

I have some stats on Python's allocator:

[https://rushter.com/blog/python-object-allocation-
statistics...](https://rushter.com/blog/python-object-allocation-statistics/)

~~~
amelius
Can Python actually return memory pages back to the OS, e.g. by sbrk() with
negative argument?

I'm currently having a problem with this, where I load a large deep learning
model into "CPU" memory then move it to the GPU, but I can't get rid of the
memory reserved by the process.

~~~
CogitoCogito
I can't answer your exact question, but any large allocations/deallocations
should be handled by mmap under the hood and in those cases the memory should
be returned.

In your case you should first consider the possibility that there is a pointer
to your model's objects that is for some reason not being released. It might
simply be that even though you are moving your model the GPU and maybe
removing any of your own references, there might be internal references to
your model's data that is hidden from you. At least something to consider.

edit: To add to this, I'm now quite sure (though I could be wrong!), that
whether python does or does not use sbrk with a negative value is beyond the
scope of python. Python is making use of malloc/free under the hood:

[https://github.com/python/cpython/blob/master/Objects/obmall...](https://github.com/python/cpython/blob/master/Objects/obmalloc.c#L124-L128)

There's some flexibility for wrapping free in different ways in that file, but
it seems like it'll basically always be using free at the core. At least on my
system in a debugger I just verified that. So if it's true that python by
default uses malloc/free, then the question of whether sbrk with a negative
number comes into play is more a question of how your libc implements
malloc/free.

Of course I might be wrong, but I think that you should probably stop worrying
about it at that level and instead look into object references first as I
detailed above.

------
heroHACK17
Memory management was my favorite topic to study during my time as an
undergrad. I considered going back to school to get my MS in CS solely because
I loved learning about this topic so much. Fascinating to see how it works
under the hood in Python. Bravo!

