
Speeding up function calls with lru_cache in Python - Immortal333
https://hackeregg.github.io/2020/06/03/Speeding-up-function-calls-with-just-one-line-in-Python.html
======
jedberg
This technique is called Memoization [0]

Here is an implementation of a memoize decorator in Python that will support
all inputs [1]. You'd have to modify to not use all the Pylons framework
stuff.

[0]
[https://en.wikipedia.org/wiki/Memoization](https://en.wikipedia.org/wiki/Memoization)

[1] [https://github.com/reddit-
archive/reddit/blob/master/r2/r2/l...](https://github.com/reddit-
archive/reddit/blob/master/r2/r2/lib/memoize.py)

------
Denvercoder9
Caching the result is not speeding up function calls.

~~~
Immortal333
Author here. Sorry, If it feels misleading. I have no intention to mislead
anybody. If you can suggest alternative title, I am happy to change title.

~~~
shankysingh
Thanks for article

Just a side note: with Fibonacci + caching it solidly become Dynamic
programming problem so Time complexity reduces from quadratic to o(n), IIRC .

There is a whole class of problems where recursion + memoization(caching) =
Top Down Dynamic programming , The other way to Increase performance and
_actually reduce call stack_ in these class of problems including Fibonacci
would be Bottom Up Dynamic Programming

Some gists I found on it
[https://gist.github.com/trtg/4662449](https://gist.github.com/trtg/4662449)

~~~
akdas
Plugging my own writing, but I did a deep dive on top-down vs. bottom-up
dynamic programming on my blog: [https://avikdas.com/2019/04/15/a-graphical-
introduction-to-d...](https://avikdas.com/2019/04/15/a-graphical-introduction-
to-dynamic-programming.html)

The follow-up posts go into even more detail on individual problems that can
be solved using either form of dynamic programming, including some real-world
problems like content-aware image resizing.

~~~
eindiran
Thank you for the link, I enjoyed the post a lot. The drawings you've done add
a lot to the explanation. Do you have any other deep dives you've done in the
same style that you'd recommend?

~~~
akdas
Thanks! I'm glad the drawings were helpful.

Most of my writing can be found at my blog
([https://avikdas.com/](https://avikdas.com/)), but here are some from the
same dynamic programming series:

\- Deep-dive into the Chain Matrix Multiplication Problem -
[https://avikdas.com/2019/04/25/dynamic-programming-deep-
dive...](https://avikdas.com/2019/04/25/dynamic-programming-deep-dive-chain-
matrix-multiplication.html) \- Real-world applications of DP:
[https://avikdas.com/2019/05/14/real-world-dynamic-
programmin...](https://avikdas.com/2019/05/14/real-world-dynamic-programming-
seam-carving.html) and [https://avikdas.com/2019/07/29/improved-seam-carving-
with-fo...](https://avikdas.com/2019/07/29/improved-seam-carving-with-forward-
energy.html) \- Another real-world application, this time in machine learning
- [https://avikdas.com/2019/06/24/dynamic-programming-for-
machi...](https://avikdas.com/2019/06/24/dynamic-programming-for-machine-
learning-hidden-markov-models.html)

If you look on my blog, you'll also see my recent series is on scalability,
things like read-after-write consistency and queues for reliability.

------
initbar
I had worked on a open source project called 'safecache' on the similar note.
As others has already commented, @lru_cache does not play well with mutable
data structures. So my implementation handles for both immutable and mutable
data structures as well as multi-threaded operations.

[https://github.com/Verizon/safecache](https://github.com/Verizon/safecache)

~~~
vorticalbox
Thanks for this.

------
gammarator
(Fibonacci numbers have a closed form analytic solution that can be computed
in constant time: [https://www.evanmiller.org/mathematical-
hacker.html](https://www.evanmiller.org/mathematical-hacker.html))

~~~
nimithryn
Does this work in practice for large values of n? You will be limited by
numerical error in phi.

~~~
WJW
The Fibonacci series grows so fast that for "large" values of N you'll need a
arbitrary size integer implementation anyway, so at that point you might as
well go for a (non-IEEE) arbitrary size float type and get all the significant
digits you need.

~~~
nimithryn
You’ll still need to compute the digits of Phi though, and then exponentiate
that. My intuition is that using ints is still faster.

------
bravura
I recently discovered that joblib can do something similar, both on disk and
in memory:
[https://joblib.readthedocs.io/en/latest/memory.html](https://joblib.readthedocs.io/en/latest/memory.html)

~~~
thechao
This memoizes closures of functions, and lets them be executed on other
processors? Does it support dependency tracking between memoized closures
(incremental recomputation), or do I have to roll that, myself?

~~~
trombonechamp
Unfortunately you need to roll that yourself in joblib. But it is automatic in
bionic ([https://github.com/square/bionic](https://github.com/square/bionic)),
assuming your workflow can be structured in the way bionic likes.

~~~
thechao
Bionic is what I'm looking for (right on their use-case page): a Make
replacement that can work with intermediate computations, and not just files.

------
anandoza
> As, we can see the optimal cache size of fib function is 5. Increasing cache
> size will not result in much gain in terms of speedup.

Try it with fib(35), curious what you find.

~~~
Immortal333
After, reading your comment. I cross checked result and found out that 3 is
most optimal size. I even ran fib(40) on it. with size 2, There are many
misses but after 3 and on wards misses are constant(if fib(40), then misses
are only 40 which emulates DP approach of O(N)).

Why 3 is optimal, because of how recursion and LRU work. I wish i can explain
it using animation.

You can play with it.
[https://repl.it/repls/NocturnalIroncladBytecode#main.py](https://repl.it/repls/NocturnalIroncladBytecode#main.py)

~~~
hyperman1
It makes sense. f(n) depends on f(n-1) and f(n-2). So if the cache is able to
produce these 2 values, you basically get the linear algorithm from the
article. I assume the running f(n) also takes up a cache slot, hence 3 instead
of 2.

If this theory is correct, every recursive function f(n) requiring access to
f(n-x) should have x+1 as maximum usefull cache size.

------
mac-chaffee
Maybe I'm abusing lru_cache but another use for it is debouncing.

We had a chatbot that polls a server and sends notifications, but due to clock
skew it would sometimes send two notifications. So I just added the lru_cache
decorator to the send(username, message) function to prevent that.

~~~
GeoAtreides
That sounds very interesting. Can you detail the problem a bit (I'm not sure I
understand how clock skew affects python code) and how lru_cache decorator
fixed it?

~~~
whalesalad
Essentially the OP is using the decorator to prevent the function call from
ever being run more than one time. It’s at most once.

~~~
faceplanted
More specifically, it prevents a call being repeated _in quick succession_, if
there are enough calls in between repetitions it'll fall out of the cache and
be reprocessed.

------
drej
Functools’ lru_cache also has good methods for getting more info about the
cache’s utilisation (.cache_info() I think), which is quite helpful when found
in logs.

------
CyberDildonics
One line summary: Use lru_cache decorator

@functools.lru_cache(maxsize=31)

def fib(n):

------
andreareina
lru_cache doesn't work with lists or dicts, or indeed any non-hashable data,
so it's not quite a transparent change. About half the time I use a cache I
end up implementing my own.

~~~
raymondh
What approach are you using to efficiently store and look up non-hashable
function arguments?

~~~
andreareina
Converting to tuple has mostly been enough. I want to say always, but I don't
trust my memory that much, haha.

------
intrepidhero
This is neat and I learned something new. TLDR: use the functools.lru_cache
decorator to add caching to slow functions.

I must admit I was hoping for a general approach to speeding up all function
calls in python. Functions are the primary mechanism for abstraction and yet
calls are relatively heavy in their own right. It would be neat if python had
a way to do automatic inlining or some such optimization so that I could have
my abstractions but avoid the performance hit of a function call (even at the
expense of more byte code).

~~~
riazrizvi
A benefit of Python is that it removes consideration of low level type
details. You can just code your problem and think at a higher level, in terms
of larger custom classes etc. This is great for developer speed. But to
improve performance of code, you need to micro manage your types, you need to
understand details about the compiler and the processor, you need to know what
the machine is doing to the data to service your particular process, and
streamline it accordingly.

Is there a magic function? If one crops up, it would get baked into Python,
development of which is extremely active. So magic functions trend obsolete.
The main thing developers can consistently do to improve performance, is micro
manage types and algorithms to improve the performance of bottlenecks. This is
done with a tool like Cython, but you have to learn the discipline of coding
for performance, you have to learn the details of what the machine is doing
with the data in your particular piece of code. When do you want to replace
linked lists with arrays? When are you doing unnecessary type conversion? How
overweight are your high frequency objects? How is memory being used? Are
there more efficient ways on my microarchitecture to get this calculation
done... Performance coding is a discipline like application coding.

Is this a problem with Python? I don't think so, because performance comes
second in the developer process. Write the code first then optimize the real
bottlenecks.

------
satyanash
What's the ruby equivalent of `functools` and specifically
`functools.lru_cache`?

Note that `lru_cache` doesn't just do caching, it also provides convenient
methods on the original function to get cache stats etc in a pythonic way.

------
lalos
The last bit of deterministic functions is a main selling point of functional
programming (and get help from compilers) or any design that promotes creating
pure functions.

------
oefrha
> One line summary: Use lru_cache decorator

Okay, at least got the decency of providing a TL;DR. But if your summary is
literally three words, why not put it in the title: “speed up Python function
calls with functools.lru_cache”?

God I hate clickbait.

------
m4r35n357
Huh? He sped it up much more by rewriting as a loop . . .

------
aresic
dvic CV function on problems

------
helloxxx123
Cool

------
fastball
tl;dr – memoization.

------
asicsp
Instead of the last quote in the article, I prefer this one (got it from [0])

>"There are two hard things in computer science: cache invalidation, naming
things, and off-by-one errors." – Martin Fowler

And there's plenty of similar articles, for example [1] [2]

[0]
[https://www.mediawiki.org/wiki/Naming_things](https://www.mediawiki.org/wiki/Naming_things)

[1] [https://dbader.org/blog/python-
memoization](https://dbader.org/blog/python-memoization)

[2]
[https://mike.place/2016/memoization/](https://mike.place/2016/memoization/)

~~~
ritter2a
Even in this version, the quote still feels incomplete without someone
shouting "Concurrency!" while it is delivered...

~~~
SCLeo
Concurrency.""There are three hard things in computer science: cache
invalidation, naming things, off-by-one errors, and

------
perfunctory
> @functools.lru_cache

Avoid these tricks if you care about thread safety.

