
IncPy: Automatic memoization for Python - Goosey
http://www.stanford.edu/~pgbovine/incpy.html
======
pbiggar
There are a few responses here who seem not to have read the article or
watched the video (note I didn't read the paper).

This is pretty amazing, and totally different from a memoizing decorator, and
anything which could be done at the application level (unless it did deep deep
magic; the kind you can't really do in Python).

Firstly, it is fully automatic. You don't need to annotate anything.

Secondly it works on impure functions! (as opposed to pure functions like
fib.) It tracks data dependencies through the program, including global
variables, external files(!), etc. If they do not change, then it uses a
cached value.

Assuming that it works as he explains it in the video, this is properly
amazing (and I've a PhD in compilers).

In comparison, a memoization decorator:

\- requires you to annotate functions,

\- can be accidentally put on functions which are not pure, which would lead
to bugs and unsafe behaviour,

\- does not (safely) support functions where data in the body of the function
changes (such as global variables, external files, etc). Instead, it would
only check the arguments.

Anyway, I'm impressed.

~~~
pgbovine
thanks for such a great summary and 'sales pitch' of my project ;) i couldn't
have put it any better.

if you have any further suggestions, please email me personally.

------
pgbovine
hi everyone ... i'm the creator of IncPy and pleasantly surprised that it
ended up on HN :)

anyways, this is still an early-stage research project where i'm the only
developer, so i can't make much guarantees about its robustness (i can only
test on my own machine with my own research scripts).

that said, though, i am _very interested_ in getting new users and feedback.
please email me at the email address listed on the IncPy website if you have
any questions. i am more than happy to answer them!

------
stevelosh
I just tried to run Mercurial's test suite with it and it segfaults all over
the place, unfortunately. It's too bad -- it seems like it could speed up some
Mercurial operations quite a bit.

~~~
pgbovine
hi stevelosh,

sorry that it didn't work out of the box; as you can tell, it's still a
research prototype :)

i'll add the Mercurial test suite to my TODO list of programs to try. in your
experience, what kind of Mercurial operations do you think can be sped up by
memoization?

------
akkartik
Like <http://www.erikyyy.de/compilercache> but for any python program. Very
cool.

~~~
imurray
Erm, not really. Compiler cache saves compiled versions of files. When
building a large project it notices things that have been built before and
looks up the compiled code from a central cache. If you have to do a "make
clean" (e.g. some scripts for making Debian packages start this way) you still
get fast re-builds.

The linked article is about run-time memoization: rather than caching the
compiled version of a function's code, you cache just the output of the
function for particular inputs.

~~~
akkartik
_"rather than caching the compiled version of a function's code, you cache
just the output of the function for particular inputs."_

Think of the function as the compiler and the inputs as your C files. The
kinds of things the two are doing are very similar, but incPy's job is much
harder.

------
pizza
You could just use a small memoize decorator like this one

    
    
      class Memoize:
    	def __init__ (self, f):
    		self.f = f
    		self.mem = {}
    	def __call__ (self, *args, **kwargs):
    		if (args, str(kwargs)) in self.mem:
    			return self.mem[args, str(kwargs)]
    		else:
    			tmp = self.f(*args, **kwargs)
    			self.mem[args, str(kwargs)] = tmp
    			return tmp

~~~
asmosoinio
Not the same at all: IncPy stores the results to disc, so you can rerun your
script and it will still have correct cache results for long running
functions.

~~~
pgbovine
great point; the persistence of the IncPy cache is one of the biggest
differentiators from an ordinary memoize decorator

------
nirai
why not optimize such scenario's at application level?

It is possible to implement a trivial and transparent cache as decorator and
apply it to particular bottle neck functions.

0 development effort, 0 running overhead, 0 maintainability issues

~~~
pbiggar
How could it be less development effort than an automatically memoizing
interpreter?

Its hardly transparent if its a decorator.

Maintainance issue: applying it to a functions which isn't pure. Avoiding it
adds development overhead.

There are other advantages I outlined in my comment.

------
JeanPierre
What about just make a new def-macro for python? A very simple common lisp-
implementation would be something like this:

    
    
      (defmacro defmemoize (fn val &body body)
          (with-gensyms (lst hash-code)
          `(let ((,lst (make-hash-table :test #'equal)))
              (defun ,fn ,val
              (let ((,hash-code (list ,@val)))
                  (or
                      (gethash ,hash-code ,lst)
                      (setf (gethash ,hash-code ,lst)
                          ,@body)))))))
    

In which the standard fibonacci-example would be

    
    
      (defmemoize fib (n)
          (if (<= n 2) ; first two = 1
              1
              (+ (fib (1- n))
                 (fib (- n 2)))))
    

Obviously, it would be something more advanced than this, but it shouldn't
have to be much more advanced.

~~~
joubert
I like the _atom_ construct in Clojure - see a functional implementation of
memoization, instead of one based on macrology: <http://clojure.org/atoms>

