
IncPy: Automatic memoization for Python - unignorant
http://www.stanford.edu/~pgbovine/incpy.html
======
phren0logy
My favorite parts of python are the parts more oriented to functional
programming (ie. list comprehensions). As I got more of an appetite for
functional programming, I have drifted, because it seems at odds with the
direction python is headed (ie. Guido wanting to remove map/reduce).

There have been a few libraries posted here on HN recently about adding
functional elements to python, so apparently it's not just me. Maybe it's time
for a "functional fork" of python?

~~~
agentultra
It's a weird synthesis, but I wouldn't frame it as hostile. A functional
programming style has always been at odds in the Python world. OTOH, you have
functools and itertools, et al. On the other, classes and a preference for
explicit loops.

Yet even in Python3 where reduce has been moved around, there's still the
spirit of functional programming present. The map, filter, reduce functions
all return generators. This is a pretty good improvement as you can now use
them on (theoretically) infinite sequences.

Yes, lambda is the lame, dead horse. It's just a syntactic issue. I'm sure
most people in the Python world would be happy to receive a multi-line lambda
that can return more than just an expression. It's just that no one has been
happy with any of the syntax proposals to make it happen.

However, there are other things about lambda in Python that make it difficult
to implement as well. But I think baby steps are important.

Is a fork necessary? Well... it would be nice to see some people experimenting
with getting lambda to work. However I don't think a fork of the interpreter
is necessary just to support a _style_ of programming. Indeed Python prefers
one way to do things, but functional programming has proven practical enough I
think to be an exception to the rule and so it lives on. Sort of. :)

------
yuvadam
In this context it's worth noting that writing a memoizing decorator in Python
is ridiculously easy:

    
    
      class memoized(object):
         def __init__(self, func):
            self.func = func
            self.cache = {}
         def __call__(self, *args):
            try:
               return self.cache[args]
            except KeyError:
               value = self.func(*args)
               self.cache[args] = value
               return value

~~~
pgbovine
(IncPy author here ...)

yup agreed, but the programmer needs to figure out:

1.) when it's safe to memoize

2.) when it's worthwhile to memoize

also, your memoization decorator doesn't save data to disk. if you wrote a
persistent memoizer, then you would need to also track all dependencies for
the data you memoized, so that you can know when it's safe to invalidate on-
disk cache entries.

IncPy takes care of all of this automatically ;)

~~~
newtonapple
I have a question. How does IncPy keep track of data dependency coming from IO
layer e.g. sockets. Say I want to write a simple web crawler. The function
crawls a website, spits out the HTML in raw string, sleeps for 5 seconds and
then crawls the site again. Will IncPy not memoize the function at all, knows
that the data coming from the network layer has not changed and return the
memoized string immediately, and/or smart enough to detect when input (the
website) has changed and expires the cache automatically?

~~~
ippisl
incPy need checks the functions it caches to be pure and determinstic.
determinstic function is a function that when given the same input , always
returns the same output. incPy tests for a few options of non-determinism ,
and will allow in the future to define non determinstic sources in a config
file.

------
rix0r
I like it, but isn't the approach a little overkill?

If you'd design a declarative systems where you declare your datasets and how
they transform into each other, then you could analyze the dependency chain
and do the same thing as a library instead of a separate interpreter.

Sprinkle some transparent pickling, hashing and timestamping in there and you
get all of the benefits but much more reusable.

Am I underestimating the problem?

~~~
scott_s
_If you'd design a declarative systems where you declare your datasets and how
they transform into each other, then you could analyze the dependency chain
and do the same thing as a library instead of a separate interpreter._

That sounds like a lot of effort on the part of the programmer. The author's
approach - which I like - is to require as little intervention from the
programmer as possible.

Don't confuse the author's research implementation with how it should look in
practice. I imagine the author implemented a light-weight interpreter that
does the memoization on top of CPython. That's far easier than hacking CPython
itself, which gives him a faster path to proof-of-concept implementation and
publishing evaluations of the idea. If the research gives good results, then
maybe this approach could get implemented as a VM optimization - you wouldn't
know it's happening, your programs just run faster.

