

Type checking in Python - senko
http://senko.net/type-checking-in-python

======
xiaq
This module doesn't seem to make it possible to type-annotate map and filter,
which are Python builtins.

I believe that dynamic typing results from frustration over generics, rather
than over static typing in general. (Type stuttering is another problem, but
modern languages solve it with local type inference.) I mean, annotating non-
generic functions is always straightforward doesn't introduce much complexity:

    
    
        # s should be str
        # returns nothing
        def print(s):
            ...
    
        # just becomes (using Py3 annotation syntax)
        def print(s: str) -> None:
            ...
    

However, it's much harder to write a type annotation of generic functions.
Worse, remember that Python's map is _variadic_.

    
    
        # for any types t0, t1, t2, ... tn:
        # fn is a function that takes arguments of types t0, t1, ... t(n-1)
        # and returns a value of type tn;
        # seqs are iterables of types t0, t1, ... t(n-1);
        # returns a list of type tn.
        def map(fn, *seqs):
            ...
    
        # How to write this signature is non-obvious
    

Many mature static typing systems would allow you to express such types
(usually called parameterized types). But any one of them would require more
than those trivial notations in print.

That's when the dynamic typing people get annoyed and go "fxxk static typing
systems, I can handle this in my mind". Which is about 65% (totally random
estimation) the point of dynamic typing in my opinion.

Any type checker for dynamic typed languages that doesn't seriously try to
solve the generics problem is not genuinely interesting.

~~~
gsg
It's not hard to imagine an extension to ML-style types that could cover
variadicity:

    
    
        map : ('a... -> 'b) -> ('a list)... -> 'b list
    

Where <type>... denotes <type1> -> <type2> -> ... -> <typeN> spliced into the
type signature, with each type variable 'a in <type> replaced with 'a1, 'a2,
... 'aN. You could cover tuples too:

    
    
        zip : ('a list)... -> 'a,,, list
    

Bit ugly, and writing a type checker for it could be fun, but it seems
workable.

~~~
tel
Take a look at Typed Racket for a variation on this.

~~~
gsg
[http://docs.racket-lang.org/ts-
guide/types.html?q=typed#%28p...](http://docs.racket-lang.org/ts-
guide/types.html?q=typed#%28part._varargs%29)

Right, looks very similar.

------
markbnj
As a somewhat new python programmer (~2 years) coming from statically compiled
languages like C#, C++ and Pascal this is something I have thought about a
lot. My initial impulse was to wish I had static type checking. But I have
come to the conclusion that I just hadn't fully appreciated the differences
between interpreted and compiled languages.

The article says that the lack of type checking "lets through" a certain class
of errors. However, let's be specific here: there is no build-time static
analysis phase in the transformation of python source to executable code. So
"lets through" means the same thing whether you have strong typing or not: the
error is going to be found at runtime. What we're really talking about, then,
is converting AttributeError ("YourObject has no attribute 'startswith'") into
something more specific ("Hey, this is supposed to be a string!"). Honestly,
that seems to me to be a pretty minor increase in diagnostic information.

So the bottom line for me is that I feel I didn't really understand duck
typing. I used to joke that the term actually meant "no typing," and that is
in some respects true. But what it really means is "the thing can do what the
method expects it to be able to do." If the thing can't serve in the expected
role, then the existing errors that result are sufficiently explanatory, imo.

~~~
theseoafs
Sure, but you have to wait until runtime to detect those errors. Which means
your codebase can have a nasty bug that you might not even notice if you don't
run a particular test case. Static typing detects the same thing, but
statically at compile time.

~~~
markbnj
Yes, but there is no compile time in python. Are we discussing type checking
in python, or another language, or some ideal hybrid? I don't need to be
convinced of the value of static type checking in compiled languages.

~~~
theseoafs
Sure there is. When do you think parse errors happen? All languages are
compiled in one way or another.

~~~
markbnj
Yes, but that's not a very useful observation. I think most developers
understand that there is a process to transform source code into executable
format. The distinction between interpreted and compiled languages, as
commonly understood, is precisely about when that happens. There is a
"compilation" step in python, but it happens when the code is executed, and
therefore there is not a distinct "compile time" step during which a
programmer has an opportunity to detect errors and prevent them from occurring
at runtime. In any case, we're slicing hairs here, so in defense of my
position I'll simply note that if your definition is sufficient, then there is
no need to distinguish between interpreted and compiled languages at all.

~~~
theseoafs
That's simply not true. There is a compilation step in python that is
completely distinct from the runtime (i.e. when the bytecode files are
compiled). There's nothing special about python that prevents static compile
time analysis from happening.

~~~
theseoafs
Oh, and to address your other point: the line between compiled and interpreted
languages has been blurry since day one, and it's only getting blurrier. No
modern, highly-used interpreted language gets by these days without
compilation to bytecode. Rather than "compilation vs. interpretation", the
battle should be "interpretation vs. native execution", which is more
meaningful, but the line there is actually still pretty blurry (see JIT
compilation, which straddles that line).

~~~
markbnj
I understand your point. There is a step that compiles python source to
bytecode, and it does take place before the bytecode is executed. I just think
this is a somewhat pedantic observation. A large part of the value of the
language lies in the immediacy of its "interpreted" nature (I'll put that in
quotes from now on, thanks to you :)). You run the script and it either fails
or executes. In the version of python I use on Ubuntu scripts aren't compiled
to bytecode for the first time until they are imported, so other code in the
project has likely already run before the compiler would have a chance to
review the use of types in the imported module. There isn't a project-wide
process of compiling all the files and making sure everything is used
correctly before any code executes. Maybe it could be made to work that way,
but wouldn't that alter the language and the way it is used? In the end I come
back to: is AttributeError really not descriptive enough? You can have the
last word here.

~~~
theseoafs
Okay. AttributeError is good enough, except when it's not. Python is designed
in such a way that makes you actively seek out AttributeErrors, rather than
having them diagnosed at compile-time. If that works for you, then that's
great, but it doesn't always work. Whether it's in the main branch of CPython,
or some sort of experimental fork, or some completely distinct static analysis
tool, Pythonistas should support efforts to add (potentially optional)
increased static checking to the Python interpreter, because it's a good
thing.

Will it fit in with the current Python ecosystem? Will it have to change the
way the language is used? Maybe, but that doesn't mean we shouldn't experiment
with static analysis. We're hackers, after all, and this is something that
probably deserves to be hacked on.

------
willvarfar
I've been checking types in Python dynamically for years now. My library,
Obiwan
[https://pypi.python.org/pypi/obiwan/1.0.2](https://pypi.python.org/pypi/obiwan/1.0.2)
uses type annotations, which I think nicer than decorators.

~~~
andybak
As the article mentions - the big advantage of decorators is Python 2.x
compatibility. (not that I want to start another discussion of THAT topic)

EDIT - looks like obiwan will allow decorators as well as annotations. I do
prefer the syntax of typedecorator though. It seems less cluttered.

~~~
andreasvc
It's a matter of taste. I think the decorator approach looks more bolted-on
and ad hoc. I prefer C style declarations over annotations, though, because
they have the least amount of noise.

------
anon4
There's one thing that doesn't sit well with me about adding type-checking to
the CPython implementation of the Python language. Type-checking is generally
used at compile-time to check that your program won't do nasty things, and
then run-time checks are done only when you do something overly dynamic that
the compiler couldn't check (i.e. casting an Object to some concrete class in
Java). However, all type-checking implementations for CPython work purely at
runtime! And type-checks are invoked each time your function is called! This
seems not only wasteful, but also defeats the purpose of type-checking before
running your program.

I would rather see a more generic pre/post-condition contract system, to be
used in select top-level functions, that gives a lot more flexibility in
expressing what is supposed to happen (i.e. I expect to be given a not-None
value with a __str__ method that doesn't throw and will return an iterator
yielding consecutive not-empty, not-None strings), including what assumptions
you're making when you call a function that you got from a random object (i.e.
here I'm calling a function that I got somehow as a parameter and I'm going to
assume it can take a string of the format "ipv4 address:port" and returns an
object that is a database connection), together with a mechanism for
recovering from broken contracts - we are doing runtime checks, so there's no
reason to restrict ourselves to Java-style type constraints, which don't even
work for Python in general. Or, alternatively, a lighter system that can be
checked at import time once before the program is started "for real".
Preferably both of those.

------
csirac2
This is the single biggest thing I miss coming from modern Perl. Pythonistas
seem very anti-type validation/checking, but with Moose classes the type stuff
simplifies a lot, is declarative, self documenting, and vastly reduces certain
classes of bugs.

The closest I've seen is the traits thing from Enthought, but there doesn't
seem to be much buy-in from python users.

~~~
doctoboggan
Former Enthought employee here, I always liked programming with Traits.
Although nowadays I use the similar but much for light weight atom[0], which
was created by another former Enthoughter.

[0]: [https://github.com/nucleic/atom](https://github.com/nucleic/atom)

~~~
csirac2
Awesome pointer, thanks!

------
illumen
I recently wrote this post "statically type checking python"
[http://renesd.blogspot.de/2014/05/statically-checking-
python...](http://renesd.blogspot.de/2014/05/statically-checking-python.html)

Turns out with modern tools you _can_ statically type check python :)

------
fx5
Appreciate that it has a "logging only" mode.

I wish there was something like this, but parsing the docstrings with the same
format as pycharm: [http://www.jetbrains.com/pycharm/webhelp/type-hinting-in-
pyc...](http://www.jetbrains.com/pycharm/webhelp/type-hinting-in-pycharm.html)

~~~
senko
I'm currently entertaining the idea of it _outputting_ docstrings (adding
:param:, :rtype: and other applicable tags if they're not already there).

------
cabalamat
I wrote a similar python typechecker some time ago
([https://github.com/cabalamat/ulib/blob/master/debugdec.md](https://github.com/cabalamat/ulib/blob/master/debugdec.md)).
My syntax is somewhat less verbose so this:

    
    
        @returns(int)
        @params(a=int, b=int)
        def add(a, b):
            return a + b
    

becomes:

    
    
        @typ(int, int, ret=int)
        def add(a, b):
            return a + b
    

I was inspired to do it like that by Haskell's very clean type syntax.

Senko's version has the advantage that you can compose types in it, e.g.
{str:int} is a dictionary whose keys are strings and values are integers.

------
ak217
For Python 3, I've added optional type checking based on function annotations
and decorators to the Ensure library:
[https://github.com/kislyuk/ensure#enforcing-function-
annotat...](https://github.com/kislyuk/ensure#enforcing-function-annotations).

    
    
        @ensure_annotations
        def f(x: int, y: float) -> float:
            return x+y
    
        f(1, 2.3)
        >>> 3.3
        f(1, 2)
        >>> ensure.EnsureError: Argument y to <function f at 0x109b7c710> does not match annotation type <class 'float'>
    

I think it works better than other approaches mentioned here because it (1)
doesn't repeat the contents of the function signature, (2) doesn't install any
magic system-wide hooks (and the attendant performance issues), and (3) is
completely optional.

------
bsaul
Didn't know Guido thought of static type checking such a long time ago. I'm
thinking now that one solution to two problems (this one, and slow python 3
adoption) would have been to include optional static typing (like in dart) in
python 3.

------
raymondh
Having seen many type-checking initiatives for Python come and go over the
years, I've come to believe that type-checking is like security in that it is
difficult to add after-the-fact and works best when designed-in from day one.

When function annotations were introduced, I ended up removing all their uses
from the standard library because every early attempt to use it was too
simplistic and failed to support any type-system use case except for extra
documentation.

------
nox_
Too bad that this isn't type checking in the traditional sense though. That's
just runtime tag checking.

------
petrounias
Union types would be an incredibly useful extension of this.

~~~
illumen
Indeed! pysonar2 uses them with success.

------
artursapek
This is neat, but why not just use Go!

~~~
pekk
Go is neat, but why not just use Haskell? (etc.)

~~~
raverbashing
Monads

Or more specifically, how non-intuitive it is to use the IO stuff

I don't care what it uses, or what is it called, I care about being able to
use it with what I know

So yeah, I'll go for Go instead of Haskell

~~~
Peaker
The IO stuff has almost nothing to do with monads.

It is unfamiliar, but if you use static typing, at least reap the benefits of
better error checking, no runtime null dereferences, etc.

