
A Failed Experiment with Python Type Annotations - ingve
https://mortoray.com/2019/06/11/a-failed-experiment-with-python-type-annotations/
======
joshuamorton
> mypy has no trouble understanding this, but it’s unfortunately not valid
> Python code. You can’t refer to Node within the Node class.

No, the workaround is to stringify "Node".

    
    
        class Node(Object):
            def add_sub(self, sub: 'Node'):
                ...
    
            def get_subs(self) -> Sequence['Node']:  #Or maybe 'Sequence[Node]'
    

works just fine. As of python3.7, this stringification is done by
automatically under a from future import, and will eventually become the
default, so the original code will be valid.

> This complexity helped drive the introduction of the auto keyword to C++.
> There are many situations where writing the type information isn’t workable.
> This is especially true when dealing with parametric container classes,

This is absolutely wrong! You cannot annotate a function as returning `auto`
in C++. `auto` and its variants in other languages are useful for eliding
redundant type _declarations_. Especially long ones that use
generics/templates. `auto foo = MyContainer<Tuple<int, str>>();` or whatever
is nice than having to double write the type declaration. But, in C++ or java,
if you write a function that returns a Mycontainer<Tuple<int, str>>, you have
to write that in the function.

This is an intentional choice: function declarations are your apis, and
explicit and clear APIs are useful for human readers of your code. That's why
you should annotate the public apis[0] even when you can have them be type
inferred. Speaking of which, if you want type inference, check out pytype[1],
it's like MyPy, but does do type inference on unannotated code. But you should
still annotate your public apis. It serves as a sanity check that you aren't
accidentally returning something you don't expect.

And of course, being familiar with the differences between iterables,
iterators, sequences, containers, etc. is not a bad idea.

[0]:
[https://google.github.io/styleguide/pyguide.html#3191-genera...](https://google.github.io/styleguide/pyguide.html#3191-general-
rules)

[1]: [https://github.com/google/pytype](https://github.com/google/pytype)

~~~
dragonwriter
> But you should still annotate your public apis.

The good thing about inference (especially with a REPL) is you can write it
without the annotation, and then use the inferred type (in Haskell, I usually
find that when I resist the temptation to explicitly annotate types, the
actual types are more general than I would have specified.)

~~~
klodolph
In Haskell, I usually find the quality of error messages to be much worse
without top-level annotations.

~~~
dragonwriter
I agree with that, too. My usual practice with Haskell is leave types off to
leverage the information gained from type inference (with the intent of
annotating signatures when I'm done), but then tell Haskell what I'm thinking
the types should be if things break with impenetrable error messages.

But my coding in Haskell is pretty much personal and toy projects; I like the
approach, but it may not be ideal for coding in anger.

------
rcfox
I recently added type annotations to my Python 3.7 project and found it very
helpful. Just the process of adding the annotations caught a handful of bugs!

Some things I found useful:

1\. from __future__ import annotations

As I mentioned in another comment, this lets you write annotations naturally
without worrying about when something gets defined.

2\. typing.TYPE_CHECKING is only True while type checking.

This allows you to conditionally import files that would otherwise cause a
circular dependency so that you can use the symbols for annotations.

3\. def foo(bar: str = None) -> str

If an argument defaults to None, then mypy will recognize that its type is
actually Optional[whatever]. So in the example above, bar is an Optional[str].
(I'm not sure if this is mypy-specific.) Optional[T] is equivalent to Union[T,
None].

4\. You can make typedefs easily.

MyType = Union[Sequence[str], Dict[str, int]]

Makes writing complicated annotations easier.

~~~
yegle
If your code doesn't need to be py 2/3 compatible it's fine to annotate using
str. But if you do, typing.Text and bytes is better.

~~~
quietbritishjim
Am I missing something? Annotations aren't supported by Python 2 so if you're
using them then you never need to worry about Python 2/3 compatibility.

~~~
joshuamorton
Yes and no.

Annotations in comments or in type stubs are supported in python2 (you can
look at typeshed for typing.Text and conditional py2/3 stuff). There's also
some other cases, but they're...unique.

------
Deimorz
The first issue is fixed by just using a string. Yes, you can't do
Sequence[Node], but you can do Sequence["Node"].

The docs talk about this here ("class name forward references"):
[https://mypy.readthedocs.io/en/stable/kinds_of_types.html#cl...](https://mypy.readthedocs.io/en/stable/kinds_of_types.html#class-
name-forward-references)

~~~
BurningFrog
OK, that works, but it does seem dumb, or even deliberately obtuse.

The language could just as well recognize Node as it recognizes 'Node', but
instead it given as a burden for the human programmer to handle.

Is it maybe because mypy is not really part of the Python _language_ , and
can't really make changes to it?

~~~
_ZeD_
This is not mypy, it's "python-the-language": the annotations are values, and,
in this case, the method annotations are evaluated when encountered - which is
_before_ the end of the class. So you are trying to use a value still not
defined. It's like doing something like

    
    
        def foo() -> Bar:
           ...
    
        class Bar:
           ...
    

obviously the python interpreter cannot find a "Bar" definition during the
parsing of the foo function

~~~
BurningFrog
You're saying this is how the language works.

I'm saying the language could work differently.

The case of referring to a class inside its definition is quite different from
your "Bar" case. The thing referred to is already declared, it's just not
_fully_ defined yet.

There is no logical reason I can see making it impossible for Python to honor
the obvious intention of the programmer here. It seems it has just chosen not
to do so. Though I'll admit I haven't thought through every weird corner case
:)

------
totalperspectiv
I really like using and writing type annotated Python, but every time I do, I
wonder if I should just use a language where all the annotation time I put in
gets me actual runtime improvements.

~~~
carlmr
And accurate compile time checks.

I recommend checking F# if you want mostly inferred, strong static types and
terse pythonesque Syntax (few braces, significant whitespace). Also much
faster.

Considering you can use all of .NET from F# you can benefit from the work on
C# as well.

------
sametmax
While the first complaint is just a reflect of the lack of good tutorials on
mypy, the second one is very valid: I too wish for automatic type detection
for obvious cases.

What's more, while I found mypy useful myself in several instances, it's still
cumbersome to use:

\- you must know to use the magic command line arguments to avoid mypy
complaining about other libs and imports all the time

\- you need to use List[], Dict[], Set[], Iteratble[], Tuple[], etc. instead
of list[], dict[], set[], iter[], tuple[], etc., which means an import in
almost every file and a very unnatural workflow.

\- you need to use Union and instead of |. This is ridiculous. I have to do:

    
    
        from typing import Union, List
    
        def mask(...) -> List[Union[bool, int]]
    

Instead of:

    
    
        def mask(...) -> list[bool|int]
    

And Guido explicitly rejected the proposal for those on github despite the
fact most imports are due to those.

Now, mypy has improved a lot. It's way faster, shows many more things that
before, way less false positive, and has support for duck typing and dunder
methods (named "protocol").

But it's sad to thing TypeScript is actually easier to use. Having a JS tech
easier to use than a Python tech is a good sign we can improve things.

------
j-e-k
I would have to disagree with this post (first point is actually inaccurate
and the second I could go either way on).

Type annotations/mypy, especially when coupled with dataclasses/pydantic has
been very helpful in maintaining a rather substantial 3.7 codebase.

~~~
trimbo
> rather substantial 3.7 codebase

What size codebase is substantial?

~~~
j-e-k
Excluding tests/patches to third party libs, around 65K LOC

------
ggregoire
Why would you keep Python if you want static typing? I mean, there are several
modern, performant (more than Python actually) statically typed languages.
Rust, Go, Kotlin, Scala, Swift… All of them are mature and have a good library
ecosystem. So, except if you need some math/AI/ML packages available only for
Python, why bother? Choose the right tool for the job, no?

~~~
pacala
Scala: Python + dataclasses [+ mypy] offers a programming experience close to
pragmatic Scala. The tradeoff becomes:

* worse performance

* lack of immutable vectors, alleviated via style conventions

* rarely, awkward lambdas

vs.

* wide pool of people familiar with the language

* lack of JVM lockin: Scala native is not there yet, the library ecosystem is all JVM.

* [good] batteries included

* much better reflection

* access to modern ML

* gradual typing

* no actors

Rust: Manual memory management is verbose and distracting.

Go: Manual exception handling is verbose and distracting.

Kotlin: JVM lock-in.

Swift: Apple lock-in.

~~~
hermitdev
No immutable vectors in Python? Are you not aware of tuples?

Lambdas in Python are quite limited, but offset by natural nested local
functions.

Regarding performance, that's not always an issue, and depending on the use
case can be addressed with async/await or using a multiprocessing pool. I have
had issues with multiprocessing and some DB libs ib the past, but recently
have had pretty good success.

Multithreading is still an issue because of the GIL (global interpreter lock)
and really only useful if you're calling into a C/C++ lib that releases the
lock while it does it's native things.

I've written a lot of Python in the last 15 years, and Ive written a ton of
scripts that are faster than a well written identical C/C++/C# app, and was
written in 1/10th of the time. It just depends on the use case.

~~~
pacala
Sorry, I meant persistent vectors. Data structures with type Vector[T] instead
of Tuple[T0, T1, ...], amortized O(1) updates and list-like syntax for
literals, e.g. i[x0, x1]. It's a minor annoyance, in most cases using vanilla
Python lists with the convention 'any list that is part of a dataclass / is
passed across function boundaries should never be mutated' is good enough. As
u/joshuamorton noticed in this thread, the convention can be enforced with
mypy and Sequence[T], which is great!

------
nickm12
This is a really poor article with a click-bait headline. I was expecting this
to be a case study where Python type annotations just couldn't work due to
some interesting specifics of the projects. Instead, it's some petty gripes
about the syntax of annotations not being just to the author's liking.

------
davesque
The first point can be rather easily circumvented by using strings as forward
references and it only takes a bit of googling to figure that out. The second
doesn't seem remotely like a deal breaker to me. Half the reason for having
type annotations is to provide a description of what a function or method does
at a glance. This is idiomatic even in hardcore functional languages like
Haskell. With return type inference, you don't get that benefit.

~~~
cjwoodall
It seems to me that type inference would be silly in this case. Type
suggestion during mypy checks might be useful though.

Function x does not specify a type, but returns SomeHelpfulTypeString. We
recommend you use that.

Put it behind a flag and call it a day. Maybe even have a way of requesting
type suggestions for functions during development

------
perlgeek
I think the premise of this blog post is misguided.

Yes, there are two things about introducing types that the author finds
annoying. So don't do those two things, and use type annotations for all the
rest.

The type annotation system and mypy are designed to give you benefits even if
you don't fully annotate everything -- that's the starting point of nearly all
existing python code bases.

------
neuland
Other people have mentioned that the self-referential case from this post is
doable. But what is not doable (to my knowledge) is this kind of self-
reference:

    
    
        class Node(typing.NamedTuple):  # or dataclass
            child: 'Node'
    

The error is _example.py:4: error: Recursive types not fully supported yet,
nested types replaced with "Any"_.

This error has been in there for several years at least. And it always bites
me when I forget. It's particularly cryptic when the self-reference is many
levels deep. I've even had the error message break in these cases where it
prints out a line number from the wrong file. So, then you have to hunt to see
where the self-reference is hiding.

~~~
rcfox
Just tried this myself, and it seems to only be an issue for
typing.NamedTuple. Though, it still manages to do type checking correctly...

    
    
        from __future__ import annotations
        from typing import Optional, NamedTuple
        from dataclasses import dataclass
    
        class Foo:
            def __init__(self, foo: Foo = None) -> None:
                self.child: Optional[Foo] = foo
    
        Foo(Foo(Foo(None))) # Works
        Foo(Foo(Foo(1))) # error: Argument 1 to "Foo" has incompatible type "int"; expected "Optional[Foo]"
    
        @dataclass
        class Bar:
            child: Optional[Bar]
    
        Bar(Bar(Bar(None))) # Works
        Bar(Bar(Bar('bar'))) # error: Argument 1 to "Bar" has incompatible type "str"; expected "Optional[Bar]"
    
        class Hep(NamedTuple): # error: Recursive types not fully supported yet, nested types replaced with "Any"
            child: Optional[Hep]
    
        Hep(Hep(Hep(None))) # Works
        Hep(Hep(Hep({'a': 'b'}))) # error: Argument 1 to "Hep" has incompatible type "Dict[str, str]"; expected "Optional[Hep]"

~~~
lozenge
It looks like the constructor is being typechecked, but probably not attribute
access. `val.child.junk` will pass because `val.child` is `Any` and you can do
any-thing to an Any.

~~~
rcfox
That seems to work too..

    
    
        a = Foo(Foo(Foo(None)))
        b = Bar(Bar(Bar(None)))
        c = Hep(Hep(Hep(None)))
    
        a.child.junk = 1
        # error: Item "Foo" of "Optional[Foo]" has no attribute "junk"
        # error: Item "None" of "Optional[Foo]" has no attribute "junk"
    
        b.child.junk = 'a'
        # error: Item "Bar" of "Optional[Bar]" has no attribute "junk"
        # error: Item "None" of "Optional[Bar]" has no attribute "junk"
    
        c.child.junk = 1.1
        # error: Item "Hep" of "Optional[Hep]" has no attribute "junk"
        # error: Item "None" of "Optional[Hep]" has no attribute "junk"
    
        a.child = 1
        # error: Incompatible types in assignment (expression has type "int", variable has type "Optional[Foo]")
    
        b.child = 1
        # error: Incompatible types in assignment (expression has type "int", variable has type "Optional[Bar]")
    
        c.child = 1
        # error: Property "child" defined in "Hep" is read-only
        # error: Incompatible types in assignment (expression has type "int", variable has type "Optional[Hep]")
    

I'm using mypy 0.701 with Python 3.7.3.

~~~
neuland
Hey, thanks a lot for pointing this out! Yet another reason to use data
classes vs. named tuples.

Looks like just plain class level attribute declaration also works:

    
    
        class Node:
            child: 'Node'
    

I wonder why `typing.NamedTuple` is unique as far as it not working. I know
they don't use `eval` and templating to create it anymore. From looking at the
code [0], they're using metaclass / __new__. But other than the fact that it
uses metaclasses, I'm not sure why it'd have an error. Obviously there's cycle
handling in mypy, otherwise none of the examples would work. If I use an
example with a metaclass, that also typechecks. So, it's not metaclasses that
trigger the error.

    
    
        class NodeBase(type):
            def __new__(cls, name, bases, attrs):
                return super().__new__(cls, name, bases, attrs)
    
    
        class Node(metaclass=NodeBase):
            child: 'Node'  # works
    

Edit: it looks like there is now work going into this and other cases where
recursive types don't work [1][2]. For example: `Callback =
typing.Callable[[str], Callback]` and `Foo = Union[str, List['Foo']]`.

[0]
[https://github.com/python/cpython/blob/3.8/Lib/typing.py#L15...](https://github.com/python/cpython/blob/3.8/Lib/typing.py#L1593)

[1]
[https://github.com/python/mypy/issues/731](https://github.com/python/mypy/issues/731)

[2]
[https://github.com/python/mypy/issues/6204](https://github.com/python/mypy/issues/6204)

------
whalesalad
I cannot stand typing in Python. It's never given me any saving grace in a
project, only headaches.

------
jks
What's the status with numpy types? I find two projects,
[https://github.com/machinalis/mypy-data/tree/master/numpy-
my...](https://github.com/machinalis/mypy-data/tree/master/numpy-mypy) and
[https://github.com/numpy/numpy-stubs](https://github.com/numpy/numpy-stubs)
but neither looks very complete and neither has been updated recently.

------
deepsun
Try Kotlin. Syntax is similar to Python (but better), but it's actually just
Java, with all the supporting ecosystem. Kotlin is just syntactic sugar.

~~~
amelius
> Try Kotlin. Syntax is similar to Python (but better)

Well, from [1], I'd say syntax is closer to C.

[1] [https://kotlinlang.org/docs/reference/control-
flow.html](https://kotlinlang.org/docs/reference/control-flow.html)

------
jimmaswell
What's possibly so cumbersome about defining return types?

It does seem a little embarrassing that the annotations don't even support
circular references.

~~~
adrianratnapala
Circular references like that are called "recursive types" in formal type
theory. They are one of those big issues that academics like to write papers
about.

Even if you just want to write a practical interpreter, and choose to gloss
over the issues, they will still come back in some disguised form and either
by requiring some sort of implementation kludge, or just by creating weird
edge cases.

~~~
heavenlyblue
And how is that an issue for type checking, would you elaborate?

------
dragonwriter
> Or do you have any idea what type get_closure returns?

It doesn't.

Because of the syntax error.

------
Barrin92
I feel baking type annotations on top of dynamic, late binding languages is
fundamentally misunderstanding what they are about. Python might not go as far
as say, smalltalk, but if you're picking a dynamic language just accept the
runtime dynamism and don't program against it.

~~~
heavenlyblue
Or well, accept the dynamism and use type declarations as automated
declarative teat coverage, at the very least.

