
The state of type hints in Python - BerislavLopac
https://www.bernat.tech/the-state-of-type-hints-in-python/
======
thosakwe
I will always wish that Python weren't the lingua franca of sorts for machine
learning.

For me, it's always frustrating to have to guess the types of parameters in
functions without appropriate documentation, especially considering that I
don't know much of the language.

Combined with the fact that static analyzers for Python can very easily get
slow (ex. completions for Tensorflow always take multiple seconds, which IMO
is unacceptable), it's just not something that makes the language seem like a
reasonable choice to me.

I was working on Tensorflow bindings for Dart, which is Java-level statically-
typed now, and in porting functions from Python found myself absolutely lost
as to what any of the code actually did.

I might be the only one who dislikes Python to this degree, though.

~~~
jdonaldson
Nah, I'm with you. Trying to do the same in Haxe. There's not a first class
language for typed tensor ops. Type information could include dimensionality,
and remove a whole bunch of stupid, time consuming errors.

~~~
LolWolf
What about Julia? It’s what we’ve been teaching our Stanford ML class with and
it’s _fast_! Also typed and JIT with multiple dispatch. Arrays here don’t have
the Numpy/Scipy weirdness of matrices vs. ndarrays and linalg is truly first-
class.

This comes from someone who used to despise the language, but it’s truly come
a long way.

~~~
Xcelerate
I'm still amazed that Julia hasn't taken off in six years. It's this great
language that solves most of the problems that other (scientific computing)
languages have, and hardly anyone uses it. I use it all the time for personal
projects, but I use Python at work. Looking forward to the day I can use Julia
for everything.

~~~
aurelian15
My primary issue with Julia is that it has a relatively high latency in a REPL
environment. I and many people I know primarily use REPL environments (e.g.
Jupyter Lab) for scientific computing, so this is a pretty relevant use-case.
For example, if I start Julia and type

    
    
       [1 2 3; 4 5 6; 7 8 9] ^ 2
    

I have to wait about 5 seconds for a response (on a first generation Core i7,
SSD). On the other hand, running the following in a fresh Python interpreter
is almost instantaneous:

    
    
       import numpy as np
       np.linalg.matrix_power([[1, 2, 3], [4, 5, 6], [7, 8, 9]], 2)
    

Unfortunately, in most scenarios the actual execution speed (where Julia is
far superior) is secondary. People just tend to run larger experiments over
night; and as long as you can express your code in terms of numpy matrix
operations, Python is fast enough.

~~~
StefanKarpinski
That taking 5 seconds is very strange. I have an early Core M (mobile laptop
chip, much slower than Core i7, which is a desktop chip) and that expression
takes 0.7 seconds at a fresh prompt. That's still much worse JIT compilation
delay than we'd like it to be, but 5 seconds is either a very bad
configuration or perhaps a bit of hyperbole? There are other situations like
time-to-first-plot where compile times do cause a really serious delay that is
a very real problem—and a top priority to fix.

~~~
aurelian15
Tried again this morning after rebooting the computer -- turns out I was low
on RAM yesterday evening. After starting Julia a few times to make sure it is
cached I get the following results:

    
    
      time julia -e '[1 2 3; 4 5 6; 7 8 9] ^ 2'
      real	0m1.629s
    

And for Python/Numpy

    
    
      time python -c 'import numpy as np; print(np.linalg.matrix_power([[1, 2, 3], [4, 5, 6], [7, 8, 9]], 2))'
      real	0m0.103s
    

Edit: Julia Version is 0.6.3, installed directly from the Fedora 28
repositories.

~~~
b2gills
And people think Perl 6 is slow

    
    
      time perl6 -e 'say [1, 2, 3;  4, 5, 6;  7, 8, 9] >>**>> 2'
      [(1 4 9) (16 25 36) (49 64 81)]
    
      real	0m0.170s
    

Note that the majority of that time is just loading Perl 6.

    
    
      time perl6 -e 'Nil'
      real	0m0.156s
    

Perhaps someone could create a slang module for Julia in Perl 6, as that would
be a fairly easy way to improve its speed. (Assuming Julia is easy to parse
and doesn't have many features that aren't already in Perl 6)

------
hellofunk
I'm a fan of this multi-language trend of gradual and optional typing. Static
typing has definite benefits, but there are also diminishing returns on
productivity as a language becomes more and more statically safe. Often times,
type safety is a rabbit hole that can easily turn a language into what some
call a "puzzle language," where the efforts to please the compiler become non-
intuitive and take long times to figure out. Some of the classes in Swift for
example have to adopt a hundred protocols just to compile. Rust is an even
more extreme example. You might get extra safety, but at significant
development cost.

Gradual typing in Python is a happy medium that lets the developer decide (to
some degree) on the workflow up front. It's a nice compromise for a wide range
of work.

I love this explanation from a few months ago:

[https://blog.merovius.de/2017/09/12/diminishing-returns-
of-s...](https://blog.merovius.de/2017/09/12/diminishing-returns-of-static-
typing.html)

~~~
burntsushi
> Rust is an even more extreme example. You might get extra safety, but at
> significant development cost.

This is not the indisputable fact that you seem to be portraying this as. I've
written similar amounts of Rust and Python, and it is much much easier and
faster for me to write robust software in Rust than in Python. Python _might_
let me get to a prototype more quickly in some cases, but the amount of time
spent on maintenance after the fact puts Rust very clearly in the lead in my
experience.

There's no doubt that Rust's learning curve is much much steeper than
Python's, but this is a very different thing that saying something general
about overall development cost.

~~~
hellofunk
Do you not find that lack of a REPL-oriented workflow inhibits productivity? I
sure do. I love C++ and don't use Rust beyond hobby interest, but being able
to do things interactively is definitely an advantage in languages like
Python.

~~~
burntsushi
Is a REPL nice to have? Yes, absolutely! It's a nice quality of life
enhancement. But in terms of development cost, it's _barely_ a blip.

------
MR4D
Fantastic article - well worth the read!

One key takeaway I had is that python is closing in on its own “Perl 6”
moment. It seems to me (and this is just my opinion), that I’m adding type
hints, the complexity has gone up considerably. (Don’t this about the simple
syntax, rather think about all the gotchas and edge cases that are going to
cause productivity issues for developers in larger codebases.

I really wonder (and have for a while - only now even more so), if Python has
gone about it the wrong way.

I don’t develop often anymore because it’s no longer my job, but really do
love it when I get to spend time on it. And python, for many reasons, is my go
to language (sorry for the pun).

But now I’m starting to look at Swift and Rust, figuring if I’m going to go
thru these headaches, I might as well get a more modern language that runs
fast as blazes as a nice side benefit.

I’d love some guidance from anyone who’s further down the road with this
issue.

~~~
oddity
I think Swift might be closer to what you want than Rust.

I like both, but Rust and Swift are really for two very different groups and
solve different problems. Rust is for the C/C++/assembly programmers who would
never even think of using Java or Python. Swift is a modern Java.

If you're coming from Python, I think Swift is closer to what you want since
Rust will probably seem more complex for reasons that don't have benefit for
the Python-style use case.

~~~
Longhanks
The only thing in which Swift relates to Java is syntax. Apart from that, they
couldn't be more different.

And if you start digging deeper, you get to see that Swift is a very complex
language, too. It's easy to get started, probably easier than Rust, but it has
its own kind of quirks.

~~~
akvadrako
Swift does have a lot of syntax, but it's quite well designed and that
contributes to the reference guide being organised in a useful manor. So that
complexity is more manageable than C++ or Rust.

Once they can remove lots of the ObjC compatibility parts, it should get
better too.

------
jlarocco
On one hand it's nice that there's a standard way to specify types in Python
now, but TBH, if I'm going through the trouble of specifying types then I'm
going to use a language that has them built in and fully exploits them. In
particular, not using them for performance tuning seems like a big miss.

~~~
hodgesrm
Agree. At that point it's just better to go back to Java or <insert your
favorite statically typed language here>. Java has a broad solution to type
safety _and_ a standard doc format for code. Sure there are awkward corner
cases around things like type erasure but I find Java for the most part just
works.

~~~
welder
Disagree, performance is a side-effect of type safety... it's main use is to
prevent bugs and help devs reading code. Also we use Python for the library
ecosystem, easy to read syntax etc. which Java will probably never match. For
performance, use Go or Rust and get everything Java has plus an active modern
community.

~~~
thermodynthrway
Performance and type safety are absolutely linked. The only major reason JS is
still 5-10x slower than Java is looseness of types preventing optimizations.

The Java community is still vibrant and much larger than Go and Rust put
together. Modern stuff like RxJava, Lombok, MapStruct, streams, Guava, Gson,
Retrofit, DropWizard, Vert.x, etc... Make Java a breeze these days. The
learning curve is just steep.

And the Java ecosystem is easily larger than Python. Python may be nearly as
popular now, but Java has been in the top three for nearly two decades.

A lot of shops use "vanilla Java" out of ignorance without modern libraries
and it's crufty as hell. It's similar to shops still using JQuery hacking vs
those that have awakened and use React and Typescript, practically a different
language

~~~
outworlder
> And the Java ecosystem is easily larger than Python. Python may be nearly as
> popular now, but Java has been in the top three for nearly two decades.

And we should not care in 2018. Let me use Rust or whatever, I'll call your
Java-based service if I need to use it.

~~~
thermodynthrway
gRPC is a godsend

------
tyfon
"One of the main selling points for Python is that it is dynamically-typed."

This is actually quite a surprise to me, I am using python despite this as I
work as an analyst and many libraries are written for python. I've never seen
any advantages of dynamic typing, in fact I tend to avoid languages with it if
I can.

Are there any valid reasons to have it in a language?

~~~
zimablue
Most typesystems are limited in what they can express, and as they get more
complicated in what they can express they tend to get more complicated to
understand.

A typesystem is a limitation you place on your code which you accept in the
hope that the things which it proves true (problems it avoids) are worth the
limitations that you've accepted. You can set everything to Object but the
culture that develops around statically typed code tends to mean that people
are really afraid to ever do this or be ostracised.

From a data moving point of view there's a couple of examples I hit in
production where a typesystem isn't helping me => * DataFrames tend to
gain/lose columns on every operation, so to represent them in types I'd need a
zillion types with code duplication or an interface for each field name. Some
operations aren't even possible to type like pivot_table. So tons of my code
is really not helped by typing * Everything turns into a DAG of data flows,
you can do this with typing but the overhead would mean that noone woudl come
up with something like mdf (I don't use mdf in prod but it's impressive for
specific applications)

If you want a better defense of dynamism google Rich Hickey, a point he made:

There's many choices to be made for typesystems, you can have generics/not,
higher kinded types, dependent typing. Maybe it's better not to enforce these
choices right at the root language level and allow them to happen at a higher
level (be plugged in).

~~~
Shoue
> DataFrames tend to gain/lose columns on every operation, so to represent
> them in types I'd need a zillion types with code duplication or an interface
> for each field name. Some operations aren't even possible to type like
> pivot_table.

Couldn't this be done with dependent types?

~~~
zimablue
For pivoting? I don't know about dependent types, sometimes it's possible in
theory to know the type of the pivot columns because you know the possible
combinations of column fields that you're pivoting from, but is that
dependent-typable?

I think there's a kind of Motte and Bailey that goes on with people who think
typesystems are the one true god. They choose the simplest typesystem as their
example of the cognitive/learning overhead, (say C#?)

But if you question the limitations they go to "but it's possible in X obscure
typesystem".

~~~
Derbasti
But that's exactly to the OP's point: You need to _think_ about this. You need
to figure it out beforehand. In Python, you can express this without that
mental overhead--particularly while prototyping.

------
RcouF1uZ4gsC
This article talks about Python getting type hints. There as a HN post
recently about Ruby getting a type checker
([https://news.ycombinator.com/item?id=17217815](https://news.ycombinator.com/item?id=17217815))

I think the debate is largely going in favor of static type checking. The
widespread adoption of type inference in many statically typed languages (var
in Java, auto in C++, let in Rust) has greatly eased the ergonomics of using a
statically typed language.

Unless there is a massive ecosystem that you need to take advantage of (for
example javascript), in 2018 greenfield development should probably be in a
statically typed language.

~~~
brianberns
I agree, and yet dynamic languages such as Python continue to gain popularity
because they are supposedly easier for newbies to pick up. I think we
(experienced software developers) have to do a better job of educating
newcomers about the dangers of dynamic typing.

[https://stackoverflow.blog/2017/09/06/incredible-growth-
pyth...](https://stackoverflow.blog/2017/09/06/incredible-growth-python/)

~~~
seanp2k2
The more TypeScript I write, the less I like Python. In a recent project that
involved pulling some data from Hive via Pandas dataframes and merging it with
data from a JSON REST API, then doing some calculations and stuffing that data
into a different JSON REST API (with a totally different shape to the data), I
kept running into little edge cases that Python forces you to handle
explicitly vs Javascript just doing what you want. It also annoys me to an
irrational degree that Python has "dicts" and not objects, "lists" and not
arrays, KeyError everywhere, error looking up dict keys as ints (you can only
look them up by str with no implicit type conversion), can't say
"dict.keyname", have to do dict['keyname'] or usually dict.get('keyname') and
handle things like dicts within dicts where you're not sure if keys exist even
more verbosely, etc

example of something annoying that came up:

    
    
        foo = [1,2,3]
        bar = {'1': 1, '2': 2, '3': 3}
        for i in foo:
          print(bar.get(i))
    

What does it output? None x3. In order to get it to do the thing, I did
something like this:

    
    
        foo = [1,2,3]
        bar = {'1': 1, '2': 2, '3': 3}
        for i in foo:
            str_i = str(i)  # even more fun when this throws
            print(bar.get(str_i))
    

I use python because most of my coworkers understand it and it has wide
library + runtime support where I work, but it's definitely not my favorite
thing to work with. Dynamic typing where types still matter and where the
language is picky about implicit type conversions kind of sucks.

I also couldn't do:

    
    
        foo = [1,2,3]
        bar = {'1': 1, '2': 2, '3': 3}
        for str(i) in foo:
            print(bar.get(i))
    

Which was also _eyeroll_. Like, I get it, but...c'mon. Working with Pandas is
also more or less learning another language, with the insanity that is
df.T.whatever and df.loc[]:

    
    
        df.loc[df['shield'] > 6, ['max_speed']]
    

I'm sure that someone will comment about how I'm an idiot and there's a much
better way to do it, but given PEP20:

    
    
        There should be one-- and preferably only one --obvious way to 
        do it. Although that way may not be obvious at first unless 
        you're Dutch.
    

I guess I'm not Dutch enough.

~~~
antod
So, in a discussion where most complaints about Python are that it is
dynamically typed (vs statically typed eg Java, Go, Rust etc) you're
complaining that it is strongly typed (eg vs weakly typed eg Javascript)?

I don't have a problem with that - it just seemed kinda incongruous in that
context. I imagine the other critics would be even more unhappy about implicit
type conversions.

------
abakus
I use Numpy NDArray and pandas DataFrames a lot, and often times you want to
validate the shape / columns / dtype of theses objects. I have yet to find a
clean and meaningful way to type hint these, and ended up just put assertions
at the beginning of code.

~~~
zeth___
I don't get what's so bad about assertions. It makes it much easier to demand
that a given object has a given method than forcing the object to be of a
certain class.

If anything adding a list of assertions at the top of a method/function/class
is much more readable than the current type hint implementation.

And if you're running out of resources move the assertions to your regression
tests.

~~~
wnoise
> It makes it much easier to demand that a given object has a given method
> than forcing the object to be of a certain class.

You know what's even easier? Calling the method. Gives as much useful
information in the case of a failure, and also only checks at runtime.

I want something that can check earlier than that, in two ways: before
running, and up the call stack.

~~~
zeth___
Ok. Then why would you use a weakly typed language where the idea of types can
be easily and completely subverted with monkey patching?

~~~
wnoise
Library selection, currently running infrastructure, modifying existing
applications, etc, etc.

It doesn't have to work 100% of the time if the 99% of the time it does work
it's capable of catching serious errors earlier.

~~~
zeth___
As someone whose used python since the 2.1 days in high school, why not create
a language with the type system you want instead of killing python? There is a
reason why python is this good when it specifically chose to not implement the
features that are now being added to it.

------
comesee
The point of adding type hinting to Python is to create a smoother path from
prototyping to production level code. Contrast with prototyping in Python then
rewriting in C++. Soon typed Python code will be automatically transformable
to native code. Being able to _incrementally_ evolve a codebase from a
prototype to production level quality and performance without wholesale
rewrites will revolutionize development practices.

~~~
yorwba
> Soon typed Python code will be automatically transformable to native code.

Do you have a specific project in mind? I know of a few, like Cython, RPython,
Nuitka etc. that already do this, but use their own type annotations that are
partially incompatible with the Mypy type system.

One hurdle to native code generation is that most large Python projects will
contain a bit of metaprogramming that the type checker can't handle. That
isn't a problem if you can check it by hand and use assertions to signal the
type checker that you know what you're doing, but that code will also be
impossible to compile. There are some ways to get around that (e.g. RPython
allows arbitrary Python during initialization, which is executed at compile
time), but they might require a complete rearchitecturing of the program.

~~~
comesee
I've heard a rumor that the mypy team has an alpha python->c++ compiler
working.

------
RussBaz
As an author of enforce.py, I just want to say that working on and with type
hints exhausted me. The inner implementation is so full of corner cases. And
the indifferent at best attitude of python subreddit wasn't very helpful
either (imho). I still don't know if anyone besides project's awesome
contributors ever used it.

To be honest, the biggest issue of python type hinting (from my point of view)
is that it is an abstraction of a totally different typing system underneath.
Type hints do not represent actual types at all. They are just for convenience
sake.

What started as a proof-of-concept for me, soon became a huge monster with
special handlers for almost every single case. It is so messy now, that hardly
anyone besides me can do any maintenance. I have been considering a rewrite
(using ast modifications to avoid huge number of potential corner cases and
performance issues) for more than a year now, but such endeavour is too big
for me to undertake. It would be a fresh start after all.

End of rant :)

------
bitcraft
I don’t like the added burden of importing types in order to use them. So far,
I’ve relied on pycharm’s excellent static analysis and docstrings to add type
hints. One huge pain point that about C++ are header files and maintaining two
source files for related code. Python stub files really reduce flexibility
when refactoring or building new code. If you are adding types and maintaining
two source files, the rapid development advantage of using python is nearly
lost, and the type info is hidden while reviewing code, unless you have both
files open. Not great for code review when the editor isn’t able to report
type conflicts.

------
soulnothing
I started using python type hints when you needed to apply patches. This was
years ago. At work we had a database that changed it's schema randomly every
night. With type hinting in our build process we were able to determine
whether we could handle the new schema. I have also been developing python for
about eight years now.

I can't really stand the language anymore. It feels to me that unless you have
a large python code base, or a large python employee base. It's not just worth
the hassle. It was game changing when it first came out. But now,outside of
data science, machine learning, and devops I'm not seeing it.

With projects like mypy/type hinting, and pypi. You're one step away from
using a static language, that is either native or running on a VM (CLR, JVM,
BEAM, etc.)

Additionally almost all teams I've worked on, have encountered. Either
maintainability issues. Where a static type checker would aid. I.E. changing a
class or function definition, and not seeing it updated across the board. But
more common is performance. Specifically the more asynchronous/parallel work
stream. We have asyncio, and others. But it's more complex than some of the
static languages.

Since type hinting first came on the scene it has been a requisite in any
project I've worked on. I've lead a number of teams, who want to move from
python. To the tune of we complete each sprint developing the project in
python and $LanguageOfChoice. Begging the business to move on, but no dice.
Yes I speak in business language why we should move on. But the business won't
let us.

~~~
oblio
Periods are not commas, your text is hard to follow, like a series of
telegrams.

------
sleavey
Type hinting in Python is so ugly that I stripped it all out after a few days
of using it on some personal projects. They more than double the length of
function signatures and create comment clutter (is this a real comment or some
MyPy flag?).

I wish there was a way to fully separate the hints into separate files. Such
files could act like header files in C. Or, some comment character other than
# to start type hint (how about "!"?), so IDEs can hide them if the user
wants.

~~~
pjtr
Isn't that exactly what is possible with "3\. Interface stub files"?
[https://www.bernat.tech/the-state-of-type-hints-in-
python/#3...](https://www.bernat.tech/the-state-of-type-hints-in-
python/#3interfacestubfiles)

~~~
sleavey
These don't support full separation [1]. Some type hints must be in the main
source file.

[1] [https://stackoverflow.com/questions/47350570/is-it-
possible-...](https://stackoverflow.com/questions/47350570/is-it-possible-to-
separate-all-type-hinting-checking-infrastructure-in-python)

------
carapace
You can write valid useful Python code that flummoxes these type hints.

Things like e.g. PySonar exist.

Cython exists.

------
wozer
Many great arguments for static typing in this article.

------
rs86
Funny that python is discovering types. All of this is common sense to real
developers that can code on haskell.

