The authors don't seem to ever discuss the fact that mypy version changes frequently make previously-passing code fail type checks.
I don't think I have ever upgraded mypy and not found new errors. Usually they're correct, sometimes incorrect, but it's a fact of being a mypy user. Between mypy itself and typeshed changes, most mypy upgrades are going to involve corresponding code changes. The larger your code base and the more complicated your types are, the worse it'll be, but it's basically an ever-present issue for any program interesting enough to really benefit from a type checker.
How many of those repositories were "type-correct" but only on particular versions of mypy? I bet it's a lot!
But the big culprit is typeshed. Something will get new/fixed annotations and suddenly you aren't handling all possible return types in your callers, or whatever.
I can't show any of my professional work (no publicly available src) but this side project of mine is locked to 0.790 until I can find time to sort the issues: https://github.com/calpaterson/quarchive/tree/master/src/ser...
It's hard to classify anything as a "false negative" with mypy since it is very liberal (often unexpectedly so, which I think is one of the sharp edges of gradual typing).
I think mypy has some problems, but this isn't one of the bigger ones for me.
Though I will say, while most have been fixable in a few minutes, some have been a real chore to fix. Sometimes an innocuous looking error balloons into several hours of reconciling obscure type system behavior errors once you start fixing it. Regardless, it's a small price to pay for proper type checking in Python. I've more than made up the lost time in detecting bugs before they ship.
You can argue that when it removes features it provides a replacement (sometimes), but that does not change the fact that if you have any reasonably large project (>1 million LOC), every standard upgrade will break your app.
One of the main reasons we write all new code in Rust and are migrating the C++ code base step by step to Rust is because Rust offers infinitely better backward compatiblity guarantees than C++.
Rust never ever breaks your code, and you can opt-in to newer Rust editions for new code only, and these are ABI compatible with Rust code written using older editions.
One thing I didn't see mentioned that is a particularly annoying false positive in mypy is the enforcement of Liskov Substitution for equality checks. All types have the common bottom type of object and object itself defines the __eq__ magic method as accepting any other object, which means all custom Python classes that want to define an equality check have to either define the '==' operator as valid accepting anything as a second operand, or tell mypy to ignore the error, which is the obviously saner thing to do since you want a type error, not a return of False, when you check equality against a completely unrelated type. Unfortunately, mypy provides no way to give you that statically.
I'm glad they mentioned the default parameter None to avoid the classic footgun of empty collections, though. That has to be the single most annoying false positive, when you declare a parameter as Optional and the type checker then complains it was set to None in the parameter definition but later set to an actual value, which you have to do because of the totally insane and unintuitive fact that all invocations of a Python function share the same parameter, so filling in the empty list with something else upon calling it means everything else that calls it now gets the changed default value instead of the empty list.
This is probably the single greatest rookie mistake made in Python because of how unexpected it is if you don't deeply understand the object model, but doing the literally recommended by the book solution to get around it results in the official type checker endorsed by Guido himself to report a type error.
def f(l: List[int]=cast(Any, None)) -> int:
if l is None:
l = 
What? No I don't?
foo = if x: Foo else 1
foo == 1 # why would I want this to throw? I just want False!
def __eq__(self, other):
if not isinstance(other, MyClass):
# whatever your comparison is
This is neat because it allows you to define a class B that can compare equal to an existing class A with the expression A == B, even though class A has no knowledge of B.
In your example, `foo` could be marked (in mypy) a `Union[Foo, int]`, which would be a reasonable value to compare against an `int` and not a type error.
For an example, consider `np.eye(4) == 1`. In this, you're comparing a doubly nested array of floats with a scalar int, but the operation is well defined (vectorized equality comparison).
And yes, it should be feasible to typecheck that operation.
Well, actually as long as they implement the right dundermethods, they can totally be.
from typing import Protocol
def method_a(self) -> None:
def method_b(self) -> None:
def method_a(self) -> None:
def method_b(self) -> None:
def compare(a: ProtocolA, b: ProtocolB):
if a == b:
print("a == b")
print("a != b")
a = MyClass()
b = MyClass()
You're saying is that an 8-bit number can never be equal to a 16-bit number, or a 32-bit number, or a...
I hope you agree that's ridiculous.
No comparison without explicit casts is a valid design choice. I think Ada and Rust work that way, though I've never used either language and might be mistaken...
Sure, but that wasn't the claim.
How you view the claim itself depends on your thoughts about the level at which equality should operate (should type tags be considered part of a value's identity, is it about the bit representation, or the abstract value encoded by that representation, ...)
There are two forms of equality in Java:
- 'x == y' is an identity check, like 'x is y' in Python (are x and y referencing the same memory?). These must be of the same type, or else it won't compile.
- 'x.equals(y)' is an overridable method, like 'x == y' in Python. Its argument is of type 'Object', so it will accept anything.
I don't do much Java these days, but I do write a lot of Scala (which runs on the JVM). One of the first things I do in a Scala project is load the 'cats' package so I can use 'x.eqv(y)' which refuses to compile if x and y are of incompatible types. This problem also propagates up through different APIs, e.g. 'x.contains(y)' checks whether y appears in the collection x (which could be a list, hashmap, etc.); this doesn't check the types, and I've ended up with bugs that would have been caught if it were stricter. I now prefer to define a safer alternative 'x.has(y)' which uses 'x.exists(z => z.eqv(y))'.
I think it does make sense if you consider the function signature an expression scoped one level above the function definition. I agree it's initially not intuitive, and it's tripped me up a few times before as a beginner, like probably most other beginners.
As a more experienced Python developer, I think I'd actually now find it unintuitive if they changed it, due to everything I've internalized about Python scope, values, assignment, expressions, etc. Same for always requiring "self" as the first method argument. Overall, I'd prefer the changed versions, but I think they (or someone else) would basically need to make a whole new spiritual successor language.
What does take a long time is learning about it all. There's the docs, then PEP 484, then at least eight other related PEPs of various impact. It would be a good topic for a small book / website / course, except things change kinda fast.
I do think it could be a hack for a new dev to establish credibility. Spend a few months type hinting code and fixing bugs in popular open source repos. You'll learn way more about real software engineering than following another tutorial. If someone wants to do that and wants some mentoring, hit me up at my last name at gmail.
That said using type annotations made me realize I want to be fluent in a language with a strong static type system (huge productiveness booster imo), so I'm currently studying Typescript and nim.
It is part of mypy, and it compiles type-valid python files to cpython binary extensions, and takes advantage of a lot of shortcuts available to cut execution time. It is still a bit early to advocate for its use everywhere, and has it drawbacks (extensions compiled with mypyc will also typecheck the inputs and outputs of the module at runtime, which is great for improving code quality and validity of the annotations, but also means you may get TypeErrors in production if you hit a bad edge case).
If there is no shortcut it will fallback to normal, slow, python performance, with some smallish benefits (e.g. vtables for attribute lookups, no parsing time, etc).
How does it compare to Nuitka? https://nuitka.net
For comparison, Racket has gradual typing which gives a speed boost when everything is typed, but can end up much slower if anything remains untyped, e.g. see figure 3 in http://www.ccis.northeastern.edu/home/types/publications/gra...
Python would probably be even worse, since it's "more dynamic" than Racket; e.g. there are many hooks that can run arbitrary code (like dunder methods), and there's little encapsulation so values may get swapped out via monkey-patching at any time.
Not trying to start language wars here, of course, I'm just interested because I find Nim one of those languages that I'd like to try but don't have a project/use case to try on.
Also useful for autocomplete.
For me, most of the value of typing is at the function level: what does this function accept as input and what does it produce as output?
Also mypy gets in the way of using loop constructs, as the iterating variables have to be predeclared with typing before the loop, breaking encapsulation.
output = 
for x in range(1):
# note: Revealed type is 'builtins.list[builtins.int*]'
for x in y:
Does Pyright behave like MyPy or PyType? Or neither?
Also did Python really create a type annotation system without specifying the semantics of how it is supposed to work? That's crazy.
If the type checkers disagree on stuff defined in pep 484 that’s a bug to report. Beyond that they can vary but feel mostly similar.
Splitting out the mistake checking and the data specification aspects of types would probably make pieces more portable across different languages
- Static + Weak = C
- Static + Strong = Haskell
- Dynamic + Strong = Python
- Dynamic + Weak = JS
and, in particular, explicit vs. implicit type casts of those objects.
Python: strong, dynamic
PHP: weak, dynamic
C: weak, static
Haskell: strong, static
I expect that everybody doing this job has a CS or equivalent degree and knows the difference between strong / weak and static / dynamic typing. Reading other comments here demonstrates this is not the case. This is not a bad thing (luckily we only need a screen and a keyboard) but hopefully they'll have learned something new today.
Btw, there is a lot I don't know despite my degree. No hard feelings.
Practically, untyped languages would include asm and forth.
Unlike void pointers or Object references in Java you can't "forget" an object's type.
return "try again"
> "Dynamic typing" The belief that you can't explain to a computer why your code works, but you can keep track of it all in your head.
— Chris Martin (https://chris-martin.org/2015/dynamic-typing)
Here's my response to that quote:
"Static typing": The belief that if your code type-checks correctly, it will do exactly what you intended. (A variant of the belief that if your code compiles correctly, it will do exactly what you intended.)
Of course, taken literally, that's unfair, but so is the quote you gave. Everything you add to your code has a cost. Time you spend writing type specifications is time you're not spending doing something else that might add more value. Sometimes writing type specifications is the best use you can make of that time; sometimes it isn't. Python at least gives you both options: use type annotations if you want, but you don't have to if you don't want to.
No-one claims that typechecking solves 100% of problems. But it has a higher cost-benefit than anything else that's been tried.
> Time you spend writing type specifications is time you're not spending doing something else that might add more value.
So is time you spend thinking about the behaviour of the code in your head - the main difference is that it's slower and more error-prone.
Occasionally you really can do the calculation about what kinds of expressions are valid in which places better than the computer. But that's a rare case that gets rarer every day.
> Python at least gives you both options: use type annotations if you want, but you don't have to if you don't want to.
Not really: your code will almost always be silently unsound. E.g. libraries you're using will usually have incorrect type annotations (because their type annotations aren't checked, and the checker is unsound even if they were).
In contrast if you use say Haskell you genuinely do have both options: you can write code and have it be safely typed, or you can call unsafeCoerce at any point where you don't want typechecking to happen.
I would want to see a lot of data, for a lot of different kinds of programs, to back up this claim.
> So is time you spend thinking about the behaviour of the code in your head
You have to do this anyway; static typing doesn't write your code for you. Even Haskell, which I will freely admit is probably the closest thing to an AI that I've ever seen in a programming language, can't do that. :-)
> your code will almost always be silently unsound
Code that has type annotations has this same problem. That was the point of my response: static typing != sound code.
> libraries you're using will usually have incorrect type annotations
They can't be incorrect if they aren't there. The comparison I'm making is not between code with correct type annotations and code with incorrect ones. It's between code with type annotations and code without any of them at all.
The amount is what matters. If spending x minutes writing down types saves you y minutes of thinking, and y>x, that's a win.
> Code that has type annotations has this same problem. That was the point of my response: static typing != sound code.
Well-typed code has certain soundness properties - they may not fully encode all the properties you want your program to have (that part is up to you), but the properties that you have encoded will be reliable. Optional typing undermines that: even if the types say one thing, you have no guarantee that that thing is true.
> They can't be incorrect if they aren't there. The comparison I'm making is not between code with correct type annotations and code with incorrect ones. It's between code with type annotations and code without any of them at all.
My point is that if the ecosystem is not well-typed then you don't actually have the choice of using types. What I'm arguing against is your claim that Python gives you the choice: it doesn't, because to be able to write well-typed code you need well-typed libraries.
> You have to do this anyway; static typing doesn't write your code for you. Even Haskell, which I will freely admit is probably the closest thing to an AI that I've ever seen in a programming language, can't do that. :-)
I think what he meant to say is that you spent 95% of your time to think about: "what is the kind of think I can from this method call and how can and should I use it and what should I return in the end". Writing down the result of your thoughts takes almost no time in comparison. Not only that, in good statically typed languages, you don't have to write down the types explicitly most of the time (but you can still have your IDE show them to you if you want to see them).
And in addition to that, you certainly have to think _less_ in many cases. A good example are functions that return lists. Often you always have an element in the list. Using a good type-system, it will be indicated, so when I get the list, I know that I don't have to handle the case that the list is empty.
In python I would often have to think _more_ about it and look into the documentation or maybe even the implementation to understand if it could return an empty list or not.
The issue here isn't static vs. dynamic typing, it's API documentation. In cases where static type declarations work out to be sufficient as API documentation, sure, use them as API documentation. But the "time to think" issue isn't being solved by static typing; it's being solved by having good documentation for the APIs you are using (or writing good documentation for the APIs of the libraries you are writing).
> A good example are functions that return lists. Often you always have an element in the list. Using a good type-system, it will be indicated, so when I get the list, I know that I don't have to handle the case that the list is empty.
Same comment here: whether or not the function can return an empty list is part of the API. The API needs to be documented somehow. A static type declaration might be enough, or it might not. Either way, what's solving the issue isn't static typing, it's API documentation.
To what I say: writing out a type (= a few characters) is negligible anyways when compared to the time you spend thinking about what the type is (which you have to do in python as well, maybe even more).
> Same comment here: whether or not the function can return an empty list is part of the API. The API needs to be documented somehow.
If you have a proper API documentation in both cases, then you have to think less, that's true. In the real world however, I can testify that I have to think longer in python because 1) APIs not always well documented and 2) even if they are, a well documented API that uses statical types is still easier to use due to the automatic compiler/IDE support.
Writing out a type is only a few characters if it's a type that's already built into your type system. (In which case, as others have already pointed out, you probably won't have to write it anyway in a statically typed language because the type system will automatically infer it. But in such cases, you're not gaining any documentation benefit from it.)
If it's a type you're having to invent as part of writing the code, writing it out everywhere it gets used can be a lot more work.
> I have to think longer in python because 1) APIs not always well documented
Meaning, not well documented compared to APIs in other languages? That hasn't been my experience; my experience has been that API documentation pretty much sucks everywhere.
> a well documented API that uses statical types is still easier to use due to the automatic compiler/IDE support
If you use an IDE, perhaps. (I don't; I find that they cost me more than they save me.)
Right, but if it is not, then you also have write extra code in python (e.g. create a new class).
Of course alternatively you can also just use "String" for everything (you can do that with a statical typesystem too), but I hope you agree with me that this isn't a great idea in any language.
I also have the feeling that you might change your mind if you try out a modern fully fledged IDE with a good statically typed language. The reason why I think this is because of things that you said such as "But in such cases, you're not gaining any documentation benefit from it." which are correct if you are not using an IDE. But if you do, it's actually wrong. IDEs like IntelliJ can be configured to show all (or only certain) types that are not written in text but that the compiler infers - without pressing any key. I like to use this feature a lot when diving into an unknown code base.
> Meaning, not well documented compared to APIs in other languages?
No, just not being well documented. Many popular libraries in python are well documented, but not all are. And the less popular/public ones are often poorly documented. The same is true for other languages as well, but in those, you at least you often have a "basic" documentation through the types.
In any language, not just Python. A type that isn't already built-in has to be defined in your code no matter what language you are using.
> I hope you agree with me that this isn't a great idea in any language.
Of course it isn't. Different types exist for good reasons.
> I also have the feeling that you might change your mind if you try out a modern fully fledged IDE with a good statically typed language.
I doubt you would have this feeling if you knew how many times I have tried "a modern fully fledged IDE with a good statically typed language". And every time has ended up the same.
> IDEs like IntelliJ can be configured to show all (or only certain) types that are not written in text but that the compiler infers
This is a fair point, but in a language like Python this could be provided as a library function if it were needed. (Python already has the "help" built-in function that shows you the documentation for whatever object you pass it as an argument, at the Python interactive prompt. Inferred types could be handled the same way if Python had them.)
> at least you often have a "basic" documentation through the types
Which might be significant useful information. Or it might not. It depends on what kind of code you are dealing with. It's quite possible that the particular kind of code I have dealt with has simply not been the kind where static typing is much of a help, and that there are other kinds of code where it is. But the original claim that I responded to was "Everyone needs types" (by which was meant "everyone needs static typing"). It is that blanket, general claim that static typing is always better that I was disputing, not the claim that static typing can be helpful in some cases.
Yeah exactly! So it's the same for every language. I just don't understand why you then say that I would have to type more in a statically typed language.
> I doubt you would have this feeling if you knew how many times I have tried "a modern fully fledged IDE with a good statically typed language". And every time has ended up the same.
Everyone is different and that's one reason why people choose different tools and languages. Nothing wrong with this - one of the best developers that I know uses vim for everything. I on the other hand would not be productive without a good IDE.
> But the original claim that I responded to was "Everyone needs types" (by which was meant "everyone needs static typing"). It is that blanket, general claim that static typing is always better that I was disputing
Fair enough, I agree with you on that one. The thing I disagree with is that is static typing requires (always) more effort when writing code. For Java this is totally true, but for many other languages, my experience is that I neither have to type more nor that I have to think more.
Not really; you write it out fully once, when creating a type alias/definition, and then just use the name later.
Yeah, I’ve found Python to have a pretty good documentation culture (docstrings especially facilitate this), often better than some statically typed languages with ecosystems where signatures are regularly mistaken for adequate documentation.
In many cases in Python, you don't have to care. For example, if you're going to iterate over the list, Python will iterate over an empty list just fine: it will execute zero iterations. No need to check anything.
I'm very enthusiastic about static types with global type inference, where I don't have to write a single type annotation if I don't want to. My enthusiasm degrades pretty quickly the more annotations I have to manually clutter my code.
I see way too many code bases that test things that are easy and xfail things that are hard because it turns out that you can’t just test your way to a complex product, but you sure can make yourself feel good by testing the fact that a dict acts like a dict.
I would certainly argue that anyone who writes a library, whose public API will be consumed by hundreds or thousands of people, should always include type annotations.
Python's type annotations leave something to be desired, in terms of how easy they are to include in code and how well the tooling supports them. I'd almost rather see something like "Typethon" with better syntax+soundness that compiles-to-Python.
Types are an optional tool, sure, but they're one of the most valuable tools we have to catch (not eliminate entirely) bugs and hasten development (without introducing errors).
With a little care and a sufficiently strong type system it’s surprising how often this is actually true.
I don't know much at all about the structure of your application, and it's hard to speculate, but what strikes me is your live and snapshotted data have the same shape/type and seem to be interchangeable for the compiler/interpreter, but are very different to you and your customers. The engineer creating that screen may have had the forethought at the outset to say "this should never deal with live data, only snapshots", so that warning probably gets put in a comment, but there's very little to stop someone from later passing in the live data that you definitely don't want.
There's a feature in Haskell (and similar languages e.g. PureScript) called newtype declarations. They're a lightweight way you can tag and optionally limit access to your data. You essentially tell the compiler "this is a new and distinct type that can wrap the original one". The original and newtype are not interchangeable, and an attempt to do so won't compile.
Having that feature I'm in the habit whenever I recognize overloaded meanings encoded with the same type (e.g. live/snapshot) of wrapping one of them in a newtype, and before I even implement application logic, I write out the signatures and specify which type I actually want. They add a bit of tediousness, but they have saved me many times, and I've never regretted using them.
That is an administrator console. Sometimes we do have the foresight to understand what our users actually want but in a large system inevitably happen things like "it was obvious to us" or "after months of use we realized that it's better to do this way", etc.
I'll give them a read.
> Dynamically typed languages are actually statically typed with precisely one type: hash table.
Because by this definition makes C++ and Java look dynamic with vtables.