I don’t think that’s strictly true. Python developers are finding there are advantages to having a type system in larger programs such as readability and a reduction in certain classes of bugs.
That said, I’ve found Python’s approach lets me still build things through experimentation then fixing the types when I’m a bit more confident that I’ve built the right thing.
As a comparison, I found Go really clunky when I tried to learn it because it wouldn’t compile if the code wasn’t totally correct. It makes it hard to find the right solution through experimentation.
Python’s problem is that there’s multiple 3rd party type checkers and they all disagree in subtle ways about what the spec says. It’s a mess.
Python is probably an all-round better language than JavaScript. But typescript absolutely embarrasses Python’s “gradual” type system with just how good it is in comparison.
I would love to instead have one official upstream runtime type checker which can be enabled per function, class, or module via decorators. Basically, providing opt-in strong typing as a language feature.
Currently, I’m using the third-party Beartype decorators to provide this feature, but I would prefer something built-in and ideally something with more readable error messages for users when type errors occur, which might be easier to provide if it was built into Python.
(I use Pyright too, but I find the runtime type checking to be more robust since it understands also the types returned by highly dynamic untyped code.)
I don’t see that as a problem since you’re not going to run multiple type checkers over your code, much like most C++ codebases pick a compiler and stick to it.
1. You need to understand how to express concepts in a form the type system can understand. Each type system will have patterns it understands, and patterns it does not.
2. Go isn't a great example. It has a Java-circa-2000 type system. Modern type systems are much more expressive.
I picked Go as an example of a language that gets a lot of praise specifically for having strict compiler rules.
I've also tried Scala and found it clunky but I do suspect most of its problems came from the insane operator overloading and opaque build system rather than the type system. I understand it's gotten better since I looked 8 years ago but I've not had a reason to check it out again.
Those people haven't used Java 11+ or Python with types.
Python with types is a double edged sword. If you can enforce other collaborators and SDKs use strict typing, it's a very pleasant experience to write, coupled with Pydantic for defining POJO-like structs. For CLI tools Click + Poetry build system.
Java17+ type system is pretty good. Way better than Go in terms of dev experience. you may need a POJO generation tool (Lombok, Autovalue) and preferably validation libraries. I'd still write java over Go if efficiency and single binary is not a concern.
Yeah, the early days of Scala were a bad time for people going crazy with symbolic method names. Thankfully, most of that is in the past. (And Scala 3 is a much cleaner language in every way.)
Sorry this is probably a stupid question - how do you make your CI hard fail if you don't have types? This sounds like the missing piece for me, someone who also prefers just to crack on without types and then add them later.
CI generally has a pipeline, no? Or even at worse a shell script that you built? Just add the equivalent of “exit()” when mypy (or whatever typing linter you’re using fails), and then the dev gets notified he broke the build for everyone. That’ll get them to fixing and checking their code before it goes in to the main branch. Peer pressure is underestimated these days I think.
Personally I also find python typing useful in "smaller" programs and I use them pretty much everywhere.
For example I use it in all my automation scripts which are usually not much longer than a few pages or a few modules. Adding type hints is a very minor task, really comes straight out of brain just like the code i type.
Type hints have given me wonderful completion and documentation which has helped me to write more correct and bug free code. Often little things pop up that would only have been caught at runtime. Now I see those things as soon as I type them.
If it works for you, great. I’ll also do that for the same reasons. What I really love, though, is I don’t need to be totally correct in my types all the time. If the code will run, it’ll run and, in plenty of simple cases, that’s good enough (even if mypy tells me I passed the wrong type).
> It makes it hard to find the right solution through experimentation.
To me this suggests the tooling was insufficient. One of the things I value about type systems is being in a tight feedback loop with the compiler. Strong typing helps me run more experiments and to run them from within my editor. But without a good LSP and such, it would be miserable.
> To me this suggests the tooling was insufficient.
Perhaps but I find that type systems usually introduce friction to getting things working that, for some types of projects, can be worthwhile and, for others, not worth the overhead.
I think what you perceive as "friction" I perceive as "feedback," and it helps me work faster while writing code that's as good or better. When I interact with unfamiliar untyped code, it takes me longer to find my footing. So I don't find that there's overhead.
I'm not dogmatic, type systems aren't the end all. But I think a lot of the trouble people have with them comes from using the approaches they've developed in untyped systems and saying, "this is the same, I just get more warnings and errors." But the opportunity is in developing new approaches and habits that leverage the type system.
For me the "aha" moment was watching Jon Gjenset's YouTube videos, and seeing him intentionally write code with type errors in order to figure out what the type of something was. This helped me make the switch from thinking of errors as drudgery to be avoided to errors as feedback I may want to elicit.
Now I'm able to answer lots of questions within my editor that I'd previously need to use a REPL or consult the docs to answer. It's like I'm able to query my codebase, but the query language is just the language I'm already working in. The magic of strong typing is good tooling.
If it’s not clear: I like working with typed Python code. It absolutely solves a lot of issues that would require a stupid amount of testing and manual checking otherwise (dict-passing is a personal “favourite” I’ve seen).
We shouldn’t pretend that it’s free. The fact is, sometimes running some code that I know works despite what the type system is telling me is very useful. Comparing to, say, Go, where I need to deal with everything (even if it’s incidental to the actual problem) adds friction. Sometimes I don’t want to deal with edge cases because this code will never leave my machine!
The cost of strong typing is increased compilation time, increased memory use during compilation, and increased complexity in learning the language. Time spent resolving type issues during development is time spent testing and debugging shifted left, and generally faster than chasing down bugs.
I can recall times when I've neglected a special case to write a one-time script faster, but they were always external to the type system. Eg, "this regex is pretty sketchy, but I know it will work with this particular set of logs." So I'm confused on this point, am I missing something?
All programs need to be type checked, even one-off scripts written in dynamic languages, to ensure that they function. The difference is whether the process is manual or automated. The computer is going to do it better and faster than I can, and that will free up mental bandwidth that I can apply to the problem at hand.
Example: Optional types in a type system require you to handle the null case. Plenty of times you know null isn’t possible (just the interface requires it) so you ignore handling assume it’s never null.
In a statically typed language you’ll get an error in the compiler. In Python it’ll run. Reading a file can be similar. These small frictions add up when I’m just trying to make a simple script work by experimentation.
At the risk of repeating myself, that sounds like insufficient tooling to me. Are you only seeing these issues when you explicitly compile? And then you have to fix a bunch of them? Because I see them in my editor in real time, and often clean them up with code completion. When I'm writing a one-file script, it's probably going to compile and run the first time.
My problem is I don’t want to waste my time fixing them at all. I just want to run/debug the code and move on with my life to the next thing.
At the risk of repeating myself: I think typing is a powerful tool but to pretend it has no trade-offs is 100% disingenuous or naive. If it didn’t, dynamic or untyped languages simply would never exist!
That's understandable. Personally after embracing strong typing I find it hard to go back, and I feel like my time and energy is wasted without good feedback from the LSP and reliably understanding what types I'm dealing with or what errors may occur. Dynamic typing isn't free either.
As far as the origins of dynamic typing goes, the amount of compute we can dedicate to tooling today, and the development of language servers in particular, radically changes the tradeoffs we can make. Strongly typed languages are better positioned to leverage tooling, and in my view this has completely upended the landscape of language ergonomics. Strong typing might not be free, but it does buy you a lot.
Does that view really make me naive or disingenuous? Perhaps we've just had different experiences. To be frank, when you say something that I solve in a blink of the eye is a serious friction, I have a similar reaction.
> To be frank, when you say something that I solve in a blink of the eye is a serious friction, I have a similar reaction.
So you're telling me that the following:
data = requests.get(some_link).json()['the_field_i_care_about']
do_something_with(data)
is just as much effort as defining the types and dealing with the error case (depending on the language). You're either vastly more intelligent—and a faster typist—than pretty much every human being I've ever interacted with or, more likely, you're lying to me and/or yourself.
This is both rude and ignorant. There is great value to exploration and experimentation, particularly for beginner programmers. Type systems do get in the way, and I say this as someone with 20+ years experience of modern (Scala, Haskell) type systems.
> There is great value to exploration and experimentation
I agree. Typing in some code and trying to compile it is experimentation. Trying to fix the error in your code is exploration. Discarding the compiler's output as getting in the way means ignoring very, very precious advice.
Expecting an incorrect program to compile and run is a reckless spoiled brat attitude, not "exploration and experimentation".
On the other hand heavyweight type systems that demand preemptively writing code to cover cases that won't happen and/or aren't important, including cases that make sense but aren't needed yet, can waste time actual exploration and experimentation, compared to writing only useful code and being sometimes surprised by runtime errors when the program accidentally attempts something not implemented yet.
On the third hand "exploration and experimentation" can suffice for a proof of concept, not for production code: tests and automated checks are necessary to be confident about your program, and even rudimentary tools like Python type annotations and type checkers can be useful.
No. It's that when people who prefer type systems go to use a popular dynamic language, they try to drag in features that they like from other languages.
Very honestly, optional type systems tend to be the worst of all worlds. Because people who don't care, don't need to use it, you don't get safety. But people who enjoy ceremony can inflict verbosity on others. While missing the most important reason to do it.
> No. It's that when people who prefer type systems go to use a popular dynamic language, they try to drag in features that they like from other languages.
Very much. The “problem” people try to solve with things like dependency injection isn’t even a problem in python. 99.99% of the time you can just import you dependencies everywhere they are needed.
So many times now I’ve had to unteach bulky OOP patterns for people coming from strictly typed languages to python believe are “good practices” or are “needed for reliable code” and you go from 20 different interdependent classes down to 3 functions that just directly do what they are supposed to do.
And you always end up with someone being disappointed that all of these complex patterns they learned to solve problems that don’t exist in python aren’t needed, rather than someone just happy that you can proceed directly to a value adding solution and skip the crud and design pattern spam.
And then they start arguing crazy stuff like the 200 extra lines unneeded code makes the solution “more readable” or “more reliable”, when in reality it’s just a desire to use the solutions they are used to even when the problems don’t exist.
> The “problem” people try to
> solve with things like
> dependency injection isn’t
> even a problem in python.
> 99.99% of the time you can
> just import you dependencies
> everywhere they are needed.
That's... Just not the problem DI fixes in any language? Sure, in C++ or Java you can just create a global static instance and use that everywhere. That's pretty much equivalent to just importing the same thing everywhere. But the downside of both approaches is testability: it becomes much harder to test units in isolation when they access global resources.
Consider a function with `import datetime`, that gets the current time and does something with it. You want to make sure that this function still does the right thing in the extreme case of the current time being 23:59:59. How do you write this test without DI?
Admittedly, this approach adds _some_ noise to the function signature. On the other hand, it feels more honest. It makes it unmistakably clear that this API relies on, and uses, some external state.
Also, this approach scales poorly with the number of dependencies, and that’s a good thing. If `f` were to also depend on a `db_connection` in addition to `datetime.now`, then this pattern automatically alerts the API designer that it might be time to redesign `f`.
Exactly! And now you've invented Dependency Injection. That's all DI means, passing dependencies to where they're needed instead of pulling them in via global state. You're doing it by hand (and in an impure way) instead of relying on an IoC container for convenience, but it's still the same thing.
I find that good types help me model the problem and solution with clarity, and the procedures and functions take a back seat. After all, most procedures are just transformations of structured data, so putting the datastructure first makes good sense and keeps the code clean and lean. I used to believe in the “moving fast with no types” thing but in practice I find myself modeling the problem either way, it’s just clumsier without strong types.
Common Lisp was the first language I'm aware of with optional typing. In theory, what you're saying is true. In practice there's a lot of pressure to ship the working prototype, and retrofitting types on it later doesn't actually happen. The result was a reputation for being slow. (This was back when computers were slow enough that the overhead of being dynamic was pretty painful.)
Python type annotations are a good source of lightweight, easy to maintain metadata for many frameworks that provide valuable automatic handling of specific data types (for example defining deserialization and deserialization of nice class types with field declarations, like attrs/cattrs or Pydantic, or defining command line interfaces with function parameter declarations, like Appeal).
Expecting type annotations to be more generally useful is mostly projected, baseless declaration anxiety: tools and methods that are useful in other languages are expected to be relevant in Python, certainly comforting for the type-addicted programmer but not necessarily useful. Declaring types is perceived as normal, as a necessary burden and dynamic typing is perceived as missing important structural elements.
Consider the error pattern of accessing a value as if it had a different type: in C or C++ consequences are dire (possibly undetected and cascading memory corruption), likelihood is very high due to specific language features (pointers and references, raw arrays, casts, weak typing in general) and detailed type declarations are a useful mitigation because they turn run time catastrophes into actionable compile time reports, while in Python, thanks to robust duck typing without dangerous complications, consequences are mild (reasonable error messages or wrong results, before corrupting memory), likelihood is low (specific instances of unexpected and malformed data or gross API misunderstandings) and type declarations do nothing.
I’m not into the whole optional typing thing either. I would consider that far down the “worse typesystem” end of the spectrum I mentioned, only slightly better than “types are just arbitrary labels the language doesn’t even know about”, which is maybe even worse than lean dynamic typesystems, because you might make the mistake of believing in the shadows on the wall.
“types are just arbitrary labels the language doesn’t even know about” is ok if you're aware of the limitations. I add typedefs and other jsdocs to javascript just because it makes it easier to work with in an IDE. it's nice to hover over a variable and see what it's "supposed" to be (if all went well this object should have a .id, might have a .name, etc), sometimes even declared as two types like "this variable is either an int or a boolean". I don't really want to bother learning typescript or trying to get coworkers to do so/put it in the pipeline so it's the best I can do and it certainly makes life a bit easier.
I've been writing python for a living for almost 20 years. Big projects. Never felt the need for static typing. And I strongly dislike the verbosity and complexity it adds to the language.
Python is beautiful because it's simple, easy to read, easy to understand. The typing tagged on to it looks like an ugly word salad to me.
Unfortunately, when many people hear of static typing they think of C++ and Java, which understandably leads to these kinds of conclusions. If they ever bothered writing typed Python for a little bit, especially in new packages or projects where the surrounding code is also typed, they might realise how useful it is.
There's a world of difference between the experience you get with untyped Python and typed Python that you check with strict mypy or Pyright. After using typed Python for a while, I don't think I could go back to a completely untyped codebase.
> Have you ever tried static typing though? I suspect you don't know what you're missing. Static types make Python way easier to understand.
I'd probably call myself within the "dynamic typing crowd" and I've tried plenty of static typing. It mostly just slows down iterating on something, prevents issues that I/my projects don't really suffer from in the first place, and gets in the way more than it helps.
The statically typed languages I've tried are: C#, Crystal, Elm, Go, Haskell, Haxe, Java, Kotlin, Nim, Rust, TypeScript and probably more I'm forgetting about. Out of those, I've probably written most Rust code. I wouldn't say I despise static typing, but I'm not getting the same value from it that others seem to get.
I still come back to Clojure, ClojureScript or just straight up vanilla JavaScript, as they're much more effective at actually helping me solve the problem I have in my practical day-to-day.
> I'm not getting the same value from it that others seem to get.
Another reason I've seen people not realise how good static types are is if they aren't using a proper IDE with code intelligence.
The benefits of static typing are:
1. Fewer bugs.
2. Makes code easier to understand, because you know what types things are. That gives you a lot of information (even "business domain" things) that usually aren't documented in dynamically typed code.
3. Navigating code in an IDE that understands the language is a lot faster. E.g. you can just ctrl-click something to go to its definition, auto-complete works reliably, you can find all the uses of an item reliably, etc.
4. Refactoring code becomes tractable and easy. You can rename an item and it will automatically update all the usages. Anything you miss will get caught by the type checker.
If you don't use a proper IDE you're missing out on half of that.
Even so, I don't really see how you can say it gets in the way more than it helps. Unless you're working on really small & one-off projects, the amount of time you'll save by not having to deal with type errors or spend ages deciphering code just to figure out what type something has easily offsets any time adding the types.
> If you don't use a proper IDE you're missing out on half of that.
Ok, does Visual Studio Code and/or Visual Studio and/or the various JetBrains IDEs count as "proper IDE"? If so, those are the editors I tried, and while they're nice and all, none of those things you listed got better compared to my non-static typed languages usage. In fact, I'd argue that some of those things get worse when using statically typed languages, especially #2 and #4.
> Even so, I don't really see how you can say it gets in the way more than it helps. Unless you're working on really small & one-off projects, the amount of time you'll save by not having to deal with type errors or spend ages deciphering code just to figure out what type something has easily offsets any time adding the types.
I don't have to spend any time figuring out what type something has because that's not a typical problem I have when reading and writing code in for example Clojure. And if I do wonder about the shape of the data or whatever, I evaluate that snippet of code in my editor and it shows me what data is inside of whatever I had selected.
It's OK that we have different ways of working and our brains work differently. Static typing is not objectively better, some things just work better in one way for some people. I really love the feedback cycle of "Read code, evaluate it, change it, evaluate it, write a test, evaluate it, save file" for producing/modifying code, and others want a cycle of "Read a lot, type a bit, run type checker, run unit tests" or whatever, and that's perfectly fine.
> I don't have to spend any time figuring out what type something has because that's not a typical problem I have when reading and writing code in for example Clojure. And if I do wonder about the shape of the data or whatever, I evaluate that snippet of code in my editor and it shows me what data is inside of whatever I had selected.
Of course you have to know the type of an object to manipulate it. How can you not?
Running code to see the type is pretty much the only option for dynamically typed code but it's clearly vastly inferior:
1. It takes way longer.
2. It only shows you one possibility for the type. Falls apart as soon as you have a union or optional.
> In fact, I'd argue that some of those things get worse when using statically typed languages, especially #2 and #4.
How? Static types add extra information that makes code easier to understand. In dynamically typed code it's the stuff that ends up in comments anyway, except now it's complete and correct.
Similarly they make automatic variable renaming actually work (IDEs don't have enough information to do it properly otherwise), and they detect errors when you screw up a refactor.
What's your argument for them making things worse?
> Of course you have to know the type of an object to manipulate it. How can you not?
Not every language requires you to know the strict type of the data to manipulate it. Take `conj` from Clojure as an example. The function adds an entry to a `collection`, so you know it's a collection, but that's all you need to know in most cases. A collection can be a map, vector, list or your own thing, but it "adds a new entry" regardless, which exact behavior depends on what you use with it.
Definitely not for everyone, but personally I like the abstracted thinking this leads to.
Besides, what I meant with my comment wasn't "I don't need to know anything" but that confusing what types I'm dealing with isn't a typical problem I have when reading/writing code. I thought I made it clear, but I suppose I could have made it clearer. That's on me.
> Running code to see the type is pretty much the only option for dynamically typed code but it's clearly vastly inferior:
Disagree, it's vastly inferior to rely on types to understand what the data actually is, and evaluating the selected form takes one keyboard shortcut and your editor shows you the actual data + type. Only being able to see the type is clearly vastly inferior to me. I like to be able to see exactly what's going on, not just being able to see kind of what's going on.
You never wanted to be able to select a function call inside of another function and be able to see exactly what that returns, without having to do anything else than selecting that piece of code and hitting a keyboard shortcut in your editor? I wouldn't want to trade being able to do that with having static types any day.
> Similarly they make automatic variable renaming actually work (IDEs don't have enough information to do it properly otherwise), and they detect errors when you screw up a refactor.
I don't know where you get the idea that automatic variable renaming doesn't work outside of statically typed languages. Just because a language isn't statically typed, doesn't mean you can't statically analyze it, see https://clojure-lsp.io/features/ for one example.
Worth repeating: It's great that you seem to be getting good value out of statically typed languages, really. But that doesn't mean it's "objectively the best" or whatever, it just means it works great for you with the tradeoffs you're willing to make. Personally, I make other tradeoffs, so other languages are better for me and what I work on. You seem to argue from the standpoint of "Obviously static typing is the best for everything, no doubt" (like many of the statically typing practitioners) but reality is almost never that black and white.
I love Lua but I spend way more time than is fair hunting through my codebase to find the name of that one thing I was trying to access but that doesn’t seem to be present any more and where and when it was created. (Inb4 “better unit tests!” — typesystems do that bit of thinking for you…)
I feel the same way about Lisps, but more strongly. It’s really fast and fun to sketch stuff up until you’re about 3 or 4 functions deep trying to figure out the shape of that one inner associative array and what made you think this little adventure was a good idea in the first place.
> It’s really fast and fun to sketch stuff up until you’re about 3 or 4 functions deep trying to figure out the shape of that one inner associative array and what made you think this little adventure was a good idea in the first place.
But that's exactly the situations where lisps shines! Select the form in your editor, evaluate it and your editor tells you exactly what it is, both runtime and compile-time data, pure magic :)
Yes. I like to use type annotations on functions because it's documentation. If you use good names then it makes docstrings redundant. I'm thinking something like this:
def bounding_box(points: list[Point]) -> Box:
...
A decent IDE will be able to quickly jump to definition of those Box and Point etc. if you need it. This is much better than having a docstring and having to look stuff up manually. The fact you can run a static type checker like mypy is a bonus!
I also really like being able to document the imperative code like `def do_thing() -> None`. Of course, it's completely up to the programmer to follow the rule of not doing side effects in routines that return something.
But having to do it everywhere? Ugh. I don't think people realise how powerful duck typing is for doing polymorphic code. I can write something like:
def mean(things):
return sum(things)/len(things)
And I don't care what concrete type you pass me as long as it supports `sum` and `len`. What am I going to do, define an ABC or typing.Protocol called `ListLike` or something? Hell no, I've got better things to do.
But of course learning when to define static types vs when not to comes down to experience. Python treats you like an adult. I feel like a lot of people who want static typing everywhere want it as training wheels for other devs they don't trust to make the right judgement calls.
You picked a poor example for `mean` because there's a very easy answer: collections.abc.Sequence – typing.Sequence in earlier versions – an object that supports __getitem__ and __len__. You can check the collections.abc documentation[1] for the available protocols; you might be surprised about how much they cover already. I often use Sequence/Iterable or Mapping as parameter types instead of concrete types like list or dict.
> I feel like a lot of people who want static typing everywhere want it as training wheels for other devs they don't trust to make the right judgement calls.
Now, I want it because it makes my life a lot easier. It finds many classes of errors before runtime and improves auto-completion in my editor. For most programs, it's not a lot of effort. The frustrating part is interfacing with untyped libraries while using strict type-checking, but thankfully, more and more libraries are adopting typing. The type system itself is fine: most things you want to express are doable, but it's not at the same level of complexity as Typescript. It's certainly a better type system than Go's for example.
Hmph, I knew someone would point out that `Sequence` exists or whatever... The point was really that this could be a new type that isn't fully captured by something built in.
As for `collections.abc.Sequence`, using an ABC sucks because then you have to define a class and inherit from it. I don't want to do that. I just want to pass something that conforms to the duck type. Are these now also defined as structural types/Protocols?
> But having to do it everywhere? Ugh. I don't think people realise how powerful duck typing is for doing polymorphic code.
Duck typing is indeed powerful, but it does not need to be a runtime check.
> What am I going to do, define an ABC or typing.Protocol called `ListLike` or something?
Yes, how am I going to know what types I can pass to mean instead? I would go to the extreme of saying that a lot of python code should be type annotated (if annotated at all) with protocols instead of concrete or base classes.
Of course until type checking is properly integrated in the interpreter this is all kind of pointless.
> Yes, how am I going to know what types I can pass to mean instead?
Oh, if I'm writing it for you (ie. I'm writing a library) then I'll put documentation (probably as type annotations, as I said in the first half of my comment). But a lot of code I write is just for me, or is going to be part of a completely self-contained project and it would be obvious from context what to pass. Like I said, it's a judgement call that Python expects you to be able to make.
As an undergrad, Python was my favorite language. Now it's one of my least favorite because of dynamic typing and the poor dependency management.
Python dicts, when used as composite types or records, are a literal hellscape. Grepping through the code to find out where stringly-keyed fields get written takes way more time than thinking about types ever would. These should be structs.
Static typing has so many advantages:
- It lowers the software defect rate. All type errors are caught for free at compile time instead of runtime. This makes the software strong and rigid instead of brittle, and it removes an entire category of tests you would have to write and maintain.
- Static typing makes code maintainable for other people, including future you. It's self-documenting. You know precisely what things are in the immediate scope.
- Static typing makes bug-free automated refactoring with tools possible. There is no greater pleasure than mutating code via its AST.
Static typing is not hard, either. Most typed languages don't require type declarations except in structs and function declarations - that's really not a lot of effort.
a type system is useless if you have no guarantees whether it will be used by any code you use. And if any code that declares types can violate them at will.
Optional is equivalent to no type system at all, in my opinion.
Just because a type system is optional doesn't mean it doesn't do anything, it just means that you can control how much checking you get, both in terms of the code that is typechecked and how strictly it is checked. Avoiding some type errors through type checking is much better than not having any type checking. Or put another way, optional seat belts are much better than not seat belts at all.
except in the real world, seat belts are mandatory. Because not wearing them is always a net negative.
giving me some confidence is completely useless on a large scale, especially because the type system could literally be lying even when it does give me something when it is optional and unenforced (an external linter does not count as enforcement, because it is not exhaustive). I still have to write tests for everything to check types, or run some external type checker which is guaranteed to not be able to catch all errors because the language semantics don't even allow for it.
giving me full confidence is not. In typed languages, if the compiler (not some external tool) says there are no type errors, then there are no type errors. I'm not "reasonably confident". I'm sure.
i also care about usability of the dependencies for myself, the dev. If the dependency doesn't provide type information, I can miss bugs in my own code.
That misses the primary reasons many people use Python: the ecosystem and network effects. I would not use Python if it weren't the only language my colleagues know (many of them not computer scientists by training) and if it didn't have the best (or second/third best) libraries for almost everything.
I have no inside knowledge. But given the performance and scale chatgpt runs at (and the caliber of the team), I think it’s safe to assume a lot of their production code is written in C++ / cuda.
Python is really popular for both training and inference, and in both cases it uses native-compiled libs under the hood to ameliorate performance problems. I mean, maybe they’re chasing the last few % now, but it looks to me like most of their R&D is focused on their models and their interactive capabilities.
There is definitely a push for it; not all of us think it's an improvement. When I see modern Python code, it looks nothing like the (relatively) clean, smallish and easy-to-understand language that it was 25 years ago. I get it, things change. The language now caters to a different crowd and attracts different people.
But I have often wondered, if someone wants static typing, why not just use a statically typed language?
I would and do, but a lot of existing code (especially ML code) is Python and I find myself having to interface with it. It's not an awful lot of fun, especially when there are no annotations to record what the code expects and what it outputs.
> But I have often wondered, if someone wants static typing, why not just use a statically typed language?
A very high proportion of Python devs aren’t using it for the language, but for the libraries and ecosystem. Also lots of them weren’t the ones who picked it.
Meanwhile, not at least having the kind of autocomplete and documentation that type hints provide is kinda hellish on any project of more than 200 or so lines. The time savings from spotting runtime bugs before they happen is just a bonus.
Personally, I almost never need to add a type hint outside high-level definitions and function/method signatures, so they’re not really in the way even when I’m being pretty thorough with them.
Having a type system provides benefits. Having a type system also comes with costs. The trick is whether (you believe) your specific use case gets enough out of those benefits to pay the costs.
I can’t think of a time I have paid more for using a typesystem than I have been empowered by it. If you’re using types properly, it should be very empowering, and the cost not really perceptible. If anything, it reduces cost by moving things out of my brain and tests, into the typesystem.
If you're using dynamic systems properly, it should be very empowering, and the lack of types is not really perceptible. If anything, it reduces the time code my moving things (having to consider types) out of my brain. I can just imagine what I want the code to do and type it, and most of the time it just kind of works out.
Mind you, I prefer a type system. I'd prefer to use Typescript over Javascript, etc. But I've also used a number of dynamic languages that let me work a lot faster when needed; Tcl, Ruby, and Python are examples of these.
I've also used some type systems that lift a heavier load, letting me pay more attention to the type definitions and know that it will "just work" at the end, because, mathematically, it works. Haskell falls into this category (though I rarely use it for anything other than fun).
I get it, you haven't use a dynamic system in a way that it works out for you. But that doesn't mean they're wrong... just that others have different experiences.
or perhaps a bad typesystem is worse than no typesystem at all? I think typing in python is obviously getting better, it just feels something went wrong when they constantly have to fix design mistakes other languages never made to begin with. Typescript is fun, its powerful and strict, typing in python is ugly and frustrating.
If this is the first time people are introduced to static typing in programming I can understand their frustrations and opposition to it. Its probably the worst type system in any modern, popular language out there, except perhaps when things like clojure(script) pretends to have types (shudder).
Python's optional type-hinting is leaps and bounds better than a static typing system (on its own). It's fluid, allows expressibility whilst balancing guard-rails, and is making crazy cool use of "execute-time" or "import-time" time concepts (as an alternative to design-time and run-time).
So no, we're not coming around. Static typing as dictated by a heavy-handed and super-strict compiler has it's place, but has gimped our industry for decades.
However, we do have to be very careful. Static, compile-time typing has kept the hipsters and junior-devs at-bay, and kept them from causing too-much havoc as we've seen in the JS world. So it's definitely an up-coming hazard for us to navigate around and make sure we don't fall prey to. Otherwise python will turn into another JS dumpster fire. Luckily, the JS developers are too-distracted and enthralled with node.js to jump ship.
Statically typed languages are in fact so much better nowadays that the industry should drop all dynamic / scripting languages for application development
That will sadly probably never happen given the momentum they’ve built so the only choice is to retroactively add static typing and begin enforcing it in individual projects.
When starting a new project, I would suggest nobody choose a dynamically typed language. Between Swift, Kotlin, C#, Go, Rust, etc… there’s no need for anything else. As long as front end web is around you might need TypeScript - and as long as ML is Python centric you might benefit from using a bit of it. But I wouldn’t make them the primary language.
Context: Recovering dynamically typed language addict of 12 years. They’re slow, error prone, and don’t scale to a large engineering team.
Whether or not the performance matters...well that's somewhat subjective since Python has a fairly high performance floor which makes performance concerns a bit of a, "Why are you doing it in Python?" question rather than a, "How do I do this faster in Python?" most of the time. That said it _is_ more memory efficient and faster on attribute lookup.
Yes I know the slotted attribute is not in a __dict__, which definitely helps memory usage. But my point is that if the parent structure is itself in a dict, that access will swamp the L1 cache miss in terms of latency. Even the interpretation overhead (and likely cache thrashing) will eliminate L1 cache speedups.
And yes __slots__ improve perf, but it’s about avoiding the __dict__ access, which hits really generic hashing code and then memory probing more than it is about L1 cache
Where __slots__ are most useful (and IIRC what they were designed for) is when you have a lot of tiny objects and memory usage can shrink significantly as a result. That could be the difference between having to spill to disk or keeping the workload in memory. E.g., Openpyxl does with a spreadsheet model, where there could be tons of cell references floating around
> The __slots__ declaration allows us to explicitly declare data members, causes Python to reserve space for them in memory, and prevents the creation of __dict__ and __weakref__ attributes. It also prevents the creation of any variables that aren't declared in __slots__.
Emphasis:
> prevents the creation of __dict__ and __weakref__ attributes. It also prevents the creation of any variables that aren't declared in __slots__.
In short, if you create a slotted object with __slots__ it sends you down a fairly orthogonal object lifecycle path which does not create or use __dict__ in anyway. This obviously has drawbacks/limitations like not being able to add new members to the object like a normal Python object.
From the second article:
> However, if you have __slots__, the descriptor is cached (which contains an offset to directly access the PyObjectwithout doing dictionary lookup). In PyMember_GetOne, it uses the descriptor offset to jump directly where the pointer to the object is stored in memory. This will improve cache coherency slightly, as the pointers to objects are stored in 8 byte chunks right next to each other (I’m using a 64-bit version of Python 3.7.1). However, it’s still a PyObject pointer, which means that it could be stored anywhere in memory. Files: ceval.c, object.c, descrobject.c
Which I think addresses your concern about parent dict access...but I could also be misunderstanding your point.
I’ve met a lot of teams throughout my career who struggle daily with a badly performing Python codebase. You can write a no-frills web service in c#, go, rust or JavaScript. And, so long as you don’t do anything stupid, it’s usually plenty fast enough from day 1 to handle your users. But in my experience, the same isn’t true of Python. I’m sure Python web services can be made to run ok, but because it’s slow by default, I bet a lot more time is spent optimising Python programs around the world than optimising JavaScript.
Good point. It’s more about choosing a good algorithm though.
A brute force O(N) in C++ may be fast enough, in a situation where you need to use O(logN) to get the equivalent speed in Python. Squeezing out a few extra percent from a O(N) in Python by using slots will not be enough.
Of course that doesn’t mean you shouldn’t leave performance on the table if the optimizations have noticeable effects.
Right. Another way I’ve heard it put is that a Python program running on a modern computer is equivalent to the same go program running on a computer from 20 years ago.
depends on the type in question. If you are fetching and operating on a large number of records then it can matter. But otherwise the answer is more often that it does not really matter.
Better than both use Pydantic. You'll never want to use anything else ever again. It's truly transformative in how you write code. Full type hinting support as well as strong verification that your data actually conforms to the types you set. Full recursive parsing of types arbitrarily nested and can parse tagged and untagged unions.
TypedDicts are enormously helpful in defining args a function takes. You can’t do that with either dataclasses / pydantic without passing instantiated objects as args - which is really cumbersome.
I actually have a function for this! I use it all the time and it's super helpful.
T = TypeVar("T")
U = TypeVar("U")
V = TypeVar("V")
P = ParamSpec("P")
def modelargs(model: Callable[P, U]):
def _modelargs(func: Callable[[T], V]) -> Callable[P, V]:
def __modelargs(*args: P.args, **kwargs: P.kwargs) -> V:
return func(model(*args, **kwargs)) # type: ignore
return __modelargs # type: ignore
return _modelargs
class MyModel(BaseModel):
foo: str
bar: int = 4
@modelargs(MyModel)
def test_func(model: MyModel):
print(model.foo, model.bar)
return 4
test_func(foo="Hello", bar=20) # -> prints Hello 20
If you look in your editor you'll see that the type signature for test_func is `(*, foo: str, bar: int = 4) -> int`. It's unfortunate that you have to write the model type twice but in exchange you don't have to write the args twice.
I think serjester was talking about PEP 692 for typing kwargs with TypedDicts. Your recipe is a bit different.
Pydantic is targeting other use cases. The point of TypedDicts is compile-time safety without run-time overhead. Pydantic is useful for a lot of things, but performance isn't exactly its strong suit (written as of 2.9.2, I was just revisiting it earlier this week).
Anyway, in the same spirit of function signature hacking, I've found the following useful for "inheriting" them:
_T = TypeVar("_T", bound=Callable)
def inherit_signature(_function: _T) -> Callable[..., _T]:
return lambda f: f
# Requests for example has some long signatures (via typeshed).
class CustomSession(requests.Session):
@inherit_signature(requests.Session.post)
def post(self, url: str, *args: Any, **kwargs: Any) -> requests.Response:
...
Really like Pydantic - assuming you’re happy to step outside the standard library. It does have a few sharp edges, but out of the box it includes all the stuff you eventually end up needing anyway. attrs is the other big Python parsing / validation library. I know nothing about it other than it’s also popular and that it distinguishes itself philosophically from Pydantic in a number of ways.
Don’t forget to use ‘ConfigDict(frozen=True)’ absolutely everywhere!
That said, there's also msgspec [1], which I've not used yet but plan to for my next project. It is supposedly quite a bit faster than Pydantic in de/serialization.
Python very much lives up to its goal of being a language for consenting adults. It's part of the culture that you don't write code that tries to forbid users of your code from doing "wrong" things and such efforts are mostly futile anyway. You can reach right into a Python object's internal data and modify it directly, taking away __setattr__ is just a suggestion. It's the same as in C where you can modify const variables if you're determined enough.
I love Python and use it extensively. I just find the extensive typing rituals a bit funny. If one needs this much bandaid (essentially for code editors), then Python is not a good choice for the task.
I feel like this needs to be mentioned more: If you don't have some kind of system to enforce types, then TypedDict does nothing at all.
You can store a float in an attribute annotated as a string, and default Python will not stop you, or display any kind of warning. The typing is purely for development, and does nothing once compiled.
If you want typing to actually be enforced, you need to use something like Pydantic.
Counter argument: this is actually a myth that gets repeated far too often, and misses the point of what static types do.
If your input data is the correct type (big if #1), and if your types are sounds and consistent with each other (big if #2), then you don't need something like Pydantic, because the types guarantee that there will never be a float stored somewhere that's annotated as a string. It cannot happen, because as soon as you try to store a float in a string attribute, your type checker (Pyright, Mypy, etc) will complain. And if you're consistently running that type checker over your code, e.g. in a CI job that runs over your codebase for every push, then you can never have checked-in code where the types are incorrect.
There are the two big caveats above, but these turn out not to be such big deals. Caveat #1 is that you can't rely on external data to match the types you've described internally. This is an ideal use case for Pydantic, like you say: you check external types once, at the boundaries of your program, and then internally you can be confident that the types will always be correct.
Caveat #2 is that you don't break the type checker's limits. This generally means avoiding things like `Any`, and ensuring that your typed code always calls other typed code, and never untyped, un-annotated code. The easiest way to do this is to start by typing the leaf files in your program (the ones that only get imported, and never import anything in turn). These can't import untyped code, so if they're fully typed and the type checker passes, then you can be confident that their types are correct. Now sure, a bad caller could try and call one of these functions with invalid input, but as they say, Python is for adults, and you're free to call a function with the wrong arguments, and you're free to deal with the results.
Now, you can add types to code that only imports typed code, and because you've checked the imported code you know it's correct, so if you also check the importer and it's also correct, then you can be confident at that level too. You can keep going until you've added types to the entire codebase, and now you can be confident that the entire codebase always passes the correct types, and therefore that no runtime enforcement is necessary.
There is a third caveat, which is that this relies on having a powerful enough type system to model all the things that you were originally doing in Python's dynamic type system. This is hard, because you can do some pretty wild things in Python, and my own experience with Python type checkers has been a bit disappointing - they handle more boilerplate-heavy code fairly well, but seem to struggle on more idiomatic code that uses Python's dynamic features. However, that seems to be improving all the time, and Typescript demonstrates that it's certainly possible to model a dynamic language with static types to a very reasonable accuracy.
Which is why you use a type checker to ensure that it never happens in the codebase. If there is no line of code that ever does something like
int_field: int = "string"
or
self.int_field = "string"
Then you can be confident that the language will never store a string value in that integer field. And if you expand this to cover all fields and variables in your codebase, you can see that, if you use the type system correctly, the language won't violate those constraints.
There are some exceptions to this, like I said in the previous comment, but I think when getting started with typed Python, it's easier to assume that the types will be valid, because that's the case almost all of the time. Usually, overriding type assertions requires some kind of explicit cast or statement, and this is usually documented.
another way to accomplish this, and also eliminate any and all possibilities of human error, is to make the type checker part of the language. You're still doing the same thing, just with extra steps and implicit rules
This is the kind of copyediting advice ChatGPT gives. I think everyone gets that the author doesn't know what you, in particular, think about TypedDicts. Read things more charitably; this is not a good use of time to discuss.
I’m not a big fan of the “my opinion is fact” or “your opinion is wrong” headlines. They can be mildly funny in the right context, but it’s been done so much that they’re just a bit boring now. I’m especially bored of seeing this convention in conference presentation titles.
Honestly, right now it looks like you are the one derailing the discussion. And when it didn't go your way, you started quoting the "guidelines" as if to smack us over our fingers for misbehaving. This entire sub-thread at time of writing this is like 20% of the entire discussion; and that's sad.
Yup, using "you" in headlines is a pattern that needs to die.
I get that it's attention-grabbing, but it's because it's rude.
You don't know anything about me. You don't know what I think or what I already know or what I won't believe.
I know it's not a big deal in the grand scheme of things, but it's just one of those little aggravating things that makes life just a little bit worse each time you come across them.
i agree (!), and it was my typo (amid Figaro the rescue kitten helping with the keyboard) and attempt to remark that such titles probably don't even get clicked.
It's a qualifier. In a sense, they do know their audience fairly well because someone who clicks on that link is intrigued and feel they have more to learn about it. Anyone who is pretty knowledgeable about the subject will go "pff .. yeah right" and not even click on it.
That said, it does annoy me too.
Also what annoys me is that I constantly try to play devils advocate for things like this even if I don't always agree with the conclusion of the advocate.
I’m going to send you some json to update your object – an HTTP PATCH request. There are three possible updates I might want to make to the subscription field:
- change it to a new value
- set it to None
- leave it as-is
In the json the first two options are specified by sending an object with a "subscription" field, either set to a string or to null. The third case is expressed by omitting the field.
The OP asks how all three cases could be represented in python, and points out that one could not use subscription=None to represent both case 2 and 3 above.
If you need to be able to represent what looks like a tree of data
set - no value
- new value
do_no_set
(sorry about the horrible tree) then it seems like you should represent the data like that. Pass in action (set/ignore) and value (value for set, no value for ignore). Or even (set/unset/ignore), which allows you to have some sanity checking that, if the action is set, an actual value is provided (and vice versa for unset).
I’ve been in the same boat. The amount of pain you can reduce by pushing type annotations through a codebase like that is spectacular. And although I love dataclasses, the non-totality that you get from typeddicts is key for modeling the kind of mess you describe.
While we’re on the topic, a better typesystem is better than a worse typesystem. Thanks for coming to my TED talk.