Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Let's discuss Python type hints guidelines
12 points by gr33ndata on Feb 10, 2022 | hide | past | favorite | 10 comments
I am starting to appreciate how Python type serve as extra documentation, as well as how they are bing hacked to make stuff like data classes.

Nevertheless, I feel they are sometimes overused, and maybe we need to think about guidelines, so we can have a better balance between clarity and verbosity.

We learned from experience that clear code doesn't need comments. Functions and variables with good naming do not require comments. For example, is the following comment useful?

# Adding a and b and putting results in c c = a + b

Not saying that comments aren't needed at all. In fact they are essential for the subtle details not that clear. Or as John Ousterhout put it, “Comments augment the code by providing information at a different level of detail.”

Now with type hints, I want to argue the same, obvious stuff don't need to be annotated. Here are some examples.

- args and *kwargs: We all know that these are lists and dictionaries respectively.However, I've seen code where they are annotated still. - dictionaries: Like 99% of the time, the dictionary keys are strings, yet I've seen them being annotated with Dict[str, float] or such. Sure, maybe you are dealing with a case where the dict keys are something else, but then I'd argue that the reader needs to know more about that dict then and what it carries then its types. - When a function doesn't return anything, should we still annotate it with None, or we are kinda stating the obvious here. - When the variable name is obvious. Do I still need to tell the type if the variable is called timestamp, days_since_creation, is_valid, num_item_sold, currency, or amount_paid? - All data scientists use df to refer to DataFrame, the name df should be enough to infer the type and a type hint here will be just noise?

The point is, I want to know what do people smarter than me think? Maybe, I am wrong about it, and type hints still offer some value that I miss here.

Finally, one might argue that obvious types hints need to be stated still to be enforced by tools like mypy or so. That's a different argument, and personally, I feel like type checking is a way to hack a dynamic language like Python to become Java. So, I'm only interested here in the documentation value of type hints.




For me, a big part of the value of typehints isn’t just the legibility, it’s also helping the interpreter to help me. I can have the interpreter tell me that a string got passed into a function instead of the list of strings that I was expecting, or I can have have my code output a single letter instead of a whole word from the array access call in my code.

I’ll figure out the bug eventually either way, but having the compiler give me the clearer warning earlier in the development process (sometimes even from pure static analysis) is a significantly more pleasurable development experience and a potential time saver

And 50X that, especially the time saved, if I’m integrating with code you didn’t author.


Clear code doesn’t need comments is a bit tricky statement. There are always two aspects of a problem. The math and the physics. Say a+b is clear from math point of view, but it’s completely ambiguous from physics point of view. In other words math is the what/how and phy is the why. I think the why needs good comments and for the what I agree with you that a clean good doesn’t.


Mostly working in languages with strong type inference, I too find many type hints in Python a little bit verbose and unnecessary for a reader. Though thing is, without giving up on some features that make inference undecidable in general (like arbitrary subclassing), you're not going to get language that can both infer most types from code and do so in a way that actually gives you some interesting guarantees about it.

You mention that you're not interested in typechecking, but to be honest, why bother with restricted format of stuff from `typing` if you don't care about automatic processing of these? You could just as well emulate math notation with ASCII, use LaTeX or simply descriptions. Though at that point, are they actually more than a traditional documentation?


Hey there.

I'm the author of Robust Python, and the first quarter of my book is all about type hinting, so suffice to say I've thought about it a lot. What follows is strictly my opinion on things.

I absolutely agree that there needs to be a balance between verbosity and clarity. Obvious things such as some variable names probably don't need to be annotated, but I think we disagree with what an obvious name is. When I see a variable named timestamp, is that an int, or a datetime? When I see days_since_creation, is that an int or a timedelta? When I see amount_paid, is that an int, float, decimal.Decimal or some custom type? The only way I can answer these questions is look at how it's used. For variables that may be clear, but for things like parameters in a function it's less clear. I have to look at calling code to see what people can pass in, and I have no guarantee that all the calling code is even visible to me. In these cases, I will happily trade off the verbosity of a type annotation so that I can have a better chance of correctness as more people work on the code over years. Now, if the project is small, maintained by a single developer or something like that, then I would say type hints can just be added noise. (but if at anytime you need to change the name of a variable to be more clear, such as "amount_paid" -> "amount_paid_decimal", I would encourage a type annotation instead).

I also think it's a mistake to treat type checking as a hack to feel more Java-like. I feel like the more appropriate counterpart is TypeScript (Guido has talked about it before: https://developers.slashdot.org/story/21/05/22/0348235/what-...). I believe type annotations (along with type checkers) was introduced to Python to solve an absolutely real problem with dealing with the robustness of codebase when you have multiple people working on it (across multiple years, even decades even).

All in all, yes, I do not think you should be type annotating everything, but I also am wary about type annotating too little. All of this advice is context-dependent; there is no one right answer depending on project size/value. But if you want to increase communication and lessen the amount of mistakes someone can make as they modify your code (especially if they never get the chance to talk to you), I whole-heartedly endorse type checking and type annotations. After all, your code is going to be read much more often than written, and I'd rather optimize for the unfortunate souls who have to read my code later on. I'll happily pay the penalty now for writing out a few more characters.


I just read your book this last week and loved it! I came here to recommend it.

One thing I still struggle with, though your book helped a lot, is when you start trying to get typing right with decorators and other types of misdirection. You pretty quickly start reading through PEPs and forums and using weird hacks that may or may not be a good idea.

For example, I’ve spent a fair amount of time trying to get PyCharm to understand a classmethod property to no avail. It’s hard to know if it’s a problem with my typing, a problem with PyCharm, or something else.

Anyway, thanks for the great book!


Thanks!

If it makes you feel better, I struggle with that too from time to time -> it happens to the best of us. It's a tricky thing to balance the use of more complicated features, because sometimes it clarifies things and sometimes it makes it more complicated (and often depends on the audience).

My main recommendation is to not get too hung up on fighting your tools, at some point, you run into a law of diminishing returns. I've run into bugs in tools and have sucked it up and put in a # type: ignore. But, and this is critical, I try to have a comment for future developers who might have a better way to do it, or even a link to a public bug. Remember, your main goal is communication to the future, not necessarily a strict dogmatic rigor of type adoption.

That being said, I always love a good programming puzzle, so if you get stuck on something like that, feel free to reach out to me on Twitter. No promises I have the answer, but I like learning just as much as the next person.


Relax. Per my understanding, you don't need to push yourself so hard. I come from C/C++/Java first. Then I use python. When I learn python, everyone said that the good part of python is its duck type and dynamic type. Now everyone talked about type hints. I can't express my feeling about it. Per my experience, type hints can improve your code quality if you are working with huge code base or some complex project. However, if you just write some scripts to auto some daily tasks, using type hints is a kind of over-skill.


Clear code doesn’t need comments is a bit tricky statement. There are always two aspects of a problem. The math and the physics. Say a+b is clear from math point of view, but it’s completely ambiguous from physics point of view. In other words math is the what/how and phy is the why. I think the why needs good comments and for the what I agree with you that a clean code doesn’t.


Agree about the what vs the why, that's why I see type hints leaning towards the what most of the time


I think that you have a few (pretty common, I would say) misconceptions about what type hints in Python are.

> I want to argue the same, obvious stuff don't need to be annotated

The first misconception is that type hints are something optional: they are not. Every single Python variable, function argument and return are annotated - it's only that, if there is no explicit annotation, it is assumed to be `Any`.

As we know, Python is strongly-typed: every object has a clearly defined type, even if it's the default `object`. However it is also dynamically-typed, meaning that variables do not constrain which types they can contain; so every variable can contain anything.

> Python type serve as extra documentation

The second misconception is that the annotations are there to document which types a variable is supposed to contain. I mean yes, that is definitely a welcome effect, but their primary purpose is to enable static analysis, i.e. to validate that there are no unexpected collisions between types when executing a program.

The main difference between Python and statically-typed languages like C++ or Java is that, in Python, the static analysis step is optional - if you don't run `mypy` or another type checker before deploying, your program will still happily execute as if there were no annotations at all.

> balance between clarity and verbosity

Another misconception is that type annotations inevitably make your code more verbose. In fact, you can have all the benefits of the typing annotations without having a single type hint in your code.

Python typing specifications allow you to expunge all your typing annotations into separate files, and keep the main code clean of them. For example imagine a source file, let's call it `foo.py`, which looks like this:

    def foo(bar, baz):
        return f"result: {bar + baz}"

    class Bla:
        def lalala(self):
            return 42
Now you can have a separate file, named `foo.pyi`, which looks like this:

    def foo(bar: int, baz: int) -> str: ...

    class Bla:
        def lalala(self) -> int: ...
Now, when you run `mypy`, it will use that second file to validate the typing information. This is called a "stub file", and you can read more about them in PEP 484 [0].

A few more misconceptions:

> - args and kwargs: We all know that these are lists and dictionaries respectively.However, I've seen code where they are annotated still.

Annotating args and kwargs are not about annotating those variables themselves, they are about marking their contents*. So if you annotate `kwargs` with `dict`, you're telling `mypy` that you expect the value of each keyword argument to be a dictionary, which is probably not what you wanted.

> When a function doesn't return anything, should we still annotate it with None, or we are kinda stating the obvious here.

Yes, because without an explicit annotation the default is `Any`, which is different from `None`. Specifically, as explained in the `mypy` documentation [1], an unannotated function is considered to be "dynamically typed", and some internal typing errors might go unchecked.

Again, for the purposes of static analysis nothing is "obvious".

> When the variable name is obvious. Do I still need to tell the type if the variable is called timestamp

Even if we look at annotation as simply documentation - does your variable contain a datetime object, a numeric timestamp (e.g. Linux epoch), or maybe an ISO 8601 formatted string?

> I feel like type checking is a way to hack a dynamic language like Python to become Java

As described above, the only difference between Python and Java (when it comes to typing) is that the latter enforces static analysis at compile time. Python, being dynamically interpreted, would require this analysis to happen on every run, which would make most programs completely unusable, so this analysis is completely optional. Yes, at runtime a Python function can be passed a type it does not know how to handle; but this is also perfectly possible even in statically typed languages.

Static analysis is a powerful tool, and many static language proponents insists that they give you enough correctness validation that you don't need unit tests. But it is only a tool, and it is a great benefit to have an option of having it when desired and ignoring it when practical; most languages don't have that luxury.

[0] https://www.python.org/dev/peps/pep-0484/#stub-files

[1] https://mypy.readthedocs.io/en/stable/getting_started.html?h...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: