Hacker News new | past | comments | ask | show | jobs | submit login
Python: Even a Feature That You Do Not Use Can Bite You (petrzemek.net)
49 points by ingve 31 days ago | hide | past | web | favorite | 44 comments

Understanding the basic syntax for assignment isn’t a high bar to clear. While it’s maybe not ideal to use colons for both the “assignment-like” separator between keys and values inside dictionaries and the annotation of types, introducing any new syntax features into a language that’s been around for decades is never without compromise. Colons are also used at the end of block statements and in index ranges and nobody complains.

In this case you hit the most benign sort of change: Code that was previously syntactically invalid is now syntactically valid but has silly semantics. There’s never going to be a way to avoid this with full generality, and your code never worked — type annotations just delayed your runtime error by one line.

When annotations were introduced in 3.4, there was no requirement they be used to annotate types instead of something else, so any valid Python expression is allowed syntactically. In 3.6 the use of annotations for anything other than types was provisionally deprecated, and in 3.7 is fully deprecated. In 3.8, non-type usage will be disallowed as a TypeError, and the only allowable non-type annotations will be strings, (which will be interpreted as bare identifiers for backward compatibility with pre-3.7 code that uses them as forward references). At this point your code will revert to raising an exception on exactly the same line it used to.

The problem is not understanding, it's fat-fingering (especially when dict literals and dict comps use `:`).

> type annotations just delayed your runtime error by one line.

Or it introduces a subtle bug because this was intended as a reassignment or somesuch.

The real problem, as others point out, is not using a linter to catch errors before runtime. Mypy flags this problem through static analysis.

You’re right I’m being uncharitable by saying “understanding”. I’m annoyed he bothered writing an accusatory blog post about this aspect of the language without mentioning any of the alternative syntax proposals helpfully discussed in the PEP, or exploring why annotations work this way. The designers have been very transparent and conservative in considering different ways of doing it, and the OP doesn’t offer any ideas for improving the interpreter, or note that the behavior he’s complaining about will be fixed in 3.8.

> Mypy flags this problem through static analysis.

The problem is that there's not really such a thing as static analysis in python. For example, if I wrap it in a function like this, mypy doesn't complain at all.

    def blah():
        ages = {}
        ages['John']: 42
        return ages

Mypy intentionally does not check annotations inside untyped `def` statements. This is a feature meant to make it easier to gradually introduce types in code that wasn't intended to be typechecked. If you call mypy with the flag `--check-untyped-defs` or just annotate the function with anything at all, you will see mypy checks the inside of the function:

    from typing import Any

    def blah() -> Any:
I'm not sure what you mean by "there's not really such a thing as static analysis" in Python -- that's exactly what Mypy does.

Cool, I knew I had to be missing something. You're right of course.

There is a way to avoid this kind of problem; have sufficiently complicated syntax that most random permutations caused by "fat-fingering" are incorrect. Python's problem is that the syntax is so minimalist, and the semantics so rich, that any random typo is liable to do something.

For what it's worth, I think Python's approach here is the right one (at least, for Python.) As others have noted, the common answer to this among Python users is to just use a linter to catch these kinds of issues, although I find that unsatisfying- linters can't catch everything, and they require you to have your stuff together well enough to use them (e.g., you can't be a student who is already overwhelmed with the bare necessary stack.) There are things that seem like they could be done to alleviate these kinds of problems within the language itself; most obviously, for the particular problem in the OP, a variable annotation expression could require the right-hand to be a type expression. It's not clear to me why these kinds of moves aren't taken.

> a variable annotation expression could require the right-hand to be a type expression. It's not clear to me why these kinds of moves aren't taken

Because when annotations were introduced in PEP 3107 they were not meant exclusively for type hinting. We're in a period now where non-type usage is deprecated but allowed for compatibility with old code. In 3.8 a non-type evaluation of the RHS will be disallowed.

This is a weird article to me. There seems to be an implicit condemnation of Python for this new feature resulting in non-error, unexpected behavior under certain typoes, despite the fact that that is true for literally any typo that results in valid code in any language (such as, e.g., = instead of ==). I don't understand how it would be possible to add a new syntactic option without being "bit".

Regarding type hints' ability to be arbitrary code, I suspect this is to allow you to use nested type hints from the `typing` library, like `Tuple[int, int]`, as well as more complex or user-defined types. Plus it just gives more flexibility in general.

I'd be interested to hear the author's suggestions on alternative syntax or ways to catch when the programmer intended something other than a type hint, or even arguments that type hints are extraneous and should not have been added - personally, I've found them quite useful recently, in combination with pylint.

I read it as a more general cautionary tale rather than an attack on this particular feature or language. Be aware of new features, because even ones you don't know about and don't use can cause you problems. Or alternatively for language developers, be aware that new features can make the language harder to use even for people who don't use them.

There were some people who thought that trying to bolt a type system on to Python was a bad idea.

A common response to them was "you don't have to use it; you can ignore it and you're no worse off than you were before". I think it's reasonable for him to point out that this isn't quite true.

I dunno. This sort of problem would apply to any new feature with syntax. It doesn't seem fair to blame the type annotations specifically.

But do you think it's reasonable for him to stop the language from progressing because he makes stupid mistakes like these?

I mean technically it's a feature you're using, just unintentionally. Kinda like setting a mutable object as a default argument.

But I'll admit I got bit by exactly this recently, wrote up an "assignment" using a colon instead of an equal sign and it didn't do anything, but didn't fail either.

Somewhat frustratingly, pycharm didn't warn about it either. I should probably check their bug tracker to see if somebody already reported it.

I don't think so, because 42 is not a type in Python (it could be one day, though - in TypeScript, 42 can be used as a type, basically an enum with only one value). I think (naively) that it should throw an error.

Annotations were specifically designed as general-purpose, not specifically type annotations. So the ability to put arbitrary expressions in annotations was very much intended.

In fact, the type hints of the typing module are not actually types.

Ah, I see, good to know.

What I wanted when I found the PEP a few days ago were docstrings for instance variables, which isn't one of the things that this provides, so I stopped reading. I ended up using properties but it's a bit verbose.

It's great that typings are in a module so it can be extended or replaced!

I'm not sure how feasible throwing an error is - Type hints in Python can be arbitrary classes or instances thereof, in order to type hint for stuff like nested data structures (`Dict[str: int]`). You could disallow instances of specific built-in classes, but that seems... clunky, especially as some can be used in valid code (strings are used as forward references).

I wonder why most people defend this feature. The problem is not the annotation, you can do that in ML, too:

  Poly/ML 5.6 Release
  > 5 : int;
  val it = 5: int
But you cannot just use 42 where a type is expected:

  > 5 : 42;
  poly: : error: <identifier> expected but 42 was found
  Static Errors
It's pretty silly that Python did not develop an actual type system if annotations like that are supported.

In the void of enforcing a speficic one, you allow people flexibility to do custom things.

For instance linters not understanding type specificatins beyond a certain complexity but still using them in your code and some custom decorator to check them at run/dev time

non-clickbait title: Python: Mistyping a character in your code can bite you.

Any IDE or linter worth its salt would catch this for you.

Can you recommend a particular linter? I use flake8, which does not warn for this. I just tested pylint which also does not appear to warn for this.

PyCharm would highlight the expression after : as a type that was not imported correctly (or an expression that is not a type).

I would assume mypy would cause an error.

If I wrap it in a function, mypy doesn't throw an error. I'm not sure if mypy has a function to lint such code without executing it.

In any language that undergoes syntactic extension, there is new syntax that is a syntax error in an old version of the language. Erroneous programs written for the old version which land on that previously bad syntax will undergo a change in behavior.

This is true even in languages that don't undergo syntactic extension. If you call a nonexistent function today, your program bombs. Tomorrow, that function could be added, so now something unexpected happens, perhaps.

Lisps are susceptible. Today (foo bar) is diagnosed; tomorrow it's new syntax (not simply a new function), because foo exists as a macro operator.

However, functions and macros, being named, can be namespaced to curtail such issues. Random, ad hoc read syntax extensions, not so much.

One approach is that files, or sections of files, can support annotations which indicate what version of the syntax is required.

Under GCC, if you want only C89, so that any syntactic extensions from later dialects are invalid syntax, you use '-ansi' on the command line to select that dialect. For oinstance, support for // comments disappears, and so x //blah/ y is x / y, rather than x followed by a comment-to-end-of-line.

This is true. I don't think Python is to blame for it, though. Any language has potential for making a mistake that's syntactically valid but doesn't have the intended meaning. It's like when a former manager of mine typed "call me at your continence" when he meant to say "call me at your convenience".

Languages can help manage change, though, providing features such that programmers working against older dialects of the language, who make mistakes, get diagnostics according to the old version, rather than accidentally stepping on new syntax.

It seems odder to me that 42 is a valid type hint than that the LHS can be an expression. But I share with most of the others here the opinion that this really isn’t a problem with Python. Every language makes trade offs, and python gives you incredible flexibility and expressiveness in exchange for a lack of certain guardrails.

Once a coworker came to me with a program that was unexpectedly failing, and it turned out that somewhere in the code there was a function that takes a callable, but instead of

    something(iterable, func)
he had written

    something(iterable, func())
There were no IDE errors/warning (in fact PyCharm tends to induce this error by auto inserting the parens) and the behavior that resulted made it tricky to catch.

Nonetheless, personally I’d take that over most other options for run of the mill progamming tasks.

PyCharm would highlight that error as a yellow warning if you correctly define the types of the arguments for "something".

This would also not be accepted by mypy.

Iirc this was pre type hints, so the example might be invalid now, but I’m sure I could come up with other examples that demonstrate the same point.

This isn't a good article to post. It's just one person complaining about a change he didn't know about.

It's really just a syntax change. I probably wouldn't have chosen that particular syntax using colons, or I would have forced another keyword like "set" in front of it, but it is what it is. He created a bug and he likely won't do it again, so problem solved.

I also didn't know about this change but now I do because this article was posted.

Ok I don't really see the issue. You could just as easily write x is 42 as x: 42 and you have the same issue. it's not really a language problem. If you see keyerror, you should know that that value didn't properly get added to the dictionary. When investigating why, you should see your error of using the wrong operator. Language is functioning properly

So how do people defend this one:

  >>> [][0]
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  IndexError: list index out of range
  >>> [][0] : "bug"

Instead of the weirdly alarmist nature of the article, it would be a more fair point to make that colons are now very overloaded in python syntax.

The level of assumed ignorance of the reader in this article is a bit high given the subject matter.

also, annotations end up accumulated into global __annotations__

PS: wait until he reads about := assignement syntax

Local variable type hints are a major reason I refuse to use python versions before 3.6. What is this guy's problem?

Weird article title. This is more neat than it is alarming.


There are common libraries which do things incompatible with the way pypy works. I've never gotten it to do anything but abort, when run over a non-trivial program.

mypy is a very cool experimental project (as their own webpage calls it), but I wouldn't fault anyone for not running it on their program.

The question, then, is why that kind of analysis isn't built into the compiler?

> Despite considerable discussion about a standard type parameterisation syntax, it was decided that this should also be left to third-party libraries. ([7], [8], [9]).

> Despite yet more discussion, it was decided not to standardize a mechanism for annotation interoperability. Standardizing interoperability conventions at this point would be premature. We would rather let these conventions develop organically, based on real-world usage and necessity, than try to force all users into some contrived scheme. ([14], [15], [16]).

That doesn't sound really "batteries included" to me.

Is there a single programming language for which static analyzers are a part of the language rather than external tools?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact