Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] Python 3.8 Makes me Sad Again (ilya-sher.org)
24 points by ilyash on Aug 16, 2020 | hide | past | favorite | 46 comments

So the author is sad because Python 3.8 fixes things... too late for their taste? And they, "as an author of another programming language" would've known better and never made these "mistakes" to begin with? Get off your high horse.

I don’t get the part about unordered maps. Python never promised ordered dicts (until 3.7) it wasn’t a secret nor did you have any reason to believe that they would be ordered.

I mean they’re ordered now and that provide some benefit in terms of simpler code in some cases, but it was never a problem as such.

Besides, Python already had OrderedDict in the standard library for a decade.

I think it refers to the fact that, if dictionaries do not promise to be ordered but have predictable (and consistent between runs) iteration order (e.g. by insertion order), code can start depending on that.

Some people think that, if code ‘out there’ starts depending on that, it’s better to update the documented API to promise what it actually does.

    “Somebody” ignored the wisdom of Lisp, which was 
    “everything is an expression and evaluates to a value” 
    (no statements vs expressions), and made assignment a 
     statement in Python years ago.
This was the philosophy of Algol. That is why ":=" was invented there. From Lex Fridman interview I understood that Guido was not educated enough to understand some issues, and tried to fix them afterwards. :-)

Funny, a very similar sentiment, but inverted was expressed in a rant about c# shared on here a couple of days ago - typing `foo.bar()` and implicitly discarding the result, because you assumed foo.bar() was a statement.

My first though was "that's the single largest cause of avoidable bugs in my python code".

There is some valid criticism but, as pointed out, fixing language inconsistencies when there is a corpus of programs already running is not trivial.

Maybe when we decide to make Python 4.

/me ducks

I would sure love it if the packaging folks could come to some conclusion.

The fragmented packaging scene is not a core language problem, but a real source of dismay nonetheless.

As PyPI is not a curated repository, I don't see how that could be done.

I don't understand the parameter section. What fundamental mistake is python trying to remedy with the must-be-positional parameter?

I don't get why you would want to force people to not name a parameter.

Like, why is it so bad that I write:

    myval = mydict.get(mykey, default=mydefault)
Who does it help that this throws an error?

In many languages you can do whatever you want with parameter names because they’re an implementation detail. In Python it’s always been frustrating that perhaps a less experienced developer can rename a parameter “inside the function” and break calling code that happened to refer to the name. (And since it isn’t static/compiled, how long after the change will you see this break?)

Is the author sad because Python 3.8 is fixing things? What would he rather have them do? Not fix things?

Or is he sad because millions of people around the world use Python instead of his shiny new esolang where he’s managed to get all the things right™?

Hindsight is always 20/20. This kind of snark achieves nothing.

Why is it useful for maps to be sorted by insertion order?

However the author did not mention stable order does not come for free. It is not a trivial decision to make, between ordered and unordered dicts.

Actually in this case I believe it did come for free due to a change in the way dict was implemented in 3.6:


It is not as simple as that. It might looks "free" in this particular implementation, but it obviously prevents further optimizations.

To follow up on lightgreen's comment, in arijun's own linked-to URI they specifically mention the new level of indirection right away. Indirection is not "free" at all. For very large dict()s in main memory/DRAM it could be 2X slower. A hot loop benchmark where the CPU can perfectly predict its near future work and its prefetcher can mask DRAM latency may not reveal this, but a less "simple" benchmark would.

The primary point of the ordering feature (and insertion ordering) is because Python uses that very same dict implementation for language features such as keyword args {def foo(\\kwargs)}. Changing to hash-order from "source code"-order can be confusing.

The reason for the change for more compact dicts. Thus the fact that it also preserved insertion order was a happy accident with no additional cost; what I called "free".

You're right. If a more efficient but unordered dict were to come out, it would have to be in an import and have less comfortable syntax, since the base dict has a requirement for preserving insert order. But python is not C, and speed is not the most important consideration. Python often goes for "efficient enough, and very comfortable to use", for example by using lists as its ordered flat container instead of arrays. I think that making dicts ordered is another step in the right direction for Python.

What Python calls "list" other languages call "array" (or "vector" or "seq" or "dynarray" or ...). So, it is not at all "instead of arrays" as you said, but rather "arrays by another name". People usually say "linked list" or "singly linked list" or "doubly linked list" to refer to the idea/data structure that does pointer hopping during traversal.

Most languages don’t have dynamically resizing arrays, so IMHO vec is a better fit. But my point wasn’t about the name. It was that if you don’t want to incur the costs associated with automatic resizing or with the indirection in python lists (as they are built as arrays of pointers), you have to use a different, less convenient library, like array or numpy. And that’s ok, considering the language goals.

I guess if you want a pseudo map/list where ordering matters? But truth is semantics of a 'map' have no implicit ordering (at least mathematically). The fact some languages offer this as a convenience is great but its hardly fair to complain about languages that do not.

There are 2 questions here:

1. Why should it use a stable ordering at all?

2. Why should that ordering be the insertion order?

#2 isn't a particularly deep question. It has at least one fairly obvious reason: because it's the most intuitive thing to iterate through items and get them back in the same order as you inserted them. Anything else would be less intuitive.

#1, though, is more interesting, and it's one of those experience-based things that can be hard to see the significance of initially. One reason for it is that determinism & reproducibility is quite useful in a variety of use cases (e.g. testing), and while it's incredibly easy to lose, it's also quite hard (and often brittle) to gain back afterward (in the cases where it's even possible). By contrast, it's generally far easier to inject randomization at any point when you really need to (e.g. for security). Another reason can be that it reduces degrees of freedom in your program, which is generally a good thing as it helps when reasoning about program behavior. (This is not only during development, but also when debugging: it's incredibly useful to see what see what order items were inserted in to arrive at the current state.) There are probably more, but these are what I can think of off the top of my head.

And finally, another overarching reason for both is simply the notion of "information loss" (basically, entropy): it's easy to lose information, but not so easy to get it back. And in some people's experience (including mine), it often pays off in the long run for clients of an API to go out of your way to minimize unnecessary information loss in your library. (That information in this case being the implicit ordering information.)

Of course, all this hinges on the trade-off being worthwhile in each case. That's why library writers try to test different workloads to see if e.g. the performance trade-off is worth it. In CPython's case, it appeared it was.

You can use the same structure to output data to the screen (or elsewhere) and to find stuff by the key.

In other words, I must pay the cost of maintaining the ordering even in the 99% of cases when it will not be used.

Indeed, if you find e.g. PHP too slow, you can switch to Python for speed. To Python. For speed. Compared to PHP.

It makes execution reproducible.

Almost any hash table implementation will give the same iteration order for the same sequence of insertions and deletions across program executions. So this would already be true even if the order was not the insertion order.

Golang starts iterations in a random position each run to prevent people from relying on the ordering.

Not “almost any”. Good hashtables intentionally mix random numbers into hashes to prevent DOS attacks. And also, even if hashtables are stable, the hashes of the objects might not be (depend on the addresses for example).

I think Go does it very well here (since they don't want to guarantee the order) - getting consistent order in almost every case is the worst of both worlds, because people won't learn about the underlying implementation, they'll notice that it seems ordered and will rely on this behaviour.

As lightgreen points out in a sibling comment, it's common recently for languages to use a secure randomized hash like siphash and generate different orderings per program execution. It seems like that's the approach more languages should take.

> My perspective is biased towards correctness and “WTF are you doing?”

What an arrogant and pompous thing to say.

That tells everything you need to know, the author thinks of themselves as some sort of intellectual God.

If you're a teapot, it's definitely better.

> HTTP status codes 103 EARLY_HINTS, 418 IM_A_TEAPOT and 425 TOO_EARLY are added to http.HTTPStatus. (Contributed by Dong-hee Na in bpo-39509 and Ross Rhodes in bpo-39507.)

I get why a CI/CD approach is taken with modern languages but you can't revert changes with programmimg languages like you can with APIs. My view is, long-term supported new features are fine but deprecation of existing features and code breaking changes should be constrained to a major version change (like 2.7 -> 3) and that should happen at most once in a decade.

I wish more programmers that work outside of the tech industry would contribute to these discussions.

Clarification: NGS sucks differently. That's it.

Don't know how you guys get from "From my perspective, all languages suck, while NGS aims to suck less than the rest for the intended use cases" to how NGS is generally good or generally better than other languages or did/would avoid mistakes.

Note "aims to suck less", not even "sucks less".

Thanks, OP

>This works because regular dicts have guaranteed ordering since Python 3.7

But the API is different! OrderedDict has more methods that have to do with order, and `reverse` works on its `.keys()` and `.values()`. Or did they fix this in 3.8?

PHP also had ordered maps from the start.

I think the problem is that a lot of people get used to ordered maps in PHP and then expected the maps to be ordered in other languages.

This is fairly low effort, and it doesn't even take much effort to find flaws with Python.

Meh another shit-post for some clout.

Yups. Couple of items are “$thing got fixed. It was wrong. Im sad” and then state how their own project did it right. Cool story bro

Disagreement with post doesn’t make it shit. Diversity of opinions is great.

But calling the post shit instead of providing any arguments definitely does not make a constructive helpful discussion. Such comments are just noise.

I think parent is using “shit-post” to describe a negative post.

Like this: https://en.m.wikipedia.org/wiki/Shitposting

Does not look obviously negative to me.

Thank you for the link though.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact