More

pansa2 · 2026-05-06T12:33:02 1778070782

> New Zealanders are forced to pay a sky high TV licence

There’s no TV license in New Zealand

nephihaha · 2026-05-06T15:40:54 1778082054

My bad. But my point sticks, because NZ TV licenCe did exist for most of the time TV has been in New Zealand, i.e. for well over forty years. These days, the state, aka "the Crown", controls major outlets, and still pays for some programming from public taxes as well as taking advertising revenue. So the public pays for it whether they want to or not. (Much like Australia which abolished its licence a while ago, and the Republic of Ireland which retains a TV licence but also carries advertising on RTÉ.)

Where is most of that public money going to?

pansa2 · 2026-05-03T14:38:06 1777819086

The PEP for this change is here [0] and discussion of it is here [1]. Both are very long and seem to represent a huge amount of complexity, apparently to make installing Python easier for novices?

But what about those of us who listened to Rich Hickey and prefer "simple" over "easy"? With the executable installer no longer available, how do I get a copy of python.exe, python316.dll etc onto my machine so that `C:\Python316\python.exe <script>` works, without having to think about `py`, `pymanager`, Windows Store etc?

[0] https://peps.python.org/pep-0773/

[1] https://discuss.python.org/t/pep-773-a-python-installation-m...

zahlman · 2026-05-05T15:41:45 1777995705

> apparently to make installing Python easier for novices?

That seems to be the goal. But from what I've seen on forums and Discord, nothing really works for that. People are becoming less and less computer literate, including people who want to try programming. I've seen multiple instances of people not understanding the idea of running an installer once and then running the thing it installed after that. Any number of Windows users who have never seen cmd.exe (or Powershell or any other alternative) before.

And the install manager makes things worse because the `py` name conflicts with the old launcher, and of course nobody (at least, among the people actually wrestling with this stuff) actually reads pages like the one in OP. Everyone tries to learn how to do stuff from third-party sources, and old sources explaining former ways to do things never go away or get updated.

... You know, I think I finally understand why people care about `uv` being able to install Python. (I mean, I'm on Linux so I'm insulated from this anyway, but still.)

pansa2 · 2026-04-21T06:31:44 1776753104

In a similar vein, see this page about the performance of the interpreter for the dynamic language Wren: https://wren.io/performance.html

Unlike the Zef article, which describes implementation techniques, the Wren page also shows ways in which language design can contribute to performance.

In particular, Wren gives up dynamic object shapes, which enables copy-down inheritance and substantially simplifies (and hence accelerates) method lookup. Personally I think that’s a good trade-off - how often have you really needed to add a method to a class after construction?

versteegen · 2026-04-21T09:01:12 1776762072

Yes, language design is a hugely important determinant of interpreter or JIT speed. There are many highly optimised VMs for dynamic languages but LuaJIT is king because Lua is such a small and suitable language, and although it does have a couple difficult to optimise features, they are few enough that you can expend the effort. It's nothing like Python. It's not much of an exaggeration to say Python is designed to minimise the possibility of a fast JIT, with compounding layers of dynamism. After years of work, the CPython 3.15 JIT finally managed ~5% faster than the stock interpreter on x86_64.

pjmlp · 2026-04-21T12:22:08 1776774128

CPython current state is more a reflection of resources spent, than what is possible.

See experience with Smalltalk and Self, where everything is dynamic dispatch, everything is an object, in a live image that can be monkey patched at any given second.

PyPy and GraalPy, and the oldie IronPython, are much better experiences than where CPython currently stands on.

dec0dedab0de · 2026-04-21T16:26:54 1776788814

The problem is that AI has been dominating the conversation for so many years, and they'll get more improvements from removing the GIL than they would from adopting the PyPy JIT.

The JIT would help everyone else more than removing the GIL, I wish PyPy became the reference implementation during 2.7

pjmlp · 2026-04-21T17:13:49 1776791629

Actually because AI has been driving the conversation that CPython JIT efforts are finally happening and being upstreamed.

It is also because of AI, that Intel, AMD and NVidia are now getting serious about Python GPU JITs, that allow writing kernels in a Python subset.

To the point that I bet Mojo will be too late to matter.

dontlaugh · 2026-04-21T10:29:09 1776767349

Python is worse, but not by all that much. After all, PyPy has been several times faster for many years.

vlovich123 · 2026-04-22T01:45:27 1776822327

That is an incorrect analysis. CPython is difficult to JIT because of the lack of thought to the native bindings / extensions, not because of the language itself (as others point out PyPy was way faster long ago)

versteegen · 2026-04-22T03:40:01 1776829201

You're correct. I neglected that; extension API compatibility is a big (the most important?) difference between PyPy and CPython's JIT. Amongst language features that affect optimisation potential, an extension API can be the worst.

Edit: I think what you're alluding to is that tracing JITs can overcome a lot of dynamic language features which make things hopeless for method JITs. Where LuaJIT really shines vs PyPy is outside of JITed loops. (Also memory and compile overheads). I realise this is a bit of a motte and bailey.

psychoslave · 2026-04-21T06:59:29 1776754769

That’s basically what is done all the time in languages where monkey patching is accepted as idiomatic, notably Ruby. Ruby is not known for its speed-first mindset though.

On the other side, having a type holding a closed set of applicable functions is somehow questioning.

There are languages out there that allows to define arbitrary functions and then use them as a methods with dot notation on any variable matching the type of the first argument, including Nim (with macros), Scala (with implicit classes and type classes), Kotlin (with extension functions) and Rust (with traits).

pjmlp · 2026-04-21T12:19:21 1776773961

It is getting better, now that they finally got the Smalltalk lessons from 1984.

"Efficient implementation of the smalltalk-80 system"

https://dl.acm.org/doi/10.1145/800017.800542

IshKebab · 2026-04-21T14:47:29 1776782849

> Ruby is not known for its speed-first mindset though.

Or its maintainability, and this is one of the big reasons why. Methods and variables are dynamically generated at runtime which makes it impossible to even grep for them. If you have a large Ruby codebase (say Gitlab or Asciidoctor), it can be almost impossible to trace through code unless you are familiar with the entire codebase.

Their "answer" is that you run the code and use the debugger, but that's clearly ridiculous.

So I would say dynamically defined classes is not only bad for performance; it's just bad in general.

psychoslave · 2026-04-21T20:11:39 1776802299

That's yet an other topic, as monkey patching can definitely be explicit in ruby. The dynamically generated things at runtime are generally through the catch all method missing facility that can be overwritten. This can also be done in, say, PHP. It just that the community is less fond of it. Not sure about what most popular ahead of time oriented languages expose as facility in this area, obviously one can always even decide to generate automodifying executable. There is nothing special about ruby when it comes to go into forbidden realms, except maybe it doesn't come to much in your way when you try to express something, even if that is not the most maintenance friendly path.

igouy · 2026-04-21T16:21:15 1776788475

> Ruby is not known for its speed-first mindset though.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

naasking · 2026-04-21T13:40:23 1776778823

> In particular, Wren gives up dynamic object shapes, which enables copy-down inheritance and substantially simplifies (and hence accelerates) method lookup.

A general rule of thumb is that if you can assign an expression a static type, then you can compile it fairly efficiently. Complex dynamic languages obviously actively fight this in numerous ways, and so end up being difficult to optimize. Seems obvious in retrospect.

pansa2 · 2026-04-05T09:34:11 1775381651

Yes, Smalltalk's syntax fits on a postcard - and it's possible to go even more minimal than that, e.g. Lisp or Forth.

OTOH Ruby doesn't need a postcard, it needs a full poster.

pansa2 · 2026-03-17T23:51:33 1773791493

> Wouldn't this get the funding back?

The funding was Microsoft employing most of the team. They were laid off (or at least, moved onto different projects), apparently because they weren't working on AI.

kelvinjps · 2026-03-18T01:31:30 1773797490

With Python being the main language for AI, isn't like more important to be more performant? I kinda don't get Microsoft reasoning, maybe they're just tight in money

brianwawok · 2026-03-18T02:05:07 1773799507

I don’t think Python is the main language of AI.

eru · 2026-03-18T02:36:19 1773801379

Python is pretty big as glue in the AI ecosystem as far as I can tell. It also seems to be most agent's 'preferred' language to write code in, when you don't specify anything.

(The latter is probably more to do with the preferences they give it in the re-inforcement learning phase than anything technical, though.)

pansa2 · 2026-03-17T23:19:38 1773789578

The Python devs didn’t want to make huge changes because they were worried Python 3 would end up taking forever like Perl 6. Instead they went to the other extreme and broke everyone’s code for trivial reasons and minimal benefit, which meant no-one wanted to upgrade.

Even the main driver for Python 3, the bytes-Unicode split, has unfortunately turned out to be sub-optimal. Python essentially bet on UTF-32 (with space-saving optimisations), while everyone else has chosen UTF-8.

diziet_sma · 2026-03-18T00:32:49 1773793969

> Python essentially bet on UTF-32 (with space-saving optimisations)

How so? Python3 strings are unicode and all the encoding/decoding functions default to utf-8. In practice this means all the python I write is utf-8 compatible unicode and I don't ever have to think about it.

sheept · 2026-03-18T00:54:00 1773795240

UTF-32 allows for constant time character accesses, which means that mystr[i] isn't O(n). Most other languages can only provide constant time access for code units.

msl · 2026-03-18T08:06:49 1773821209

UTF-32 allows for constant time access to code points. Neither UTF-8 nor UTF-16 can do the same (there are 2 to the power of 20 valid code points, though not all are in use).

While most characters might be encodable as a single code point, Python does not normalize strings, so there is no guarantee that even relatively normal characters are actually stored as single code points.

Try this in Python:

  s = "a\u0308"
  print(s)
  print(s[0])

You will see:

  ä
  a

cloudbonsai · 2026-03-18T03:43:51 1773805431

Internally Python holds a string as an array of uint32. A utf-8 representation is created on demand from it (and cached). So pansa2 is basically correct [^1].

IMO, while this may not be optimal, it's far better than the more arcane choice made by other systems. For example, due to reasons only Microsoft can understand, Windows is stuck with UTF-16.

[1] Actually it's more intelligent. For example, Python automatically uses uint8 instead of uint32 for ASCII strings.

zahlman · 2026-03-18T07:19:43 1773818383

There is no caching of a "utf-8 representation". You may check for example:

  >>> x = '日本語'*100000000
  >>> import time
  >>> t = time.time(); y = x.encode(); time.time() - t # takes nontrivial time
  >>> t = time.time(); y = x.encode(); time.time() - t # not cached; not any faster

Generally, the only reason this would happen implicitly is for I/O; actual operations on the string operate directly on the internal representation.

Python uses either 8, 16 or 32 bits per character according to the maximum code point found in the string; uint8 is thus used for all strings representable in Latin-1, not just "ASCII". (It does have other optimizations for ASCII strings.)

The reason for Windows being stuck with UTF-16 is quite easy to understand: backwards compatibility. Those APIs were introduced before there supplementary Unicode planes, such that "UTF-16" could be equated with UCS-2; then the surrogate-pair logic was bolted on top of that. Basically the same thing that happened in Java.

cloudbonsai · 2026-03-18T09:56:17 1773827777

> There is no caching of a "utf-8 representation".

No there certainly is. This is documented in the official API documentation:

    UTF-8 representation is created on demand and cached in the Unicode object.

    https://docs.python.org/3/c-api/unicode.html#unicode-objects

In particular, Python's Unicode object (PyUnicodeObject) contains a field named utf8. This field is populated when PyUnicode_AsUTF8AndSize() is first called and reused thereafter. You can check the exact code I'm talking about here:

https://github.com/python/cpython/blob/main/Objects/unicodeo...

Is it clear enough?

zahlman · 2026-03-18T19:26:35 1773861995

The C API may provide for it, but I'm not seeing a way to access that from Python. This sort of thing is provided for people writing C extensions who need to interface to other C code.

(And the code search seems to be broken; it can't find me the definition of `unicode_fill_utf8` although I'm sure it's obvious enough.)

nslsm · 2026-03-18T04:41:03 1773808863

Read first paragraph here https://devblogs.microsoft.com/oldnewthing/20190830-00/?p=10...

pansa2 · 2026-03-18T01:07:26 1773796046

> all the encoding/decoding functions default to utf-8

Languages that use UTF-8 natively don't need those functions at all. And the ones in Python aren't trivial - see, for example, `surrogateescape`.

As the sibling comment says, the only benefit of all this encoding/decoding is that it allows strings to support constant-time indexing of code points, which isn't something that's commonly needed.

laurencerowe · 2026-03-18T01:34:55 1773797695

They absolutely do because random byte strings are not valid utf8. Safe Rust requires validating bytes when converting to strings because this.

zahlman · 2026-03-18T07:11:32 1773817892

> Python essentially bet on UTF-32 (with space-saving optimisations), while everyone else has chosen UTF-8.

It did nothing of the sort. UTF-8 is the default source file encoding and has been the target for many APIs. It likely would have been the default for all I/O stuff if we lived in a world where Windows had functioning Unicode in the terminal the whole time and didn't base all its internal APIs on UTF-16.

I assume you're referring to the internal representation of strings. Describing it as "UTF-32 with space-saving optimizations" is missing the point, and also a contradiction in terms. Yes, it is a system that uses the same number of bytes per character within a given string (and chooses that width according to the string contents). This makes random access possible. Doing anything else would have broken historical expectations about string slicing. There are good arguments that one shouldn't write code like that anyway, but it's hard to identify anything "sub-optimal" about the result except that strings like "I'm learning 日本語" use more memory than they might be able to get away with. (But there are other strings, like "ℍℯℓ℗", that can use a 2-byte width while the UTF-8 encoding would add 3 bytes per character.)

rjh29 · 2026-03-18T00:16:58 1773793018

Ironically Perl 5 managed to do the bytes-Unicode split with a feature gate, no giant major version change.

pansa2 · 2026-03-17T22:38:38 1773787118

Maybe they could have two versions of the interpreter, one that’s thread-safe and one that’s optimised for single-threading?

Microsoft used to do this for their C runtime library.

veber-alex · 2026-03-17T23:05:15 1773788715

That's exactly what we have now and it looks like the python devs want a single unified build at some point

chuckadams · 2026-03-18T00:05:28 1773792328

PHP does this as well. Most distributions ship PHP without thread safety, but it's seeing more use now that FrankenPHP uses it. Speaking of which, it would be nice if PHP's JIT got a little love: it's never eked out more than marginal gains in heavily-numeric code.

pansa2 · 2026-03-17T22:29:17 1773786557

There isn’t a dev mailing list any more, is there? Do you mean the Discord forum?

pansa2 · 2026-03-17T22:21:52 1773786112

>> Python 2->3 transition

> taking backwards compatibility so seriously

Python’s backward compatibility story still isn’t great compared to things like the Go 1.x compatibility promise, and languages with formal specs like JS and C.

The Python devs still make breaking changes, they’ve just learned not to update the major version number when they do so.

BarryMilo · 2026-03-17T22:55:59 1773788159

Indeed, Python's version format is semver but it's just aesthetics, they remove stuff in most (every?) minor version. Just yesterday I wasted hours trying to figure out a bug before realizing my colleague hadn't read the patch notes.

pansa2 · 2026-03-09T08:22:38 1773044558

Crystal’s syntax is similar to Ruby’s, but AFAIK the similarity more-or-less ends there.