I'm curious is the JIT developers could mention any Python features that prevent promising JIT features. An earlier Ken Jin blog [1], mentions how __del__ complicates reference counting optimization.
There is a story that Python is harder to optimize than, say, Typescript, with Python flexibility and the C API getting mentioned. Maybe, if the list of troublesome Python features was out there, programmers could know to avoid those features with the promise of activating the JIT when it can prove the feature is not in use. This could provide a way out of the current Python hard-to-JIT trap. It's just a gist of an idea, but certainly an interesting first step would be to hear from the JIT people which Python features they find troublesome.
It's interesting you mention __del__ because Javascript not only doesn't have destructors but for security reasons (that are above my pay grade) but the spec _explicitly prohibits_ implementations from allowing visibility into garbage collection state, meaning that code cannot have any visibility into deallocations.
I think __del__ is tricky though. In theory __del__ is not meant to be reliable. In practice CPython reliably calls it cuz it reference counts. So people know about it and use it (though I've only really seen it used for best effort cleanup checks)
In a world where more people were using PyPy we could have pressure from that perspective to avoid leaning into it. And that would also generate more pressure to implement code that is performant in "any" system.
> In practice CPython reliably calls it cuz it reference counts ... In a world where more people were using PyPy we could have pressure from that perspective to avoid leaning into it
A big part of the problem is that much of the power of the Python ecosystem comes specifically from extensions/bindings written in languages with manual (C) or RAII/ref-counted (C++, Rust) memory management, and having predictable Python-level cleanup behavior can be pretty necessary to making cleanup behavior in bound C/C++/Rust objects work. Breaking this behavior or causing too much of a performance hit is basically a non-starter for a lot of Python users, even if doing so would improve the performance of "pure" Python programs.
> That cleanup can be explicit when needed by using context managers.
It certainly can be, but if a large part of the Python code you are writing involves native objects exposed through bindings then using context managers everywhere results in an incredible mess.
> Mixing resource handling with object lifetime is a bad design choice
It is a choice made successfully by a number of other high-performance languages/runtimes. Unfortunately for Python-the-language, so much of the utility of Python-the-ecosystem depends on components written in those languages (unlike, for example, JVM or CLR languages where the runtime is usually fast enough to require a fairly small portion of non-managed code).
> meaning that code cannot have any visibility into deallocations.
This is more pedantry than a serious question. JavaScript has WeakReference, sure it'd be cumbersome and inefficient because you'd need to manually make and poll each thing you wanted to observe, but could it not be said that it does provide a view on deallocations?
Yes, WeakRef and FinalizationGroup both make GC visible (the latter removes the need to poll in your example). So not pedantic at all. They were eventually added after much reluctance from the language designers and implementers, partly because they can lead to code being broken by (valid & correct) engine optimizations, which is a big no-no on the web. But some things simply cannot be implemented without them.
Note that 90% of the uses for them actually shouldn't be using them, usually for subtle reasons. It's always a big cause for debate.
That link itself calls out that conformant implementations can’t be relied on to call callbacks.
> A conforming JavaScript implementation, even one that does garbage collection, is not required to call cleanup callbacks. When and whether it does so is entirely down to the implementation of the JavaScript engine. When a registered object is reclaimed, any cleanup callbacks for it may be called then, or some time later, or not at all.
It's likely that major implementations will call cleanup callbacks at some point during execution, but those calls may be substantially after the related object was reclaimed. Furthermore, if there is an object registered in two registries, there is no guarantee that the two callbacks are called next to each other — one may be called and the other never called, or the other may be called much later.
There are also situations where even implementations that normally call cleanup callbacks are unlikely to call them:
It's supported in all of the major engines. And you also can't rely on the garbage collector to run at a predictable time (or at all!), so the engine never calling finalizers is functionally the same as the garbage collector being unusual.
The only (other) visible effect of GC not running is memory exhaustion. WeakRef/FinalizationGroup not getting triggered can have lots of script-visible effects, so can be much much worse. I wouldn't describe that as "functionally the same".
Oh! While this one does mention that you don't have visibility, this + weak refs seem to change the game
I remember a couple of years ago (well probably around 2021) reading about GC exposure concerns and seeing some line in some TC39 doc like "users should not have visibility into collection" but if we've shipped weakrefs sounds like we're not thinking about that anymore
We still try to limit any additional exposure as much as possible, and WR/FG are specced to keep the visibility as coarse as possible. (Collections won't be visible until the current script execution finishes, though async adds a lot more places where that can happen.)
A proposal to add new ways of observing garbage collection will still be shot down immediately without a damn good justification.
I'd really hope we do live with 4kb pages forever. Variable page size would make many remapping optimizations (i. e. continuous ring buffers) much harder to do, so we would need more abstraction layers, and more abstraction layers will eat away all the performance gains while also making everything more fragile and harder to understand. Hardware people really love those "performance hacks" that make live a more painful for the upper layers in exchange for a few 0.1%s of speed. You could also probably gain some speed by dropping byte access and saying the minimal addressable unit is now 32 bits. Please don't. If you need larger L1 cache - just increase associativity.
The extra L1 cache from a 64k page is on it's own a ~5-10% perf improvement (and it decreases power use by reducing the number of times you go out to L2.
Funny, most of what you described sums up the Alpha architecture. 8KB pages + huge pages and, initially, only word-addressable memory, no byte access.
(Of course, it only took a few years for this to be rectified with the byte-word extension, which became required by ~all "real software" that supported Alpha)
It's also one of the only architectures Windows NT supported that didn't have 4KB pages, along with Itanium. I've wondered how (or if?) it handled programs that expect 4KB pages, especially in the x86 translation subsystem.
If I do this on my mac, I wonder if am technically violating the HEIC patent license. I suppose it depends on the details in the patent license, plus perhaps rights Apple has acquired for its users. I definitely don't know, but maybe someone on HN does?
I gather from the HN discussion that it's not simple to disable scripting in an SVG, in retrospect a tragically missing feature.
I guess the next step is to propose a simple "noscripting" attribute, which if present in the root of the SVG doc inhibits all scripting by conforming renderers. Then the renderer layer at runtime could also take a noscripting option, so the rendering context could force it if appropriate. Surely someone at HN is on this committee, so see what you can do!
Edit: thinking about it a little more - maybe it's best to just require noscripting as a parameter to the rendering function. Then the browsers can have a corresponding checkbox to control SVG scripting and that's it.
Disabling script execution in svgs is very easy, it's just also easy to not realize you're about to embed an svg. `<img src="evil.svg">` will not execute scripts, a bit like your "noscripting" attribute except it's already around and works. Content Security Policy will prevent execution as well, you should be setting one for image endpoints that blocks scripts.
Sanitizing is hard to get right by comparison (svgs can reference other svgs) but it's still a good idea.
I had the impression from elsewhere in this thread that loading the svg in some other way, then you are not protected. This makes a no-brainer "don't run these ever" option in the browser seem appealing.
> This makes a no-brainer "don't run these ever" option in the browser seem appealing.
Firefox has this: svg.disabled in about:config. It doesn't seem to be properly documented, and might cause other problems for the developer (I found it accidentally, and a more deliberate search turns up mainly bug tracker entries.)
I would surprised to see performance as good as V8, although that would be great. As I recall the v8 team performed exceptionally well in a corporate environment that badly wanted js performance to improve, and maybe inherited some Hotspot people at the right time.
I'd be quite delighted to see, say, 2x Python performance vs. 3.12. The JIT work has potential, but thus far little has come of it, but in fairness it's still the early days for the JIT. The funding is tiny compared to V8. I'm surprised someone at Google, OpenAI et al isn't sending a little more money that way. Talk about shared infrastructure!
Don't underestimate the nicotine withdrawal for making you feel crabby - I've heard that anecdote many times. The nice thing is ... that one gets better with time.
I'm super curious how this hack worked, but I feel like the story is just about the last step. What did the attacker have such that this last step did it?
My guess is that the attacker had the google password, and also the login for Coinbase was somehow stored in Google, so the attacker getting into google also exposed Coinbase. I just looked at Coinbase, and it does have a "Sign In With Google" feature.
If you want to live the stripped-down TOTP lifestyle, you have to love this 20 line Python solution. Does not depend on weird libs, and the last edit is 4 years ago. Write the seed on a Post-It and you're all set. Not so convenient, but sound sleeping!
https://github.com/susam/mintotp
There is a story that Python is harder to optimize than, say, Typescript, with Python flexibility and the C API getting mentioned. Maybe, if the list of troublesome Python features was out there, programmers could know to avoid those features with the promise of activating the JIT when it can prove the feature is not in use. This could provide a way out of the current Python hard-to-JIT trap. It's just a gist of an idea, but certainly an interesting first step would be to hear from the JIT people which Python features they find troublesome.
[1] https://fidget-spinner.github.io/posts/faster-jit-plan.html
reply