AFAIK it was implemented in 3.11. That is, all of it except of the GIL removal itself, which actually decreased performance for single-threaded code; the actual improvement was elsewhere.
I find that difficult to believe, as the "What's New in Python 3.11" release notes (https://docs.python.org/3.11/whatsnew/3.11.html) don't mention "GIL" or "Global Interpreter Lock" at all. A change of that magnitude would definitely get a mention there.
No. Sam Gross’s PoC included a number of optimisations besides the gilectomy. It were those optimisations that made it faster, while the GIL removal slowed it down again; so only the optimisations were implemented.
The "GIL removal" umbrella proposal was two-fold. It included (a) removing the GIL, (b) several optimizations to handle some issues with GIL being removed and offset the GIL removal overhead (due to more frequent lock checks, etc).
The GIL-removal assisting changes and optimizations were merged, but the GIL removal was not.