Hacker News new | past | comments | ask | show | jobs | submit login
A function decorator that rewrites the bytecode to enable goto in Python (github.com/snoack)
197 points by sea6ear 35 days ago | hide | past | favorite | 88 comments

Along the lines of rewriting bytecode at runtime, https://www.python.org/dev/peps/pep-0302/ lets you arbitrarily hook Python's import process and rewrite .pyc files as desired.

As a practical application, pytest uses this to take the statement `assert foo == bar` and rewrite the AST into a number of instructions that extract the values of foo and bar and print them if the assertion fails: http://pybites.blogspot.com/2011/07/behind-scenes-of-pytests... (https://docs.pytest.org/en/6.2.x/assert.html for those unfamiliar with the library) - so if you use pytest, you're already relying on these arcane techniques!

Back in 1999 we used similar arcane techniques to ship encrypted Tcl scripts.

And probably still do the same today...

I do the same in my module that enables increment operators in Python by patching the bytecode for two unary plus/minus ops: https://github.com/borzunov/plusplus

Jesus, what a disgustingly clever hack!

I wonder if we could use a similar mechanism to finally add a "continue" to Lua.

Lua has `goto` since 5.2; you don't need a continue keyword. e.g.

  for z=1,10 do
    for y=1,10 do
      for x=1,10 do
        if x^2 + y^2 == z^2 then
          print('found a Pythagorean triple:', x, y, z)
          print('now trying next z...')
          goto zcontinue

That's nicer than continue really (as someone who doesn't know and has never written Lua) - works for whichever level of nested loops like that that you want, and allows/somewhat forces you to come up with a name that might be clearer what it's for.

In Perl, "next" can take a label:

    while (foo()) {
        while (bar()) {
            if (baz()) { next LOOP; }

Similar in Java (it is called "break", and the optional label is written the same way), although I'd guess most Java developers don't know about it.

I know we can do that, but I still want continue ;)

It's truly a thing of beauty. I can't wait to test what happens when you put labels or gotos in expressions...

Yeah, you can patch calls to continue() to instead be jumps.

There was an equally intriguing Python goto implementation posted last year¹, with some interesting comments²(including some in the gist itself).

This version does seem better, but I'm not really sure that is a good thing.

¹ https://gist.github.com/georgexsh/ede5163a294ced53c3e2369cca...

² https://news.ycombinator.com/item?id=23076859

yes! goto is one of the 2 things python got wrong. The other is the ability to deliver end user executables.

Java actually has an elegant-for-Java feature that removes most credible uses of goto. You can label and break out of standalone code blocks. It makes error handling without exceptions no bad.

    label: {
        break label;

And in most C-style languages that don’t have labelled blocks like this, you can use an infinite loop to achieve the same effect; for example, in Rust where labelled blocks aren’t yet stable:

  'label: loop {
      break 'label;
And if you want to go one further on that, you can break with value because Rust is expression-oriented:

  fn main() {
      println!("Hello, {}!", 'label: loop {
          break 'label "world";

Swift also has (copied?) this...


(And I always forget about it)

You can have the same flow in python raising and catching an exception.

Doesn't Perl also have this? It would be a great feature in Python, although I wouldn't use it very often.

In Perl, any “bare block” is equivalent to a loop that only runs once. Useful in conjunction with redo too.

Oh my goodness! I'm using this right away (yesterday I was lamenting lack of goto for this exact use case.)

This is just Java, right? I don't think I've seen this in the C/C++ specs or compiler extensions.

Labeled break/continue are present in most languages that postdate Java. Go, JavaScript, Kotlin, Nim, Rust, Zig, for example, all have labeled break/continue as well. The only notable post-Java language I'm aware of that doesn't have it is C#. Languages that predate Java generally don't have such a feature--and this includes C/C++.

Well if you're using C, you could just use a proper goto.

One thing I don't like about goto is that it allows the creation of irreducible control flow. Makes all kinds of optimizations and program analysis more difficult. When I have the chance, I advocate for more well behaved alternatives such as labeled break statements. Although they're more restrictive, most of the time they're all you need.

This is true, but I suspect that you're getting downvoted because python doesn't do bytecode optimization.

CPython doesn't. But the other Python implementations (PyPy, GraalVM, et al) don't even use the CPython bytecode.

Also true... but this decorator isn't going to work in that context.

Cpython absolutely does, just not much of it (it has a peephole optimiser with a very small set of opts).

Properly implemented tail calls are all you need.

You can get loops and goto etc from there.

(You might still want extra first class support for eg exceptions.)

yes they can be used for evil but how are exceptions different?

Python added a GOTO a few years ago, by mistake, with asyncio.ensure_future(): https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

We are still paying this mistake as of today: most devs don't scope their async tasks lifecyle leading to a huge spaghetti code.

GOTOs are too easy to misuse.

We used to have only GOTOs, no loop, no if... But specialized keyword are better: they enforce a cleaner code, leverage conventions, play batter with tooling and make the whole experience easier to read and debug.

Really? I can't think of why I would want to use a goto in Python, and if I did it would definitely make it not "Pythonic".

Python is 30 years old, is there any popular language created in the last 30 years that has goto?

(And also I emphatically agree on the end user executable thing)

I recently needed to do some "what if" tax scenarios. I wrote functions for each relevant IRS form that pretty much just followed the form. For example, for Schedule D I had one line for each form line.

For example lines 15 through 20 ended up like this:

  D15 = 40000
  D16 = min(D1, D15)
  D17 = min(D14, D16)
  D18 = max(D1-D10, 0)
  D19 = min(d1, 163300)
  D20 = min(D15, D19)
The variable numbers match the form's line numbers, and the code is a straightforward implementation of the instructions for the line.

The instructions include things like "If line X is greater than line Y, go to line Z".

I think the code would have been clearer if I could have used goto to match the instructions instead of having to convert to if blocks. For the forms I was interested in at least the if blocks were all fairly simple and the lines in the code were all in the same order as the lines on the form, but I'm not sure that would be true for all of the IRS forms.

That's what functions are for.

And bonus, you can put them in several files, and import them when your program grows. Which means you can then call them in the shell to play with them.

It will make the whole experience much easier.

Functions calls would have been clear enough. They are essentially just gotos with arguments when they are in tail position.

And for the tax stuff, you wouldn't even need to worry too much about having properly optimized tail calls, as the business logic here is unlikely to blow up your stack.

For what it's worth, I actually think your code is clean as is, as it clearly shows the relationships of the computation, and it's easy to understand the flow of data. I'd just suggest naming the variables as best as possible, e.g. d16_NetIncomeRaw = min(...).

Makes total sense to me.

The code’s logic would be able to better match the logic flow of the tax statement logic flow which often say things like “if no whatever, skip to #farther.down.the.page.”

Sometimes it’s best to let art^h^h^hcode imitate life.

> Python is 30 years old, is there any popular language created in the last 30 years that has goto?

Julia (9 years old) https://sodocumentation.net/julia-lang/topic/5564/-goto-and-...

Golang (11 years old) https://golang.org/ref/spec#Goto_statements

Lua (28 years old) http://lua-users.org/wiki/GotoStatement

Also C# (21 years): https://docs.microsoft.com/en-us/dotnet/csharp/language-refe...

PHP perhaps gets honorable mention for not originally having a goto statement, but adding one something like 10 years later.

Zig moved in the opposite direction. It had one at first, and then removed it: https://github.com/ziglang/zig/issues/630

For Zig, note that there's not really a use case for goto that you can't do with other language features.

To jump backwards, there's labeled continue. To jump forwards, there's labeled break. The use case that computed goto tries to solve is addressed with labeled continue on a switch [1]

[1]: https://github.com/ziglang/zig/issues/8220

Maybe labeled break and continue is what Python needs, not goto.

All that trouble to avoid proper gotos. Essentially introducing castrated gotos.

“Proper” gotos, like continuations, are hard to reason about and optimize.

Restricted versions each covering a major use case are easier to reason about, and to optimize.

Anyhow, pretty much every version of “goto” using the name implemented or proposed for Python or other structured languages is castrated: typically, it take only a static label and has restrictions on where it can jump based on structure (out of blocks but not into blocks, within the same function, is pretty typical.)

Proper goto can jump anywhere in the program, even dynamically computed at runtime, and if state isn't set up the way the code jumped to expects, too bad. Everything more restricted is just dickering about how much it should be castrated.

Isn't that basically every control flow mechanism?

That was even the whole point behind Dijkstra's paper. To paraphrase, "Goto is a very powerful language feature. Which is exactly the problem. Less powerful features give programmers less latitude to do clever things their colleagues (and the compiler) can't reason about effectively."

Ye sure. I meant that it feels like jumping through hoops to introduce a lot of different mechanisms just to remove goto:s. E.g. "labeled breaks" and using silly loop constructs.

You have it precisely backwards.

These features already have reasons to exist, because they solve control flow scenarios. So there they are. And then we are brought the question of introducing goto, and there's just nothing that it adds. It would make the language more complicated for no reason.

They're not set up in the most elegant way, but the concept of leaving a nested loop is very simple and should definitely be possible without a goto. And switching between states in a state machine, where each state has a single entrance like a normal function, should also be simple.

With goto you can't know at a glance if the code is following those patterns or doing some kind of spaghetti nonsense. It hurts readability.

Wow I'm actually pretty blown away! I use Go on a daily basis and have never seen it used anywhere or had any clue it was in the language.

If you had asked me a few minutes ago if Go had goto I would have said "definitely not and they would never add it because it goes against the Go ethos".

The name of the language should have given it away.

Hopefully Go 2 won't have Goto

Why does it go against the Go ethos? (Asking out of curiosity. I’ve never used the language.)

I feel like they usually prioritize simplicity and readability instead of tricks, syntactic sugar, and one-liners.

For example there's no ternary operator. Instead you'll have to write out a few lines of if/else. There's no function overloading or operator overloading, no built in min/max, and currently no generics (though they're being added).

They definitely try to keep the number of keywords down to a minimum compared to languages like Swift where you could write the same business logic in fewer lines using stuff like try, guard, !, and ?.

An if-statement is essentially sugar for conditional jump which is sugar for compare then jump. Goto is a jump. It’s all sugar.

i can only think of where you want to break out of all nests rather than just the current one.. i still wouldn't use it though.

Go has goto

Surely the ten+ year Python 3-is-incompatible-with-2 saga was also a mistake.

They could have tried to handle this in a better way, but some incompatible changes were useful and probably necessary.

Perhaps files could have declared at the top whether they were Python 2 or 3, and the pyc could have been kept as compatible as possible?

Wouldn’t have been very useful, there were not that many incompatible syntactic changes you could not ignore or paper over (string literals were the main issue until u”” was restored) and it would not have helped any with the semantic & api changes.

Don't know if I agree about the goto thing, but there are actually a number of options now for delivering varying degrees of self-contained Python executable.

When I evaluated the landscape a few years ago, I settled on PEX [1] as the solution that happened to fit my use-case the best— it uses a system-provided Python + stdlib, but otherwise brings everything (including compiled modules) with it in a self-extracting executable. Other popular options include pyinstaller and cx_freeze, which have different tradeoffs as far as size, speed, convenience, etc.

[1]: https://github.com/pantsbuild/pex

I've been using nuitka. but really it should be part of the language by now.

There are a ton of options in Python but none are satisfying compared to eg Golang.

Oh sure, a statically-compiled language is always going to win on this for just having it out of the box.

But that can also be seductive in its own way, like when you get used to just deploying a single binary but then suddenly have an application that also includes data files— do you do the go:embed thing to maintain the glorious single binary, or do the pieces go in an archive together? If it's that one, do you find the data at runtime with a flag, or an envvar, or a relative path, or something else?

Undoubtedly, Go is easier than Python or Ruby, especially for the trivial case. But deployment is overall a non-trivial problem.

This sounds like a case of perfect being the enemy of good. Yes, no deployment system will cover all cases. But that doesn't mean it's impossible to make a system that's good for most cases.

Just... out of curiosity as someone who learned to code in BASIC on a TRS-80... WHY ON EARTH would you ever want goto or gosub statements instead of proper loops and methods? It's not like they're faster. They certainly don't make code more readable. The last time I had to use that type of construct was gotoAndStop() in Flash, which at least had the benefit of queueing up a bunch of art and sound effects. Other than the "neat-o" factor, what's the use case or impetus behind this apparent horror?

I had to use GOTO label, once so far, as a way to quickly exit from inside nested loops - you can do it without GOTO but the point was to be efficient(like if you work with the pixel colors of a big image , each extra instruction inside the loops will have a visible time effect).

In a situation like that I would usually define a boolean before entering the top loop, and set that true within once a condition is met, causing all the outer loops to break if that condition is true. I can't imagine there would be a major speed difference in some other way of breaking all the loops to jump to a different line of code.

That means once it’s set to true, all the inner loops would need to at least finish that iteration. That could be a huge speed difference at a minimum or a completely wrong implementation.

this means you do a if check each loop, this is bad if you do some say image filter/effect where you work with big loops. end you get some nested if and break inside the code, is much more readable to just jump out.

I would argue that having to do that is not in fact a 'proper' loop. Your way and goto are both workarounds for a missing control flow construct.

For gosub, what's the issue? It is a method call.

Pythons for-loop is not working with boolean, so using one would mean to uglify the code. A goto would simplify the workflow by removing 1-2 lines for each additional loop.

Is that faster than (or comparable to) raising an Error in the inner loop and catching it outside the outer loop?

Internal error bubbling code is slow and can change from release to release. And it's definitely not intended for the job.

probably faster since it was built for this purpose, throwing an error would have to at least create an exception object, then create stack trace and the other setup stuff.

As i said I code for almost 20 years and I only had to use it once so far, but people that maybe work a lot more with stuff that needs a lot of performance might use it much more often.

Read "Structured programming with goto" by Knuth (a paper). He gives a quite in depth explanation of when gotos could be preferable.

Also, if you generate code then gotos can be useful.

First of all, you wouldn't want it. I am an old programmer, so I still remember being taught to code before structured programming became universal. And I can personally attest the structured programming was a great advance.

In case you haven't heard the term, structured programming is the term that was used for replacing goto with a handful of commonly used control flow structures. A loop, a function with a return statement, and some kind of case statement (or multi branch if) were the basic ones.

If you look at modern languages today almost all of them have the same set of control flow constructs. They have an if statement, a loop, some kind of function with a return statement, most have some kind of case or multi-branch if, and must have some sort of exceptions. (If was present before structured programming; exceptions were added later.)

If you look at the very oldest programming languages they may have some of these control flow structures but they definitely have two: the if statement and the goto. Because the goto statement is strictly more powerful than these other control structures. If you have a language that doesn't support exceptions but does have a goto, then you can write an exception-like flow using goto. If you have a language that doesn't have break or continue for its loops you can build a loop that works that way using goto. If you're programming language doesn't have the broken form of case statement (that falls through to the next branch) so you can't build Duff's device [https://en.m.wikipedia.org/wiki/Duff%27s_device], then you can build that using goto.

In fact, the combination of if and goto is powerful enough to create any possible (single threaded) flow control structure including ones you and I have never even thought of.

The reason that structured programming was such an important and powerful step forward is because it made the language strictly less powerful. Because the language was less powerful, it was possible to reason more clearly about what the code could or couldn't do. In particular, it allowed a programmer to make certain assumptions about locality. If you started reading at the top of a function and worked your way down you could reason about what this function did in isolation: perhaps the tangle of loops and ifs and try-except blocks in the function are confusing, but at least you don't have to read the entire rest of the program to be certain what it is doing. With goto in your language that guarantee is gone: you might have to understand the entire rest of the program in order to figure out any one function.

So, we gave up the power to create certain flow control constructs in order to make it possible to reason about our languages. But there is nothing that says that the evolution of programming languages is finished. Exceptions are pretty nifty, and they were not part of the original toolkit. Maybe there is another control flow structure out there waiting to be discovered which is extremely useful for solving certain kinds of problems. Someone with goto in their language could discover and use that control flow structure, while those of us without goto cannot.

> With goto in your language that guarantee is gone: you might have to understand the entire rest of the program in order to figure out any one function.

The same is true, to a greater degree, for function calls.

In eg. C goto:s are function scope.

If (failed) goto retry;

Is much more elegant than a loop, or at least easier to read since you only need to really pay attention to the happy path.

But you can also do that with recursion

if failed: do_thing()

This is covered in the Julia docs, where they argue recursion is more readable than goto


I guess that depends on if the language supports tail recursion and/or you're 100% sure it won't blow the stack. If you need to retry forever, without using any memory to do so, goto is perfect.

Some code are more elegant to write with a goto than a loop.

Javascript has a rare yet similar feature called label


Exceptions are forward gotos. A "while" loop with continue and break is backward goto. The only thing missing is break from nested while. Java has one, called break with label, one of more obscure features of language. So, in a sense, python already does have goto.

All control flow constructs in modern programming languages are about codifying the safe uses of goto and setjmp.

For example defer and friends are goto cleanup; and yield and exceptions are setjmp.

Sometimes a feature of a programming language is about making it really inconvenient to do something you shouldn't be doing

So rust also has goto :)

Great! It's too bad that python's jump instructions only accept constant targets or constant offests (which seem like they end up being equivalent), otherwise this approach could also support labels-as-values.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact