Hacker News new | past | comments | ask | show | jobs | submit login
Python 3.11: “Zero cost” exception handling (python.org)
197 points by rbanffy on Oct 6, 2021 | hide | past | favorite | 90 comments



They weren't zero cost before? In a language where idiomatic control flow uses exceptions? That's crazy!

I've felt weird using exceptions like that but I always assumed that CPython was optimized to minimize overhead of exceptions and exception handlers.


Exceptions always have a cost. What happened with many C++ runtimes is that they moved the execution cost of exceptions almost entirely into the exception raising mechanism so that the path of execution that does not raise exceptions has no overhead due to exceptions. This was not the case with the Python runtime.

The "zero cost" implementation for C++ exceptions in the case of ELF binaries means storing a bunch of static (compile-time generated) data regarding CPU register state changes for each function in a table indexed by function address offset within the DSO. When a "throw" expression is executed, the execution stack is crawled and that table evaluated for each function encountered until an appropriate "landing pad" (eg. a matching "catch" clause) is found, and then the stack is unwound by interpreting the function table contents to step through the various CPU states in reverse order, executing destructors as required. It's just like sufficiently advanced technology. In the case of no "throw" expression, there is no extra CPU cycles spent. This trades off execution time for extra memory storage in the no-exception case and more CPU cycles in the exception case.

As I understand it, the Python "compiler" just generates a bunch of straight-line code and decision statements so that when a function returns, its result is checked and if the result is that an exception was thrown it jumps to the local handling code otherwise it jumps to the local non-exception code. The price for handling exceptions is always paid all the way up the stack for every function call made.

Because exceptions are a fundamental control-flow mechanism in Python (unlike in C++, where they should only be used for exceptional control flow), I'm not sure if there will be a net benefit to "zero cost" exceptions. I guess they should try it and measure the difference under various scenarios and make an informed decision based on evidence.


Excellent post! Python Bytecode is a little more naive / high level than this though, so stuff like exceptions and their nested handlers are actually implemented in the VM itself. The VM has essentially a second stack for exception and context manager blocks. The compiled bytecode essentially looks like this:

    SETUP_TRY 10 # address where exceptions will be handled
    some stuff that might explode
    POP_BLOCK
    JUMP 20  # jump over exception handlers
    LOAD exception type  # address 10
    COMPARE  # check if exception type matches
    ... handler for the type
    stuff after the try-except  # address 20
The corresponding source would look like

    try:
        some stuff that might explode
    except exception type:
         ... handler for the type
    stuff after the try-except
Python's bytecode compiler is generally a 1:1 translation of the AST; it never optimizes, e.g:

    [value for value in list]
Translates to something like

    LIST_NEW
    FOR_EACH
    STORE
    LOAD
    LIST_APPEND
    JUMP BACK
Note how "value" generated a store-load pair.

The VM checks whether PyErr (pointer to exception) is set after basically everything. Similarly, extension modules check PyErr after every call into the interpreter, e.g.

    PyObject *attr = PyObject_GetAttr(someattr, somepystr);
    if(PyErr_Occurred()) { // or !attr
        // handle exception
    }
This gets old pretty fast.


I'm growing more interested in actually understanding some of the internals that you mention here. I know this is a bit of a tangent, but is there a better way to approach understanding python's internals than reading the source (which feels a bit monolithic to me right now)?


The central dispatch of the VM is a good place to start: https://github.com/python/cpython/blob/main/Python/ceval.c#L...


I thought I had a pretty good handle on Python internals, until some time early this year when I took an interest in the generated bytecode. I'd read plenty of the cpython source, written lots of cython extensions, etc., but somehow missed the middle piece.

Fortunately, it's really easy to get at the bytecode, and quite instructive. Random inquiry: how do generator functions work?

  In [1]: def foo(): 
     ...:     yield from range(10) 
     ...:                                                                                                                                                                                                            

  In [2]: import dis                                                                                                                                                                                                 

  In [3]: dis.dis(foo)                                                                                                                                                                                               
    2           0 LOAD_GLOBAL              0 (range)
                2 LOAD_CONST               1 (10)
                4 CALL_FUNCTION            1
                6 GET_YIELD_FROM_ITER
                8 LOAD_CONST               0 (None)
               10 YIELD_FROM
               12 POP_TOP
               14 LOAD_CONST
               16 RETURN_VALUE
From there, you can read how each of those bytecode instructions is implemented in ceval.c, which formerly_proven links to.

edit: probably nice to have the actual disassembly of a list comprehension, too:

  In [4]: def bar(): 
     ...:     return [x for x in range(10)] 
     ...:                                                                                                                                                                                                            
  
  In [5]: dis.dis(bar)                                                                                                                                                                                               
    2           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f814059f450, file "<ipython-input-4-a7a3cc7e7d7f>", line 2>)
                2 LOAD_CONST               2 ('bar.<locals>.<listcomp>')
                4 MAKE_FUNCTION            0
                6 LOAD_GLOBAL              0 (range)
                8 LOAD_CONST               3 (10)
               10 CALL_FUNCTION            1
               12 GET_ITER
               14 CALL_FUNCTION            1
               16 RETURN_VALUE
  
  Disassembly of <code object <listcomp> at 0x7f814059f450, file "<ipython-input-4-a7a3cc7e7d7f>", line 2>:
    2           0 BUILD_LIST               0
                2 LOAD_FAST                0 (.0)
          >>    4 FOR_ITER                 8 (to 14)
                6 STORE_FAST               1 (x)
                8 LOAD_FAST                1 (x)
               10 LIST_APPEND              2
               12 JUMP_ABSOLUTE            4
          >>   14 RETURN_VALUE


> Python's bytecode compiler is generally a 1:1 translation of the AST; it never optimizes

I thought python had a "peephole optimizer" that optimises?


Note that the generated instructions - even if they are never run as in [1] - still consume cache and might hinder further optimizations. Which is why `noexcept` is becoming more popular. And because there is no GC, code must be written to be exception-safe in all conditions which is often forgotten.

1: https://godbolt.org/z/bKfG14P64 - the difference between `e()` and `n()` is that one is marked `noexcept`. Both `f()` and `g()` run directly to `ret` (return) in the normal case.


Yeah, noexcept(true) is identical to wrapping the function with a try-catch construct in which the catch clause simply calls std::terminate(). Your godbolt example doesn't include the definition of n() so it doesn't show that. Adding noexcept(true) has a cost (because it's a kind of exception handling) but also allows the compiler to optimize out some of that cost under some circumstances.

Nevertheless, an example that demonstrates the exception path doesn't say much about the "zero-cost" non-exception path. In C++ exception handling is expensive. Outside of actually directly handling exceptions they have zero cost. There are no generated instructions. They do not consume cache. They do not hinder further optimizations.


Indeed, in the following case of a 4-deep call stack, each with its own exception, the bulk of the handling code can be moved elsewhere (already marked as cold by the compiler), but nonetheless there are a few instructions which won't matter in most cases but are still required to jump there and thus end up in the instruction cache.

https://godbolt.org/z/eh1d4K1M7


Isn’t exception handling code placed in the cold section of the binary these days so that the impact on cache is nonexistent?

Yup, the generated assembly does this. There’s some minimal extra instructions still to setup the frame but the bulk of the exception handling lives elsewhere.


There is GC in C++/CLI, Unreal C++, and the optional C++11 API, which is getting dropped in C++23, though.


Also, it's not really the case that exceptions (when not thrown) are zero-cost in C++, and not because of the instruction cache or increase to static data size.

The cost is that exception-throwing functions inhibit many of the optimizations performed by compilers, so they generate worse code, even though no "extra" code is actually executed.


Do they? I spend a lot of time dealing with the optimizers in GCC and haven't noticed that happening. In GCC, a throw is implemented as a single function call to the C++ runtime function __cxa_throw(). If a simple single function call will pessimize performance we're screwed as a species.


It does because calling that function adds a new control flow edge out of the block it is in, which means you can no longer prove certain cleanup code can safely be elided.

Additionally the fact that C++ does not annotate whether a function throws in its signature that means you need to generate unwind data for most non-leaf functions since (short of whole program LTO) there is no way to safely to know that the functions they call don't throw. That means the size of the metadata necessary to support unwinding grows with the size of your binary, not the portion of the code using exceptions.

Finally, C++ exceptions are dynamic, which imposes a higher cost when they are actually used than a static exception model.

In general I suspect that explicit error handling (as is done in the Swift ABI) would result in better code because it would make the control flow edges more optimizable and optimizer could safely assume many functions don't ever need to be unwound. It would certainly make the binaries smaller due to the reduction in size of unwind tables (although some people including Stroustrop[1] argue there are ways to do similar optimizations to elide unwind data with existing C++ compilers, but I am not sure if anyone has ever built such an optimizer).

There is some discussion about adding a static exception model (with annotated functions) described in P0709[2]. Various parts of that are controversial, though I hope the committee eventually finds a way to agree to some variant of it, because it would allow unification with other languages (it is semantically equivalent to Swift Error handling), and there are even C proposals that would be interoperable[3].

Full disclosure, I work on a dynamic linker written in C++, so I spend all day long writing C++ code that is built without exceptions or most of the standard library, but is part of the runtime machine that enables exception handling.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p194...

[2]: http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p0709r4....

[3]: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2429.pdf


This was really informative, thanks! Also “it’s just like sufficiently advanced technology” made my morning.


>Because exceptions are a fundamental control-flow mechanism in Python (unlike in C++, where they should only be used for exceptional control flow), I'm not sure if there will be a net benefit to "zero cost" exceptions.

Well, they might be "a fundamental control-flow mechanism" but for every StopIteration (for an example of a control flow exception use), there are multiple (up to millions) of traversed elements that didn't throw an exception.


One of the stated benefits of this is

> Calls to Python functions would be faster as frame objects would be considerably smaller. Currently each frame carries 240 bytes of overhead for exception handling.

I guess that’s where this will pay even in a language/ecosystem where exceptions aren’t exceptional.


It's a time-space tradeoff. A few bytes less overhead on the stack, a few CPU cycles more overhead when raising an exception. Is it a net win? Show me empirical test results.


Thanks for the informative post.

> then the stack is unwound by interpreting the function table contents to step through the various CPU states in reverse order

Why do you need the exception function table contents for this. Presumably, you only want to destroy stack objects so isn't the call stuck necessary and sufficient to unroll all the way back to the function with the catch handler?

Also what are your thoughts on returning errors via a status vs exceptions. I love the former - they force you to somehow handle each error (pass it up or deal with it) vs exceptions where there is no accountability for a method that causally caused the exception.


You need to restore some execution context before invoking the destructors. The value of the implicit "this" parameter, for example, has to be set correctly before the call to the destructor. Depending on the CPU and calling conventions, other registers may need their values contextually restored.

Then there's the stack crawling itself. Again, depending on the CPU and calling conventions, just finding where the return value is stored for a function can require some gymnastics (aarch64 I'm looking at you here).

As to returning and handling errors via status returns instead of exceptions: you're factoring the cost of error handling into the good path. It's the opposite of zero-cost. I've never understood the argument that every piece of code needs to be able to handle every error that every subroutine it calls could ever encounter. No one even does that, and the end result is that "handle error by return codes" is usually faster than using exceptions, because the error handling ends up being not done at all. At least, that's my experience over 40 years of maintaining other people's code.

As to the performance of languages that force error status returns to be dealt with vs. exception handling? Profile. Get some numbers. Examine the bias of the person generating the numbers.


What’s a DSO?


Dynamic shared object, or what's usually understood generically (in non-language specific terms) as a [binary] module or in GUI apps a plugin--something loaded and linked dynamically by the program, not by the linker at compile time or at program startup. In the Unix/C ecosystem "DSO" is a common term to describe binary modules, without reference to specific binary formats like ELF or mach-O.

Especially in the land of Unix where dynamic linking has become very sophisticated and automated, many of these terms are losing their distinctive meanings as the various ways to link and load converge on the same underlying mechanisms which you can mix-and-match independently. But normally the difference between a DSO and a shared library is that shared library symbols (functions, global variables) are by default visible globally (at least on ELF; mach-O story is more complex), making them implicitly visible to subsequently loaded code, whether a shared library or DSO. On the other hand, you usually want all DSO symbols to remain private to the DSO by default, requiring the application at runtime to explicitly request a pointer to specific symbols as different DSOs might have symbols of the same name (especially true for plugins), leaving the application logic to decide which to use and when. Also, especially pertinent to Lua, Perl, Python, Ruby, and similar languages, a DSO usually implicitly relies on the main program to have loaded and exported certain core APIs, rather than the DSO explicitly linking to a shared library to provide them. (This still often requires special compile-time flags when building the DSO, and possibly also special flags to the main binary.) That ensures the VM/interpreter/engine and all DSOs are always calling into the same core runtime implementation, and ensures the main binary controls which that is. DSOs do often dynamically link to other shared libraries, but because of default symbol visibility semantics this can cause problems, such as two DSOs or a DSO and the main program compiled against different releases of OpenSSL with incompatible ABIs. (If you can manage to load two different version of OpenSSL, you have the reverse issue of accidentally passing objects between the two implementations.) Mitigations and resolutions for that problem differ between ELF-based and mach-O-based systems. (AIX is an outlier, which like Windows uses a PE-derived binary format that is significantly different than ELF or mach-O. Many of the semantic distinctions and jargon that developed around ELF don't make much sense in the context of PE.)


Dynamic Shared Object (in this context, probably a .so file).


Ah!


> exceptions are a fundamental control-flow mechanism in Python

Where do you have this from?

In almost all Python code I have seen, exceptions are still for, as the name says, exceptions. So in the common cases, there would be no exceptions raised. I would assume that the exception handling code (under `except ...`) will be run only a tiny fraction of times compared to the other code, at least in most cases.

I would argue, if one abuses exceptions for any sort of control flow logic, this is bad design.

See also the list of builtin exceptions: https://docs.python.org/3/library/exceptions.html

From those, yes, there is StopIteration and StopAsyncIteration, which are used for control-flow, but the handling of those is anyway internal in CPython, and so the discussion about zero cost does not apply, as it would not change this (as far as I understand the current proposal).

Otherwise, all other exceptions are not used for control-flow.


The for loop in python is a try/except catching StopIteration in a trench coat.

Also, EAFP, and the context manager protocol.


But that is what I meant, this StopIteration would not be affected by the zero cost optimization. The zero cost optimization is about compiled Python bytecode and the VM interpreter. The StopIteration is handled internally by CPython (not via Python bytecode).

For example, consider this function:

    def foo():
        for i in range(3):
            print(i)
There is no try-except in this code, and neither in its bytecode. So this is not affected by the zero cost optimization from the linked issue.

The bytecode is this:

  2           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (3)
              4 CALL_FUNCTION            1
              6 GET_ITER
        >>    8 FOR_ITER                12 (to 22)
             10 STORE_FAST               0 (i)

  3          12 LOAD_GLOBAL              1 (print)
             14 LOAD_FAST                0 (i)
             16 CALL_FUNCTION            1
             18 POP_TOP
             20 JUMP_ABSOLUTE            8
        >>   22 LOAD_CONST               0 (None)
             24 RETURN_VALUE
The iteration logic is inside the `FOR_ITER` op. And that is purely handled inside CPython.

Also, this is again what I meant: In user code, you rarely would use custom exceptions for control flow. Please show me any example where you have seen otherwise.

So, again, what I wrote: I'm very confident that for most user code, the code path under `except ...:` is only rarely executed. Please show me any big Python project where this would not be the case. I doubt that there is any.

Also, btw, on EAFP, Guido van Rossum disagrees: https://mail.python.org/pipermail/python-dev/2014-March/1331...


> The for loop in python is a try/except catching StopIteration in a trench coat.

That's...horrifying, both from a computational efficiency perspective (due to the overhead of exceptions) and from a "beauty" perspective (exceptions are supposed to be exceptional, darn it!). Why is it implemented this way? Why not just use a normal loop and check a return value from the iterator?


exceptions are supposed to be exceptional, darn it

This feeling is at odds with what is generally considered idiomatic Python, for better or worse.

Upthread BiteCode mentioned EAFP. See, for example: https://devblogs.microsoft.com/python/idiomatic-python-eafp-...

(It's certainly not unanimous. Guido himself is on the record as saying EAFP isn't better than LBYL, but in most of the Python I've cruised through these last umpteen years [that wasn't some other language written in Python], I've certainly seen an overall preference for EAFP. Naturally YMMV.)


I think that implies that either "exceptions" are misnamed in Python (as in every other language with that name, they're meant to be exactly that - exceptional - so whatever Python calls "exceptions" isn't the same as in other languages), or that that particular idiom is wrong, and needs to be rethought.


> Why not just use a normal loop and check a return value from the iterator?

Because an iterator needs to be able to return any python value, and if you did that, there would be at least one python value that an iterator could not actually return.

Unless you required the iterator to wrap real return values in a container (basically, an optional/maybe monad) but then you have unwrapping overhead on every real return value, which is worse.


Couldn't the runtime just allocate a specific Python object representing the case where the iterator is complete? Make that object implementation-specific, hide it as best as you can, write documentation stating that users should not use this value directly, and if they do - they voided the warranty, and should expect evil to befall them. (ultimately, you can't prevent the user from doing really dumb things e.g. editing the Python binary directly)


You do that with one thing, okay, maybe it works.

You do that with two or more of the protocols that Python has that use return values for return values ans exceptions for flow control signals and...you make it a lot harder to work with them, especially in conjunction.


Can you give an example of this? I can't think of any myself - generators also raise a StopIteration exception, for instance, so they could use the same value.


In Python e.g. instead of doing this:

    if key in some_dict:
        foo = some_dict[key]
        ... # A
    else:
        ... # B
you would do this:

    try:
        foo = some_dict[key]
    except KeyError:
        ... # B
    else:
        ... # A


You can do that, but:

- Even in this example, I would assume that branch A is more often executed. So the zero cost optimization (which is about branch A) would improve this code performance.

- I would argue, code which uses `some_dict.get(key, fallback)` or explicitly check `if key in some_dict: ...` is both more clean and more Pythonic. See also e.g.: https://mail.python.org/pipermail/python-dev/2014-March/1331...


This is on a different dimension.

Up to this point, in Python, exceptions were always checked, so they were "free" in the sense you were always paying for them anyhow. I remember many C++ programmers migrating to Python back in the ol' comp.lang.python days having this described to them. You don't have to worry about exception-heavy code, because since you're always paying for them anyhow, you might as well use them.

A lot of making CPython code run well works like that. You want to use as much of the stuff you're already always paying for anyhow, rather than reimplementing any of it in your own code.


> In a language where idiomatic control flow uses exceptions? That's crazy!

Seems like the opposite: if exceptions are extremely rare then you want to optimise the case where they’re not raised, at the cost of the other one.

If exceptions are common then it matters a lot less, you may even want to avoid 0ce depending on the impact on the raised case.


I don't think python has the mindset of "exceptions are extremely rare". That is probably what the OC meant by python being "a language where idiomatic control flow uses exceptions". As an example, every iterator in python signals its end by throwing a StopIteration exception. So, every "for x in iter" has the interpreter throwing and catching an exception.


> I don't think python has the mindset of "exceptions are extremely rare".

No it does not, that's my point: when exceptions are extremely rare, as in C++ or Java, having the "no exception" case be free at additional expense in the "exception" case is an excellent tradeoff, because the latter should happen extremely rarely so even if the costs are ridiculous you'll have more than made them by all the cases where you got the "no exception" run for free.

In Python where exceptions are common however, making one case free at the expense of the other is less interesting a proposition, and is more of a balancing act: you don't want to overly penalise exceptions-heavy code as that is considered normal and idiomatic.


Zero cost refers to the cost when no exception is thrown, not the overhead of exceptions. It may be more expensive throwing an exception under "zero cost" exception model, as throwing an exception may require parsing some data in the executable. (I'm not sure about the implementation, so this is just a may...)


> Zero cost refers to the cost when no exception is thrown, not the overhead of exceptions

There are recent benchmarks where .NET "zero cost when an try-catch block is present and exception not thrown" turned out to be significantly slower then the alternative without such a block.

It turns out that the try-catch block is a barrier across which some optimisations and re-organisations (e.g. method inlining) cannot happen, and that the compiler & JIT normally do a lot of these. So it might be "zero cost" but it might also prevent wins.

Edit: see here https://blogs.msmvps.com/peterritchie/2007/06/22/performance...

https://stackoverflow.com/questions/1308432/do-try-catch-blo...


Makes sense. Basically it’s zero-cost for exceptions-unaware code. I don’t know if it’s been changed with more modern jits but used to be chrome was unable to jit functions with try statements (similar to functions using `arguments` or `with` I think).


> I've felt weird using exceptions like that

How should they be used instead? Maybe I don't understand what you mean by "idiomatic control flow uses exceptions" - could you give an example. Maybe there is some use of exceptions that I'm not quite familiar with in Python.


When using a for-loop over an iterator, the iterator protocol in Python says to keep returning elements until you run out, at which point you throw an exception. So every loop over an iterator or iterable object in python throws an exception when it is done. https://docs.python.org/3/library/stdtypes.html#iterator.__n...


While that's true at the Python language level, there are already special optimizations for this in CPython: `tp_iternext` is not required to set an exception. If it returns NULL without setting an exception, that's taken to be the end of iteration.

If you call `next()` in Python, this special case is translated to a `StopIteration` exception. But if you use a for-loop, it can directly stop iterating without ever materializing the `StopIteration` exception. So the overhead of Python raise/try-except is already irrelevant for the for-loop.


TIL, thanks for explaining, indeed would have expected exceptions to be pretty cheap also if I knew this.


This covers it very well: https://devblogs.microsoft.com/python/idiomatic-python-eafp-...

In particular, this is not idiomatic python:

    if "key" in dict_:
        value += dict_["key"]

But this is:

    try:
        value += dict_["key"]
    except KeyError:
        pass

I too hate using the exception handling in this way, and if you aren't careful, you end up papering over other unexpected exceptions in your code, so you have to be (A) very specific in the exception you catch, and (B) keep it in as small a portion of code as possible.

I just think it makes for clumsy code - which of the two look better:

    try:
        value = dict_["key"]
    except KeyError:
        pass
    else:
        do_something(value)

OR

    if "key" in dict_:
        do_something(dict["key"])
But it might just be me.


The difference in your "which of the two look better" example is:

(a) has one dict operation plus an exception which rarely occurs and is nearly zero cost if it doesn't.

versus

(b) has nearly always two dict operations, plus a possibly incorrect assumption that the dict will not be mutated between the "if key in dict" and "dict[key]" operations.


Doesn't Python pretty much guarantee that since it's single-threaded?


No, you can have multiple threads in a python app. And while it promises to keep many things atomic, code like that can be interrupted between each line at minium.


If you don't care about a key existing, then this works.

    do_something(dict_.get("key", None))
I use that a lot for data parsing. Passing the exception is not very clean IMHO. I stick to d[key] nomenclature when I need assurance that all the keys are present in the dictionary, and .get(key,None) when I don't.


.get("key") is enough, as the default is already None.

And if you care about the default value being a particular type, when there may also be None in the input stream, do something like:

x.get("key") or []

or

str(x.get("key") or "") # Guarantee strings and avoid "None"!


The second 'key in dict' option is not thread safe if dict_ is not local (another thread may delete the entry between the existence check and use). This is a good reason why the exception handling approach is a better idiom.

The exception handling approach may also be faster if your code cares (may, because I don't know if the cost of calculating the hash of "key" twice is cheaper than the exception handling overhead).


If I find if's and try's looking ugly for a particular use case I try to figure out how to get rid of them. For your first example I would do this, assuming value is a number.

  value += dict_.get('key',0)
Though I agree with using an if in the second example, if there isn't a better way to do the iteration to avoid looking up keys that don't exist.


> In particular, this is not idiomatic python

According to this article, I think his case is rather weak. The conclusion does not seem to follow from the premise to me.


Nearly all loops are terminated by raising an exception.

https://docs.python.org/3/library/exceptions.html#StopIterat...


The first that came to mind is how `get()` is handled in Django's ORM. The idiomatic way to look for a single object is to use `get`, then catch a `DoesNotExist` exception:

From https://docs.djangoproject.com/en/3.2/ref/models/querysets/#...

  from django.core.exceptions import ObjectDoesNotExist

  try:
      blog = Blog.objects.get(id=1)
      entry = Entry.objects.get(blog=blog, entry_number=1)
  except ObjectDoesNotExist:
      print("Either the blog or entry doesn't exist.")


Right, but the better way to actually write this is something like

  entry = Entry.objects.filter(blog__id=1, 
  entry_number=1).first()
  if entry is None:
    # deal with does not exist
Maybe it's my scala/Java background shining through, but we are big Django users and we ban the "catch exceptions as standard" workflow, because there is almost always a cleaner way...


This fails to raise an error if there is more than one object matching the given filters


Presumably entry_number is unique_together with blog_id. Otherwise the original code is also not handling the MultipleObjectsReturned exception.

Generally speaking, I tend toward the cleanest code being:

    blog = Blog.objects.get(id=1)
    entry = blog.entry_set.filter(entry_number=1).first()
    if entry is None:
        handle_missing_entry()
    handle_entry(entry)
But it does suffer from having the extra DB query in there, which may or may not be helpful, depending on the surrounding code (and whether or not you'll be using the blog instance anywhere else).


> Right, but the better way to actually write this is something like

Maybe for some cases, but it does not do the exact same thing, see the comment here: https://stackoverflow.com/a/29455777/1598080

And if we are talking about idiomatic, I think it is maybe a stretch to count this as idiomatic for Python, but given it is documented for Django I think it is fair to call it idiomatic for Django.


I wouldn't make it an if statement unless it's going to be a part of the standard flow. I think the catchphrase is "leap before you look". Though your right that a single query is better.

Honestly, I normally just use get and let the exception fly. If it's a celery task I'll see the stack trace in flower, or right in the output if it's dev with debug on. Then I would go out of my way to make sure there was never a circumstance where a user requests something that doesn't exist.


That is the sort of method that I prefer if it's available. I'm a C# dev so this way also feels far cleaner to me.


What are you suggesting is better about that way? More readable?



In python, it's normal to use exceptions in place of type checking, e.g. in polymorphic functions.


But that would not be for control flow.


On a smaller scale, it is.


To me sending wrong arguments is an error condition, not control flow, see this for more info: https://softwareengineering.stackexchange.com/questions/1892...


It's not about sending the wrong arguments. In Python a function may support a variety of types in a single argument. But instead of querying the type and switching based on that, you often want to try to call a method and if you get an exception (because that method does not exist), try to call a different method that gets you what you need.

The advantage to this over checking the type is that you are still use duck typing. If you check the type then you can only support a specific set of classes.


Thanks for the clarification, I have actually seen this now that I think of it and this indeed is closer to control flow than error flow.


This is also known in Python as the "ask for forgiveness not permission" idiom.


Wow! Prior to reading this, I was not aware of "Zero Cost" exception handling. While I am only a Python developer, I always assumed that in any programming language, exception handling, regardless of whether an exception is raised or not, cost some CPU cycles. I work at an HFT firm and they test their changes in equations in Python programs on crypto rather than C++. So I resorted to using try-except blocks in Python to reduce "branching" i.e if-elif-else blocks. I would just add all the different conditional functions in a dictionary and manage calls based on keys and handle exceptions. I don't know if that's the best way to improve speed, but I would like to check if this has any impact on it.


For me it isn't about the cost. Modern languages like Go and Rust separate the error handling from the conventional logic, and that makes the code more readable. It's my only complain about Python, (outside of performance of course). In Python when you see a `try`, you don't know if it's because there's error handling going on, or if it's because that's the only way to achieve a certain goal due to Python being designed to mingle logic with error handling. After doing projects in Go and Rust, I can see the value in separating the two, and that makes me sad that Python is old now.

Maybe what they're planning to do with this is allow wrappers to hide the places where exception handling is gratuitous, and therefore try to bring Python forward into the world of more modern languages.


> Modern languages like Go and Rust separate the error handling from the conventional logic, and that makes the code more readable.

The (result, error) pattern in Go or Result<Ok(res), Err(error)> pattern in Rust usually mixes the two. Unless you're doing and_then in Rust, but Go doesn't have anything like that. If anything, I feel like it's exceptions that separate the error handling for the conventional logic. You have your normal code in try, and your error handling in except.


> Modern languages like Go and Rust separate the error handling from the conventional logic, and that makes the code more readable.

This is a very odd claim. Go and Rust are extreme examples of mixing up error handling with conventional logic. There are excellent reasons for doing it that way, but the fact remains that they do. Exceptions, on the other hand, definitely do separate error handling from conventional logic – that's the whole point of them.

I think you just happen to have seen Python code bases where exceptions are caught very close to where they are being thrown, but that's property of the code you read, not the language feature. And presumably you have also seen Rust/Go code bases where errors are often passed back up the stack, which is easy to do but still requires some code (even just a ? in Rust is still an explicit decision) in a way that allowing exceptions to propagate up does not.


You may enjoy programming in Elixir if you like that style. In Elixir, you only program the “happy path” and just let things fail. Then you rely on supervisor processes to handle the exceptions/errors. Well, at least that is the idea. I think people still do tests and function guards and things. but the “let it fail” idea is definitely part of the Erlang/Elixir world.

The sad thing is that there really isn’t any “learn elixir” book that teaches this idiomatic design. A student of Elixir should set up an umbrella application from the very first hello world, in my opinion.


Well, I don't enjoy the "happy path" programming. Admittedly, this implementation to improve speed feels a bit hacky. I only did it because it had a measurable impact on the computational performance of my program. In my other grunt worker scripts, I actually prefer if-elif-else statements because they make code readability better for other programmers who are not Python "natives", but use the scripts or modify them to suit their use cases.


There are some weird performance optimizations in Python, e.g.,

    item = some_dict.get(key)
    if item is None:
        # key does not exist
Versus

    try:
        item = some_dict[key]
    except KeyError:
        # key does not exist
When I tested these (admittedly, a while ago), which one was faster depended on how often the key was missing. If “missing key” was an expected case, the first one was faster. If “missing key” was uncommon, the second was faster. It sounds like the fast path in the second case is getting faster, so this performance gap may be increasing.


Fun fact: all those approaches use multiple dict lookups, just of different dicts.

First approach is looking for `get` in `type(some_dict).__dict__` and then for `key` in `some_dict`. Second approach is looking for `key` in `some_dict`, and then (only if missing) for `KeyError` in the module globals/builtins.

If the performance of hash lookups matters, Python is the wrong language for you.


> If the performance of hash lookups matters, Python is the wrong language for you.

Announcement to Python programmers: “Don’t bother trying to improve the performance of your Python code! If performance matters, just completely rewrite your code in a different language!”

I don’t know how to respond to that, except to disagree with the underlying assumptions that (1) there is a “right language”, (2) if performance matters, Python is not a suitable language, or (3) people are generally in a position to choose which language a project is written in.

Even if performance matters, it is not the only thing that matters. When you choose a language, there are necessarily tradeoffs... everything from the skillset of your team, to the ecosystem of libraries available affects that decision. Finally, there are projects already written in Python.


I am going to speculate here, so if I'm wrong please point it out.

Here, the number of steps directly affect the time.

In the first approach, the ".get()" method first analyses the type of "some_dict" and then uses an internal variable (the ones surrounded by double underscores) to try and fetch the value by using the provided key. If the key is present, then the value is returned, if not then a default value is returned. So if the key does not exist, the returning the default value saves 1 step (that of fetching the value from the map)

In the second approach, the exception raises the number of steps because the type of error has to be determined and the stack is traced every time an exception is raised. So the more exceptions are raised, the slower the code gets.

I tested this with 3.9.7 right now and in my testing, the runtime of first approach was virtually unchanged, while the second one was faster if exceptions were raised ~12% of the time or less. (I ran both 10 million times)


The tenacity of people getting excited over micro optimizations in Python for more than two decades is remarkable. Nothing has happened despite monumental speed programs that were broadly advertised and marketed to corporations.

Meanwhile, SBCL has an industrial strength compiler that predates Python and its trademark (the SBCL compiler was called "Python" before the trademark, thereby invalidating it). Python (the language) is mainly good at aggressive marketing.


I think the worst part is them not optimizing CPython earlier, which means that people now rely a lot on CPython internals, which means stuff like Pypy isn't compatible with the Python ecosystem. And then you end up with everyone having their Python optimization: Dropbox, Instagram, Pypy.


It's when you put all these small optimizations together that it leads to something remarkable. It's analogous to video codecs: there are a bunch of individual optimizations that alone don't look that impressive, only saving ~1% here or there. But once they all are working together, you see savings of 10-50%.


I think what's surprising is how small these optimizations are. A few proposal sped up Python a lot, but were partially incompatible. Meanwhile OCaml is removing the GIL from the language, adding multicore and added stuff like Flambda in the last decade, all of that while mostly keeping backwards compatibility and not sacrificing single-core performance. All of that while having way less people working on it than Python.


Yeah but Python is on the order of 100x slower than "fast" languages. 50% faster would obviously be nice but it's not really going to make Python a fast language.


I'm looking forward o seeing numbers per python release, for what the optimizations result in taken together. Mark Shannon is going at it with optimizations now.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: