Hacker News new | past | comments | ask | show | jobs | submit login
Overload functions in Python (arpitbhayani.me)
69 points by arpitbbhayani on Feb 17, 2020 | hide | past | favorite | 61 comments

This is pretty neat, and shows off some relatively advanced features of the language.

And while it's fine for a personal project or in a specific context, please don't do this in a regular Python project, especially in a professional setting. You'll just confuse new hires, slow down execution time, and make the code more difficult to reason about and debug.

It's a funny thing:

Advocates of dynamic languages tend to claim that the flexibility they offer — dynamic duck typing, dynamic dispatch, runtime reflection, eval — is a major advantage.

And yet every time someone actually tries to meaningfully use those features, they say ‘why would you do that, it's too confusing’ and tell people to stick to writing code that's just as easily expressed in a statically-typed, statically-dispatched, AOT-compiled language, while still paying the costs of their environment supporting those features.

If you're going to write Python like C, why even bother?

Just because you can, doesn't mean you should.

A large part of Python's popularity is due to the fact that there's a reasonably well defined 'pythonic' way to do things, that everyone can learn and then have a decent experience using and reading code produced by others.

You can implement fancy operators, overloading, entire DSLs in Python; but by doing so you break the pythonic contract and make your creation stand alone with a separate learning curve. There are some valid reasons to do this, especially for bespoke in-house tooling, but open source modules intended for mass use have virtually no justification to deviate from the primitives which the entire community is used to.

> Just because you can, doesn't mean you should.

I think this is very much true, but actually I disagree with you when it comes to OSS. For example, Django makes heavy use of metaclasses in order to simplify its API, and I think that's fine, because no junior developer realistically needs to contribute to such a project. They can work on a project which uses Django without needing to understand the internals.

Having said that, I was only introduced to SQLAlchemy a couple of years ago, when already pretty competent at Python. Their filter syntax (ab)uses __eq__ to allow you to write expressions such as `MyModel.my_field == 'query'` which return an expression which can be evaluated dynamically when applied to a SQL query. I did a double take when first looking at this, assuming it at first to be a typo. I then ended up digging into the internals of SQLAlchemy to find out how it all fits together. The upshot was that I explored the SQLA API in great detail. The downside is I spent a few hours doing it :D

If all we do is write Pythonic code (especially now that "Pythonic" seems to include type hints), what's the benefit of the highly dynamic CPython virtual machine?

Surely a faster VM, or even an ahead-of-time compiler, would be possible if we give up on some dynamism? Is that a direction the community should take?

(I think Guido's answer would be no, based on his apparent dislike of existing "Python compiler" projects such as Nuitka.)

I have never heard anyone claim that eval is a good thing.

I use dynamic duck typing and runtime reflection in some places in the Python I write.

For instance, I might attach an extra attribute to an object and use that later on - e.g. a request object that lives for the duration of the request and is discarded later on.

Or rewrite a certain function into a loop that goes over the attributes and does the same thing to each of them.

But I could live without them, at the cost of some contortions.

I think the really big benefit is not having to spec out unimportant infrastructure between functions in a module. The lack of a spec makes it easier to keep local things local.

eval(/exec) is how dataclasses (and namedtuples before them) work: https://github.com/python/cpython/blob/4c1b6a6f4fc46add0097e... . It is a big gun, but sometimes (usually deep within lib code) big guns are the right answer. Don't use reflection or runtime bytecode emission in Java.. unless you have to. Don't drop to inline asm in C.. unless you have to. And so on.

And, of course, if it _is_ the right decision to use such a tool be aware of just how easy it is to use big guns wrong. Even this very article's 'just do it' tone seems to convey a lack of respect for decorators, what I'd consider a 'medium gun' in python. So many intermediate python programmers write decorators that don't properly interoperate with the descriptor protocol and thus either fail to work on instancemethods (as here: https://repl.it/repls/ThreadbareCurlyKeyboard ) or hardcode a 'self' arg in their wrapper and thus don't work on global functions. I'm fine with simplifying it for an article but for production code this is a pet peeve of mine :p

> or hardcode a 'self' arg in their wrapper and thus don't work on global functions. I'm fine with simplifying it for an article but for production code this is a pet peeve of mine :p

To be fair, this is such an easy class of mistake to make that the standard library does it. @functools.lru_cache is bugged for instance methods.

I think it really depends on both language and language culture. Dynamic Dispatch and function overloading is the basic way to program in Elixir, every basic tutorial will teach newcomers how to do it, linters will complain if your methods have too much conditionals inside instead of outside and programs will elegantly look like state machines at top level. On the other hand Elixir also has macros, but they are instead discouraged and considered an advanced topic, while in other languages like racket and lisp languages they are usually a prominent tool.

What enters in play is the principle of the least surprise. If a feature is frequently used then everyone can identify it, know it's effects and understand it's limitations, and so it stops being confusing. If it's a feature built for the 1% libraries that requires some special DSL syntax and you can only find deep in the manual or in advanced books then it's probably something you should use very sparingly.

Because when you always take advantage of all that flexibility, you end up with Perl.

Everyone was always so excited to show off their mastery of Perl and do fancy things that it was very hard to maintain, and Perl got a reputation as a "write-only"[1] language.

You should follow the principle of least-surprise, and follow the idioms of the language. It's great to have the flexibility when you absolutely need it, but that should be reserved for rare cases and be very well commented.

[1] https://en.wikipedia.org/wiki/Write-only_language

I don't know about "every time"; it's more that they should be used judiciously. Using them to allow you to have two different functions that perform two different calculations but bear the same name wouldn't qualify as judicious use in my book.

There are similar problems with getting a bit too happy with macros in Lisp, or type-level programming in Haskell, or templates in C++, or self-modifying code in assembly language. In all cases, the principle is the same: In general, you should always prefer the simplest way to get the job done.

I phrase it this way:

It's a lot easier to understand, say, decorators in python than annotations in Java.

The other thing is that a lot of the dynamic languages are getting cool features to support static dynamism really well. Protocols and literal types being my two favorites in python, which allow statically-verifiable duck typing, and argument controlled return types respectively.

That flexibility is very useful when used sporadically. It becomes unreadable when people use it too much or in the wrong places.

> If you're going to write Python like C, why even bother?

I agree with you, but I suspect a lot of people fall into the "I want to be able to do this, but I never want to see anyone else doing it" camp.

The standard library has something similar: https://docs.python.org/3/library/functools.html#functools.s...

Cython also has had a mechanism for this for a long time. In fact if you wanted multiple dispatch in a new pure C program in 2020, just write it in Cython with no use of the CPython API and have Cython generate the pure C library for you.


Ditto for this; it works very well with type hints. I used it in func-analysis [0], a tool I put together to learn about duck typing.

[0]: https://gitlab.com/Seirdy/func-analysis

SQLAlchemy has too this for hybrid attributes/expressions.


I thought a consensus had emerged that function overloading was a bad idea for a while now? Even in strongly-typed languages, it pushes that extra bit of cognitive load onto the human reader. It also complicates things for tooling. In loosely typed languages it's hard to see the need. As somebody else mentioned here, variable args and kwargs are the more pythonic* way to address such concerns. If you want to have different behaviour for different args you can do this explicitly.

I guess this article is a fun discussion and a nice comparison of language features, maybe I'm taking it too seriously.

> I thought a consensus had emerged that function overloading was a bad idea for a while now?

Hm, really? I didn't get the memo :-) After all, there is no semantic difference between function overloading and varying behavior based on args and kwargs.

You can trivially compile any set of overloaded functions into a single function with a bunch of if/else statements at the top level, without any change to the caller side. Basically, you need either variable arguments or function overloading to support a dynamic API. Nearly all languages support either one, many support both. Why is that a bad idea?

Languages like Elixir encourage heavy function overloading (to the point that some people replace nearly every potential if/case by a bunch of overloaded private functions), and I don't think its ecosystem is hurting for it.

Sidenote: I also really like how TypeScript does function overloading: you can only define one function implementation, but you can give it multiple overloaded typed signatures. This lets you say stuff like "if the first parameter is a string, then the second one must be a number". But you still have to implement that top-level if/else by hand, like you would in JavaScript. It's nice and pragmatic!

I actually really like the kwargs approach. It’s nice and verbose.

I think it depends entirely on whether the overloads perform the same conceptual action. If I have a "send value" function that transmits over an existing network connection, I can have overloads for bool, int, string, etc. Each does the same action, but for a different type.

In the other hand, overloads can do entirely arbitrary actions. If there is no consistency between the different overloads, then there is no reason to have them be named the same thing.

I wad looking for Scott Meyers reference for this earlier and the one I found matches yours: ”use function overloading for type conversion”

On one hand - I agree, overloading isn't great. On the other hand having a `def func(args, kwargs)` is a pain the ass for everyone involved (people & IDEs): you have no idea what the args or kwargs could be without reading the source.

If you can get away with just a bunch of named kwargs after the arguments that is fine, but I'd take overloading over the `args, kwargs` garbage any day, even if that is the more "pythonic" way.

I think the kwargs approach is fine for when you’re reading your code. When you’re writing I think you’ll always have to consult the docs, or headers. In a strongly typed language IDE can pick up the hint but in a more dynamic language like python it can get confused.

My bad - I meant this but didn't know how to format code so that is showed the stars

    def func(*args, **kwargs)
I am in full support of actual kwargs with names, it's the wildcard ones that I don't like.

by "kwargs" the usual approach is to have optional arguments ("keyword arguments") that aren't just kwargs, but are named optional arguments:

    def func(args, flag=False, flag2=True)

I fully agree that named kwargs is perfectly acceptable, but this is the version I think is horrible:

    def func(*args, **kwargs) 
is valid and completely ambiguous - you have no idea what it's doing unless you read the source, and track down how it branches out. IDEs are stopped dead in their tracks as they aren't going to parse the logic to figure it out

And yes it's not the most common approach, but I have seen it enough times to despise it. This overloading approach removes the need for it so I'm all for that.

> This overloading approach removes the need for it so I'm all for that.

Ish, see if your IDE supports it (hint: it won't, but it will support named kwargs).

And it doesn't really need to. Consider this:

    from functools import singledispatch
    from typing import Union

    def myfunc(arg:Union[int, str]):
        print("Called with something else")
    def _(arg:int):
        print("Called the one with int")
    def _(arg:str):
        print("Called the one with str")

Everything works as you'd expect, and the IDE's type hints are accurate as long as you actually put the correct type hint

    def myfunc(arg:Union[int, str]):

Like most things, function overloading can be used for good or bad effect. You could for example implement something like multiple dispatch, which seems to be trending somewhat currently (in e.g. Julia and modern C++).

It adds cognitive load when done badly, but reduces it when done well and consistently. The knee jerk dismissal of different approaches is probably the main reason why programming as an art has been by and large stagnant for decades.

C++ does not have multiple dispatch, though.

Historically in C++ it was considered to be more performant to dispatch function calls on the base type of its arguments, not the concrete types.

Consider for simplicity member functions A::f(x...) in C++ as free functions where the first argument is fixed to be of type A, as in f(A, x...). If you want to emulate multiple dispatch for just 2 arguments f(A, B), you already find yourself in visitor pattern-land.

And the other issue with C++ is that it does not allow open functions. In the visitor pattern you necessarily have to implement f(A, B) for _all_ subtypes of A and B.

In Julia open functions + multiple dispatch make life quite a bit easier than in C++:

    julia> abstract type Animal end
    julia> abstract type Color end

    julia> struct Dog <: Animal end
    julia> struct Cat <: Animal end

    julia> struct Brown <: Color end
    julia> struct White <: Color end

    julia> f(::White, ::Cat) = println("Hello white cat")
    julia> f(::Brown, ::Dog) = println("Hello brown dog")

    julia> for (color, animal) in [(White(), Cat()), (Brown(), Dog())]
            f(color, animal)

    Hello white cat
    Hello brown dog
Note that array type is here is Vector{Tuple{Color,Animal}}, which would correspond to type erasure in C++.

Lastly, the modern C++ version with std::variant + std::visitor is probably not really multiple dispatch, since it has a union storage type under the hood.

I tend to think of C++ function templates to have (static) multiple dispatch. With runtime polymorphism you do end up with visitors or some other hackery.

In Julia function overloading is used instead of object oriented programming, and it's actually quite powerful abstraction for mathematical manipulation.

Just as an example automatic differentation libraries are written using function overloading (in which the arguments of the overloaded functions are a tuple of a value and its derivative).

Another example is CUDA array support. And the best thing is that you can combine these 2 libraries (and many others that take advantage of operator overloading)

concepts/traits/type classes or whatever they are called in your language of choice are all forms of overloading (more specifically ad-hoc polymorphism) and are generally understood to be A Good Thing.

At least in C++, most forms of overloading I see is to enable generic programming.

Of course nobody stops you from naming functions doing different things with the same name, but then again nobody stops you from naming functions badly in the first place. As usual with more powers come more responsibility.

Python has idiomatic ways of implementing something like overloading. - Default parameters, so variable numbers of arguments can be passed. - Run-time type identification. Though not generally recommended, you can use it for slightly different implementations for different argument types.

This is a neat article, and a nice dive into Python.

I think it also taught me, by prompting an immediate negative gut reaction to the basic idea, about an opinion about language features that I didn't know I had, let alone that I didn't know I had so strongly: I think that I officially believe that function overloading should only be used for two reasons: First, you can overload functions of the same arity to mimic dynamic typing in a static language. A print function that takes many types of argument, for example. Second, you can provide multiple arities to mimic optional arguments in a language that doesn't have them.

But the example in the argument, where the overload is for providing two different versions that do different things with their arguments, is not something I'd want to see in real code. There's just too much opportunity for confusion. For example, if I were familiar with `float area(int)` as a function that calculates the area of a circle, and and then encountered `area(int, int)`, I would guess that the return value is a float, and that the two ints are now the lengths of the semi-major and semi-minor axes of an ellipse.

And I'm having a hard time coming up with a better example for the article. Perhaps because function overloading just isn't a desirable feature in a language like Python.

I don't miss it a lot in Python except for some cases such as:

    area(Circle circle):

    area(Square square):
You could say "Ah! But that could be something that you define in each class, like square.calculate_area()". Yes, but sometimes you don't have access to the class. You could monkey-patch it but that's not something that I like to do.

The Pythonic way to deal with that would be to just remember that Python is a dynamic language:

  def area(shape):
      if isinstance(shape, Circle):
      if isinstance(shape, Square):
Or, if you want better coverage from your type checker,

  from typing import overload

  def area(shape: Circle): ...

  def area(shape: Square): ...

  def area(shape):
      if isinstance(shape, Circle):
      if isinstance(shape, Square):
This still isn't real overloading. The last one is the one and only function. All the other two bits do is tell the static type checker what kinds of arguments it's prepared to support.

edit: Scratch that, I think what I'd really go for in a simpler case like this would just be

  def area(shape: Union[Circle, Square]):

Python including functools.singledispatch I think is a strong indicator that function overloading IS pythonic (or at least pythonic enough to core python developers)

I don't know that I've seen singledispatch used anywhere, and the standard library includes a lot of unpythonic code (and entire modules).

As a simple example, the unittest module is entirely unpythonic.

Singledispatch was mostly included to handle specific cases where singledispatch is very clearly useful (the PEP mentions pprint and copy), but not as a generic tool for common end-user code.

One common use for function overloading is when you need the function to work on classes defined across several different packages.

It's used a lot in R for this reason. You might want to have a function that operates on different models, but there are 60 modeling packages it might be used on.

In languages that have named arguments - Python being one - it's also possible to overload based on that. In effect, they become a part of the function name (kinda like Objective-C).

The Wren scripting language supports this kind of "overloading by arity" [0].

Wren therefore allows overloads such as `range(stop)` and `range(start, stop)`. This is more intuitive than Python's `range(start=0, stop)`, which might be the only function in the language that has an optional parameter before a required one.

[0] http://wren.io/method-calls.html

> which is the only(?) function in the language that has an optional parameter before a required one.

The documentation shows it as being overloaded, rather than having default arguments:

    range(stop) -> range object
    range(start, stop[, step]) -> range object
There's iter which is overloaded:

    iter(iterable) -> iterator
    iter(callable, sentinel) -> iterator

That's the thing - it's documented as overloaded, because that's the most intuitive explanation. Wren would allow it to actually be implemented as an overloaded function.

Python doesn't support overloading, and it doesn't support optional arguments before required ones, so the actual implementation in Python is a bit messy - something like:

  def range(start_or_stop, optional_stop=None):
      if optional_stop is None:
          start = 0
          stop = start_or_stop
          start = start_or_stop
          stop = optional_stop

there is no optional positional in the implementation. range is overloaded in the raw sense of the term as the implementation checks the number of arguments and their types and does the right thing.

it could have been implemented in pure python as well by doing args, *kwargs.

Well, however the `range` function is implemented, IMO its API would be better expressed as true overloading - as two functions with the same name.

I could have sworn I have written range(1,stop) many times in Python. Did I misunderstand your argument, or has my memory gone all sideways?

`range(stop)` and `range(1, stop)` are both supported, but without overloading, the implementation of `range` is messy as it has to work out the meaning of each argument manually.

Why is that a problem? I want the standard library to contain all messy stuff so my code doesn't have to.

From the call site there's no difference between Python's optional-first-argument range() function and a hypothetical overloaded one. Any perceived complexity in usage, therefore, can be fixed with better documentation.

`range` is an example. Lack of support for overloading makes it harder to replicate its API in our own functions.

Ah right, totally misunderstood.

Yep, true. Overloading is nice.

    range(a) means range(start=0, end=a)
    range(a, b) means range(start=a, end=b)

Its weird an article written about this now, has no mention of https://www.python.org/dev/peps/pep-0443/ (skrause mentioned the standard library docs for this https://docs.python.org/3/library/functools.html#functools.s... a few minutes ago https://news.ycombinator.com/item?id=22346433 )

It's especially weird to me, since single dispatch generic functions would do quite a lot of what he shows in the article, without having to build it all from scratch. I mean if you need more than what the standard library tools for multiple dispatch will let you have, then sure build your own, but I definitely echo nickserv's sentiments https://news.ycombinator.com/item?id=22346235 ... This kind of hand rolled alternative to something in the standard library is not something you should end up doing as a last resort, its usually more trouble than it's worth. When you do need it, you should be documenting the hell out of not just what it does, but why you had to do it yourself.

Edit for general knowledge sharing reasons: I just noticed the nice update to the built in functools.singledispatchmethod (https://docs.python.org/3/library/functools.html#functools.s...) that came with Python 3.7, it now supports registering arguments using type annotations. I can already think of a few places where I could go back and clean up some code by removing a bunch of now unnecessary code doing "if isinstance(foo, str):" checks.

This is very clever. But function overloading is something out of static language territory. It feels un-Pythonic and needlessly complicated. Especially when compared to args and *kwargs.

It feels pythonic to me. One thing that I think brings perspective here is the PEP on singledispatch, which is essentially on function overloading, and is implemented in functools!


It removes the construction of multiple branches within your args/kwargs built function to handle this or that if this or that parameter is this type. I think it clarifies otherwise difficult to write functions or functions whose results have more in common then the actual functional internals.

Got to the Namespace portion and couldn't help thinking "Namespaces are one honking great idea—let's do more of those!"

meh.. that kind of code would be on a shortlist for refactoring at first sight.

    area(width=1, height=1)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact