Hacker News new | comments | ask | show | jobs | submit login

Python really needs much better anonymous functions - Lambda expressions just don't cut it.

To be clear, what Python does need is multiline, inline anonymous functions. In JavaScript/ES2015, the fat arrow function syntax is just remarkably powerful in its ability to simplify code. I would go so far as to say that the ES2015 fat arrow syntax completely changed the way my programs are written, increasing power and reducing complexity. ES2015 fat arrows I think are possible the best programming language feature I know.

async programming in JavaScript benefits massively from the beautifully terse and powerful anonymous functions in ES2015. Python now has powerful async programming, but no equivalent to the ES2015 terse inline function anonymous function syntax.

I raised this issue elsewhere once and someone said "there will never be multiline, inline anonymous functions in Python because it would require brackets". If so then it is a great pity that Python is simply not capable of including a language feature that is IMO absolutely critical.

Also, as mentioned elsewhere in this thread, using the word "lambda" to describe anonymous functions is a really really bad decision. Lambda sounds very deep, very computer sciencey and very complex. Probably they should just be called anonymous functions in keeping with the rest of the industry, which also is not a great name but is much better than "Lambda" functions.

Beginners might think along these lines: "Lambda .... lambda calculus? I don't know calculus. I'm outta here."

Names matter.




Python is fundamentaly flawed, since it's based on lines (and significant indentation), and foremost, since it's based on statements, and not only on expressions.

Lambda expressions look in python like a sore spot, because Python stresses so much statements, and procedural programming.

IIRC, recently we've seen passing a paper in hacker news, that let beotians write programs, and only ~10% IIRC (or less) of their uterances were procedural statements!


I've seen people make the distinction between statements and expressions in the past, but I don't get the significance. I mean I understand what they are but I don't understand why people view the concept as such a big deal.

Are there any resources explaining this which you would recommend?


Expressions can be nested arbitrarily and infinitely — like in math. You can take “x * x” and “3 * x” and add them into “xx + 3x”. You can pass that to a function like “log(xx + 3x)”. You could then divide that, square root it, on and on. You can combine the expressions however you want.

Statements can also be combined, but in a separate way. You have control flow blocks, and inside of those you have one serial list of statements. Control flow blocks and statements can only go in other control flow blocks, and not in expressions.

If you’re coming from languages like C or Python, that might not seem like a huge deal. Why would you want “x = 5” or “while y < 10:” in an expression?

Languages that focus more on expressions have surprisingly simple ways of expressing some things, though.

In Rust, “if” statements are fully expressions, so you can assign them:

    let x = if y > 10 {
        let a = readLine().parse();
        let b = readLine().parse();
        a + b
    } else {
        y * 2
    }
Another neat feature is that macros can return blocks, and you can still call those macros wherever you like. For example, in the standard library, there is a “try!” macro that checks if a value is successful. It expands basically into:

    if result is successful {
        result.value
    } else {
        return result.error
    }
The “return” there basically causes the error to short-circuit up the stack. In practice, that means that you can use “try!” to simply say “give me the successful value, and if there was an error, just send it up the stack.”

If you wanna get really fancy, languages like Haskell and Elm don’t have side effects, so things like “x = 5” don’t exist. There are only expressions. You can create values that represent doing something (kind of like Redux actions), and there are lots of tools for combining those action values. Because it’s all expressions, you can combine them arbitrarily. If you have a common case for how you combine your effects, you can just make a plain old function for it, because it’s all just expressions.


beotians?


Yes and the fact that they aren’t full closures is a travesty - I can’t tell you how many times a nice dependency injection has been ruined by having to add a bunch of def’s. It bloats code and reduces expressiveness.


Nested functions in Python are "full closures" in every sense I understand it. Is there something in missing?

They can read the value of local variables (including arguments) in the parent function, and write to them (using the "nonlocal" statement). If one nested function assigns a new value to a captured variable then, of course, this change is seen by all other nested functions and the enclosing function. Most importantly, if the lifetime of the nested function object is longer than the runtime of the enclosing function (most commonly because you return the nested function object from the enclosing function) then the lifetime of the enclosing function's local variables is extended appropriately.

This applies to both nested functions defined using a "def" statement and nested functions defined using a "lambda" expression (except you can't assign to variables in a lambda expression). I say this because you don't say what you mean by "they", and it's a common misconception that only lambda expressions capture enclosing variables.

The only disadvantage of this is that, because name lookup is done at execution time rather than parsing time, all the enclosing variables must be captured and their lifetimes extended even if the nested function doesn't use any of them. In this sense, nested functions are too closurey in Python! This is not really so bad because if you don't need variable capture then you should simply not be using a nested function in the first place; this is best anyway because it means you are sharing one function object between different uses.


Sorry if this wasn't clear - I'm talking about lambdas. I'm basically complaining about a piece you pointed out, that you can't assign to variables in a lambda expression, this is very not ergonomic IMO. Having to create a def every time I want a function to assign to a variable is annoying. I pretty much want anonymous nested functions.


This is the number one thing that makes it difficult for me to want to switch from JavaScript to Python despite Python otherwise lining up almost perfectly with my programming/syntax preferences.

JavaScript has things like TypeScript (and Babel in general) that add features that are "missing" from the language. Is there anything similar for Python and anonymous functions?


FWIW, in many languages, "lambda expressions" and "anonymous functions" are two similar but different things. Specifically, a lambda expression is a shorthand syntax for writing anonymous functions whose body consists of a single expression.

Which is pretty much what you get in Python.


Lambdas aren't single expression functions in Lisp.


Haha, I had assumed someone would mention that.

But most Lisps are pretty rough-and-ready about these sorts of things. IIRC, Scheme requires the body of the lambda to be a single expression, doesn't it? And then you just cheat your way out of it with (begin ...).


If it did it would be annoying to type progn all the time and someone would make a macro...


> IIRC, Scheme requires the body of the lambda to be a single expression, doesn't it?

No, it doesn't require it.


It used to, though. From the original Scheme paper: Note that in SCHEME, unlike MacLISP, the body of a LAMBDA expression is not an implicit PROGN.


Changed in the early/mid 80s I think. AI Memo 848 (Revised Revised Report) from 1985 has it already as an 'essential special form': (lambda (var1 ...) expr1 expr2 ...)

R1RS does not permit it. And: In the the R1RS syntax BNF there is a (DEFINE (<identifier> <identifier list>) <form>) with one form. But later in the document it is a '<form list>'... Then the LAMBDA syntax has a BODY, but it is only a FORM...


Which languages?


C# offers three different ways to denote an anonymous function: Anonymous delegates, lambda expressions, and statement-bodied lambdas. Each of them are true closures and can be used to construct a "method" of a given delegate type (i.e. any one of them can be used to as a value for a given variable — interestingly, in any case, an anonymous method must be assigned to a variable (or parameter) before you can use it, unlike, say, the anonymous functions of Lisp, JavaScript, F#, or even the criminally underrated VB).

But they have some important difference:

1) Anonymous delegates (introduced in C# 2.0) allow you to elide the formal parameters when defining a method of a given type if the body of the method doesn't use those parameters. Even though this construct is basically obsolete, I still use it just for this feature.

2) Lambda expressions can be automatically transformed into expression trees, which means that you can use them to construct (awesomely powerful) fexprs[1]. But, they can't contain statements, which means you can't use them for any kind of imperative code. These are the most equivalent to Python's lambda as far as that limitation is concerned.

3) Statement-bodied Lambdas can contain statements (as the name implies), but the language won't convert them into expression trees for (you can do it manually – tedious but not at all hard). At the time that the lambda expression feature was introduced (as a member of the constellation of features that made up LINQ), the framework didn't even contain expression classes that corresponded to most of the imperative constructs, but that changed with the advent of C# 4.0 which implemented a meta-object protocol[2].

Another implementation detail (last time I checked, which has been a while) is that anonymous delegates get turned into instances of System.Multicast delegate, whereas a lambda expression either gets turned into a class with a method that corresponds to the lambda expression (and fields that correspond to closed over variables), or into an expression tree, depending on the class of the variable to which it's being assigned.

[1] https://en.wikipedia.org/wiki/Fexpr

[2] https://en.wikipedia.org/wiki/Metaobject


Always heard them called lambda expressions in c++.


Agree. Without multi line lambdas im javascript, promises would never have been invented.

The braces also makes it easier to find the start and end of a anonymous function even if it's just a one liner, compared to pythons where its only delimited by : and ,


How exactly does not naming your functions make your code more powerful and less complex?


Why does it need that, though? What's wrong with just naming the function?


It's not so much naming the function, but that doing so moves it outside the flow of the logic it is participating in. Especially when it's going to reference local context that is particularly confusing. There are definitely times when extracting out a function and naming it is the right thing to do. But there are an equal number of times when keeping all the local context together is better. It's probably hard to appreciate this unless you spend some time in languages where closures are seamlessly integrated like Groovy.


I've used Lisp extensively and I've used closures extensively. In Python you can simply define the function on the line above where you plan to use it. I really don't see the need. Lisp needs lambda because you can't just randomly introduce new variables everywhere (you have to use flet, which actually is a lambda).


I find it tremendously helpful to move the function out of the flow of logic so it is modularized. When I read scala or haskell code where as soon as you see map or filter or something you know you’re gonna get some crazy anonymous function to follow, I get sad because it’s a miserable and confusing way to write code instead of pulling the function out into a separate definition with documentation, and then having the map or filter part be extremely concise using only predefined functions.


> I get sad because it’s a miserable and confusing way to write code

you just have to visualize it as a tree


Yes, that is a confusing way to write and read code, rather than a linear flow, like flattening the tree by extracting functions into separate definitions.


> Yes, that is a confusing way to write and read code, rather than a linear flow, like flattening the tree by extracting functions into separate definitions.

I don't understand. It's harder to flatten the tree: you have to unflatten it in your mind afterwards to understand what's happening. It's easier to just visualize, say, this lisp function as a tree directly.

    (defun good-enough-p (guess x)
      (format t "~% Guess =~7,4f     Guess^2 = ~7,4f    Error= ~7,4f" guess 
              (* guess guess) (abs (- (* guess guess) x)))
      (< (abs (- (* guess guess) x)) .001))


There's real readability and clarity to positioning an anonymous function right at the point in the code that it will be used.

When you name and define a function some other place and use it elsewhere, you have increased the overall complexity. What is this named function? Where is it used? Is it accessed by other bits of code? Should it be? All these questions come up when you see a named function. It is in fact less readable and requires more code to define a named function elsewhere and then call it. Inline terse anonymous functions are more straightforward, self explanatory and simple.

It is often the case that you know an anonymous function will only ever be used once, in this particular bit of code so it is much more clean, readable and understandable if the anonymous function lives right there.

Especially valuable in reducing the complexity of async programming.

It's hard to make a strong case just in words, but once you really understand the power of the fat arrow syntax for anonymous inline functions then you use it more and more and it becomes second nature and the programs you write have a completely different style, oriented towards neatly positioned inline anonymous functions everywhere that typically can be read and understood at a glance as you skip through the code.

There is simply no other way to write a function that absolutely cannot be called by some other bit of code. That's the problem with functions that are defined outside their usage context - you may know for sure that the function will only ever be called from this one point in your code, but if you have to declare it as a named function elsewhere then there is always the possibility that in the future someone will come along and make use of that function - it is more complex in the immediate term and leaves open the door for increasingly code complexity in the longer term. Terse inline anonymous functions such as the fat arrow syntax solve this perfectly. They are so clear and easy to understand that it's actually rather beautiful.

This is how terse you can make a JavaScript fat arrow function:

  x => x
It is a function that take a param of x and returns x. A more useful example might be:

  message => console.log('new message: ', message)

My primary languages are Python and JavaScript .... and coming first and primarily from Python, I know that the Python community feels it has a monopoly on readability. But having gained a fair amount of experience in JavaScript too, I can now say that things like inline anonymous functions and the extremely terse fat arrow syntax greatly increase readability even further. If Python had both then it would be even more readable, terse and powerful. Add destructuring on top of that and programming in Python would be almost a new experience.


I actually think that positioning an anonymous function at the site where it’s used is extremely unreadable and unintuitive. I work in Scala and Haskell a lot so I see it all the time and I’m usually forced to write code that way to stick to local conventions, and I despise it.

It makes no sense at all to break up the wonderful conceptual flow of functional components like map, fmap, filters, folds, monadic operations, with sudden harsh definition of a function that takes your brain out of the context of what was flowing and into the context of what is the function.

I want to see something like

    val someStat = myData.groupBy(thingExtractor)
      .foldLeft(baselineValue)(statAccumulator)
I want the mental task of grokking thingExtractor, baselineValue or statAccumulator to be a wholly separate task from seeing what flows into the calculation of someStat. I want to look elsewhere in the code for those things, kind of like expanding or collapsing a block of text: keep it out of mind when it doesn’t matter.


It comes down to a preference thing.

I want to read code like an essay. I don't want to have to jump to a function definition to figure out what `thingExtractor` does. To me, that's like taking a book and rearranging the chapters in alphabetical order. No, put them in chronological order. Don't define a function three "chapters" ago in your code and then expect me to remember it if you're only going to use it once. I wouldn't read Ikea instructions that way, why should I read code that way?

But again, personal preference. And you don't have to choose either/or -- I love that Javascript's anonymous and named functions are so similar because it means you can switch between the two styles with basically zero cognitive overhead.

I will say, having worked in large enterprise before, I have seen bugs come out of people not inlining code and instead attaching it to classes or leaving it otherwise accessible. Specifically, people will use functions that are intended to be method-specific in other places. Even if those functions are well-written and don't have side-effects (which is often not the case), the original author still doesn't know that other people are now depending on them as an API.

So later on the code gets refactored and those methods get changed or removed, and suddenly you have a broken build. That's something you can partially mitigate by making a function private (although not entirely, because I've also seen it happen in code within the same class/scope). Python doesn't have true privates, but you can also mitigate that problem by just using code reviews to force people to respect `_`. In practice, I often found that it was easier to make the function anonymous, so everyone knew 100% that it was only being used in exactly one place.


I also work in an enterprise situation and very many bugs that I deal with from legacy code and others’ code comes directly from in-lining anonymous functions.

Many bugs have to do with utilizing closures to access variables needed in the function body, which then make refactoring harder and make modifying unit tests harder.

This can be even worse in languages like Python where there is a distinction between early binding and late binding and you have to be aware if by closure you’re using a reference name that may be associated with different data during its lifetime, or a name whose value won’t change, because a change to the underlying value could make a difference in what is bound inside the anonymous function at different times when it’s called. The classic example is trying to define functions in a loop where functions use the loop variables by closure. Then being surprised when every one of the functions has a reference to only the final value of the loop variable.

Even in statically typed languages, this makes things much harder to reason about than they should be. On the other hand, making an anonymous function that accepts many arguments for all the data it would try to access by closure is stupid: those arguments need to be documented, and it’s just so much cleaner and maintainable to do that with a regular function definition, which also makes it much clearer what all the conditions are for calling the function.

What’s worse is that these things can be deeply welded into some coding context, like using a flatMap over some TypedPipe in Scala / Scalding, and can result in needing to make in-line functions that are hundreds of lines long with arbitrarily complex function bodies, which then become tied into assumptions about the runtime context you’re embedded in, and then nobody can figure out how to refactor it into a standalone function, so it just grows by attrition to an inline lambda over years and is extremely fragile. Change something seemingly unrelated about the outer context it’s defined in, and suddenly you get unexpected, cryptic compiler errors complaining something’s wrong with the TypedPipe, and you have to dig deeper to understand why it’s related to the anonymous function.

I would say many of the most serious closure-related bugs and bugs related to unrefactorable yet undocumented dependence on an enclosing context that I’ve seen have been largely a direct result of the programming style of in-lining anonymous functions inside functional programming constructs.

I sympathize with your claim of “not reading an essay” especially because people can be prone to try to use functional programming or overloading the Python data model and operator syntax with cutesy bullshit that they try to pass off as expressiveness.

But I think there’s a middle ground where you think about it not like an essay, but just basic modularity and separation of concerns, and write functions separately except when they are very short and really trivial.


Hrm. I wonder if the differing bugs just comes from which one people are doing more.

I often have to fight to get people not to expose huge amounts of state whenever they build a module, so by far the most common bugs that I see are people being too loose with turning things that really shouldn't be modules into very fragile, cumbersome modules that depend on being used only in specific (undocumented) places with specific (undocumented) setup.

If I was in the opposite situation, and everyone I worked with already inlined code all the time, then probably most of the bugs I'd see would be related to people reusing variables, abusing hoisting by making spaghetti references to variables that are defined later in the function, etc... and in that case I could definitely see myself agreeing with you.

I have on occasion wanted the ability to define an anonymous function that didn't inherit variables from the scope that it was defined in. So I'll give you that - I would love for the ability to make an anonymous function that only has access to variables that are explicitly passed in. If I could isolate variables going into a closure as easily as I can isolate variables going out, I suspect a lot of the problems you're talking about would be easy to solve.


That is a good point that whichever approach displays the most bugs in a given team is likely to just be whatever is the most common approach for that team, by simple base rates.

I’m not sure how we could objectively decide if either of these two approaches is definitively better, but in the specific, restricted case of a framework like Scalding, I’d strongly wager that avoiding in-lining ends up better in the long run. Those cases also have little connection to the weak module design issue you brought up, since it’s usually a module with de facto map reduce boiler plate and then just a few isolated places with any actual implementation, and when those parts are expressed as huge in-lined anonymous functions inside Scalding data type wrappers, I know right away it’s a bad code smell.


> I’m not sure how we could objectively decide if either of these two approaches is definitively better

The correct approach might just be the opposite of whatever you and your team's predilection is. In your case, you're saying that the teams you work with are using inline functions instead of following a restricted framework, in part because they're using languages that encourage them to just slap a bunch of nested code in instead of writing out the extra boilerplate.

Well, they probably already know to be careful about module design -- so if you encourage them to inline less code, odds are pretty good you won't suddenly wake up in the morning with a codebase with a hundred classes and a bunch of obscure private/public methods named `setupEntityExtractorForCollisionPart3`.

On the other hand, if your team is coming from a Java background and half of them are starting from the position that lambdas are just witchcraft, then it's probably not a bad idea to get them over that fear.

In a setting where everyone is inlining most of their code, probably the tests that are coming out are all integration tests, so... yeah, bias towards creating units so you can unit test. In the opposite situation, I'm actually just trying to get people to stop testing private methods and leaking implementation details into their tests. So I would love if people were testing with a little less granularity.

I know that my initial reaction to you listing off the problems you've run into with people building anonymous functions that couldn't be refactored was just, "yeah, but why the heck would anyone make that mistake? How hard is it to organize the variables in one function?" So I assume that other people might listen to my complaints about improper code reuse and think, "yeah, but why the heck would anyone ever just reuse a method in a class without checking the documentation first?" So my takeaway from that is, "different people struggle with different things."


Python has a convention for "private" functions, which is to prepend "_" to the name.

> It's hard to make a strong case just in words, but once you really understand the power of the fat arrow syntax for anonymous inline functions then you use it more and more and it becomes second nature and the programs you write have a completely different style, oriented towards neatly positioned inline anonymous functions everywhere that typically can be read and understood at a glance as you skip through the code.

This is exactly why I think Python has such limited anonymous functions: because it does not want to promote that style.

> There is simply no other way to write a function that absolutely cannot be called by some other bit of code.

Python also believes that you do not need to defend against this possibility, as long as you make it clear to other developers that they should not call your "private function". This is also why the language allows monkey patching anything at will. If someone wants to ignore good sense, let them.


Private functions do not restrict the usage of that function to one point in the code. There's a world of difference between a coding convention - essentially a comment to say "please don't access this function from outside the current class" versus it simply being impossible to do so. And the "private" convention does not say "don't use this function at any other point except one", it says "please don't access this function from outside the current class", which is completely different to the point I make. Anonymous inline functions reduce lines of code and complexity.

>>because it does not want to promote that style.

Why would Python not want to promote a more readable, simple, powerful, reliable and maintainable programming style?

Can you reference anything to back this assertion?


The parent comment about private naming convention is a red herring.

99% of the time in places where you might use a multi-line anonymous function you're already inside another function

So when you define your named function in Python, it is only available within the scope of the current method call and there's no danger of it being abused by other developers.

The lack of true private vars in python is a separate issue, and one of the best things about the language (because library authors inevitably get carried away and mark things as private which don't need to be - then you're stuck in copy & paste land... I did loads of ActionScript back in the day and this was super frustrating)


As a user of JavaScript and es6, fat arrows are only necessary if you haven't started using async and await syntax, and I found myself almost never using fat arrows after moving to async/await.

Indeed naming my functions resulted in more testable and clearer code.


Hmmmm. I don't see the relationship between async/await and fat arrows.

Can you explain more about how async/await obviates the need for fat arrows?


Just async is callback based.

You don't need callbacks with async/await. No callbacks, no need for fat arrow.


Interesting idea.


Getting rid of fat arrows is one thing, but I use async/await pretty much all the time, and I still use anonymous functions in my code.

  await (async function () {
    //Do a thing
  }());
I don't think async changes anything about how useful or problematic an anonymous function is.


That feels like a weird pattern (for one because you can have async fat arrows), but also because in general I find

    await request_some_data()
much clearer than

    await (async () => {
        // just make the async request inline
    })()


When arrow functions were first released, they didn't support the `async` keyword. I guess that wasn't fixed until ES2018? Maybe earlier - I think Babel would compile them pretty early on. Regardless, some people who adopted async patterns early got into the habit of avoiding arrow functions entirely, so it wouldn't be crazy to me to hear someone say, "I used arrow functions before I learned async."

Otherwise, this is just restating the same arguments people already made above.

There's no practical reason why

  (() => {
  
  }());

would be useful but

  await (async () => {
  
  }());
wouldn't be. The scenario you're talking about where async code removes the need to have an anonymous function only makes sense if you were only using anonymous functions for callbacks. In practice, that's not the only (or even primary) use case for people who advocate inlining code in Javascript. They're specifically advising against the pattern where single-use functions are given names and removed from program flow.


Correct, and I fundamentally disagree with that pattern. If it's more than a single expression, it's doing something complex enough to deserve a name or to be mocked in tests, etc.


While I can see async/await replacing a lot of the most common use in JS, anonymous functions have a lot more uses than that (passing a single use function to generic algorithm, like an ordering function to a sorting routine, is one of the more common other uses, across languages.)


Come on. You've defined a function inside another function. It's very clear that you've done it so you can pass it to some higher level function. If someone comes along and starts using your inner function willy nilly in the outer function without anyone noticing then you've got much bigger problems that multiline lambdas are not going to help you with.


> Python also believes that you do not need to defend against this possibility, as long as you make it clear to other developers that they should not call your "private function".

Python has apparently never worked in a large, corporate environment with a multinational team.

Yes, in theory you can just not call private methods. And in my own personal code I don't care much about restricting variable access, because I trust myself. However, it is a tremendous amount of work to get a company culture to start respecting privates if they aren't already. I've had so many phone calls trying to explain why "yes, you can technically call this function, but in reality you can't, and I don't care that your deadline is tomorrow, you still can't. And yes, I know that technically you're on a separate team and not under my jurisdiction, but you can't just remove me from the code review and call all of my library's private functions anyway."

It's taking the path of most resistance. True privates are very helpful on large teams.


You do realise you can define a function inside another function in python, right?


So if I understand correctly, in Javascript a fat arrow is just the syntax sugar for:

    var x = function(x){ return x };


It also changes the behavior of this. [0]

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


The two biggest differences iirc:

- it can never be named

- it inherits `this` lexically from its parent scope (which imo is the sane default)




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: