authors_of_long_books = set() for book in books: if len(book.pages) > 1000: auth...

tremon · 2025-03-11T17:50:31 1741715431

also it's natural, clear, and in the right order

That isn't natural to anyone who is not intimately familiar with procedural programming. The language-natural phrasing would be "which of these books have more than thousand pages? Can you give me their authors?" -- which maps much closer to the parent's linq query than to your code.

bdangubic · 2025-03-11T18:01:13 1741716073

That isn't natural to anyone who is not intimately familiar with procedural programming.

This is not about "procedural programming" - this is exactly how this works mentally. For kicks I just asked me 11-year old kid to write down names of all the books behind her desk (20-ish) of them and give me names of authors of books that are 200 pages or more. She "procedurally"

1. took a book

2. flipped to last page to see page count

3. wrote the name of the author if page count was more than 20

The procedural is natural, it is clear and it is in the right order

skydhash · 2025-03-11T18:10:24 1741716624

That's when you're doing the job, not what the mental representation of the solution. I strongly believe if you ask her to describe the task, she would go:

1. (Take the books)->(that have 200 pages or more)->(and mark down the name of the authors)->(only once)

bdangubic · 2025-03-11T19:18:53 1741720733

I respectfully disagree. And I think one of the core reason SWEs struggle with functional-style of programming is that it is neither intuitive nor how general-joe-doe’s brain works.

whstl · 2025-03-11T20:29:30 1741724970

I haven't really encountered software engineers who really struggle with functional style in almost 20 years of seeing it in mainstream languages. It's just another tool that one has to learn.

Even the people arguing against functional style are able to understand it.

Strangely, this argument is quite similar to arguments I encounter when someone wants to rid the codebase of all SQL and replace it with ORM calls.

bdangubic · 2025-03-11T22:11:19 1741731079

Strangely, this argument is quite similar to arguments I encounter when someone wants to rid the codebase of all SQL and replace it with ORM calls.

we must be in completely different worlds cause I have yet (close to 30 years now hacking) to see/hear someone trying to introduce ORM on a project which did not start with the ORM to begin with. the opposite though is a constant, “how do we get rid of ORM” :)

I haven't really encountered software engineers who really struggle with functional style in almost 20 years. It's just another tool that one has to learn.

I recall vividly when Java 8 came out (not the greatest example but also perhaps not too bad) having to explain over and over concept of flatMap (wut is that fucking thing?) or even zipping two collections. even to this day I see a whole lot of devs (across several teams I work with) doing “procedural” handling of collections in for loops etc…

whstl · 2025-03-11T22:28:34 1741732114

I'm more talking about projects that do start with an ORM, but have judicious (and correct) usage of inline SQL for certain parts. It's not uncommon to see developers spending weeks refactoring into an ORM-mess.

The argument is always that "junior developers won't know SQL".

But yeah I've also seen the opposite happening once. People going gung-ho on deleting all ORM code "because there's so much SQL already, why do we need an ORM then".

And then the argument is that "everyone knows SQL, the ORM is niche".

I guess it's a phase that all devs go through in the middle of their careers. They see a hammer and a screwdriver in a toolbox, and feel the need for throwing one away because "who needs more than one tool"...

tvier · 2025-03-11T18:05:15 1741716315

You are describing how to execute the procedure, while the gp is describing what the result should be. Both are valuable, but they're very different.

My personal take is that "how to execute" is more useful for lower level and finer grained control, which "what the results should be" is better for wrangling complex logic

dambi0 · 2025-03-12T08:01:03 1741766463

Your daughter may have implemented it procedurally but your description of the task was functional.

syklemil · 2025-03-11T17:10:26 1741713026

fwiw, once Python's introduced there's the third option on the table, comprehensions, which will also be suggested by linters to avoid lambdas:

    authors_of_long_books: set[Author] = {book.author for book in books if book.page_count > 1000}

These are somewhat contentious as they can get overly complex, but for this case it should be small & clear enough for any Python programmer.

itsmeknt · 2025-03-11T19:13:49 1741720429

I tried scaling up the original into an intentionally convoluted nonsensical problem to see how a more complicated solution would look like for each approach. Do these look right? And which seems the most readable?

  # Functional approach
  
  var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = books
    .filter(book => 
       book.pageCount > 100 and 
       book.language == "Chinese" and 
       book.subject == "History" and
       book.author.mentions > 10_000
    )
    .flatMap(book => book.author.pets)
    .filter(pet => pet.is_furry)
    .map(pet => pet.favoriteFood)
    .distinct()

  # Procedural approach
  
  var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = set()
  for book in books:
    if len(book.pageCount > 100) and
       book.language == "Chinese" and
       book.subject == "History" and
       book.author.mentions > 10_000:
      for pet in book.author.pets:
        if pet.is_furry:
          favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory.add(pet.favoriteFood)

  # Comprehension approach
  
  var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = {
    pet.favoriteFood for pet in
      pets for pets in 
        [book.author.pets for book in 
          books if len(book.pageCount > 100) and
                   book.language == "Chinese" and
                   book.subject == "History" and
                   book.author.mentions > 10_000]
    if pet.is_furry
  }

FWIW, for more complex problems, I think the second one is the most readable.

syklemil · 2025-03-11T19:56:45 1741723005

I'm more partial to the first one because it keeps a linear flow downwards, and a uniform structure. The second one kind of drifts off, and reshuffling parts of it is going to be … annoying. IME the dot style lends itself much better to restructuring.

Depending on language you might also have some `.flat_map` option available to drop the `.reduce`.

itsmeknt · 2025-03-11T20:41:42 1741725702

True! Good point on the restructuring, I haven't thought about it in that way.

I think I like the second approach because the loop behavior seems clearest, which helps me analyze the time complexity or when I want to skim the code quickly.

A syntax like something below would be perfect for me if it existed:

  var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = books[i].author.pets[j].favoriteFood.distinct()
    where i = pagecount > 100,
              language == "Chinese",
              subject == "History",
              author.mentions > 10_000
    where j = is_furry == True

dahauns · 2025-03-13T00:46:05 1741826765

Hm, LINQ query syntax form is kinda going in that direction

  (from book in books
   where book.pagecount > 100 
        && book.language == "Chinese"
        && book.subject == "History"
        && book.author.mentions > 10_000
   from pet in book.author.pets
   where pet.is_furry == true
   select pet.favoriteFood)
  .Distinct()

But it also demonstrates the...erm, chronic "halfassedness" of LINQ's query syntax form with distinct() not available there and having to fall back to method syntax form anyway...

syklemil · 2025-03-11T23:50:15 1741737015

You would likely approach it in any style with some helper functions once whatever's in the parentheses or ifs starts feeling big. E.g. in the dot style you could

  fn bookFilter(book: Book) -> bool {
   return book.pageCount > 100 and 
     book.language == "Chinese" and 
     book.subject == "History" and
     book.author.mentions > 10_000
  }
  
  var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = books
    .filter(bookFilter)
    .flatMap(book => book.author.pets)
    .filter(pet => pet.is_furry)
    .map(pet => pet.favoriteFood)
    .distinct()

tsss · 2025-03-11T20:29:29 1741724969

Your FP example is needlessly complicated. No one who does FP regularly would write it like that.

  var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = books
    .filter(book => 
       book.pageCount > 100 and 
       book.language == "Chinese" and 
       book.subject == "History" and
       book.author.mentions > 10_000
    )
    .flatMap(book => book.author.pets)
    .filter(pet => pet.is_furry)
    .map(pet => pet.favoriteFood)
    .distinct()

Or in Scala:

  val favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory = (for {
    book <- books if
      book.pageCount > 100 &&
      book.language == "Chinese" &&
      book.subject == "History &&
      book.author.metnions > 10_000
    pet <- book.author.pets if
      pet.is_furry
  } yield pet.favoriteFood).distinct

Though, most Scala programmers would prefer higher-order functions over for-comprehensions for this.

Chris_Newton · 2025-03-11T21:48:39 1741729719

I didn’t see the original, but the FP example here looks fairly idiomatic to me.

An alternative, which in FP-friendly languages would have almost identical performance, would be to make the shift in objects more explicit:

    var favoriteFoodsOfFurryPetsOfFamousAuthorsOfLongChineseBooksAboutHistory =
      books
        .filter(book => isLongChineseBookAboutHistory(book))
        .map(book => book.author)
        .filter(author => isFamous(author))
        .flatMap(author => author.pets)
        .filter(pet => pet.isFurry)
        .map(pet => pet.favouriteFood)
        .distinct()

I slightly prefer this style with such a long pipeline, because to me it’s now built from standard patterns with relatively simple and semantically meaningful descriptions of what fills their holes. Obviously there’s some subjective judgement involved with anything like this; for example, if the concept of an author being famous was a recurring one then I’d probably want it defined in one place like an `isFamous` function, but if this were the only place in the code that needed to make that decision, I might inline the comparison.

itsmeknt · 2025-03-11T20:44:48 1741725888

Thanks! I have updated my post to use your code. It is indeed much nicer. And yes, I don't write much FP.

I just improved the comprehension code as well using the same idea as your code, eliminating an entire list!

davidw · 2025-03-11T17:46:40 1741715200

Without syntax highlighting, "book.author for book in books if book.page_count > 1000" requires a lot more effort to parse because white space like newlines is not being used to separate things out.

nicwolff · 2025-03-11T19:14:21 1741720461

    authors_of_long_books: set[Author] = {
        book.author 
        for book in books 
        if book.page_count > 1000
    }

syklemil · 2025-03-11T19:49:26 1741722566

You've had some answers already, but I also think this is a good argument for syntax highlighting. With tools like tree-sitter it's pretty easy these days to get high quality syntax highlighting, which allows us humans to receive more information in parallel. A lot of the information we pick up in our daily lives is carried through color, and being colorblind is generally seen as a disability (albeit often a mild one which can be undetected for decades).

Syntax highlighting in print is more limited because of technological and economic constraints, which might leave just bold, italics and underlines on the table, while dropping color. On screens and especially in our editors where we see the most code, a lack of color is often a self-imposed limitation.

davidw · 2025-03-11T19:53:45 1741722825

That's not the point though. If you need the syntax highlighting to quickly make out the structure, perhaps the visual layout is not as good as it could be.

syklemil · 2025-03-11T20:00:29 1741723229

I consider syntax highlighting to be a part of the _visual_ structure. Visibility is more than just whitespace and placement!

xen0 · 2025-03-11T18:18:06 1741717086

Set comprehensions are normal in mathematics and, barring very long complex ones, I find them the easiest to parse because they are so natural.

They're just a tad more verbose in Python than mathematics because it uses words like 'for' and 'in' instead of symbols.

d0mine · 2025-03-11T18:55:17 1741719317

Set comprehension are more idiomatic here (explicit syntax) though filter/map are not that bad too:

    {*map(_.author, filter(_.page_count > 1000, books))}

It uses lambdas package.

mrkeen · 2025-03-12T07:43:52 1741765432

Awesome, giving it a quick scan,

  authors_of_long_books = set()

Now I know that authors_of_long_books is the empty set. Do I need to bother reading the rest?

tvier · 2025-03-11T17:59:11 1741715951

Much of both sides of this argument are opinion, but wrt this comment.

> ... no function call overhead.

This code has more function calls. O(n) vs 3 for the original

khaledh · 2025-03-11T18:23:53 1741717433

That's not true. The lambdas used in the functional version are each called once for every item in the list.

stouset · 2025-03-11T18:45:34 1741718734

No sane optimizer is going to emit the functional code as a gajillion function calls.

tvier · 2025-03-11T18:59:27 1741719567

Yeah, if you treat it as javascript vs python they're likely correct (I'm not that familiar with js). The article and original comment were about function vs imperative though, so I assumed half decent runtimes for both.

Spivak · 2025-03-11T20:36:28 1741725388

It's not? How could that possibly work when the lambda could throw and it could throw on the nth invocation and your stack trace has to be correct?

If I run this in the JS console I get two anonymous stack frames. The first being the console itself.

    [1, 2, 3].filter(x => [][0]())

khaledh · 2025-03-11T18:59:00 1741719540

True, but now you're relying on a specific implementation and optimization of the compiler, unless the language semantics explicitly say that lambdas will be inlined.

stouset · 2025-03-11T19:03:34 1741719814

This is true of literally anything and everything your compiler emits. In practice the functional style is much easier to optimize to a far greater degree than the imperative style.

tvier · 2025-03-11T19:10:29 1741720229

This is why you shouldn't get into arguments about performance on the internet without highly specified execution environments.

I'm going to take my own advice and go back to work :)

feoren · 2025-03-11T18:44:52 1741718692

> You are told explicitly at the beginning what the type of the result will be

I would argue that's a downside: you have to pick the appropriate data structure beforehand here, whereas .distinct() picks the data structure for you. If, in the future, someone comes up with a better way of producing a distinct set of things, the functional code gets that for free, but this code is locked into a particular way of doing things. Also, .distinct() tells you explicitly what you want, whereas the intention of set() is not as immediately obvious.

> There are no intermediate results to think about

I could argue that there aren't really intermediate results in my example either, depending on how you think about it. Are there intermediate results in the SQL query "SELECT DISTINCT Author FROM Books WHERE Books.PageCount > 1000"? Because that's very similar to how I mentally model the functional chain.

There are also intermediate results, or at least intermediate state, in your code: at any point in the loop, your set is in an intermediate state. It's not a big deal there either though: I'd argue you don't really think about that state either.

> and no function call overhead

That's entirely a language-specific thing, and volatile: new versions of a language may change how any of this stuff is implemented under the hood. It could be that "for ... in" happens to be a relatively expensive construct in some languages. You're probably right that the imperative code is slightly faster in most languages today, and if it has been shown via performance analysis that this particular code is a bottleneck, it makes sense to sacrifice readability in favor of performance. But it is a sacrifice in readability, and the current debate is over which is more readable in the first place.

> a single pass over books

Another detail that may or may not be true, and probably doesn't matter. The overhead of different forms of loops is just not what's determining the performance of almost any modern application. Also, my example could be a single pass if those methods were implemented in a lazy, "query builder" form instead of an immediately-evaluated form.

In fact, whether this query should be immediately evaluated is not necessarily this function's decision. It's nice to be able to write code that doesn't care about that. My example works the same for a wide variety of things that "books" could be, and the strategy to get the answer can be different depending on what it is. It's possible the result of this code is exactly the SQL I mentioned earlier, rather than an in-memory set. There are lots of benefits to saying what you want, instead of specifying exactly how you want it.

megous · 2025-03-11T22:10:42 1741731042

Set is a well defined container for unique values. It's much clearer what it is than some non-existent .distinct() function with no definition and unclear return value.

Procedural code in JS doesn't say how you want something done any more closely than the functional style variant. for-of is far more generic than .map/.filter() since .map() only works on Array shaped objects, and for-of works on all iterables, even generators, async generators, etc. In any case you're not saying how the iteration will happen with for-of, you're just saying that you want it. Implementation of Set is also the choice of a language runtime. You're just stating what type of container you want.

Sometimes functional style may be more readable, sometimes procedural style may.