Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Keep data transforms and algorithmic calculations in functional style

What annoys me greatly though is kids coding various 'fancy' functional paradigms for data transformation without realizing their performance implications, still thinking they've actually done the 'smarter' thing by transforming it multiple times and changing a simple loop to three or four loops. Example : Array.map.filter.map.reduce. Also when talked about it, they have learned to respond with another fancy term : "that would be premature optimization". :|



This is just an unfortunate consequence of how map and filter are implemented via iterators.

Of you work with transducers, the map filter map reduce is still just one single loop.


A smart enough compiler can and will fuse your loops / iterators.

GHC does for example.


Relying on a sufficiently smart compiler to turn bad algorithms into good algorithms is unreliable.

It's much better to just have a way of expressing efficient algorithms naturally, which is what transducers excel at.


That kind of thing really depends on the language. Some of the stronger functional languages like Haskell have lazy evaluation, so that operation won't be as bad as it looks. But then you really need to fully understand the tradeoffs of lazy evaluation too.


or list fusion, but yes Haskell will optimise map x (map y lst) to map (x . y) lst

https://stackoverflow.com/questions/38905369/what-is-fusion-...


I don't know how things are implemented in other languages but in C# 9, these operations are optimized.


There are ways to keep functional transformations and immutable data structures efficient. Copy-on-write, unrolling expressions into loops, etc. Proper functional languages have them built into the runtime - your clean map-reduce chain will get translated to some gnarly, state-mutating imperative code during compilation. In non-FP or mixed-paradigm languages, where functional building blocks are just regular library functions (standard or otherwise), map-reduce is exactly what it says on the tin - two loops and a lot of copying; you want it fast, you have to mostly optimize it yourself.

In other words, you need to know which things in your language are considered to be language/compiler/runtime primitives, and which are just regular code.


Most languages don't have these facilities at all - so you need to be really careful what you are doing. This works "fine" with test data, because your test data usually is a few hundert items max. A few years back people at our firm build all data filtering in the frontend, to keep the "backend clean". That worked fine in testing. In production with 100k rows? Not so much.

Even in C# it depends on the linq provider - if you are talking to a DB, your quers should be optimized. Linq to objects doesn't do that and repeated scanning can kill your performance. E.g. repeated filtering on large lists.


I am talking about IEnumerable, not IQueryable: https://blog.ndepend.com/net-9-0-linq-performance-improvemen...


Are they? LINQ is usually slower than for loop


Quite often it compiles to the same IL. Would you like to provide some godbolt examples where it's significantly different?


IL is always different since it's a high-level bytecode. The machine code output is different too. Now, with guarded devirtualization of lambdas, in many instances LINQ gets close to open-coded loops in performance, and it is very clever in selecting optimal internal iterator implementations, bypassing deep iterator nesting, having fast-paths for popular scenarios, etc. to achieve very good performance that can even outperform loops that are more naive, but unfortunately we're not there yet in terms of not having a baseline overhead like iterators in Rust. There is a selection of community libraries that achieve something close to this, however. I would say, LINQ performance today gets as close as it can to "it's no longer something to be worried about" for the vast majority of codebases.


Two or so years ive been developing library and i remember where switching from something simple like First or FirstOrDefault to for loop made difference when using benchmark dotnet

Then I found that it was common knowledge that linq is slower, even among ppl on c#s discord


Aren't those all linear operations?


Yes, wrote quickly without thinking. Even if it doesn't change the complexity, it's still three or four times the operations.


Why would your example be O(n³)?


Oh yes, sorry I meant to write 3 * O(n) which though doesn't change the order is still three times the operations. The example I was remembering was doing filters 'inside' maps.


So... O(n)? Leaving aside the fact that "3 * O(n)" is nonsensical and not defined, recall f(x) is O(g(x)) if there exists some real c such that f(x) is bounded above by cg(x) (for all but finitely many x). Maybe you can say that g(x) = 3n, in which case any f(x) that is O(3n) is really just O(n), because we have some c such that f(x) < c(3n) and so with d = 3c we have f(x) < dn.

It's not the lower-order terms or constant factors we care about, but the relative rate of growth of space or time usage between algorithms of, for example, linear vs. logarithmic complexity, where the difference in the highest order dominates any other lower order terms or differences.

What annoys me greatly is people imprecisely using language, terminology, and/or other constructs with very clearly defined meanings without realizing the semantic implications of their sloppily arranged ideas, still thinking they've done the "smarter" thing by throwing out some big-O notation. Asymptotic analysis and big-O is about comparing relative rates of growth at the extremes. If you're talking about operations or CPU or wall clock time, use those measures instead; but in those cases you would actually need to take an empirical measurement of emitted instruction count or CPU usage to prove that there is indeed a threefold increase of something, since you can't easily reason about compiler output or process scheduling decisions & current CPU load a priori.


I do understand 3 * O(n) is just O(n), thanks. I was just clarifying my initial typo. However, it's still three/four times the iterations needed - and that matters in performance critical code. One is terminology, and the other is practical difference in code execution time that matters more, and thus needs to be understood better. You might not 'care about constant factors' but they do actually affect performance :).


> Maybe you can say that g(x) = 3n, in which case any f(x) that is O(3n) is really just O(n)...

In practice 3x operations can make a world of difference.

3x SQL queries, 3x microservice calls, missing batching opportunities, etc.

Sorry but this kind of theoretical reasoning wouldn't move a needle if I'm reviewing your PR.


> Sorry but this kind of theoretical reasoning wouldn't move a needle if I'm reviewing your PR.

If this were a PR review situation I would ask for a callgrind profile or timings or some other measurement of performance. You don't know how your code will be optimized down by the compiler or where the hotspots even are without taking a measurement. Theoretical arguments, especially ones based on handwavey applications of big-O, aren't sufficient for optimization which is ultimately an empirical activity; it's hard to actually gauge the performance of a piece of code through mere inspection, and so actual empirical measurements are required.


Callgrind to measure impact of performing 3x more database queries or 3x more microservices calls?

I don't block PRs because of micro optimizations but my examples aren't.


I recall looking at New Relic reports of slow transactions that suffered from stacked n+1 query problems because the ORM was obscuring what was actually going on beneath the hood at a lower level of abstraction (SQL).

My point is it's often difficult to just visually inspect a piece of code and know exactly what is happening. In the above case it was the instrumentation and empirical measurements of performance that flagged a problem, not some a priori theoretical analysis of what one thought was happening.


That is premature pessimization. They have no idea what premature optimization is as nobody has done that optimization at all since 1990 or so. Premature optimiiation is about manually unrolling loops or doing inline assembly - things any modern compiler can do for you automatically.


> Premature optimiiation is about manually unrolling loops or doing inline assembly - things any modern compiler can do for you automatically.

If compilers consistently did this, projects like ffmpeg wouldn’t need to sprinkle assembly into their code base. And yet.


ffmpeg did that AFTER profiling proved it was needed.

No compiler is perfect, if you need the best performance you need to run a profiler and see where the real bottlenecks are. I'm not sure if ffmpeg really needs to do that, but I trust they ran a profiler and showed it was helpful (cynically: at least at that time with the compilers they had then, though I expect they do this often), and thus it wasn't premature optimization.

Regardless, compilers mostly get this right enough that few people bother with that type of optimization. Thus almost nobody even knows what premature optimization is and so think the wrong thing. Premature optimization is not an excuse to choose a O(n) algorithm when an O(1) exists or some such.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: