Hacker News new | past | comments | ask | show | jobs | submit login

Does that ever happen?



I've seen examples of actual production systems where there is just a general proliferation of slow practices that make the system slow without any one function standing out yes. In C# such practices could include using interfaces (for dependency injection) and linq (for readability), and reflection for instance. All of these have fast alternatives with varying levels of downsides. The downsides aren't as big when you are practiced at them.

Also, when profiling production code, it can be hard to FIND the slow function since the optimizer may inline things.

There is perhaps value in understanding your language and ecosystem very well, so that you have decent intuition for what is fast, and making default-fast choices even if their is a small readability cost. That cost may be made up for by not having to crunch on performance later. As well, many performance choices make code simpler, after all the goal is to do less. and less is less.


Yes. This is the plague of modern software. It is all over the place. So much software today spends all its time chasing pointers, using the cache poorly, and branching wildly.

You won't see these aspects of poor design on a sampling profiler. You will see it by running e.g. perf on Linux and seeing pitifully low IPC and cache miss numbers.


This is one of the reasons I really love Rust: it not-so-subtly nudges you towards avoiding heap allocations by adding extra boilerplate and making them obvious in type signatures (unless you hide them behind wrapper types).

This contributes to Rust programs generally having good performance characteristics without spending time on optimizations.


I find the exact opposite for rust as it often encourages boxing random objects to satisfy odd lifetimes.

That being said of course in almost all of these case you can restructure your program so you don't need to box the values but if it's not performance critical why bother? Repeat a couple dozen times across a large codebase and you have the same pointer chasing issues.


The NLL improvements of last year have improved things quite a bit. It's still not perfect, but in my opinion it has reached a point where this is mostly an issue for Rust beginners.

Some patterns of writing code will be really awkward to realize, but there are usually "more rusty" solutions that you start to apply without event noticing. Once you write code with the desired ownership semantics in mind, it's often (relatively) frictionless.


It is still an issue for doing UI related coding, due to the way it is common to design such kind of systems.


> This is one of the reasons I really love Rust: it not-so-subtly nudges you towards avoiding heap allocations by adding extra boilerplate and making them obvious in type signatures (unless you hide them behind wrapper types).

Could you clarify this? It seems like the opposite to me. Borrowing requires lifetimes in type signatures, but boxing yields owned objects, which can be passed around easily like value types.


I’d say it happens more frequently than being able to optimize yourself out of a perf problem in my experience.

Too many people have taken the “make it correct then make it fast” advice too far. The definition of “correct” needs to include performance parameters from the beginning because usually the bottlenecks that cause real issues are architectural.


It is my extremely studied experience that this is rarely (read: almost never) the case, and that the far greater risk is of radically less- or entirely un-maintainable code produced by developers who have incorrect intuitions about what needs to be fast.


I suppose this is why they call it anecdotal evidence, as our experiences don’t agree.

What I will say is that code that is architecturally correct for the performance requirements necessary is largely not less understandable or unmaintainable than code that is incorrectly designed for the performance context. You don’t end up in the kinds of harder to read optimizations you see in this article in those cases any more than you do under the other case.

Another consideration is if your architecture is wrong for your performance space, nothing will help but a rewrite. If your code is optimized in the small too early, you can always rewrite in a way that is more clear.


You're talking about architectural changes, this article is about, essentially, microoptimizations. Good artchitecture is often faster than bad.

Your design should indeed take into account performance requirements. But micro-optimizations like these (almost all of these changes are to avoid linear numbers of reallocations, with the exception of string builder) don't give you order of magnitude speedups unless they're in hot loops anyway.

Profile than optimize means profile, then optimize. Designing good software isn't optimization, its designing good software.


It makes a big difference if all your function signatures in a Go program take a string parameter or a []byte parameter, or if they take some other expensive type by value. Refactoring this later can be close to impossible. I would say rather than micro-optimizations these are the choices that must be made correctly in the early phases of development to avoid having an unfixable performance problem later.


Literally the thread I’m responding too is asking what the value of optimize then profile is. My experience is that you are likely hosed if you’ve gotten to this point in the 80% case. That isn’t to say these techniques have no value, I’ve used many of them myself. But the question was, “how often does it happen that your optimizations are for a flat profile”. For me, the answer is “most of the time”.


And the broader context is when talking about micro-optimizations versus reasonable architectures.

You shouldn't micro-optimize before profiling, because it likely won't matter. Bluntly, if you have a flat profile, none of the optimizations in this article are relevant anyway. You'll be able to pull out single digit percentage speedups, maybe.

The optimize then profile argument isn't meant to be about architecture. Yes, you should build performant architecture. Yes, you should take time to plan a performant architecture before building[+]. But the question of profile than optimize is never (except in the strange way you're bringing it up) about doing macro-optimizations before you've written a line of code. It's almost always in the context of "don't just try to optimize what you think is slow, because you're almost always wrong".

Big-O style speedups from architectural changes aren't micro-optimizations, they generally sit outside of that conversation entirely.

As an aside, flat profiles are in practice exceedingly rare. Most (useful) programs do the same thing many times. Its very unusual to see a program that isn't, in essence, a loop. And the area inside the loop is going to be hot. The pareto principle applies to execution time too.

[+]: Maybe startups who gotta ship it to survive as the exception.


> You shouldn't micro-optimize before profiling, because it likely won't matter. Bluntly, if you have a flat profile, none of the optimizations in this article are relevant anyway. You'll be able to pull out single digit percentage speedups, maybe.

Glad we agree.

> The optimize then profile argument isn’t meant to be about architecture

Glad we agree. If only all the people who tell me “correct before performant” agreed with us. In practice, in my experience, this is not the case. People use it in day to day conversations at the earliest parts of conversations about architecture all the time. If they didn’t I wouldn’t have nearly the problems I do with the statement.

> As an aside, flat profiles are in practice exceedingly rare

This seems to be the most controversial part of our disagreement. In my experience, that is flatly untrue. Especially when talking about systems where the performance does not meet the requirements. I can count on 1 hand the number of times I’ve seen systems go from “unacceptable” performance to “acceptable” via micro optimizations. I’ve never seen one go to “great”. I don’t know how to quantify this though, so I’m willing to leave this in the realm of my experience is different than yours.

All that is to say, my experience says that systems that don’t treat performance as first class requirements don’t tend to meet their performance expectations.

All of which is neither here nor there based on the article but is directly related to the question of ‘what do you do with a flat profile’?


Kanev[1] disagrees. Flat profiles are the common case in actual practice.

Edited to add: Since apparently you also work at Google, you should walk over to Svilen's desk and just ask him if profiles of production software are generally flat, or if they generally have hot spots.

1: https://static.googleusercontent.com/media/research.google.c...


You're citing a paper using, as an example, an already highly hand optimized and performance-guided optimized binary.

That's what you get after you profile and optimize.


For me, all the time. But the two points about "write, profile, optimize" that people usually leave out is that profiling/benchmarking is mostly useful as a comparison tool (are your optimizations actually doing anything?) and that part of profiling is asking the question, "how fast can this go?" which is usually a difficult question to answer, but when you get really granular you can usually reason about it.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: