Hacker News new | past | comments | ask | show | jobs | submit login
Sloc Cloc and Code Revisited – Optimizing an Already Fast Go Application (boyter.org)
88 points by ngaut on Dec 25, 2018 | hide | past | favorite | 16 comments



I think the stuff about nested loops performing poorly is a misunderstanding. There's nothing inherently bad about nested loops. The motivation given in the previous post[0] is:

> The big breakthrough I had was after looking at the hot function I realised that storing the game events in a List of Arrays meant that we had a loop in a loop. It was however possible to flatten this into a single array of integers and reset the game every 11 turns with a simple if check.

The issue here wasn't nested loops, it was data structure overhead and poor data locality from having a lot of tiny lists (11 items each, as I understand it).

0: https://boyter.org/2017/03/golang-solution-faster-equivalent...


The author recognizes this for certain cases:

> Turns out that at this small level of nested looping there is no performance to be gained here.


Unsurprisingly, the biggest performance gains came from algorithmical changes, not micro optimisations (like if instead of switch.)

The lesson is always the same: leave micro optimisations for later and spend more time learning about data structures that can improve your code locality.


Correct. However I did find that just working on the codebase getting small wins was enough to put my mind into thinking about it and hence able to make that leap to the larger gains.

For the super tight loops scc uses the micro optimisations still add up though so it wasn’t wasted effort and in the end it all came together nicely.


Once you are somewhat experienced with optimization, the difficult part isn't getting into the mindset of optimizing, but in holding off on it until your features are in.

Before you have the features, you mostly want to find the boundaries on the problem so that you can slice it up into modules and optimize those independently. It's unusual in contemporary programming to have to do a whole-program micro-optimization pass, but that's the only kind you can do within a small program, so it's common to get the wrong idea and start shaving off bytes and cycles before the architecture is ready for it. In a large program, it's specific combinations of features that create performance spikes, and those tend to be addressed by building a new custom code path. When you scale up the data changes, so your view of the problem changes too, and you start being able to optimize by creating a codepath that combines a whole bunch of specific processing steps.


The caveat working that way is that you can end up working towards a local optimum, and lose sight of the forest for the trees when it comes to making the bigger algorithmic changes.


Nobody is perfect :) I still think just getting into the mindset of a problem is a good way to even begin to think about these larger changes though.


Is there some Go specific overhead on loops? If not, why does the author seems so obsessed with removing loops within loops?


No, Go has about the fastest loops you could want, implemented in the obvious way at the asm level. (That is, not that Go's loops are any more extra magical fast than any other compiled language, but they are not slow as in some dynamic languages.) The focus is there because that's where you get wins, in any language.


Depending on the type of loop there might be runtime boundary checks, but they can be avoided in many cases IIRC.


I only skimmed this, but why isn’t ripgrep mentioned? (While other Rust-based tools are.)

By the way, ripgrep author has a good writeup on performance comprison. https://blog.burntsushi.net/ripgrep/


Ripgrep is heavily covered in a previous article: https://boyter.org/posts/sloc-cloc-code/


Ripgrep is a similar but different kind of tool.


99% the reason I didn’t put ripgrep in there. Ripgrep, sift, Ag, pt, ack solve a totally different problem to scc, tokei, loc, polyglot and cloc.

The last 1% was because I didn’t want to have any comparisons made against ripgrep because early in scc’s development I was slightly comparing walk performance and ended up having a chat with burntsushi over twitter. To avoid marking more work for him I was careful to avoid any comparisons, as the last thing we want is him distracted from working on that and other tools.


Of course, silly me, I take loc and ripgrep to be so essential I conflated them. Too much Christmas merriment, not enough proper thinking. Sorry.


It’s all good! Have a great Christmas :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: