Hacker News new | past | comments | ask | show | jobs | submit login
Cyclomatic complexity (wikipedia.org)
91 points by tosh on Feb 9, 2024 | hide | past | favorite | 49 comments



And as always, Goodhart's law applies: "When a measure becomes a target, it ceases to be a good measure"

While the concept of cyclomatic complexity can be a somehow useful guideline, I've seen way too many CI/CD pipelines enforcing CC goals, which then usually results in an un-untangleable middleware and module mess where any piece of core logic is scattered over 102 code points in 34 modules — instead of just residing in a single, well readable and commented main function that would neatly self-explain a whole complex task in three screens that might have a CC of 20.


> any piece of core logic is scattered over 102 code points in 34 modules — instead of just residing in a single, well readable and commented main function that would neatly self-explain

The tragedy of McCabe metrics is that this is a tooling failure. You can and should apply McCabe to whole software and then all these smol-functions approaches start falling apart. McCabe goes through the roof because the metric doesn’t depend on length.

McCabe measures decision points. A 5000 line linear function has a McCabe complexity of 1. It’s fine.

An even better measure is architectural complexity. I’ve written about this a lot after reading some papers and a PhD thesis about it [1]. Tried turning it into a book but that didn’t quite happen. It will be a subpoint in a new project :)

Architectural Complexity uses visibility metrics to say how intertwined your software has become. A good rule of thumb is that the more imports/exports you have in a file, the more architecturally complex it is. Best seen with a dependency graph visualizer, but you can also feel it viscerally when you have to bounce around a bazillion files to understand what happens with any code flow.

[1] https://swizec.com/blog/why-taming-architectural-complexity-...


I mean I'm not working on big code bases anymore, more tooling, analytics and managment. But both the open/closed principle and some advice from Sandy Metz and others really stuck with me: Don't bother abstracting.

Just write straight line code.

And then introduce indirections and abstractions if good reasons demand it. In fact, if changes to the code demand it.

For example, introducing an abstraction for an external database can improve testing prospects. If you need to fix the same bug in several places, it's usually time to extract a common abstraction. If you need to touch some established piece of code to introduce a choice what kinda algorithm/strategy/formatting/.... you need to use here, that's a decent point to introduce some abstraction and to inject some function/object here. If you would have to replicate the exact complex logic from somewhere else, and those places need to move in sync for reasons, moving that to a common place is probably a good idea.

Just following what the code needs and the changes and requirements around the code need has resulted in much simpler code and way more effective abstractions for myself.


I think it's a shame you didn't write the book. I know some people who desperately need to read it.


Same! Also, Swizec's an awesome writer. (Long-time email subscriber, and fan of his posts and Serverless Handbook book, etc).


Like cpus with icache and dcache which leverage memory locality, we overlook the effect of locality on understanding code.


Old style Java code often exhibits low CC, and as you indicate this is not necessarily a good thing. The actual logic is often spread out in a multitude of helpers, proxies, deep inheritance chains, and unneeded abstractions. The end result is a system that is very difficult to reason about.

Also, in general terms over use of DRY can lead you here.

In many cases it is better to “risk” higher CC by concentrating logic where it makes sense.


Yeah I agree with this. Or, rather, I think there's an argument to be made that what we're really interested in is logic density Vs logic sparsity and when developing something we have to decide what the right level of that is. So if you're a small team, logic density makes sense because it means one person can have a lot of context in one place and make significant changes without doing a lot of context switching. Don't underestimate the productivity of one experienced guy doing old school PHP.

But if you're a large team on a large project logic sparsity (with appropriate decoupling!) makes sense because then multiple people can change the code simultaneously without stepping on each others' toes. People can specialise where appropriate without having to understand the whole thing. The "sparsity" approach is obviously what enterprises see as good news mostly because it means they can have large teams without silos but in real terms it costs a lot of money to operate this way. Although I think as you rightly point out, Java has in recent years realised that it possibly went a bit too far in that direction historically.


I hear what you're saying here, but I like to delay going in the second direction as long as feasibly possible.

In theory, the idea that "people can specialise where appropriate without having to understand the whole thing" sounds good. In reality, so many, many, many bugs and piles of technical debt pile up from developers with a worms eye view who don't understand how the whole system works. Eventually you have to make this compromise, but resist it as much as you reasonably can, and always strive to grow developers so they know as much of the system as they possibly can.


Exactly, and this is why I always try to steer teams away from "one metric to rule them all" whether this be "always fail if coverage is less than X percent" or "random code metric like CC is beyond limit". Reality is simply more complicated than that, and it takes experienced engineers to actively manage and balance the tradeoffs appropriately over time. Putting arbitrary one-size-fits-all rules in place is almost never the answer.

Unfortunately, in some (many?) companies there simply aren't enough experienced engineers who have the time to do the active balancing...leading us back to "just stick this rule in pipeline so the teams are held to _some standard_".


Agreed. I like to do these scans but for informational purposes, not as a gate. Also most tools allow you to annotate code to turn off warnings, which can help when used intelligently.

Of course some teams will over use such tools and turn off the metrics left and right.

In the end there is no substitute for experienced engineers.


I think that's the joke behind FizzBuzz Enterprise Edition, where FizzBuzz is implemented with rigid adherence to Java-encouraged abstractions:

https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...


Indeed, I've used McCabe stuff in the past not as a gate, but a way to say "this should be looked at before that" type of thing. Works pretty well.

A low score doesn't mean "good", nor a high one "bad", just an indication that something might do well with some more scrutiny.


Or even more importantly if the score is increasing as revisions occur you need to do a full review of that part.

You are looking for changes and not absolute values of many measures.


OK first time I see Cyclomatic complexity as a useful metric. Rate of change is more important than any specific measure.

But I never seen anyone mentioning that - but after I saw that here in comment it seems so obvious now.


No it is not rate of change. It is change in itself. The value should not be increasing.


I used this advice for decades. It works well...


I wonder if there are any open source emulators out there which are compliant with a strict limit on cyclomatic complexity. Specifically given a common pattern is to use a large switch/case to map from instructions which usually compiles to a jump table which is performant, fairly clear/maps nicely to architecture docs but falls foul to normal cyclomatic complexity rules.

Edit: project concept: Enterprise Gameboy.


> Specifically given a common pattern is to use a large switch/case to map from instructions which usually compiles to a jump table [...]

Work-around: Implement the jump table directly, e.g., array of function pointers or whatever works in the language of choice.


As soon as I saw the title, I was reminded of these comments that I sometimes see in Ruby code:

  # rubocop:disable Metrics/CyclomaticComplexity
Presumably because nobody on the team is brave enough to open a PR that disables it in the config.


This x100.

If you take a function and tell someone it's "too complex" and tell them to "fix it", that person will usually end up splitting it into, say, three functions. And now, you've just multiplied that complexity by all the different possible permutations of how/where the different functions can call each other. Corner cases that you could previously rule out because the control flow made them impossible are now fair game.


Except complexity as an issue is primarily about readability.

Fine, a junior developer has a switch statement that's like 10 things long within a function with several if/else branches and they refactor to meet complexity limits by moving the switch to a function.

As long as that function is well named, that's a huge improvement to readability.

In part, complexity thresholds relate to human short term memory limits of seven plus or minus two.


That's why it's important to pair this one single aspect (cyclomatic complexity) with other aspects that balance it, and combine it into an index.


We have tools that flag methods with high cyclomatic complexity during review.

By “flag”, all I mean is that these tools leave a comment during code review. Humans make the decision to listen to the comment or ignore it. The human in the loop is critical, because ignoring it is sometimes the right decision.


Kolmogorov complexity [1] is a measure of the complexity of an object (say a text), while Cyclomatic complexity is a measure of the complexity of a program. Can we make any statements about the relationship between them?

Let's say we have a text A, and we find a program P which is of length L and which the Kolmogorov complexity K(A); i.e. L=K(A). Does it make sense to compare the cyclomatic complexity C(P) with K(A)? Or can we write some equation or inequality the compares C(P) with K(A)?

I guess a first task would be to normalize one or the other so as to get the same units...

[1] https://en.wikipedia.org/wiki/Kolmogorov_complexity


They're not meaningfully related. You can say certain things about them though, for example kolmogorov optimal programs will generally have higher cyclomatic complexity because they don't follow any fixed set of rules. Put another way, they lack the "simple" underlying structure that would be required.


My random thought for slacker news consideration:

I recently read "The evolutionary origins of modularity"[1] and it talks about NNs that are evolved based on task-performance vs task-performance plus "graph cost" and finds that the later greatly outperform the former. Graphs in the later group ended up being more "modular" which allowed them to evolve to meet changing requirements more readily.

Could we analyze computer programs and create graphs representing modules? And links between systems? A node would a method. An edge would be a call site. Graph-modules – clusters of highly interconnected nodes – would represent the real fault lines of the application, which could be juxtaposed against the developer’s conceptual model. The cost of the overall graph could be calculated and used as a metric – aiming for some lower bound.

1: https://royalsocietypublishing.org/doi/10.1098/rspb.2012.286...


As a former biologist and self-taught programmer your comment and the paper you shared made my day (thank you). I often use biological and evolutionary analogies to prioritize best practices in my head, this one goes is at the top now.



I used this as a metric of autoflight system complexity!

https://dspace.mit.edu/handle/1721.1/9172


Does this measure make sense for pure functional languages? Can it be defined for terms in the lambda calculus or in combinatory logic?


Probably not. Although, you can immediately follow that up with asking whether this makes any sense in any programming language. From the wikipedia link:

https://en.wikipedia.org/wiki/Cyclomatic_complexity#Correlat...

Correlation between CC and defects isn't always present and in fact lines of code might be a better metric.

I assert that there is no current objective metric where you can plug in source code and get out some sort of comprehensibility value. Lines of code is the best we have at the moment. And it goes without saying (but I'm going to say it anyway) that lines of code isn't a great metric to be stuck with.

Right now the only thing we have are best practices (ie some other people were "successful" in some other context and they happened to be doing this) and code smells (right, the code makes my tummy feel bad, you can just feel the objectivity pouring out).

[It's been pointed out to me in the past that Weyuker's 9 Properties can be used to determine comprehensibility, but when you look at the papers published about it they're all really underwhelming. Basically, people have built a machine that when given source code dumps out a number that correlates to Weyuker, but then nobody takes the next step to see if that number actually correlates to anything that anyone cares about.

People are trying things though. For example, there's this: https://www.sonarsource.com/resources/cognitive-complexity/

I'm not particularly convinced by it, but it does a better job than Weyuker, iirc.]


> I assert that there is no current objective metric where you can plug in source code and get out some sort of comprehensibility value.

Ratio of ternary operators more than 1 level deep to total LoC.

You know you're in for some pain when you randomly open a file and the ternary expressions go 6 levels deep.


Sure, but the vast majority of programmers don't do that type of thing.

Often in cases like the above I'm not sure if the metric is correct - or is it just that you are not used to reading that style. For years I worked at places that banned the ternary expression and so I didn't use them and it was a lot of work to remember what was going on the rare times I saw them. Then I got a job on the team that did use them all the time (but never more than 1 level deep) and I soon learned to read them and now I use them often where they make sense.


> Ratio of ternary operators more than 1 level deep to total LoC.

Sure. I should have said, "no current general objective metric".

For example, if we detect any symbols longer than 6 billion characters, then we know the code is likely to be hard to grok. Or if the only character used in symbols is the underscore.

And maybe the path forward for determining objectively comprehensible code is to collect every pathological case and then good code is code without the bad stuff. But I personally hope for a general solution that we can setup which will emergently determine that too many nested ternary operators is bad from first principles. I'm more than ready to be disappointed when this turns out not to be possible, but everyone is allowed a dream.


The issue you face here is one of program inputs. Any time you are passing functions as values, you can't guarantee how exactly the body of a function will execute. In a simple example:

  (lambda (a) (a (lambda () (b)) (lambda () (c)))
You can't tell how the calls to b and c will be executed without also knowing something about the value of a. The function a could be a branching construct, or it could just evaluate it's inputs directly. You could assume the worst case and assign a high complexity to this code, but all lambda calculus terms are delayed in this way, so every function call would be assigned the worst case complexity for control flow.

More generally, imagine an interpreter that accepts a string as input and that string contains a program. You can make the program contained in the string as complex as you like without affecting the complexity of the interpreter, and effectively cheat the metric.


You still have a number of atoms, composed and selected, in a way that is completely similar.

But you'll discover that almost every program in a functional language has an enormous complexity when measured that way. So, it does pretty much make as much sense as on the imperative case, and also it's completely useless (instead of only partially useless, I guess).


That's what I was wondering as a woefully uninformed armchair warrior burnout.

The main (only?) difference between functional and imperative programming is mutability. The ideal FP representation is a spreadsheet with no infinitely cyclical references to cells like A1->B1->C1->A1, and the ideal IP representation is a Turing tape reading/writing/moving with no registers. I'm probably wrong about both of these, but that's how I think of them.

So most IP languages like JavaScript eventually reach a state complexity limit where humans can no longer analyze them. It's like trying to remember more than 7 digits in a phone number while someone is yelling more phone numbers at you.

Whereas FP languages work more like piping STDIN/STDOUT/STDERR streams between executables running in isolated processes, and have no scalability limit. We just don't have the hardware to properly run them, because we went the SIMD/GPU route instead of CPU clusters.

Note that most of the FP lingo like currying, closures, lazy evaluation, etc are clever and have their applications, but distract from the core essence separating it from IP.

People a little older than me did a lot of work in the 1990s to move away from mutable state (assembly, C++) and transition to pure functional synchronous blocking deterministic idempotent logic (Lisp, Prolog, VHDL/Verilog, HTTP, etc). Unfortunately corporations unwound most of that progress by reintroducing every anti-pattern in a million-monkeys typing proprietary race for profit. Async/nonblocking code is borrowed from monads and is a workaround for single-threaded code lacking proper scatter-gather arrays and higher order functions that can fork/join other threads.

The only language I know that gets this stuff mostly right is ClojureScript, which attempts to use pure FP by avoiding monads and executing logic as a one-shot process with new input/output passed while suspended waiting for events, similarly to how PHP works. It also has a software transactional memory that works like Redux for storing state, which is also related to copy-on-write arrays in PHP before version 5 introduced pass-by-reference classes and permanently derailed what made it compelling. Haskell and other FP languages mostly miss the point due to their over-reliance on monads and syntactic sugar, which costs them purity and makes them little better than IP languages for use-cases in real life. And of course since ClojureScript is Lisp-based, its parenthesis syntax makes it difficult to read, putting it into the category of write-only languages like Perl, which all but guarantees that it will never reach mainstream adoption.

I'm mostly skeptical of Rust for these reasons. Because a properly structured program doesn't have much/any mutable state, which negates most of the need for the borrow checker. I have a few dozen criteria that I use to judge what a "good" language would look like, and there simply isn't one yet. I would dearly love to write one, but it would take me perhaps 2-5 years, so with just nights and weekends and a few weeks of vacation per year if I'm lucky, that will never happen. The main challenge I face in my daily work is overcoming this condemnation of being forced to work with mediocre tools for the reminder of my life. And I see the 100x overhead cost of poor tooling reflected in the $200/hr wages of programmers whose productivity has fallen so low that a social network is considered a big lift. Whereas if we look at the logic involved, a teenager could write one in a couple of weeks with something like HyperCard/FileMaker/Access.

I'm reminded of the Kirk-Picard conversation that goes "I take it the odds are against us and the situation is grim?" because I've mostly witnessed regression instead of progress the last 25 years. But revolutions happen when things are at their worst, so that keeps me going and inspired.

TL;DR: mutability matters more than cyclomatic complexity.


CC tools applied to JS tend to count each half of an AND or OR operator as its own path, which is super annoying and not useful.


Rather than attempting to create an objective code complexity measure, I just try to implement domain specific rules into composable functions. Then code becomes linear and declarative.

For long running processes, I switch to MPSC where each event source is a thread, but places its events onto the main thread's queue.


Love this topic. Early in my computing career I was trying to modernize our software testing. I wrote a small pgm to generate the Cyclomatic Complexity of our Fortran code. I was especially pleased after discovering an error while working thru the IEEE paper McCabe had published.


Is the "cycl-" pronounced as in "cycle" or as in "cyclical"?


cycle


Ah thanks, looks like I need to correct my pronunciation then...


I've definitely heard both saiclo- and siclo- in a math seminar context!


Amazing, great to know! In what context does this come up in math?


cyclotomic polynomials are common, and probably people borrow the pronunciation, but I have always heard that pronounced as "cycle"


If you’re using Ruby I recommend the RubyCritic gem. It’ll give you lots of information on your project.


It's good, but it's unfortunately easy to split up your code in unhelpful ways just to get the dopamine rush of passing all of the metrics. (without aggressive configuration)

Additionally, you can get much of the benefit of RubyCritic via the Rubocop extension in your editor (with the same caveats as above)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: