Hacker News new | past | comments | ask | show | jobs | submit login

Casey Muratori knows a lot about optimizing performance in game engines. He then assumes that all other software must be slow because of the exact same problems.

I think the core problem here is that he assumes that everything is inside a tight loop, because in a game engine that's rendering 60+ times a second (and probably running physics etc at a higher rate than that) that's almost always true.

Also the fact that his example of what "everyone" supposedly calls "clean code" looks like some contrived textbook example from 20 years ago strains his credibility.

Edit: come to think of it, the only person I know of who actually uses the phrase "clean code" as if it's some kind of concrete thing with actual rules is Uncle Bob. Is Casey assuming the entire commercial software industry === Uncle Bob? It's like he talked to one enterprise java dev like 10 years ago and based his opinion of the entire industry on them.




The thing that sets him off is that he is using a computer with enormous computing power and everything is slow.

He does have a narrow view, but it does not make his claims invalid.

I liked that his POC terminal made in anger made the Windows Terminal faster. But even in that context it was clear that by making some tradeoffs - which the Windows Terminal team can not make (99.99% of users do not run into the issue, but Windows has to support everything) - it could be even a lot faster.

So we live in a world where we cater for the many 1% use cases, which do not overlap, but slows down everyone.

Many gamedevs do their own tools, because they are fed up how slow iteration is. The same thing is happening at bigger companies, at some point productivity start to matter and off the shelf solutions start to fail.


> The same thing is happening at bigger companies, at some point productivity start to matter and off the shelf solutions start to fail

This is perhaps the third time I've posted this on HN, but what you describe is the circle of life for widely-used software projects. Large tech companies are not immune to it, resulting in frequent component rewrites, deprecations and almost-drop-in replacements that shuffle complexity up or down the stack.

Step 1: Developer is fed up by how slow/bloated current incumbent is, so they write a fast, lean and mean project that solves their problems

Step 2: The project becomes popular on its merits, rakes in stars on Github as people discover how awesome it is

Step 3: Users start discovering limitations for their use cases, issues and pull requests pour in

Step 4: Thousands of PRs later, the project is usable by most people and has "won". It is now the incumbent, but no longer is as fast as it once was, but it also ships functionality catering to many niche needs

Step 5: Go to step 1


I've started a project that has the potential to go down this path but have a very strict 'implement your own diffs if you want them' for this particular project.

Like, feel free to fork away if you like. The core repository needs to be simple and stay true to its goals, and when it updates everyone downstream can update if they want to do that to themselves. But for what it is and does, maybe the project as it is is good enough.

I start feeling almost physically sick when I see the potential for bloat to creep into the software I write. This makes working with scaled software development with others particularly hard, however.


This is so true especially when it comes to frontend development also for backend framework but to a lesser extend.

But i also think that the product, library or framework owner should really box in its project and reject wild growth of features and prevent generalisation of the usage.


See Phoenix browser, now Mozilla Firefox.


Oh man Phoenix those were the days, 2003. I imagine a lot of HN readers here were only just born.


Don't forget Firebird!


Lynx on the C64 -- Is more nostalgia possible?


"But even in that context it was clear that by making some tradeoffs"

It was literally a weekend POC, and Casey Muratori even went beyond the POC part and fixed some emoji/foreign language bugs that were present in the Terminal.

Also, his intent was not to replace the Terminal. His intent was to demonstrate that it was possible to do the optimization in the way he suggested. Originally a Microsoft PM dismissed his suggestions and claimed it would be a "doctoral thesis project" or something.

All this "yeah it's a narrow view" is just moving the goalposts more and more. Not only he has to do a "doctoral thesis project" in a few days, he also has to completely replace a tool that's already written, bells and all? Where does it stop?


But how much of that slowness is due to code that values "cleanliness" excessively? I bet that if you look at the source of nearly any application on your PC, it will be very much not clean on average.


I think it would certainly value the kind of "clean" design patterns (or anti-patterns as I consider most of them) that object-oriented programming evangelists espouse.


Fine, but is that likely to be the cause of these applications generally being slow?


It wasn't about the tradeoffs that the Windows Terminal team "can not make" - it was about alternative optimizations and performance concerns that they arrogantly refused to consider as being possible, and which ought to have been considered if they were being properly competent.

Engineering tradeoffs are real. But hiding behind them every time when it can be pointed out that they don't actually apply - and when demonstrated with concrete evidence - is another thing altogether.


> The thing that sets him off is that he is using a computer with enormous computing power and everything is slow.

If that's his complaint, then "clean code" isn't the problem. The problem is capitalism and/or human nature.

Once something performs acceptably well, ie good enough to sell it, performance isn't going to get any better. Flashy stuff and features get you money, going from 400ms to 100ms gets you...nothing.


> going from 400ms to 100ms gets you...nothing.

According to Amazon [0] that'd be a 3% gain in sales (assuming the inverse holds true as getting slower, anyway).

[0] https://www.gigaspaces.com/blog/amazon-found-every-100ms-of-...


That's if it's in your sales flow, not if it's in your software.


Quite often the issue isn't 400ms vs 100ms, it's literally seconds vs single-digit ms.

> The problem is capitalism and/or human nature.

Fundamentally, yes.


[flagged]


[flagged]


The industralization of the Soviet Union was done at the cost of the massive hunger, the murders of Great Purge and making disturbance to the neighbor contries (I am talking about pre 1939 now, WW2 soviet atrocities are not even scratched here), resulting in millions of deaths, and bringing misery to generations, in some areas until today.


I want my car to have a subscription to nice heated seats when I am stuck in traffic after working unpaid overtime.


>Cars weren't as big a priority for the communists. They were right too, cars and car infrastructure are super inefficient.

How to say that you're a city kid, without saying you are a city kid.


Well yes, rural infrastructure is super inefficient compared to urban infrastructure.


> He does have a narrow view, but it does not make his claims invalid.

I would tend to disagree on this, specially when claims come from the gamedev world. Games are presented as finished pieces (even when they aren't), and not just a release milestone. Ideally, a game is a one-off effort where you write a piece of code and if you're lucky, you won't have to touch it again. So, doing one-off optimizations instead of focusing on milestones and long-term maintainability of the code is not only a possibility, but actively encouraged. That's why until rather recently (20 or so years), assembly optimization for critical execution paths, if not for most of the product.

Most of the rest of the software doesn't work like that. You often implement something that will be maintained, modified, extended and reiterated on for several years, not by you, but by several other teams with totally different experience and backgrounds. Or decades. Doing some fancy trick to skip a cleaner, extensible, maintainable design because you shaved off a couple of cycles on it is literally burning your employer's money and potentially causing huge issues in terms of maintainability, as many programs don't actually rely on a happy path like games do.

The main reason modern systems are slow isn't (just) because programmers are lazy - Its because most software - unlike games - have compatibility and maintainability requirements, and more often than not, a huge legacy support. And also, in these systems, most of development time is actually spent maintaining and extending existing code, not writing new one.

The author's assertion is fundamentally wrong, because software engineering is quite more than performance - even when it matters. Flashback to the beginning of the 90's, and "every game" used bresenham's algorithm to skip usage of the (slow or non-existent) div instruction. In some cases, a couple of bit wise shifts would also eliminate mul operations. These implementations were in some cases 2-4x faster than the classical counterparts, on a 12-40Mhz machine. Two cpu generations later, the Pentium comes out, and both mul and div take 1 clock cycle. The fancy pants implementation is now 3-5x slower at the same speed. Except now the cpu clock is 4x faster and shoveling around registers may actually impede parallel execution of code. All of this in a 5-year window. I envy the relatively stable instruction set of the last decade, where everything is sort-of predictable and assertions of speed can be made on code with a relatively high degree of confidence, but the reality is, silicon is cheap, and for most applications, performance is gained not by throwing away what makes some huge applications barely maintainable, but by deploying hardware. New, fancy, faster, cheaper and more economical hardware. Choosing a single metric (performance) and an instance in time to bitch about something is actually a disservice to the community at large.


> Two cpu generations later, the Pentium comes out, and both mul and div take 1 clock cycle.

Where are you getting this information? Agner[0] lists DIV as taking 17 cycles at best (8-bit operand already in a register) on the P5, and MUL as taking 11 cycles. Even Tiger Lake takes 6 cycles for DIV.

There are ways [1] to beat that, but I don't think you can get it down to a single cycle.

[0]: https://www.agner.org/optimize/instruction_tables.pdf p.162

[1]: https://lemire.me/blog/2019/02/08/faster-remainders-when-the...


You are completely right, I just had a major brain fart. I probably mixed up some things (or had some bad/incomplete source at the time). More than two decades have passed, so its probably me with wires crossed.


> “Ideally, a game is a one-off effort where you write a piece of code and if you're lucky, you won't have to touch it again.”

Ideally a game pulls in over a billion dollars per year, every year for over a decade. Think World of Warcraft or Fortnite, not Flappy Bird.


And the amount of money changes the fact that the core engines are written as a one-off effort how, exactly? Updates to the scripting engine to fix play-ability issues and content updates aren't really heavy software refactoring. Sure, there are usually some actual code bugfixes on the initial releases - and more often than not - related to someone implement some really clever trick that raises an exception on some cpu. Its not like they are incrementally rewriting and extending the internal engine for a decade, as it happens eg. with a browser.


If you think that, you're not familiar enough with modern games as a service. Fortnite lives on the latest version of Unreal Engine, and Unreal Engine changed a lot from the initial Fortnite development until now with many new features, rewritten parts, and major refactoring of other parts. It's huge and constantly evolving, so it is similar to, i.e., browsers.


"Games are presented as finished pieces" is an idea that's at least a decade out of date. The industry has gone very hard on the idea of Games as a Service and it's now normal for AAA games to receive years of content updates.


From a software perspective, they are. Most games don't change requirements (or base code) during their maintenance releases, as these releases may fix some code bugs, more often than not provide only incremental updates on content. Compare that with eg. intermediate releases of software like OpenOffice.


Casey makes the point that you don't have to hand-tune assembly code, but instead just write the simpler code. It's easier to write, easier to read, and runs faster too!

If there's something wrong with that advice, I can't imagine what it is...


"Easier to write, easier to read" is the part that's wrong with that advice.

It absolutely is, on toy problems like the one described in the article.

It very frequently is not when embedded in much larger domains as part of large projects maintained over years by teams.


As a counterpoint: "Clean Code", at least the variant from the book, is very frequently also extremely difficult to write or to read in larger codebases too.

The claim that "Clean Code" scales better or allows for more maintainable software hasn't been proven by anyone, and everyone with enough experience has worked with several counter examples.

The problem of code maintainability is not solved by this coding philosophy.


Sometimes people forget that "Clean Code" is a book, and it's not exactly stellar. This is a pretty thorough tear-down: https://qntm.org/clean

I'm totally onboard with prioritizing readability over performance for most code, but the style in the book has a lot more tradeoffs than it discloses, and you often don't really appreciate that until you are trying to debug if/how A transitively calls Z in a 10 million line codebase.

It's really hard to have a constructive conversation about this though since it's so subjective, any example is too trivial, and any real system is too large.


Do you have any example of an open source project written in the simple style suggested by Casey Muratory that is very difficult to be maintained and you think would benefit from using "clean code", "SOLID", design patterns and abstraction on top of abstractions?

If you can't point to an actual example, I don't think you have a solid case.


This. Clean code isn't about writing stuff like this. For what he's talking about the overhead of polymorphism is a major part of the total cost and his case is simple enough that there's little value.

However, the bigger your task gets the more value there is to polymorphism and in general the smaller the percent of total time goes to the polymorphism overhead.

And note that his attack is only on polymorphism, not the other aspects of clean code. I strongly suspect the compiler optimizes away much of the clean stuff I do but I have never checked. I also find profiling easier on cleaner code, it makes it very obvious where the time sink must be and thus what warrants expending effort to improve. Profiling almost always shows the vast majority of time going into the unavoidable (say, disk reads) and a small number of other routines. Spend your optimization effort on the spots that need it because 99+% of your code doesn't run often enough for it to matter.


And when you combine inadequate abstractions with programmers who aren't the kind of geniuses brought in to optimize game engines you get very difficult to fix performance problems.

One of the nice things about some of the clean code concepts he uses is that (as he shows) you can tactically step back from them in key, performance critical areas and reap these wins.

If you stay too low level you get lots of tangled spaghetti code with major performance problems and no obvious way forward besides "make it better."


I find it a bit disingenuous to call what Casey Muratori is doing "staying in the low level".

Using procedures/functions is not exactly "low level". Using switch is not low level. Lookup tables are something you have to do in high level code all the time.

Sure he could have used much better variable naming (CTable?) and probably documentation, but code-wise there's nothing that screams low level there.


I'm not sure how it's disingenuous, I sincerely believe what I said and I'm not trying to fool anyone.

I would consider most of his replacements lower level than typical the clean code practices he critiques (especially the ones like iterators that he mentions but avoids in order to steel man the clean code side a little bit), not the lowest level possible. They take into account how the machine actually works and avoid additional indirection which is why they perform better.


His code does translate relatively straightforward into haskell. Do you think haskell is a low-level language, too?

Take Listing 27, getAreaUnion for example:

  f32 const CTable[Shape_Count] = {1.0f, 1.0f, 0.5f, Pi32};
  f32 GetAreaUnion(shape_union Shape)
  {
      f32 Result = CTable[Shape.Type]*Shape.Width*Shape.Height;
      return Result;
  }
Is represented quite straightforwardly:

  {-# LANGUAGE OverloadedRecordDot #-}
  
  data Shape
    = Square    { width :: Float, height :: Float }
    | Rectangle { width :: Float, height :: Float }
    | Triangle  { width :: Float, height :: Float }
    | Circle    { width :: Float, height :: Float }
    
  cTable :: Shape -> Float
  cTable shape = case shape of -- The "lookup table" or "array"
    Square    {} -> 1
    Rectangle {} -> 1
    Triangle  {} -> 0.5
    Circle    {} -> pi
  
  getAreaUnion :: Shape -> Float
  getAreaUnion shape = cTable(shape) * shape.width * shape.height
Although it is typically easier to abstract in a "high level" language, abstraction does not require it. This whole debate is rooted on false assumptions and the need to take a side, imo. Casey has a point, it is just ignored in a typical hand-wavery fashion. "The toy example doesn't scale" is a poor argument, especially when what we can observe is slow software.

The stuff proposed in the post is not rocket science, it is a very straightforward implementation of tagged unions. Instead of fetching a vtable and jumping to a value there, he proposes to branch on the tag. This is essentially dynamic dispatch on a known set of types.

Additionally, he shows that this can result in speedups greater than a factor of 1. Any program that wants low latency or high throughput can profit from this observation.

This way of programming is by no means the one to rule them all. It has different advantages and drawbacks; none of which have anything to do with the percieved intelligence of the programmer or later consumers, for that matter.

An objective disadvantage of this style is, that the program can't interface with code, that hasn't been written yet, as a caller. Another disadvantage is that the size of the tagged union is defined by its largest "subclass".

In the end, what he has shown is that speed is often a compromise made unnecessarily. This doesn't really have to do with clean code anymore, as I can see how a compiler could implement what he is angry about with virtual functions in every situation where his style is applicable.

Casey has had a similar thing about the windows-terminal and somewhere in his videos a different, yet arguably worse, problem comes to mind: a lot of libraries do not care about performance enough. If you write a program and care, you may run into the problem that the library you use is your bottleneck. If this library is hard to replace (imagine needing a rocket-scientist), then you are done for. In that specific case it was DirectWrite and some other Windows-API that were slow. So if, for one reason or another, the windows team was required to use both, they'd have a hard limit on how fast they could go, just due to that. There is no "being smart" or "requiring a genious" involved in the forced/strongly recommended library here.


It could in fact sometimes be easier to refactor low level code, than to dig yourself out of bad leaky abstractions.

https://caseymuratori.com/blog_0015


> If there's something wrong with that advice, I can't imagine what it is...

It will start getting really annoying when you try to add shape ‘hexagon’ and need to figure out all the places where a shape can potentially be used, just so you can update the switch statements.


Many languages provide unions or sum types along with exhaustiveness checking to make this very easy (frequently not OO-inheretence based languages though).


Why would you go with "many languages" into a thread showcasing how C++ sucks? I'm pretty sure that had the author of the video done all the same manipulations in Python, the speed difference would've been negligible.

The author of the video discovered that C++ compiler is dumb when it comes to optimizing virtual method calls (that instead of bare virtual method calls he had to help the compiler to guess the right conditions where these virtual calls could be replaced with guessed static calls). Essentially, all that his video is saying is: "virtual calls bad if-else good". Which is like what every C++ game-dev thinks after few years on the job. Which is amusing in how short-sighted it is, and sometimes even more amusing to discover the "solutions" created by such C++ game-devs that are aimed at replacing C++ objects, but do it in a way that's even worse than C++ original design (who would've thought that to be possible!?)


What happens if library user wants to extend functionality? They can't inject their code into the library.


If this is going to be a library where that’s a desirable feature, architect it for that feature. In the example, one easy way would have the coefficient table be expandable/replaceable. If you really need to run arbitrary user code, then write an interface that the user will conform to and call their code. You don’t even need OOP support to do that easily, just typed function pointers.


Yeah, it's very awkward. Your best option is to leave a 'hole' case where someone may provide a 'data type' set of functions satisfying an interface, and the library author simply calls them. Effectively you're adding an OO escape clause, but it's ugly and will break user code when you add more functions and grow the interface.

Conversely, in a codebase organized by objects it's not clean to add an extra method to the base class and each subclass. You have to write an external function and switch over every known subclass inside it, which is also very ugly and will also break when you add more subclasses.

The two designs are actually the duals of each other. Someone compared it to rows vs columns and it's a great comparison.

In OO, the methods are columns and each new row is a new subclass implementing them.

In FP, the types are the columns and each new row is a function that switches over the possible types.


Bob Nystrom has a good article on this (the expression problems it’s called)

https://journal.stuffwithstuff.com/2010/10/01/solving-the-ex...

He also discusses it in his book Crafting Interpreters.


Even in C compiler will emit warnings for unhandled cases in switch statements as long as you don’t provide a default case (as you shouldn’t).


Depends on the type of code you’re writing. If your `switch` is tightly coupled to the code that defines the cases and they’ll definitely be changed in lockstep, a default is more likely to be harmful.

If your cases are defined externally, and you need to be forwards compatible, omitting a default is wrong.

The Swift language specifically added `@unknown default` for switching over enums.


Depends on your language I suppose. I haven’t worked with a ton of compiled languages.

But we can just re-up the problem by adding 100 different shapes instead of the one. Now you have switch statements with 104 cases each spread through your codebase.


I prefer to have those 104 cases all in one place (as is the case, when it is a switch statement) rather than each in a separate file, that I need to jump around between now (as is the case with polymorphism). This situation is a bit analogous to organising things column-wise vs row-wise. And in practice I find that I need to jump around a lot less with code that uses switches than with code that uses polymorphism. Tangentially, the latter is also more prone to turning into spaghetti, as the whole is obscured by indirection levels between the parts, but you don't see the spaghetti until you try to step through the code, when debugging an issue or just trying to familiarise yourself with a new codebase.


Conversely it's really annoying to add a new method to each shape--you've got to open them all up and add shape-specific code for them with a bunch of boilerplate. With switch statements you just add one more function.

This is the "expression problem": https://en.m.wikipedia.org/wiki/Expression_problem


switch isn't 'invented' for enums. switch is a low-level construct which mimics several goto's / jumps. It's just as bad, except for the case where you have either: a) multiple behaviors for the same value, b) need to pass through (not break). b) is the nr 1 reason I hate switches, and my nr 2 reason is that most languages don't support proper enums, and will fail when you don't handle all possible values


Switch was invented because it allows to replace several ifs and gotos with a precomputed static jump table, an optimization trick.


Switch is not bad, far less is there any reason to hate it.


Agree. It's the same as saying don't use for loops or any other basic language constructs. Switch is very useful, please leave switch alone! You will not take switch away from me :)


When do you actually pass through? Only some state machines do that. But even I. That case, ifs are more clear and less error prone due to missing breaks, braces/scoping issues, etc


> It's easier to write, easier to read, and runs faster too!

ITs still possible to get a bottleneck in assembler.

Whatever language is used, executable code still needs to be profiled using tools as described here.

https://en.wikipedia.org/wiki/Profiling_(computer_programmin...


The problem is that Casey has a very particular definition of simple which is problematic to apply in many cases.


Does he? If anything, it is the definition of "Clean Code" that is somewhat special compared to previous usage of OOP and other paradigms.

Casey's definition of simple actually reminds me of the cliché by Rich Hickley. It's simple, but it's not necessarily easy.


I don’t recommend adopting “Clean Code” either for similar reasons.


Fair enough!


> Is Casey assuming the entire commercial software industry === Uncle Bob?

It's uncharitable to take Casey as making absolute blanket statements like that, but still, it would not be unreasonable for him to single out Uncle Bob in particular.

The Amazon rankings for Bob Martin's "Clean Code":

Best Sellers Rank: #5,338 in Books (See Top 100 in Books)

#1 in Software Design & Engineering

#2 in Software Testing

#4 in Software Development (Books)


This comment helped make sense of this whole comment section for me.

I work in game development, largely with optimisation. I mostly work with GPU optimisation, which is a whole different beast. On the CPU, most of the time issues are either trying to do too much stuff in a hot loop (rendering stuff that could have been culled, putting physics on objects that don't need it,...) or doing something in a slightly inefficient way in a hot loop. Because everything in the game is indeed a loop consisting of a series of hot loops.

People in this comment section call his example contrived, but it's very similar to one of the biggest performance improvements I've seen in practice.


Hot loops are where you spend your optimizing efforts. If you're going through that list of shapes again and again it very well might be worthwhile to cache some data and provide the objects with a way to update the cache.


There are a lot of clever techniques already in play to minimise the amount of data you need to consider.

Still, each triangle's position, shape, and other properties can change each frame, as does that of the camera. So you cannot avoid doing some amount of work for each of the visible triangles and their vertices each frame.

Since you need to update the screen at a consistent frequency (typically 30 or 60 times a second) and the list of triangles that actually need to be rendered each frame is in the millions... Well, that's a lot of work which cannot be avoided.


Bob Martin has made a real effort to tie himself to the specific phrase "clean code." If the author of this article had referred to clean code without using quotation marks, or talked about managing software complexity using any other term, you'd be right, but I think he's specifically talking about Bob Martin's "Clean Code" and the school of object-oriented philosophy that cleaves close to his beliefs.


The fact that you think that "everything is inside a tight loop" doesn't apply to all code today already shows your own model of code is broken because you believe the syntactic sugar modern programming languages and paradigms provide is actually reality. If everything wasn't in a loop, your program would halt once you're done with whatever you're calculating. Just because you have things like callbacks and things feel lazy doesn't mean that things do not really operate in a loop on a deep level, of course they do. You just are insulated from it because you write hooks only and such and you don't actually see the loop.

Believe it or not, callbacks are not like interrupts, there is a loop somewhere that checks the status of something and then runs the callback. All computer software today involve things that run in loops, you just don't see it. Web browsers do it! Of course they do.

Moreover, he didn't contrive his example, he said that he in fact used a textbook example used by the advocates for polymorphism and such.


You dropped the word "tight", it is important.


How so? What loop isn't "tight"?

EDIT: To expand on this, why is modern software slow? The reason is because people, thinking their code isn't a bottle neck or performance doesn't matter but adherence to the right abstractions is, they write slow code and call backs thinking everything just happens immediately, with layers of abstractions, and those little innocent steps add up when every piece of code written today is written with that same neglect. The call backs runs slowly, queues fill, promises hang, and so-on and on we go.

So sure, may be your code doesn't seem like it needs to run at 60fps. But when everything I do on a computer is written like it will be run every 10 seconds, it definitely will be noticeable. That's because I don't just run your program or look at your website, I look at 10 of them or even more. If I average all of those per time, then may be your code should in fact be able to run at 1Hz or so or I will start to notice.

People of course were right not to teach new devs not to over-optimize immediately, but the culture has swung so far in the other direction, especially since you all seem to love complexity so much, you've managed to yes make computers that can calculate pi faster than a super computer from the 70s crawl when it renders and handles an editable textbox. There has to be a move back in the other direction, you guys need to give a shit about performance, at least a little.


The kind that runs once and then waits geological ages to hit the next iteration. For example:

    while (command != 'quit') {
       command = readline()
       handle(command)
    }


Handle(command) runs in another loop at the same time, the scheduler.


New developers are still watching Uncle Bob videos online and taking those strategies as the default for how you craft code, largely because there are not very many people since then making similarly grandiose claims about how software should be crafted and forming entire companies pushing adoption of those techniques commercially. We even had a young dev leave our company and form a startup around the idea of doing what we do, but a full "clean code" rewrite. Our software already has major performance issues, I'm not hopeful about the speed of his code after he layers on even more abstractions.


Exactly. The problems of software performance come from decades of poorly/quickly executed evolutionary change resulting in bad systems design. It's all an new abstraction over an older abstraction over an even older abstraction, because some old application still needs to be supported (something Casey has likely never had the problem of worrying about in game development).

Game developers have the luxury of starting from near-scratch every once in a while. That exactly what his lauded handmade series is all about. I'm guessing that things wouldn't be so clear-cut if he was given a 10 year old codebase to iterate on.

"Normal" developers see game developers as gods walking amongst us and place far more value on their opinions than they should. The truth is that game developers and "normal" developers face equally as challenging problems, just different problems. As a trivial example, an experienced web developer could probably run circles around Casey in terms of elegantly accounting for browser quirks (conversely, the web developer would probably be stumped about data oriented design). Either could learn the other's discipline, but each would have decades head-start on the other.

The idolization of gamedevs is extremely frustrating, especially when it comes to appeals to their authority.


Even more annoying than the idolization of game devs is when you open up the "Displays" submenu of the osx system preferences application, and it takes several times longer to load than the previous major os version, several seconds!, with the only significant change being a different layout, and constantly trying to ignore how nearly everything takes so much longer than necessary, wasting so much time and energy.

I agree that not everything is like a game, but it makes me legitimately sad when it seems like nobody cares about performance (aside from a few domains).


I am a developer who worked on embedded, desktop apps, mobile apps, games and now on large microservice based application. A developer is a developer and can move from one kind of application to another.

I find what Casey says in his videos to be true. And I though about that stuff before I even watched his videos, which are excellent.

However, I started not to care. I don't want to start fights inside the company, especially fighting alone against many OOP cultists. There's not my money at stake, so if companies as a whole decide for OOP, clean code, SOLID, design patterns, abstractions on top of abstractions, making the code bases giant pile of junks while degrading performance, I am not going to go against the crowd.

Code that I write for myself is quite different than code I write for my employers.

I just hope that the industry as a whole will wake up from the whole OOP nightmare.


> I just hope that the industry as a whole will wake up from the whole OOP nightmare.

I agree with that 100%. OOP is a giant mess, no matter where your stance on code clarity or performance stands. It's objectively worse in both regards.


> something Casey has likely never had the problem of worrying about in game development.

This is simply not true, and has in all likelihood worked on such problems given his work at RAD whose software has been used in +20 years at this point.

> The problems of software performance come from decades of poorly/quickly executed evolutionary change resulting in bad systems design.

This may be true of some code bases, but it's demonstrably false for new software that's created today. Lots of new software gets built and it's slow.


> The idolization of gamedevs is extremely frustrating

Every story needs a Hero; it's inspiring when you're trapped in the CRUD gulag (until you see the TC/WLB).


Fair enough, but never forget that you can be your own hero. I have heard numerous accounts of hobby coding being used as a successful antidote to chore coding.


Then again, thing's aren't in the critical path until suddenly they are.

Regardless of scenario I will never willingly do a O(n^2) sort when writing new code. Just in case those 10 items suddenly turn to 10000 one day.


"Regardless of scenario" limit your options.

If you are shipping a binary to your users that will never be able to get updates, your cautiousness would be justified. There are other situations where it will needlessly limits options.

There are situations where I have knowingly written O(n^2) or worse, and put a # xxx dragons marker by it. Quick to write, leave my options open, keep my momentum on the problem I care about.

I will grep for xxx issues at some later time. I may end up throwing out the code before that happens. If I hit big-o issues before then, I can refactor.

I once had a system where the important problems turned out to be a series of IO bottle-necks - nothing to do with computation - but that was obscured because good sense had been burnt at the altar of compute efficiency.


Are you even manually implementing sorts frequently?

Even languages that are notorious for having tiny libraries, like C and JS, have built-in sorts.


It’s less about accidentally writing n^2 sorts and more about not accidentally creating n^2 algorithms.


Not sorting, but very simple algorithms that can't afford having O(n^2) performance? That's common even in CRUD apps.


Perhaps not sort so much, but certainly with search I've seen people roll their own inefficient search functions many times.


It is the exact same situation, though. Most people can just chain a sort and a binary search to do it, and both are included in most languages. Or just put it into a tree map, if your language has it.


The moon doesn't fall into the ocean until it does.


Not a fair comparison :P.

The point is that the developers may think O(n^2) is fine because their toy use cases had n=10...100, but then actual users will try to use the software for n=10k, or n=100k, and then either waste their lives working with suddenly slow software, or look for alternatives.

I walked into a case like this the other day. I wanted to do a little semi-collaborative project planning. I found a nice tool, played with it for a moment, figured it has the functionality I need and it's fast enough. Then decided to do the actual plan. Once the number of entries in the system went from 10-20 to 30-40, I started to feel things get a little laggy. 50-60, more laggy. At this point I was committed, so I suffered the tool for couple of months, as its UI kept breaking when handling 100 entries. If I knew this would happen at the start, I'd look for something else. But instead, I walked into a hidden O(n^2) somewhere, that makes me hate the product with a passion now.


It's more than that. The way black box composition is done in modern software, your n=100 code (say, a component) gets reused into a another thing somewhere above, and now you're being iterated through m=100 times. Oops, now n=10k

Generally, Casey seems to preach holistic thinking, finding the right mental model and just write the most straightforward code (which is harder than it looks; people get distracted in the gigantic state space of solutions all the time). However this requires 1. a small team of 2. good engineers. Folks argue that this isn't always feasible, which is true, but the point of these presentations is to spread the coding patterns & knowledge to train the next gen of engineers to be more aware of these issues and work toward said smaller team & better engineers direction, knowing that we might never reach it. Most modern patterns (and org structures) don't incentivize these 2 qualities.


> The way black box composition is done in modern software, your n=100 code (say, a component) gets reused into a another thing somewhere above, and now you're being iterated through m=100 times. Oops, now n=10k

That doesn't seem quite right. as 100 * (100^2) <<<<< 10000^2


Yeah I was only talking about quantities. Equivalently, assume that it's a linear algorithm in the child and a linear one in the parent. Ultimately it ends up as O(nm) being some big number, but when people do runtime analysis in the real world, they don't tend to consider the composition of these blackboxes since there'd be too many combinations. (Composition of two polynomial runtimes would be even worse, yeah.)

Basically, performance doesn't compose well under current paradigms, and you can see Casey's methods as starting from the assumption of wanting to preserve performance (the cycles count is just an example, although it might not appeal to some crowds), and working backward toward a paradigm.

There was a good quote that programming should be more like physics than math.


There was a fun example in the Julia compiler around a year ago.

Part of the compiler was O(N^2) in `let` block nesting depth. That is

  let x = foo(), y = y, z = 2y
    ...
  end
would be a depth of 3. It didn't seem like that should be a problem, N is never going to be 10, let alone 100, right?

Until suddenly, `N` was in the thousands in some critical generated code spit out by some modeling software, so that handling the scoping introduced by `let` suddenly dominated the compilation time...


The "contrived textbook example from 20 years ago" still has a very real impact today. In my experience there are still lots of development teams that are instructed to develop in a "Clean code" style in the flavour of Uncle Bob. It's especially true in the .net development space, and is almost a cultural problem within .net.

As a former .net developer that was often pushed into "clean code", my big takeaway from the video was that actually, not using "clean code" techniques, such as polymorphism made the code so much more readable and easier to grok that the optimisation that followed was completely natural.


There are a lot of people who advocate "clean code" principles without ever having read or knowing about Uncle Bob, because those "enterprise java dev like 10 years ago" folks sort of seeped into the industry.

It's the same thing with TDD zealots. Or any other fad driven development paradigm, which our industry is filled with.


This summarizes my impression of the article. It reads like a freshman CS TA giving a "well, akshually" speech. This is all old news- everyone knows vtables are "slow" in some very loose sense of the term. In general, these optimizations don't make a big enough difference to be worth considering while designing your code. In very particular domains, these ideas are valid, but a lot of that code is written in C so there's no dynamic dispatch anyway.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: