
Data-Oriented Design (Why You Might Be Shooting Yourself in The Foot With OOP) - gruseom
http://gamesfromwithin.com/data-oriented-design
======
akkartik
I find it useful to consciously separate input data structures from
intermediate data structures (accidental complexity). I try to structure my
code so it doesn't rely on intermediate data structures, it knows how to
recompute them. When this works it can be very pleasing: just input data
structures with caching all over the place.

Sometimes I think I'm chasing pure functional programming, but in a more
pragmatic form and separated from type-checking.

~~~
gruseom
That's very interesting. We're doing something similar. I'm curious as to how
you reconcile your "intermediate data structures" with one of the principles
in the OP, that of minimizing the transformations you have to do on your data
in the first place. The latter is a profound insight that I am slowly
digesting. One thing it throws out the door, for example, is layered
architectures. Not a small deal! Yet it makes sense to me, because my
experience with layered architectures has been that the more nicely modular
and well-defined you make each layer, the more bloated and nasty the mappings
between layers become.

 _Sometimes I think I'm chasing pure functional programming_

No question this is more suited to FP than OO.

Edit: this really is a rich subject. It's interesting that a lot of this
discourse is coming out of the game dev world, because that's a section of the
software universe which is relatively free of pseudo-technical bullshit
(probably because it's so ruthlessly competitive and the demands on the apps
are so high).

~~~
nostrademons
I found that minimizing transformations on your data is a principle you apply
when you productionize. For most of the development cycle, you want to keep
things as debuggable as possible (at the possible expense of performance), and
intermediate data products + debugging hooks are a good way to do this.

This brings up a much bigger question of when to productionize, though. Most
programs are never actually "done", but at some point you have to release to
the public and hopefully get millions of users. You need to make the
performance/maintainability tradeoff _sometime_. The later you push it off,
the more productive you can be in the critical early stages, and the better a
product you can bring to market. But if you push it off too long, you miss the
market window entirely and don't get the benefit of user feedback.

~~~
gruseom
But these are fundamental design issues. You can't change fundamental design
when you "productionize"; coming up with that design and implementing it _is_
the development cycle.

~~~
nostrademons
Productionize usually means "rewrite". I think that software engineers in
general have become too averse to rewriting code; as long as you do it with
the same team that wrote the prototype, it's often a good idea to throw away
everything and start from scratch.

The development cycle for me is much more about collecting _requirements_ than
coming up with a design that satisfies those requirements. That's what
iterative design is about - you try something out, see if it works for the
user, see what other features are really necessary for it to work for the
user, and then adjust as necessary. Once you know exactly what the software
should do, coming up with a design that does it is fairly easy.

My current project is nearing its 3rd complete rewrite since September, plus
nearly daily changes that rip out large bits of functionality and re-do them
some other way.

~~~
akkartik
_"software engineers in general have become too averse to rewriting code"_

Fervently agree. I was one of them.

No amount of rewriting is too much - as long as you constantly have a working
app.

------
Shamiq
Why bother with the acronyms? Just look at the problem and figure out a
beautiful solution. It's a lot tougher than just picking a design philosophy,
but the result justifies the mental effort.

~~~
chadaustin
That was a bit of my reaction too. But then I thought:

Object oriented programming solves a great many problems with the construction
of large systems.

However, when you're writing real-time or interactive systems, there's no
escaping the fact that you must understand how CPUs, memory, and caches work.

If your game turns out to be successful and you need to fit its frame updates
in 16 milliseconds (60 frames per second), then you'll need to optimally map
your algorithms to the hardware.

However, most startups and most games fail. So why not optimize for whatever
it takes to prove a product and scale an engineering team? As long as you
understand the optimal capacity of the hardware, is initially writing your
system with OOP so bad? I don't think so.

On the other hand, these types of discussions are a great way to teach people
about the realities of modern hardware.

~~~
chipsy
Why did you conflate startups with games? A game ships once(unless it's
online), a startup ships endlessly.

The article is a bit confusing, but the way I took it when it ran, and now, is
that there aren't just performance benefits to thinking "data flows" vs
"objects," there's source readability benefits too. If you can define a
bespoke data structure that manages state in _exactly_ the way you want it,
that's far better than a cluster of objects that mostly do the job but need a
little massaging at key points. Better on the hardware, simpler to read, less
likely to cause bugs. A 5% improvement in low-level state management
multiplies many times over, because the management pattern is likely to be
replicated over hundreds or thousands of slightly different game features that
all rely on that data model.

~~~
chadaustin
I'm not talking about shipping, but instead about development risk. I've seen
too many teams start projects with lots of low-risk but high-cost "engine"
work like the example given in the article, when they don't even know if the
game will succeed in the market.

... crap, I confused the linked article with a very similar article which I
read today:
[http://research.scee.net/files/presentations/gcapaustralia09...](http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf)

Anyway, my point stands. When starting a project, you should understand the
eventual end state (high-performance algorithms making effective use of cache
and memory) but don't think you need to implement it all up front.

If a data flow or procedural approach is clearer and easier to maintain, then
by all means. But don't discount OOP as an intermediate state simply because
you'll eventually have to translate the code to fit better on the hardware.

That's all. :)

------
duncanj
Once, I had to evaluate for rewrite a program that was written in a naive
"object-oriented" style. The program built up a large graph of objects, did a
few transformations, and wrote its stuff out. It ran out of memory on small
subsets of the data it needed to work on.

I evaluated the program's data usage and rewrote it with the metaphor that I
had to process the whole thing from a tape drive. It was still object-
oriented, but the memory needs were now bounded.

tl;dr: I don't see the dichotomy.

~~~
eru
"tl;dr:" doesn't go down well around here. (Perhaps you should have just
posted the first part of your comment. That part's good.)

~~~
klipt
Ironic, considering it isn't a comment on the OP but a summary of their own
post, and could easily be replaced by something like "In other words..."

~~~
eru
Oh, you are right. I assume the down-voters did not recognize the colon,
either.

------
lukifer
I've always thought OOP was an overused pattern. If you don't need inheritance
or information hiding, what does OO give you that can't be accomplished more
easily with functions and arrays/hashtables?

~~~
xtho
Whether inheritance is an essential quality of OOP is IMHO debatable. This
leaves us with data abstraction and polymorphism.

~~~
eru
And Haskell solves those two problems pretty nicely without OOP.

