Hacker News new | past | comments | ask | show | jobs | submit login
Entity–component–system (ECS) back and forth (skypjack.github.io)
77 points by skypjack 32 days ago | hide | past | web | favorite | 61 comments

Catherine West's RustConf keynote provides a good discussion about why you might want to organise "a program where objects interact" using an ECS system. She discusses the "how you might write it with standard-OOP", some drawbacks to this, and iterates this towards using an ECS system. https://kyren.github.io/2018/09/14/rustconf-talk.html

I like that both this and TFA mention a focus on the data, and "row"/"column" roughly compares to (say) an SQL table. - It's an interesting way of thinking about objects/entities.

Thanks for the link. I haven't read it before. It looks really interesting indeed.

And the response by Jonathan blow https://youtu.be/4t1K66dMhWk

One of his points is that you still have to solve synchronization for allocations and deletions from the arrays.

I feel like blogs about ECS are needlessly verbose when the idea is really simple.

Problem: Most games are built in with a game loop ticking frames through many gameobjects/actors. Many actors share some functionality like physics collision, but there is also a lot of unique functionality on the same objects.

Naive solution: Use inheritance to build out your core functionality and override functionality as needed.

-This leads to a mess of interacting parts a massive type tree with many permutations of functionality.

-The deep type hierarchies blow out cache locality as the natural way is to tick through all functionality for a single object as you loop through all objects

ECS: Break functionality apart into discrete systems. Use composition to build the permutations of functionality.

-Its easier to to manage what you intended and also easier to experiment with functionality through composition than inheritance.

-The natural way to loop through functionality is one system at a time. You loop over your entities many times but because you can run the same system code in a tight loop its really speeds things up.

-Systems provide singleton like functionality to store global state in a private way. In a heavy actor model we have to rely on static fields, which can be ok, but now you have a large cumbersome hierarchy of functionality with global access to state.


-Composition over inheritance is good in all walks of coding.

-Many discrete tight loops of systems is usually better than one loop of complex logic.

What's funny is that that's roughly how I wrote my first game when I only knew structural programming (and not that well).

Game objects were Pascal records. There was 1 global table with all the records. There were many functions called from the game loop processing all the records in some way. Like one function updated orientation, velocity and position. Other did collision detection between objects, another did collision detection with the map. There was a function for possible reactions to collisions (if you collided with a powerup you get 100 health etc).

It was all data driven, objects had integer indices referencing tables with predefined attributes. Like your ship record had "engineNo", which referenced table of engines with power, turning radius, fuel consumption, and other parameters for each kind of engine. Same with guns, armor, power generators.

Map was 2d array of records which had tileType, tileImage, xParam, yParam. If tileType was teleport - it teleported any object it collided with it to (xParam, yParam). If tileType was switch it triggered tile on (xParam, yParam) every time something collided with it. There were also tileTypes that damaged anything that collide with it to a various degree, and tileTypes to jump to the next level. That was like 3000 lines of code in 2 files (because Turbo Pascal refused to compile it when it was in one file :) ). Everything was global. There were no classes. And it was pretty easy to understand and modify, and allowed a lot of design possibilities.

Well, it would be easy if I used proper variable names and constants. As it was you had to remember what tileType==10 means, and I had 3-screen long function that was nested if-then-else with all possible tiletype-vs-objectType interactions. Like - bullets slow down when entering water tiles and disappear when hitting walls. Napalm particles change into smoke particles which expire after X frames. Napalm particles glue to walls and slowly move down. Grenades explode when touching walls.

There were lots of such rules, but putting them in one place was the right thing now that I think about it.

I'd argue even with abysmal naming practices it was more understandable than most of modern codebases, where this logic would be separated into 20 small classes.

Then I learnt OO programming and tried to make a proper object oriented game architecture, and it was never as simple as that naive approach :)

I don't think the idea is difficult actually. What I've found difficult when I decided to implement my own tool (https://github.com/skypjack/entt) was that technical details on how to design something that was both easy to use and with good performance were scattered all around the web. I'm just trying to summarize what I've discovered so far for the "future me". :-)

Sadly the article doesn't mention how ECS typically solve the "holes" problem, where your arrays might be quite sparse which leads to inefficiency.

Do you happen to know how that's usually done?

Yeah, I'm the author of EnTT C++ ECS (https://github.com/skypjack/entt) and I can guarantee you there are no holes there. :-) The next post (part 2 of the series) will be more or less all about this point. I hope to publish it as soon as possible. The idea is exactly to guide the reader through different models, from the easiest to implement to the ones that are probably hardest to develop but have no holes, perfect matches, higher performance and so on. Stay tuned.

In my ECS I do this by storing the actual data in a dense array, alongside a hash that maps entity IDs to array indexes.

If you want to see the (javascript) innards: https://github.com/andyhall/ent-comp

> ”Many discrete tight loops of systems is usually better than one loop of complex logic.”

A related low-level design choice for high-performance data structures is “Prefer structures of arrays over arrays of structures.”

Man this is interesting. This is exactly how I write my react/redux apps. I wonder if this is an architecture that works especially well for UI in general, since games are very similar in some respects (lots of disparate components, mostly reactive, lots of compromises when structuring code to meet requirements for visual design). It's an architecture that I have to spend a lot of time explaining to my coworkers, but once they learn to think that way they can suddenly jump in and do anything since they know exactly where to look for every feature.

ECS is one of those things you "get religion" about, especially as a game developer. Like, "Oh, I've been fighting the class hierarchy and overcomplicating my code for so long... it doesn't have to be this way!" I've written about this as well, whilst developing my own game: https://nullawesome.tumblr.com/post/146127692039/entities-an...

I went through a very similar thing writing my own toy/hobby game-project. It started out as a classical straightforward OOP design, but once I started adding extra stuff to it like animation, a scripting engine, etc. I really started to feel a lot of inertia from the design and the awkward forced hierarchies I had built. Over time I've been incrementally flattening the design to end up with the kind of hybrid/half-way design mentioned in the article, which borrows some things from ECS, but still has distinct game objects implemented as classes.

I'm not fully on-board with the (between-the-lines) assessment in the article that this is somehow inferior to, or should just be a stepping stone towards a full-blown ECS though. At least not for a simple game where performance is not really a concern. My view is that a hybrid approach can actually be better than a 'real' ECS for those kinds of cases, taking the best of both worlds. In my case I don't use any inheritance, only composition, I don't have to fight OOP, but I can still write very clear and concise code to access and manipulate properties of my game objects. It may not be cache-efficient and whatnot, but that really is no concern for my simple game at all, having at most 20 to 40 lightweight entities active at the same time. Physics updates (which could drag down performance if too inefficient) are treated like in a traditional ECS, simply advancing the full physics sim for all entities completely separate from the game objects.

Looking forward to the next article!

Good point. I put it in wrong wording probably. I didn't want to say that it's a shitty thing, just let the reader know that it's not a great improvement in terms of performance. However, I point out also how the halfway approach already is worth it and brings in some benefits, the same you nicely described in your comments. ;-)

Functionality composition is tangential to ECS. You can choose composition over inheritance in all types of architectures.

Composition is only one of the reasons why it makes sense to consider ECS. Structuring your game state as pure data makes it easier to serialize, to double-buffer, and to arrange in memory how you see fit. There are non-ECS ways of doing this but ECS is the easiest way I've found to write general code that can operate the same over many different kinds of game objects.

> ECS is one of those things you "get religion" about, especially as a game developer.

That's what worries me to be honest. I'm right there with you that ECS is just the right way, but whenever I can't come up with good arguments against my choices I feel like I'm blinded by faith.

My main argument for why I wouldn't use ECS in a greenfield game project is that a lot of existing tools don't play nice with it. That's about it.

To be honest I use composition over inheritance because it fits better with my mental patterns. I've nothing against OOP in general and I'm pretty sure we can have code with good performance also without a component-based model. The fact is just that I'm not as good at designing things with OOP in mind as I'm when I design them with components in mind. But this is me, not a golden rule.

I like a good balance of composition and inheritance, but I feel like the current hype for ECS is a bit overwhelming. I noticed many beginners are confused and try to shoehorn ECS into EVERY aspect of their game project, as in "OOP/inheritance is lava" way. Also I see people obsessing about performance of ECS vs OOP, when it's not relevant for most of the indie game projects that have a much smaller scope than a high-end AAA game projcet.

I agree that component-based models should not be taken as all-about-performance solutions. I'm using ECS as the core of some software of mine but of course OOP is still present around to treat other aspects and problems that don't fit well with ECS in general. I mean, such an architectural patter solves a specific problem in an elegant way, but it doesn't solve everything in let's say a game. Try to put everything into components is a common mistake that leads to poor code sometimes. However, for those part where it fits, I like to use this architectural pattern because it's closer to the way I think and thus easier to work with for me.

But deep down, it is a performance principle: luxuries like pointers, virtual functions etc. are suitable for complicated but rarely executed code, while important loops should access arrays of components (or something very close) without unnecessary cache misses.

If memory accesses were as cheap as 40 years ago, inheritance vs. composition would be a lofty debate about elegance and architecture and programming language semantics and taste; now it has become a much more practical and detailed tradeoff between difficult design and bad performance.

> if memory accesses were as cheap as 40 years ago

To clarify for readers not familiar with the motivations here:

Memory access is in fact cheaper than it used to be, as you might expect. The concern is that it did not get cheaper nearly as fast as processors got faster, meaning the relative cost of accessing memory became much higher when you could be doing more calculations in that time instead.

Specially what allows for better performance is accessing memory in a pattern that can be predicted and prefetched. Linear access is best, because looking up a single byte in memory has approximately the same overhead as looking up a contiguous chunk of bytes.

Think Wiley E. Coyote picking up train tracks from behind himself and putting them down in front. If your memory access looks like that then you are less likely to suffer from memory access bottlenecks.

> I've nothing against OOP in general and I'm pretty sure we can have code with good performance also without a component-based model

Without a component-based model, sure. But with OOP specifically (which usually just means object hierarchies) you're chasing a lot of pointers and usually (though not always) doing a lot of null checks and other branching inside loops.

I've been wondering for a long time about how ECS can be cache friendly.

If you look at Overwatch's ECS ( https://youtu.be/W3aieHjyNvw?t=326 ), there are many systems that can read from 10+ different components. For every entity in this system, at least 10 reads with no locality whatsoever are done. That seems crazy slow to me.

The point of data oriented patterns is to structure data how it is used. So if these lookups were bottlenecks you'd collapse components together if they were always used together.

The other idea is that where there is one access pattern there will be more of the same. So by updating by System you keep as much as possible hot in cache by doing all the similar lookups together. So whilst the first lookup might have ten complete misses the next one probably won't.

At a gameplay level there's a lot of chaos going on and entities are not generally doing things that are easy to organize in a way that avoids cache misses anyway. On top of which you are juggling ease of change with optimization. At which point you just need to be fast enough. I'd also bet most bottlenecks for Overwatch were not in gameplay code.

Mostly though people should be thinking in a data oriented way rather than grabbing an ECS framework and expecting that to magically make things cache efficient.

I saw the video, but they don't go much in details on the actual implementation, so it's hard to say where, how and if things are optimized or not. Moreover, the speaker stresses also on the fact that ECS is used for code organization in most cases and they benefit a lot on this aspect.

Consider instruction cache locality as well. That system will run sequentially running the same code for all entities that need it, and is likely to all stay cached.

Whereas if you tick each entity separately and run all the logic, each new entity tick is following on to so much unique code having run that it is probably starting all over on uncached instruction fetches.

It's ten reads, but each of the ten reads is typically just an index into a big sequential array. Also, when you're reading components, you can run that system in parallel with any other systems that do not write to those components.

Here is a talk by Blizzard about Overwatch which uses ECS: https://youtu.be/W3aieHjyNvw

This is interesting. The vertical slicing the author talks about sounds a lot like the conventional way to structure a functional component based UI. Components + state tree + selectors + reducers in a typical react app fit together in this way and it makes it very difficult to apply traditional OOP design patterns. This is a huge source of contention on my team between the front end and the back end developers who are all c# dotnet guys. If anyone has any suggestions on where to find more information about functional, flow-based application architecture I would really like to learn more.

I's suggest reading a bit on the Elm Architecture. You don't have to use Elm to appreciate or use the architecture.

I really like the following article on ECS, as it shows the complete setup in a very compact way, making it not only easy to understand, but actually quick to implement yourself: https://blog.therocode.net/2018/08/simplest-entity-component...

One day I'll write a blog post myself writing about how I went from ECS to OOP.

The short version is that ECS works against creativity. I want my entities to be unique: they move differently, they react to damage differently, they die differently, etc... In a ECS world, this means having at least 5 different components and 5 different systems per... type of entities. With just 10 entities you already have 100 different classes... that are never going to be reused. Also, because I reimplemented my game to OOP I was also able to compare the performance and noticed that the ECS performed worse (although maybe my implementation was not good?).

> In a ECS world, this means having at least 5 different components and 5 different systems per... type of entities.

Shouldn't it be more data data-driven than that? That is, instead of 5 components and 5 systems, you have one or two components and one or two systems and the components declarative describe the differing behaviours in a data-driven manner.

In my personal experience, if you have objects or inheritance to define every possible behaviour, the codebase becomes a huge tangle and performance will suck and actually creating the content becomes a pain. I'd personally much rather have components that define the general flavour of the behaviour and then use data to fine-tune it.

And for the edgy edge cases.. have a script component that runs Lua or whatever.

Of course, you can do that with OOP too and you can certainly write high performance OOP, I just personally find it much harder due to how it encourages distributing state across the codebase, especially when you want to do it in a multithreaded environment. But if you've had better experiences with OOP, then more power to you, use what works for you. I would, however, not write my own ECS, but use a well designed and well tuned existing one, like the article authors EnTT.

Maybe. However it makes sense to stick with the paradigm with which you are more comfortable. There is no shame with using OOP instead of component-based models or the other way around. I think that knowing both of them can help sometimes, because they fit different problems and can work side-by-side tho. That said, if you'll ever write such an article, ping me!! I'm pretty sure it will be an interesting point of view to think of.

I'd be interested to read a follow-up on how to incorporate other data structures such as quad trees, spatial hashes or scene graphs into an ECS.

I look at it as EC and S. It's a datastore keyed by Entity handles that lets look up Components. Data transformation is then driven by Systems. The behavior of a game is then driven by the tick order of the Systems.

From this view of the world it's fairly obvious that other structures need to reference entities through their handle. Or in the case of middleware with its own view of the world a translation layer keeps things synchronized.

Stay tuned then. I was already planning to write something about scene graphs and component-based models. Nothing forbids to write also about the rest.

is this right ?:

'entities' ~= object references (ids of objects)

'components' ~= properties (instance/member variables) of objects (maps of entity ids to components)

'systems' ~= algorithms that operate on lists of components

A definition of "ECS" would have been nice.

To add a quick summary to the given acronym expansion, I think of Entity-Components as a native, in-memory database where Entities are your Primary Keys and Components are your Tables. Systems are a way of organizing and running operations on your entities over time.

native: native to your programming language, ie no serializing / deserializing types

Here's my simple explanation. Making a game with object hierarchies, you might do the equivalent of this:

    monster = {}              // create an entity
    monster.hp = 10           // give it properties
    monster.pos = [0,0,0]
Using ECS, you instead do the equivalent of this:

    currID = 1                // ECS init
    hpData = {}
    posData = {}

    id = currID++             // create an entity
    hpData[id] = 10           // give it properties
    posData[id] = [0,0,0]
Obviously neither version is really implemented that way, but hopefully it shows the key insight - instead of storing properties on the object they describe, you store them in tables full of like data, keyed by the ID of the object they describe.

Showing the how without showing the why is probably not helpful IMO. Why is important here.

Using ECS is somewhat like treating your game state like a normalized database. It lets you perform operations on all the items of a certain type at once.

Say in my game I need to recenter my origin on a regular basis. In an open world game, you want to make sure that players don't start to see errors caused by floating point arithmetic, and the further away their position gets from 0, the worse this gets. To fix it, one option is to transform all positions by the same amount to keep the player's absolute position near (0, 0, 0).

To do this in the hierarchy style, the code might look like this: (Using JS because I was writing it today so that's where my brain is)

    var gameObjects = world.getAllGameObjects();
    var offset = [-1000, 0, 500]; // Pretend it's a real vector

    // Iterate through every single object in the game
    // (Not using foreach or filtering in either example, sue me)
    for(var gameObject in gameObjects) {

        // Branch on every game object just to see if it's positioned in the world
        if(gameObject.position !== null) {
            // Do what I really wanted to do
Meanwhile, the ECS version would look more like this:

    // This can vary greatly based on implementation
    // I'm going to go with an option that reads easily
    // This should just give back the list of components, which we'll say is an array.
    // Lookup time in this fictional system is the cost of a dictionary lookup.
    var positions = world.getComponents('position');
    var offset = [-1000, 0, 500]; // Pretend it's a real vector

    // Iterate through only the things we care about
    // Not using map here, sue me
    for(var position in positions) {
        // Do what I really wanted to do
Not only is the second example shorter, but it's also doing a lot less work. We're adding a vector to an array of vectors. No branching, no touching objects we don't care about, nothing extra.

Many ECS systems will allow you to filter down to the set of entities that have two or more components at the same time. Obviously more expensive, but still not as wasteful as checking every object or adding an update function to run every tick on every relevant entity.

The above code was written just before bedtime and not against a real ECS library or OO hierarchy, so please forgive any obvious errors.

> like treating your game state like a normalized database

This makes me curious, how bad would it be to actually store game state in say, an in-memory sqlite database? I feel like this could get you some nice things, like auto-destroying all components of an entity via foreign key deletion, being able to merge/join entities with the intersection of multiple components etc. Assuming the clunkiness of interacting with sqlite is abstracted away, I wonder what the real performance differences would be.

Compared to iterating over an array of structs (or a struct of arrays if you want to be fancy) it's probably multiple orders of magnitude slower.

And lets be honest, half of my performance problems in webapps are caused by the overhead of sending a query, the database is rarely bottlenecked by inserting or updating the data on the physical storage device if I do not use bulk insert or bulk update methods.

I assume you mean sending a query over the network, as most webapps do; sending it to a local sqlite file should be far less impactful.

But still not free.

Please try it out though, I would love to see a proof of concept that took the ECS-as-database approach. Like I said earlier in the thread I expect it would not do well, but I'd be interested to see the results either way.

I've wondered that as well.

My gut feeling is that when crossing the SQL boundary you would lose a lot of the performance you would otherwise be gaining from using an ECS to begin with. Especially when talking to the GPU.

I'd love to be proven wrong if someone has the time though.

My biggest problem with ECS is that most articles talk about why and on a very abstract level how it works, but there are rarely if ever useful examples of how to implement an ECS.

My goal is exactly to give details on how to implement different component-based models from easier ones up to hardest ones to develop and provide links to real-world implementations. In the last part, I'd like to go in depth into one or two of them, but I'm doing all of this in my free time, so it will take a while to publish everything. I hope you'll stay tuned and keep reading, because feedback like these ones are invaluable.

Yep, that is definitely an issue.

If I made a living in games I'd probably try to fill that hole myself, but it's a hobby for me, and one that doesn't get much time allocated.

> Showing the how without showing the why is probably not helpful IMO. Why is important here.

I was replying to someone who asked about the What.

Sure, but when introducing someone to a new paradigm, the why is part of the what.

The why is what TFA is about. If you think the person I replied to was asking for an introduction to the paradigm, you could probably reply to them directly instead of dragging me about it.

Incidentally, FWIW I think your post about the why completely misses the point of ECS. Looping over entities with a particular property is easily done with OOP hierarchies or anything else. The benefits of ECS are elsewhere - that you can freely add or remove an entity's properties (without type issues or hot code becoming polymorphic), that the data you frequently iterate over can be stored in friendly contiguous chunks, the usual benefits of composition over inheritance, and so forth.

If I came across as harshly critical I apologise, that was not my intention.

I meant to say that the why is to treat your data like a normalized database, which provides those properties you list at the bottom.

In fact my example demonstrates at least the following properties:

> (without type issues or hot code becoming polymorphic), that the data you frequently iterate over can be stored in friendly contiguous chunks

My example was not to show that you can loop over every entity. It was to show how easy it is to loop over only what you care about without doing extraneous work, by iterating over a contiguous chunk of exactly what you need to update.

I was going to add an example of creating a regenerating health component and system, but I didn't for a couple of reasons. One, it was already past bedtime. Two, I would have compared it against adding a tick handler and so it would need to be a larger example with more explanation of the way the imagined engine works. The post was already pretty long in the first place so I decided against it.

> In fact my example demonstrates ... loop over only what you care about without doing extraneous work, by iterating over a contiguous chunk of exactly what you need to update.

Your example demonstrated iterating over a filtered list instead of an unfiltered list. The "contiguous chunks of data" benefit of ECS refers to how the underlying data is stored in memory, not to which data you iterate over (and both of those are orthogonal to types/polymorphism).

My goal here is not to argue, I don't think we disagree on this concept. I should have more carefully considered how my starting words would be interpreted, based on your replies I fear they came off more pointed than I intended in my sleep-addled state. I did not mean to start off by putting you on the defensive.

> Your example demonstrated iterating over a filtered list instead of an unfiltered list. The "contiguous chunks of data" benefit of ECS refers to how the underlying data is stored in memory, not to which data you iterate over

I apologize if this came across unclearly, but that's why I started with a block of comments explaining my fictional example ECS system. I did specifically call out that I was retrieving the backing array for the positions component, implying that components were flat arrays in a dictionary where the key was the component type. I hope we can agree that an array should be expected to be a contiguous chunk of memory when giving a quick example on the internet.

The relevant part of my previous code example:

    // This can vary greatly based on implementation
    // I'm going to go with an option that reads easily
    // This should just give back the list of components, which we'll say is an array.
    // Lookup time in this fictional system is the cost of a dictionary lookup.
    var positions = world.getComponents('position');
Perhaps I should have been more clear that I meant looking up a pointer to the positions array would cost a dictionary lookup.

> (and both of those are orthogonal to types/polymorphism)

I probably should have given an example in a language other than Javascript, since it's not statically typed.

This reminds me of J2ME games (which had to do this, because using actual objects was prohibitively expensive in terms of RAM on those devices).

Entity-Component-System (Which I think is more correct than Entity-Component system)

It's a pattern of organizing data and code that consists of:

Entities: Conceptually these are like objects, in memory they might just be an index.

Components: Conceptually these are like fields on objects, they can typically be added and removed unlike normal fields. If entities are being represented as index's, each different type of component is generally stored as some form of Map<Entity, Component>.

Systems: These are where you store your logic, they typically iterate over all entities with a given set of components and do something. For example iterate over all entities with a position and velocity components, and do position += velocity * dt.

"Entity-Component system"?

Plural "Systems" in ECS architecture are the separate explicit and implicit modules (functions, objects, processing passes, etc.) that do something with the data represented by Entities and Components, and thus represent the third equal leg of the architecture, like in other three letter acronym architectures (MVC, BDI...); it isn't just an architecture organized as a system (singular) of Entities and Components, even if some kind of framework to manage them is necessary. Systems could be managed by a framework too, for example to process them automatically in a correct order.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact