Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Open-source shooter which made it to AC: Valhalla and Skydio drones (github.com/teamhypersomnia)
270 points by geneotech on June 25, 2023 | hide | past | favorite | 45 comments
So just for fun, I wrote a complete multiplayer game in pure C++. I even wrote my own texture atlas packer, which is now used by Assassin's Creed, 2 scientific publications as well as a drone manufacturing company - each of these basically mentions the name of my game. There is even a claim that Unity patented some of the ECS ideas that originate from this project. See https://github.com/TeamHypersomnia/Hypersomnia#tech-highligh... for details.

By the way the game is pretty darn good. 10 people connected yesterday to test a new map: https://www.youtube.com/watch?v=CHLPzZqANlM

It took me well over 10 years to code it all by hand, but the journey was truly worth it.




I didn’t realize that getting 100% deterministic behavior for floats would require using the same compiler, even when using the same math functions.

Is floating point arithmetic not fully specified or something?

Edit: should have searched the web first: https://stackoverflow.com/questions/49471943/floating-point-...


As I understand it, it's more that the mathematical functions aren't specified by the standard.

There is no sqrt instruction. So you stard with a taylor approximation, then maybe do a few rounds of Newton's method to refine the answer. But what degree is the taylor series? Where did you centre it (can't be around 0)? How many Newton's methods did you do? IEEE 754 might dictate what multiplication looks like, but for sqrt, sin, cos, tan you need to come up with a method of calculation, and that's in the standard library which usually comes from the compiler. You could make your own implementations but...

Floats are not associative: (a+b)+c != a+(b+c). So even something like reordering fundamental operations can cause divergence.


> There is no sqrt instruction. So you stard with a taylor approximation, then maybe do a few rounds of Newton's method to refine the answer. But what degree is the taylor series? Where did you centre it (can't be around 0)? How many Newton's methods did you do? IEEE 754 might dictate what multiplication looks like, but for sqrt, sin, cos, tan you need to come up with a method of calculation, and that's in the standard library which usually comes from the compiler. You could make your own implementations but...

This is just plain wrong. IEEE 754 defines many common mathematical functions, sqrt and trig included, and recommends them to return correct values to the last ulp. Most common cpus have hardware instructions for those, although Intels implementation is notoriously bad.

Furthermore there are some high-quality FP libraries out there, like SLEEF and rlibm, and CORE-MATH project that aims to improve the standard libraries in use.


> recommends

That word is really important because it means you can’t rely on it for real determinism.


Your general point stands, but with corrections:

- sqrt actually can be a single instruction on x86_64 for example. Look at how musl implements it, it's just a single line inline asm statement, `__asm__ ("sqrtsd %1, %0" : "=x"(x) : "x"(x));`: https://github.com/ifduyue/musl/blob/master/src/math/x86_64/... Of course, not every x64_64 impl must use that instruction, and not every architecture must match Intel's implementation. I've never looked into it, but wouldn't be surprised if even Intel and AMD have some differences.

- operator precedence is well-defined, and IEE754 compliant compilation modes respect the non-associativity. In most popular compilers, you need to pass specific flags to allow the compiler to change the associativity of operations (-ffast-math implies -funsafe-math-optimizations which implies -fassociative-math which actually breaks strict adherence to the program text). A somewhat similar issue does arise with floating point environment though, as statements may be reordered, function arguments order of evaluation may differ, etc.

The fact that compilers respect the non-associativity of program text is a huge reason why compilers are very limited in how much auto-vectorization they will do for floating point. The classic example is a sum of squares, where it bars itself from even loop unrolling, never mind SIMD with FMADDs. To do any of that, you have to fiddle with compiler options that are often problematic when enabled globally, or __attribute__((optimize("...")) or __pragma(optimize("...")), or probably best of all, explicitly vectorize using intrinsics.


I'm not quite sure they are guaranteed to be deterministic across architectures either.

I think rounding up/down can vary across architectures. Mixing arm/amd/intel here should result in non-deterministic behavior potentially. https://en.wikipedia.org/wiki/IEEE_754#Reproducibility

Ah... I see. The game is using a software implementation of floating point maths to avoid this issue :). Smart. That in combination with the unified compiler should get you out of most of the trouble.


There's a rather long article here, from Sun, now owned by Oracle - probably easier to start from the conclusion and work back!

https://docs.oracle.com/cd/E77782_01/html/E77791/z4002282485...


Float determinism is one of those things that in theory floats should behave deterministically, but in practice it's a wild west. One big problem is that in C (and in many others), common operators are not explicitly mapped to specific fp operations, which gives compilers quite a lot of leeway on what they can do.

Another issue is that FP uses some amount of invisible global state, e.g. rounding modes. Iirc for example DirectX likes to change the flags behind your back.


Even using the same compiler, a different -march can give you substantially different results.


I think it’s just about Intel’s intermediate 80bit precision. Anything else?


There are lots of different ways floating point determinism can get lost. The obvious one, as you mentioned, is the 80 bit x87 unit. Lots of 32 bit compilers will compile to doing floating point math on the x87, which is slow, but the same compiler compiling in 64 bit mode will use SSE2 instructions.

Floating point arithmetic is not associative. That is, (a+b)+c != a+(b+c) and (a*b)*c != a*(b*\c).

Multiplying by the reciprocal is not equal to dividing by the value. That is, x*(1/y) != x/y. An optimizing compiler may, when you attempt to divide by a constant, will optimize that to multiplying the reciprocal instead, that is, if you have code that divides by the constant 3 it will multiply by 0.33333333333333333, because it's a lot faster.

FMA (fused multiply-add) instructions are more precise and therefore not equal to the same calculation without FMA. That is, a*b+c != a*b+c if one compiler will output FMA instructions and the other one does not. (this will be true even in the same compiler with different flags)

Special functions are fucky. sqrt, sin, cos etc might not always give equal values. Or even in the same compiler if minor alterations to the code are made. A compiler might use one algorithm to compute sin(x), but a different algorithm if it needs to compute both sin(x) and cos(x) at the same time.

Floating point rounding mode is a thing. Sometimes a plugin changes your floating point rounding mode. Sometimes this plugin will be inserted into your runtime without your knowledge, such as an antivirus program, or malware that hijacks your browser to give "better"/"customized" shopping/search recommendations. There was a bug writeup about a crash in Chrome several years back, but I can't find it.

Basically you should assume that floating point math is non-deterministic. If you think you need deterministic floating point math, try to reformulate the problem so that you don't need deterministic floating point math. If you really* need deterministic floating point math, understand that you're signing up for a lot of pain.


AFAIK, the first three things you listed would be deterministic across compilers unless you enabled -ffast-math (which isn't what this is talking about) precisely for that reason. I believe the same applies to intrinsics and rounding modes but not sure, especially since the website talks about GCC vs clang differences for the latter.

[1] https://stackoverflow.com/questions/55974090/clang-gcc-only-...


Any reorderings or optimizations of floating point code could potentially change the results even without 80-bit precision active.


AFAIK only if you have -ffast-math enabled. Otherwise IEE754 very specifically lays out all the optimizations the compiler is and isn't allowed to do.


Constant folding of floats may depend on the compiler of compiler (e.g. 0.5 + x + 0.6 + 0.7), the compiler may be non-deterministic (std::unordered_map<float> constants), and it may depend on the CPU your compiler is running on.


> Networking is based on cross-platform simulation determinism.

Springrts[0] has been doing all that has been described here for the same reasons in late 2000s and I guess countless RTS games also, they just weren't open source. Not to say it isn't an achievement because it is indeed damn hard.

[0] https://springrts.com/


That is mentioned in the details below the heading you quoted. The difference here is that RTS games don't use deterministic floating-point physics, so Hypersomnia requires a novel solution.


That's exactly what it's doing, though - deterministic floats, down to having exact same gcc version for all supported OSes. (Well, used to, haven't checked in a while.)


Oh man, I put so many hours into Spring via Balanced Annihilation. Filled that missing part of my soul that TA and SupCom left, but SupCom 2 (an otherwise really good, but casual and much smaller RTS) didn't.

Also, we don't speak about PA.

BAR looks really promising, also: https://www.beyondallreason.info


Zero-K is the ultimate TA-like, IMO. BAR is really good but much less ambitious and much closer to the original TA gameplay, Zero-K has really cool terrain deformation stuff and interesting units


This looks fascinating but - by god, was your post title hard to parse!


Same here. I still find it a bit misleading too.

The game didn’t make it to AC:Valhalla, a texture-packing library that was made during its creation did[1] (not wanting to downplay that achievement, it’s worthy of a post in its own right, IMO).

It’s also a bit of a stretch to claim it was the source of the ECS system used in Unity. I was using a game engine which used ECS prior to 2010. In the game engine world it seems to originate with Operation Flashpoint[2] (post from 2007).

[1] https://github.com/TeamHypersomnia/rectpack2D

[2] https://t-machine.org/index.php/2007/09/03/entity-systems-ar...


Hey there, a little clarification - I don't claim that I came up with ECS itself, only a particular implementation of it that is very memory efficient, and Unity's patent concerns a specific implementation as well - sorry if that was a bit misleading. As for the title, I just couldn't resist making the pun, maybe you're right though that the packer should be talked about separately.


Am I crazy or is your approach to ECS similar to a data oriented ECS?

In a fully, brutally data-oriented ECS, you would allocate all components of the same type in the same memory block, so you'd effectively end up with a std::vector<TransformComponent> and a std::vector<PhysicsComponent>, etc. Each entity then holds ids, such as a pair of `what` component and `index` into the respective vector. This ensures that, like your approach, the cache is used well, and that memory fragmentation is minimal.

I find your project(s) very, very inspiring as I love working on similar problems among others (github.com/lionkor), thank you for sharing!


Thank you! The approach you describe is indeed the "final form" of a brutally data-oriented ECS, but my method is a bit simpler - e.g. if the game has:

struct player { components::transform transform; components::health health; };

struct box { components::transform transform; components::physics physics; };

Then my ECS simply holds a std::vector<player> and a std::vector<box> rather than having a vector per each type of component. I also use the integer identifiers like you describe but with a twist - there is one more indirection involved so that deleting from the vectors does not invalidate existing indices pointing to vector elements. It's still exactly O(1). See the section on the Memory pool in the game repo's README.


This is pretty much how games outside of general purpose engines have always done it. With some exciting variations like having one monster struct that has every property for every entity in the game so you just have a single array of entities. Or having each struct contain an ID field so you can pass around pointers and cast them based on the ID value.


Indeed, such a rigid struct-based "ECS" is basically how most non-ECS games do it, but using components to compose game objects, if you say put them into std::tuple, makes it easy to to later call e.g. a generic lambda only on vectors of objects that contain given components. This makes for an excellent statically polymorphic code that doesn't care about what particular type the game object might have (is it a character? is it a tree?) as long as it has e.g. physics + health. In my 2013 article I also describe how to apply the same concept if we don't hardcode a specific collection of "structs of components", but add/remove components to entities at runtime.


Yeah this is the nice thing about writing a game without a general purpose game engine, you only need to go as deep into these things as the game needs. Spend a lot of time with general purpose engines and it’s always a bit of a pain when their world representation is a little bit ‘wrong’ for your game.


Columnar stores were in use in games since at least the mid-90s.

Very common way of optimizing for CPU caches.


you can avoid invalidating indices by using a sparse array (like a flat_map) instead of a vector


This isn’t as ‘brutal’ as you can go because the point of data oriented programming is to organise data coherently in memory based on how it’s accessed. Linear arrays as blobs of data are fine if you have strictly linear access, but you can go further by considering how systems access data. So you end up with ideas like archetypes as ways to structure memory to minimise cache misses.


I just love these posts and projects and am very grateful that they are provided open sourced. I'd love to do something like that one day when I have the time (i.e., when I retire), and it's these projects that will most likely be my schooling in this world.


This looks rad! Reminds me of ARC (Attack Retrieve Capture) back in the late 90s -- now on life support as Armor Critical

https://en.wikipedia.org/wiki/Attack_Retrieve_Capture http://armorcritical.com


Incredible achievement. The games looks great.

To the author:

What else have you been working on commercially to support the project?

Do you plan to release it in steam?

How does the persistent world effect gameplay?

With your clever rebuild physics when a new player joins technique, what does that functionality look like for existing players? Do bullets shift position for example?


Hey there, thank you so much for the kind words!

- I have done various gamedev jobs over the years, the most recent being gameplay programming for this minigame in PUBG: https://www.youtube.com/watch?v=tSP5P0QGWa4 I managed to successfully invest my savings too which let me focus entirely on Hypersomnia.

- Yes, I'd love to release on Steam before 2024.

- For now there is no persistent world at all, I just mentioned it as a teaser for where the game might be heading if it becomes successful.

- Rebuilding the physics should not at all shift positions. What is rebuilt is merely the "hot state" like contacts and trees - the only moment where hot state matters is during the process of advancing the simulation from one state to the next, it does not influence how state is later rendered etc. The only impact on the existing players could be a single framerate drop in case the physics world to be rebuilt is massive, but in practice all our maps are still rather small, best suited for 3v3 matches.


Thanks for the answers. You are an inspiration!


Rebuilding the physics state on client connect is a cool solution to synchronization!


Firstly, this looks absolutely unreal, things like this make me really appreciate how little I actually know about CS.

Secondly, I've caught a big ol' bug for you on MacOS. When on the pause screen/settings, mouse input is offset significantly. It's to the point that the main menu options aren't even clickable. I've gathered screen capture of the problem here for you, with the logs and a System Report.

https://youtu.be/O4OoMdeFAt0


Love to hear it! Sorry about the MacOS incident, I haven't launched it in ages as we play on Windows and Linux pretty much 100% of the time, I'll see what I can do to fix it :)


Pfft! No apologies necessary. I think I speak for everyone with a pulse and more than 2 tired but functioning brain cells when I say: thanks for supporting MacOS in the first place, cheers for your absolute galaxybrain efforts.


Heck! I just realized I've been following some of your posts over the years without intending to. Specially about ECS and networking.

It's nice to see it so advanced. Congrats!


Thank you! That's interesting, I did write an article on SE about ECS back in 2013, but I think it's the first I ever published a proper piece on networking - I do link to an important Glenn Fiedler's blogpost when describing my deterministic architecture in README but just in case I'm not him, I hope I wasn't too misleading :)


The gameplay reminds me of this this old top down multilayer shooter from the early 2000s called Infantry Online

https://www.freeinfantry.com/

https://en.m.wikipedia.org/wiki/Infantry_(video_game)


This looks really cool - however, the MacOS build guide seems to be a dead link...


Oh, thanks a lot for the catch! Fixed it immediately.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: