I recently built out training and inference at a FinTech (for fraud and risk) using Elixir and tried this very approach…
We’re now using Python for training and forking Ortex (ONNX) for inference.
The ecosystem just isn’t there, especially for training. It’s a little better for inference but still has significant gaps. I will eventually have time to push contributions upstream but Python has so much momentum behind it.
Livebooks are amazing though and a better experience than anything a Python offers sans libraries.
Do you know if Telkom offered ISDN around ‘95 thru to ‘01?
My parents had “two lines” back then so we could always have access to the Internet, even while talking on the phone. I recall, vaguely, using dial up but we may have had an ISDN line. In part, because my Dad wanted to order LaserDiscs from the US while my Mom was chatting away to family.
One caveat about LISTEN/NOTIFY is that channels are not first-class objects, so there's no authorization associated with them, thus anyone who can login can also NOTIFY any payload to any channel.
Supabase realtime (especially if you want a managed backend) or other streaming CDC setups (like Decodable, which is Flink/Debezium under the hood) are also great choices for logical replication. Streaming tech will continue to get more cost-effective and simpler to implement in the coming year(s).
I should note: I haven't used Decodable in production yet, I'm just a fan of Flink :)
A similar data-structure that is also very useful uses sparse-dense arrays but encodes a free-list in the unused memory. It's an ideal data structure if your access patterns are sporadic reads/writes bookmarked by frequent complete iterations of the dense array. You can insert and remove (swap 'n' pop) in O(1) and iterate in O(n) without polluting your cache and skipping over empty slots.
It's frequently used in Entity-Component Systems (ECS) which introduce safety by associating a counter with each entry in the sparse array (usually called a "generation") and incrementing them on deletion. You can also verify cross-linkage between the sparse and dense arrays like the data structure described in the article, but that comes at the cost of another indirection and fetch from memory.
At a quick glance, this is very close. However, the proposal goes through several contortions to maintain pointer stability. If you preallocate sufficient storage for your worst case (which can be unmapped virtual memory) then it makes implementation a lot simpler.
Neat that the committee is looking to introduce it. Too bad we eschew the STL (and even CRT) in games.
There’s some articles floating around that touch on the idea. Bitsquid comes to mind. [1]
To give a slightly more detailed description, you allocate two arrays of the extents. One is the dense array of whatever type, say T, and the other is the sparse array of 32 bit or 64 bit integers. The sparse array stores indices into the dense array for extant objects. However, for extinct objects, the indices instead point to the sparse array itself to encode a free-list.
To insert, in O(1), you check the free-list for an unused slot. If there is none, then you grow the sparse array (increment a counter) and use the newly introduced slot. Then you insert (or construct in place) your object at the end of the dense array. Finally, you map that slot in the sparse array to the end of the dense array.
To delete, in O(1), you swap the object you wish to delete with the last in the dense array and decrement the counter (swap and pop). Then you add the index in the sparse array to the free-list. You’ll need to know the index assigned to the object you swapped with to do this, which can either be stored on the object itself or by introducing another array the maintains an inverse mapping.
To iterate, you can treat the dense array like any other array.
This scheme poses some challenges:
- You lose pointer stability because you are relocating your objects. You must access them through indices. If you can defer your deletions, say to a frame boundary, you assume your pointers are stable.
- Indices, without further work, are not safe. They are just as safe as pointers. Usually you don’t expect more than N objects, so you can use log2(N) bits for the forward and reverse mappings between sparse and dense arrays and then use the leftover bits to identify stale indices. You can, and should, encapsulate this in a type-safe handle system. [2][3]
- High churn can cause index invalidation to fail (if you wrap a counter) and performs worse than other data structures.
- “Swap and pop” requires T to be trivially copyable, or ideally, trivially relocatable.
It’s also worth mentioning that you can reorder the dense array to maintain custom invariants. This is really useful if you have dependencies between objects and want to iterate in order.[4]
[3]: You usually can use 24 bits for index and 8 bits for a discriminator (counter) in games. Handles (that you pass around) can be wider to exploit register widths and increase safety, since you have 32 bits for further runtime checks.
[4]: By doing more work at insertion and deletion, a constant factor on O(1), you can save work on iteration or updates, a constant factor on O(n). A great example of this being a worthwhile optimization is for scene graphs or transform hierarchies. If you partially order by depth in the hierarchy, you can guarantee correct computation without chasing indexes or pointers. If you maintain additional information about boundaries, you can trivially parallelize computation of transforms or reduce redundant work.
This echos my experiences, especially with the move to “live services” model for games. Often the teams working on the services for multiplayer games are very small so it’s all about leverage. They’re also embedded within a culture that doesn’t value or understand the practices and tools that are taken for granted in webdev, mostly because they’re hard to transfer to the other domains in gamedev.
I’m glad to see the open source development of these tools. Personally, there’s a lot of stuff colleagues and I have built that I would love to see open sourced. Otherwise we just end up implementing the same services over and over again.
Another example is patch delivery. It’s a solved problem yet everyone keeps rolling their own, or only shipping to Steam. Of course, all managed with some Jenkins scripts written by someone that’s no longer at the company. I want Fastlane for games that makes it easy to target the various distribution channels across desktop/mobile/console.
The lack of a standardized patch system for games is mind boggling! Sure there’s some for desktop software exes which just ship the new version every time but those apps are rather small and games can be several GB! We had to roll our own for our game but I couldn’t believe there wasn’t a universal standard when I went searching for one last year. Butler from itch.io looked interesting but I couldn’t decipher how to use it in the end. Also rolling patches manually seemed crazy to me and we ended up using a library similar to rsync that works pretty well and makes deploying patches easier but still not perfect.
I've been meaning to implement SGP4 from scratch as a learning exercise. What I found really interesting is how the USAF/NORAD tracks and reports objects in LEO: they publish Two-line Element Sets (TLEs), which are a fixed-width ASCII format derived from punch cards.[1] The format is pretty easy to parse.[2]
Just want to give props to the skyfield python library. I have no idea what any of the numbers in the TLE mean but just by following some examples have been able to plot the location of Starlink satellites vs time very easily.
Windows bundles software decoders (for H.264 and H.265) that are available through MFT and DXVA.[1][2] Chromium leverages both already.[3] It's also a low effort way to support hardware-accelerated decoding as vendors (Intel, AMD, NVIDIA) register MFTs that wrap their propriety SDKs (Quick Sync, AMF, NVDEC).
These provided MFTs have to meet certain minimum requirements for certification by Microsoft, which amounts to "good enough." Results vary in practice and can be incredibly frustrating (especially encode). It's often better to use the underlying SDKs but that comes with a lot of hassle.
(I've been working on a video capture and sharing app inspired by Shadowplay.)
Windows 10 is also rolling out support for AV1, though not bundled.[4]
Worth noting that you pass in data in a proprietary binary format. [1]
I've done it, not out of performance concerns, but because transcoding between proprietary binary formats with types is a lot saner than the alternative.
One caveat to keep in mind when using the binary format is that arrays of custom types are not portable across databases because the serialized array contains the OID of the custom type, which may be different on the other end.
The other thing to keep in mind is that text or CSV can be much more compact for data sets with many small integers or NULLs. On the other hand, the binary format is much more compact for timestamps and floating point numbers. In general, binary format has lower parsing overhead.
We’re now using Python for training and forking Ortex (ONNX) for inference.
The ecosystem just isn’t there, especially for training. It’s a little better for inference but still has significant gaps. I will eventually have time to push contributions upstream but Python has so much momentum behind it.
Livebooks are amazing though and a better experience than anything a Python offers sans libraries.