Hacker News new | past | comments | ask | show | jobs | submit login
Vulkan Memory Allocator (gpuopen.com)
126 points by fctorial 8 days ago | hide | past | favorite | 108 comments





This is a fairly well known project which fixes one of Vulkan's greatest shortcomings (some might say the lack of resource memory management is one of Vulkan's greatest features though), but I wonder if there are alternatives which provide most of the critical features but with a much smaller footprint. VMA is around 20kloc, which is about the same as jemalloc (23kloc). A general purpose allocator like jemalloc is overkill for many situations, but there are much smaller (yet slower) alternatives like Emscripten's emmalloc (which is just 1.4 kloc: https://github.com/emscripten-core/emscripten/blob/main/syst...).

Are there similar smaller alternatives for VMA?

As for the motivation: my 3D API wrapper around OpenGL, D3D11, Metal and WebGPU clocks in at 15kloc for all 3D backends, I'm hesitant to add a Vulkan backend exactly for problems like doing my own memory management for Vulkan resources. If I would integrate VMA, this would more than double the line count just for the memory management of a single 3D backend which simply doesn't seem "right". See: https://github.com/floooh/sokol/blob/master/sokol_gfx.h


V-EZ [1] was kinda supposed to be that, a wrapper that makes Vulkan easier to use in non-über-performance-critical applications, but it seems to be dead [2]... Well, maybe not smaller, but at least standardized to the point where you would expect it to be present as a system dependency.

[1] - https://github.com/GPUOpen-LibrariesAndSDKs/V-EZ

[2] - https://github.com/GPUOpen-LibrariesAndSDKs/V-EZ/issues/73


If performance is not critical, why not just use something simpler like OpenGL or D3D11?

One problem is that tooling support around OpenGL and D3D11 is quickly rotting away, because the GPU vendors and Microsoft focus on Vulkan and D3D12.

D3D/DirectX is Windows-only, while Vulkan is supposed to be replacing OpenGL, not complementing it. Maintaining three different graphics APIs (DirectX, OpenGL, Vulkan) is always harder than maintaining only two (DX+Vulkan (eventually)).

You can use OpenGL through libraries like Angle regardless of the underlying APIs. That's what web browsers use to support WebGL on all platforms.

Five actually, you are missing NVN and GNM/GNMX.

I wasn't trying to enumerate every existing graphics API, nor is that relevant to my point.

I consider Vulkan to have redone the same design by committee mistakes from OpenGL, and the disregard by graphics programming newbies, like the proprietary APIs offer (including console ones).

Ironically WebGPU is what Vulkan should have been in first place, from my point of view.


Vulkan was never intended to be for "graphics programming newbies". It's intended to be the thing you would build a more developer-friendly API on top of.

Look at the state of Vulkan drivers in practice, the "fastest" ones in benchmarks of actual AAA games do tons of work that was supposed to be done in userspace (like, that was the whole point!). I agree in principle that it's important to have an API expose all the ugly low level details, but doing so and then telling people not to actually use it is pretty much obviously going to result in a suboptimal compromise like what we have today. Of course people are going to try to code against vanilla Vulkan; I think things would be different if it had shipped side by side with something like a robust implementation of WebGPU, so people who weren't able to use big name game engines had something to fall back on, but that's not what happened.

> the "fastest" ones in benchmarks of actual AAA games do tons of work that was supposed to be done in userspace

Can you give some examples preferably with links to source.


I think the counter-intuitive "hindsight-is-20/20" fact is that a higher level, more abstract API provides more wiggle room for GPU vendors to implement optimizations for their specific GPU architecture. "Abstract API" doesn't mean it has to be mess like what OpenGL became after 1.x, but one problem is that many people think that Vulkan was the only possible alternative future to OpenGL, while ignoring APIs like D3D11 and Metal which would have been a better starting point for an OpenGL successor.

Idk as an "intermediate" graphics programmer, the explicitness and lack of abstraction in Vulkan is one of the killer features for me. "Wiggle room" in higher-level API's like OpenGL is one of the things that made them impossible to deliver consistent results with. With Vulkan there's more typing, but once you get something working it's going to work the same pretty much everywhere.

I kind of see it as a non-issue. It seems like Vulkan's target audience was specifically elite developers who wanted the most control possible, and based on the results which have been achieved in products like Doom 2016/Eternal it seems like it's working for them.

I suspect WebGPU will be the OpenGL successor for everyone working with graphics who isn't at an AAA game studio.


As I wrote above, OpenGL is an exeptionally bad counter example to Vulkan because it's just a massive soup of knobs and dials. D3D11 and Metal had this specific problem fixed already, and should have been used as base for an OpenGL successor instead of going all radical with Mantle.

Vulkan is basically a GPU-programming API with some support for triangle rendering, great for compute workloads, not so great for rendering triangles. At the very least, Khronos should provide additional layers on top of Vulkan to simplify traditional rendering tasks, basically providing one or more optional API layers that are closer to Metal and D3D11.


I would recommend watching some of the talks from around the time Vulkan was announced. The API was released very much in response to developer demand to just give them access to the GPU in an unopinionated way, and largely it has delivered on that promise.

Yeah I was around at the time. As I remember it, the "developer demand" was a very small but very vocal cabal of AAA game rendering engineers :)

I think the general idea was the right one, the execution of that idea was the problem (e.g. Metal is the successful execution of the same idea).


Metal gives no choice for the lower level. Something higher level built on Vulkan with ability still to use the lower level for those who need would be the successful execution. Metal isn't that, besides being Apple only anyway.

I.e. I'd agree with your idea if Metal was actually built by Apple on top of Vulkan as a convenience abstraction and you could use Vulkan there directly too.


The first Metal version didn't provide much low-level access, but the later Metal versions gradually added more low-level, more explicit features. The key point is that those features can be ignored if the higher level and more convenient features are good enough.

But I think the parent comment's point is just that if you start with low-level, you can always build on top of it, but the other way around you have to wait for the API vendor to give you the low level access you need.

I hear that you are saying Metal does a better job of being a convenient graphics API than Vulkan. I am just having trouble parsing why this would be an objective problem with Vulkan, rather than you just not being the target audience of this API.


"Wiggle room" means, for example, that OpenGL drivers might look at the name of your executable to apply different hacks and profiles. It means that you write perfect code, and then who knows what will happen. Thanks, but no thanks.

Not what meant, besides, if you read NVIDIA and AMD driver release notes carefully, you will find that these also contain game-specific fixes in the Vulkan drivers. It's just rarer because there are not many games using Vulkan.

If you look at Khronos slideware that is kind of the official message, naturally that is what those people end up thinking, specially if they are newbies looking into learning portable 3D APIs.

radv is one of the fastest ones. What does it do that it shouldn't?

radv is one of the ones I'm referring to, it's open source so we can see the tricks they use. One example: it spawns a thread behind your back in order to execute command buffers in order to make the queue submit appear nonblocking (or at least fast). This is completely contrary to the spirit of the Vulkan spec which deliberately doesn't provide any convenience mechanisms for things like callbacks on command buffer completion, precisely because these would require runtime threads to be around. The user has no control over this thread, including where it spawns or its priority (how could it, when it's an implementation detail of the driver?) nor can they take advantage of this knowledge to improve synchronization on the CPU side (e.g. by removing external synchronization around queues).

Contrast this with the situation on Metal (explicit runtime that spawns threads, and explicit callbacks to be able to take advantage of that) or DirectX 12 (explicit runtime that works for the average case, with extremely fine grained control provided over the threading model, priorities, etc., including being able to turn them off completely if you really need it). Both of these are clearly much better models because they are actually exposing the detail that seems obvious in hindsight (applications need queue submission to be nonblocking) and are able to provide much more flexible, efficient, and useful APIs as a result.

By leaving everything up to the user, Vulkan in practice ends up underperforming its competitors unless the drivers patch up the work--at which point you have the unfortunate situation where you have people targeting a low-level API running on a high-level runtime. It's this, not some dumb digression about extensions, that IMO make Vulkan kind of disappointing--it's definitely not a failure, but it could've been much better than it was.


As long as it doesn't violate the spec, I'd look at actual performance results of it. I.e. if it interferes with usage or creates some kind of performance degradation for the applications, then it's indeed a problem. Otherwise this requirement should have been an explicit part of the spec.

I.e. I'd imagine if this is a real problem, there should be some RFC for the updated version of the API where this is prohibited or some requirements are added about how to control it.


It creates a performance degradation for applications that are correctly using Vulkan. And from conversation with Khronos people they do definitely consider this a failure (either of the spec or of the driver); among other things, it makes benchmarking rather difficult if new threads are spawning constantly as you submit command buffers. But my assertion (backed up by radv benchmarks) is that effectively nobody can use Vulkan APIs "correctly" which is why implementations like that are helpful in practice.

Then I'd expect someone to actually propose to address this in the future versions. Question is how easy they can do that, given there is some backward compatibility guarantees I suppose.

I'm not sure what their approach for that is. May be they break it at some point and then the driver can offer more knobs for such features to be optional or behave differently in the newer versions.


You don't need to be a graphics programming veteran to learn vulkan either. The official spec is quite approachable even for newbies.

Yeah it is a very well written spec. I've been working on an LSP implementation recently, and I wish the spec was even 10% as clear as Vulkan's. Also the official tutorial is really excellent.

What I think is unapproachable about Vulkan is just the complexity of the problem space. There are a lot of concepts to be aware of just to get the first triangle on the screen (swap-chain, synchronization, render-passes etc.), so in my experience it took quite a few hours of working with it to build an intuition about how everything fits together. And that was coming from an OpenGL background where 70% of the concepts were familiar. I can imagine if you were coming from something like web front-end development, Vulkan could seem pretty inscrutable.

But I will say, since I did cross that bridge, Vulkan became much more intuitive to me than OpenGL. The fact that everything's explicit means there's no mysteries, and the declarative style becomes quite guessable after some time.


As verbose as it often is, the lack of an opengl-style state machine in Vulkan (or Wgpu/Metal/Dx12 for that matter) makes it so much easier to reason with. About 70% of the bugs I've had to deal with in OpenGL involved me forgetting to unbind some buffer object or shader program from OGL's global state. From my experience the "simplicity" of OpenGL is often negated by having to write wrappers around every OpenGL resource type for safe resource management via RAII.

Not to mention, error handling is a huge improvement in Vulkan. In openGL, errors are stored in stack that gets popped by calling glGetError. A rookie mistake is only calling the function once, when you really have to call glGetError until the stack is empty every time you check for errors to catch all errors. By contrast, Vulkan just returns a vkResult structure on every function that can fail.


When Khronos states that OpenGL is done, newbies are going to try to learn an API that is actually getting updates, regardless of its intended target.

And if I market a laser cutter, and you decide to use it to cut your steak at dinner, am I to blame if you find it too complicated to operate?

If you let knives to rust, I don't have other option left when they run out.

A rare moment where we completely agree... Vulkan drivers end up implementing all sorts of workarounds that undermine the "low level" nature of the API because people can't use them efficiently. e.g. the Mesa driver spawning a thread per queue to actually perform the submission despite the spec being intended to have the user control threading.

> I consider Vulkan to have redone the same design by committee mistakes from OpenGL

Which are? (I'm genuinely interested)


Extension spaghetti for starters, https://vulkan.gpuinfo.org/listextensions.php

Followed by Khronos refusal that it isn't part of their job to define an SDK, so each newbie has to go through the ritual of passage to learn how to get OS abstraction libraries to show up a 3D accelerated window, math library, font handling library, texture and image loading library, shader compiling infrastructure, scene graph to handle meshes,....

Now there is LunarG SDK, which still only offers a subset of these kind of features.

If it wasn't for NVidia's early C++ efforts, Vulkan would still be C only.

Also Vulkan only exists because AMD was so kind to contribute Mantle as starting ground, otherwise Khronos would most likely still be arguining how OpenGL vNext was supposed to look like.

Really, in 21st century if you want to write portable 3D code just use a middleware engine, with plugins based backend.


> so each newbie

Vulkan isn't for newbies. Really, if you're new to 3D graphics then you're not Vulkan's target audience; pretending otherwise only results in pain. That's like complaining that a modern CPU's privileged instructions are too complicated for people new to assembly – yes, they are, and no, that's not a design flaw.

> Also Vulkan only exists because AMD was so kind to contribute Mantle as starting ground

That's maybe a point against Khronos, not against Vulkan.

> Really, in 21st century if you want to write portable 3D code just use a middleware engine, with plugins based backend.

Which is exactly what people should be doing. And those middleware engines can be written using Vulkan, because it is designed the way it is.


My degree thesis was porting a particle engine framework from NeXTSTEP/OpenGL/Objective-C into Windows/OpenGL/C++ with MFC.

Yet that doesn't mean I don't care about newbies in 2021, specially when Khronos says OpenGL isn't going to move beyond version 4.6, thus everyone new in the field feels like Vulkan is what they should learn instead.


> Really, in 21st century if you want to write portable 3D code just use a middleware engine, with plugins based backend.

Isn't this why Vulkan was designed the way it was. It's lower level, giving more control to things like memory. In this way I view it somewhat like the ASM of the GPU (even though though there are lower levels still).

I'm curious if anyone with a lot of experience writing graphics "middleware engine" backends agrees.


The idea was probably "built it and they will come" (the library authors who provide the easier to use wrapper libraries). The problem is that these libraries are hobbyist/volunteer work (with the notable exception of native WebGPU implementations), which on its own isn't a bad thing, but hobbyist can't afford a testing lab running hundreds of GPU/driver/OS combinations to make sure that their libraries are as robust as expected.

The companies who have these resources (for instance the GPU vendors), chickened out by designing Vulkan and thus offloading those QA tasks to the API users (simplified, but that's what it is in the end).

Which leaves Unity, Epic, and a handful AAA game developers as potential Vulkan users, which in turn are not enough to test Vulkan implementations, because just a handful API users isn't enough to cover all the dusty corners (same situation as back in the bad old days with MiniGL).


> Which leaves Unity, Epic, and a handful AAA game developers as potential Vulkan users

??? For example, there are a substantial number of emulators that have already implemented or are in the middle of implementing a Vulkan backend: RPCS3, Dolphin, Yuzu, Cemu, Ryujinx, PPSSPP... these emulators are developed by small teams but often push the hardware hard, bringing bugs to light in the process.


Also DXVK and VK3D.

Yeah exactly. Vulkan is meant to be the lowest level interface possible to the hardware. It's for programmers who hitting a wall in terms of optimization because of the overhead imposed by OpenGL or DX11. It's not intended for everyday programmers, and it would not make sense to make concessions in performance to accommodate usability concerns.

But the thing is that Vulkan isn't the lowest possible hardware interface, it's still built on top of a virtual GPU abstraction which matches some GPUs better than others. There's a ton of compromises to accomodate mobile GPUs for instance. The only realistic way to achieve the goal of an actually low-level API is to have one API per major GPU architecture, and to create new APIs when new GPU architectures emerge. Attempts to abstract over radically different GPU architectures will never result in an actually explicit low-level API.

Ok yeah that's technically true, but there's always going to be a balance point. You can't expect graphics programmers to write a separate renderer for each possible hardware target. That kind of works in the console space, but in the PC space you have to account for a wide variety of hardware.

There's always going to be a balance point, and Vulkan's priority is much more about getting as low-level as possible within the constraints than it is about being approachable for developers.


> it's still built on top of a virtual GPU abstraction which matches some GPUs better than others.

Doesn't khronos contain people from a bunch of different corps (amd, nvidia, intel, qualcom) for this exact reason.


Yeah, but that doesn't seem to have helped much to keep the Vulkan API small and tidy, instead there are dozens (or maybe hundreds by now?) of vendor-specific extensions.

I think this is vastly overstated. I've done quite a bit of work for Vulkan, and have done just fine without reaching for extensions.

And how else do you propose accessing hardware features which are vendor specific?


> Extension spaghetti for starters

Extensions are a PITA to deal with, but they generally represent real disparate hardware features offered by individual GPUs which varies by manufacturer, model, and date. This reminds me of CPU features like SSE2, AVX, AES, etc, etc which general-purpose binary programs are forced to query for at runtime to either take advantage of or fall back to a software implementation. But GPUs have even more architectural change velocity than CPUs.

It seems like a hard problem in general. How do you think this could be done better?


Yeah exactly. How else can you support the wide range of hardware features out there without being stuck with a lowest-common-denominator?

I think what pjmlp is trying to suggest is the API or Khronos instead of allowing extensions where it has limitless possibility, it should be courting and doing what Metal and Direct X are doing setting tiers of GPU Level. You either support RayTracing + N other features and call yourself Vulkan 2.0 GPU or you get back to Vulkan 1.x

This is exactly how is has been working in the PC world, and is working quite well as far as I can tell.


Apple, Microsoft, Sony and Nintendo have such a hard time delivering high quality graphics with their proprietary APIs, pity them.

"How do you support a wide variety of hardware with different features without API extensions?"

"Look at these companies that offer an API without extensions by not even trying to support hardware with different features."


PC, iOS and console AAA game studios are doing just fine with those APIs.

How does DX12 do feature discovery for things like ray tracing which are supported on only some hardware?

Haven't checked DX12 yet, but both D3D11 and Metal have a small number of "tiers" with guaranteed feature sets, which basically correspond to GPU generations. Usually you pick the lowest tier you can afford and write your code against that feature set.

In D3D11, the "tier" is basically the minor version number (D3D11.1, .2, etc...), while in Metal you have this handy reference:

https://developer.apple.com/metal/Metal-Feature-Set-Tables.p...

Both solutions prevent the "combinatorial explosion" of OpenGL or Vulkan extensions.


Ok I didn't know that, but it seems like a very sensible approach. I suppose you would lose some specificity in terms of very specific features, but probably this would be sufficient in most cases.

But couldn't you do something very similar in Vulkan? I.e. essentially bucket your render-paths into a couple tiers by checking for a set of extensions required to support each one?


By requiring DX12 Ultimate support, done.

This is a very low-quality argument. Game developers quite clearly cannot only target DX12 Ultimate and have their product only work for a small percentage of users on PC.

If you want to target PC, you have to deal with hardware fragmentation. You can pretend this is a Vulkan issue, but it's just a reality of the platform.


Nice story, except DirectX owns the PC market, period.

pjmlp I have been on this forum long enough to know you would eventually move the goalposts. Your claim was about the merit of extensions, I'm not sure why you're talking about marketshare now.

What goalposts? The APIs that own the games market are doing just fine without Khronos mess.

This is not feature discovery, though - this is feature constraining, which is rather different, and doesn't work at all if you want to get a game out to a large audience while also taking advantage of different hardware configurations.

It has done wonders for PC gaming so far.

Feature discovery has done wonders for PC gaming. Not feature constraint, like you're advocating.

Like proprietary APIs do it, there is a limited feature set and it is set, done.

With extension spaghetti, not only are we back in GLAD and GLEW land, each set of extensions is yet another possible code path.


Extensions are the stick with which hardware vendors beat platform companies (in practice, Microsoft) into innovation. That's how we got raytracing, for example.

Hardware raytracing was designed by NVidia and Microsoft together while creating DirectX 12 Ultimate, shown to the world in an Unreal engine based demo of a starwars lift scene, it has zero to do with Vulkan.

I doubt either of us were in the relevant rooms, but the API is very obviously a DX-ification of Nvidia's OptiX (the software-based raytracing solution that they released long before HW raytracing). Furthermore, Nvidia released OpenGL and Vulkan extensions for raytracing long before the release of DX12 Ultimate. Do you think Microsoft would have allowed that if it was a genuine co-development by the two companies?

> Followed by Khronos refusal that it isn't part of their job to define an SDK, so each newbie has to go through the ritual of passage to learn how to get OS abstraction libraries to show up a 3D accelerated window, math library, font handling library, texture and image loading library, shader compiling infrastructure, scene graph to handle meshes,

That seems to me like a reasonable decision that graphics API does not try to encompass and duplicate many other unrelated APIs and stays focused on graphics.


I completely agree. I recently decided to try learning Vulkan. I built an sdl window framework and got an OpenGL triangle in maybe 6 lines of OpenGL code. For vulkan you literally cannot draw anything until your reimplement an entire rendering engine from probing hardware, to figuring out queuing strategies, and resource management, etc. you are forced to focus your on HOW rather than WHAT to draw. I think they set the barrier of entry way too high. From a newbie perspective I’d rather have the library make smart decisions by default so I can just init and draw. But if I wanted to do more I could.

It takes dozens of lines of code to draw a triangle in modern OpenGL. You were presumably using immediate mode, which is ancient and deprecated because it has horrible performance on modern hardware. Given that the whole point of an API for hardware accelerated graphics is better performance, that's a big problem.

Immediate mode OpenGL not even a particularly friendly API, either. I found its implicit global mutable state for things like the matrix stack to be deeply confusing. There's much better drawing libraries out there if what you're looking for is ease-of-use.


The removal of immediate mode from OpenGL is one of many examples of modern OpenGL's stupidity, partially responsible for creating the mess we're in:

https://www.jwz.org/blog/2012/06/i-have-ported-xscreensaver-...


Immediate mode still exists and is usable to this day. The compatibility profile never went away.

JWZ implemented old-school OpenGL in a library on top of a more modern API, and I would entirely agree that's a good approach to compatibility. I'd have advocated for doing the same thing, just without calling everyone idiots.


Immediate mode makes no sense from a performance standpoint. But you can still use it in 2021 if you want. I believe it's still supported mostly for CAD programs which haven't changed their core software since the 80's

And if you like the immediate mode API, just use something like raylib's rgl[1] that emulates an immediate-mode API on modern OpenGL. Aside from performance benefits, you can port your code to GLES or WebGL, where immediate-mode isn't present.

[1] https://github.com/raysan5/raylib/blob/master/src/rlgl.h


I'm willing to bet that much of that extra code you talk about is the boilerplate code that every Vulkan tutorial tells you is boilerplate code. After that boilerplate is cut and pasted in and you have your first triangle, what is the incremental in drawing a second triangle?

I kind of wish Project Fahrenheit had succeeded, and SGI had scrapped OpenGL back in the 90s in favor of Direct3D.

Can we please just make Direct3D the standard and get the open source community to support it instead of Khronos khruft?


Microsoft had plenty of time, before and after they started their "Microsoft loves open source" marketing campaign, to open source it or standardize it in some form. It's core to their vendor lock-in strategy so it will never happen. Makes you wonder why they are buying so many Vulkan related companies and game studios.

Yeah exactly - MS has zero incentive to make it easier to port games to other platforms.

From the sound of things, it is really more that microsoft has little interest in setting up some form of standards comittee over the API. They could potentially provide the source for the linux version of DirectX Core and Direct X 12, but realistically that does little good.

Long story short (read on for details), there is a LOT of work needed beyond just releasing the source of the DirectX libraries to make anything usable on Linux. Microsoft would need to be able to justify to shareholders the spending of all the time and effort needed, which means they need to get something of value. Since the Linux port of DirectX libraries shares the source with the windows version, it seems risky for them to accept outside contributions, since that will either fork the codebase, or pull those changes into the windows version, which feels like a potentially massive hole for people to try to get Patent technologies into the windows implementation, and then sue Microsoft. I'm not sure the advantage in enabling the game studios to more easily port their games to Linux (the main other benefit to Microsoft I see) is worth it, especially since Proton exists, making "porting" mostly a QA exercise, and possibly adding some proton specific optimizations.

The existing DirectX libraries communicate with the DirectX User Mode Driver over a custom interface. Without the GPU makers officially compiling and supporting the User Mode Driver, nothing useful happens. Further it uses an interface to the kernel that is totally alien to linux. Proper kernel drivers that understand this interface and can drive the physical device would be needed, or the libraries changed to communicate over the existing DRM interfaces.

One major problem is that for a non-exclusive mode, the code would need to interoperate with the GL/Vulkin user mode drivers.

On Windows my understanding is that alternate APIs are implemented by having the surface creation code call into DirectX's version, but after that point remaining calls are basically directed to very same DirectX User Mode Driver, which has extra code specific to those APIs to convert those calls into whatever underlying commands need to be sent to the GPU, and sends them along to the hardware exactly as it does for DirectX. The low level commands for the GPU are basically a black box to the DirectX stack, so it does not care what triggered the User Mode Driver to issue them.

For open source drivers, I'm not sure that the MESA/Gallium stack is designed to be able to handle a fully separate user mode driver also talking directly to the GPU via DRI. My very strong suspicion is most such drivers are not designed for that, so the MESA/Gallium driver would need to become unified with the DirectX User Mode driver for an open source route. While possible since Gallium was designed to be able to support multiple APIs, it is basically certain that the way that is approached will not mesh perfectly with the way the DirectX libraries try to interface with the User Mode Driver.

For the NVIDIA proprietary blob driver route, things might be easier, as it is probably feasible to merge the DirectX User Mode Driver with the user mode portions of the existing driver, given that NVidia can unilaterally change the internal architecture of their proprietary drivers without needing to coordinate with anyone, as long as they maintain the CUDA/OpenGl/Vulkan/etc ABIs that apps directly interface with.


A lot of excuses when we know true love knows no boundaries. If they wanted to make it happen, it would be a standard by now. Open source folks would have worked for free to make it happen (after they made sure it wasn't yet another attempt to destroy open source and covered all their bases).

Why don't we all just adopt Vulkan which is:

1. Already open source

2. Well documented

3. Extremely powerful, as demonstrated by the work at Idtech


Now that ZeniMax Media is Microsoft owned lets see how long Id Tech will be around.

Next titles will be most likely XBox exclusives anyway.


Even Carmack later corrected his point of view on DirectX, around version 8 timeframe.

How complex is your renderer? It's not that hard to do memory management manually with Vulkan for simple rendering pipelines. I think it gets harder when you want to have a lot of dynamism going on.

VMA doesn't solve a lot of hard problems around memory management, which really come from synchronization issues. You need to make sure that the GPU is done with a resource before you can destroy it on the CPU side. What it brings is a host of allocation algorithms for very specific use cases and a lot of code for tracing/debugging memory usage. Plus, it shields the users from some positively braindead quirks that the Vulkan spec has gathered (some through extensions that create traps when they aren't enabled - just present; I had to scream when I pieced that together...).

I have my own slightly anemic memory manager code that implements a basic scheme with no frills, avoids pitfalls and fits into about 500 lines of code. The only thing I really might want to improve is the some of the free lost handling. The rest shoukd be good for quite a while.


Yeah I don't even really have a system for this; I basically have the mostly statically managed memory which is living for the length of the scene/program, and for per-frame stuff I just wait for fences on a cleanup thread.

It's a somewhat general 3D API wrapper which exposes an API that's similar to Metal and WebGPU, but with a number of restrictions because it needs to support GLES2/WebGL backends as the worst case).

One idea I'm playing with is to provide callback hooks so that resource allocation and management can be delegated to the API user (so they can for instance integrate VMA themselves), and only provide a rudimentary default solution (which probably would be enough for many simple use cases).


Why do you care so much about LOC of third party code?

Long story short: because of the terrible dependency management situation in the C/C++ world. Also once you commit to an external dependency, it becomes your own maintenance problem, the less code the better in this case.

Somewhat related I hope. Does anyone know a resource guide to learn methodically about GPUs? Let me see if I can explain my frustrations:

1. The usual recommended books for beginners, although good miss what I need, yes I love building ray-tracers and rasterizers but I can finish the book and not have the slightest idea about how a GPU actually works

2. Books like H&P although excellent, treat GPUs as an after-thought in 1 extra chapter, and even the content is like 5-10 years behind.

3. The GPU gems series are too advanced for me, I get lost pretty quickly and quit in frustration

4. Nvidia, AMD resources are 50% advertising, 50% hype and proprietary jargon.

I suppose what I want does not exist, I want a guide that starting from a somewhat basic level (let's say assuming the reader took an undergraduate course in comp architecture) methodically explains how the GPU evolved into a complete separate type of computing architecture, how it works in the nitty gritty details, and how it is been used in different applications (graphics,ML,data processing, etc)


I agree strongly with you about the need for good resources. Here are a few I've found that are useful.

* A trip through the Graphics Pipeline[1] is slightly dated (10 years old) but still very relevant.

* If you're interested in compute shaders specifically, I've put together "compute shader 101"[2].

* Alyssa Rosenzweig's posts[3] on reverse engineering GPUs casts a lot of light on how they work at a low level. It helps to have a big-picture understanding first.

I think there is demand for a good book on this topic.

[1]: https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-...

[2]: https://github.com/googlefonts/compute-shader-101

[3]: https://rosenzweig.io/


Thank you, I will check them out. I remember having read an article by the smart young lady, I didnt understand half of it, hopefully I will get more this time..

Here's a somewhat recent blog post I stumbled over which might be helpful:

https://rastergrid.com/blog/gpu-tech/2021/07/gpu-architectur...

There isn't a lot of actual under-the-hood information though, because GPUs are closed IPs. So the information needs to be pieced together from the occasional conference talks, performance optimization advice from GPU vendors and what enthusiasts reverse engineer by poking GPUs through the 3D APIs.


Thanks. Ok, hear me out because here it comes naive time. GPU demand will continue to grow exponentially in this decade (VR, Crypto or whatever remains from it,ML,Data Eng,Steam Deck, Laptops etc)Wouldnt it be possible for some multi-country/university/companies to create a totally open GPU specification ? That has happened already? I understand we are talking about a long time of research effort and billions $$ but I think the benefits for all would be incredible. Open hardware, open libraries, open drivers. Imagine a world with no Linux, a totally closed x86 fully owned by IBM, closed webGL. Where can I read more about efforts in this direction if they exist?

What would be the point, besides an unprecedented, enormously ambitious hardware design learning project?

Executing shaders better than Nvidia and AMD is not likely.

Selling good graphics adapters at competitive prices to concrete users is even less likely.

Experiments with APIs would have a fatal adoption problem.

Avoiding DRM, if legally feasible, would be less useful than spending the same resources to support Sci-Hub or improve laws.

And of course for the more practical purpose of writing mere software, including Vulkan implementations, specifications are complete and open enough.


An open GPU design would be great for the RaspberryPi for instance, even if performance wouldn't be competitive with NVIDIA or AMD (it really doesn't need to be). I think a "RISC-V, but for GPUs" would make a lot of sense, e.g. RISC-V met a similar scepticism in the beginning, yet it seems to quickly gain steam in the last few years.

I found this very helpful. It's a 140 pages book and explains relatively abstract how modern gpu hardware works and the programming model.

General-Purpose Graphics Processor Architectures (Tor M. Aamodt)

https://skos.ii.uni.wroc.pl/pluginfile.php/28568/mod_resourc...

> 4. Nvidia, AMD resources are 50% advertising, 50% hype and proprietary jargon.

I found the Nvidia CUDA C Programming Guide very helpful...


Thanks, I was previously unaware of this reference. On a quick skim, it seems to be more of an outline of interesting research directions for GPU architecture than a synthesis of where things are, targeted at programmers. But it has lots of detail and is likely to be useful to lots of people!

The book looks pretty great!,pretty much in spirit to what I wanted. It is very slim but it has a big bibliography so it is perfect as an initial roadmap.

Is this page completely broken for anyone else? After the fade-in animations, the whole page vanishes. Using Chrome, btw.

I hope this is not how the allocator works.. Allocate, memory gone


+1, though I don't see a fade, just a white stripe at the top and a long empty grey page below it.

It worked for me earlier, on FF 78, Linux.

Loading it just now, I'm seeing what you see.


refreshed and its gone, darn



Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: