An actual 3D block based format would be useful, currently on PC at least the only formats are for 2D textures. They do work for 3D textures, but in 2D slices so the result is not as good as it could be. Also the only 1 channel format is BC4 which is .5 byte per entry, and has 2 encodings modes, but the 2nd one is mostly useless for an SDF, so that is a waste.
I'd prefer a 1 per byte entry option for higher detail, no need for any fancy modes, needs to be simply and fast to encode so it can be done realtime.
IDK about this post though, separating the block min/max into a separate memory location from the indices doesn't make much sense, you could just access a less detailed mip if that is what you want.
Apples M-series chips have ASTC if you count those as desktop, but yeah neither AMD or Nvidias discrete GPUs have it. Desktop and console chips usually only have BC, mobile chips usually only have ASTC, and the few chips which straddle the line between those worlds have both (e.g. Nvidia Tegra and Apple M).
> This makes SDF amenable to implementation in GPU hardware, though to date no hardware has been designed specifically to accelerate SDF.
This isn't entirely correct. Nvidia GPUs can perform interpolation on 3D textures, which is specifically useful for performing finer distance queries on a 3D SDF.
Aren’t the TMUs on AMD GPUs still cap at FP16 at least for advertised throughput?
NVIDIA TMUs historically offered higher precision and about double the throughput as well as supported other edge cases such as sub pixel sampling much more effectively than AMD GPUs and at least recalling some failure states in low level APIs AMD seems to at least do certain things in shaders which were loaded by the driver when certain API functions were called rather than having native instructions for that.
Or maybe we're talking past each other - bandwidth and interpolation requirements mean that lower precision does tend to be faster - even on nvidia. unorm8 is often the quoted "texels/second" precision, as to this day rgb888 is the "most common" texture format in games/apps/desktop environments. I think GCN had fp16/int16 at 1/2 that (I think RDNA increased that to 1/1), fp32/int32 at 1/4 and fp64 at 1/8. But that's pretty much the same as nvidia's texture filtering rates, abut I don't think there's a "hard cap" or internal limit of fp16 precision?
Other GPUs do have 3D texture support, but Nvidia does specifically use their 3D texture units for SDF interpolation in their simulation work. That's about as close to hardware-for-SDFs and well beyond "whatever hardware was developed for triangle rasterization has been repurposed for SDF by end users" (arguably, 3D texture support _predates_ the modern programmable shader pipeline).
I'd prefer a 1 per byte entry option for higher detail, no need for any fancy modes, needs to be simply and fast to encode so it can be done realtime.
IDK about this post though, separating the block min/max into a separate memory location from the indices doesn't make much sense, you could just access a less detailed mip if that is what you want.