Hacker News new | past | comments | ask | show | jobs | submit login
Minimal Cross-Platform Graphics (zserge.com)
201 points by rrampage on Jan 24, 2023 | hide | past | favorite | 64 comments



I used to use a timer to limit the frame rate to 60 Hz too. But on Windows, I found that DwmFlush() seems to act like WaitVSync(), so I've been using that in preference for years now. I think this is undocumented behaviour. I guess what it actually does is wait until the compositor is ready for another frame.

To be able to call that function, I LoadLibrary("dwmapi.dll"), and then GetProcAddress(dwm, "DwmFlush").


> Issues a flush call that blocks the caller until the next present, when all of the Microsoft DirectX surface updates that are currently outstanding have been made.

DWM always runs with vsync so presents never happen more frequently than screen refreshes.


Why not just use VSync?


If I am not too much mistaken, because on some hardware (mobile), there is no vsync feedback.

vsync feedback support has to be queried (look at vulkan API/wayland API). Usually it is called "presentation" something.


How would I do that? I think there is any such function in the Win32 API.


Yes, I think you basically have to use a DirectX swap chain to get vsync presentation.



Perfect companion for this C graphics library: https://github.com/tsoding/olive.c


Is there any similar tutorial for XCB? Most documentation only covers libx11, and trying to extrapolate libxcb usage from that is tricky at best and heisenbug-prone at worst.


Very nice! I like that it's a single header. I wrote something similar although it's not single header, doesn't do audio, and no X11 backend https://github.com/samizzo/pixie. I use mine as a backend for Windows ports of MS-DOS demos that I make.


A similar x-platform library called TiGR (Tiny GRaphics):

https://github.com/erkkah/tigr

Brief discussion:

https://news.ycombinator.com/item?id=34310208


Very cool!

This example I think needs editing:

    memset(f->buf, rgb, f->width*f->height*sizeoof(uint32_t));
it's described as a way to "fill the complete framebuffer with a solid colour", but 'rgb' is shown in the previous snipped to be 'uint32_t'. That is not how 'memset()' [1] works, it will only use the least significant 8 bits of the 'int'-typed value argument. So it would use whatever blue bits where in 'rgb', only.

For clearing (all zero) it's usable, and then I would recommend writing it as:

    memset(f->buf, 0, f->width * f->height * sizeof *f->buf);
this both fixes the typo ("sizeoof" sounds like a C lecturer being punched in the guts), and avoids duplicating the type and instead inferring it from the actual object, which is more DRY and a style I've been advocating for ... years.

[1]: https://man7.org/linux/man-pages/man3/memset.3.html


Fine, but at least fix this bug:

    .. (sizeof *f->buf) ..
Because otherwise thats one stray space char away from a bad day ..


Or ... sizeof(*f->buf)? I’d forgotten that sizeof even allows dropping the parens, but isn’t that asking for trouble? “sizeof a <op> b” is sometimes parsed as “sizeof(a) <op> b” and sometimes as “sizeof(a <op> b) depending on the precedence of operator <op>. Is there any downside to always using parens and writing sizeof() like a function call?


Sure, many people like that style, I don't.

But no, I don't think there are any downsides, but there is this notion that things should look like what they are, which is popular in C.

With parens, I would suggest spaces to make it

    sizeof (*f->buf)
that makes it look less like a function call in many styles.


since everyone else is posting links to their similar libraries, i thought i'd post mine too, https://gitlab.com/kragen/bubbleos/-/tree/master/yeso

it's probably a bit more efficient than fenster on x-windows because it uses shared memory, and i think the programming interface is a little nicer (see the readme above for code examples)

apps i've written in yeso include an adm-3a terminal emulator, a tetris game, a raytracer, an rpn calculator, a fractal explorer, and so on

i haven't ported yeso to win32/64, macos, android, or the browser canvas yet, just x-windows, the linux framebuffer (partly), and a window system implemented in yeso called wercam

it includes bindings for c, python (via cffi), and lua (via luajit's ffi), and presumably you could use it from zig or rust in the same way as fenster, but i haven't tried


I like BubbleOS. :)


yay :)


I think this needs much more complexity to be useful.

For the rendering, ideally it needs GPU support.

Input needs much more work, here's an overview for Windows: http://blog.ngedit.com/2005/06/13/whats-broken-in-the-wm_key...

Windows' Sleep() function has default resolution 15.6ms, that's not enough for realtime rendering, and relatively hard to fix, ideally need a modern OS and a waitable timer created with high resolution flag.

Here's my attempt at making something similar, couple years ago: https://github.com/Const-me/Vrmac


> Windows' Sleep() function has default resolution 15.6ms, that's not enough for realtime rendering, and relatively hard to fix

Very easy to fix. Call this at the start of your real-time process:

https://learn.microsoft.com/en-us/windows/win32/api/timeapi/...

Use an argument of 1, and windows will come knocking at ~1Khz. This is extremely effective in my experience. Allows you to do some pretty crazy stuff on 1 thread.


> For the rendering, ideally it needs GPU support.

It's a 2D framebuffer library for placing single pixels like in mode 13 *eyerolling* (GPUs are hardly useful for this type of stuff).

If you want something more complete, check out the sokol headers (shameless plug): https://floooh.github.io/sokol-html5/

...but that can hardly be called 'minimal' anymore.


2d graphics are rendered on the GPU nowadays, and rightly so. Even SDL uses the GPU wherever it can, by default.


That's a silly statement. If you want to place single pixels, a GPU won't help in any way. The most efficient way would be to write pixels with the CPU into a mapped GPU texture, and then render this texture as fullscreen quad. That's hardly more efficient than going through the system's windowing system.

For applications like emulators, or making vintage games (like Doom) run on modern platforms, that approach makes a lot of sense.


Most vintage systems use something way more complex than a pure CPU-controlled framebuffer for graphics. They generally have some sorts of pre-defined "tiles" used to implement a fixed-width character mode, with the addition of a limited number of "sprites" overlaid in hardware. These video modes could be implemented efficiently by modern GPU's.


Only if you don't have a cycle correct emulator inbetween. Those old school system relied on hard real time timings down to the clock cycle to let the CPU control the color palette, sprite properties etc... at the right raster position within a frame. Modern GPUs don't allow such a tight synchronization between the CPU and GPU, so the best way is to run the entire emulation single-threaded on the CPU, including the video decoding.

(the resulting framebuffer can then of course be dumped into a GPU texture for rendering, but that just offers a bit more flexibility, eg embedding the emulated system into a 3D rendered world)


It depends what you mean by "hard real time". In theory, user input you get while scanning out pixel x might change pixel x + 1, and this leaves you with no choice but rendering single pixels in a strictly serial way. In practice, no existing emulator cares about that.


It's not about user input, but the CPU writing video hardware registers at just the right raster position mid-frame (to recycle sprites, change the color palette, or even the resolution). Home computer emulators for systems like the C64 or the Amstrad CPC need to do this at exactly the right clock cycle, otherwise modern (demo scene) demos wouldn't render correctly.

PS: of course one could build a GPU command list to render such a video frame somehow, but I bet just building this command list is more expensive then just doing the video decode with the CPU. It would basically come down to one draw command per (emulated system) pixel in the worst case.


But wouldn't the GPU help if you were mapping e.g. 256x192 virtual pixels to say 1024x768? I.e., each of the pixels from the low-res space being represented by a NxM patch of actual screen pixels, like a Win32 GDI StretchBlt() call.

If you had a frame buffer for the actual screen and you tried to do even a 1 to 2x2 expansion on the CPU's time, that'd have to be a serious speed hit. Presumably GPU hardware can do that sort of thing.


Yes, for fancy upscaling or applying pixel shader effects like CRT filters, doing the final pass on the GPU definitely makes sense. This would no longer be a "minimal" library though.

Window system composers should also be able to upscale bitmaps on their own though, and nothing prevents them to use the GPU for this.


Doing this through the GPU is actually somehow A LOT slower on at least Windows than going through the windowing system.


>That's hardly more efficient than going through the system's windowing system.

My intuition says otherwise but I admit I don't have math on hand to back it up.


Ok it's most likely slightly slower, but not enough that it matters. Frame latency might actually be lower though if the result doesn't need to go through a swapchain AND the window system composer.


No direct access to the command buffers of the GPU? No matrix transform routines? No colorspaces besides RGB? Absolute non-starter.


> direct access to the command buffers of the GPU

Impossible given the platform limitations at that time, on Linux that code works on top of GLES 3.1. I didn't want too many Windows-only features there.

> No matrix transform routines?

There's no need for that. That library is for C#, which includes these routines in the standard library of the language for many years now, since .NET core 1.0: https://learn.microsoft.com/en-us/dotnet/api/system.numerics...


My remark was about Fenster, and a critique of your comment that it somehow isn't good enough because it lacks GPU acceleration and that sort of thing. The point of Fenster is to be minimal and simple yet still allow programmers to get stuff on the screen in a way that's easy and fun. It performs superbly at the role it was conceived for. We can list features we want/think are table stakes for a modern graphics pipeline or whatever, and there are plenty of libraries that are up to those tasks. This one is doing something different.


> No direct access to the command buffers of the GPU?

Lol, show me one library that does this without going through an abstraction layer like Vulkan. Details like this are not even documented by GPU vendors and you'd need to reverse engineer every supported GPU architecture yourself.


I was being sardonic.

Guy writes a library with the design goals of making basic, retro-like graphics functionality easy and fun with a minimum of code, and certain Hackernews dogpile him because it can't be integrated with an AAA pipeline or whatever.


Ah, apologies then :)


There’s simple and there’s naive.

SDL already walked this path. So did pygame.

It turns out without shaders, and all their complexity, you basically can’t do anything useful at high resolutions.

/shrug

It’s too slow; compositing, transformations, all the things people want to do are very much harder to implement in software on top of a naive graphics stack.

Simple doesn’t mean naive; you can have a simple api that is excellent.

…but this is a naïve implementation, and it won’t work in any useful way at scale.

People complain about ballooning complexity, but they often forget that stuff (eg. specialised graphics hardware) wasnt invented by a bunch of idiots.

It was invented because the naive approach, that they used before was found to be in practice fundamentally limiting, inferior and failed to meet people’s expectations.

Of course, if you vastly lower your expectations, and target say, 320x240 at 8bit colour, you can happily have a naïve implementation that works just fine

(Don't believe me? Quote:

> Having this we can now draw complex polygons and would probably need a “flood fill” algorithm. Typically it is implemented using a queue of pixels to check and paint, but we can use recursion, as long as the filled area remains small enough to not overflow the stack

^ This is the very definition of a naive implementation, and https://github.com/zserge/fenster/blob/e71d493fa6d544243dd60... settings one pixel at a time, is too. Delightfully charming as it might be to write your own functions that set one pixel at a time, it's a joke, really, if you expect to do anything serious)


For any graphic intensive application it would be obviously be necessary to use a GPU.

But for quick hacking / porting old demos / writing emulators and also text based UI it can be fast enough.

With the added benefit of small footprint, high compatibility and fast startup time.

The Lite editor https://github.com/rxi/lite is using pure software rendering (on top of SDL) in a rather naïve fashion but it still renders full 32bit colors at full resolution at more than 60FPS on my computer, not the best solution but still surprisingly fast given the simplicity of the renderer.

This is typically a case where simple/naïve can beat a juggernaut like Electron.


> is using pure software rendering (on top of SDL) in a rather naïve fashion

https://github.com/rxi/lite/blob/master/src/rencache.c#L4

I think you'll find that they found the naive approach was sufficiently poor, performance wise, that additional optimizations had to be applied on-top.

> But for quick hacking / porting old demos / writing emulators and also text based UI it can be fast enough.

/shrug

If you want to use it, use it. It's 'good enough'...

> if you vastly lower your expectations


If anyone is interested in getting into zig, we made a library that has similar goals (and can run on WASM in the browser): https://github.com/ibebrett/zigzag



Wonderful. I really wanted something like this to do Graphic the "hard way", also the most fun way... I was trying to use Tcl-TK for that but this seems a lot lower level, perfect for what I wanted to do which is to write a small GUI toolkit for tiny apps.


Writing pixels into an RGBA framebuffer in main memory is fun, but it's also the easy and slow way to do graphics. The hard way these days is to figure out how to use a modern GPU to do the rendering you need. It's almost always several orders of magnitude faster.


writing pixels into an rgba framebuffer in main memory is still plenty fast to do a full-screen animation at 60 hertz tho

the benefit of the gpu is no longer (since maybe late last millennium) that you don't have the memory bandwidth to your framebuffer; it's that you can do a lot of computation per pixel

typically the gpu advantage is only about an order of magnitude tho


I did something similar to load an image in windows/linux https://imadr.me/cross-platform-window-in-c/


"raylib" comes to mind, as it is a graphics library also inspired by the Borland BGI graphics library.

- Anyone with experience in it / any possible synergies?

[https://en.wikipedia.org/wiki/Raylib]


Very, very cool. Already played doom on it. Maybe I can get my nephew interested in playing with this!


Pro-tip: Doom is available on Steam for $5. Just in case you want to avoid the DMCA nastygram I got from my ISP this morning. *cringe*


This is very cool, but if I were doing such things I'd just use SDL (https://www.libsdl.org/)


Not so minimal, but if you need GPU rendering I wrote this: https://github.com/jacereda/glcv


And a couple of wrappers (probably bitrotten) https://github.com/search?q=user%3Ajacereda+glcv


Basically a bare-metal SDL2? well done.


This is extremely useful, and the page is beautifully written. Some people write absurd electron behemoths, and then there's this jewel of elegance.

Could this be compiled as an αpε? Such a thing would make many heads explode.


I think the Xlib dependency would prevent compilation as an αpε. I put some effort into doing X11 from scratch (without Xlib or Xcb) to make this possible. Or at least, my aim was to be able to build with musl libc and generate a single executable that would run on many different Linuxes.

https://github.com/abainbridge/deadfrog-lib/blob/master/src/.... It is janky though.


Super interesting!

> It is janky though

So what? It can be improved :)

Would you like to try to make an APE out of it? It would then be not just different Linuxes, but also MacOS and Windows, a bit like https://x410.dev/ or the free software vcxserv https://sourceforge.net/projects/vcxsrv/

We have discord. You should join us!

FWIW I'm currently working on a DNS server with nice extra features like Multicast for mDNS/Bonjour: it could be used to publish the X server DISPLAY and make it accessible via service-discovery


> We have discord. You should join us!

Who is "you", here?


People who try to hack stuff with APE - here abainbridge but hopefully, you too: the more the merrier :)


One advantage of speaking X directly for small projects is that one can allocate fixed ids.


Alpha Rho Epsilon? Alpha Pee Epsilon? Google isn't helping me here...


I think they're referring to this:

https://justine.lol/ape.html

One of many HN discussions about the topic here:

https://news.ycombinator.com/item?id=26273960


Thank you! When I encounter an unknown term on HN, I try to find out for myself what it is before bugging everyone, but it doesn't always work.


Yup! Besides being hard to search, others have commented that the name might be bad for accessibility:

https://news.ycombinator.com/item?id=24257503




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: