More

jstimpfle · 2026-03-10T09:57:17 1773136637

The hidden compile-time cost of <insert almost any C++ feature>

SuperV1234 · 2026-03-10T11:15:58 1773141358

Nah, many great features are extremely cheap. E.g. constexpr, templates, fold expressions, equality operator defaulting, concepts...

JamesTRexx · 2026-03-10T12:09:38 1773144578

Until you try to add / modify a feature of the software and run into confusing template or operator or other C++ specific errors and need to deconstruct a larger part of the code to find (if possible) out where it comes from and spend even more time trying to correct it.

C++ is the opposite of simplicity and clarity in code.

kreco · 2026-03-10T11:31:53 1773142313

Yes, but they are still all hidden.

direwolf20 · 2026-03-10T12:29:53 1773145793

constexpr means running code at compile time. template means duplicating lots of code lots of times. these are not cheap.

jeffbee · 2026-03-10T12:50:17 1773147017

Yeah this is the very first time I am hearing that templates are "extremely cheap". Template instantiation is pretty much where my project spends all of its compilation time.

SuperV1234 · 2026-03-10T18:04:46 1773165886

It depends on what you are instantiating and how often you're doing so. Most people write templates in header files and instantiate them repeatedly in many many TUs.

In many cases it's possible to only declare the template in the header, explicitly instantiate it with a bunch of types in a single TU, and just find those definitions via linker.

jeffbee · 2026-03-10T18:22:41 1773166961

On the few times that I have looked at clang traces to try to speed up the build (which has never succeeded) the template instantiation mess largely arose from Abseil or libc++, which I can't do much about.

jstimpfle · 2026-03-04T18:47:41 1772650061

> which means you have subtle display artifacts.

No. 150÷ just means 96dpi * 1.5

lateforwork · 2026-03-04T20:05:38 1772654738

At 150% scaling, one logical pixel maps to 1.5 physical pixels. When a 1px grid line is drawn, the renderer cannot light exactly 1.5 pixels, so it distributes the color across adjacent pixels using anti-aliasing. Depending on where the line falls relative to device-pixel boundaries, one pixel may be fully colored and the next partially colored, or vice versa. This shifts the perceived center of the line slightly. In a repeating grid, these fractional shifts accumulate, making the gaps between lines appear uneven or "vibrating."

Chromium often avoids this by rendering 1px borders as hairlines that snap to a single device pixel, even when a CSS pixel corresponds to 1.5 device pixels at 150% scaling. This keeps lines crisp, but it also means the border remains about one device pixel thick, making it appear slightly thinner relative to the surrounding content.

For some people such artifacts are not noticeable for others they are.

jstimpfle · 2026-03-05T14:26:18 1772720778

I'm one of those people who are super sensitive to the issues you describe, and let me tell you this: Scaling value (like 150%) is just an integer number.

For the most part, non-ancient renderers (3D but also to a large degree 2D renderers), do not care about physical pixels, and when they do, they care the same amount no matter what the DPI is.

Raster data has a fixed number of pixels, but is generally not meant to be displayed at a specific DPI. There are some rare applications where that might be true, and those are designed to work with a specific display of a given size and number of pixels.

It's especially older applications (like from the 90s and 00s) that work in units of physical pixels, where lines are drawn at "1 pixel width" or something like that. That was ok in an age where targetted displays were all in the range of 70-100 DPI. But that's not true anymore, today the range is more like 100 to 250 or 300 DPI.

One way to "fix" these older applications to work with higher DPIs, is to just scale them up by 2 -- each pixel written by the app results in 2x2 pixels set on the HiDpi screen. Of course, a "200%" display i.e. a display with 192 DPI should be a good display to do exactly that, but you can just as well use a 160 DPI or a 220 DPI screen and do the same thing.

It's true that a modern OS run with a "scaling" setting of 150% generally scales up older applications using some heuristics. Important thing to notice here is that the old application never considered the DPI value itself. It's up to the OS (or a renderer) how it's doing scaling. It could do the 2x2 thing, or it could do the 1.5x thing but increase font sizes internally, to get sharp pixel-blitted fonts when it has control over typesetting. And yeah, some things can come out blurry if the app sends final raster data to the OS, and the OS just does the 1.5x blur thing. But remember, this is an unhappy situation just for old applications, and only where the OS receives raster data from the App, instead of drawing commands. Everything else is up to the OS (old apps) or the app itself (newer, DPI-aware apps).

For newer applications, e.g. on Windows, the scaling value influences nothing but the DPI value (e.g. 150% or 144 DPI) reported to the application, everything else is up to the app.

lateforwork · 2026-03-05T19:52:59 1772740379

Sorry none of that makes any sense to me. Go to BestBuy where they have Surface laptops on display. Open the browser and go to a website where a grid with 1px horizontal lines are displayed. I immediately notice that the lines are disproportionately thin. You may not notice it and that's fine.

jstimpfle · 2026-03-06T13:57:04 1772805424

I've started out with a longer reply, but let's try and condense it a little: I concede you can still find this issue, especially on less than 4k displays, but it's becoming less and less of an issue, because of improving software. Where you see the issue, it's simply a software problem -- CSS or Chrome or the website/app should be fixed.

I don't see much of this issue anymore, on my 27" 4K screen, set to 175% scaling. It's logical that if you want to do arbitrary scaling or zooming, and want to keep all distance ratios perfectly intact, you will experience some ugliness from antialiasing (less and less noticeable as you go to ~4K and beyond). That will be so regardless of scaling, even when it's set to 100%.

So if it ought to look good, software simply needs to be written more flexibly! 1px in CSS doesn't mean 1 physical pixel but it is a (quite arbitrary) physical distance, defined as 1/96th of an inch. It's all up to the app and the software stack deciding line widths and how they will actually come out on a screen in terms of pixels lit. They should _respect_ the scaling setting (like 150%) but they also are in full control, in principle, to make it look good.

<hr> lines come out are perfectly fine on my screen (175% 4K) with Firefox and Chrome.

1.5px width lines will come out quite bad with 100% scaling, but will look perfect with 150% scaling obviously.

Notice that vector fonts generally look better if you have a reasonably high-dpi display. But on average, it doesn't matter if you test font sizes of 20pt or 21pt or 17pt or whatever. Why is that? Because font rasterizers already snap to pixels. They properly quantice geometric objects. They don't make arbitrary stubborn choices like "it must be exactly 1/96th of a virtual inch wide" but they are a little flexible to fight antialiasing.

And the more higher DPI monitors there are, the less software will be making stubborn choices.

lateforwork · 2026-03-04T20:27:24 1772656044

Here are some links that explain the issue:

https://patrickbrosset.com/articles/2024-06-21-invasion-of-t...

https://flutterawesome.com/sharp-looking-flutter-application...

https://tanalin.com/en/articles/integer-scaling/

jstimpfle · 2026-03-01T19:40:39 1772394039

I've had a keyboard like that and with it, xterm (and nothing else) felt like it was displaying the characters even slightly before I had pressed them. It was a weird sensation (but good)

nickjj · 2026-03-02T01:29:20 1772414960

Yes, I know this feeling, it's like typing on air. The Windows Terminal has this same feeling. 8 years ago I opened this issue https://github.com/microsoft/terminal/issues/327 and the creators of the tool explained how they do it.

xterm in X11 has this feeling, ghostty does not. It's like being stuck in mud but it's not just ghostty, all GPU accelerated terminals on Linux I tried have this muddy feel. It's interesting because moving windows around feels really smooth (much smoother than X11).

I wish this topic was investigated in more depth because inputting text is an important part of a terminal. If anyone wants to experience this with Wayland, try not booting into your desktop environment straight into a tty and then type. xterm in X11 and the Windows Terminal feel like this.

jstimpfle · 2026-03-01T19:36:01 1772393761

If you're typing just one character per second you'll still feel the difference. Latency is stress inducing.

jstimpfle · 2026-02-06T14:20:17 1770387617

Assuming that the two `mv` commands are run in sequence, there shouldn't be any possibility for a and d to be observed "at once" (i.e. first d and then afterwards still a, by a single process).

duped · 2026-02-09T04:57:27 1770613047

You'd be wrong

jstimpfle · 2026-02-09T13:30:20 1770643820

How so?

With regards to Linux kernel implementation, I could map you out a list of sequence points caused in the kernel by above sequential "shell script", proving that no other process could observe d and then a using two subsequent "stat" calls. I could imagine that a single directory listing (getdents) could happen to first visit a and then d, though -- I don't know the internals of that.

With regards to POSIX specification, I don't know the details but anything less than that guarantee would be very a huge robustness problem.

jstimpfle · 2026-02-06T09:32:37 1770370357

I don't think that's happening in practice, but 1) it may not be specified and 2) What you say could well be the persisted state after a machine crash or power loss. In particular if those files live in different directories.

You can remedy 2) by doing fsync() on the parent directory in between. I just asked ChatGPT which directory you need to fsync. It says it's both, the source and the target directory. Which "makes sense" and simplifies implementations, but it means the rename operation is atomic only at runtime, not if there's a crash in between. It think you might end up with 0 or 2 entries after a crash if you're unlucky.

If that's true, then for safety maybe one should never rename across directories, but instead do a coordinated link(source, target), fsync(target_dir), unlink(source), fsync(source_dir)

jstimpfle · 2026-02-06T14:16:47 1770387407

why is this being downvoted? If there's something wrong, explain?

jstimpfle · 2026-01-24T09:50:02 1769248202

std::string is not UTF-8 and can't be made UTF-8. It's encoding agnostic, its API is in terms of bytes not codepoints.

jhasse · 2026-01-24T15:03:17 1769266997

Of course it can be made UTF-8. Just add a codepoints_size() method and other helpers.

But it isn't really needed anyway: I'm using it for UTF-8 (with helper functions for the 1% cases where I need codepoints) and it works fine. But starting with C++20 it's starting to get annoying because I have to reinterpret_cast to the useless u8 versions.

jstimpfle · 2026-01-25T22:26:16 1769379976

First, because of existing constraints like mutability though direct buffer access, a hypothetical codepoints_size() would require recomputation each time which would be prohibitively expensive, in particular because std::string is virtually unbounded.

Second, there is also no way to be able to guarantee that a string encodes valid UTF-8, it could just be whatever.

You can still just use std::string to store valid encoded UTF-8, you just have to be a little bit careful. And functions like codepoints_size() are pretty fringe -- unless you're not doing specialized Unicode transformations, it's more typical to just treat strings as opaque byte slices in a typical C++ application.

jhasse · 2026-01-25T22:50:24 1769381424

Perfect is the enemy of good. Or do you think the current mess is better?

jstimpfle · 2026-01-27T20:00:26 1769544026

std::string _cannot_ be made "always UTF-8". Is that really so contentious?

You can still use it to contain UTF-8 data. It is commonly done.

jhasse · 2026-01-28T08:20:40 1769588440

I never said always. Just add some new methods for which it has to be UTF-8. All current functions that need an encoding (e.g. text IO) also switch to UTF-8. Of course you could still save arbitrary binary data in it.

jstimpfle · 2026-01-28T13:03:39 1769605419

> That's where the standard should come in and say something like "starting with C++26 char is always 1 byte and signed. std::string is always UTF-8" Done, fixed unicode in C++.

> I never said always

Yes you totally did. And regarding "add some new methods for which it has to be UTF-8", there is no need at all to add UTF-8 methods to std::string. It would be a bad idea. UTF-8 is not bound to a particular type (or C++ type). It works on _any_ byte sequence.

jhasse · 2026-02-04T15:15:05 1770218105

You're right, sorry.

Good point, so maybe the standard should just add some functions that take std::string_view. Definitely not add a whole new class like std::u8string ...

jstimpfle · 2026-01-07T11:07:12 1767784032

What type of filter do you mean? Unless I'm misunderstanding/missing something, the approach described doesn't go into the details of how coverage is computed. If the input image is only simple lines whose coverage can be correctly computed (don't know how to do this for curves?) then what's missing?

I'd be interested how feasible complete 2D UIs using dynamically GPU rendered vector graphics are. I've played with vector rendering in the past, using a pixel shader that more or less implemented the method described in the OP. Could render the ghost script tiger at good speeds (like 1-digit milliseconds at 4K IIRC), but there is always an overhead to generating vector paths, sampling them into line segments, dispatching them etc... Building a 2D UI based on optimized primitives instead, like axis-aligned rects and rounded rects, mostly will always be faster, obviously.

Text rendering typically adds pixel snapping, possibly using byte code interpreter, and often adds sub-pixel rendering.

dahart · 2026-01-07T16:23:26 1767803006

> What type of filter do you mean? […] the approach described doesn’t go into the details of how coverage is computed

This article does clip against a square pixel’s edges, and sums the area of what’s inside without weighting, which is equivalent to a box filter. (A box filter is also what you get if you super-sample the pixel with an infinite number of samples and then use the average value of all the samples.) The problem is that there are cases where this approach can result in visible aliasing, even though it’s an analytic method.

When you want high quality anti-aliasing, you need to model pixels as soft leaky overlapping blobs, not little squares. Instead of clipping at the pixel edges, you need to clip further away, and weight the middle of the region more than the outer edges. There’s no analytic method and no perfect filter, there are just tradeoffs that you have to balance. Often people use filters like Triangle, Lanczos, Mitchell, Gaussian, etc.. These all provide better anti-aliasing properties than clipping against a square.

jlokier · 2026-01-07T14:17:39 1767795459

> If the input image is only simple lines whose coverage can be correctly computed (don't know how to do this for curves?) then what's missing?

Computing pixel coverage accurately isn't enough for the best results. Using it as the alpha channel for blending forground over background colour is the same thing as sampling a box filter applied to the underlying continuous vector image.

But often a box filter isn't ideal.

Pixels on the physical screen have a shape and non-uniform intensity across their surface.

RGB sub-pixels (or other colour basis) are often at different positions, and the perceptual luminance differs between sub-pixels in addition to the non-uniform intensity.

If you don't want to tune rendering for a particular display, there are sometimes still improvements from using a non-box filter

An alternative is to compute the 2D integral of a filter kernel over the coverage area for each pixel. If the kernel has separate R, G, B components, to account for sub-pixel geometry, then you may require another function to optimise perceptual luminance while minimising colour fringing on detailed geometries.

Gamma correction helps, and fortunately that's easily combined with coverage. For example, slow rolling tile/credits will shimmer less at the edges if gamme is applied correctly.

However, these days with Retina/HiDPI-style displays, these issues are reduced.

For example, MacOS removed sub-pixel anti-aliasing from text rendering in recent years, because they expect you to use a Retina display, and they've decided regular whole-pixel coverage anti-aliasing is good enough on those.

jstimpfle · 2025-12-28T02:07:28 1766887648

In C++, in particular when restricting to a C like subset, I prefer looking at an expression like

    foo->bar.baz

instead of (in Rust and other modern languages that decided to get rid of the distinction)

    foo.bar.baz

For example, the former lets me easily see that I can copy foo->bar and I now have a copy of baz (and indeed bar). In a newer language, it's harder to see whether we are copying a value or a reference.

saghm · 2025-12-28T02:44:44 1766889884

I see what you're saying but I'd argue that this is mostly an unnecessary thing to worry about because with the exception of types explicitly opted into being cheaply copyable, you're going to be moving it if you're not accessing it via a reference. The idea is that if you're worried about expensive copies, it shouldn't be possible to copy implicitly in the first place; you'd either explicitly `clone` or you wouldn't be copying it at all.

jstimpfle · 2025-12-28T16:36:07 1766939767

I'm not worried about expensive copies. I'm worried about being able to understand my systems code. The solution isn't adding more abstractions (like move semantics on top). I don't want to move anything. I want to be clear about taking a reference or making an actual copy, these are deeply, semantically, different. This difference is important for single threaded code but also for concurrency -- not only with mutable data types but also with immutable ones.

Performance is mostly a consequence of clear and direct code. You mostly don't achieve performance by saving individual copies, but by being in control of the code and architecture.

ibotty · 2025-12-29T11:12:45 1767006765

I don't think your run often into these things, because of Rust's ownership enforcement. But I might be misunderstanding you, because it's all pretty abstract and I might not have the whole context.

jstimpfle · 2025-12-18T17:40:50 1766079650

He wrote this whole game in it. Apart from that, a couple dozen or hundreds of beta-testers. Not sure whether the language ever gets released, maybe he's too worried about having to maintain it and not being able to change it anymore.