Hacker News new | past | comments | ask | show | jobs | submit login

A few hours later another programmer came up with the prototype of a much faster terminal renderer, proving that for an experienced programmer a terminal renderer is a fun weekend project and far away from being a multiyear long research undertaking.

I have no idea if this is the case here, and I suspect it might not be, but pretty much every time I've seen a developer complain that something is slow and then 'prove' that it can be faster by making a proof-of-concept the only reason theirs is faster is because it doesn't implement the important-but-slow bits and it ignores most of the edge cases. You shouldn't automatically assume something is actually bad just because someone shows a better proof-of-concept 'alternative'. They may have just ignored half the stuff it needs to do.




This particular case was discussed at length on Reddit and on YC News. The general consensus was that the Microsoft developers simply didn't have performance in the vocabulary, and couldn't fathom it being a solvable problem despite having a trivial scenario on their hands with no complexity to it at all.

The "complaining developer" produced a proof of concept in just two weekends that notably had more features[1] and was more correct than the Windows Terminal!

RefTerm 2 vs Windows Terminal in action: https://www.youtube.com/watch?v=99dKzubvpKE

Refterm 2 source: https://github.com/cmuratori/refterm

One of the previous discussions: https://news.ycombinator.com/item?id=27775268

[1] Features relevant to the debate at any rate, which was that it is possible to write a high-performance terminal renderer that also correctly renders Unicode. He didn't implement a lot of non-rendering features, but those are beside the point.


And the experienced developer is Casey Muratori who is somewhat well known for being a very experienced developer. That makes it less likely that he doesn't know what he's talking about and is skipping over hard/slow features.


And he had a condescending tone from the beginning (as he always does). Maybe if he was more respectful / likable, the developers would have responded better.


Where? He started completely neutral here:

https://github.com/microsoft/terminal/issues/10362

We live in a time where every competent developer is slandered in public if he isn't fully submissive to the great corporate powers.


I think the comment below in the github thread sums up the attitude of the developer. It's definitely not a "neutral" attitude. It's somewhat chip-on-shoulder and somewhat aggressive.

  > Setting the technical merits of your suggestion aside though: peppering your comments with clauses like “it’s that simple” or “extremely simple” and, somewhat unexpectedly “am I missing something?” can be read as impugning the reader. Some folks may be a little put off by your style here. I certainly am, but I am still trying to process exactly why that is.

But by any reasonable reading, the guy wasn't "slandered".


Man, if we start taking issue with "Am I missing something?", how can we have productive, good-faith discussions? The only attitude I can associate with that is openness to learn, a genuine curiosity.

How is a yes/no question aggressive? At that point the maintainers had two possible responses:

1. Yes you are missing that ...

2. No that is the complete picture.

But they chose to side channel to a third possibility, "we are put-off by your questioning!". Excuse me what?


> How is a yes/no question aggressive?

Have you stopped beating your wife?

More relevantly, when the question is asked genuinely then - as you say - it's expressing an openness to learn.

Sometimes it is asked rhetorically, dripping with sarcasm and derision. In that case, it is clearly not furthering our interest in productive, good-faith discussions.

Far more often, it falls somewhere between those two and - especially in text - is often ambiguous as to which was intended. While we should exercise charity and hope our conversational partners do likewise, it makes sense to understand when some phrases might be misconstrued and perhaps to edit accordingly.


If you're going to read emotional content into that "Am I missing something?", I think sarcasm and derision are not the most plausible options. In this case, it seems like incredulity is the more likely and appropriate reaction: because it seemed like the person asking the question was putting a lot more thought and effort into the discussion than the Microsoft developers who were not willing to seriously reconsider their off-the-cuff assumptions.


Oh, I didn't mean that sarcasm and derision is how the Microsoft developers interpreted the phrase. I was speaking to the notion that the question was necessarily innocent and could only be interpreted thusly.

I would say that incredulity falls within the range between "completely inoffensive" and "outright hostile", and very much toward the former side of the scale. It can be hard to distinguish from feigned incredulity, which (while still far from "sarcastic and derisive") makes its way toward the other side somewhat.


"Feigned incredulity" can be every bit as caustic as outright hostility.

It's all a matter of perception and context, of course. And though you say there's only one way to interpret it, even you describe it as a continuum.

Sadly, this is all just a sad missed opportunity.

MS could have been less defensive and more open to possible solutions. The genius programmer could have slowed his roll a bit and could have been more collaborative.


I do get the sense that the "feel" in his writing eventually becomes more like "what are you guys smoking, this should be simple!"

It's not just "Am I missing something?"

It's:

"Am I missing something? Why is all this stuff with "runs of characters" happening at all? Why would you ever need to separate the background from the foreground for performance reasons? It really seems like most of the code in the parser/renderer part of the terminal is unnecessary and just slows things down. What this code needs to do is extremely simple and it seems like it has been massively overcomplicated."

Perhaps frustrated that they don't really seem to be on the same technical page?

I tend to think these things can go both ways. I feel pointing out someone's frustration in writing tends to make things worse. Personally I would just ignore it in this case.


That exact case seem a very appropriate scenario for clarifying? Microsoft kept saying something was difficult, whilst Casey knew that it was not, so really he was being polite by first confirming that there wasn't something he'd overlooked?


There's a difference between "inherently difficult" and "difficult to update this software package". My reading of this thread is that the MS devs are saying this will take them a lot of effort to implement in this app, not that the new implementation could be simpler than the existing implementation. Asking to rearchitect the application is an involved process which would take a lot of back-and-forth to explain the tradeoffs. The new architecture can be simple, but evaluating a new architecture and moving to it are not.

There's a point at which you've moved from "fix this bug" or "evaluate this new component" to "justify the existing design" and "plan a re-architecture".


Whether or not you see his behavior as polite, I guess, is a matter of how you read people and the context of the situation. That said, he did literally admit he was being "terse". I think it was counterproductive at best and rather mean at worst.

As for whether it really is "difficult", one has to ask for whom? For someone that is intimately familiar with C++, DirectX, internationalization, the nature of production-grade terminal shell applications and all their features and requirements?

And even if it is "easy", so what? It just means Microsoft missed something and perhaps were kind of embarrassed, that's totally human, it happens. It's not so nice when this stuff is very public with harsh judgement all around.

This all rubs me the wrong way. I have found the Microsoft folks to be very helpful and generous with their attention on Github Issues. They've helped me and many others out, it has been genuinely refreshing. What this guy did might discourage participation and make folks more defensive to avoid losing face in a big public way over a mistake or silly gotch-ya.


Some people prefer to communicate with less words? This is an issue that crops up often with different cultures working on a single issue.

As for difficult, the context is very much set from it being a Github Issue on their own repo, meaning there is a certain assumption of skill.

You're cutting Microsoft a lot of slack here, and it feels like you're forgetting that out of this whole transaction MS end up with free labour and bug-fixes? They choose for the setting to be very public, and they choose to let their employee's directly reply to customers with quotes like[1]: ["I will take your terse response as accepting the summary.", "somewhat combatively", "peppering your comments with clauses like", "impugning the reader."]. All of which are corporate-passive-aggression and (in my mind) are vastly more antagonistic than Casey ever was?

1. https://github.com/microsoft/terminal/issues/10362


One non-sequiter deserves another. Just call his mother a cunt and move on.


Casey is in fact perpetually annoyed with and disdainful of microsoft. Anyone who is familiar with him knows this.

He's been like this for years, and that's fine when you are hanging out with you buddies over a beer, but now Casey is a public figure.

Being a public figure means you are not 'every competent developer'. The reason this was made so public wasn't ms employees, it was Casey's followers.

The sequence of events he started here ended with his fans and friends on discord feeling justified (because Casey, not them, was right on a technical level) brigading volunteers and microsoft employees alike until at least one of them quit open source.

A truly ugly conclusion that could have been avoided with a more circumspect approach.


The problem wasn't that the Microsoft devs were wrong technically. The problem was that the tone of the Microsoft developers got much worse than Cassey's tone, they should have just closed the bug rather than ridiculing him at the end. If they did that the issue wouldn't have been a big deal.


I've found people sometimes take a neutral tone, especially from someone (me for example!) who is sometimes more than a bit openly opinionated, as being passive aggressive (or passive condescending if that is a thing). Perhaps that is what has happened in this case?


For those curious, what was the outcome of this closed issue? Did Casey make a working terminal on Windows outside of a text renderer? Did Microsoft incorporate his feedback?

My worry is that Casey did this technical performance for the benefit of his followers, and nothing of value was gained, except of course Casey's growing fame.


Well given how absurdly big the difference is, and the main thing he did was render on demand instead of 7000fps I think he has a good reason to be condescending and they totally deserve it for wasting millions of people's time with this shit.


See also: the blinking cursor in Visual Studio Code.

Here's a thread on it with other examples: https://news.ycombinator.com/item?id=13940014

They fixed it, but it was a sign of the times. Everything we've used over the decades had to be re-implemented for the web and stuff like Electron, and the people doing the implementing use such powerful machines that they don't even notice a simple blinking cursor devouring CPU or think about its impact on normal computers.


This! Developers at MS (edit: and elsewhere) should be forced to use their brainwork on low-end machines at least two days a week.

Or not - regardless of what MS employee claimed, Linux terminals performance is more than adequate.

Edit: I am speaking of Linux, not WSL, of course.


Yes, the open source volunteers and random employees deserve it. They are responsible for all of microsoft's many sins, and we should find them online and tell them they are trash tier developers until they learn their lesson, right?

Ok, sarcasm off. This attitude is utterly toxic. People who are ignorant of how fast their software could be do not deserve abuse from strangers.


> People who are ignorant of how fast their software could be do not deserve abuse from strangers.

That's not the only valid way to frame the situation. At some level, professional software developers have a responsibility to be somewhat aware of the harm their currently-shipping code is doing.


Taking responsibility (which the developers later did by the way, even in this thread) and enduring abuse (which is also well documented here and elsewhere) should not be put on the same level.

More broadly, I'd much rather endure a slightly slow terminal made by developers acting in the open and in (mostly) good faith than the intentionally malicious software produced by actual bad actors within Google, Facebook, Microsoft et all..


"Abusive" is probably the the best one-word description of the way Microsoft and its software interacts with users. But I thing we'd agree it's a bit of a stretch to apply that to the case of a slightly slow terminal. However, it is absolutely fair to call it abusive when Microsoft tries to deny their problems or lie to their users that those problems are not Microsoft's fault and are something the users must simply put up with.

It's also important to keep in mind the vast asymmetry here. When Microsoft deploys problematic software, even a relatively minor problem will be responsible for many man-hours of frustration and wasted time. Far more man-hours than are ruined when a few developers have bad things said about them online. One doesn't excuse the other, but you can't ignore one of the harms simply because it's more diffuse.


The person that quit the project (and possibly the internet at large) wasn't a microsoft employee.

In my mind, there are two asymmetries.

* Microsoft v. Users

and

* Casey's network v. a 3-4 man open source team within microsoft

I don't disagree that the former is abusive.

However, it's my contention that this incident is primarily about the latter.

Casey, rightly, already had some pent up rage about the former asymmetry as well.

But it was a human manager/dev? within that small team, not Microsoft writ large, that got defensive about the software he was responsible for.

I believe I'd feel embarrassed and defensive too if something I'd worked on turned out to be flawed in a painfully obvious way. I can understand avoiding the grim truth by denying that the problem has truly been solved by ~700 lines of C.

Something else that I'll note here is that the vast majority of "Your software is too slow, Here's how to fix it, I could do it in a weekend, and by the way this whole problem space is actually super simple." tickets do not end up realizing software speedups. Without proper context, they just sounds patronizing, making the argument easier to dismiss.


That said, there's only so much patience one can have...


condescending: having or showing a feeling of patronizing superiority.

In this case he also demonstrated his superiority with working code.

Better to learn from it than to pout about it.


However, his experience, in games and game development tools AFAIK, might not be fully applicable to the development of mainstream commercial software that has to try to be all things to all people, including considerations like internationalization, accessibility, and backward compatibility. The performance difference that he demonstrated between Windows Terminal and refterm is certainly dramatic, but I wouldn't be surprised if there's something he's overlooking.


When I saw this mentioned on HN I immediately knew this kind of comment would be there because something along the lines and I am paraphrasing "its probably fast because its not enterprise enough" was repeated in every place the refterm was shared by different people even multiple times even after showing all the proof in the world that its in fact the opposite they almost refused to believe that software can be that much better than its standard today even to the point of bringing up arguments like 16fps terminal is better than 7500 one because so many fps would probably consume to much resources. Before I found Casey's tone criticizing bad software off putting but now I understand that after many years of such arguments it takes toll on you.


Seconding. It takes doing some low-level gamedev[0] stuff, or using software written by people like Casey, to realize just how fast software can be. There's an art to it, and it hits diminishing returns with complex software, but the baseline of popular software is so low it doesn't take much more than a pinch of care to beat it by an order of magnitude.

(Cue in the "but why bother, my app is IO bound anyway" counterarguments. That happens, but people too often forget you can design software to avoid or minimize the time spent waiting on IO. And I don't mean just "use async" - I mean design its whole structure and UX alike to manage waits better.)

--

[0] - Or HFT, or browser engine development, few other performance-mindful areas of software.


I feel obliged to point out the destructive power of Knuth's statement, "Premature optimization is the root of all evil."

I have encountered far too many people who interpret that to mean, "thou shall not even even consider performance until a user, PM or executive complains about it."


The irony is that the very paragraph in which Knuth made that statement (and the paper, and Knuth's programming style in general) is very much pro-optimization. He used that statement in the sense of "Sure, I agree with those who say that blind optimization everywhere is bad, but where it matters…".

Here's the quote in context:

> There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.…

( https://pic.plover.com/knuth-GOTO.pdf , http://www.kohala.com/start/papers.others/knuth.dec74.html For fun, see also the thread around https://twitter.com/pervognsen/status/1252736617510350848)

And from the same paper, an explicit statement of his attitude:

> The improvement in speed from Example 2 to Example 2a [which introduces a couple of goto statements] is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by pennywise-and-pound-foolish programmers, who can't debug or maintain their "optimized" programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn't bother making such optimizations on a one-shot job, but when it's a question of preparing quality programs, …


Knuth's statement was basically "use a profiler and optimize the hot path instead of trying to optimize with your intuition", which is great advice. Most people heard "don't optimize at all". Something that you can derive from that advice is "have a hot path to optimize". I've seen a few programs that aren't trivial to optimize because the work is distributed everywhere.


I'm what most people call a FPGA Engineer though I work all the way from boards/silicon to cloud applications. The number of times I've been asked to consult on something in the software world on performance, where the answer to how to do it write was me telling them "rm -rf $PROBLEMATIC_CODE" and then go rewrite it with a good algorithm, is way too damn high. Also, the number of times someone asked me to accelerate something on an FPGA only for me to go implement it run on a GPU in about 2-3 days using SYCL + OpenCL is insane. Sure, I could get another 2x improvement... or we can accept the 1,000x improvement I just gave you at a much lower price.


Which of course, never really happens, because PM’s and execs always want more features, and performance is never a feature for them until it becomes so noticeably bad that they begrudgingly admit they should do the minimum to make users stop complaining.


Agreed, as a young performance oriented coder I've been often looked down by people who used Knuth almost god-like authority to dress up all sorts of awful engineering.

And of course most people don't know the full quote and they don't care about what Knuth really meant at the time.


>> I've been often looked down by people who used Knuth almost god-like authority to dress up all sorts of awful engineering.

Quick quips don't get to trump awful engineering. Just say call Knuth a boomer and point to the awful aspects of actual code. No disrespect to Knuth, just dismiss him as easily as people use him to dismiss real problems.


I feel like that quote spoke to a particular time. Nowadays I'd point at premature abstraction as the fount of evil.


> I feel obliged to point out the destructive power of Knuth's statement, "Premature optimization is the root of all evil."

Except that line was written in a book (Volume 1: Art of Computer Programming) that was entirely written from the ground up in Assembly language.

Its been a while since I read the quote in context. But IIRC: it was the question about saving 1 instruction between a for-loop that counts up vs a for-loop that counts down.

"Premature optimization is the root of all evil" if you're deciding to save 1-instruction to count from "last-number to 0", taking advantage of the jnz instruction common in assembly languages, rather than "0 to last-number". (that is: for(int i=0; i<size; i++) vs for(int i=size-1; i>=0; i--). The latter is slightly more efficient).

Especially because "last-number to 0" is a bit more difficult to think about and prove correct in a number of cases.


> Except that line was written in a book

I recall it being from his response to the debates over GOTO, and some googling seems to agree.

Not that that takes away from your overall point.


In the office today with my copy of Literate Programming (which contains the essay in question) I can confirm that the sentence does appear in "Structured Programming with goto Statements" (it appears on page 28 of my copy). Here it is in a general context, not pertaining to a single particular example.

In support of your overall point, though, having just said "[w]e should forget about small efficiencies, about 97% of the time", the next paragraph opens: "Yet we should not pass up our opportunities in that critical 3%."


It's hilarious to me that people quote the person that wrote TAOCP to justify not thinking about performance at all.


I'm not an experienced programmer but if I took all these maxims seriously...

If I don't think about performance and other critical things before committing to a design I know that in the end I will have to rewrite everything or deliver awful software. Being lazy and perfectionist those are two things I really want to avoid.


... the best usage of this phrase i've encountered is using it to shut down a requirements discussion


We should forget about small efficiencies, say about 97% of the time.

Yet we should not pass up our opportunities in that critical 3%.


I find it striking that a decent modern laptop would have been a supercomputer 20 years ago, when people used Office 97 that was feature complete already IMO. I can't help this constant cognitive dissonance with modern software; do we really need supercomputers to move Windows out of the box?


We need some extra processing power to support larger screens and refresh rates. Arguably, security benefits of managed code / sandboxing are worth it - but the runtimes seem to be pretty-well optimized. Other than that, I don't see anything reasonable to justify the low performance of most software.


   "support larger screens and refresh rates"
Uh, yah 4k, etc but most of the modern machines are still 1920x1080@60hz. Which is only 8% larger than 1600x1200 which wasn't an uncommon resolution in the late 1990's, usually running at 75Hz or better over analog vga cables. So its actually _LESS_ bandwidth, which is why many of us cried about the decade+ of regression in resolution/refresh brought on by the LCD manufactures deciding computer monitors weren't worthy of being anything but overpriced TV screens. Its still ongoing, but at least there are some alternatives now.

It is possible to get office97 (or for that matter 2003, which is one of the last non sucky versions) and run it on a modern machine. It does basically everything instantly, including starting. So I don't really think resolution is the problem.

PS, I've had multiple monitors since the late 80's too, in various forms, in the late 1990's driving multiple large CRTs at high resolution from secondary PCI graphics cards, until they started coming with multiple ports (thanks matrox!) for reasonable prices.


I'd imagine software is bloated and grown until is it still-just-about usable on modern hardware. Making it faster there is probably seen as premature optimisation.

I'd imagine perhaps this is how product teams are assessed - is the component just-about fast enough, and does it have lots of features. So long as MS Office is the most feature-rich software package, enterprise will buy nothing else, slow or not.


It doesn't even need to be the most feature-rich any more. Microsoft has figured out that the key is corporate licensing.

Is Teams better than Zoom? No, but my last employer ditched Zoom because Teams was already included in their enterprise license package and they didn't want to pay twice for the same functionality.


It really is feature-complete. And I still use it for writing! Word 97 beats anything else I've tried in both polish and performance.


I think there's a story in here that most are missing, but your comment is closest to touching on. This was not a performance problem. This was a fundamental problem that surfaced as a performance issue.

The tech stack at use in the Windows Terminal project is new code bolted onto old code, and no one on the existing team knows how that old code works. No one understands what it's doing. No one knows when the things that old code needed to do was still needed.

It took someone like Casey who knew gamedev to know instinctually that all of that stuff was junk and you could rewrite it in a weekend. The Microsoft devs, if they wanted to dive into the issue, would be forced to Chesteron's Fence every single line of code. It WOULD have taken them years.

We've always recommended that programmers know the code one and possibly two layers below them. This recommendation failed here, it failed during the GTA loading times scandal. It has failed millions of times and the ramifications of that failing is causing chaos of performance issues.

I'm come to realize that much of the problems that we have gotten ourselves into is based on what I call feedback bandwidth. If you are an expert, as Casey is, you have infinite bandwidth, and you are only limited by your ability to experiment. If your ability to change is a couple seconds, you will be able to create projects that are fundamentally impossible without that feedback.

If you need to discuss something with someone else, that bandwidth drops like a stone. If you need a team of experts, all IMing each-other 1 on 1, you might as well give up. 2 week Agile sprints are much better than months to years long waterfall, but we still have so much to learn. If you only know if the sprint is a success after everyone comes together, you are doomed. The people iterating dozens of times every hour will eat your shorts.

I'm not saying that only a single developer should work on entire projects. But what I am saying is that when you have a Quarterback and Wide Receiver that are on the same page, talking at the same abstraction level, sometimes all it takes is one turn, one bit of information, to know exactly what the other is about to do. They can react together.

Simple is not easy. Matching essential complexity might very well be impossible. Communication will never be perfect. But you have to give it a shot.


Thanks for the “know the code one and possibly two layers below them” point, haven’t seen it written out explicitly before, but it sure puts into perspective why I consider some people much better programmers than others!


I started off programming doing web development working on an community run asynchronous game where we needed to optimize everything to run in minimal time and power to save on cost and annoyance. It was a great project to work on as a high schooler.

Then in college, I studied ECE and worked in a physics lab where everything needed to run fast enough to read out ADCs as quickly as allowed by the datasheet.

Then I moved to defense doing FPGA and SW work (and I moonlighted in another group consulting internally on verifcation for ASICs). Again, everything was tightly controlled. On a PCI-e transfer, we were allowed 5 us of maximum overhead. The rest of the time could only be used for streaming data to and from a device. So if you needed to do computation with the data, you needed to do it in flight and every data structure had to be perfectly optimized. Weirdly, once you have data structures that are optimized for your hardware, the algorithms kind of just fall into place. After that, I worked on sensor fusion and video applications for about a year where our data rates for a single card were measured in TB/s. Needless to say, efficiency was the name of the game.

After that, I moved to HFT. And weirdly, outside of the critical tick-to-trade path or microwave stuff, this industry has a lot less care around tight latencies and has crazy low data rates compared to what I'm used to working with.

So when I go look at software and stuff is slow, I'm just suffering because I know all of this can be done faster and more efficiently (I once shaved 99.5% of the run time off of a person's code with better data packing to align to cache lines, better addressing to minimize page thrashing, and automated loop unrolling into threads all in about 1 day of work). Software developers seriously need to learn to optimize proactively... or just write less shitty code to begin with.


> There's an art to it

while that's true, in this particular case with Casey's impl, it's not an art. The one thing that drastically improved performance, was caching. Literally the simplest, most obvious thing to do when you have performance problems.


That's part of the art. It's obvious to anyone who knows it and mysterious and ineffable to anyone who doesn't.

The meta-point is that corporate developers have been through the hiring machine and are supposed to know these things.

Stories like this imply that in fact they don't.


The hiring machine merely ensures that they can leetcode their way out of an interview and into the job. It doesn't care about what they're supposed to know :)


Even something like a JSON parser is often claimed to be IO bound. It almost never is because few could keep up with modern IO and some cannot keep up with old HD’s


Third'ing: the current crop of new developers have no freaking idea how much power they have at their finger tips. Us greybeard old game developers look at today's hardware and literally cream our jeans in comparison to the crap we put up with in the previous decades. People have no idea, playing with their crappy interpenetrated languages, just how much raw power we have if one is willing to learn the low level languages to access them. (Granted, Numpy and BLAS to a wonderful job for JIT languages.)


I'd say it is almost the other way around. We have so much wonderful CPU power that we can spare some for the amazing flexibility of Python etc.

Also it's not that simple. One place I worked at (scientific computation-kind of place), we'd prototype everything in Python, and production would be carefully rewritten C++. Standards were very high for prod, and we struggled to hire "good enough", modern C++, endless debates about "ooh template meta-programming or struct or bare metal pointers" kind of stuff.

3 times out of 4, the Python prototype was faster than the subsequent C++. It was because it had to be, the prototype was ran and re-ran and tweaked many times in the course of development. C++ was written once, deployed, and churned daily, without anyone caring for its speed.


Python has nice optimised libs for that, so it's not completely a surprise for that kind of application.

If you're doing generic symbol shuffling with a bit of math, Python is fast-ish to code and horribly slow to run. You can easily waste a lot of performance - and possibly cash - trying to use it for production.

Whether or not you'll save budget by writing your own optimised fast libs in some other lang is a different issue, and very application dependent.


Worth bearing in mind that Casey has a long history of unsuccessfully trying to nudge Microsoft to care about performance and the fact that he's still being constructive about it is to his credit.


I highly respect Casey but given his abrasive communication style I sometimes wonder if he is not trying to trigger people (MS devs in this case) to push him back so he can make his point.


Honestly, I felt like the ones to start with the condescending tones were the Microsoft devs who kept talking down to Casey about You Don't Understand How Hard This Is, when they also readily admitted they didn't understand the internals very well.


I don't think they're actually contradicting themselves there. They know enough about how hard text rendering is to conclude that they're better off delegating it to the team that specializes in that particular area, even though it means they have to settle for a good-enough abstraction rather than winning at a benchmark.


Enterprise deployment of a Somebody Else’s Problem field can really harm innovation,

“Any object around which an S.E.P. is applied will cease to be noticed, because any problems one may have understanding it (and therefore accepting its existence) become Somebody Else's Problem.”


Agree. Rendering text well really is hard, if you sit down and try to do it from scratch. It’s just that dealing with all of the wonderful quirks of human languages doesn’t have to make it _slow_. That’s their mistake.

And you’re right; all refterm really does is move the glyph cache out to the GPU rather than copying pixels from a glyph cache in main memory every frame.


In my experience as a former game dev who moved to enterprise apps, game dev techniques are broadly applicable and speed up enterprise apps without compromising on functionality.

Consider memory management techniques like caching layers or reference pools. Or optimizing draws for the platform's render loop. Or being familiar with profiler tools to identify hotspots. These techniques are all orthogonal to functionality. That is, applying them when you see an opportunity to will not somehow limit features.

So why aren't the enterprise apps fast, if it's so easy? I think that boils down to incentives. Enterprise apps are sales or product led and the roadmap only accommodates functionality that makes selling the software easier. Whereas in games the table stakes point you need to reach for graphics is not achievable by naively pursuing game features.

Put another way, computers and laptops are way stronger than consoles and performance is a gas. Enterprise devs are used to writing at 1 PSI or less and game devs are used to writing at 1 PSI or more.


With enterprise apps, I also have the budget to throw more computers at a problem. If it's between 2 weeks of my time, or to throwing another core at a VM, the extra core wins most of the time.


I actually have a lot of respect for old school game programmers because they have two traits that many of us who develop mainstream commercial software often lack: a) they care about performance and not in the abstract, but performance as evaluated by an actual human (latency issues in a messaging app are tolerable, a game with latency issues is simply not fun to play) and b) they can sit down without much fuss and quickly write the damn code (the ability that slowly atrophies as one works on a multi-year-old codebase where every change is a bit of a PITA). Sure, the constraints are different, but a lot of it is simply learned helplessness.


> might not be fully applicable to the development of mainstream commercial software that has to try to be all things to all people, including considerations like internationalization, accessibility, and backward compatibility.

Windows Terminal has none of that. And his refterm already has more features implemented correctly (such as proper handling of Arabic etc.) than Windows Terminal. See feature support: https://github.com/cmuratori/refterm#feature-support

Also see FAQ: https://github.com/cmuratori/refterm/blob/main/faq.md


Internationalization and accessibility are very important in game development. A lot of time is invested in this and larger studios have dedicated UI/UX teams which spend a lot of time on these issues.

The same is true of backwards compatibilty. As an example, making sure old save data is compatible with new versions is an important consideration.

Source: I'm a game programmer working mainly with graphics and performance, but I previously spent five years working on the UI team at a AAA studio.


How is it not applicable when the thing at question is rendering text and rendering is the core of game development? This argument is stupid. Do you have to be a slowpoke to develop commercial apps?


My point is that a UI that meets the needs of as many users as possible, including things like internationalization and accessibility, is much more complex than a typical game UI. That complexity drives developers to use abstractions that often make it much more difficult to optimize. And in the big picture, support for these things that add complexity is often more important than top-speed rendering.


Games are typically much better at internationalization and accessibility than developer tooling though. For example this new windows console doesn't have either, but all big games gets translated to and handles text from languages all over the world.


Video games often have an international audience and go to great lengths to support accessibility and multiplatform support, ie. supporting both tablet and desktop. It's laughable how bad many enterprise UIs are that fail to handle different locales, or issues displaying right-to-left text and assuming everyone is using an English speaking standard desktop environment, whereas many indie games manage to handle these things very well.


Games usually handle internationalization and accessibility much better than most software.

This includes audio localization (something no 'Enterprise' software has ever needed AFAIK), and multiple colour palettes for different types of colour blindness.

Sometimes video games are the only software with reasonable localizations I ever find installed in a particular computer.


Can't recall when it was the last time I played game with no internationalization support


Backward compatibility is a huge one here.

There is a newer version of component X but we can't leverage that due to dependency Y.


I found it very funny that the Hindi sample text on display, in the YouTube refterm demo, means “You can wake up someone who is sleeping, but how do wake up someone who is hell bent on pretending to sleep?”.


A bit out of topic but has anybody followed performance issues of Microsoft Flight Simulator 2020? For more than half a year it was struggling with performance because it was CPU heavy, only loading one core and etc. Barely ran on my i5 6500. Fast forward half a year, they want to release it on XBox. MS/Asobo moves a lot of computation on GPU, game starts running smoothly on the very same i5 with maximized quality settings.

You just begin to wonder how these things happen. You would think top programmers work at these companies. Why would they not start with the good concept, loading GPU first etc. Why did it take them so much time to finally do it correctly. Why waste time not doing it at the beginning.


It's pretty straightforward case of prioritization. There are always more things to do on a game project than you have people and time to do.

The game runs well enough so the people who could optimize things by rewriting them from CPU to GPU are doing other things instead. Later performance is a noticeable problem to the dev team, from customer feedback and need to ship in more resource constrained environments (VR and XBox) and that person then can do work to improve performance.

It's also handy to have a reference CPU implementation both to get your head around a problem and because debugging on the GPU is extremely painful.

To go further down the rabbit hole it could be that they were resource constrained on the GPU and couldn't shift work there until other optimizations had been made. And so on with dependencies to getting a piece of work done on a complex project.


Makes sense and then it kinda agrees with parent comment that "Microsoft developers simply didn't have performance in the vocabulary".

Yes, there is no doubt that "there are always more things to do on a game project than you have people and time to do". However how there is time to firstly make "main thread on single core" monster and then redo it according to game development 101 - make use of GPU.

It is no joke - GPU was barely loaded, while CPU choking. On a modern game released by top software company proudly presenting their branding in the very title.


> However how there is time to firstly make "main thread on single core" monster and then redo it according to game development 101 - make use of GPU.

The single threaded code was taken from Microsoft Flight Simulator X, the previous version of this game from 2006. It was not done from scratch for the new game, and it still hasn't been replaced. They've just optimized parts of it now.

Another important performance bottleneck is due to the fact that the game's UI is basically Electron. That's right, they're running a full blown browser on top of the rest of the graphis for the UI. Browsers are notoriusly slow because they support a million edge cases that aren't always needed in a game UI.

For anyone interested in learning more about Microsoft Flight Simulator 2020 optimizations I can recommend checking out the Digital Foundry interview with the game's technical director. https://www.youtube.com/watch?v=32abllGf77g


It was actually made by a third-party game development studio rather than Microsoft.

Also the assumption that the culture of the Windows Terminal team is the same as the team building a flight simulator is a bit far fetched. Large organisations typically have very specific local cultures.

Rewriting stuff from CPU to GPU also isn't 101 game development. Both because it's actually quite hard and because not all problems are amenable to GPU compute and would actually slow things down.


I work within game development. Mainly on graphics programming and performance. There's always a number of things I know would speed various systems up but that I don't have time to implement because there are so many bigger fires to put out.

Also, I have rewritten CPU implementations to run on the GPU before and it's often nontrivial. Sometimes even in seemingly straightforward cases. The architectures are very different, you only gain speed if you can properly parallelize the computations, and a lot of things which are trivial on the CPU becomes a real hassle on the GPU.


It may sound oversimplified, but IME PC games are only optimized to the point where it runs well on the development teams beefy PCs (or in the best case some artificial 'minimal requirement PC with graphics details turned to low', but this minimal requirement setup usually isn't taken very seriously by the development team).

When porting to a game console you can't simply define some random 'minimal requirement' hardware, because the game console is that hardware. So you start looking for more optimization opportunities to make the game run smoothly on the new target hardware, and some of those optimizations may also make the PC version better.


I'd like to second this and add that this is partly why I feel it's important to develop games directly for all your target platforms rather than target one or two consoles and then port the game.


Because a rule of thumb is to not focus too much on performance in the beginning of a project. Better a completed project with some performance issues, than a half product with hyper speed. The key thing with development is to find somekind of balance within all these attributes (stability, performance, loo, reusability etc) In case of FS simulator. Not sure what the motives were. Sure that they had some serious time constraints. I think they did an acceptable job there.


Agreed. As someone who spends most of their time on performance related issues, it's important to keep in mind that sometimes performance issues are strongly tied to the architecture of the program. If the data structures aren't set up to take advantage of the hardware, you'll never get the results you're hoping for. And setting up data structures is often something that needs to be thought about at the beginning of the project.


Completely agree on rule of thumb and can't doubt they had their motives. It is no way that simple.

Then again isn't it like 101 of game development?

Imagine releasing a game that looks stunning, industry agrees that it pushes the limits of modern gaming PC (hence runs poorly on old machines). Fast forward some time - "oh BTW we did it incorrectly (we had our motives), now you can run it on old machine just fine, no need of buying i9 or anything".


Because it's never that obvious, it's not really 101 of gamedev to move everything to the GPU.

You know your target specs but where the bottlenecks are and how to solve them will be constantly shifting. At some point the artists might push your lighting or maybe it's physics now, maybe it's IO or maybe networking. Which parts do you move to the GPU?

Also a GPU is not a magic bullet, it's great for parallel computation but not all problems can be solved like that. It's also painful to move memory between the CPU and the GPU and it's a limited resource can't have everything there.


True. Performance is not issue if no one is playing it.


I'll reiterate a rant about Flight Simulator 2020 here because it's on-topic.

It was called "download simulator" by some, because even the initial installation phase was poorly optimised.

But merely calling it "poorly optimised" isn't really sufficient to get the point across.

It wasn't "poor", or "suboptimal". It was literally as bad as possible without deliberately trying to add useless slowdown code.

The best equivalent I can use is Trivial FTP (TFPT). It's used only in the most resource-constrained environments where even buffering a few kilobytes is out of the question. Embedded microcontrollers in NIC boot ROMs, that kind of thing. It's literally maximally slow. It ping-pongs for every block, and it uses small blocks by default. If you do anything to a network protocol at all, it's a step up from TFTP. (Just adding a few packets worth of buffering and windowing dramatically speeds it up, and this enhancement thankfully did make it into a recent update of the standard.)

People get bogged down in these discussions around which alternate solution is more optimal. They're typically arguing over which part of a Pareto frontier they think is most applicable. But TFTP and Microsoft FS2020 aren't on the Pareto frontier. They're in the exact corner of the diagram, where there is no curve. They're at a singularity: the maximally suboptimal point (0,0).

This line of thinking is similar to the "Toward a Science of Morality" by the famous atheist Sam Harris. He starts with a definition of "maximum badness", and defines "good" as the direction away from it in the solution space. Theists and atheists don't necessarily have to agree with the specific high-dimensional solution vector, but they have to agree that that there is an origin, otherwise there's no meaningful discussion possible.

Microsoft Terminal wasn't at (0,0) but it was close. Doing hilariously trivial "optimisations" would allow you to move very much further in the solution space towards the frontier.

The Microsoft Terminal developers (mistakenly) assumed that they were already at the Pareto frontier, and that the people that opened the Github Issue were asking them to move the frontier. That does usually require research!


I though that it looks more like 'no one touches it at all after function part finished'. It is insane that someone still do decompress and downloading in the same thread and blocks download during unzip the resource in 2021. Even 2000's bash programmer knows you had better don't do them in order or it will be slow...


It also downloads hundreds of thousands of tiny files over a high-latency connection.

At least initially, it didn't use the XBox CDN either.


The damn downloader in MSFS is the most infuriating thing. In Canada on either of the main ISPs I too out at 40ish Mbps whereas Steam and anything else really does close to the full 500Mbps. It also only downloads sequentially, pausing to decrypt each tiny file. And the updates are huge so it takes a good long while to download 2+GB.


Sometimes you don't know you have a performance problem until you have something to compare it to.

Microsoft's greatest technical own goal of the 2010s was WSL 2.

The original WSL was great in most respects (authentic to how Windows works; just as Windows NT has a "Windows 95" personality, Windows NT can have a "Linux" personality) but had the problem that filesystem access went through the Windows filesystem interface.

The Windows filesystem interface is a lot slower for metadata operations (e.g. small files) than the Linux filesystem interface and is unreformable because the problem is the design of the internal API and the model for security checking.

Nobody really complained that metadata operations in Windows was slow, they just worked around it. Some people though were doing complex build procedures inside WSL (build a Linux Kernel) and it was clear then there was a performance problem relative to Linux.

For whatever reason, Microsoft decided this was unacceptable, so they came out with WSL 2 which got them solidly into Kryptonite territory. They took something which third party vendors could do perfectly well (install Ubuntu in a VM) and screwed it up like only Microsoft can (attempt to install it through the Windows Store, closely couple it to Windows so it almost works, depend on legitimacy based on "it's from Microsoft" as opposed to "it works", ...)

Had Microsoft just accepted that metadata operations were a little bit slow, most WSL users would have accepted it, the ones who couldn't would run Ubuntu in a VM.


WSL2 worked for me in a way that WSL1 did not and it had to do with build times while doing tutorial projects. I am not an expert, but my own experience was that it was a massive improvement.


Parent is not refuting that WSL2 performed better than WSL1, they're arguing that a reasonable response to WSL1 giving you slow build times might have simply been to use a VM instead.

Microsoft being Microsoft, they didn't want people like you to hop to VMware or VirtualBox and use a full, branded instance of Fedora or Ubuntu, because then you would realize that the next hop (moving to Linux entirely) was actually quite reasonable. So they threw away WSL1 and built WSL2. Obviously WSL2 worked better for you than WSL1, but you also did exactly what Microsoft wanted you to do, which is to their benefit, and not necessarily to yours.


Additionally, the problems that held back WSL1 performance still exist, and WSL1 wasn't their only victim. So Microsoft has abandoned WSL1, but they still need to address those underlying problems. Only now, if they do successfully deal with those issues (most likely as part of their DirectStorage effort and cloning io_uring), we're unlikely to see WSL1 updated to benefit—even though that might result in WSL1 becoming a better user experience than WSL2.


What's wrong with the performance? I don't notice any performance issues in Windows terminal.


Do you use colored text tags? The performance issues only happened when that was on.


Yes


I just launched "dir"

https://i.imgur.com/lkbOR3i.png

can't even print properly the decimal separator.

maybe it wasn't that easy.


Now what? What now? How about now? Just built it from source.

https://imgur.com/dGI5W2S


Oh and you can change font

Here's with Courier New: https://imgur.com/t88wwTx

EDIT: And here's with JetBrains Mono: https://imgur.com/HYEYtGb

EDIT: Do I need to make a font parade?

EDIT: Consolas! https://imgur.com/klPufrd Where can we go without the classic? IBM Plex Mono https://imgur.com/e5tQ0NB


The font, m8, the font. Use monospcae font and it works like a charm.


Does the GP's chosen font work correctly in Windows Terminal, though? If so, then that proves that there is indeed more to a fully functional terminal renderer than refterm covers.


Refterm is a not a full-featured terminal in terms of configurability, but it has all the features needed for rendering. Configuration like: choosing fonts, choosing colors, tabs, whatever, it's misc. features, which are unrelated to rendering. The case here is about rendering and it's exactly what is shite in every terminal emulator. I don't understand why everyone is arguing about it anyway. Refterm provides a fix, Windows Terminal should simply implement it. Is this dignity or what? Are you not engineers? Should you not prioritise software quality above everything else?

EDIT: Remove argument about ease of development, because refterm is easy.

This bugs me every time. "Wow, this software works so good, but we are not gonna make our software like that, no, we'll stick to our shite implementation."


The fact that Casey Muratori's proposed approach requires the terminal to reimplement the process of correctly mapping characters to glyphs - including stuff like fallbacks to other fonts - is a huge part of the argument for why it's much harder to implement and more complicated than he claims. If it really doesn't do that right for something as simple as a decimal seperator for the font some random HN commenter happened to use, that does tend to suggest the Microsoft employees are in the right here.


The "font some random HN commenter happened to use" is some f****** proportional Calibri. I want to see someone use it in any terminal emulator. Refterm defaults to Cascadia Code, but, fair enough, it doesn't have fallback yet.

Its' description says also: "Reference monospace terminal renderer". "monospace" is there for a reason.

It's worth mentioning though, that Windows Terminal also defaults to Cascadia Code and Cascadia Code was installed automatically on my machine, so it's de-facto the new monospace standard font on Windows starting from 10.


> some f*** proportional Calibri Refterm defaults to Cascadia Code

cascadia is as borken as the other font, so what now?

https://i.imgur.com/WeV8Ror.png

maybe writing a unicode rendering isn't that easy? maybe drop the attitude?


>maybe drop the attitude?

You came here with an attitude.

> I just launched "dir" https://i.imgur.com/lkbOR3i.png can't even print properly the decimal separator. maybe it wasn't that easy.


Turned out it defaults to Cascadia Mono, my bad. Still your argument is wrong, because, like it doesn't work on your machine, on mine it does.


which is exactly the point, the amount of shortcut taken to convert code point to fast graphics make it just a nice hack thrown together and devs were right on preferring the slow but correct approach.


That is a terminal bug and not a rendering bug though, since the problem was that the terminal didn't properly fetch your user settings here. Feeding the same character into the other rendering would cause the same issue.

Nobody said he made a fully functional better terminal, just that the terminal rendering was better and functional. Doing everything needed for a fully functional terminal is a lot of work, but doing everything needed for terminal rendering isn't all that much work.


> Nobody said he made a fully functional better terminal

> The "complaining developer" produced a proof of concept in just two weekends that notably had more features[1] and was more correct than the Windows Terminal!

easily falsifiable bullshit found to be false.

https://news.ycombinator.com/item?id=28744084


I have to say though, the responder to the first comment did a bad job at conveying what exactly refterm is... Apparently people falsely think that refterm is a terminal emulator and use it interchangeably with terminal and terminal renderer, while it isn't.

If you were to open refterm once, you'd see a text that explicitly states "DO NOT USE IT AS A TERMNIAL EMULATOR", or something like that (can't open it now to copy exact).


You can’t use a non-monospace font on a tiled space. Like that literally makes no sense. Of course it won’t look right. This is like asking why you can’t use `out` parameters on an inherent async function.


Whether the font is monospace or not isn't really the problem - that causes some aesthetically ugly spacing, but that's to be expected and it's still readable. The big issue is that the code has completely failed to find a glyph for one of the characters used in something as commonplace as a directory listing from the dir command and people expect better than this from font rendering in modern applications.


If you copy and paste, the characters are there. The problem is that the glyph simply is off the tile.


Then again, refterm doesn't f*ck it up. It still renders proportional font; just aligned to tiles. It is essentially doing the same as for emoji.


As far as I understood from the video about refterm the speedup is mostly due to not rendering literally every frame in sequence (after all, who needs that), which is what windows terminal seems to be doing.

That seems like it would be unaffected by correcting font rendering.


> That seems like it would be unaffected by correcting font rendering.

It is.

"... extremely slow Unicode parsing with Uniscribe and extremely slow glyph generation with DirectWrite..."

Glyph generation is about rasterising text, because you can't just feed it the font file.


This is some schoolyard level stuff right here. The GP isn't using a monospaced font. Who, in the history of terminal emulators, has wanted to use a non-monospaced font in their terminal?


This is exactly the sort of "we can just skip that feature to make it faster!" edge case that I was talking about in my post.


But it isn't an edge case! It's not an edge case, if it isn't a use case!

This is an edge case as much as building a rasterizer directly into the terminal is an "edge case".


> But it isn't an edge case! It's not an edge case, if it isn't a use case!

The fact that a random commenter on HN used a non-monospaced font with refterm actually makes it a use case.

I do, however, agree that it is an edge case with a very low probability.


Because it's such an improbable edge case, it seems like it's not relevant to the more general discussion of "does refterm's speed and features actually show that the rendering problem is far easier than the Microsoft developers made it out to be".

The Microsoft terminal doesn't render monospaced fonts, the overwhelmingly common case, nearly as fast as refterm. If rendering variable-width fonts is somehow intrinsically insanely expensive for some reason (which I haven't seen anyone provide good evidence for), then a good implementation would still just take refterm's fast monospaced rendering implementation and use it for monospaced fonts, and a slower implementation for variable-width fonts.

That is - refterm's non-existent variable-width font rendering capabilities do not excuse the Windows terminal's abysmal fixed-width font rendering capabilities.


Agreed. It doesn't seem like it is relevant. My comment was more addressing that it _is_ an edge-case, albeit a very unlikely one.


Wait, what? Your edge case is something that no one would ever (should ever?) do? Are you going to complain about it not rendering scripts fonts correctly either?

Also, it's worth noting that this isn't a compelling argument in the first place because the windows terminal doesn't even come close to rendering readable Arabic, it fucks up emoji rendering, etc – all cases that Casey was able to basically solve after two weekends of working on this.


In my 25 years of software development I've found that I'm rarely able to enumerate what an application should or shouldn't do on my own. Unless the app is extremely simple and has very few options it's incredibly unusual for any individual to understand everything about it. While I don't have a use case where I'd want to use a variable space font in a terminal, that doesn't mean such a case doesn't exist for anyone. Maybe some people want that feature.

Windows Terminal, for some reason, gives users the option to change their font to one that isn't monospaced, so I'd argue that it should render them correctly if the user chooses to do that.


Would you actually...?



doesn't work even with the proper font https://i.imgur.com/WeV8Ror.png


You're not arguing in good faith.

Casey threw something together in a matter of days that had 150% of the features of the Windows Terminal renderer, but none of the bug fixing that goes into shipping a production piece of software.

That screenshot you keep parading around is a small issue with a quick fix. It's not like Casey's approach is inherently unable to deal with punctuation!

You don't discard the entire content of a scientific journal paper because of a typo.

"Sorry Mr Darwin, I'm sure you believe that your theory is very interesting, but you see here on page 34? You wrote 'punctuted'. I think you meant 'punctuated'. Submission rejected!"


In this case, the bug fixing is probably the lion's share of the work though - there's a huge amount of subtle edge cases involved in rendering text, and the Microsoft employees almost certainly know this. And the example that broke it isn't even something particularly obscure. We're literally talking about the output of the dir command, one of the first things someone is likely to do with a terminal window, not displaying correctly. He basically did the easy part of the work and lambasted some Microsoft employees as idiots because they thought it was more complex than that.


In Casey's defense (I'm ambivalent on this one), while the dir command itself isn't obscure, one could argue that using a no-op Unicode character as the digit group separator is an obscure case, at least for an American programmer. But I think your overall point still stands.


You've lost the original point: everyone was pretending this refterm was ready to replace the terminal app, criticizing microsoft for taking the slow but sure approach:

> The "complaining developer" produced a proof of concept in just two weekends that notably had more features[1] and was more correct than the Windows Terminal!

But now apparently pointing out that "MS was right not to want to take shortcut in unicode rendering" morphed into "criticizing in bad faith refterm for not being production ready"

Who's not arguing in good faith here?


>refterm was ready to replace the terminal app

Considering Casey himself puts front and center the disclaimer that this is solely intended to be a reference and goes into as much detail in his videos I don't know where you got this from. I don't think anyone is under the illusion that this is could replace the actual terminal. It's just meant to show that there's a minimum level of performance to be expected for not a huge amount of effort (a couple weekend's worth) and there is no excuse for less.


> everyone was pretending this refterm was ready to replace the terminal app

Who said that? Refterm isn't a fully functional terminal, it is just a terminal renderer bundled with a toy terminal.


I haven't changed anything, just downloaded and launched, that's the result. if the term only works with one font why is the software picking a random one from the system conf?


It looks like refterm is hard-coded to use Cascadia Mono, which isn't included in-box with Windows 10. So I don't know what happens if you don't have that font. If that's the only issue, then I think we can let that one go, as refterm is clearly only a proof of concept, and one-time logic for choosing the correct font at startup would presumably have no effect on rendering speed.


https://i.imgur.com/WeV8Ror.png

no, doesn't work with cascadia either.

so what now?


I suspect an i18n issue. What locale are you using, and what is the decimal separator character supposed to be in your locale?


https://i.imgur.com/WrsVwFz.png

absolutely standard


It looks like your "Digit grouping symbol" field is empty. I'm sure that's standard in some locales, though not for US English. I don't know how to make that field empty; when I try, Windows says it's invalid. So I wonder if your locale sets that separator to some kind of Unicode character that, in a proper renderer, is equivalent to no character at all. If that's the case, then I'm guessing refterm could handle that character as easily as it handles VT escape codes. But this does lend some weight to the position that Casey was oversimplifying things a bit.


well, whatever is different in your settings, mine renders normal https://imgur.com/dGI5W2S

EDIT: I am sorry for the attitude, changed "wrong" to "different"


Their settings aren't wrong, just different, likely because of differing standards for digit grouping across locales. So this is a case that refterm clearly doesn't support. This case by itself doesn't invalidate refterm's approach to rendering, but I can see why the team at Microsoft, knowing that there are many such cases, would favor abstraction over the absolute best possible speed.


which is exactly the point.


A pretty one sided view. I use Windows terminal because it supports multiple tabs - multiple cmds and some WSL bashes.

I don't care at all if this or that terminal uses a bit more RAM or is a few milliseconds faster.


Then they should have said so on the issue "We don't value performance and wont spend any resources to fix this", rather than do the dance of giving bullshit reasons for not doing it.

Anyway, resource usage matters quite a lot, if your terminal uses CPU like an AAA game running in the background you will notice fan noises, degraded performance and potentially crashes due to overheating everywhere else in the computer.


> I don't care at all if this or that terminal uses a bit more RAM or is a few milliseconds faster

Did you watch the video? The performance difference is huge! 0.7 seconds vs 3.5 minutes.


> The "complaining developer" produced a proof of concept in just two weekends...

That developer also was rather brusque in the github issue and could use a bit more humility and emotional intelligence. Which, by the way, isn't on the OP blog post's chart of a "programmers lifecycle". The same could be said of the MS side.

Instead of both sides asserting (or "proving") that they're "right" could they not have collaborated to put together an improvement in Windows Terminal? Wouldn't that have been better for everyone?

FWIW, I do use windows terminal and it's "fine". Much better than the old one (conhost?).


> could they not have collaborated to put together an improvement in Windows Terminal?

My experience with people that want to collaborate instead of just recognizing and following good advice is that you spend a tremendous amount of effort just to convince them to get their ass moving, then find out they were not capable of solving the problem in the first place, and it’s frankly just not worth it.

Much more fun to just reimplement the thing and then say “You were saying?”


Haha I just saw the youtube video with the developer demoing his project and the text in his terminal literally reads in Hindi "you can wake up someone who is asleep, but how do you wake up someone who is just closing his eyes and pretending to be asleep?"

The developer surely was having a tonne of fun at the expense of Microsoft. Perhaps a little too much fun imo.


> Much more fun to just reimplement the thing and then say “You were saying?”

The thing is NO ONE likes to lose face. He could have still done what he did (and enjoy his "victory lap") but in a spirit of collaboration.

To be fair, MS folks set themselves up for this but the smart-alec could have handled it with more class and generosity.


This was purely self-inflicted.


It’s often easier to put together something from scratch, if you’re trying to prove a point, than it is to fix a fundamentally broken architecture.


That sounds like you've never seen performance of a heavily worked-on subsystem increase by 10x because one guy who was good at that kind of stuff spent a day or two focused on the problem.

I've seen that happen at least 10 times over my career, including at very big companies with very bright people writing the code. I've been that guy. There are always these sorts of opportunities in all but the most heavily optimized codebases, most teams either a) just don't have the right people to find them, or b) have too much other shit that's on fire to let anyone except a newbie look at stuff like performance.


More generally, in my experience performance isn't looked at because it's "good enough" from a product point of view.

"Yes it's kinda slow, but not enough so customers leave so who cares." Performance only becomes a priority when it's so bad customers complain loudly about it and churn because of it.


There’s a bit of incentive misalignment when commercial software performance is concerned. If we presume customers with tighter budgets tend to be more vocal and require more support, and customers on slower machines are often customers on tighter budgets, the business as a whole might actually not mind those customers leaving as it’d require less support resources spent on customers who are more difficult to upsell to.

Meanwhile, the majority of customers with faster machines are not sensitive enough to feel or bother about the avoidable lag.


Though we're now in a situation where lots of software (see: Word and Excel) is painfully slow even on high-end desktop hardware.


That is probably why they have this law called wirths law about the wintel ecosystem. What Andy giveth, Bill taketh away.

Or Gates's law "The speed of software halves every 18 months"


There's also sometimes the incentive to slow things down because if it is too fast, the client will perceive that he paid too much money for an operation that takes no time, i.e. it doesn't exists seems unimportant.


It would be a shame for a truly artistically designed busy-wheel if it didn't get turn once or twice, regardless of the time it was actually needed!


I've made such optimizations and others made them in my code, so:

c) slow code happens to everyone, sometimes you need fresh pair of eyes.


Absolutely: what I should have said is that I've been the one to cause performance problems, I've been the one to solve them, I've been the manager who refused to allocate time to them because they were not important enough, and I've been the product owner who made the call to spend eng hours on them because they were. There are many systemic reasons why this stuff does not get fixed and it's not always "they code like crap", though sometimes that is a correct assessment.

But show me a codebase that doesn't have at least a factor of 2 improvement somewhere and does not serve at least 100 million users (at which point any perf gain is worthwhile), and I'll show you a team that is being mismanaged by someone who cares more about tech than user experience.


"Need" especially, because often it's just that those fresh eyes don't have any of the political history, so doesn't have any fallout from cutting swaths through other people's code that would be a problem for someone experienced in the organisation.


I’ve seen it at least as many times, too. Most of the time, the optimization is pretty obvious. Adding an index to a query, or using basic dynamic programming techniques, or changing data structures to optimize a loop’s lookups.

I can’t think of a counter example, actually (where a brutally slow system I was working on wasn’t fairly easily optimized into adequate performance).


It is nice when a program can be significantly sped up by a local change like that but this is not always the case.

To go truly fast, you need to unleash the full potential of the hardware and doing it can require re-architecting the system from the ground up. For example, both postgres and clickhouse can do `select sum(field1) from table group by field2`, but clickhouse will be 100x faster and no amount of microoptimizations in postgres will change that.


No argument from me. I’m just pointing out that it’s wrong to assert that programmers are general incorrect when they say something can be optimized. Many times in my career, optimizations were trivial to implement, and the impact was massive. There have been a few times when the optimization was impossible without a major rearchitecture, but those times were rare.


Yah, was going to say something like this. I've fixed these problems a few times, and I don't really think any of them were particularly hard. That is because if your the first person to look at something with an eye to performance there is almost always some low hanging fruit that will gain a lot of perf. Being the 10th person to look at it, or attack something that is widely viewed as algorithmically at its limit OTOH is a different problem.

I'm not even sure it takes an "experienced" engineer, one of my first linux patches was simply to remove a goto, which dropped an exponential factor from something, and changed it from "slow" to imperceptible.


I don't know about you, but I was really laughing out loud reading that GitHub conversation.

GPUs: able to render millions of triangles with complex geometrical transformations and non-trivial per-pixel programs in real time

MS engineers: drawing colored text is SLOW, what do you expect

P.S. And yes, I know, text rendering is a non-trivial problem. But it is a largely solved problem. We have text editors that can render huge files with real-time syntax highlighting, browsers that can quickly layout much more complex text, and, obviously Linux and Mac terminal emulators that somehow have no issue whatsoever rendering large amount of colored text.


To be fair to the MS engineers, from their background experience with things like DirectText, they would have an ingrained rule-of-thumb that text is slow.

That's because it is slow in the most general case: If you have to support arbitrary effects, transformations, ligatures, subpixel hinting, and smooth animations simultaneously, there's no quick and simple approach.

The Windows Terminal is a special case that doesn't have all of those features: No animation and no arbitrary transforms dramatically simplifies things. Having a constant font size helps a lot with caching. The regularity of the fixed-width font grid placement eliminates kerning and any code path that deals with subpixel level hinting or alignment. Etc...

It really is a simple problem: it's a grid of rectangles with a little sprite on each one.


Supercomputer or not, it's a terminal.

In the real-CRT-terminal days of the 1970's & 1980's of course the interface to the local or remote mainframe or PC was IO-bound but not much else could slow it down.

UI elements like the keyboard/screen combo have been expected to perform at the speed of light for decades using only simple hardware to begin with.

The UX of a modern terminal app would best be not much different than a real CRT unit unless the traditional keyboard/display UI could actually be improved in some way.

Even adding a "mouse" didn't slow down the Atari 400 (which was only an 8-bit personal computer) when I programmed it to use the gaming trackball to point & click plus drag & drop. That was using regular Atari Basic, no assembly code. And I'm no software engineer.

A decade later once the mouse had been recognized and brought into the mainstream it didn't seem to slow down DOS at all, compared to a rodent-free environment.

Using modern electronics surely there should not be any perceptable lag compared to non-intelligent CRT's over dial-up.

Unless maybe the engineers are not as advanced as they used to be decades ago.

Or maybe the management/approach is faulty, all it takes is one non-leader in a leadership position to negate the abilities of all talented operators working under that sub-hierarchy.


Exactly. Fast terminal rendering on bitmap displays has been a solved problem for at least 35+ years. Lower resolutions, sure, but also magnitudes slower hardware.


It's more subtle than that. What the Microsoft engineers are saying is that the console's current approach to drawing text is inherently slow in this particular case, due to the way the text drawing library it's based on uses the GPU. The proposed solution requires the terminal to have its own text drawing code specific to the task of rendering a terminal, including handling all the nasty subtlties and edge-cases of Unicode, which must be maintained forever. This is not trivial at all; every piece of code ever written to handle this seems to end up having endless subtle bugs involving weird edge-cases (remember all those stories about character strings that crash iPhones and other devices - and the open source equivalents are no better). It's relatively easy to write one that seems to work for the cases that happen to be tested by the developer, but that's only a tiny part of the work.


I’m fairly sure noone in question wrote a new font renderer, but just rendered all available fonts upfront with a system library, and uploaded it to the GPU and let it use it as a bitmap.

Text rendering is still done mostly on the CPU side in the great majority of applications, since vector graphics are hard to do efficiently on GPUs.


Simply shaping text using state of the art libraries (like harfbuzz) can take an INCREDIBLE amount of time in some cases. If you're used to rendering text in western character sets you may think it can always be fast, but there are cases where it's actually quite slow! You get a sense for this if you try to write something like a web browser or a word processor and have to support people other than github posters.

Of course in this case it seems like it was possible to make it very fast, but people who think proper text rendering is naturally going to be somewhat slow aren't always wrong.

Saying that text rendering is "largely solved" is also incorrect. There are still changes and improvements being made to the state of the art and there are still unhappy users who don't get good text rendering and layout in their favorite applications when using a language other than English.


> naturally going to be somewhat slow

Naturally slow for a text renderer might mean it renders in 4ms instead of 0.1.


Yes, and for a terminal like discussed in the article, 4ms is considered unacceptably slow


I dunno what framerate you expect your terminal to run at, but 250hz should be enough for everyone.


You are right in the general case. But terminals are a specific niche, not requiring the full extent of text rendering edge cases as a browser, wysiwyg editor, etc “experience”. It renders “strictly”* monospaced fonts, which makes it trivial to cache and parallelize.

* as it was brought up, one might use a non-monospace font, but that case can just use the slow path and let the “normal” people use a fast terminal


I understand the scepticism about such claims, but Casey's renderer is not a toy, and handles a number of quite dificult test-cases correctly. He solicited feedback from a sizeable community to try and break his implementation. The code is vailable here: https://github.com/cmuratori/refterm


From the refterm README:

refterm is designed to support several features, just to ensure that no shortcuts have been taken in the design of the renderer. As such, refterm supports:

* Multicolor fonts

* All of Unicode, including combining characters and right-to-left text like Arabic Glyphs that can take up several cells

* Line wrapping

* Reflowing line wrapping on terminal resize

* Large scrollback buffer

* VT codes for setting colors and cursor positions, as well as strikethrough, underline, blink, reverse video, etc.


The really hard part of writing a terminal emulator, at least from my experience working on Alacritty, is fast scrolling with fixed regions (think vim).

Plently of other parts of terminal emulators are tricky to implement performantly, ligatures are one Alacritty hasn't got yet.


Thanks for the insight.

I have never written a terminal enulator, so could you maybe summarize why fast scrolling with fixed regions is so hard to implement?


Reading the thread itself, it’s a bit of both. Windows Terminal is complex, ClearType is complex and Unicode rendering is complex. That said… https://github.com/cmuratori/refterm does exist, does not support ClearType, but does claim to fully support Unicode. Unfortunately, Microsoft can’t use the code because (a) it’s GPLv2 and (b) it sounds like the Windows Terminal project is indeed a bit more complicated than can be hacked on over a weekend and would need extensive refactoring to support the approach. So it sounds a bit more like a brownfield problem than simply ignoring half the things it needs to do, though it probably does that too.


> Unfortunately, Microsoft can’t use the code

As good as Casey Muratori is, Microsoft is more than big enough to have the means of taking his core ideas and implement them themselves. It may not take them a couple weekends, but they should be able to spend a couple experienced man-months over this.

The fact they don't can only mean they don't care. Maybe the people at Microsoft care, but clearly the organisation as a whole as other priorities.

Besides, this is not the first time I've seen Casey complain about performance in a Microsoft product. Last time it was about boot times for Visual Studio, which he does to debug code. While reporting performance problems was possible, the form only had "less than 10s" as the shortest boot time you could tick. Clearly, they considered that if VS booted in 9 seconds or less, you don't have a performance problem at all.


> Unfortunately, Microsoft can’t use the code

I commented on a separate issue re: refterm

--- start quote ---

Something tells me that the half-a-dozen to a dozen of Microsoft developers working on Windows terminal:

- could go ahead an do the same "doctoral research" that Casey Muratori did and retrace his steps

- could pool together their not insignificant salaries and hire Casey as a consultant

- ask their managers and let Microsoft spend some of those 15.5 billion dollars of net income on hiring someone like Casey who knows what they are doing

--- end quote ---


> Unfortunately, Microsoft can’t use the code because (a) it’s GPLv2

One thing to remember is that it is always possible and acceptable to contact the author of a GPL-licensed piece of code to enquire whether they would consider granting you a commercial license.

It may not be worthwhile but if you find exactly what you're looking for and that would take you months to develop yourself then it may very well be.


Not always. GPL-licensed do not have to have a “the author”. There may be hundreds of copyright holders involved (IIRC, ¿Netscape? spent years looking for people that had to agree when it planned to change their license and rewriting parts written by people who didn’t)


Why talk in such generalities? Look at the github repo. There are only three committers to Casey's repo. I'm sure Microsoft could manage to contact them. I'm also quite sure that Microsoft has the money to entice a commercial license if they so wish.


> Look at the github repo. There are only three committers to Casey's repo. I'm sure Microsoft could manage to contact them.

Microsoft's attitude towards the code seems a little odd. [0]

Unfortunately the code is intentionally GPLv2 licensed and we‘ll honor this wish entirely. As such no one at Microsoft will ever look at either of the links.

Given that WSL exists I can't imagine this is a universal policy towards reading GPLv2 code at Microsoft.

[0] https://github.com/microsoft/terminal/issues/10462#issuecomm...


Yeah the attitude doesn’t really make any sense. How does the license preclude them looking at the code? They can even download it, compile it, and even run it _without_ accepting the license. They only need to care about the license if they decide for distribute it.


Blanket policy to prevent claims of "you looked at our code and stole $importantDetail": only let people look at code you can safely admit to using in your product.


Because the comment I replied to made the generic claim (emphasis added) “One thing to remember is that it is always possible and acceptable to contact the author of a GPL-licensed piece of code”.


Sure, 'the author' may be a number of people collectively, and in that case it's probably not worth bothering.


> (a) it’s GPLv2

Why is that a problem? A GPLv2 terminal would not be a business problem for Microsoft. People would still have to buy licenses for Windows. Maybe they would lose a little face, but arguably they have already done so.

At least it’s not GPLv3 which this industry absolutely and viscerally hates (despite having no problem Apache 2.0 for some reason, Theo de Raadt is at least consistent).


If Microsoft embedded the GPLv2 terminal into Windows, Windows would have to release as GPLv2 (or compatible license). I assume they don't want that.

They can alternatively buy a commercial license, as another user said below.


You should read up on the "mere aggregation" clause of the GPLv2. It allows an OS to include a GPLv2 program without having to put the entire OS under the GPLv2. If the GPLv2 did function the way you seem to think it does, then almost every Linux distro would be in violation, too.


Thanks, I think this is a very important point that I totally missed.


> Unfortunately, Microsoft can’t use the code because (a) it’s GPLv2

That's not unfortunate. Having people who work on competing Free Software is a good thing. It would be even better if Microsoft adopted this code and complied with the terms of the GPL license. Then we won't have to deal with problems like these because they'd be nipped in the bud. And we would set the precedent to take care of lot of other problems like the malware, telemetry, abuse of users' freedoms.


It's the hardest thing about building perf-related PoCs. Every time I've built a prototype to prove out an optimization, I've spent the entire duration of the project watching the benefit shrink, stressing that it would dwindle to nothing by the end. So far, I've been lucky and careful enough that I haven't convinced a team to expend massive resources on a benefit that turned out to be fictional, but I've had it happen enough times at the scale of a p-day or two that it always worries me.


Counterexample: WireGuard. Turns out OpenVPN was massive and slow for no reason and it only took one (talented and motivated) man to make a much better version.


> > for an experienced programmer a terminal renderer is a fun weekend project and far away from being a multiyear long research undertaking.

> You shouldn't automatically assume something is actually bad just because someone shows a [vastly] better proof-of-concept 'alternative'.

Apparently you should. I can confirm that the first quote is a appropriate assessment of the difficulty of writing terminal renderer. Citation: I did pretty much exactly the same thing for pretty much exactly the same reasons when (IIRC gnome-)terminal was incapable of handing 80*24*60 = 115200 esc-[-m sequences per second, and am still using the resulting terminal emulator as a daily driver years later.


>> I have no idea if this is the case here, and I suspect it might not be, but pretty much every time I've seen a developer complain that something is slow and then 'prove' that it can be faster by making a proof-of-concept the only reason theirs is faster is because it doesn't implement the important-but-slow bits and it ignores most of the edge cases.

Even in those cases it usually turns out that the handling of edge cases was considered reason enough to sacrifice performance rather than finding a better solution to the edge case. Handling edge cases probably should not cost 10x average performance.


This seems referenced in the repo itself, see the “feature support” section [1].

That being said, is anyone aware of a significant missing feature that would impact performance?

[1]: https://github.com/cmuratori/refterm#feature-support


Screen reader support[0] may have a noticeable performance cost.

[0] https://github.com/microsoft/terminal/issues/10528#issuecomm...


Can you explain how screen reader support could possibly have a noticeable performance cost?

The screen reader code should be doing absolutely nothing if it's not enabled - and even if it is, I can't imagine how it could affect performance anyway. For plain text, such as a terminal, all it does is grab text and parse into words (and then the part where it reads the words, but that's separate from the terminal) - I don't see how this is any more difficult than just taking your terminal's array of cell structs, pulling out the characters into a dynamic array, and returning a pointer to that.


Not necessarily, often times, especially in big corporations programmers will be incentivized to deliver things quickly, rather than to provide the optimal solution. Not because they are bad at programming, but because they have quotas and deadlines to meet. Just remember that story of how in the first Excel version a dev had hard-coded some of the cell dimension calculations as they were under pressure to close as much tasks as fast as possible.


The one example that comes to mind is file system search.

I am writing this application that displays the file system in the browser in a GUI much like Windows Explorer or OSX Finder. It performs file system search substantially faster than Windows Explorer. Windows Explorer is written in a lower level language with decades of usage and experience where my application is a one man hobby project written in JavaScript (TypeScript).

The reason why the hobby project is so substantially faster than a piece of core technology of the flagship product of Microsoft is that it does less.

First, you have to understand how recursive tree models work. You have a general idea of how to access nodes on the tree, but you have no idea what’s there until you are close enough to touch it. File system access performance is limited by both the hardware on which the file system resides and the logic on the particular file system type. Those constraints erode away some of the performance benefits of using a lower level language. What ever operations you wish to perform must be individually applied on each node because you have no idea what’s there until you are touching it.

Second, because the operations are individually applied on each node it’s important to limit what those operations actually are. My application is only searching against a string fragment, absence of the string fragment, or a regular expression match. Wildcards are not supported and other extended search syntax is not supported. If you have to parse a rule each time before applying it to a string identifier of a node those are additional operations performed at each and every node in the designated segment of the tree.

For those familiar with the DOM in the browser it also has the same problems because it’s also a tree model. This is why querySelectors are so incredibly slow compared to walking the DOM with the boring old static DOM methods.


> pretty much every time I've seen a developer complain that something is slow and then 'prove' that it can be faster by making a proof-of-concept the only reason theirs is faster is because it doesn't implement the important-but-slow bits and it ignores most of the edge cases

It's still a good place to start a discussion though. In such a case, apparently someone believes strongly that things can be made much faster, and now you can either learn from that person or explain to them what edge cases they are missing.


Maybe that's what usually happens but doesn't apply in that case.


This.

My time library is so much faster and smaller than yours. Timezones? Nah, didn't implement it.

My font rendering is so much simpler and faster than yours. Nah, only 8 bit encodings. Also no RTL. Ligatures? Come on.

The list goes on.


Don't assume. Casey Muratori produced a highly correct Unicode renderer with correct VT code support.

Watch the video: https://www.youtube.com/watch?v=99dKzubvpKE


Except that in this case it’s the other way around. This weekend project has better support for things like like ligatures, combinations of Arabic characters, RTL, etc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: