I have no idea if this is the case here, and I suspect it might not be, but pretty much every time I've seen a developer complain that something is slow and then 'prove' that it can be faster by making a proof-of-concept the only reason theirs is faster is because it doesn't implement the important-but-slow bits and it ignores most of the edge cases. You shouldn't automatically assume something is actually bad just because someone shows a better proof-of-concept 'alternative'. They may have just ignored half the stuff it needs to do.
The "complaining developer" produced a proof of concept in just two weekends that notably had more features and was more correct than the Windows Terminal!
RefTerm 2 vs Windows Terminal in action: https://www.youtube.com/watch?v=99dKzubvpKE
Refterm 2 source: https://github.com/cmuratori/refterm
One of the previous discussions: https://news.ycombinator.com/item?id=27775268
 Features relevant to the debate at any rate, which was that it is possible to write a high-performance terminal renderer that also correctly renders Unicode. He didn't implement a lot of non-rendering features, but those are beside the point.
We live in a time where every competent developer is slandered in public if he isn't fully submissive to the great corporate powers.
> Setting the technical merits of your suggestion aside though: peppering your comments with clauses like “it’s that simple” or “extremely simple” and, somewhat unexpectedly “am I missing something?” can be read as impugning the reader. Some folks may be a little put off by your style here. I certainly am, but I am still trying to process exactly why that is.
How is a yes/no question aggressive? At that point the maintainers had two possible responses:
1. Yes you are missing that ...
2. No that is the complete picture.
But they chose to side channel to a third possibility, "we are put-off by your questioning!". Excuse me what?
Have you stopped beating your wife?
More relevantly, when the question is asked genuinely then - as you say - it's expressing an openness to learn.
Sometimes it is asked rhetorically, dripping with sarcasm and derision. In that case, it is clearly not furthering our interest in productive, good-faith discussions.
Far more often, it falls somewhere between those two and - especially in text - is often ambiguous as to which was intended. While we should exercise charity and hope our conversational partners do likewise, it makes sense to understand when some phrases might be misconstrued and perhaps to edit accordingly.
I would say that incredulity falls within the range between "completely inoffensive" and "outright hostile", and very much toward the former side of the scale. It can be hard to distinguish from feigned incredulity, which (while still far from "sarcastic and derisive") makes its way toward the other side somewhat.
It's all a matter of perception and context, of course. And though you say there's only one way to interpret it, even you describe it as a continuum.
Sadly, this is all just a sad missed opportunity.
MS could have been less defensive and more open to possible solutions. The genius programmer could have slowed his roll a bit and could have been more collaborative.
It's not just "Am I missing something?"
"Am I missing something? Why is all this stuff with "runs of characters" happening at all? Why would you ever need to separate the background from the foreground for performance reasons? It really seems like most of the code in the parser/renderer part of the terminal is unnecessary and just slows things down. What this code needs to do is extremely simple and it seems like it has been massively overcomplicated."
Perhaps frustrated that they don't really seem to be on the same technical page?
I tend to think these things can go both ways. I feel pointing out someone's frustration in writing tends to make things worse. Personally I would just ignore it in this case.
There's a point at which you've moved from "fix this bug" or "evaluate this new component" to "justify the existing design" and "plan a re-architecture".
As for whether it really is "difficult", one has to ask for whom? For someone that is intimately familiar with C++, DirectX, internationalization, the nature of production-grade terminal shell applications and all their features and requirements?
And even if it is "easy", so what? It just means Microsoft missed something and perhaps were kind of embarrassed, that's totally human, it happens. It's not so nice when this stuff is very public with harsh judgement all around.
This all rubs me the wrong way. I have found the Microsoft folks to be very helpful and generous with their attention on Github Issues. They've helped me and many others out, it has been genuinely refreshing. What this guy did might discourage participation and make folks more defensive to avoid losing face in a big public way over a mistake or silly gotch-ya.
As for difficult, the context is very much set from it being a Github Issue on their own repo, meaning there is a certain assumption of skill.
You're cutting Microsoft a lot of slack here, and it feels like you're forgetting that out of this whole transaction MS end up with free labour and bug-fixes? They choose for the setting to be very public, and they choose to let their employee's directly reply to customers with quotes like: ["I will take your terse response as accepting the summary.", "somewhat combatively", "peppering your comments with clauses like", "impugning the reader."]. All of which are corporate-passive-aggression and (in my mind) are vastly more antagonistic than Casey ever was?
He's been like this for years, and that's fine when you are hanging out with you buddies over a beer, but now Casey is a public figure.
Being a public figure means you are not 'every competent developer'. The reason this was made so public wasn't ms employees, it was Casey's followers.
The sequence of events he started here ended with his fans and friends on discord feeling justified (because Casey, not them, was right on a technical level) brigading volunteers and microsoft employees alike until at least one of them quit open source.
A truly ugly conclusion that could have been avoided with a more circumspect approach.
My worry is that Casey did this technical performance for the benefit of his followers, and nothing of value was gained, except of course Casey's growing fame.
Here's a thread on it with other examples: https://news.ycombinator.com/item?id=13940014
They fixed it, but it was a sign of the times. Everything we've used over the decades had to be re-implemented for the web and stuff like Electron, and the people doing the implementing use such powerful machines that they don't even notice a simple blinking cursor devouring CPU or think about its impact on normal computers.
Or not - regardless of what MS employee claimed, Linux terminals performance is more than adequate.
Edit: I am speaking of Linux, not WSL, of course.
Ok, sarcasm off.
This attitude is utterly toxic. People who are ignorant of how fast their software could be do not deserve abuse from strangers.
That's not the only valid way to frame the situation. At some level, professional software developers have a responsibility to be somewhat aware of the harm their currently-shipping code is doing.
More broadly, I'd much rather endure a slightly slow terminal made by developers acting in the open and in (mostly) good faith than the intentionally malicious software produced by actual bad actors within Google, Facebook, Microsoft et all..
It's also important to keep in mind the vast asymmetry here. When Microsoft deploys problematic software, even a relatively minor problem will be responsible for many man-hours of frustration and wasted time. Far more man-hours than are ruined when a few developers have bad things said about them online. One doesn't excuse the other, but you can't ignore one of the harms simply because it's more diffuse.
In my mind, there are two asymmetries.
* Microsoft v. Users
* Casey's network v. a 3-4 man open source team within microsoft
I don't disagree that the former is abusive.
However, it's my contention that this incident is primarily about the latter.
Casey, rightly, already had some pent up rage about the former asymmetry as well.
But it was a human manager/dev? within that small team, not Microsoft writ large, that got defensive about the software he was responsible for.
I believe I'd feel embarrassed and defensive too if something I'd worked on turned out to be flawed in a painfully obvious way. I can understand avoiding the grim truth by denying that the problem has truly been solved by ~700 lines of C.
Something else that I'll note here is that the vast majority of
"Your software is too slow, Here's how to fix it, I could do it in a weekend, and by the way this whole problem space is actually super simple."
tickets do not end up realizing software speedups.
Without proper context, they just sounds patronizing, making the argument easier to dismiss.
In this case he also demonstrated his superiority with working code.
Better to learn from it than to pout about it.
(Cue in the "but why bother, my app is IO bound anyway" counterarguments. That happens, but people too often forget you can design software to avoid or minimize the time spent waiting on IO. And I don't mean just "use async" - I mean design its whole structure and UX alike to manage waits better.)
 - Or HFT, or browser engine development, few other performance-mindful areas of software.
I have encountered far too many people who interpret that to mean, "thou shall not even even consider performance until a user, PM or executive complains about it."
Here's the quote in context:
> There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.…
( https://pic.plover.com/knuth-GOTO.pdf , http://www.kohala.com/start/papers.others/knuth.dec74.html For fun, see also the thread around https://twitter.com/pervognsen/status/1252736617510350848)
And from the same paper, an explicit statement of his attitude:
> The improvement in speed from Example 2 to Example 2a [which introduces a couple of goto statements] is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today's software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by pennywise-and-pound-foolish programmers, who can't debug or maintain their "optimized" programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn't bother making such optimizations on a one-shot job, but when it's a question of preparing quality programs, …
And of course most people don't know the full quote and they don't care about what Knuth really meant at the time.
Quick quips don't get to trump awful engineering. Just say call Knuth a boomer and point to the awful aspects of actual code. No disrespect to Knuth, just dismiss him as easily as people use him to dismiss real problems.
Except that line was written in a book (Volume 1: Art of Computer Programming) that was entirely written from the ground up in Assembly language.
Its been a while since I read the quote in context. But IIRC: it was the question about saving 1 instruction between a for-loop that counts up vs a for-loop that counts down.
"Premature optimization is the root of all evil" if you're deciding to save 1-instruction to count from "last-number to 0", taking advantage of the jnz instruction common in assembly languages, rather than "0 to last-number". (that is: for(int i=0; i<size; i++) vs for(int i=size-1; i>=0; i--). The latter is slightly more efficient).
Especially because "last-number to 0" is a bit more difficult to think about and prove correct in a number of cases.
I recall it being from his response to the debates over GOTO, and some googling seems to agree.
Not that that takes away from your overall point.
In support of your overall point, though, having just said "[w]e should forget about small efficiencies, about 97% of the time", the next paragraph opens: "Yet we should not pass up our opportunities in that critical 3%."
If I don't think about performance and other critical things before committing to a design I know that in the end I will have to rewrite everything or deliver awful software. Being lazy and perfectionist those are two things I really want to avoid.
Yet we should not pass up our opportunities in that critical 3%.
"support larger screens and refresh rates"
It is possible to get office97 (or for that matter 2003, which is one of the last non sucky versions) and run it on a modern machine. It does basically everything instantly, including starting. So I don't really think resolution is the problem.
PS, I've had multiple monitors since the late 80's too, in various forms, in the late 1990's driving multiple large CRTs at high resolution from secondary PCI graphics cards, until they started coming with multiple ports (thanks matrox!) for reasonable prices.
I'd imagine perhaps this is how product teams are assessed - is the component just-about fast enough, and does it have lots of features. So long as MS Office is the most feature-rich software package, enterprise will buy nothing else, slow or not.
Is Teams better than Zoom? No, but my last employer ditched Zoom because Teams was already included in their enterprise license package and they didn't want to pay twice for the same functionality.
The tech stack at use in the Windows Terminal project is new code bolted onto old code, and no one on the existing team knows how that old code works. No one understands what it's doing. No one knows when the things that old code needed to do was still needed.
It took someone like Casey who knew gamedev to know instinctually that all of that stuff was junk and you could rewrite it in a weekend. The Microsoft devs, if they wanted to dive into the issue, would be forced to Chesteron's Fence every single line of code. It WOULD have taken them years.
We've always recommended that programmers know the code one and possibly two layers below them. This recommendation failed here, it failed during the GTA loading times scandal. It has failed millions of times and the ramifications of that failing is causing chaos of performance issues.
I'm come to realize that much of the problems that we have gotten ourselves into is based on what I call feedback bandwidth. If you are an expert, as Casey is, you have infinite bandwidth, and you are only limited by your ability to experiment. If your ability to change is a couple seconds, you will be able to create projects that are fundamentally impossible without that feedback.
If you need to discuss something with someone else, that bandwidth drops like a stone. If you need a team of experts, all IMing each-other 1 on 1, you might as well give up. 2 week Agile sprints are much better than months to years long waterfall, but we still have so much to learn. If you only know if the sprint is a success after everyone comes together, you are doomed. The people iterating dozens of times every hour will eat your shorts.
I'm not saying that only a single developer should work on entire projects. But what I am saying is that when you have a Quarterback and Wide Receiver that are on the same page, talking at the same abstraction level, sometimes all it takes is one turn, one bit of information, to know exactly what the other is about to do. They can react together.
Simple is not easy. Matching essential complexity might very well be impossible. Communication will never be perfect. But you have to give it a shot.
Then in college, I studied ECE and worked in a physics lab where everything needed to run fast enough to read out ADCs as quickly as allowed by the datasheet.
Then I moved to defense doing FPGA and SW work (and I moonlighted in another group consulting internally on verifcation for ASICs). Again, everything was tightly controlled. On a PCI-e transfer, we were allowed 5 us of maximum overhead. The rest of the time could only be used for streaming data to and from a device. So if you needed to do computation with the data, you needed to do it in flight and every data structure had to be perfectly optimized. Weirdly, once you have data structures that are optimized for your hardware, the algorithms kind of just fall into place. After that, I worked on sensor fusion and video applications for about a year where our data rates for a single card were measured in TB/s. Needless to say, efficiency was the name of the game.
After that, I moved to HFT. And weirdly, outside of the critical tick-to-trade path or microwave stuff, this industry has a lot less care around tight latencies and has crazy low data rates compared to what I'm used to working with.
So when I go look at software and stuff is slow, I'm just suffering because I know all of this can be done faster and more efficiently (I once shaved 99.5% of the run time off of a person's code with better data packing to align to cache lines, better addressing to minimize page thrashing, and automated loop unrolling into threads all in about 1 day of work). Software developers seriously need to learn to optimize proactively... or just write less shitty code to begin with.
while that's true, in this particular case with Casey's impl, it's not an art. The one thing that drastically improved performance, was caching. Literally the simplest, most obvious thing to do when you have performance problems.
The meta-point is that corporate developers have been through the hiring machine and are supposed to know these things.
Stories like this imply that in fact they don't.
Also it's not that simple. One place I worked at (scientific computation-kind of place), we'd prototype everything in Python, and production would be carefully rewritten C++. Standards were very high for prod, and we struggled to hire "good enough", modern C++, endless debates about "ooh template meta-programming or struct or bare metal pointers" kind of stuff.
3 times out of 4, the Python prototype was faster than the subsequent C++. It was because it had to be, the prototype was ran and re-ran and tweaked many times in the course of development. C++ was written once, deployed, and churned daily, without anyone caring for its speed.
If you're doing generic symbol shuffling with a bit of math, Python is fast-ish to code and horribly slow to run. You can easily waste a lot of performance - and possibly cash - trying to use it for production.
Whether or not you'll save budget by writing your own optimised fast libs in some other lang is a different issue, and very application dependent.
“Any object around which an S.E.P. is applied will cease to be noticed, because any problems one may have understanding it (and therefore accepting its existence) become Somebody Else's Problem.”
And you’re right; all refterm really does is move the glyph cache out to the GPU rather than copying pixels from a glyph cache in main memory every frame.
Consider memory management techniques like caching layers or reference pools. Or optimizing draws for the platform's render loop. Or being familiar with profiler tools to identify hotspots. These techniques are all orthogonal to functionality. That is, applying them when you see an opportunity to will not somehow limit features.
So why aren't the enterprise apps fast, if it's so easy? I think that boils down to incentives. Enterprise apps are sales or product led and the roadmap only accommodates functionality that makes selling the software easier. Whereas in games the table stakes point you need to reach for graphics is not achievable by naively pursuing game features.
Put another way, computers and laptops are way stronger than consoles and performance is a gas. Enterprise devs are used to writing at 1 PSI or less and game devs are used to writing at 1 PSI or more.
Windows Terminal has none of that. And his refterm already has more features implemented correctly (such as proper handling of Arabic etc.) than Windows Terminal. See feature support: https://github.com/cmuratori/refterm#feature-support
Also see FAQ: https://github.com/cmuratori/refterm/blob/main/faq.md
The same is true of backwards compatibilty. As an example, making sure old save data is compatible with new versions is an important consideration.
Source: I'm a game programmer working mainly with graphics and performance, but I previously spent five years working on the UI team at a AAA studio.
This includes audio localization (something no 'Enterprise' software has ever needed AFAIK), and multiple colour palettes for different types of colour blindness.
Sometimes video games are the only software with reasonable localizations I ever find installed in a particular computer.
There is a newer version of component X but we can't leverage that due to dependency Y.
You just begin to wonder how these things happen. You would think top programmers work at these companies. Why would they not start with the good concept, loading GPU first etc. Why did it take them so much time to finally do it correctly. Why waste time not doing it at the beginning.
The game runs well enough so the people who could optimize things by rewriting them from CPU to GPU are doing other things instead. Later performance is a noticeable problem to the dev team, from customer feedback and need to ship in more resource constrained environments (VR and XBox) and that person then can do work to improve performance.
It's also handy to have a reference CPU implementation both to get your head around a problem and because debugging on the GPU is extremely painful.
To go further down the rabbit hole it could be that they were resource constrained on the GPU and couldn't shift work there until other optimizations had been made. And so on with dependencies to getting a piece of work done on a complex project.
Yes, there is no doubt that "there are always more things to do on a game project than you have people and time to do". However how there is time to firstly make "main thread on single core" monster and then redo it according to game development 101 - make use of GPU.
It is no joke - GPU was barely loaded, while CPU choking. On a modern game released by top software company proudly presenting their branding in the very title.
The single threaded code was taken from Microsoft Flight Simulator X, the previous version of this game from 2006. It was not done from scratch for the new game, and it still hasn't been replaced. They've just optimized parts of it now.
Another important performance bottleneck is due to the fact that the game's UI is basically Electron. That's right, they're running a full blown browser on top of the rest of the graphis for the UI. Browsers are notoriusly slow because they support a million edge cases that aren't always needed in a game UI.
For anyone interested in learning more about Microsoft Flight Simulator 2020 optimizations I can recommend checking out the Digital Foundry interview with the game's technical director. https://www.youtube.com/watch?v=32abllGf77g
Also the assumption that the culture of the Windows Terminal team is the same as the team building a flight simulator is a bit far fetched. Large organisations typically have very specific local cultures.
Rewriting stuff from CPU to GPU also isn't 101 game development. Both because it's actually quite hard and because not all problems are amenable to GPU compute and would actually slow things down.
Also, I have rewritten CPU implementations to run on the GPU before and it's often nontrivial. Sometimes even in seemingly straightforward cases. The architectures are very different, you only gain speed if you can properly parallelize the computations, and a lot of things which are trivial on the CPU becomes a real hassle on the GPU.
When porting to a game console you can't simply define some random 'minimal requirement' hardware, because the game console is that hardware. So you start looking for more optimization opportunities to make the game run smoothly on the new target hardware, and some of those optimizations may also make the PC version better.
Then again isn't it like 101 of game development?
Imagine releasing a game that looks stunning, industry agrees that it pushes the limits of modern gaming PC (hence runs poorly on old machines). Fast forward some time - "oh BTW we did it incorrectly (we had our motives), now you can run it on old machine just fine, no need of buying i9 or anything".
You know your target specs but where the bottlenecks are and how to solve them will be constantly shifting. At some point the artists might push your lighting or maybe it's physics now, maybe it's IO or maybe networking. Which parts do you move to the GPU?
Also a GPU is not a magic bullet, it's great for parallel computation but not all problems can be solved like that. It's also painful to move memory between the CPU and the GPU and it's a limited resource can't have everything there.
It was called "download simulator" by some, because even the initial installation phase was poorly optimised.
But merely calling it "poorly optimised" isn't really sufficient to get the point across.
It wasn't "poor", or "suboptimal". It was literally as bad as possible without deliberately trying to add useless slowdown code.
The best equivalent I can use is Trivial FTP (TFPT). It's used only in the most resource-constrained environments where even buffering a few kilobytes is out of the question. Embedded microcontrollers in NIC boot ROMs, that kind of thing. It's literally maximally slow. It ping-pongs for every block, and it uses small blocks by default. If you do anything to a network protocol at all, it's a step up from TFTP. (Just adding a few packets worth of buffering and windowing dramatically speeds it up, and this enhancement thankfully did make it into a recent update of the standard.)
People get bogged down in these discussions around which alternate solution is more optimal. They're typically arguing over which part of a Pareto frontier they think is most applicable. But TFTP and Microsoft FS2020 aren't on the Pareto frontier. They're in the exact corner of the diagram, where there is no curve. They're at a singularity: the maximally suboptimal point (0,0).
This line of thinking is similar to the "Toward a Science of Morality" by the famous atheist Sam Harris. He starts with a definition of "maximum badness", and defines "good" as the direction away from it in the solution space. Theists and atheists don't necessarily have to agree with the specific high-dimensional solution vector, but they have to agree that that there is an origin, otherwise there's no meaningful discussion possible.
Microsoft Terminal wasn't at (0,0) but it was close. Doing hilariously trivial "optimisations" would allow you to move very much further in the solution space towards the frontier.
The Microsoft Terminal developers (mistakenly) assumed that they were already at the Pareto frontier, and that the people that opened the Github Issue were asking them to move the frontier. That does usually require research!
At least initially, it didn't use the XBox CDN either.
Microsoft's greatest technical own goal of the 2010s was WSL 2.
The original WSL was great in most respects (authentic to how Windows works; just as Windows NT has a "Windows 95" personality, Windows NT can have a "Linux" personality) but had the problem that filesystem access went through the Windows filesystem interface.
The Windows filesystem interface is a lot slower for metadata operations (e.g. small files) than the Linux filesystem interface and is unreformable because the problem is the design of the internal API and the model for security checking.
Nobody really complained that metadata operations in Windows was slow, they just worked around it. Some people though were doing complex build procedures inside WSL (build a Linux Kernel) and it was clear then there was a performance problem relative to Linux.
For whatever reason, Microsoft decided this was unacceptable, so they came out with WSL 2 which got them solidly into Kryptonite territory. They took something which third party vendors could do perfectly well (install Ubuntu in a VM) and screwed it up like only Microsoft can (attempt to install it through the Windows Store, closely couple it to Windows so it almost works, depend on legitimacy based on "it's from Microsoft" as opposed to "it works", ...)
Had Microsoft just accepted that metadata operations were a little bit slow, most WSL users would have accepted it, the ones who couldn't would run Ubuntu in a VM.
Microsoft being Microsoft, they didn't want people like you to hop to VMware or VirtualBox and use a full, branded instance of Fedora or Ubuntu, because then you would realize that the next hop (moving to Linux entirely) was actually quite reasonable. So they threw away WSL1 and built WSL2. Obviously WSL2 worked better for you than WSL1, but you also did exactly what Microsoft wanted you to do, which is to their benefit, and not necessarily to yours.
can't even print properly the decimal separator.
maybe it wasn't that easy.
Here's with Courier New: https://imgur.com/t88wwTx
EDIT: And here's with JetBrains Mono: https://imgur.com/HYEYtGb
EDIT: Do I need to make a font parade?
EDIT: Consolas! https://imgur.com/klPufrd Where can we go without the classic? IBM Plex Mono https://imgur.com/e5tQ0NB
EDIT: Remove argument about ease of development, because refterm is easy.
This bugs me every time. "Wow, this software works so good, but we are not gonna make our software like that, no, we'll stick to our shite implementation."
Its' description says also: "Reference monospace terminal renderer". "monospace" is there for a reason.
It's worth mentioning though, that Windows Terminal also defaults to Cascadia Code and Cascadia Code was installed automatically on my machine, so it's de-facto the new monospace standard font on Windows starting from 10.
cascadia is as borken as the other font, so what now?
maybe writing a unicode rendering isn't that easy? maybe drop the attitude?
You came here with an attitude.
> I just launched "dir" https://i.imgur.com/lkbOR3i.png can't even print properly the decimal separator. maybe it wasn't that easy.
Nobody said he made a fully functional better terminal, just that the terminal rendering was better and functional. Doing everything needed for a fully functional terminal is a lot of work, but doing everything needed for terminal rendering isn't all that much work.
> The "complaining developer" produced a proof of concept in just two weekends that notably had more features and was more correct than the Windows Terminal!
easily falsifiable bullshit found to be false.
If you were to open refterm once, you'd see a text that explicitly states "DO NOT USE IT AS A TERMNIAL EMULATOR", or something like that (can't open it now to copy exact).
That seems like it would be unaffected by correcting font rendering.
"... extremely slow Unicode parsing with Uniscribe and extremely slow glyph generation with DirectWrite..."
Glyph generation is about rasterising text, because you can't just feed it the font file.
This is an edge case as much as building a rasterizer directly into the terminal is an "edge case".
The fact that a random commenter on HN used a non-monospaced font with refterm actually makes it a use case.
I do, however, agree that it is an edge case with a very low probability.
The Microsoft terminal doesn't render monospaced fonts, the overwhelmingly common case, nearly as fast as refterm. If rendering variable-width fonts is somehow intrinsically insanely expensive for some reason (which I haven't seen anyone provide good evidence for), then a good implementation would still just take refterm's fast monospaced rendering implementation and use it for monospaced fonts, and a slower implementation for variable-width fonts.
That is - refterm's non-existent variable-width font rendering capabilities do not excuse the Windows terminal's abysmal fixed-width font rendering capabilities.
Also, it's worth noting that this isn't a compelling argument in the first place because the windows terminal doesn't even come close to rendering readable Arabic, it fucks up emoji rendering, etc – all cases that Casey was able to basically solve after two weekends of working on this.
Windows Terminal, for some reason, gives users the option to change their font to one that isn't monospaced, so I'd argue that it should render them correctly if the user chooses to do that.
Casey threw something together in a matter of days that had 150% of the features of the Windows Terminal renderer, but none of the bug fixing that goes into shipping a production piece of software.
That screenshot you keep parading around is a small issue with a quick fix. It's not like Casey's approach is inherently unable to deal with punctuation!
You don't discard the entire content of a scientific journal paper because of a typo.
"Sorry Mr Darwin, I'm sure you believe that your theory is very interesting, but you see here on page 34? You wrote 'punctuted'. I think you meant 'punctuated'. Submission rejected!"
But now apparently pointing out that "MS was right not to want to take shortcut in unicode rendering" morphed into "criticizing in bad faith refterm for not being production ready"
Who's not arguing in good faith here?
Considering Casey himself puts front and center the disclaimer that this is solely intended to be a reference and goes into as much detail in his videos I don't know where you got this from. I don't think anyone is under the illusion that this is could replace the actual terminal. It's just meant to show that there's a minimum level of performance to be expected for not a huge amount of effort (a couple weekend's worth) and there is no excuse for less.
Who said that? Refterm isn't a fully functional terminal, it is just a terminal renderer bundled with a toy terminal.
no, doesn't work with cascadia either.
so what now?
EDIT: I am sorry for the attitude, changed "wrong" to "different"
I don't care at all if this or that terminal uses a bit more RAM or is a few milliseconds faster.
Anyway, resource usage matters quite a lot, if your terminal uses CPU like an AAA game running in the background you will notice fan noises, degraded performance and potentially crashes due to overheating everywhere else in the computer.
Did you watch the video? The performance difference is huge! 0.7 seconds vs 3.5 minutes.
That developer also was rather brusque in the github issue and could use a bit more humility and emotional intelligence. Which, by the way, isn't on the OP blog post's chart of a "programmers lifecycle". The same could be said of the MS side.
Instead of both sides asserting (or "proving") that they're "right" could they not have collaborated to put together an improvement in Windows Terminal? Wouldn't that have been better for everyone?
FWIW, I do use windows terminal and it's "fine". Much better than the old one (conhost?).
My experience with people that want to collaborate instead of just recognizing and following good advice is that you spend a tremendous amount of effort just to convince them to get their ass moving, then find out they were not capable of solving the problem in the first place, and it’s frankly just not worth it.
Much more fun to just reimplement the thing and then say “You were saying?”
The developer surely was having a tonne of fun at the expense of Microsoft. Perhaps a little too much fun imo.
The thing is NO ONE likes to lose face. He could have still done what he did (and enjoy his "victory lap") but in a spirit of collaboration.
To be fair, MS folks set themselves up for this but the smart-alec could have handled it with more class and generosity.
I've seen that happen at least 10 times over my career, including at very big companies with very bright people writing the code. I've been that guy. There are always these sorts of opportunities in all but the most heavily optimized codebases, most teams either a) just don't have the right people to find them, or b) have too much other shit that's on fire to let anyone except a newbie look at stuff like performance.
"Yes it's kinda slow, but not enough so customers leave so who cares." Performance only becomes a priority when it's so bad customers complain loudly about it and churn because of it.
Meanwhile, the majority of customers with faster machines are not sensitive enough to feel or bother about the avoidable lag.
Or Gates's law "The speed of software halves every 18 months"
c) slow code happens to everyone, sometimes you need fresh pair of eyes.
But show me a codebase that doesn't have at least a factor of 2 improvement somewhere and does not serve at least 100 million users (at which point any perf gain is worthwhile), and I'll show you a team that is being mismanaged by someone who cares more about tech than user experience.
I can’t think of a counter example, actually (where a brutally slow system I was working on wasn’t fairly easily optimized into adequate performance).
To go truly fast, you need to unleash the full potential of the hardware and doing it can require re-architecting the system from the ground up. For example, both postgres and clickhouse can do `select sum(field1) from table group by field2`, but clickhouse will be 100x faster and no amount of microoptimizations in postgres will change that.
I'm not even sure it takes an "experienced" engineer, one of my first linux patches was simply to remove a goto, which dropped an exponential factor from something, and changed it from "slow" to imperceptible.
GPUs: able to render millions of triangles with complex geometrical transformations and non-trivial per-pixel programs in real time
MS engineers: drawing colored text is SLOW, what do you expect
P.S. And yes, I know, text rendering is a non-trivial problem. But it is a largely solved problem. We have text editors that can render huge files with real-time syntax highlighting, browsers that can quickly layout much more complex text, and, obviously Linux and Mac terminal emulators that somehow have no issue whatsoever rendering large amount of colored text.
That's because it is slow in the most general case: If you have to support arbitrary effects, transformations, ligatures, subpixel hinting, and smooth animations simultaneously, there's no quick and simple approach.
The Windows Terminal is a special case that doesn't have all of those features: No animation and no arbitrary transforms dramatically simplifies things. Having a constant font size helps a lot with caching. The regularity of the fixed-width font grid placement eliminates kerning and any code path that deals with subpixel level hinting or alignment. Etc...
It really is a simple problem: it's a grid of rectangles with a little sprite on each one.
In the real-CRT-terminal days of the 1970's & 1980's of course the interface to the local or remote mainframe or PC was IO-bound but not much else could slow it down.
UI elements like the keyboard/screen combo have been expected to perform at the speed of light for decades using only simple hardware to begin with.
The UX of a modern terminal app would best be not much different than a real CRT unit unless the traditional keyboard/display UI could actually be improved in some way.
Even adding a "mouse" didn't slow down the Atari 400 (which was only an 8-bit personal computer) when I programmed it to use the gaming trackball to point & click plus drag & drop. That was using regular Atari Basic, no assembly code. And I'm no software engineer.
A decade later once the mouse had been recognized and brought into the mainstream it didn't seem to slow down DOS at all, compared to a rodent-free environment.
Using modern electronics surely there should not be any perceptable lag compared to non-intelligent CRT's over dial-up.
Unless maybe the engineers are not as advanced as they used to be decades ago.
Or maybe the management/approach is faulty, all it takes is one non-leader in a leadership position to negate the abilities of all talented operators working under that sub-hierarchy.
Text rendering is still done mostly on the CPU side in the great majority of applications, since vector graphics are hard to do efficiently on GPUs.
Of course in this case it seems like it was possible to make it very fast, but people who think proper text rendering is naturally going to be somewhat slow aren't always wrong.
Saying that text rendering is "largely solved" is also incorrect. There are still changes and improvements being made to the state of the art and there are still unhappy users who don't get good text rendering and layout in their favorite applications when using a language other than English.
Naturally slow for a text renderer might mean it renders in 4ms instead of 0.1.
* as it was brought up, one might use a non-monospace font, but that case can just use the slow path and let the “normal” people use a fast terminal
refterm is designed to support several features, just to ensure that no shortcuts have been taken in the design of the renderer. As such, refterm supports:
* Multicolor fonts
* All of Unicode, including combining characters and right-to-left text like Arabic
Glyphs that can take up several cells
* Line wrapping
* Reflowing line wrapping on terminal resize
* Large scrollback buffer
* VT codes for setting colors and cursor positions, as well as strikethrough, underline, blink, reverse video, etc.
Plently of other parts of terminal emulators are tricky to implement performantly, ligatures are one Alacritty hasn't got yet.
I have never written a terminal enulator, so could you maybe summarize why fast scrolling with fixed regions is so hard to implement?
As good as Casey Muratori is, Microsoft is more than big enough to have the means of taking his core ideas and implement them themselves. It may not take them a couple weekends, but they should be able to spend a couple experienced man-months over this.
The fact they don't can only mean they don't care. Maybe the people at Microsoft care, but clearly the organisation as a whole as other priorities.
Besides, this is not the first time I've seen Casey complain about performance in a Microsoft product. Last time it was about boot times for Visual Studio, which he does to debug code. While reporting performance problems was possible, the form only had "less than 10s" as the shortest boot time you could tick. Clearly, they considered that if VS booted in 9 seconds or less, you don't have a performance problem at all.
I commented on a separate issue re: refterm
--- start quote ---
Something tells me that the half-a-dozen to a dozen of Microsoft developers working on Windows terminal:
- could go ahead an do the same "doctoral research" that Casey Muratori did and retrace his steps
- could pool together their not insignificant salaries and hire Casey as a consultant
- ask their managers and let Microsoft spend some of those 15.5 billion dollars of net income on hiring someone like Casey who knows what they are doing
--- end quote ---
One thing to remember is that it is always possible and acceptable to contact the author of a GPL-licensed piece of code to enquire whether they would consider granting you a commercial license.
It may not be worthwhile but if you find exactly what you're looking for and that would take you months to develop yourself then it may very well be.
Microsoft's attitude towards the code seems a little odd. 
Unfortunately the code is intentionally GPLv2 licensed and we‘ll honor this wish entirely. As such no one at Microsoft will ever look at either of the links.
Given that WSL exists I can't imagine this is a universal policy towards reading GPLv2 code at Microsoft.
Why is that a problem? A GPLv2 terminal would not be a business problem for Microsoft. People would still have to buy licenses for Windows. Maybe they would lose a little face, but arguably they have already done so.
At least it’s not GPLv3 which this industry absolutely and viscerally hates (despite having no problem Apache 2.0 for some reason, Theo de Raadt is at least consistent).
They can alternatively buy a commercial license, as another user said below.
That's not unfortunate. Having people who work on competing Free Software is a good thing. It would be even better if Microsoft adopted this code and complied with the terms of the GPL license. Then we won't have to deal with problems like these because they'd be nipped in the bud. And we would set the precedent to take care of lot of other problems like the malware, telemetry, abuse of users' freedoms.
> You shouldn't automatically assume something is actually bad just because someone shows a [vastly] better proof-of-concept 'alternative'.
Apparently you should. I can confirm that the first quote is a appropriate assessment of the difficulty of writing terminal renderer. Citation: I did pretty much exactly the same thing for pretty much exactly the same reasons when (IIRC gnome-)terminal was incapable of handing 80*24*60 = 115200 esc-[-m sequences per second, and am still using the resulting terminal emulator as a daily driver years later.
Even in those cases it usually turns out that the handling of edge cases was considered reason enough to sacrifice performance rather than finding a better solution to the edge case. Handling edge cases probably should not cost 10x average performance.
That being said, is anyone aware of a significant missing feature that would impact performance?
The screen reader code should be doing absolutely nothing if it's not enabled - and even if it is, I can't imagine how it could affect performance anyway. For plain text, such as a terminal, all it does is grab text and parse into words (and then the part where it reads the words, but that's separate from the terminal) - I don't see how this is any more difficult than just taking your terminal's array of cell structs, pulling out the characters into a dynamic array, and returning a pointer to that.
The reason why the hobby project is so substantially faster than a piece of core technology of the flagship product of Microsoft is that it does less.
First, you have to understand how recursive tree models work. You have a general idea of how to access nodes on the tree, but you have no idea what’s there until you are close enough to touch it. File system access performance is limited by both the hardware on which the file system resides and the logic on the particular file system type. Those constraints erode away some of the performance benefits of using a lower level language. What ever operations you wish to perform must be individually applied on each node because you have no idea what’s there until you are touching it.
Second, because the operations are individually applied on each node it’s important to limit what those operations actually are. My application is only searching against a string fragment, absence of the string fragment, or a regular expression match. Wildcards are not supported and other extended search syntax is not supported. If you have to parse a rule each time before applying it to a string identifier of a node those are additional operations performed at each and every node in the designated segment of the tree.
For those familiar with the DOM in the browser it also has the same problems because it’s also a tree model. This is why querySelectors are so incredibly slow compared to walking the DOM with the boring old static DOM methods.
It's still a good place to start a discussion though. In such a case, apparently someone believes strongly that things can be made much faster, and now you can either learn from that person or explain to them what edge cases they are missing.
My time library is so much faster and smaller than yours. Timezones? Nah, didn't implement it.
My font rendering is so much simpler and faster than yours. Nah, only 8 bit encodings. Also no RTL. Ligatures? Come on.
The list goes on.
Watch the video: https://www.youtube.com/watch?v=99dKzubvpKE