Hacker News new | past | comments | ask | show | jobs | submit login
Arranging Invisible Icons in Quadratic Time (randomascii.wordpress.com)
376 points by ingve 11 months ago | hide | past | favorite | 111 comments



> Laying out icons on a grid should be an inherently linear operation, but somehow it was written as quadratic and was executed even when the icons were not being displayed.

I could sort of understand this sloppiness if it was some random app from a hot startup, but this is a Windows issue affecting, presumably, all Windows users... Somehow this gets past code review and QA and into production, where it remains broken and nobody has fixed it. Depressing.

I would argue part of the problem is that the majority of developers are computer science illiterate. They just hack things together, gluing libraries on top of frameworks, with very little understanding about what goes on underneath. Even when you point out a problem like this, people seem to be unwilling or incapable of understanding it.

I have a personal experience where I raised doubts about an unoptimized algorithm (the time complexity was something like O(n^2 log n) where typical case n was supposed to be 10^6). I'm paraphrasing, but the response I got was roughly "don't worry, we can just run it on a powerful VM" and also the classic "developer time is more valuable than CPU time". When people realized this problem can't be solved by throwing money at it, the next suggestion was to artificially constrain n to be small enough that the program completes. Again I was raising doubts that this doesn't make sense, but everybody else thought it's fine. So I spent time running a series of experiments to demonstrate that the results we're getting are nonsense. It was at this point that I optimized the algorithm so it ran in O(n log n) or something similar. Something that we should have done from the start.


I would argue that the majority of developers are juggling half a dozen tasks that need to be done ASAP and a simple quadratic solution was fast enough for the use case they had in mind and there definitely was no time to optimize for the case of invisible icons.


From the perspective of an individual developer that may be the case, but it's inexcusable for Microsoft as an organization.


There are still a lot of bugs, features not working at all in windows. So a feature that works but slowly in extraordinary (for most users) conditions is not prioritized.

If you want a bug-free os, you'll pay an extreme price in features or in dollars.


> There are still a lot of bugs, features not working at all in windows. So a feature that works but slowly in extraordinary (for most users) conditions is not prioritized.

Please don't pretend like Microsoft has rationally prioritized the feature/bug work on Windows. Microsoft is spending resources pushing ridiculously niche use cases like "3D Objects". Microsoft is also doing a bunch of duplicate work, for example, maintaining 2 different UIs for Control Panel even though the old one was fine. Maybe it's time for Microsoft to stop the endless re-designs just for the sake of "new is better", and maybe focus on fixing bugs?


You may be correct in the larger sense, but there will be multiple teams working on Windows, with non-interchangeable staff. Assuming anybody found this bug before, somebody on the Explorer team almost certainly will have rationally prioritized it.


Why? Would someone rationally get a bonus for fixing this instead of whatever quarterly promise they made to their boss? What's rational about fixing a bug that isn't threatening revenue?


They wouldn't. The person rationally prioritizing this bug will have put it somewhere down the bottom of the list. And that's how we find ourselves here, with the issue unfixed.


> If you want a bug-free os, you'll pay an extreme price in features or in dollars.

It isn't like Windows division doesn't make billions of dollars every year. Lack of resources is hardly an excuse.


I suspect in both features & dollars!


I've never really understood this mentality. It's probably a privilege thing, but I always argue for doing things the right way. I got fired for it once, but I live below my means just in case that happens anyways. I don't need much, and don't take on debt.

IMO, the kind of devs who do this shit are like union scabs: they make things worse for everyone else because they don't have the ability to stand up for themselves.


I can think of two reasons why this happened in the first place.

- Either it was good enough in low resolution screens, and algorithm just considered mature and nobody think about it after that.

- Or, it was implemented with "first commit which meets requirements wins" competition, and it just didn't get the attention it needed.

I'm like you, a big fan of "do it once, do it right", but not everyone understands how well-polishing something presumably small affects everything like a butterfly effect.

Some people just want to tick all the boxes and go home. There's no inherent wrong with this, but when combined with wrong constraints and incentives, these things happen.


This affects visible icons as well; the icons being invisible in the original report is just the icing on the cake.


I would argue that the majority of developers are juggling half a dozen tasks that need to be done ASAP , so they skim a bug report and mark it as irrelevant to the use case they had in mind, and there definitely was no time to read the bug report carefully and understand the core issue it was describing.


This attitude is why 98% of software is absolute dogshit. Have some pride.


I applied for a job at Microsoft recently. I had to take an online test which consisted of two medium and one hard leetcode problems. In one hour.

It took nearly 15 minutes to just read the problems and make a game plan. That left 15 minutes solve and code all of them up. Have you ever done a hard leetcode problem? In 15 minutes? If that’s the bar at Microsoft, I doubt their developers are computer science illiterate. That test was harder than any of the other FAANG tests I’ve done, online or in person. Clearly the icon code was delegated to some intern or perhaps a child of one of the developers, because the bar is stupid high there just to have a conversation with a hiring manager. And they don’t even pay that well.


Check out some of Dan Luu's writing. https://danluu.com/algorithms-interviews/ comes to mind:

> At the start of this post, we noted that people at big tech companies commonly claim that they have to do algorithms interviews since it's so costly to have inefficiencies at scale. My experience is that these examples are legion at every company I've worked for that does algorithms interviews. Trying to get people to solve algorithms problems on the job by asking algorithms questions in interviews doesn't work.

He goes on to talk about why this happens.

In my own experience [edit to clarify: not at Microsoft], if you lay out a piece of code as an algorithm problem that needs to be optimized, most folks will make short work of it. But it's quite common for folks to not realize something will be slow enough that they need to apply that kind of thinking. That's especially true if part of the algorithm is hidden behind some indirection: whether just a function call, a call into a library that feels "far away" in the codebase / is owned by a different team, or dependency injection.


Good article. I liked this quote:

"though big companies try to make sure that the people they hire can solve algorithms puzzles they also incentivize many or most developers to avoid deploying that kind of reasoning to make money."


Great blog, thanks for sharing.

It’s true that leetcode is a unfortunate fad of the times, and a lazy way to select candidates. I also know a number of people at Microsoft that couldn’t answer any of the leetcode problems I was given, yet extremely talented developers nonetheless. Fortunately because the system is automated, you can just reapply and take another test. Seems the questions are plucked randomly so with some luck you’ll pull a triple easy set.


That interview process doesn’t just select for people who truly understand algorithms at a fundamental level. It also admits people who are good at memorizing the solutions to hundreds of leetcode-style problems. It is likely better at selecting more of the latter than the former.

When I ask algo questions during interviews they’re always based on real problems that have arisen in our code. Making the question actually relevant to the job is both a better interview process overall and has the added benefit of instantly filtering out the memorizers.


> Have you ever done a hard leetcode problem?

Yes. I used to compete on CodeForces regularly.

> Clearly the icon code was delegated to some intern or perhaps a child of one of the developers, because the bar is stupid high there just to have a conversation with a hiring manager.

You're implying that the developers of Microsoft can't be computer science illiterate, because a computer science illiterate couldn't possibly pass through the recruitment funnel that you experienced. But we know this code exists. Somebody wrote it. Somebody reviewed it. Somebody tested it. Everybody involved in this process thought "yeah, this is good enough", when it's clearly not good enough. Anyone with as little as CS101 experience would not write crap like this in the first place, so it's really hard to argue that the developers at Microsoft are computer science literate.


The code was likely written in the early 90s when the barrier to entry was much lower. It was a time when thousands of icons on your desktop wasn’t a use case worth considering, much less testable. That seems more plausible than all the developers being illiterate.


That's not true at all.

Many many people save everything they touch to Desktop, so it builds up cruft over time. There are ancient jokes about it. And computers were slower then brute-forcing big-O complexity was less feasible.

And the "barrier to entry" gets lower as an org gross and needs to higher thousands of people and get less picky.

I'm sure that MS doesn't require people to ace the hard leetcode problem to get an offer. It's just there to provide a broad picture of algorithm memory and coding speed (which is a stupid metric, not a "high" metric).


> The code was likely written in the early 90s when the barrier to entry was much lower.

You're probably right about that.

> That seems more plausible than all the developers being illiterate.

I'm not saying all developers are CS illiterate, but most are.


>The code was likely written in the early 90s

This is easily falsifiable by running a test case (create 1000 icons on desktop) in windows 7, 2000, XP and 98.


It is of course good that MS has a interview process that favors really smart people. But it doesn't mean that automatically all the work at MS will be flawless. There is much more then that.

And people are even joking recently about this: https://i.redd.it/jhr5tv5ls2e61.jpg


Being good at "leetcode" problems and being good at writing good production algorithms certainly have some overlap... but it's not very large, and they're actively at odds with each other quite often.


> Being good at "leetcode" problems and being good at writing good production algorithms certainly have some overlap... but it's not very large, and they're actively at odds with each other quite often.

Except in this particular example, where someone managed to write a quadratic algorithm for arranging icons.


It's the leetcoder who would have written the O(n^2) algorithm and figured it's OK because it passes all the cases thrown at it. A leetcoder doesn't optimize everything, just the thing for the problem they're facing today.

It's the person used to writing production code who would have stopped to think about the fact that the cases we tested with today may not be representative of the entire problem space.

Leetcode is throwaway code. Writing tons and tons of throwaway code is, like I said, not completely unrelated to writing good production code. To get good at coding, as with anything, it is necessary but not sufficient to code a lot. But this is the sort of thing that I'd expect a leetcoder to do in practice; to miss the real problems, probably because they were off hyperoptimizing something else that was irrelevant.


I don't understand why you're trying to make this argument.

You said:

> being good at writing good production algorithms certainly have some overlap... but it's not very large

I agreed with you, and noted that this specific case is an exception:

> Except in this particular example, where someone managed to write a quadratic algorithm for arranging icons.

It seems to me like you're trying to take the most extreme position that's available in this discussion: you're trying to claim that being good at optimizing algorithmic time complexity ("good at leetcode") does not help someone at optimizing algorithmic time complexity in production, because there's some kind of mythical distinction between "throwaway code" and "production code". That distinction is not relevant here. Someone who is good at optimizing algorithmic time complexity in throwaway code is certainly good at optimizing algorithmic time complexity in production code. Just like someone who is good at riding horses is also good at riding horses on a sunny day. Or someone who is good at painting with acrylic is also good at painting with watercolors, if you compare to another person who doesn't have any experience painting.


> I would argue part of the problem is that the majority of developers are computer science illiterate.

Sometimes, but it’s certainly only part of the problem. I’ve seen these kinds of bugs from people who would “know better”, who understand what O(n^2) is and are entirely willing and capable of fixing these things when you point them out…it’s just that they don’t see them when they are writing the code, or they mispredict the value of n that the code will be used under. And this seems to be prevalent no matter how “production grade” the software is; I’ve diagnosed these in my web browser, filesystem, text editor…in many of these cases I know who the developers are and they are exceptionally talented, and yet these kinds of bugs still exist. I don’t think we can solve these issues without some sort of concerted effort to uncover them, which I would argue is a strangely rare skill.


> I would argue part of the problem is that the majority of developers are computer science illiterate.

While I wouldn’t argue against that...

Sometimes people who do know what they’re doing choose slow algorithms knowingly.

I implemented a way finding algorithm that works in O(n^4) a while back. Because I knew the specific use case we needed it for had a max of around 200 for n, and that I could run a brute force of all possible routes for that in less than a minute in unoptimised Python on my laptop, so that I could then cache all the results and just do a very fast lookup from a ~40,000 row database table in production. I was very very clear to everybody concerned that this solution was not a scalable general solution, but it worked like magic for what our customer needed.


You applied global optimization, which I wish we could do more often.

Part of the type system should encode how long the fn call is allowed to take.


Couldn’t agree more. I’m saving this post to send to the next person who expresses disapproval that I ask computational complexity questions during my interviews.


How has that person not gotten in trouble for writing terribly slow software at work yet?


I've met some developers with a good intuitive understanding of fast vs slow code and who simply don't have the vocabulary (Big-O and whatnot) to perform well in that kind of an interview.

If a person really couldn't write fast code though, they could also just never have been assigned anything performance critical, the issues could have been addressed before they had any impact (e.g., in code review), or there could be a host of organizational reasons why despite pushing excessively slow code to production the consequences to any one developer are minor, especially in the short term (I've seen all of the above).

Mildly off-topic, most of the accidentally quadratic code I've seen has arisen by building on top of something that wasn't designed to support it via some kind of miscommunication or leaky abstraction (and the rest from building the algorithm never imagining N could grow large enough to matter). E.g., somebody designed a type which could be instantiated with sorted data and then passed to other places with demanding enough performance guarantees to need sorted data, and the type checker would ensure that an unsorted array wasn't accidentally being passed places it shouldn't be. Somebody else saw a sorted data type providing an Add() method and thought that was perfect -- without any exotic performance needs they just blissfully assumed any of the many well-known average-case logarithmic algorithms was used for the implementation. Fast forward a little bit, and suddenly the instantiation of a large sorted data type is taking way too long because...drumroll please...it's bubble sorting!

That isn't to say that nobody was blameless per se, but lack of comp sci knowledge wasn't the culprit; it was some combination of poor communication, ambiguous type names, leaky abstractions, not performance testing, not verifying the behavior of the components you're building on, etc....


> When people realized this problem can't be solved by throwing money at it, the next suggestion was to artificially constrain n to be small enough that the program completes... It was at this point that I optimized the algorithm so it ran in O(n log n) or something similar. Something that we should have done from the start.

I had the exact same situation at my previous job. Heck, the CEO even sold it as an advantage of our system, highlighting how few data we needed to get good results. Except our results were crap.

I fixed that problem and many others, and left the job three years ago, but they are still selling their product the same way. Some times you really need a big amount of data to get good results.


Well it's probably there since Windows 3.0 or so.


not 3.0, but 95 or ie4 active desktop easily. i experienced this slowdown back then, and learned not to store many icons or files in the desktop folder.


There used to be a quadratic algorithm in character input in readline, the library that many common shells are using to parse commands. So if you copy-pasted a somewhat long list into something like Postgres or the Python shell, it would take a lot of time because it had to check a couple of things on each input, and that somehow involved checking the existing characters in the input (I've forgotten the details).

I think that's a really classical case of O(n^2). You have a linear algorithm (n), and then run it on each instance you insert (n) so n^2.

It bothered me for several years, until I eventually did some profiling and contacted the readline (and bash) maintainer who fixed it.

This is about 5 years ago.

There's also another readline-related problem in several shells that they don't discover if you resize the terminal window. I can't remember if I raised that too. It sometimes amazes me that very few people appear to ever look at or debug our common infrastructure.


There are a lot of such.

When Larry Wall wrote the map operation for Perl, he just took the grep operation, added the function call, and added a resize if it was needed. The result is that the common "turn a list into a hash with map" operation was quadratic because it did the resize on every input element. I fixed that. I later noticed that unshift did the same thing for a different reason and fixed that. And then I looked at Ruby's source code and saw all the same performance mistakes.

Exactly as the blog says, n^2 is fast enough to make it into production and slow enough to fall over once there. We focus on making it right, and only return to making it fast when there is need. And most of the time the cause is a hidden n^2 that we never thought about.


`java` used to have quadratic command line parsing, which nobody much noticed as long as unixes had fixed ARG_MAX, but it wasn't very long after Linux got arbitrary-length arguments that the first person discovered that java might take a week to parse a really big command line. I personally reported this defect and I don't know if they ever fixed it.

https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6728845


Is that related to checkwinsize sometimes becoming unset? I’ve never managed to find a pattern to when it happens.


>I don’t know how many people this bug affects (anyone with 200-300 icons is hitting a modest version of this, and it gets progressively worse with more) and I have no power to fix it. So, I filed a bug. I am not hopeful that it will be fixed. My last quadratic-in-Windows bug has had zero comments since it was filed a few months ago.

At some point I realized that there is a sweet spot when it comes to software largesse, and it's way below the most popular applications.

Using popular apps like Chrome, Firefox, Windows, Fedora, Debian, etc., I don't have a snowball's chance in a hot place at even getting my issue looked at... by anyone. I'm lucky if my bug report gets a WONTFIX -- after I spend an hour filling out some elaborate bug report form which makes a tax form look easy.

However, once I started using less popular software, browsers and GNU distros which only have hundreds or thousands of users, I get way more attention. I may even get as fortunate as a developer (!) holding my hand through an issue and fixing a bug on the spot, probably from just asking a question on IRC.

It's a luxury which not everyone can afford, requiring both having technical knowledge and being willing to get my hands dirty. But the quality of experience I get in the end is so much better and more stable than anything else I experienced in the past.


At least in Firefox this doesn't appear to be true, at least not always.

Recently I noticed that on my main PC, if I have uBlock Origin enabled, all network requests are delayed significantly. So I made a bug report on uBlock Origin's Github. They couldn't find the issue and they eventually asked if I could try it with another ad-blocking addon, and sure enough, the same bug. So the issue was with Firefox, not a specific addon.

So I made a performance profile and made a Firefox bug report. It took some time but they eventually figured that it was because I run a 240Hz monitor, which was causing some events not to fire until a timeout expires: https://bugzilla.mozilla.org/show_bug.cgi?id=1684703.


> Using popular apps like Chrome, Firefox, Windows, Fedora, Debian, etc., I don't have a snowball's chance in a hot place at even getting my issue looked at..

Is this based on experience? In building a browser game engine, over the past few years I've filed ~3 chrome bugs and they all got fixed. Also the experience was frankly nicer than what I'm used to with github repositories - nobody on the chrome team closed my issue for departing from the bug report template! In fact they didn't even ask for repro files; basically as long as I described the issue clearly, they took it from there.

Granted the process took a looong time (months IIRC). But overall I found it very painless.


I wonder which versions of Windows have this bug, since I don't think it was there when I was using 98SE with a nearly-full desktop (~400 icons) and hardware from around 2 decades ago (single core)...

WPA just looked at the debug information for all of the EXEs and PE files and downloaded symbol files from Microsoft’s symbol servers

I've always found it funny that MS would publicly distribute these, since they get you probably 90% of the way to actual source code from the perspective of understanding how Windows works.

...so I pulled out the shell32.pdb from some version of XP (not sure which one, might be SP1 or SP2) and it has no BatchPositionChangesHelper, CCollectionSink, CListViewHost, nor IconLayoutEngine. It does have CDesktopBrowser and SHDesktopMessageLoop. Also, "windows.storage.dll" definitely isn't present in XP, given that it's a WinRT thing.

In other words, I'm inclined to believe this bug was not there originally, but was introduced during one of their horrid shell rewrites that happened after XP --- someone who has easy access to the Vista or newer symbols can see when those functions were introduced.

I also bet it was partially due to the insane amounts of abstraction that seems to be the norm in code today, where algorithms can become accidentally quadratic or worse without being noticed ("it's just a loop so it must be linear"... forgetting that in that loop's body, there's another loop somewhere down in the long chain of function calls.)

...and speaking of source code, someone who isn't bothered by the legal aspects could probably get to the bottom of this bug: https://news.ycombinator.com/item?id=24586708


> where algorithms can become accidentally quadratic or worse without being noticed ("it's just a loop so it must be linear"...)

This is a takeaway message for me. With some luck I’ll remember that sometime


I distinctly remember helping out a friend many many years ago, whose Windows box would simply lock up as soon as it had booted into the GUI. Probably around Windows 98.

Some random program had dumped temp files in the "Desktop" folder to the tune of 60000+ files. Even booting in safe mode would crash since it would still try to render the desktop icons.


> I've always found it funny that MS would publicly distribute these, since they get you probably 90% of the way to actual source code from the perspective of understanding how Windows works.

Most of the public symbols only include function names, with some notable exceptions.


It is stunning to see this brought in the open - and in such a clear and seemingly easily reproducible way. Kudos to the OP for digging into this as much as he did. In a world of 8 billion people, I don't think anyone would have dug so deep to reproduce this problem affecting a large portion of population. I'd love to see this fixed. Hope Microsoft pays attention and fixes this, rather than ignore it like they do all the Feedback Hub/Uservoice feedbacks.


> Kudos to the OP for digging into this as much as he did

You should read the rest of his blog posts.

These are really good ones:

https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and...

https://randomascii.wordpress.com/2019/10/20/63-cores-blocke...



His posts on floating point maths are very interesting too.


You'd think Microsoft would have automated perf / regression tests running with scenarios for large amounts of files.

But then, there's so much variability with Windows that it's impossible to think of all scenarios to test for.


It may be too hard to do with a source base that old. Who knows what's required to even build it!


I think it has more to do with Microsoft shifting towards a more Google-like strategy, where more time is spent coming up with ways to extract value from customers, rather than provide value to customers.

Like when they fired all of those quality assurance people a while back, I can't imagine they made that decision to improve Windows 10 for their customers.


This post is honestly rocking my world.

I've always been telling people that having lots of icons on your desktop will maybe use a little bit more ram, but it won't slow your computer down because computers are super fast.

How many other assumptions of mine are false? Would I have faster browser experience if I deleted some bookmarks, or if I deleted my browsing history? Would my computer run more quickly if I have fewer folders on my disk? Would gmail be faster if I had hundreds of emails instead of tens of thousands of emails?


It is stunning that the problem was in such a highly visible location, but the idea that your operating system becomes slow after years of use has always been a thing. Reinstalling Windows was something I would do every couple of years, I even made money in highschool doing house calls fixing people's computers.

It was about an hour or two, most of which was spent drinking tea with a random family's dad. I'd back up their data, reinstall windows, replace IE with firefox and a decent free antivirus. The computer would go from being a nightmare to being a joy to use.

In that time there were definitely more dangers looming, they would accumulate search bars and other adwares and trojans and such. There wasn't much value to be won from PC's then so having a virus would usually not affect their lives much. It would be common to get an email from your provider that they'd disable your connection if you didn't remove the DDoS bot from your machines.


Windows has had problems with desktop icons causing slowness for years. In fact, back in Windows 95 it was notable how moving the entire contents of the desktop into a folder on the desktop really sped stuff up.

I always assumed it was the overhead of loading every icon (there is substantial overhead in loading icons - in many cases requiring many disk seeks per icon). There are also a lot of shell extensions that get involved in the process for thumbnailing documents (very expensive for some document types), and determining status for shortcuts and offline files (which might involve network requests).

Overall, icons on windows are just slow. Avoid them.


Anecdotally in a certain Chrome version somewhere in the past years I've noticed something like this. The first 2 characters in my omnibox caused a terrible slowdown. So likely the history used for autocomplete kept growing or there was no sane index on the data.

Clearing my history fixed it to be _instantly_ again (even with history of a few days) and it took a few weeks to be very slow again.

So yes, I'd say its very likely that some of our assumptions might be false around such matters :)

luckily this was fixed sometime, but unfortunately this seems to coincide with autocomplete of Chrome not showing literally anything anymore.


All browsers seem to slow down with an old profile.

I suspect it's because all the unit and regression tests are run with fresh profiles, and nobody tests performance with a crufty 10 year old profile thats been through a lot of file format migrations, with highly fragmented indexes etc.


Yeah this is likely. Not sure about other browsers but I haven't noticed any of such issues myself anymore. And hey, if it ain't broke... It would be a nice for browsermakers to implement such a testsuite tho, but until then its just too easy to just clear data. Average user has all data synced to their respective cloud accounts anyway or don't care enough.


I believe that modern web browsers limit your history to 3 months, so it's almost like you never have a profile older than 3 months :)


I got quite upset when three years of browser history were wiped out by an update. A Firefox update, of all things! I still can't find the setting to restore the old behaviour.


I've never experienced this.

I just checked (in Firefox) my history panel, sorted by date... Today, Yesterday, Last 7 days, This Month, a named month for the prior 5, and Older than 6 months.

And some of the randomly sampled stuff in that giant last category is quite old.


If the slowdown is sufficiently bad and it's happening on Windows then you can always record an ETW trace (https://support.google.com/chrome/a/answer/9025467) and share it with me. My DMs are open: https://twitter.com/brucedawson0xb I've resolved a few Chrome performance issues due to strangers on the internet sharing ETW traces with me, and analyzing those traces is actually my day job.


Sorry for the late response. This was on Windows yes but it must've been fixed by now as I very rarely reset profiles and haven't experienced this since. I'll keep your offer in the back of my head tho, thanks!


I've noticed it started taking a long time to download files, they'd "download" to 100%, then stall. Moving all my "Downloads" folder into an "Old Downloads" folder fixed the problem. I suspect there was a quadratic algorithm at play.


Yes. At previous job where we were making a desktop Win32 application we experienced this. Our automatic pipeline is: Build server compiles all the .EXEs and .DLLs, creates an .EXE installer and sends it for testing. Test agent grabs the installer, installs the product and runs various test scenarios (pokes buttons, menu items, etc). One of those scenarios is HTML help (.CHM file) displayed inside Windows built-in help viewer that started to time-out. The solution was to go to Control Panel, Internet options, Clear Cookies. Also the product's check-for-updates-in-background feature on this machine was very slow because cookies. Both features are using Windows' (or IE's) high level HTTP library containing functions such as fetch-this-url-and-figure-out-TLS-and-redirects-automatigally.


> Would I have faster browser experience if I deleted some bookmarks, or if I deleted my browsing history?

In some ways, probably.

Using the "Forget about this site" command in Firefox will generally be quite slow and cause high CPU usage for ten seconds or more.


"Arranging Invisible Icons in Quadratic Time"

That title is pure poetry. The post could have ended with "Upon hearing this, the programmer was enlightened." :^)

(reference to https://simple.wikipedia.org/wiki/Hacker_koan)


"How much time does it take to sort icons if nobody is there to see it?"


"If an algorithm runs in a machine and no output device is around to show it, does it make a computation?"


My thoughts exactly when I program Chrome, in headless mode and with no UI, to scroll and click on a button.


Thanks - I was inordinately pleased with that title


Well there goes my standard tech support line:

"No, don't worry the icons don't lag your computer, just the programs that are currently running."

I do feel silly now.


Well, technically the Desktop and its icons belong to Explorer, which has to be running. So not too silly. That being said, I'm not sure how many people actually know that the task bar, desktop and improved Alt+Tab experience with previews is part of Explorer; not just the very visible windows that show a folder.


This reminds me of the time, about a month after Napster went offline, my boss nuked a co-worker's hard drive because "the music downloads folder was making the computer slow." He typically claimed zero knowledge of computers, and left such matters up to me or his buddy that owned our ISP... but on a weekend, he took matters into his own hands and destroyed a huge compilation of irreplaceable tracks. I'm still offended by the senseless destruction to this day, even if I don't personally value Phish to that degree.


so, I wonder if I'm suffering from selection bias because of these excellent articles I keep reading in randomascii.

I keep reading these articles about terrible quadratic complexity for things on windows, it gives me the impression that performance on windows can be pretty bad, and frankly, I don't have the best opinion of windows.

Is performance worse in general on windows? I haven't used it for a fair few years. Is it just generally a worse OS? I used to somewhat resent having to use windows when I needed to develop on it (developing for games consoles) but there were some nice parts (I have to say, visual studio is really nice as an integrated debugger, even if I prefer other editors).

I'm trying to check my biases. Is windows actually a fairly poor OS nowadays?

> If you are not on Windows – I’m very sorry.

The writer on the article definitely seems to think that it's a nice one, despite the constant issues they run into. Is this just because they've used the system so long, and have developed such expertise with it, or there are things to love that I'm missing about it?


For most users, most of the time, Windows performance is just fine. This is especially true on modern 64-bit hardware with SSDs when using the latest Windows 10 builds.

In the past? It was a bit of a mixed bag. TCP performance and network latency in general was always worse on Windows than Linux, largely for silly reasons. Similarly, the 32-bit versions had a lot of fixed sized memory regions for things like file caching that were set in stone in the NT4 era and not changed until Windows Vista. Towards the end when all 32-bit machines had 4GB of memory, this made disk access unnecessarily slow.

There are also design space differences between Linux and Windows. On Linux a process is very lightweight, basically a fat function call. On Windows the equivalent need is typically met by RPC calls to an already running process. Threads are cheap on Windows and threading was very well supported since the 1990s. On Linux, not as well, even now. For example, even NT4 and async scatter/gather IO that SQL Server used back in v7. I believe that Linux is still struggling with this.

Conversely, Windows beats the pants off Linux in some areas, even now. Try using a Linux desktop with a flaky DNS server. Hooo-boy. Timeout city! On Windows this is almost unnoticeable, because it fails over to the secondary DNS server in milliseconds and caches all results as expected. Linux failover and caching seems to require a significant effort to set up by the users, for some mysterious reason.

Similarly, Windows SMB v3 over RDMA is crazy fast. If you have the right NIC, you can flatline a 40 Gbps pipe with a single file copy. In the Windows Explorer GUI! No special tools needed. Copy, paste, and next thing you know you're doing 5 gigabytes per second. That takes real effort to match in Linux.

Etc...

Basically, everyone is used to the warts on their preferred system, and has learned to work around them. But when exposed to something unfamiliar, all the little issues seem to be always in the way.


> TCP performance and network latency in general was always worse on Windows

I would disagree. Unless you're trying to write "cross-platform" network code using select and other POSIX-style interfaces, in my experience Windows with IOCP or RIO will beat any other user mode application TCP stack for I/O rates and latency.

Vista's stack had latency issues in the beginning but Windows 8 with RIO was released a long time ago.

Note I said "user mode application". Linux has easier kernel bypass and well-supported DPDK for applications that really need it. You obviously cannot beat that with user-mode code


Yeah, IME Windows really excels if you build a TCP server using the async winsock2 APIs + IOCP.

Where people get into trouble with it is when they try to code against it like it's another Unix variant.


On a day-to-day basis the only times that Windows feels slow are: 1) Git operations. These run faster on Linux and I waste some measurable number of seconds/minutes each due to this slowdown, presumably due to slower file operations. 2) Start bar menus. I frequently right-click an icon on the task bar and even after they fixed the problem I reported in 2019 there is still a noticeable delay. If I then have to right-click on the application name to bring up the sub-menu then there is a secondary delay which is even longer. These delays happen even if I repeatedly right-click on the same icon. I find it bizarre that "waiting for a menu" is a thing in 2021. Blog post is here:

https://randomascii.wordpress.com/2019/09/08/taskbar-latency...


> If you are on Windows then make sure you are using symbol servers. If you are not on Windows – I’m very sorry.

I thought this was a particularly bizarre take, given that in the very same post the author has lamented not having access to the source code of the program he is trying to debug.

You can keep your symbol servers, I'll take the source code please.


I think the point is that the stuff on the symbol servers corresponds to the binaries used in the wild by the people who first encounter these issues.

On Linux, you might run into the problem of a distro shipping stripped binaries, which means having the source code doesn't help much in parsing a stack or profiler trace. Instead, you need to track down the debug info for the binaries as compiled and shipped by that particular distro. Even in the best case, that will be more work than Microsoft's single-source symbol server.


File system performance is arguably terrible on Windows, when working with thousands of files.

Stuff like "npm install", large git operations are many times slower than doing the same operation in a linux virtual machine running on that same Windows. Hard to tell if it's Windows itself, or maybe npm/git not using the "proper ways"

Other than that, Windows performance it just fine, and arguably smoother than a Linux desktop (no mouse freezing for example).


Yep, file system performance is a continued source of frustration, most noticeable when using git. Using Windows Defender exclusions is necessary but not sufficient - it's still slow.


Consider adding your code directory to exclusions list for windows defender, if you haven't done so already. File metadata related operations are slower on windows, so it definitely feels worse with git.


> The writer on the article definitely seems to think that it's a nice one, despite the constant issues they run into.

The Stockholm syndrome is a very real thing.


It could be Stockholm syndrome for sure. I have not used Linux enough lately or OSX basically at all to know what they are like. I know the advantages and the warts of Windows very well, and I know neither for other operating systems.

The "I'm very sorry" comment was meant to be strictly in relation to symbol servers where Windows is clearly ahead (although Linux may be making progress in this area).


first off, thanks for the article! I always enjoy reading them.

Secondly, I think it's a very good point you raised. Sad to see the situation hasn't progressed much in the linux world, as reading about Microsoft's symbol servers, they seem incredibly useful!


It seems that edge also has an N^2 complexity rendering SVG files. I've been using SVGchart to render time series simulation data to SVG and viewing it in the browser. Putting out 4 traces sampled at 40KHz over 1 second produces a slow but tolerable rendering. 2 seconds of data kills it, while 0.5s is fairly snappy.

The issue at this particular threshold of time complexity is IMHO that you often need to create a new data structure with multiple fast algorithms (insert, and query) in order to reach O(n log n). Many people will punt at that point, questioning weather it's really worth the trouble.


I got the «no desktop icons» advice in the early xp days and had no idea why it was so slow. Now I know, at last. I’m guessing this algorithm is at least from the win2k days.


>I don’t know how many people this bug affects (anyone with 200-300 icons is hitting a modest version of this, and it gets progressively worse with more) and I have no power to fix it. So, I filed a bug. I am not hopeful that it will be fixed. My last quadratic-in-Windows bug has had zero comments since it was filed a few months ago.

What’s the way to report a bug in Windows? The closest thing I know is the feedback hub which is an unmoderated public forum full of spam and dunces


The phrase "I filed a bug" links to a bug database which Microsoft has started to use for performance bugs, especially developer facing. I have been using that quite a bit.

They prefer that you use Feedback Hub for most bug reports but...


This is not surprising to me at all. At work, we had this wierd behavior where opening a “open file” dialogue took minutes! The solution was to delete dead shortcuts (target programs were removed) on the desktop… After that the dialogue would open again in seconds.

I still can’t see the connection though.


Oh, that's an infamous one, too. Mark Russinovich did a live demo of troubleshooting it in one of his talks. ( I think it was "Case of the unexplained" 2013? https://channel9.msdn.com/events/TechEd/2013/WCA-B306 )


I want to develop a habit for looking at my code through the lens of time complexity.

What are some resources (articles, public knowledge bases, books, courses, etc.) that could make one more fluent in this matter? Something that could help build up time complexity intuition and learn to assess it faster.


Watch for nested loops. That's pretty much it. Well that and be aware that if you have two nested loops then one of them had better be looping through log(n) the elements of the other or less.

The most common quadratic algorithms have two loops that both loop 'n' times, or the inner loop runs between 0 and n times. Both are quadratic - the second case is twice as fast, but still quadratic. In some cases the inner loop is not on 'n' at all but is on something that is typically proportional to 'n'. That's just as bad. That is, O(n*n/100) is still O(n^2).


The one exception to this is if you have nested loops not because you're going over your data multiple times, but simply because that's the most natural way to iterate over your data given how it's organized. If you have eg. a 2D array, then doubly-nested loops are usually the most straightforward way to iterate over the elements and aren't necessarily a red flag—many of the useful things you can do to a 2D array require touching each element at least once. But if you have triply-nested loops working on a 2D array rather than a 3D array, it's worth a second look.


Just be glad no-one thought of running Internet Explorer as the desktop background....oh wait....


Findings like this are great.

Performance-bugs cost thousands of kilowatts-hours.

Computing energy isn't the largest contributor to climate change, but still: Optimizing for energy-efficiency will get more and more important even for Microsoft.


> That works poorly for efficient testing, and it also seems to have worn out the external monitor connection on my personal laptop.

I have the same problem. Since Macbook pro 15"/16" run the dGPU 10-20W at idle whenever an external monitor is connected, which thermally throttles the CPU, I have to ration my external monitor usage throughout the day. So now my USB cable/port (I hope it's the cable!) is worn down and the connection arbitrarily drops when plugged in.


> If you are not on Windows – I’m very sorry.

Do you mean OSX right? Last time I had an issue of this kind on Linux, I just got the source, built in debug and actually fixed the issue.


Sure, and that is a wonderful ability to have. But what would you do if you got a perf recording from a user who was running a different version of Linux? I was able to load this other person's ETW trace and analyze it with basically zero effort, and that (common, for me) case would be much harder on Linux.

Once I had a repro the Linux case would be arguably easier, but I couldn't get a repro until I could analyze the OP's trace.


IIRC there's an "invention" in windows95 to draw icons in folders/desktop from the bottom up, because to human perception it's faster than drawing from top to bottom.

Wouldn't surprise me if that code survived with this paradoxical result.


Yes. Some hard facts to knock that theory that today's computers are slow because they do a lot more stuff than 20 years ago.

No, they do a lot more of _nothing_. Something that would have been obvious and unacceptable on a 600MHz single core machine.


I assume that Microsoft never noticed this problem because there are so many ways to hang Explorer (mine is hung right now) that they'd never get to the "1000 icons" case.


Maybe I just can't read it, but it looks to me like ~BatchPositionChangesHelper doesn't have any callees. Is the layout of this inclusive profile just confusing?


The layout can be quite confusing at first. It's well designed, I think, but takes a while to understand automatically.

One decision they made (visible at the top of the call stack also) is to not indent for children if there is only one child. So, ~BatchPositionChangesHelper calls FinishAndSaveBatchPositionChanges (always, in the samples) which then calls _SetRedraw, SetExtendedStyle, and FlushDataModelAndSave.

Not indenting in the single-child-case is good because it reduces the width needed for the call stack.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: