While perhaps not so great from a defensive programming perspective, Mac OS feels like a different case since it's only designed to run on specific hardware.
Modern Mac OS also has all sorts of "bugs" that Hackintosh users need to patch or otherwise work around. Since we're doing something that was never intended, I don't really see these as flaws in the OS.
I would still consider them to be timebomb bugs, though. Even if you're developing for a restricted set of hardware, newer versions of that hardware could very easily violate some corner-cutting assumptions in the future. I would rather spend a little more time now to get something right and future-proof, rather than pass the problem onto future-me, who likely won't have the context anymore to find the issue quickly, or, worse, future-someone-else, who doesn't have the context at all.
Yeah, over a long enough time window I think portability and correctness will always come back to bite you. Apple could've saved time by making Darwin only handle one processor nicely, but then the Intel transition and ARM additions (iOS is still Darwin, after all) would've hurt more. Windows coasted on x86 for a while, but now that they're targeting ARM I'll bet they're pretty glad that it was originally built to be portable. Code that only works on the exact system you need today might be good enough sometimes, but if you survive long enough you'll want it to be more flexible than all that.
EDIT: I should add, this applies to applications, not just OSs. If you're an Android dev - will your app run on Android 15? Will it work on ChromeOS? Will it run on Fuchsia? If you're writing Windows software - will it run on ARM? If you're making webapps - does they work on Firefox? And maybe it's not worth the effort, especially if you don't plan to be selling the same software in 5 years, or maybe you think you can just deal with those things when you get there, but if you plan to still be in business in a decade then you should plan accordingly.
For my sanity would you mind calling it Mac OS X/OS X/macOS? I’m not too picky about you matching them all up to the right release but the moment I see Mac OS my mind jumps to the old one without memory protection ;)
Sorry about that—I was actually trying to use the name that implies a common product lineage (ie "Mac OS X is just the tenth version of Mac OS."), since we were comparing with the original Mac. Probably just ended up being confusing though.
I’m about to build my third hackintosh, although it will be my first on OpenCore. Can you expand upon why you call these “bugs” and which patches you are referring to?
Well, one specific thing I was thinking of was the limit of 15 USB ports in El Capitan and later. There's no reason for that to exist in an absolute sense, but no real Macs have enough ports to run into trouble.
Do the cards actually have USB controllers on them though? I thought all the USB-C ports on the Mac Pro were routed through the motherboard in order to support using any port for displays irrespective of what GPU is driving it. Or is this one of those weird Thunderbolt vs USB things?
Oh—I have no idea then, sorry! Actually, as far as I know, Apple could have fixed the port limit bug in Catalina, since I've never set up a Hackintosh on that OS. Kind of hoping a proper Hackintosh developer will chime in here because I'm not really qualified!
No, only ports on the motherboard. I think the limit is technically per-controller, but I'm not sure and I don't want to say something wrong. If you add ports via a PCIe card, those don't count against the limit either.
That said, the limit is more problematic than it initially appears, because USB 3 ports count twice—once for USB 2 devices, and once for USB 3 devices. Some motherboards also use USB under the hood for things like Bluetooth (as do real Macs, btw), and even USB headers which aren't connected to anything will take up space if you don't explicitly exclude them.
Holy shit. I feel like this neatly explains why Windows 95 was an utter crash-fest on the computer I bought just before my freshman year of highschool. With an AMD K6-2 running at 350Mhz, it was the first computer I had that was all new components instead of the franken-sytems built from a dumpster-dive base and other dumpster-dived components grafted on. The shop I bought it from initially put 95 OSR2 on it. And it did like to crash. It wasn't until I started using Windows 98SE that I started to see anything resembling stability, and not need to re-install every other month.
If only I had known about AMDK6UPD.EXE back then and been able to understand the reasons behind the crash and why the patch fixed things.
I have to admit I find this type of article about old computer/software quite interesting as recently I discovered a backup of mine that contained source code I wrote in 1993. I was writing assembly language back then. Using a really great library called Spontaneous Assembly. First version 2.0 and then 3.0. SpontaneousAssembly 3.0 added support for easily writing TSR (Terminate and Stay Resident) code.
Back in the early 1990s I was in college and working in the computer lab. So I wrote various little DOS utilities to help us better manage the computers and the interaction with Novell Netware.
Due to this reminiscing I have even purchased a few tech books from that time. MS-DOS Encyclopedia, Peter Norton's Programmers Guide to the IBM PC, and some others.
I only wish I still had a copy of SpontaneousAssembly 3.0 as it would be fun to recompile some of my old code!
Yeah I have emailed Kevin Crenshaw but he stated the project is "abandonware" at this point. He said to search for it online. So far no luck with version 3.0. It would have been fun to try playing with some old code in a VM :)
I don't live anywhere close the the Library of Congress, so not easy for me to get a copy there :(
Anyone can get a reader card at the Library of Congress if they have a photo id and are at least 16 years old. I'm not sure how you access computer files there, though. The reader card has to be obtained in person. With the Library of Congress closed to visitors because of COVID-19, I imagine it's not possible right now.
I go to the Library of Congress every once in a while and could ask for this. In my experience, to get items like this you use the Ask a Librarian link on the right and they'll work with you from there. Send me an email at the address here and I'll try this the next time I go: http://trettel.org/contact.html
(Unfortunately my next Library of Congress trip might not be for a year or more at this point due to COVID-19 and life.)
I could make disk images with GNU ddrescue, or another software if you prefer. Note that I don't have a floppy drive at the moment but will ask them if they have an external USB drive.
>"It was somewhat unfortunate that this was called an “AMD fix” (the file containing the solution was called AMDK6UPD.EXE), even though Microsoft was clear that this was not a problem in AMD CPUs but rather in their own code."
I'll bet the AMD name was suggested by producers and/or management at the protest of engineering, with the argument that the public knows this as an AMD problem and so it's better to call it that regardless of the technical reality. I've seen this logic many times in my career and do understand there's some rationale to it.
Could it be simply because since the bug primarily affected AMD CPUs at the time it would make it easier for everyone if the update was called the “AMD update” as opposed to some cryptic name like “network stack delay loop update”?
Faced with the Great Satan of Software's apparent refusal to admit its mistake and eliminate the charge, AMD has made the fix available from its Web site free of charge.
A lot of games had similar problems too. I remember spending ages downloading the demo of Screamer 2 after getting our new computer and "the internet", and being disappointed that it crashed on startup for the same reason.
The turbo button originates from Taiwanese "Turbo XT" clones that would run an 8088 or V20 at 8, 10, 12 or even 16 MHz with turbo engaged and 4.77 with it off.
Later 386 and 486 systems implemented turbo logic in different ways. Some by reducing bus speed, some by disabling CPU caches, some by inserting wait states for memory access.
I recall having a turbo button on a 386-based computer that halved the speed - I don't recall if it was from 40mhz to 20 or from 80mhz to 40, something like that. I also recall seeing computers of that era which had a mhz display on the front to show the current processor speed.
The highest clocked 386 CPU, and one of the most popular ones, was the Am386DX40, so your computer would likely have had a 40 MHz part.
It is a common misconception is that the MHz displays from this era had any kind of communication with the CPU or motherboard. They don't -- they are dumb devices that can switch between showing two arbitrary patterns on the LED display and are "programmed" by painstakingly setting jumpers on the back. Often when using 2-digit displays for computers with 3-digit clock speeds they would be programmed to display "HI" and "LO" instead of a number.
So when your display showed "20" that doesn't mean the CPU was running at 20 MHz. It might have been, because 386 CPUs always ran at the bus speed and 20 was a common 386 speed, but things get a lot more complicated when you move to the 486 platform with internal clock multipliers.
My 486 DX4/100 (33 MHz bus speed, 3x multiplier) has a turbo button that when disengaged lowers the effective speed of the system to something roughly like a 486 DX50. But this is not an exact science and does not in fact mean that the CPU is running at 50 MHz.
Maybe initially, but there definitely were still 486s with Turbo buttons that would throttle the machine to some frequency much higher than 4.77MHz (or just disable the cache). I had one! And apart from that, CPU generations have vastly different speed profiles if you kept them at the same frequencies (which is mostly theoretical, clocking a 486 at 4.77MHz, while not necessarily impossible, might turn out to be quite a project on consumer hardware).
Turbo buttons were always a shaky proposition. They might have worked okay-ish with the original AT to slow the machine down into a somewhat fitting range to play older games, but probably quickly devolved into some show-offy marketing ploy ("look how fast it goes if I press this!").
(which is mostly theoretical, clocking a 486 at 4.77MHz, while not necessarily impossible, might turn out to be quite a project on consumer hardware).
Quite a project indeed, but possible with the right motherboard -- as an amusing side note, there is a very strange sub-sub-sub-genre of computer enthusiasts who enjoy the challenge of installing various Windows versions on the slowest possible systems that will run them:
They've managed feats like running Windows XP on a 4 MHz Pentium Overdrive and Windows ME on a 3 MHz 486SL (that one takes 1 hour and 10 minutes to even boot)
It depends. On later computers--around the time frame we're talking about where the Turbo Pascal CRT bug was showing up--the turbo button, where it still existed on computers of the day, often just enabled/disabled the L2 cache near the processor.
> According to the Pentium Processor Family Developer’s Manual Volume 3: Architecture and Programming Manual (Intel order no. 241430), the LOOP instruction. The absolute best case when the branch is taken is 6 clock cycles. The Intel manual notes that “[t]he unconditional LOOP instruction takes longer to execute than a two-instruction sequence which decrements the count register and jumps if the count does not equal zero”.
This makes me wonder about three-instruction sequences of increment / decrement / jump-if-nonzero, and one-instruction sequences of jump-if-nonzero. What's the point of having the unconditional LOOP instruction in the first place?
We still face a related class of problem today. The x86 PAUSE has wildly varying throughput. On most Intel parts it is 1/8 or so, but on Skylake Xeon it's 1/141. On Ryzen its 1/3. I've seen code that makes assumptions about how much real time must have passed based on PAUSE loops.
Yes, private code. Basically there was some mutex fairness thing that was written on a SKX and on a Zen CPU where PAUSE is 50x faster it didn't have good fairness, it was too tight.
The last time I benchmarked it, which was at the beginning of the i7 era, LOOP was just as fast (within the margin of error) as dec/jnz - Intel probably doesn't want to be seen as slower than AMD and didn't care about that timing loop anymore.
That's not how microcode works, but yes - the instruction decoders and retirement logic are capable enough these days that you can have it decode into the same sequence of μops, which is almost certainly what happens.
Note that it's not exactly the same thing, because an interrupt can happen between the decrement and jump for the two-instruction case, but not for the LOOP case.
i love this site. os/2 was such a huge part of my life in the 90s and the sole reason i love computers back then. it's great that this site has preserved so much history of it.
an operating system made through a joint venture between microsoft and ibm. it was the predecessor to WinNT. it could run dos, win16, win32, posix as well as os/2 native apps. it really was an amazing operating system at the time with a VERY passionate community behind it. watch some of the videos for a good take on it:
i learned about OS/2 from the 'Doing Windows' series. really recommend it, it's a great read about the history behind all this stuff, the computing landscape of that period, and why OS/2 was a huge achievement.
the road to "run DOS stuff [without being DOS]" was very long, and paved with many gravestones... i think OS/2 comes in in part 5 or 6, but i really recommend reading the whole thing.
tl;dr A graphical OS developed by IBM that succeeded DOS and competed with Windows. Notably, it featured pre-emptive multitasking before Windows did. It was not a success in the home market but was reasonably successful in big business, especially finance, for a short amount of time.
As a tween/teen, I learnt a lot from OS/2. Up until then I had only used DOS and Windows 3.x. And then my Dad bought me a copy of OS/2 2.0, and also the Walnut Creek Hobbes OS/2 CD-ROM. And I discovered EMX (the OS/2 equivalent of Cygwin). And I started playing with bash, EMACS, GCC, etc. Next thing you know, I was installing Slackware Linux. At which point I largely lost interest in OS/2. But EMX was an important stepping-stone for me in getting in to Linux.
I think it's important to note (even in a tl;dr) that for a time OS/2 was a joint venture between IBM and Microsoft, and that MS sabotaged that relationship while secretly working on WinNT.
On a related note, "Showstopper!: The Breakneck Race to Create Windows NT and the Next Generation at Microsoft" is a surprisingly entertaining story, and reads more like a novel than a documentary/memoir.
As a result of a feud between the two companies over how to position OS/2 relative to Microsoft's new Windows 3.1 operating environment, the two companies severed the relationship in 1992 and OS/2 development fell to IBM exclusively.
Windows 3.0 was eventually so successful that Microsoft decided to change the primary application programming interface for the still unreleased NT OS/2 (as it was then known) from an extended OS/2 API to an extended Windows API. This decision caused tension between Microsoft and IBM and the collaboration ultimately fell apart.
It was the days where people owned their own software and DRM had not made it's way into games, since the internet has enabled PC game theft on a massive scale, by valve, ea and activision.
OS/2 was an alternative Operating system oriented towards businesses that could run apps from different operating systems under one unified framework.
I've written this kind of code myself, where you measure a time delta and divide something by the delta. It's always something that sticks out though, that you might divide by zero (especially if you did it in Java!).
The article says it would have been picked up in code review, and I agree. But it just seems odd that it wasn't changed right there. Why not just write to loop so that it keeps looping as long as the divisor is below some number like 10ms? You also want to minimise the estimation error, which is easier to do if you divide by a slightly larger number. Consider a loop that takes between 1 and 2ms to finish, your estimate will be either x or 2x.
ATOMIC CONTEXT: You must use the delay family of functions. These functions use the jiffie estimation of clock speed and will busy wait for enough loop cycles to achieve the desired delay
Would it have been possible for Microsoft to test for something like this, or would it be possible today? For example, is it feasible to slow down time to simulate an impossibly-fast CPU?
You can do that for linux userpace apps using the "faketime" utility. It just intercepts that calls that try to find out the actual system time. Not sure how that would effect kernalspace, since the kernal is sort of the thing that decides what time actually is.
We once got a couple of high end motherboards for AMD processors, back in the Win98 era, and I tried to install Win95 on one of them. The box said that Win98 was required, but it couldn't hurt to try, right? Maybe there would not be drivers for some of the peripherals, but all I needed was the CD-ROM and hard disk to work.
Install is going fine until it is time for it to reboot. That failed. It didn't even get to the BIOS initialization screen. It appeared to just be dead.
Figuring we just got unlucky and got a defective motherboard or processor, I tried installing on the other one so I could get on with my work.
That one died too.
Eventually I found something about this on the motherboard maker's support site. The problem was with the device probing during install.
What I'm about to say is not made up. As unbelievable as this might sound to people who grew up with more modern PCs, at one time they really did work as I'm about to describe.
The early PC buses had no built-in way for the host to identify what cards were plugged into the expansion slots. Typically, an expansion card would have a set of jumpers or DIP switches that could chose between several sets of possible addresses for the card's registers to appear in I/O space.
The user was expected to keep track of the settings of all cards they installed, and adjust the jumpers appropriately to avoid conflicts, and record the settings in a config file that the drivers would read to find out where their card was.
Later buses, such as EISA and later PCI, provided ways for the host to find out what is there and how it is configured. But operating systems still needed to support the old bus, and they wanted to make this as user friendly as possible.
So systems like Win95 would have a device probe during install that would try to identify what is on your old bus. They would do this by very carefully probing the I/O address space.
For example, suppose you know that a particular network card if present has to be at one of 8 addresses, and you know that after power on or reset that certain bits will be set in its status register and certain bits will be clear. You can read those 8 possible addresses, looking for the right bit pattern. If you don't find it, that particular network card is not present. If you do find it, you can do more tests to confirm it.
Some of those other tests might involve writing to the device registers, and seeing if it responds the way that network card should.
This is obviously risky. What if it is not that network card, but rather a disk controller card that just happens to have a register that after reset has the same bit pattern you expect in the network card status register? The thing you write then to verify it is the network card might be the "FORMAT DISK" command to that disk controller.
And so you had to be very careful with these probes. They had to be done in a safe order. You'd need to probe for that disk controller before you probed for that network card.
Those new motherboards contained peripherals that Win95 did not know about, and so the Win95 probe procedure did not know how to avoid doing bad things to them.
One of those peripherals was the built in interface for flashing the BIOS EEPROM. The Win95 device probe ended up overwriting the BIOS.
It's unfortunate (though understandable) that you bricked both before discovering the root cause. I remember hot-swapping EEPROM chips to repair a bricked motherboard from a similar era.
While attempting to upgrade the BIOS, something went wrong. Most likely there was a bad sector in the boot floppy I used. The result was an unbootable machine. Solution? Swap in a working EEPROM chip from a compatible motherboard. Boot to a floppy disk that has a BIOS imaging utility and image file. Hot-swap the bad EEPROM chip back in. Re-flash the BIOS. Or, if you had money, you could sometimes purchase a pre-flashed replacement EEPROM chip.
I don't miss those days, but am fortunate to have experienced them. It forced us to learn more about how a computer really works.
>"The issue also illustrates how seemingly solid assumptions made by software and hardware engineers sometimes aren’t. Software engineers look at the currently available CPUs, see how the fastest ones behave, and assume that CPUs can’t get faster by a factor of 100 anytime soon."
Disagree. Where I've worked (Oculus/Facebook and EA) we would never allow such assumptions in code reviews, regardless of how unlikely the failure may be. You never allow div/0 unless it's mathematically provable to be impossible. I'm sure other orgs have the same code review policy, and static analysis these days would also catch it.
That's simplifying things a little. The 90s were a completely different time in computing, still somewhat pioneer when it came to "modern" operating systems in personal computing. What came before on home computers was usually tied to the actual hardware and its implementation in a very thorough way, where way more outrageous (but at the time, widely accepted) assumptions were made. For example, what memory location to write into for direct display on the screen from your application code. A few years earlier, the absolute time that a particular instruction takes.
Computers became more powerful and more diverse, we added abstractions, we abolished assumptions.
And still I'm pretty sure that even in Oculus (to pick up your example, I know nothing about that), there are bound to be a great deal of assumptions in the code that cease to be valid with later versions of the products.
By the way, it just dawned on me that preventing the division by 0 is not even solving the problem. What then, just set the delay to the biggest representable delay? But on a machine with a 1000x faster CPU, that can still be off by an order of magnitude or two. And depending on what the delay is used for, that could then cause much harder to debug problems later on. Some assumptions about reasonable ranges had to be made, just like the assumption that 32 bit was a reasonable address size back then. But a more obvious error message would have been nice (something the article mentions as well).
> we would never allow such assumptions in code reviews
Right.
Today we have the benefit of hindsight, we know how fast processors have become. In the Win3.1 era, noone sane would have predicted this. Even Moore's Law applied to transistor counts, not processor speeds.
What you should ask is: what other assumptions are you implicitly making that you are not currently aware of?
> In the Win3.1 era, noone sane would have predicted this.
That's a bold claim!
We went from 4-8MHz 286 chips to 20-50MHz 486 chips in the decade leading up win3.1's first release. By the time we were approaching windows 95, pentiums were up to 133MHz.
Those chips already had really fast branch instructions.
So you're already staring down the barrel of calibration taking 15 milliseconds. It's a reasonably obvious step to consider LOOP being a cycle faster than adding and branching, which takes you all the way down to 7 milliseconds.
So taking that all together, x86 clock speeeds have doubled 3-4 times in the last dozen years. A chip could come out tomorrow that takes 15 or even 7 milliseconds on the calibration loop. Your code breaks if it hits 2.
I think someone sane could have predicted the problem.
Also, even into the early 2000s the majority of programmers were self-taught to varying degrees. University training, boot camps, ubiquitous internet access to reference materials etc. have vastly increased the amount of information available to a budding programmer. Back in the 90s you just hacked on something until it worked.
This reminds me of some discussion about the evolution of games (can't find it right now, it was probably about ID Software).
Computers today are literally 1000x better than PCs 30 years ago. 1000x (even more) faster, 1000x more ram, not to mention storage and other capabilities
Huh, my 1st computer hat 640KB of RAM (does it count as a computer?), the 3rd one had either 4 or 8 MB. My current one has 16GB, so you're right, that is actually 2048 (or 4096) times more...
I somehow cannot believe that nobody had mentioned the first significant instance of similar crash at least for the DOS days: Borland's CRT.TPx/conio.a which consistently caused division by zero on anything faster than Pentium MMX. At the time the wisdom was that anything over 200MHz is too fast, but in reality anything AMD with "Pentium Rating" over 200 was fast enough to cause consistent crashes and 266Mhz Pentium MMX was slow enough that it mostly worked (and anything i686 consistently crashes).
I have used Windows 98 SE on CPUs up to various Pentium 3s. There was a problem with big (above 512MB) RAM volumes but it was easy to solve.
I was only forced to switch to Windows XP when I upgraded to Pentium M (Dothan) - besides the Safe Mode I could find no solution to run Windowes 98 on it.
I would gladly return to Windows 98 now if my hardware and software supported it.
> I would gladly return to Windows 98 now if my hardware and software supported it.
Not if you would like to browse web. HTTPS - sorry. But maybe you should give ReactOS a shot with the classic theme: https://reactos.org/ (WinNT era I think)
It was much more a clean-and-simple design. The less complexity and useless (to me) features an OS has - the more I like it. Of course I mean the way it works, not the way it looks.
It also took so little RAM and HDD it would altogether fit in a humble corner of my today RAM - I don't really get it why does modern software need so much more.
My views on 98 vs XP are the opposite. I ran 98 for a while on my HP Vectra VL with a ~233MHz Pentium II and, if I remember correctly, 256MB of RAM. It was dog slow, took at least 10 seconds just to open an Explorer window.
Decided to try XP on it, and I was blown away at just how much better it performed. Explorer windows opened up instantly and the whole system just ran smoother.
Not to discount this, but I do wonder how much might have been the "clean Windows install" effect. What would have happened if you did a clean install of 98?
"Clean-and-simple design." OK. So you like an OS with no meaningful memory segmentation (you could hop into kernel mode by modifying a register), awful multiprocessing support, and absolutely no defense in depth against the truckloads of malware that are all over the Internet?
It was totally enough for a personal computer. Certainly not enough for a server. I only ran trusted software and could do whatever I wanted. As for the Internet - it's a browser's job to sandbox the JavaScript.
If you are implementing an emulator, you must insert some jitter to the emulated floppy drive.
Because if there is no jitter, the ROM's calibration code does a division by zero and crashes.