
The 640K memory limit of MS-DOS - ingve
https://www.xtof.info/blog/?p=985
======
lowkeyokay
Those were the days...I remember getting a brand new game, loading it and
getting "You need at least 587 KiB of conventional memory to run this game".
Check the autoexec.bat and config.sys for anything I didn't need. Do I really
need to use the mouse? No. Remove the mouse.com. CTRL-ALT-DELETE. Became very
familiar with that combo. Type MEM. Still only 577 KiB..What else?

LOADHIGH and DEVICEHIGH - Now we're in business! 595 KiB - Awesome!

Oh no. My dad's coming home and he won't allow me to mess with the config
files. What did I change...

Then I learned about boot disks. And with DOS4/GW all problems were gone.

Never knew the technical reasons for the limitations but this article brought
back a lot of memories.

~~~
DrJokepu
> Then I learned about boot disks. And with DOS4/GW all problems were gone.

Actually, I think MS-DOS 6.0 added CONFIG.SYS boot menus, which made it
unnecessary to mess around with boot disks (but it didn't help with your dad
discovering that you have been messing with the CONFIG.SYS).

~~~
gerdesj
"Actually, I think MS-DOS 6.0 added CONFIG.SYS"

No, config.sys was a thing from day 1. Getting the bloody memory setup on MS
stuff right was a nightmare and wasted a lot of time.

~~~
akavel
"...added config.sys _boot menus_ ", is what the parent comment says

~~~
gerdesj
I have an idea about what I'm on about:

[https://www.novell.com/coolsolutions/tools/14053.html](https://www.novell.com/coolsolutions/tools/14053.html)

(I'm Jon)

~~~
jackfraser
Yeah no I'm sure you do, but you didn't read the comment, he's talking about
the native INI-style syntax for boot menus baked into the DOS 6.0 kernel that
allows you to specify multiple different CONFIG.SYS sections, and lets you
pass an option to your AUTOEXEC.BAT containing the config name so it can
finish the job off.

A must for anyone who ran a different config for different scenarios. I often
ran without CD-ROM drivers to get a couple K more memory for the pickiest
programs.

------
DrScump
"A solution might have been to switch back and forth from Protected Mode. But
entering Protected Mode on the 286 is a one way ticket: if you want to switch
back to Real Mode you have to reset the CPU! Bill Gates, who understood very
early all the implications, is said to have called the 286 a “brain dead chip”
for that reason."

But Extenders like DOS/16M[0] did this transparently fast by having a real-
mode process handle DOS interactions.

Given the proliferation of IBM "compatible" BIOSes in PC clones of that era
(many not all that "compatible"), DOS/16M did a fantastic job of running on
almost all of them. I worked with it a lot when Informix was first ported to
extended-memory DOS usage using DOS/16M. There was a large deployment for a
branch of the US armed forces using the product.

[0]
[http://www.tenberry.com/dos16m/faq/basics.html](http://www.tenberry.com/dos16m/faq/basics.html)

When DOS/16M was first released, the company name was Rational Systems.

~~~
takeda
If I understand it correctly it uses Virtual Mode which was added in 386. In
FAQ question about speed it also mentions 386 onward[1]. I don't think DOS/16M
contradicts anything said there.

[1]
[http://www.tenberry.com/dos16m/faq/basics.html#10](http://www.tenberry.com/dos16m/faq/basics.html#10)

~~~
amluto
And virtual 8086 mode is still supported on modern Linux kernels! (Only 32-bit
kernels, though. AMD decided to prevent it from being used by a 64-bit
kernel.)

v8086 mode, like many aspects of the x86 architecture, is a big mess.

------
narag
_" Windows 95 made those memory headaches a thing of the past as MS-DOS became
progressively irrelevant."_

Hmmm... The key word is "progressively". Windows 95 was still an extender
started from DOS and using its file access subsystem.

I clearly remember I made some tweaks so I started the computer with DOS and
could choose between launching Windows 3 or launching Windows 95 from the
command line. And after shuting down Windows, it returned to DOS.

The default was to launch Windows 95 (from autoexec.bat?) and powering down
the machine when shutting down 95, hiding the fact that DOS was still there.

One note: every time I have told that story, someone insisted that 95 wasn't
just eye candy. Of course it wasn't, it switched to protected mode so DOS
wasn't active. But the file subsystem was not different and it showed.

~~~
acqq
> file subsystem was not different

Windows 95 used its own 32-bit file system code whenever possible:

[https://en.wikipedia.org/wiki/32-bit_file_access](https://en.wikipedia.org/wiki/32-bit_file_access)

It had to fall back to call BIOS only in special cases, as a backward
compatibility for some older hardware/drivers. So even if you would launch it
from DOS, it would for most newer hardware be just a bootstrap phase.

~~~
narag
My memories about that are blurry but I remember there was some general
problem with file access that persisted until XP adopted NT core for consumer
versions. Maybe something about concurrency, not sure, sorry.

~~~
acqq
Windows 95 didn't have concurrency problems different from NT, unless you used
it on the old hardware that depended on DOS drivers. The main benefits NT
brought were various protection mechanisms, which is convenient for
programmers, but 95 surely wasn't slower, especially with less RAM, the major
factor that slowed the NT adoption was its significantly higher RAM needs.
Which translated in visibly more expensive computers needed. NT was also
slower for all graphical routines for a while:

[https://en.wikipedia.org/wiki/Architecture_of_Windows_NT](https://en.wikipedia.org/wiki/Architecture_of_Windows_NT)

"The Windows NT 3.x series of releases had placed the GDI component in the
user-mode Client/Server Runtime Subsystem, but this was moved into kernel mode
with Windows NT 4.0 to improve graphics performance.[23]"

Note NT 4.0 appeared a year after Windows 95. The programmers had the benefit
of using Windows NT even before 95 appeared, but the advantages of 95 were
clear for the "normal users" who didn't think they needed the "protection"
aspects: the most popular software of the time (e.g. Microsoft Word) was
unstable enough to, from time to time, crash even on NT. The more stable Word
versions appeared only after the famous Gates' 2002 memo started to be applied
to the Microsoft Office products:

[https://en.wikipedia.org/wiki/Trustworthy_computing#Microsof...](https://en.wikipedia.org/wiki/Trustworthy_computing#Microsoft_and_Trustworthy_Computing)

Windows 95 was for a time a good trade-off for many practical uses.

~~~
narag
I meant problems with file access specifically.

~~~
acqq
And I specifically know there weren’t such with native 32-bit drivers existing
in Windows 95.

------
cpmouter
Good post, but believe me, you can write a post without ending every paragraph
with an exclamation mark. It's possible, but only if you try!

------
kabdib
Intel made the paragraph size of the x86 16 bytes, or four bits of the
physical address. The documentation I read at the time claimed that this was
to make sharing pieces of segments easier (or perhaps they wanted to encourage
people to make zillions of very small segments, e.g., one per procedure).

I think that was simply done without much thought and then post-facto
justified by the hardware engineers. They couldn't afford another 4 pins of
physical address space, so they effectively made that pin limitation
architectural.

If the paragraph size had been 256 bits (just shift another four bits) then
the x86 would have had a logical address space of 16MB, just like the 68K at
the time. This probably would have changed the face of the personal computing
industry (no 640K limit, for starters).

~~~
whoopdedo
I wasn't there, but building entire systems from zillions of small procedures
sounds familiar. Wasn't this a guiding principle behind languages such as
SmallTalk and Forth? The first system to be implemented on 8086 was Forth with
its structure of loading code from 1K blocks.

But I can see the chicken-and-egg here. Did languages at the time think in
terms of small contained blocks of code because the hardware they ran on was
limited? Or did CPU engineers restrict their designs to match the segmented
demands of the programming languages?

~~~
kabdib
I've wondered if the 16 byte paragraph size came from prejudices of engineers
who were involved with the iAPX 432, an "object oriented" processor that was
canceled after Intel realized it had significant, architecture-level
performance flaws.

------
phkahler
I can't take someone seriously who writes "Of course, PCs were much more
powerful than, say, an Amiga 500."

The Amiga had a MC68000 - essentially a 32bit instruction set with a 16-bit
implementation. None of these segmented memory issues and 640K or 16M
limitations existed on that superior processor of the day. The whole post is
about inferior features of x86.

~~~
ajross
> None of these segmented memory issues and 640K or 16M limitations existed on
> that superior processor of the day.

Uh... the 68000 had, in fact, exactly 16MB of addressible space (no MMU and a
24 bit address bus).

And, I dunno, arguing about the benefits of a flat memory model on a device
that shipped with an non-upgradable 512kb of memory is a little spun.

The 68k was a more forward-looking ISA for sure. In consumer computers of
1987, that didn't buy it much. Look to the '020 machines Sun and others were
shipping for examples that make it more attractive.

~~~
icedchai
You could upgrade the Amiga 500 to 1 meg, using the trapdoor expansion. This
was a very popular configuration.

I had a SCSI and RAM expansion for my Amiga 500. It gave me 3 megs and a
whopping 30 meg hard drive!

The Amiga was so far ahead of early DOS machines, it was pathetic. Of course,
by the time 486, SVGA, etc. came along it was pretty long in the tooth...

------
royjacobs
I started getting into PC programming around 1995 after spending time on the
c64 and Amiga, and it was right at that test that DOS extenders became
available.

DOS4GW was part of the Watcom C compiler, but I seem to recall that Borland
Pascal also had some protected mode going on. I'm pleased I've never had to
spend too much time with the segmented memory model.

~~~
WalterBright
DOS extenders were available in the later 1980s.

~~~
JdeBP
But xe is nonetheless right that DOS/4GW was part of Watcom C/C++. That's what
the "W" at the end means. The full DOS extender was DOS/4G. And Watcom C was
available from the middle 1980s, too.

~~~
WalterBright
Zortech C++ came with both a 286 extender and a 386 extender.

~~~
JdeBP
I don't know off the top of my head whether Watcom C/C++ ever targetted the
16-bit Phar Lap or Rational Systems extenders. It certainly targetted the
32-bit ones, and others besides.

------
gesman
Lots of clever ideas were born due to limitations!

Back in a days i was writing "virtual machines" on C, emulating x86
instruction set including memory tracking.

Was fun way to detect maliciously acting programs and even detecting
new/unknown computer viruses - including ways to recover files from
infections.

------
bitwize
It's amazing to think how hamstrung the early PC was by stuff like the 640 KiB
limit.

The contemporary Tandy 2000 could be kitted out with up to 768 KiB of memory
(896 KiB with an aftermarket add-on). Because it loaded its BIOS from disk
instead of having it in ROM, and because there were no hardware restrictions
on contiguous access to memory except perhaps the video framebuffer, the Tandy
2000 could use nearly all that RAM, and was for a brief shining moment
preferable to the IBM PC, XT, or AT for memory-intensive jobs like Lotus
spreadsheets. That and its hi-res color display made it sought after as an
inexpensive CAD workstation.

It was not 100% PC compatible, but PC business apps often needed only a few
small patches to work on the 2000, so official and unofficial ports were
available.

------
ghaff
The effort that so many went to in order to grab a few K of memory here and
there in the DOS world to get some game to load or to otherwise cram something
into a limited address space.

In a way, it was even worse than that because the DOS COM file format had even
more limits around the 64K segments.

~~~
Narishma
The EXE format has none of those limitations and was available from day 1.

------
ajross
I dunno, this kind of dance always seems unfair to me.

The 8086 shipped in 1978 into a market where the only successful 32 bit
architectures were mainframes (the VAX broke open the 32 bit world in
minicomputers the same year).

It wasn't supposed to be a world-beater in the late 80's, it was designed to
be a better/cleaner and much cheaper PDP 11. And it was!

~~~
tomcam
I was there, and as near as I can tell it was meant to be something closer in
concept to an enterprise-level Apple ][ that could run VisiCalc.

~~~
ajross
That's where IBM positioned the 5150 it in the market in 1981. I'm talking
about where Intel aimed the architecture in 1976-78.

No one really wants to hear this, but in the world of 16 bit architectures the
8086 was clean, straightforward, easy to code for, easy to optimize, and
_vastly_ cheaper than the minicomputers competing in the same space at only
moderately higher performance levels. It wasn't until the mid-80's and the
mess that was 286 protected mode[1] that people started to sour on it.

[1] They had a bunch of new transistors to spend, and could pick any two of
"performance", "address stretch" and "memory protection". They did the MMU
before the stretch, and paid dearly. The 68k made, in some sense, the opposite
choice and had fewer growing pains in the late 80's. By 1990 no one cared
anymore.

~~~
tomcam
I concede. That makes a lot of sense to me. I did greatly enjoy the
instruction set.

------
danso
Getting Ultima VII to successfully run, which required its Voodoo Memory
Manager [0], was probably how I became comfortable hacking computer settings.

[0]
[http://wiki.ultimacodex.com/wiki/Voodoo_Memory_Manager](http://wiki.ultimacodex.com/wiki/Voodoo_Memory_Manager)

------
partycoder
Back in the day there was also DR-DOS which was very similar and to some
extent, compatible and in some aspects, better. Until the arrival of Windows,
that is.

Have not used it in decades, though, so I couldn't really provide details
about it. In fact, back then a lot of people still used monochrome displays.

------
fapjacks
I could either choose to load the mouse and MS Windows. Or I could choose to
load the modem and soundcard and stick to running DOS. And now you know why I
never felt like using Windows was necessary for me.

Incidentally, in my circle of friends on the BBSes I used to dial, programming
was one of the things the cool kids knew. So I taught myself C using nothing
but the K&R book. As a loner and an outsider, I didn't have any other
resources available. The result of this -- cutting my teeth on C pointers with
nothing but that book for help -- is that when people in job interviews ask me
about the most difficult technical thing I've ever worked on, I don't even
hesitate to tell them about those years learning that stuff about C. What a
fun and extremely intellectually challenging time!

~~~
fapjacks
What in the hell about this post got it downvoted? How bizarre.

------
tzs
> A solution might have been to switch back and forth from Protected Mode. But
> entering Protected Mode on the 286 is a one way ticket: if you want to
> switch back to Real Mode you have to reset the CPU! Bill Gates, who
> understood very early all the implications, is said to have called the 286 a
> “brain dead chip” for that reason.

Even if you never wanted to go back to real mode, the 80286 was brain dead
because of the way they laid out the bits in selectors. One small change to
the layout would have prevented so much pain for programmers.

In protected mode, like in real mode, the 286 used a segmented addressing
scheme.

In real mode, an address consists of two 16-bit parts:

    
    
      +----------------+  +----------------+
      |     segment    |  |    offset      |
      +----------------+  +----------------+
    

The segment and the offset are 16 bits. The mapping from segment:offset to
physical address is simple: physical = segment * 16 + offset.

In protected mode, an address looks like this:

    
    
      +-------------+-+--+  +----------------+
      |   selector  |T|RL|  |     offset     |
      +-------------+-+--+  +----------------+
    

The mapping to physical address uses a lookup table, indexed by the selector.
The table entry, which is called a "descriptor", contains the physical address
of the base of the segment. The offset is added to that base to get the
physical address. The descriptor also contains information about the segment,
such as its length and access permissions.

There are two lookup tables available to the program at any given instant. The
T bit selects which is used. If T is 0, it uses a table called the Global
Descriptor Table (GDT). If T is 1, it uses a table called the Local Descriptor
Table (LDT).

The GDT, as the name suggests, is global. There is a single GDT shared by all
code on the system. Typically it is used to map the segments containing the
operating system code and data, and os protected so that user mode code cannot
access it.

Each process has its own LDT.

The RL bits specify the Requested Privilege Level (RPL). The 286 has four
privilege levels, 0-3, with lower numbers being higher privilege. To determine
if access is allowed, the processor looks at RPL, the provilege level given
for the segment in the descriptor (DPL), and the privilege level the processor
is currently running at (CPL). The check is max(CPL, RPL) <= DPL. It's OK for
user mode code to set RPL to whatever it wants, because it runs at CPL == 3,
so RPL has no effect.

Let's look at this from the point of view of a user mode C program running on
a protected mode operating system. The address space as seen by the C program,
viewed as a 32-bit unsigned value, looks like this:

    
    
      00040000 - 0004FFFF  first 64k
      000C0000 - 000CFFFF  second 64k
      00140000 - 0014FFFF  third 64k
      ...
    

Note that if you are stepping through your address space, every 64k the
address jumps by 0x70001 instead of the more usual 1. This makes pointer
arithmetic take several more instructions than it would on a machine with a
flat addresses space.

To prevent making all programs incur the performance hit from that, the
compiler writers gave us several different memory models. If you compiled a
program with "small model" [1], the compiler would limit it to a single 64k
code segment and a single 64k data segments. It could load the CS and DS
segment registers once at launch, and treat thee address space as a 16-bit
flat space. In "large model" it would do multiple code and data segments, and
incur the overhead of dealing with pointer arithmetic crossing segment
boundaries. There were mixed models, that had multiple code segments but only
one data segment, or vice versa.

The C compilers for 286 protected mode added new keywords, "near", "far", and
"huge" that could modify pointer declarations. If you declared a point "near",
it was just an offset within a segment. It had no selector. A "far" pointer
had a selector and offset, but the selector would NOT be changed by pointer
arithmetic--incrementing past the end of a segment wrapped to the beginning of
the segment. A "huge" pointer had a selector and offset, and would include the
selector in pointer arithmetic, so that you could increment through to the
next segment.

Now throw in needing to link to libraries that might have been compiled using
a different model from the model your code is compiled with, and that might be
expecting different kinds (near, far, huge) pointers than you are using, and
all in all it was a big freaking pain in the ass.

They could have avoided all that pain with a simple change. Instead of
addresses in protected mode looking like this:

    
    
      +-------------+-+--+  +----------------+
      |   selector  |T|RL|  |     offset     |
      +-------------+-+--+  +----------------+
    

they could have made them look like this:

    
    
      +-+--+-------------+  +----------------+
      |T|RL|  selector   |  |     offset     |
      +-+--+-------------+  +----------------+
    

Then the address space would like like this, from the point of view of a user
mode C program looking at addresses as 32-bit unsigned values:

    
    
      80000000 - 9FFFFFFF  524288k flat address space
    

If they had done this and also flipped the meaning of the T bit, so 0 means
LDT and 1 means GDT, we'd have the address space look like this:

    
    
      00000000 - 1FFFFFFF  525288k flat address space
    

A C compiler might still have offered near, far, and huge pointers, but now
that would just be an optimization programmers might turn to if they needed
the last possible bit of speed or needed to squeeze their code into the
smallest possible amount of memory, and were willing to go to the pain of
arranging things so pointer arithmetic would never cross a 64k boundary in
order to do so. 99% of programmers would be able to simply pick "flat" model
and be done with it.

Why didn't they do this?

I've heard one theory. The descriptors in the GDT and LDT were 8 bytes long,
so the physical address of the descriptor is 8 * selector +
base_of_descriptor_table. Note that the selector in the segment register:

    
    
      +-------------+-+--+
      |   selector  |T|RL|
      +-------------+-+--+
    

is already shifted over 3 (e.g., multiplied by 8), so the descriptor lookup
can be done as (segment_register & 0xFFF8) + base_of_descriptor_table. The
theory is that they put the selector where they did so they could do it that
way instead of having to shift it.

My CPU designer friends tell me that this is unlikely. They tell me that when
you have an input to some unit (e.g., the selector input to the MMU) that is
always going to be shifted a fixed amount, you can build that in essentially
for free.

Note also if they had done it this way:

    
    
      +-+--+-------------+
      |T|RL|  selector   |
      +-+--+-------------+
    

and had gone with T == 0 for LDT, then a protected mode operating system could
allocate up to 196592 bytes of consecutive physical memory, and set up
consecutive descriptors to point into that memory offset from each other by 16
bytes, and the address space would then look like the first 196592 bytes of
real mode address space. That might have allowed writing an emulator that
could run many real mode binaries efficiently in protected mode.

[1] I might be mixing up the names of the various memory models.

~~~
JdeBP
Your CPU designer friends are right, and your "emulator" is pretty much how
real mode and v8086 protected mode addressing were actually architected. The
descriptor parts of segment registers effectively had fixed contents derived
from what was loaded into the selector parts, rather than from descriptor
tables. Yes, this indeed allowed the 80386 to run real mode programs. (-:

------
vidanay
This, combined with doing the IRQ ISA slot shuffle makes me very glad for
modern PC architecture and OSes.

