Motivation, because it was solving a problem that didn't need to be solved by the people working on it. Specifically they had access to other UNIX systems but they "wanted" one which was open source. Linus on the other hand was a kid without access to such operating systems and he wanted to learn about them and play with them. His motivation was to simply build something he could play with.
Egos were another interesting part of the problem, at the time the great micro kernel / monolithic kernel debate was going strong. "Big brains" were arguing one side or the other and it was an area of research. So on Hurd, what had been thrust very much into the local spotlight, if it was going to be "big" then the people working on it wanted to be "right" and arguing over what was the right way to do it consumed an amazing amount of time and energy.
And finally effectiveness. By far the most effective thing to come out of the free software project is gcc (and by proxy the binary utilities binutils). These were tools that were built so that the "real project" could be built, and as such were just re-implementing an existing thing (C and C++ compiler) which was a little more than a senior undergraduate project in most universities. But gcc was useful because people could get operating systems when they bought hardware but the industry was really trying to make money by selling them the compiler tools.
So gcc/binutils came into the world with little ego (it wasn't changing the world, just re-implementing it). Was highly effective, and people were motivated to use it because it was an alternative to an expensive compiler suite. But Hurd, was no better, and significantly worse in that it was less developed than, the OS that came with the hardware that these folks bought (workstations) it was just that it had source and you probably didn't have source to the OS you were using (or you did but you couldn't share it).
Linux on the other hand, while still a toy was a better OS than MS-DOS in that it was already more 'unix like' than DOS was, and more useful than Minix because rather than just teach concepts Linus was using them to make the other parts of the OS as well. So the motivation was there to try it, Linus was not the egotistical "expert" who didn't listen to alternate ideas, and it was at least as effective an OS as MS-DOS if you didn't care too much about applications. As a result it was much more successful.
For perspective, I've programmed since well before the first personal computers. When they came out I had to buy compilers and development tools from Microsoft, Borland, and others (for the Apple systems and MS-DOS and early Windows) in order to program at home. I estimate that is cost me thousands of dollars over the years; I even had to buy Emacs-like editors so I could program on a PC. Around 1985 I bought Epsilon from Lugaru Software, its a nice Emacs-like editor still sold today!! I always wonder how profitable that product has been over the last thirty years.
Because I wanted do do document preparation on something other than Word I ended up buying a number of word processors (XYWrite, WordPerfect, MS Word for MSDOS); finally, I found a MSDOS markup based system by Mark of the Unicorn (which now sells high end Midi audio gear). Mark of the Unicorn's markup system was based on Scribe (described in Brian Reid's influential CMU PhD 1980 dissertation for which he won the ACM Grace Murray Hopper Award) that ran on PC's.
My first personal Unix system was a Dec Workstation, I think it cost me over $20,000 in 1990 for disk drive (maybe 300MB), monitor, and computer!! I put it to good use doing consulting and development, but nevertheless I am so so grateful to the open-source community for gcc, freebsd, linux, tex, and every other tool I use not just because it doesn't cost me as much to try out new things but because it has made it possible for people all over the world to access the power of computers.
It attracted people to the project, just because it was fun and wasn't trying to initially displace anything in particular. I find it interesting that the FSF still wants to own the entire stack, given the movement that they started with the GPL.
GNU got a kernel, it's Linux.
In my opinion software is a fluid thing. No matter what the design decisions where if some change is necessary over time it will happen. So despite being an idealist (and therefore wishing for a more idealistic kernel as well) I don't feel bad about Linux making these decisions early on at all. But of course I may be wrong. Feel free to correct me.
At the extreme this meant just resource allocation and messaging between processes. I would say that it then becomes a matter of opinion how much more one can add to the kernel before it stops being a microkernel.
The advantage of a microkernel is that it should be able to be made more robust and secure than its more monolithic cousins as the code that is, in principle, capable of taking out the whole system is small and slow moving. On that basis, saying that Linux is not considered a microkernel is a fairly safe statement.
Where I believe general purpose microkernels have struggled is in the performance of the messaging (often because of the increased access checking). More targeted uses have been successful. Probably the standout one being QNX which has been used extensively in the embedded space.
Jochen Liedtke's work on the L4 kernel family showed that microkernels can be really fast.
QNX's problem is the ownership. The business has been sold twice (to Harmon, an audio company, and to RIM/Blackberry). The code has gone from closed source to free version available to closed source to open source to closed source with no free version. Third party developers were so fed up they stopped developing for it. Now QNX no longer supports itself as self-hosted; you have to cross-compile from Windows.
The problem with the Hurd is that they tried to take a big OS, Mach or BSD, and make a microkernel out of it. Mach itself started with BSD and tried to make a microkernel out of it. Wrong approach. You have to start from nothing and build a minimal microkernel.
All QNX has in the kernel is CPU dispatching, memory management, timers, and message passing. Everything else is in user space. To get things started, the boot image can contain user programs which are loaded during boot and then started. That's how the initial drivers and file systems get loaded. For many embedded applications, that's all that gets loaded - there may not be a file system. For bigger systems, the initial drivers are just enough to allow loading more stuff from disk.
It's annoying that nobody else, with the possible exception of the L4 people, seem to be able to get this right. The QNX kernel is only about 60K bytes of code.
All the scheduling power of QNX derives directly from having all non-essential code in user processes rather than in the kernel, which does only one thing and does that very well (message passing).
I've re-implemented a 32 bit version of QNX long ago, maybe one day I'll attempt to move it to the 64 bit era if I can figure out a way to automatically convert linux device driver source code to an 'administrator' (QNX parlance for a daemon). That should take the sting out of re-writing each and every driver under the sun.
I strongly believe that Linux not being micro kernel based (and not in a 'minix' sense but a real micro kernel with all services as user processes) is one thing that is currently holding us back.
Many real world interactions require hard real time and with monolithic kernels you are just not going to be able to meet your guarantees.
Lots of hard realtime systems (e.g. on microcontrollers) use a RTOS where there is even no distinction between user and system mode, thus one cannot even talk about microkernels vs. monolithic. From a software engineering architectural perspective this is not a good design, but it is very fast (thus the time guarantees you can give are smaller) and sometimes easier to guarantee runtime guarantees. And it is easier to write a RTOS where all things run in system mode (which does not imply it is easier to debug bugs or to maintain).
Hard real time seems like a luxury until you've worked with it in practice and then you wonder why it isn't a basic requirement of any half decent operating system.
Modern OSes run each process in its own address space via CPU-level support for virtual memory management. Older (or simpler) OSes which don't use virtual memory simply had one big address space that all processes shared. This was Windows, pre-Windows 95, and MacOS, pre-OS X. You word processor could crash and not only take down other applications (since they were all in the same address space), but could also take down (for example) the graphics driver, because the system software was also in this one unprotected address space.
This type of older system is much less robust, because the stability of the entire system is effectively equal to the stability of the least reliable program being run on it. On modern OSes with monolithic kernels, process separation, and preemptive multitasking, an unreliable program simply crashes without affecting anything else on the system.
In a sense though, the monolithic kernel still has the same problem that userland processes had before. It's only as stable (and secure) as the least stable driver or kernel component. If anything goes wrong at the kernel level, it could easily take down the entire kernel. There's nothing to prevent a misbehaving graphics driver (for example) from overwriting memory being used by other drivers or the base kernel itself. Likewise, any exploit that can inject code into a driver/kernel module/whatever then has complete access to everything on the system.
Microkernels close the loop by chopping up the kernel itself into separate processes. The heart of the kernel is still a (much, much smaller) blob of code providing virtual memory management and a scheduler, but also a very low level and efficient IPC mechanism (efficient relative to other IPC mechanisms, that is). The rest of the kernel is divvied up into their own processes. Device drivers and other sorts of standalone kernel modules built on top of the core microkernel all get moved into their own processes/address spaces, and communicate with each other (and userland) via whatever the core IPC primitive is.
The benefit is that a lousy graphics driver can no longer take down the entire system or be an attack surface for malware taking over the entire system. In theory at least, you could hypothetically restart a misbehaving driver just like you can just relaunch an app if it crashes.
That's awesome, but it comes at a cost. The cost is that userland/kernelspace transitions are no longer plain old syscall instructions. Now they become client/server interactions via the IPC mechanism. Inside the kernel, many things that used to be simple function calls also turn into client/server interactions via IPC.
So, you get something that's an obvious win: bringing kernels into the modern era with hardware-enforced process separation, instead of just doing it for userland processes. But if the performance cost is too high, it's not worth it. The cost/benefit analysis surrounding microkernels so far has generally always favored performance, so microkernels were rarely used since monolithic kernels were "good enough" and didn't have the perf hit.
Take this all with a grain of salt; I'm really only familiar with Mach and Mach-era arguments about the feasibility of microkernels. More modern microkernels like L4 and QNX have presumably solved at least some of these problems by introducing a different set of tradeoffs. But AFAIK the basic concepts are unchanged, and the goals of microkernel architectures are the same, just with different implementation strategies.
IBM 386? Did they mean to say Intel 386, or did Linus actually have a computer that used an IBM 386SLC, which was an 80386 that IBM designed and manufactured under license from Intel?
Not sure there were all that many 386 systems that weren't, though I'm sure they exist...
CMOS versions of the 386 also showed up in some mobile phones and other embedded systems.