Where does one start with such thing? I mean, where do I read how to emulate CPU (or at least how CPU works, so I would be able to come up with how to emulate it by myself), how does PS/2 work, how to boot an image from BIOS?
If the reference manuals are overwhelming, I'd recommend to start by reading Code: The Hidden Language of Computer Hardware and Software by Charles Petzold. It covers how computers work for a general audience, from logic gates and boolean algebra, up to assembly, opcodes on the Intel 8080 processor, and how operating systems work. It was one of the most consequential books I've read, and I'd recommend it to both professional programmers and anyone else who is intellectually curious.
Learning about CPU internals was a huge part of your first year in CS? Where did you go to school?
At my school and every other school I'm remotely familiar with, the knowledge needed to emulate a CPU would be covered in upper division computer engineering courses, and not covered by CS undergrads at all, certainly not in their first year.
Bologna, Italy (from the school of Science, not Engineering). At the time (16+ years ago, under prof. Renzo Davoli), the introductory "Computer Architecture" course [1] was almost entirely dedicated to CPU architecture, including building basic ones (IIRC).
This kind of course (computer architecture) is typical of the cross CS/EE requirements and can usually be taken during any year of an undergraduate degree in CS. Prerequisites are minimal if any.
I taught that class once. It does have some prerequisites, but not a ton. I think it could be taken as early as second year, first semester at that particular school.
Back in 2007, my first semester in a technical CS school had plenty of assembly, binary, MMU emulation, OSI / TCP/IP, boolean math and so on.
I know this is mostly gone at this point, but it should not. The fundamentals are essentials. I often meet people who are totally clueless how a computer work and have CS degrees.
"How can you do C, it is so old? By now, we must have invented faster languages. Computers changed so much recently". Sure... Binary is now expressed with emoji.
Those aren't the fundamentals of CS. The fundamentals of CS are data structures and algorithms, as the Good Book says. Hardware is important, practically, but don't let practicality blind you to the actual core of the field: Practicality has a way of becoming obsolete, and an actual education is for life.
Meh. It's like saying that the fundamentals of film-making are writing and acting -- except writers and actors are nothing without an actual camera, and if you don't know how to place the camera and how to cut film produced by such camera, you will never get a film done. Cameras change, but you will never have a camera that miraculously materializes in all the right places to take exactly the shots you imagined. In the same way, some things will never change on the hardware side: you will always have inputs, outputs, memory, storage, human interfaces, power management, booting and so on.
The difference is that when people talk about hardware, they're talking about the equivalent of film, not abstract "this is central to being a camera which works with visible-spectrum light" concepts.
> you will always have [...]
> memory, storage
First, making a distinction between memory and storage is not an "at all times, in all places" kind of thing. It's more "this is what we do now, in the past, on some systems, it was different, and it may be different in the future" kind of thing. Single-level storage is not currently popular, but it was in the past and may yet come back. It already has, in some limited contexts.
Second, we've already seen home system storage go from paper tape to magnetic tape to magnetic disk to metallic magnetic disk to solid-state NAND Flash or close equivalent. Each has vastly different performance characteristics and details in every detail.
> human interfaces
I'm sure there are some iron-clad universals in HID. I don't know which of those translate from CLIs to touch-screens to gestural interfaces to speech recognition to pupil tracking to...
> power management, booting
Two things which have changed quite a bit even in the lifetime of "vaguely IBM PC-derived" desktop computers, and even moreso if you widen your scope up and down the power curve to include handheld systems and, you know, Real Computers What Do Real Work.
> we've already seen home system storage go from [...]
Yes, but the point is that there will always be a requirement to manage and persist the data you are working on somehow, and how you go about this somehow dramatically impacts (or should impact) the choices you make at the more abstract level of data structures. It is a fundamental concept that you will be forced to consider in one way or the other. You can have the fastest algo in the world crunching huge amounts of data, but if you then take an inordinate amount of time to store and retrieve results, it's as bad as having blazing-fast storage and crappy algos.
> I'm sure there are some iron-clad universals in HID
I agree that is traditionally considered a subclass of I/O, but I think in recent years we have seen that it's much more important than previously understood. Good software with mediocre UI is ignored while mediocre software with good UI can change the world. This is one of the few real discoveries in our field since the '80s.
>> power management, booting
> Two things which have changed quite a bit
... but are still there in some shape or form, and will forever be there. They are changing the world because people put effort and thought into them as fundamental parts of computing experiences, not one-offs that can be simply ignored as "constant time".
Those would also be the fundamentals of theatre, though, and film as a medium is essentially defined by its divergence from theatre. The fundamentals of film are cinematography and editing, but even this is theoretical compared to the craft of film-making. Just as CS includes the behaviors of Von Neumann machines with tapes, which is abstract compared to the science of e.g. processor die doping.
It was a technical school with very practical teachers. The lack of "theorists" probably helped a lot. Most of them were either from the video game industry or 80's embedded developers. In Quebec (Canada), a technical school teacher unlikely is to have a PhD. Technical schools and universities are 2 different "level". You can take technical school (CEGEP) as a pre-university degree (or you can end there and get the equivalent of an associate degree). When taking a CS one instead of science, you "lose" 1 year, but it is much more fun. Too bad they replaced the system programming courses with web ones a couple of years ago.
All that to say that sometime, having real industry veterans as teachers really influence the teaching point of view.
Don't get me wrong, I certainly think that functional programming has its place in the world, but as a first year uni student all fired up about finally learning "real" programming after years of teaching myself (back before the internet laid everything out on a platter), I was not impressed.
I don't think functional paradigms can really be appreciated by 1st/2nd year undergrads. At that age you are fundamentally impatient to make your mark in a practical sense, your approach will be instinctively imperative. You have to hit the wall (scaling / parallelism / thread management / complexity etc) before you start to really appreciate the upsides of functional paradigms.
Unfortunately, a lot of professors are actually terrible educators (after all, they did not get there by teaching but by researching) and think the learning process is as linear as house-building: "place bricks here and there so that your next row will be this way and that way". They also think people should enjoy programming for programming's sake, whereas a lot of people are motivated by a creative process driven by outcomes.
Not necessarily (based on being an assistant in lab sessions for first year students learning Haskell).
But what it did do was put everyone on the same level, including the arrogant students who "already knew how to code" and hadn't listened (or attended) the lectures.
I think they chose a functional language to start with good habits for thinking about what to implement, not how to implement it. If you don't know what the problem is, you should work on that, rather than bashing out some Java...
I got a solid understanding of CPU operations in my undergraduate computer architectures class, in my CS degree. Of course, it was probably a third year class. First year sounds a little unusual.
We covered it in the first quarter of the CS program at The Evergreen State College. After a few weeks of being introduced to digital logic, each student had to draw (with an application called Logisim) a simulated simple-as-possible Von Neumann machine up from logic gates and wires. Then we had to write short math programs for them directly into the RAM. Fortunately, Logisim lets you save components as something like functions or macros so it wasn't too repetitive. This project demo video by someone who took the same course shows what the result looks like:
It was challenging, but it was awesome (and finding that video to illustrate my comment is a blast of nostalgia). It wasn't any harder than most other CS or other sciences courses. And after digital logic, the other CS topics aren't really prerequisite or especially helpful in learning how simple processors work. I really appreciated getting straight to the foundations of how computers work and building up from there.
It wasn't like emulating x86 in javascript, but it was CPU internals. Up until I read your comment, I just assumed this was standard CS stuff.
I was equally shocked at the state of CS education at University of New South Wales here in Australia. They don't seem to cover many fundamentals (like CPU, algorithms and data structures, operating systems) compared to what I am used to in Europe. Either not at all in the undergrad curriculum or only very late.
I studied math at university, and did CS as a minor. They made me take data structure and algorithm classes for both.
The mathematician's version was half as long, but covered the material in more depth: ie they proved every result. The CS version was full of dumbed down and full of fluff. (And even those CS people did operating systems and compilers as undergrads.)
First year, second semester we learned binary logic, system bus, how a CPU works, etc. I think it was seen as the foundation so you actually understand how a computer works.
Me too, but there's a huge gap between the often simplified CPUs they teach you on at Uni' and programming for the querks in x86, or any of the buses/peripherals/interfaces.
I didn't learn much more than assembly 101. I doubt I could emulate much real hardware. Certainly not a PC of all things... A gameboy looks more accessible...
I've not really enjoyed my time in higher-ed (and I have little to show for it), but sometimes it's really the only place where unfashionable but fundamental topics are covered in depth. My comment was meant as a pointer (i.e. "check out lecture recordings etc") rather than a quip.
0xffff2 - I can't reply directly to you for some reason but I went to UC Berkeley many years ago and the third undergrad course, CS 61C, laid the basic groundwork for beginning to understand CPUs. Here's the syllabus from the most recent semester. http://www-inst.eecs.berkeley.edu/~cs61c/sp16/
My university didn't touch on hardware until the second half of second year, and that was only one paper (strictly speaking, a computer engineering paper).
Personally, I had some courses on computer architectures, and those covered the theory of how CPUs are constructed, how they run programs, etc. I decided to write an NES emulator. So:
- Find Wikipedia articles, and learn that it used a variant of the MOS Technology 6502, which was used in a lot of computers in the 80s.
- Find some digitized assembly programming manuals from the time (I think the one I used was distributed with the Commodore 64, and ended up having several typos introduced by OCR).
- Write a tool to recognize, decode, and print out an operation when you feed it a little data
- You basically need to set up a loop of fetching instructions, interpreting them, then doing what they say. An actual CPU runs in a similar loop, and it generally doesn't stop until power is removed,
I think that after the classes I took, I read a lot of what other emulator writers said. This article is a basic look at the structure of an emulator, the theory behind them, and some different designs: http://fms.komkon.org/EMUL8/HOWTO.html
One of my friends at uni wrote a Z80 emulator/assembler in PDP-10 assembler as a hobby project - and he was studying chemistry, not CS...
I don't understand why a basic understanding of CPU architectures isn't a CS fundamental everywhere.
Even if you have no interest in emulating a CPU or an OS, you really do need to know what registers are, how caches work, what interrupts do, and how basic IO happens.
At the very least it's a practical demonstration of one particular kind of VM, and - if you want to - you can generalise from that to VMs of your own design.
For web apps, not understanding these things can get expensive. Cycles, even cloud cycles, aren't free, and if you take zero interest in optimisation and efficiency you're literally throwing money away.
Agreed. It's important to understand how the hardware works, at least at the theoretical level. If you don't understand what the machine is doing, it's hard to say that you really understand how your program works.