Hacker Newsnew | past | comments | ask | show | jobs | submit | ajenner's commentslogin

I wrote a program called xtce-trace (https://www.reenigne.org/software/xtce_trace.zip ) to do just this (albeit non-interactively - you just give it a program and it will generate a cycle-by-cycle trace of which lines of microcode are executed). GloriousCow aka Daniel Balsom recently fixed some of its bugs and turned it into an actual emulator (https://github.com/dbalsom/XTCE-Blue ), though it's not finished yet so there's no binary release at the moment.


xtce_trace sounds fantastic — exactly what I need right now while debugging my Verilog 8086 core. Thank you also for the microcode disassembly. It’s been fun to work through.


It's not - the "DA" means that the write happens to the ES segment, while "DS" means that the read happens from the DS segment. The use of different segments for the source and destination gives these instructions extra flexibility.


Yes, it pops the stack value into the CS register. But it doesn't update the PC as well - it'll continue to point to the offset (in the old CS) of the instruction after the "POP CS". So whatever code is in the new CS, it has to be "compatible" with the code in the old CS, in terms of the instructions starting in the right place. However, it's even more complicated than that because the prefetch queue is not flushed, so the exact address where it switches over will be unpredictable. For certain very specific scenarios it could potentially be useful to speed up conditional execution by eliminating those prefetch queue flushes, though.


A more recent example of an finite loop with no exit condition can be found in the credits section of the 2015 demoscene production "8088 MPH". The loop (which can be found at https://www.reenigne.org/blog/8088-pc-speaker-mod-player-how... under "v:") has two jumps (one conditional, one unconditional) both backwards, and runs with interrupts disabled. The CPU's instruction pointer stays within that block until the end of the routine - there's no wraparound to make a forward jump from a backward one.


So... How does it work then?


There's a POP instruction in the loop that pops to a memory location addressed by a register. When that register contains the address of the final JMP instruction, the latter gets overwritten by a forward JMP.


There's nops at the beginning of the loop that get written over.


I used the DOSBox debugger to debug several of the effects during the making of Area 5150, but other than that I used real hardware. 86Box is probably the most accurate one at the moment but still isn't accurate enough to run the entire demo correctly.


Most of the tricks in the demo should work unmodified (or could be made to work) on most CGA implementations of the era. Some clones had slightly different font ROMs so it would look a little bit wrong on those. The final lake effect (and the ripply picture shortly before it) use very tight cycle counting so probably won't work on anything except a genuine IBM PC/XT and CGA. Some effects (like the radial fire effect and the voxel landscape) should work on just about anything. The Amstrad PC1512 will have trouble with any effect that modifies the CRTC timing registers as it doesn't have a fully programmable CRTC and always generates a 15.7kHz/59.92Hz 640x200 image. I don't have personal experience with the Tandy 1000 and don't know how compatible it is off the top of my head.


Thank you for your reply, it is conforting to know the tricks are solid. Maybe it will help some retroprogrammed CGA games to appear with more colors and effects.

I have a VGA XT clone (4.77/8 MHz) and an ATI Small Wonder CGA clone board. I will try to bind them together and run Area 5150.


The ATi Small Wonder is not entirely cycle-accurate. When I studied the CGADEMO by Codeblasters, I found that the scroller didn't look entirely correct, because the Small Wonder was slightly slower: https://scalibq.wordpress.com/2014/11/22/cgademo-by-codeblas... It appears to insert a few extra waitstates compared to an IBM card. This may trip up some effects in Area 5150.


As I remember it, the term "XT class" was used back in the day to distinguish between "AT class" PCs and those prior - i.e. to mean an IBM PC, XT or comparable machine. So (weirdly enough) the 5150 was considered XT class despite predating the XT (they're almost identical as far as software is concerned anyway). The term "PC class" wasn't a thing because PC came to be shorthand for any IBM PC/XT/AT or later x86 machine. Some more powerful machines like the Amstrad PC1512 (with an 8MHz 8086) were also considered XT class - these machines could run games like Bruce Lee and Digger which were designed for the PC/XT (and which were too fast on AT and later machines), though the gameplay was quicker than they were designed for so were extra-challenging.


Yeah, that seemed to be the usage in most contemporary literature/magazines. There are also those other architectural factors like the bus width, the number of PICs, the keyboard subsystem and so on, which is why you could have "XT-class" 8086 and even 286 machines (like a Tandy 1000 model or two).


What may also have had an impact was that clones were generally XT-clones (using the newer, smaller ISA card layout, and no cassette port), not PC-clones. And indeed, they were specifically advertised as 'XT-clone', 'XT-class' and such. So the 5150 is really the oddball here. Back in the old days I understood the usage as follows: 'PC' was a catch-all term for all IBM PC-compatible machines that ran DOS. 'XT' was the term used for machines with an 8088 (or sometimes V20) CPU (and of course there was the 'Turbo XT' subclass for CPUs running at more than 4.77 MHz). 'AT' was the term for 286 machines. After that era, people just identified PCs by the CPU used, so you had '386es', '486es', 'Pentiums' etc.


The C000 segment was used for the EGA/VGA extension ROM. I'm guessing that using D000-EFFF would be unnecessary (because of the planar addressing squeezing 256kB of video memory into a 64kB address space), inconvenient (because the addresses wouldn't be contiguous - EGA and VGA were designed to coexist with either CGA or monochrome adapters in B000-BFFF) and (for VGA) insufficient - you'd still not have enough to map the entire 256kB of VRAM linearly. I also expect that IBM's engineers didn't want to take up all the extension ROM space because then it wouldn't be possible to add EMS cards, network cards, and whatever else ended up being mapped there. Though 192kB of write-only video memory in that space would be an interesting design!


The programmatic transitions between still images use linear RGB space, which is the correct way to interpolate between two colours. The maths behind it is pretty simple - essentially just reversing the gamma correction from normal (0-255) sRGB space before the interpolation and redoing it for the final colour (no need to get into the hairy areas of LAB or perceptual colour spaces for this). Once we know the colour we want, we choose the character (from a list of 6) and attribute which most closely matches that colour. Of course, it's all done using lots of lookup tables so that we can process several hundred character cells per frame.


My point is that sRGB linear interpolation does not accurately model the blending of color that happens in nature, which is something that should happen with CP-437 shade blocks. Here's a web page I set up so you can see for yourself. https://justine.lol/color.html Even within the context of sRGB itself, it's not "correct" at all. For example, yellow and blue should make pink. But sRGB predicts it as being grey. Depending on how good your monitor is, the shade blocks should make pink, but even that is usually thwarted somewhat by subpixel layout. For example, if you drag the window around on an LCD, the coloring of the shade blocks will change weirdly as subpixels move. Subpixel layout shouldn't be an issue with IBM-PC CRT monitors. I wish I had one to confirm what it does. But in general this is just one of the weaknesses of the sRGB colorspace. But if you use something like CIELAB then it does predict pink. But that space has weaknesses of its own. But those problems are nowhere near as challenging as shade blocks, because linear interpolation just chooses a new color and doesn't need to predict anything. To choose colors for shade blocks you're applying your model to predict a more natural phenomenon.


There's a difference between linear interpolation of sRGB values (what you're doing on your page) and linear interpolation of linear RGB values (which gives better results). This is the explanation that made it all fall into place for me: http://www.ericbrasseur.org/gamma.html?i=1 . The linear RGB colour space is also a linear transformation of the CIELAB space so doing the interpolation in this space is equivalent to transforming to CIELAB space, doing the interpolation there, and then transforming back.

However, for the purposes of Area 5150 I think the differences between sRGB interpolation and linear RGB interpolation would have been too subtle to notice since there are only 6 * 16 * 16 = 1536 dithered colour/pattern combinations to choose from in the first place - the error introduced by that quantisation is likely larger than the sRGB vs. linear RGB difference. But I used linear RGB anyway, just to be correct about it.


I agree that gamma correction helps, but we're not talking about gamma, we're talking about color. Here's a version of the page I shared earlier with ITU REC BT.701 gamma correction applied: https://justine.lol/color-gamma.html It still has the same color problems. In fact they should be easier to see now that the white/dark issue is fixed.

Could you explain how linear interpolation is different from sRGB interpolation? I would have thought they were the same thing. If by sRGB you mean interpolating but being lazy about gamma, I'll be the first to admit that's just plain old incorrect, even though laziness is sometimes a virtue.

Also are you one of the demo authors? If so we could probably move this conversation to Discord or email and we could try some more blending methods!


Yes, I'm reenigne from the demo. Feel free to contact me at andrew@reenigne.org.

I'm not sure what you're doing with the colour mixing on that page but I'm wondering if you're just applying a gamma curve to the mixed result. This is what I meant:

sRGB interpolation:

  r_final = r_1*x + r_2*(1-x)
  g_final = g_1*x + g_2*(1-x)
  b_final = b_1*x + b_2*(1-x)
linear RGB interpolation:

  r_final = 255*((((r_1/255)^2.2)*x + ((r_2/255)^2.2)*(1-x))^(1/2.2))
  g_final = 255*((((g_1/255)^2.2)*x + ((g_2/255)^2.2)*(1-x))^(1/2.2))
  b_final = 255*((((b_1/255)^2.2)*x + ((b_2/255)^2.2)*(1-x))^(1/2.2))
The real gamma correction formula is actually slightly more complicated than that because it's linear up until the sRGB value is about 10 then then follows a ^2.4 curve but the difference is too small to notice.


Thanks! The 3D hills are no less 3D than any other 3D scene displayed on a 2D monitor. It's a heightmap (of a real place) rendered with correct perspective.

Getting the samples to output at the correct time during the end titles required a painstaking amount of cycle-counting, measuring, and iteration.


wow, that's amazing that it could be done so efficiently that we get such a fluid animation, thanks for explaining!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: