I'm just getting started on my hardware journey by working through the nand2tetris course. This has its own simplified hardware description language which works in the simulator they provide.
Looking through the linked site, it's nice to see that the verilog snippets are very similar to the simplified HDL in nand2tetris - almost identical in how pins and parts are composed.
I'm looking forward to getting hold of an ice40 board as mentioned (with the SRAM). There's a project out there which implements nand2tetris on ice40, and these seem available at the moment for a reasonable price.
So now with this VGA project, that's two reasons to order that nice ice40 board :-)
I’ve done nand2tetris a couple times (taking good notes the second time) and coincidentally played through nandgame.com for the first time this weekend. I’ve been trying out what people have done with bare metal Forth on a raspberry pi using a frame buffer and using C libraries to get input from a USB keyboard. I don’t really like having such dependencies so I was just thinking I should get out my Arty FPGA with the same VGA module he uses and just make my own computer instead. It’s hard to find stuff that works how I want and is in stock and is actually understandable.
This is definitely going to be my next project. It even uses Arty and the VGA module for it which I already happen to have (it only has a few bits for R, G, and B color though so it’s not perfect for images and gradients). There was example code[1] to use with it and it generates an image but I wasn’t sure how to make a frame buffer for it which this tutorial has!
I recently reviewed graphics rendering techniques (Computer Graphics from Scratch from No Starch Press) so that fits too. This’ll be fun!
On second thought, I’d rather stick with understandable VGA and hooking up a keyboard in a matrix… could use some other microcontroller and interface between the two with fewer pins, just not USB. Or PS/2 keyboard… Ben Eater has a tutorial as always)
This sounds very cool - it's interesting to hear what projects lie further down this trail.
I was initially looking at the Arty board, as it was referenced by a text book - all the exercises were based on it. It's a bit more pricey than the ice40 (enough to force a comparison against other boards), but it certainly looks more capable.
The icebreaker looks cool too, and small. I see they sell an HDMI output PMOD for it too. I wonder how that works (not that it's necessarily related to this VGA output example, anyway)
The HDMI PMOD is based around the TI TFP410 HDMI transmitter [1], or a compatible part from Silicon Labs. You feed it a VGA-style signal (parallel R/G/B signals, horizontal/vertical sync) with a pixel clock, and it generates an equivalent HDMI output. It's basically a drop-in replacement for most VGA designs.
Hello, I'm the author of the Project F blog. I've almost finished a complete overhaul of this series: animation and double-buffering are coming in October.
Nice work :) I think implementing a VGA controller controller seems a lot nicer in Verilog/VHDL than on an MCU.
The Ti chip you're using for DVI looks interesting too, not heard of that before.
It looks like you're going to use the FPGAs BRAM for double buffering? I started implementing double buffering for led strips in VHDL, but need to get back to finishing the SPI controller for it.
I've also got designs that generate DVI on the FPGA with TMDS encoding (no external IC required). I've never polished or written them up, but you can see an example here:
I'm using BRAM for framebuffers as it allows me to focus on the graphics rather than memory controllers and access. BRAM gives you dual ports and true random I/O; DRAM is much more complex.
Thank you. I can't promise to get to the new design until early 2023 as I have many hardware designs I want to finish this year.
Once you've got a design working in Verilator, I strongly recommend running it on an actual board if you can: nothing beats the feeling of running on hardware :)
The big disadvantage of not having a real FPGA is that you won't be concious of the very real LUT/gate limits. A simulation will happily allow you to apply all sorts of nice compartmentalizations and abstractions, without making your understand that they will cost significant money if you tried to find an FPGA to fit them into.
This was my biggest shock when first working with FPGAs, while naively using a software mentality. Most everything had to be re-written once the simulations were done.
Have you writing an spec compatibile HDMI/DP controller in HDL? It's insanely complex.
VGA is a walk in the park that's sane and easy to understand for beginners. You just need to modulate the 3 primary colors between 0-255 for 8 bit color, and two clock signals, and the monitor will display a picture.
VGA is also very simple to debug since as long as there's some signal on the color and clock lines, the monitor will display something allowing you visually see what's wrong, while on digital, if some part of the handshake failed, you get no picture on the monitor so there's no way to figure out what's wrong without specialized knowledge and equipment.
A CRT is basically it's own oscilloscope of the signals on the VGA cable making development a joy.
DVI is almost literally digitized VGA, with analog signals replaced by differential signalling and symbol encoding to keep it DC-balanced. HDMI extends it by encoding audio and extra metadata (eg. used colorspace) inside blanking intervals (yes, they still use blanking intervals that were originally required for CRT displays). It doesn't do any handshake. Main difficulty is just high clocks required, and necessity to dynamically pick symbols (unless you only use few preselected DC-balanced values). DisplayPort and HDMI 2.1 are different, with packet-based transmission.
HDMI 2.0 requires bidirectional communication between source and sink through SCDC registers to set parameters such as scramble_enable, clock ratio, and to read status flags such as clock detection, channel lock status, bit error rates etc.
I think from a hardware perspective this would be non-trivial, despite sounding like an obvious thing in principle.
In sending the entire frame with every refresh you get pixel addressing "for free", since you just send the data for each pixel sequentially in a predefined order. If you only wanted to update a single pixel, you would need to effectively send an instruction saying "set pixel x to rgb: a, b, c" or whatever, including both the pixel colour values and the pixel address. You'd also need some sort of edge detection for when a pixel is supposed to change which adds a delay.
This is fine for one pixel, but if every pixel on the screen changes at once then suddenly you are going to be sending a hell of a lot of instructions that not only contain the colour values as in the "old crt style" method, but also the address of every individual pixel too, which will then have to be decoded by the screen-side hardware.
All in all, you'll be using a lot more bandwidth and will need a much faster clock to do all of that in the same period of time. In the old school way, you just have a pixel clock that matches the pixel rate and some serdes for serialisation/deserialisation on each end which imo is considerably simpler.
If you don't send the entire buffer then a monitor should have at least 6220800 bytes of RAM dedicated to the frame buffer (1920x1080 resolution), do auto refresh (standard 60Hz) on the panel, and accept commands to overwrite said memory with new data, partially or completely.
That solution is far from what we have now, and much more like a serial LCD controller.
I think people can come up with more efficient protocols than sending x,y,rgb for every pixel. What you call "edge detection" is not necessary if you use shadow buffers. Yes, the display would need some memory, but this is measured in tens of megabytes which is not much by today's standards.
* a sink side memory of tens of megabytes is either on-chip memory (very costly) or on-PCB DRAM (still costly). For high refresh rate monitors, the memory would need to be high BW too. Note that there aren't any modern memory standards that are high BW but low storage. DRAMs with just a capacity of just a few MB would be very much pad limited.
* source side, you'd need a shadow buffer as well, and you need double the read BW to detect the difference between previous and the current frame.
All of that is technically achievable, but none of it matter for desktop monitors: the power savings are just too low to matter.
Laptops are a different matter. Many laptop LCD panels already support self-refresh. (Google "Panel Self Refresh")
But that's for cases where the screen is static: the user is staring at the screen, not moving their mouse, the cursor is static. The benefit is not just putting the link but also putting the GPU to sleep.
That's the low hanging fruit. Partial screen update doesn't save a lot more, because you'd need to power up the GPU for that.
Seems like having memory embedded in the display for a shadow buffer, and the diffing algorithm you’re proposing, could easily undo any power savings you’d get, and then some. Why are you certain that saving power is simple? How much power does the data protocol consume compared to the display itself? Isn’t the data transmission power in the noise margin of the power requirements for an active display, CRT or LCD?
VGA is incredibly easy to generate - you can do it via bitbanging (carefully turning on and off) GPIO pins on an Arduino [1], simply because the tolerances are insanely huge. A step above is SD-SDI [2], which uses less pins but has stricter requirements on timing, and equipment accepting SDI is usually only found in the professional-grade TV production space.
DVI, HDMI, DisplayPort or heaven forbid Thunderbolt are a hot mess - to generate a signal out of these without the usage of dedicated chips, you need a ton of logic and parts: output format negotiation, extremely strict timing requirements that completely rule out bitbanging, signal conditioning, specially shielded cables, shielded traces on the circuit board.
> Why are we still generating video signals as if there is a CRT on the other end?
Because it's extremely simple electronically. It was designed to be simple to work with very slow electronic hardware made of discrete components. So it's basically child's play to generate it with a modern FPGA as there are huge error bars even if you make some small mistakes.
Also blanking areas are still important because at the end of a pixel row there is still some amount of extra electronic stuff happening that is slightly slower than jumping to the next pixel. Even on LCDs or OLEDs.
Even 4K HDR HDMI 2.1 still has something akin to blanking intervals.
Because VGA is fundamentally an analogue standard built around how a CRT works. Each bit of the signal serves a very defined purpose based on how a CRT and thus a vacuum tube works. This is also why VGA is so crazy to do digital capture for, because there are a lot of things that are "VGA" that are well outside the IBM standard. Monitors don't often care because as long as they can adjust their pixel clock, something they can do based on detecting the blanking intervals, they can display the signal because CRTs don't fundamentally have a "resolution" they have "Dot pitch" which is different. But for a digital monitor like an LCD it has to detect and verify the stability of the signal, then try to decode it into something that makes sense. That can take multiple attempts to get right. Most hardware also makes assumptions based on the common resolutions because there is no resolution information... just blanking. So without EDID it can be a real challenge to deal with VGA capture. IIRC you can have multiple resolutions at approximately the same pixel clock because of how VGA works that's fine, a CRT doesn't care... but digital capture needs to know how many times per second to sample so it can make pixels.
The purpose of this site is to teach the fundamentals of creating graphics and having them show up on an external display. Driving a VGA signal is extremely straightforward compared to the complexity of HDMI. Generating an HDMI signal is arguably not the main goal of the site, so using VGA makes total sense imo.
Displayport has two features that allow this, Panel-self-refresh and Panel Replay. Otherwise if you don't send the entire frame every sync cycle, the display will lose signal and go blank.
Around 10-12 years ago someone with industry experience started a project to create a modern open source GPU from scratch. I believe he even had an sponsor for first round of ASIC production, but was targeting FPGA for the first demos.
Looking through the linked site, it's nice to see that the verilog snippets are very similar to the simplified HDL in nand2tetris - almost identical in how pins and parts are composed.
I'm looking forward to getting hold of an ice40 board as mentioned (with the SRAM). There's a project out there which implements nand2tetris on ice40, and these seem available at the moment for a reasonable price.
So now with this VGA project, that's two reasons to order that nice ice40 board :-)