For the sprites did you consider a part on dealing with multiple overlapping sprites? (I just skimmed, so apologies if you did this somewhere and I missed it!) There's at least two approaches here and I think it's an interesting thing to explore to discuss design trade-offs.
The two approaches I can see are:
1. Load the relevant line of each sprite into flops during the blank interval, when drawing determine which sprites produces a pixel for a given output pixel then apply ordering, taking into account transparent pixels to determine which one wins. You have a simultaneous read of all of the sprite flops each output pixel to do this
2. Pre-render all of the sprite pixels for a line into a memory buffer during the blanking interval. If you have multiple sprite pixels per cycle from memory and take advantage of that fact your FPGA can probably run at some multiple of the pixel clock (say 100 MHz vs the ~25 MHz pixel clock for 640x480) you do quite a lot in the blanking interval.
I reckon 2 wins when you're wanting more sprites per line and higher bit-depths per sprite pixel (8-bit 256 colour sprites vs 2/4/8 colour sprites).
I've got a ~60% complete version of 2, maybe I should follow you lead and finish it off and write about how it works.
The first sprites post, https://projectf.io/posts/hardware-sprites/ does include 4-bit sprites, but again no overlapping.
I am planning another sprite post with overlapping coloured sprites combined with a framebuffer background. I will probably use approach 2, as you describe, but with hardware design, I’ve found that the best algorithm can surprise you in practice.
SV is anyway a hardware description language, so you still think and code in terms of modules and signals (wires) to be connected. But if you want to create structures that use variables and have a more "procedural" style, I find that SV is more agile than VHDL for this specific purpose.
If you want real high level languages for hardware there are tools that provide synthesis from C/C++, Python and more software oriented languages.
That sounds extremely interesting!
If that doesn't matter, then go for it!
A functional language like Haskell would probably be easier to compile to hardware.
I’ve not used VHDL in anger, so can’t give you a useful opinion there.
Here are a few sites that helped me learn:
* https://1bitsquared.com/pages/chat (Discord)
And a few interesting blogs:
None of the introductory FPGA books have impressed me, so I don’t have any recommendations there. Maybe some other HN readers will have some suggestions?
 - https://www.elsevier.com/books/digital-design-and-computer-a...
Forthcoming posts will cover simple 3D modelling with texturing and lighting in hardware. But that’s still a long way from a GPU.
Did you look at the sprite implementation in the various C64 and Amiga FPGA re-implementations? I haven't, I'm a bit curious on how they do the pixel merging (the trivial implementation I think is a prioritized 1-8 mux controlled by some priority selection logic)
The second post on Pong https://projectf.io/posts/fpga-pong/ includes collision checking between the ball and paddles at the drawing level.
I plan to cover full sprite collisions and overlap in a future post. In the meantime, there is a post that covers sprites with transparency: https://projectf.io/posts/hardware-sprites/
I haven’t looked at the FPGA re-implementations, but I am familiar with Amiga hardware: http://amigadev.elowar.com/read/ADCD_2.1/Hardware_Manual_gui...
However, from the point of view of these designs, the PLL is an internal black box that allows us to generate an (almost) arbitrary clock frequency.
We choose a frequency to meet I/O or performance requirements. In this case a pixel clock of ~25 MHz: we have to meet this to generate a valid 640x480 video signal.
None of the Exploring FPGA Graphics designs uses a CPU: the logic is all in hardware, often a finite state machine. The trick is ensuring the hardware logic completes within one clock cycle. For 25MHz, each clock cycle is 40ns. If a design is too complex to meet timing, we can break it into multiple steps, similar to pipelining on a CPU.
You can also run different parts of a design at different clock frequencies, but this introduces the challenge of clock domain crossing.
For a low-end FPGA, like an Artix-7, you can expect to run a reasonably complex design at 100-200 MHz.
This is also how PC BIOSes let you configure various CPU and DRAM speeds from a single physical crystal.
They're accurate enough for most practical purposes. They can take a large number of clock cycles to acquire "lock" and start producing the correct frequency, which rarely matters unless you need to go from 0 to X00MHz very quickly.
> system clock doesnt run fast enough to keep up with 60hz
It doesn't actually say that as far as I can tell?
In this instantiation of the concept, a frequency divider (dividing by a whole number) is something that's easy to build: it's a counter. For example, to divide by 100, count the input clock, when you increment from 49, wrap around to 0, and toggle the output.
Putting that divider in a negative feedback loop gives you a multiplier.
Another common instantiation: if you put a voltage divider (easy to build with a couple of resistors) in the feedback path of an operational amplifier and you get a voltage multiplier.
Another instantiation with op-amps: Put a capacitor in the feedback path, and you "get" an inductor. This one is also a big deal because, practically speaking, it's easier to engineer close-to-ideal capacitors than it is inductors, which are heavy and lossy.
And if you really want to take a ride on this conceptual train, consider that it's easy to square a number. Need the square root? Use negative feedback: https://en.wikipedia.org/wiki/Newton%27s_method#Square_root
Phase locked loop can use reference clock(generated outside of the chip by an oscillating crystal, like in a watch) to generate many other clocks of different frequencies each of which is corrected and guided by this reference clock
In fpga there is no "system clock", each module can use its own clock at frequency that it needs, all defined and routed by the designer
Either way the distance is considerable. FPGAs do allow you to do unique hardware interfacing things - good luck getting a PCI microcontroller - but even there they're catching up. The RP2040 can bitbang DVI with its peripherals.
If you’re into hardware, please consider the applications that FPGAs could help make transformative.
If any hardware hackers are out there, please contact me at firstname.lastname@example.org. To clarify, I’m obviously not Satoshi Nakamoto but I know who she is.