Because we didn't need to learn a new language/IDE/environment at the same time that we learned a new paradigm, we were able to keep our feet on solid ground while working things out; we were familiar with the syntax, so as soon as we realized how to "wire something up," we could do so with minimal frustration and no need/ability to Google anything. Of course, it was left to a subsequent course to learn HDL and load it on real hardware, but for a theoretical basis, this was a perfect format. Much better than written tests!
 http://www.cs.princeton.edu/courses/archive/fall10/cos375/de... - see links under Design Project, specifically http://www.cs.princeton.edu/courses/archive/fall10/cos375/Cp...
You're right that it helps a LOT when it comes to implementing the actual hardware.
(I also put it through Vivado HLS, but I wasn't able to sneak that past the professor, rats! :)
I wish I had known about verilator back then, I could have compiled the verilog into c++ and ran my simulator test suite against it!
From the perspective of an electrical engineer and computer scientist, asynchronous circuits theoretically can be faster and more efficient. Without the restraint of a clock slowing down an entire circuit for its slowest component, asynchronous circuits can instead operate as soon as data is available, while consuming less power to overhead functions such as generating the clock and powering components that are not changing state. However, asynchronous circuits are largely the plaything of researchers, and the vast majority of today's circuits are synchronous (clocked).
The reason why we use synchronous circuits, which may relate to the reason why many students learning circuits often try to make circuits without clocks, is because of abstraction. Clocked circuits can have individual components/stages developed and analyzed separately. You leave problems that do not pertain to the function of a circuit such as data availability and stability to the clock of the overall circuit (clk-to-q delay, hold delay, etc), and can focus on functionality within an individual stage. As well, components of a circuit can be analyzed by tools we've built to automate the difficult parts of circuit design, such as routing, power supply and heat dissipation, etc. This makes developing complex circuits with large teams of engineers "easier." The abstraction of synchronous circuits is one step above asynchronous circuits. Without a clock, asynchronous circuits can run into problems where outputs of components are actually wrong for a brief moment of time due to race conditions, a problem which synchronous circuit design stops by holding information between stages stable until everything is ready to go.
The article's point of hardware design beginning with the clock is useful when you are trying to teach software engineers, who are used to thinking in a synchronous, ordered manner, about practical hardware design which is done entirely with clocks. However, it is not the complete picture when trying to create understanding of electrical engineering from the ground up. Synchronous circuits are built from asynchronous circuits, which were built from our understanding of E&M physics. Synchronous circuits are then used to build our ASICs, FPGAs, and CPUs that power our routers and computers, which run instructions based on ISA's that we compile down to from higher order languages. It's hardly surprising that engineers who are learning hardware design build clockless circuits - they aren't wrong for designing something "simple" and correct, even if it isn't currently practical. They're just operating on the wrong level of abstraction, which they should have a cursory knowledge of so synchronous circuits make sense to them.
NO CLOCKS: Most computing devices have one or more clocks that synchronize all
operations. When a conventional computer is powered up and waiting to respond
quickly to stimuli, clock generation and distribution are consuming energy at a huge rate
by our standards, yet accomplishing nothing. This is why “starting” and “stopping” the
clock is a big deal and takes much time and energy for other architectures. Our
architecture explicitly omits a clock, saving energy and time among other benefits.
Wouldn't (bitcoin) miners be a good fit for async circuits? Simple logic, low power, high performance.
This chip has other good properties in addition to "no clock".
Does this confusion typically happen to engineers who are trying to teach themselves hardware design, or is it just an indication of a terribly-designed curriculum?
It doesn't help that Verilog can be used as an imperative language executing statements in order, nor that it has two different types of assignment operator of which only one should be used, nor that the simulator does not enforce realistic restrictions.
Nor does it help that the practical way to write Verilog/VHDL is to decide what output you want the compiler to give then work backwards to the HDL input. (I've met actual design companies that turned this into a rigid waterfall flow - full block diagram before you write a line of HDL)
Your curriculum seems very sensible to me, you learned the basics first. After all HDLs are just describing the underlying circuit, you need to know how to design it first.
The basis of most confusion in my experience is that for most people (including pure electronic engineers) their "first contact" with anything digital is through software programming which, despite the recent trend in multicores and parallel execution, is highly sequential.
When you try HDL for the first time after years of programming CPUs it's natural to approach it with the same thought process. It just takes time to adapt to a new "way of thinking", nothing more. I'm sure if someone learns HDL design without prior software experience will face the same issues when he/she later decides to jump to software engineering
Maybe engineers need to be introduced to the synthesis tools at the same time as the simulator tools.
Simulating RTL is only an approximation of reality. So emphasizing RTL simulation is bad. You see it over and over though. People teach via RTL simulation.
Synthesis is the main concern. Can the design be synthesised into HW and meet the constraints? Because all the combinatorial logic gets transformed into something quite different in a FPGA.
> The reality is that no digital logic design can work “without a clock”. There is always some physical process creating the inputs. These inputs must all be valid at some start time – this time forms the first clock “tick” in their design. Likewise, the outputs are then required from those inputs some time later. The time when all the outputs are valid given for a given set of inputs forms the next “clock” in a “clockless” design. Perhaps the first clock “tick” is when the set the last switch on their board is adjusted and the last clock “tick” is when their eye reads the result. It doesn’t matter: there is a clock.
Put another way, combinatorial systems (the AND/OR/etc logic gates that form the hardware logic of the chip) have a physical propagation delay. The time it takes for the input signals at a given state to propagate through the logic and produce a stable output.
Do not use the output signal before it is stable. That way lies glitches and the death of your design.
Clocks are used to tell your logic: "NOW your inputs are valid".
The deeper your combinatorial logic (the more gates in a given signal path), the longer the propagation delay. And the maximum propagation delay across your entire chip determines your minimum clock period (and thus maximum clock speed)
There exist clockless designs, but they get exponentially more complicated as you add more signals and the logic gets deeper. In a way, clocks let you "compartmentalize" the logic, simplifying the design.
 What's the most widespread fundamental gate in the latest fab processes nowadays? Is it NAND?
 or at least clock domain
Clockless designs usually use some other mechanism for marking ready signals. Many of those mechanisms compose linearly and can even interconnect better than clocked designs, because they don't need to care about clock domains.
Typical standard cell libraries have hundreds of cells, including cells representing logic such as muxes, full adders, and other frequently occurring clusters of gates. Logic is mapped to these in the way the tool finds most optimal. So I don't think it's right to say that there is a fundamental gate in modern processes.
Unless you are looking at it from the electrical perspective in which case the fundamental gate is the inverter.
It's been a while since I last talked about this with my HW Eng friends :)
Another I try to explain hardware design for people coming from a software background:
You get one choice to put down in hardware as many functions as you want. You cannot change any of them later. All you can do later is sequence them in whatever order you need to accomplish your goal.
If you think of it this way, you realize that the clock is critical (that's what makes sequencing possible), and re-use of fixed functions introduces you to hardware sharing, pipelining, etc.
But it's hard to grasp.
It is a pdf though which can be slower.
This is not true.
"HDL based hardware loops are not like this at all. Instead, the HDL synthesis tool uses the loop description to make several copies of the logic all running in parallel."
This is not true as a general statement. There are for loops in HDLs that behave exactly like software loops. And there are generative for loops that make copies of logic.
Also, the "everything happens at once" is not true either. In fact with out the delay between two events happening, synchronous digital design would not work. (specifically flip-flops would not work).
HDL languages do have loops, but they are for testbench purposes only, in hardware implemention non generate loops would not be implemented!
I think by saying "Everything happens at once" the author meant that all your code executes at once. He is obviously trying to get the one line at a time sequential mindset out from people used with software.
Here is a perfectly fine, synthesizable parity generator using a non generative for loop (vhdl):
entity for_parity is
port (input : in std_logic_vector (7 downto 0);
even : out std_logic;
odd : out std_logic);
end for_parity ;
architecture for_arch of for_parity is
signal even_parity : std_logic;
calc: process (input) is
variable temp : std_logic;
temp := '0';
for i in 0 to 7 loop
temp:=temp xor input(i);
even_parity <= temp;
end process calc;
even <= even_parity;
odd <= (not even_parity);
For an example, the GA144 is an example of a practical computer completely implemented asynchronous logic.
Its Asynchronous nature is one of the features, benefits include lower power consumption, faster speed, and lower electromagnetic interference
I think the author means something more along the lines of something like this: all of your edge triggered state machines in the same clock domain execute at once. It is unlike a traditional programming language where statements where statements execute sequentially.
asynchronous digital logic forms part of any reasonable digital circuit design course :)
It's not an argument against clockless design, but an observation on the limitations of doing everything in a single clock tick of an already clocked chip.
We can surely build a circuit without any clock, but what challenges do we run into? How does imposing discrete clock steps help? What exactly is a clock? I would have liked to see the discussion drop a level or two of abstraction to EE or something.
"I’ll spare him the embarrassment of being named, or of linking to his project. Instead, I’m just going to call him a student. (No, I’m not a professor.) This 'student'..."
Why was this needed? What purpose did it serve the greater article except for the author to espouse their own superiority?
This kind of stuff needs to get the hell out of engineering. It turns away potentially many brilliant people that could join the field but fear the rejection by peers.
Do you have any reason to believe that arrogance and proficiency are correlated?
I'm actually the only one who has posted on the site.
It looks like this Si are the mostly combinational logic for a stage and Pi are the pipeline registers between stages (nearly all signals between stages should be buffered by pipeline regs). IO is omitted but it's the same overall architecture.
Clk --------+---------------+---... ....
+-> S0 --> |P0| --> S1 --> |P1| --> .... --+
Understanding the basics is important but I'm held up before the basics even start mattering.