The one difference this modification might make is that if it is in a multi-chip system, and another chip is sending inputs to this multiplier, you have to wait for the cross-chip delay AND the multiply before you can clock the registers. If you put a register stage on the inputs, you shorten that delay.
Maximum Points: 6
Using non-blocking assignments ensures that all inputs are sampled before any of the outputs are assigned. When we change to blocking assignments, each assignment samples its input and writes its output immediately, which means that the pipeline just passes values through without holding them for 4 clock cycles. If we were to invert the order of the assignments, i.e.
then the verilog code would work properly, since we are forcing the
assignments to sample inputs in the correct fashion.
c = c3;
c3 = c2a * c2b;
c2b = c1b;
c2a = c1a;
c1b = b;
c1a = a;
My additional note: In this case, the you can guarantee the sequence of evaluation through the begin/end statements. If there are multiple begin/end statements that all may get simultaneously evaluated, ordering blocking assignments is risky. That's why the language was extended to use non-blocking assignments. Use them.
Maximum Points: 7
Extra Credit: 4
Working: 15 pts. Proximity to Convex Hull: 0 - 15 pts. MOPS/CLB: > .2 10 pts. > .15 5 pts. > .10 5 pts. > .05 5 pts.Maximum Points: 55
Separate Multiplier Criteria: Working: 10 pts. MOPS/CLB: > 1.0 5 pts. > 0.75 5 pts. > 0.5 5 pts. > 0.25 5 pts. Total possible: 30 pts each * 2 = 60 Combined Multiplier Criteria: Working: 20 pts. MOPS/CLB: > 0.5 5 pts. > 0.4375 5 pts. > 0.3750 5 pts. > 0.3125 5 pts. > 0.2500 5 pts. > 0.1875 5 pts. > 0.1250 5 pts. > 0.0625 5 pts. Total possible: 60 pts.Maximum Points: 60
Extra Credit: 4