Eight-bit ALU
Testing Results
Outline:
Project Overview
Testing Equipment
Connections
Pin Map
Tests Performed
Results
Conclusion
The custom-made 8-bit ALU chip was designed last semester to perform bit-wise
logical operations, various shifts, two's compliment negation, addition,
decrement, and increment. The chip has two input registers and one output
register, that has three extra bits (flags) that indicate the status of
the previous operation. The chip operates on two 180 degrees out of phase
clocks, and the outputs are synchronized to one of those clocks. In addition,
the chip has Internal Testing Module, which allows monitoring of certain
internal busses. For more details see the report describing the 8-bit ALU.
Five chips were fabricated and tested this semester, and the results show
that the all five chips are perfectly functional, running at clock frequencies
up to and including 34 MHz, 4 bit clock (higher frequencies could not be
tested because of instrument limitations).
The main instrument that was used to test the chips was the Omnilab logic
analyzer. This device is capable of sampling 48 channels at 34 MHz, while
generating signals for 24 channels of outputs. In addition, the logic analyzer
provides a power supply, which was more than sufficient for our needs.
For the of testing the 8-bit ALU, 32 channels were used to monitor both
the outputs and the inputs to the chip, thus every pin was monitored except
for the 6 power supply pins and 2 unused pins. The 17 inputs were generated
by the Omnilab instrument. A breadboard was used to hold the chip, data
acquisition connector, stimulus connector, and the wires. To make sure
that the signals monitored by the logic analyzer were as close as possible
to the signals received by the chip, the wires that were connecting the
data acquisition connector and the generated data were inserted to the
holes next to the pins of the chip, instead of connecting directly to the
data generation connector. This significantly reduced the possibility of
chip receiving corrupted data unnoticed, because of poor wiring.
The connections between the chip and the Omnilab instrument were made
according to the assignments generated by the irsim's conversion utility.
Since the Omnilab software does not support labels longer than 5 characters,
short names were assigned to the signals. See next page for the connections
listing.
The connections are listed in the following format:
Pin number.Full name of the signal/Short name-Analyzer pin-Stimulus
pin (if connected)
1. C0/C0-C2
21. INPUT7/IN7-D0-1.5
2. C1/C1-C3
22. INPUT6/IN6-E7-1.6
3. C2/C2-C4
23. INPUT5/IN5-E6-1.7
4. C3/C3-C5
24. INPUT4/IN4-E5-2.0
5. Vdd
25. GND
6. CONTROL0/CONT0-C0-1.3 26. INPUT3/IN3-E4-2.1
7. CONTROL1/CONT1-C1-1.4 27. INPUT2/IN2-E3-2.2
8. OVERFLOW/OVER-C7
28. INPUT1/IN1-E2-2.3
9. SHBITOUT/SHBIT-C6
29. INPUT0/IN0-E1-2.4
10. GND
30. Vdd
11. ZERO/ZERO-F0
31. INST0/INST0-D1-3.0
12. OUT7/OUT7-E0
32. INST1/INST1-D2-2.7
13. OUT6/OUT6-F7
33. INST2/INST2-D3-2.6
14. OUT5/OUT5-F6
34. INST3/INST3-D4-2.5
15. Vdd
35. GND
16. OUT4/OUT4-F5
36. RESTART/TRIG-D5-1.2
17. OUT3/OUT3-F4
37. CLKA/CLKA-D7-1.0
18. OUT2/OUT2-F3
38. not used
19. OUT1/OUT1-F2
39. not used
20. OUT0/OUT0-F1
40. CLKB/CLKB-D6-1.1
The stimulus pattern outputted by the Omnilab machine was generated from the
irsim command file. Two such files were written: one to test the shifts and the logical
operators, and the other to test the arithmetic operators. The load instructions are tested
in both of the files, since these instructions are essential for the operation of the ALU.
Overall, every instruction was tested at least once, with most being tested at least twice,
and some being tested three or more times. In general, the more complicated the
instruction and the more chip modules it involved, the more tests were performed on that
particular instruction. This strategy was driven by fact that the probability of
encountering a defect is much more likely on a large and convoluted area of a chip, than
on a small and simple region. In addition, the defect will be more difficult to detect when
the operation is complex, and only in few cases the error becomes apparent.
Each file contains 20 instructions (20 clock cycles), two of which (first one and the
last one) can only be used to detect gross errors -- the type of defects that would render
the chip completely inoperable. The other 18 instructions can be used to test the finer
details of the chip. One cycle was used to test the NO OPERATION instruction.
Therefore, a total of 35 vectors were used to test 15 actual instructions.
The test instructions for the bitwise logical operations and shifts involved loading
certain test patterns into the registers and performing the operations. These test
patterns involved both alternating and adjacent 1's and 0's, to verify that none of the bits
are stuck at zero, stuck at one or cross-connected (shorted). In addition, the first and last
bits of test patterns were also designed to the test of the functioning of the SHBITOUT
flag during the shift operations. Four possible shift instructions were called 7 times to
verify the functionality of the shift registers. Three possible bitwise logical operations
(AND, OR, NOT), were called 5 times to check their functionality. One of the instructions
was designed to produce all-zeros output, so the activation of ZERO flag could be
monitored.
The second irsim file, written specifically for verification of the addition/negation
unit, called 4 instructions (INCREMENT, DECREMENT, NEGATE, ADD) on 8 different
occasions. These instructions were designed to verify basic addition and negation, and,
also, complicated situations with overflow resulting in a zero, so that the correct activation
and de-activation of OVERFLOW and ZERO flags can be verified. In addition, a
sequence of instructions was designed to invoke the suspected longest path in the chip
while the conditions of the internal bus are at their worst: all of the bits on the bus must
change as a result of the instruction. This testing pattern is very useful in the evaluation
of the maximum clock rate of the ALU, while performing verification of the functionality.
Note that the generators of +1 and -1 for the INCREMENT and DECREMENT operation
were also verified during the testing of the addition/negation unit.
The four possible load operations were tested in both irsim files a numerous
number of times as a consequence of loading various test patterns into registers so that
other instructions can be verified. As a result, we can be very certain of the functionality
of the internal busses and of the internal registers. A total of 16 load operations were
performed, with a very large variety of test patterns. However, we must remember that
this testing is by no means excessive: the internal busses and registers are the most
fundamental blocks of the ALU, and a small defect in any of those components can
produce a faulty output on every operation, unlike in cases when just one operation block
is faulty.
The tests described above were performed at the various clock rates, up to, and
including, 34 MHz (2-bit clocks). The higher clock frequencies could not be achieved,
because of limitations of Omnilab logic analyzer. The results of the tests showed that all
five chips are functioning flawlessly at the clock frequency of 34 MHz (4-bit clock). The
comparison of the collected data with the irsim simulations shows a perfect agreement.
The maximum clock frequency predicted by the SPICE models was 11 MHz, which is
close to the measured maximum frequency of (at least) 34 MHz / 4 = 8.5 MHz. In
addition to maximum clock frequency, the delay between the rising edge of the clock and
the change in output was measured. To perform that measurement, the clock rate of was
set to a low value, so that the next change in output did not come too soon. The
sampling rate on the logic analyzer was set to the maximum value of 34 MHz. However,
the delay between the rising edge of the clock and the change in output was less than
one sample, so we know that the delay is 30 nanoseconds or less.
High speed testing. The ALU chip was designed to operate on two clocks,
shifted out of phase by 180 degrees, each having a 25% duty cycle, so that there is a
certain amount of time between the cycles when neither clock is high. Omnilab
instrument can generate this type of clock signal at the maximum frequency of 17 MHz.
If the duty cycle is increased to 50%, so that there is no time at which neither clock is
high, the Omnilab can generate frequencies of 34 MHz / 2 = 17 MHz, because only 2 bits
are needed to represent the clock (instead of 4). Although this type of clocking is
somewhat dangerous, because there is a high possibility of fighting on the internal
busses, if the clock is not very fast, the fighting stops and the values on busses stabilize.
The testing of the ALUs using this type of clocking showed that all of the chips worked
perfectly at 17/2 MHz, but at 34/2 MHz a single test pattern produced an error: the one
that was involved the suspected longest path. Since all five chips failed that tests in
exactly the same manner: perfect operation except at the point where the longest path is
involved, the explanation lies in the design of the chip not in the manufacturing process.
More thorough analysis of the error shows that the only two outputs are incorrect:
the lowest bit of the output is 1 instead of 0, and ZERO flag is not activated because of
that lowest bit. Since the ALU chip has an Internal Testing Module, which allows to
directly monitor the internal output bus, we can see the underlying details of what is
going on. The internal output bus is used to deliver information to the output register
from the operator modules. Only one operator module is allowed to write to the bus, and
a set of transmission gates separates each operator module from the bus. Normally,
only one set of transmission gate is activated, so that there is no fighting on the bus.
With the "squashed" clocks, the fighting is lot more likely to occur, and the settling time
becomes longer. In addition, the structure of the transmission gate set that is used for
the connecting of the operator modules to the output bus have a structure that is
inherently slower on the left or the low bit side, with lowest bit being the slowest (see
picture).
Structure of a set of transmission gates.
By setting the CONTROL1 and CONTROL0 to zeros, we can monitor the lowest 4 bits of
the internal output bus, including the lowest bit, which is the cause of the problem.
34/2 MHz clock: Test File 2 erroneous results, monitored using Internal Testing Module
The data collected from the internal bus shows that the lowest bit stays high for half of
cycle too long, but afterwards the erroneous result is stored in the register, that is why
lowest bit stays high for another whole cycle.
Results: 6.8/4 MHz Clock
Test File 1 correct results
Test File 2 correct results
Results: 34/4 MHz Clock
Test File 1 correct results
Test File 2 correct results
Results: 34/2 MHz Clock
Test File 1 correct results
Test File 2 erroneous results
From the data acquired during the testing of the ALU chip, we can conclude that,
overall, the project was a success. With the additional information learned from the
Internal Testing Module, we now know what slows the chip down, and, if we had an
opportunity, we would be able to redesign the set of transmission gates that control the
output of the addition/negation unit so that a higher maximum clock speed could be
achieved.
Note: Thank you to Dr. Cavallaro, who gave us the idea of testing the chip using a 2 bit
clocks instead of a 4 bit clock. His suggestion led us to see possible improvements to
the design of the chip.