Reducing ATE Test Time by Voltage and Frequency Scaling
by
Praveen Venkataramani
A dissertation submitted to the Graduate Faculty of
Auburn University
in partial ful llment of the
requirements for the Degree of
Doctor of Philosophy
Auburn, Alabama
May 4, 2014
Keywords: ATE, external testing, power constrained test, test programming, test time
reduction, VLSI testing
Copyright 2014 by Praveen Venkataramani
Approved by
Vishwani D. Agrawal, James J. Danaher Professor of Electrical and Computer Engineering
Fa Foster Dai, Professor, Electrical and Computer Engineering, Associate Director,
AMSTC
Adit D. Singh, James B. Davis Professor of Electrical and Computer Engineering
Abstract
During wafer sort, the fabricated chips are subjected to tests that verify if they meet
the design speci cation. Test application time plays a critical role while verifying large
volume of dice in a given period of time. These tests are carried out on an automatic test
equipment (ATE). The time spent on the ATE directly a ects the  nal cost of the device.
Hence it is paramount to reduce test application time such that the device can be veri ed
reliably while keeping the test time to a minimum. While reducing test application time
is important, power dissipation is also important while considering reduction in test time.
Power dissipation is often a trade o when deciding the test frequency and becomes a major
limiting factor.
One of the major approaches to test time reduction during circuit design is to implement
multiple scan chains. This approach reduces test time drastically when compared to a same
device implemented using a single scan chain. Other approaches involve manipulating test
hardware and test patterns to reduce test time and testing many dice in parallel.
The objective of this thesis is to obtain an optimum solution to the trade o and the
feasibility of such approaches which can lead to new test methods in hardware and software.
The problem is approached in two ways (i) by scaling the supply voltage, and (ii) by scaling
the test frequency. Additionally, the two methods can be combined to reduce test time
further. These methods can be used in tandem with existing methods to provide additional
gain in test time reduction.
The proposed methodologies are veri ed by simulation and through experiments. The
experiments were carried on the Advantest T2000GS ATE located at Auburn University,
Alabama. The simulations were performed using ISCAS?89 benchmark circuits and results
show up to 50% reduction in test time.
ii
Acknowledgments
First and foremost I would like to thank Prof. Vishwani Agrawal for his invaluable
guidance throughout my work, our scheduled meetings helped me coarse tune my approaches
in research and also extend it to my personal life. His enthusiasm in research will always
inspire me. I would also like to thank Prof. Adit Singh and Prof. Foster Dai for being my
committee members, Prof. Adit Singh?s VLSI Testing and VLSI design classes, and Prof.
Foster Dai?s class on Analog circuits helped me understand the basics of testing methods and
transistor behavior which served as the foundation to my work. I would like to thank Prof.
Victor Nelson whose course on Computer Aided Design helped me learn the tools used for
experiments in this work. I would like to thank Prof. Stuart Wentworth for his class on RF
and Microwave Devices, and Prof. Stanley Reeves for his class on Digital signal processing,
both of which gave me an opportunity to gain knowledge in the areas I was least exposed
to. I would like to thank Prof. Sanjeev Baskiyar for agreeing to be my external reader.
I would like to have my special thanks to Prof. Prathima Agrawal and Ms. Shelia
Collins for managing my travel to several conferences and workshops; Ms. Jo Ann Loden
for helping me with all the registrations and paperwork when I was delayed in India during
Visa extension.
I would like to thank all my friends especially, Gisel, Madhukar, Mahalingam and Sen-
thuran for their emotional and  nancial support during hard times, Ravi Tej, Ravi Kanth,
Suraj, Swathi and Sindhu for helping me debug issues and accompanying me o hours in
Broun 310 lab.
Finally I would like to thank my parents and my sister for their immeasurable support
throughout my studies, for which I am ever grateful, and dedicate this work and e orts to
them.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 VLSI Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Levels of Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Fault Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Designs for Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Scan-Based Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Built-In Self Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Compressor-Decompressor . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.4 SerDes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.5 Analog Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 VLSI Test Equipment and Procedure . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1 Advantest T2000GS ATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Test Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.2 Test Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Time and Cost Relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Test Time Theorem and Applications . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1 Test Time Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Applications of Test Time Theorem . . . . . . . . . . . . . . . . . . . . . . . 27
iv
3.2.1 Periodic Clock Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.2 Aperiodic Clock Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Scaling Supply Voltage to Reduce Periodic Clock Test Time . . . . . . . . . . . 35
4.1 Low Voltage Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Reduced Supply Voltage Test . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Optimum Supply Voltage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.1 SPICE Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.2 Polynomial Equation to Obtain Minimum Supply Voltage . . . . . . 42
4.3.3 Solving for VDDopt, fopt and TTopt . . . . . . . . . . . . . . . . . . . . 44
4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Peak Power and Critical Path Frequency Measurements . . . . . . . . . . . . 49
4.5.1 Hardware Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5.2 Peak Power and Frequency Measurements . . . . . . . . . . . . . . . 51
4.5.3 Minimizing Test Time for Given Peak Power Limit . . . . . . . . . . 53
5 Dynamic Scaling of Test Clock Period . . . . . . . . . . . . . . . . . . . . . . . 54
5.1 Aperiodic Clock Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1.1 A Circuit Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.3 Test Programming on ATE at Nominal Voltage . . . . . . . . . . . . 59
5.2 Optimum Voltage for Aperiodic Clock Test . . . . . . . . . . . . . . . . . . . 64
5.2.1 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.1 Adapting to At-Speed Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.2 Adapting to Process-Voltage-Temperature Variations . . . . . . . . . . . . . 71
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
v
List of Figures
1.1 Illustration of a sequential circuit with 4  ip- ops. . . . . . . . . . . . . . . . . 7
1.2 Illustration of a sequential circuit with 4  ip- ops connected into a serial shift
register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Illustration of a compressor-decompressor logic connected to multiple scan chains. 9
1.4 Illustration of a decompressor logic built using multiplexor [8]. . . . . . . . . . . 10
1.5 Illustration of a compressor built using XOR logic [8]. . . . . . . . . . . . . . . . 10
2.1 Advantest T2000 ATE at Auburn University, Alabama. . . . . . . . . . . . . . . 15
2.2 Mainframe of Advantest T2000GS at Auburn University. . . . . . . . . . . . . . 16
2.3 Test head of the Advantest T2000GS with an FPGA on the loadboard. . . . . . 17
3.1 Minimum test time as a function of supply voltage (VDD) for N-cycle periodic
clock test. For a minimum test time TTsync supply voltage is Vsync which is lower
than the nominal voltage Vnom. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Illustration of test power and test energy for every test cycle using periodic clock.
The test clock period is determined by the cycle dissipating the maximum power. 31
3.3 Illustration of test power and test energy for every test cycle using aperiodic clock.
The test clock period for every cycle is determined by the power dissipated during
that cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
vi
3.4 Minimum test time as a function of supply voltage (VDD) for N-cycle aperiodic
clock test. For a minimum test time TTsync supply voltage is Vsync which is lower
than the nominal voltage Vnom. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1 Comparison of the test time measured using SPICE simulations with the delay
calculated using  power law using s298 ISCAS?89 benchmark circuit. The test
clock period is chosen as the functional period assuming that the test is not power
constrained. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Simulation and experimental test time plots to  nd the optimum voltage for s298
benchmark circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Simulated and calculated curves using test period and functional period at various
voltages. The direct approach using MATLAB (circled) matches the cross point of
the curves obtained analytically using the periods calculated from equations (4.3)
and (4.4) and the results obtained from SPICE (\plus" data points) in [66]. . . 47
4.4 Test setup for measuring peak power per cycle and maximum test frequency for
an Altera DE2 FPGA board (with all its peripherals) using the NI ELVIS II+
bench-top prototyping board. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Measured values of maximum power consumed per cycle (in blue) and maximum
test frequency (in green) plotted as a function of the supply voltage for the Al-
tera DE2 FPGA board tested using NI ELVIS II+ bench-top prototyping board.
Switching power is dominated by the CMOS circuitry contained on the board.
The FPGA itself is programmed with the function of s298 benchmark with scan. 52
5.1 Periodic and aperiodic clock simulation of 450-cycle scan test of ISCAS?89 bench-
mark circuit s298. Periodic test clock frequency is 240MHz and test time is 1.87 s.
Aperiodic clock test time is 1.31 s. . . . . . . . . . . . . . . . . . . . . . . . . . 56
vii
5.2 Aperiodic clock for 540-cycle scan test of s298 for a power budget of 1.23mW. Hor-
izontal broken lines indicate four test clock periods available from the T2000GS
ATE. Period used for a test cycle was the nearest higher ATE clock period. . . . 62
5.3 Periodic clock: ATE result for 540-cycle scan test of s298 benchmark circuit.
Waveform shows 33 test cycles (cycles 13 through 46) of 500ns clock. Signals
shown are scan-out, scan-in, scan enable, three primary outputs and clock. Green
triangles under scan-out waveform are matching strobes. . . . . . . . . . . . . . 63
5.4 Aperiodic clock test: ATE result for 540-cycle scan test of s298 benchmark circuit.
Waveforms shows 58 test cycles (cycles 13 through 71) taking the same time as
taken by 46 cycles of periodic clock test in Figure 5.3. Clock periods used were
200, 300, 410 and 500 ns as shown in Figure 5.2. Signals shown are scan-out,
scan-in, scan enable, three primary outputs and clock. Green triangles under
scan-out waveform are matching strobes. . . . . . . . . . . . . . . . . . . . . . . 64
5.5 Aperiodic clock test time as a function of supply voltage showing the minimum
test time voltage, Vasync. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6 Minimum periodic and aperiodic clock test times for s298 circuit after selecting
suitable supply voltages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
viii
List of Tables
2.1 State of art IMA Tester Cost Analysis Data [62] . . . . . . . . . . . . . . . . . 22
4.1 Parameter values for s298 benchmark synthesized in 180nm CMOS technology
(VDD = 1:8V, VTH = 0:39V, Critical Delay = 0.77ns). . . . . . . . . . . . . . . . 46
4.2 Optimum VDD for reduced test time of ISCAS?89 benchmark circuits. . . . . . . 48
4.3 Analytically obtained VDDopt and fopt for minimum scan test time of ISCAS?89
circuits in 180nm CMOS ( = 2, VTH = 0:39V). . . . . . . . . . . . . . . . . . . 50
5.1 Scan test time for ISCAS?89 circuits in TSMC 180nm technology. . . . . . . . . 58
5.2 Optimum voltage VDDopt for minimum aperiodic clock scan test time of ISCAS?89
circuits in 180nm CMOS ( = 2, VTH = 0:39V). . . . . . . . . . . . . . . . . . . 68
5.3 Test times for various methods normalized with respect to that of the conventional
method (nominal 1.8V supply and periodic clock). . . . . . . . . . . . . . . . . 69
ix
Chapter 1
Introduction
1.1 VLSI Testing
An abstract form of testing is to observe the response of a device to known inputs
under desired environmental conditions. For instance, consider a test to see how good a
microwave oven works; here the oven will be considered as the device under test (DUT). To
test the operation of the oven, one might try to cook some dish using the microwave oven
for a preferred duration of time, where the dish becomes the test input and the time is the
duration of test. If the food is cooked well, then the microwave oven operates as desired
and thus it passes the test. If it did not cook well then either the microwave oven is faulty
(if the food or the container is not hot), or it needs more time (if the food is warm but
seems under cooked), i.e., insu cient test input, or the type of food cannot be cooked under
the current conditions (if a bowl of rice is cooked without adding water), i.e., error in the
process. Similarly, in very large scale integrated circuit testing (or simply VLSI testing),
once the circuit is designed, it is tested for the correctness of operation. If the circuit fails
the test it may be due to a fault in the design. Either the test was wrong to begin with, or
if the design is fabricated then there might have been an error in the process, or the design
could have been wrong, or the conditions under which it was tested were wrong. The role
of testing is to verify if the design is free of manufacturing defects and the role of diagnosis
is to identify the source of the failure [15].
1.1.1 Levels of Testing
The testing and diagnosis of the device can be classi ed into four types based on the
purpose of each test [15]. The order in which they are performed is called as the test  ow
1
and normally is considered as a standard procedure to make sure that the design performs
as intended and to capture any anomalies early.
Simulation
This is the  rst stage of testing where the design netlist is veri ed using computer
programs called simulators. The netlist is a  le that contains the structure of the design
written in languages such as SPICE or VHDL. The simulators verify the correctness of these
netlist by applying inputs and observing the output.
Characterization Test
The characterization test is done during the initial part of the design after fabrication,
called the \silicon bring up", before it is sent to production. In this phase the industry
designing the device obtains a sample of chips from a foundry that fabricates them. During
this step the design is veri ed and debugged for the correctness of operation and whether the
initial sample meets all the speci cations of the manufacturer. Functional tests are run and
elaborate AC and DC measurements are performed. The thoroughness of the tests during
this phase may involve probing internal nodes and the use of specialized tools using electron
beam to observe the activity within the device. The step helps in identifying the correct
operating limits for the device. These are obtained by performing tests under various voltage
and frequency ranges and plotting the results as a Shmoo plot, which provides a graphical
display of the sample over the operating range [15].
Production Test
Once the chips pass the characterization test, they are sent to be produced in mass
volumes. In the production test phase, the tests are less extensive but they still have to
meet the manufacturer?s speci cations. During this phase the test time and hence the test
cost plays a paramount role. To minimize the test time, a high coverage of faults is targeted
2
with minimum vectors possible. Since the chip is already designed and fabricated in mass
volume the diagnosis of a failing chip is not performed, and only the pass/fail decision is
made [15]. Binning of the dice based on the failing speci cations is also done in this stage
to maximize yield.
Burn-In Test
The next phase in test  ow is the burn-in test phase. The main idea of this test is to
accelerate the age of the device. Due to various process variations the produced chips may
not be identical. Though the process is controlled, some of these variations are unavoidable.
It is found that once the devices are produced some fail early while others do not. Burn-in
helps to accelerate the life of the chip by putting the device under test (DUT) at very high
temperatures. During this test, the production tests are performed at very high temperatures
and voltages. The test targets two types of failures, namely, infant mortality failures and
freak failures. In infant mortality failure, the DUT fails very early due to weak resistive lines
that burn out easily at a slightly accelerated environment. In freak failure, the DUT works
properly as a good chip during normal conditions but fails after a very long time. Such
devices are identi ed by putting the DUT through long hours of burn-in [15].
Incoming Inspection
Once the chips pass the burn-in tests they are sold to various systems manufactures
who integrate several components together on a board into one system. During that process,
each component is tested to check correctness of operation and the thoroughness of the test
can vary based on the system designed. The main idea during this phase is to minimize the
e ort of replacing an individual defective component after it is integrated into the system [15].
For instance, a faulty graphics card integrated into a laptop and shipped to the customers,
ended in recall and a signi cant loss was incurred by the device manufacturer and component
manufacturer [6].
3
At each step of tests described above, the design is veri ed for certain mismatches. These
mismatches can be termed as defects, errors, or faults. A defect is de ned as the unintended
di erence in the hardware structure from the actual design, an error is a mismatch in the
output signal caused by the defect present in the design and a fault is the abstract form of
the defect causing that error [15]. For instance, a device could have a process defect such
as a weak interconnect between two logic gates, which may produce an error in the output
signal and the test engineer will conclude that some fault in the design caused this erroneous
output.
1.1.2 Fault Models
There are two basic types of testing, called functional and structural testing. In func-
tional testing, the circuit is tested for correct functional operation by giving functional vectors
and verifying if the circuit works. In a structural test, the circuit is veri ed for any structural
anomalies due to fabrication process errors. The structural tests are performed at every level
of test described earlier and the test patterns may not have a functionally meaningful output.
The structural test is performed based on certain fault models which describe the types of
faults targeted. These fault models help to develop algorithms that enable the structural
test. Some fault models generally used are given below.
Gate Level Stuck-at Fault Model
In a gate level stuck-at fault model, an input or an output signal line is considered to
be stuck at a value 0 or 1 due to the defect in the design. In these tests the signal line is
driven with a value opposite to the value being tested. For instance, to test an input line for
a value stuck at 0, an input of 1 is applied on that line and the output is observed. If the line
is not stuck at 0 then the output will be the expected value for the input value of 1 [15,29].
The work in this thesis mainly uses the stuck-at fault model for all experiments since it is
the simplest fault model.
4
Transition Delay Fault Model
Delay fault models checks if the DUT meets the timing speci cations. Resistive opens
or shorts on interconnects can cause the signal to transition after a delay. In a transition
delay fault model, the delay of a transition is measured by forcing a transition on the desired
pin and observing that transition at the output. The transition fault model is similar to
the stuck-at fault model in the sense that the stuck-at fault takes an in nite amount of
time to transition. The di erence with the two fault models are that, unlike stuck-at fault
models, the transition delay fault model is detected by a vector pair in which the  rst vector
initializes the pins to a value while the second vector cause the transition. In the transition
fault model the transition is observed at any path the signal takes, irrespective of whether
the path is a short or long path and the test passes if the transition is observed at the output
within the speci ed timing threshold [15,29].
Path Delay Fault Model
Path delay models are also called as lumped delay fault models. In this fault model, the
delay of each gate in the path, if summed up, is claimed to cause more delay in a signal when
traveling through that path. In contrast to the transition fault model, the test engineer has
the  exibility of choosing the path the signal should take. This could ensure that every path
in the design meets the timing constraints [15,29].
Bridging Fault Model
A bridging fault represents a short between two or more signal lines. The logic value
on the signal can be modeled as a 1-dominant (OR Bridge), where a signal value of ?1? on
one line forces the signal on the neighboring line to be ?1?, or a 0-dominant (AND bridge),
where a signal value of ?0? on one line forces the signal on the other line to be ?0? [15,29].
5
Transistor Level Stuck-at Fault Model
In a transistor level stuck-at fault model the DUT is veri ed for stuck-open or stuck-
short in the transistor. A stuck-open transistor fault would cause the transistor to behave
as a dynamic level-sensitive latch, while a stuck-short fault would produce a direct path
between the power supply line VDD and the ground line. These faults are not accurately
modeled in the gate stuck-at fault model due to the complementary structure of nmos and
pmos transistors in the complementary metal oxide semiconductor (CMOS) circuits. To
monitor the stuck-short faults, a steady state current is supplied, this test is termed as IDDQ
test. To monitor a stuck open fault two vectors are applied. The  rst vector sensitizes the
line to the opposite value, while the second vector propagates the value to an observing
point [15,29].
1.2 Designs for Test
Test circuits can be combinational or sequential circuits. In a combinational circuit
there are no registers, such as a D-type  ip- op (DFF) to hold a previous value and hence
is a simpler design. Due to the absence of state sensitive registers, testing combinational
circuit is straight forward and the automatic test pattern generator (ATPG) will generate test
patterns that will sensitize and propagate the fault to the output. In contrast, sequential
circuits have registers that hold previous values until there is a change in the values and
update the output on the next clock pulse, as shown in Figure 1.1, where the PI and PO are
the primary inputs and primary outputs. The combinational logic consist of logic gates and
the DFF are D-type  ip- ops.
1.2.1 Scan-Based Tests
Since registers are state sensitive, the correct output depends on their current state,
which is determined by past values. An ATPG will have to create time frames with the
preferred past states to generate sequential ATPG vectors and can be cumbersome to do so.
6
Figure 1.1: Illustration of a sequential circuit with 4  ip- ops.
Due to this reason a sequential circuit is generally converted into a combinational circuit by
connecting the registers serially into a serial shift register. The user can then set the register
in the preferred state by shifting in the input values, and observe the response by shifting
out the values. This type of test methodology is known as scan-based test; it helps to con-
vert a complex sequential circuit into a more manageable combinational circuit. Figure 1.2
illustrates this method, where the registers are connected serially using a multiplexer with
one input coming from the combinational logic and the other input coming from the scan
input (SI) or from the previous register. The test vectors are serially shifted (scanned) in
through the scan input (SI) pin and serially shifted (scanned) out through the scan output
(SO) pin. The DUT operates in the normal mode when the scan-enable (SE) pin is ?0? and is
switched to test mode by driving the SE pin high [29]. The work presented in this thesis uses
scan-based methodology to synthesize the register transfer level (RTL) benchmark circuits.
1.2.2 Built-In Self Test
Though scan-based test is widely used, one of the disadvantages of scan-based test
methods is that, for a given fault coverage, the volume of test patterns generated by the
ATPG can be very large for industrial circuits. Since automatic test equipment (ATE)
7
Figure 1.2: Illustration of a sequential circuit with 4  ip- ops connected into a serial shift
register.
has limited storage size, very large volumes of patterns can have serious overhead. The
built-in self test (BIST) helps to mitigate this problem by incorporating a pattern generator
along with the device under test (DUT), with little area overhead. It comprises a test
pattern generator (TPG) built using a linear feedback shift register (LFSR), and output
response analyzer (ORA) built using multiple input signature register (MISR) [60]. The
ORA compacts the output responses from the DUT to form a \signature" which is compared
with the \signature" from a known good circuit. The patterns are generated by providing
an initial seed to the LFSR. Though the patterns generated by the LFSR may not be as
random as in a scan-based test, they can be changed by varying the seed supplied to the
LFSR. Random patterns help to capture hard-to detect faults by \chance", thereby providing
high fault coverage in short time [29].
1.2.3 Compressor-Decompressor
In the current VLSI trend, designs have hundreds of thousands of sequential elements.
When these elements are connected together to form a single shift register, also known as
8
Figure 1.3: Illustration of a compressor-decompressor logic connected to multiple scan chains.
a single scan chain, the number of cycles needed to shift the input values through the scan
chain could be large. To minimize this, the single scan chain is normally broken down into
multiple smaller scan chains. Though the time to shift the values through the scan chains can
be signi cantly reduced, the number of scan-in pins is increased. Since an ATE has limited
amount of pins available, in order to drive the large number of pins on the chip, designs
include a hardware circuit pair called compressor-decompressor, or in short codec. The job
of the decompressor is to expand and broadcast the input values from the ATE or LFSR
to multiple scan inputs within the circuit. The job of the compressor is to get the values
from multiple scan out pins and shorten the length of the pattern before sending it to the
ATE for storing or analysis. Figure 1.3 illustrates the compressor decompressor architecture
of the Synopsys DFTMAX adaptive scan technology [8]. The decompressor is built using
a multiplexer which can be used to switch from a 1:1 mode or broadcast mode shown in
Figure 1.4, while the compressor is built using an exclusive OR (XOR) tree as shown in
Figure 1.5. As the number of input pins for the decompressor increases the test circuit
will have more controllability and hence the high test coverage can be obtained with fewer
9
Figure 1.4: Illustration of a decompressor logic built using multiplexor [8].
Figure 1.5: Illustration of a compressor built using XOR logic [8].
test patterns. As the output of the compressor increases, the length of the fault signature
increases, thus providing better resolution to the signature, thereby reducing undesirable
aliasing, where a faulty signature resembles the good signature [8,29].
1.2.4 SerDes
Though codecs are used with partitioned scan chains to minimize pin count, there may
be more pin limitations, like having only one SI pin per chip. This necessitates the need for
a separate functional block that takes in the test pattern serially and drives the scan inputs
of the chip in parallel. This functional block is known as a serializer-deserializer logic, or
in short, SerDes. The SerDes functional block contains a serial to parallel converter and a
parallel to serial converter. Besides other applications, the use of SerDes has been suggested
10
for reducing the hardware area and power required for on-chip communication [33, 34]. In
test application, patterns are normally shifted at high speed through the deserializer shift
registers and then shifted at a slower speed through the scan chain. Likewise, the patterns
in the serializer registers are shifted out at a high speed [16,41,52].
1.2.5 Analog Bus
Similar to SerDes, possibility exists for using analog signal transmission of test data [61].
It has been suggested that n-bit digital data can be converted into an analog voltage by a
digital-to-analog converter (DAC), transmitted over a single wire, and then converted back
to n digital bits by an analog-to-digital converter (ADC). Such scheme, as suggested for
on-chip communication, reduces hardware and is shown to reduce power as well, though it
must be carefully designed to limit noise related errors.
1.3 Prior Work
Most digital VLSI circuits today are tested using the scan-based method [15]. This
reduces the complexity of testing sequential circuits to that of testing combinational circuits.
As mentioned earlier in the scan method,  ip- ops are loaded and unloaded through a shift
register mechanism for testing faults in the combinational logic. Custom system-on-chip
(SoC) designs containing microprocessors, digital signal processors and memories use large
numbers of clock cycles during scan-based tests. This directly impacts the  nal cost of the
chip [15]. In the era of low power devices that contain more than a billion gates, long test
times have become a critical concern.
While the large size of a device is one reason for long test times, the main limiting
factor for test speed is the power dissipated during test due to signal transitions in the
circuit. Test power dissipation is known to be 2 the functional power dissipation in central
processing units (CPU) [47] and 4 the functional power dissipation in graphic processing
units (GPU) [69]. If the power dissipated during tests go beyond the rated power of the
11
device then it is possible for a good device to fail or even be damaged. Several approaches
have been investigated and implemented to reduce the total power dissipation of the circuit
under test (DUT); however, these methods generally lengthen the test time [42]. Hence, in
the current semiconductor industry, where devices continue to get denser and smaller, both
test power and test time must be addressed together.
Earlier approaches to reduce test time used pattern overlapping [20,23] and reusable scan
chains [36] to eliminate unwanted scan chain operations through similar patterns to reduce
the scan shift process. Reduction in test time depends on the availability of such patterns.
Scan chain partitioning also reduces test time signi cantly but increases the number of
scan input pins. Bonhomme et al. [13] and Chalkia et al. [17] proposed methods that can
overcome this problem while achieving similar test time reduction as in multiple scan chains.
Test time reduction for multi-core SoC designs requires power-constrained scheduling of
tests [21, 22, 37]. Recent proposals by Sheshadri et al. [57{59] optimize SoC test schedules
by selecting supply voltage and clock frequency.
Shanmugasundaram and Agrawal [53{56] proposed a technique to reduce test time in
power-constrained built-in self test (BIST) circuits. They implemented an activity monitor
that increases the clock frequency if the monitor records low activity in the chain, otherwise
it decreases the frequency. The method achieves 20 50% reduction in test time in BIST
circuits with little area overhead. Hashempour et al. [32] implemented a system that uses
both BIST and ATE in an e ort to reduce test time on the ATE. The methodology identi es
all \easy-to-detect" faults using BIST and then uses ATE to identify the \hard-to-detect"
faults.
Implementing parallel testing, where multiple dice are tested in parallel, has also reduced
test time when testing large volumes of dice. One noted disadvantage of such methods is
that the time between two tests, also called as indexing time, becomes an overhead when
one tester probe has to wait until the other probe completes its test. This can be mitigated
by employing an aperiodic probe method, where in a dual-probe tester system, when one
12
probe detects a faulty die it has the  exibility to check other dice for a good die by gross
fault tests until the other probe  nishes its lengthy tests [27,45].
This work presented in this thesis focuses on reducing the test application time and
can work in tandem with above mentioned procedures. The test time reduction is achieved
by implementing two methods: (1) by scaling supply voltage and (2) scaling test frequency.
The methods are investigated mathematically to obtain dependencies and then each method
is veri ed through simulations and experimentation on test equipment, such as Advantest
T2000GS entry level ATE for the method using scaled frequency and National Instruments
ELVIS [2] board test equipment for the method using scaled supply voltage. The research as
it appears here has been presented as posters [63,64] and discussed at technical forums [10,
11,65{68].
13
Chapter 2
VLSI Test Equipment and Procedure
An undergraduate or a graduate student majoring in Electrical and Computer Engineer-
ing would have several lab sessions dealing with digital and/or analog circuits. During those
lab courses the student would use numerous transistors and integrated circuits to understand
the practical applications of what they studied. Students implement the circuit on a cir-
cuit board by plugging in and interconnecting components. They then supply input values
through a computer or a voltage source connected to the board and observe the output on
an oscilloscope or simply on an LED. The output is then veri ed on the monitor against the
output they pre-calculated according to the instructions in their lab manual. If the output
matches then they claim that the circuit works and move on, else their circuit does not work
and they analyze it closely to  x the faults. This is the most basic form of testing, where
for a given circuit a set of input patterns is applied and the output response of that circuit
is veri ed by comparing with known response. If the response matches then the circuit is
good, else the circuit is bad and the test engineer then  nds the source of the problem and
attempts to  x it using diagnostic tools.
In an industry the testing happens from the day the chips are designed using a hardware
description language (HDL) until the day the chips are shipped out. In the  rst half of the
chip?s design life, the software tools play a major role in test and debug of the design.
Here the design is constantly veri ed for di erent process corners, such as power, voltage,
and timing, using a transistor-level model. However, once the design is fabricated, it is more
challenging to meet the required process corners for which the chip is being designed. During
this second half of the chip?s life in the industry, automatic test equipment (ATE), or simply
the tester, plays an important part in making sure that the chip has the expected design and
14
Figure 2.1: Advantest T2000 ATE at Auburn University, Alabama.
is capable of operating with the desired performance. The basic function of the ATE is to
drive the inputs with the test patterns and then monitor the output response from the chip.
2.1 Advantest T2000GS ATE
Auburn University, Alabama, houses the automatic test equipment (ATE)- Advantest
Model T2000GS, shown in Figure 2.1. The ATE is an entry level system manufactured by
Advantest and can perform digital, mixed signal and RF tests. The test equipment consists
of three units, the mainframe, a user interface console and a test head.
Mainframe
The mainframe, shown separately in Figure 2.2, supplies the main system power. It
also houses the system controller, site controller and the bus matrix. The system controller
15
Figure 2.2: Mainframe of Advantest T2000GS at Auburn University.
provides the tools and applications required by the test engineer to verify and debug the
device under test (DUT) placed on the test head. It controls the user interface, such as
keyboard and mouse connected to the system controller. Any information related to the test
plan, such as test patterns and test program are stored in the system controller. The site
controllers communicate with di erent modules placed in the test head. It executes the test
program on the DUT or test site.
Test head
The test head, shown in Figure 2.3, consists of di erent modules used to test the DUT.
The modules in the test head include a 500mA device power supply module (DPS 500mA),
250MHz Digital module (250MDM) and a sync generator. The sync generator provides the
capability of generating multiple time domains or frequencies for the digital module. It
16
Figure 2.3: Test head of the Advantest T2000GS with an FPGA on the loadboard.
generates the synchronization clock to synchronize the clock with the patterns. The DPS
500mA module has 32 channels and supplies the power to the DUT. The 250MDM consists
of 32 I/O channel digital logic to drive and observe the signal on the DUT I/O pins.
2.1.1 Test Programming
Once the device under test (DUT) is fabricated, it is tested to sort good and bad chips
in the wafer. For the tester to accurately test the DUT, the test engineers have to provide
the tester with three main inputs, namely the test program, the test vectors and the analog
test waveform [15]. The production test pattern can be generated by the test software tools
such as Mentor graphics Tessent Fastscan [7] or Synopsis Tetramax. Most of the testers
in the industry are compatible with the test pattern format called standard test interface
17
language (STIL) [3, 40] test pattern  le. It is the language used to de ne the test vectors
applied to the DUT. The STIL  les contain the following information required for the ATE:
1. De nition of each signal pin in the signal block,
2. Timing and waveform information in the timing block,
3. DC signal levels which are applied and expected, and
4. De nition of test patterns.
The test program contains a sequence of instructions that describes the test  ow, the
patterns to be used and the test environment condition. Once the test program is loaded
the tester uses its test pattern generator (TPG) and the frame processor to generate test
patterns and the clock, respectively.
The Advantest T2000GS ATE uses a native Open Architecture Test System (OPEN-
STAR) Test Programming Language or OTPL for short. It is a modular programming
solution that enables user to write procedures dealing with various aspects of the test in-
dividually that can later be used with the test plan to obtain a complete test program.
Apart from T2000s OTPL, the test plan can also be written using C++ though this requires
complete knowledge of the test system and the test object model.
To test a device on the T2000GS ATE the user will have to describe the DUT and the
type of test to be performed. Unlike the STIL  le, in OTPL the timing information and the
de nition of each signal are de ned separately from the pattern  le. To take advantage of
the modular programming of T2000, several  les are created. These are described next.
Pin Description File
This  le de nes the signal and power pins available on the DUT and associates each pin
with the resource type in the test system. For instance, any I/O signal pin is speci ed with
digital pin resource or \dpin" and the power pins are speci ed with digital power supply
18
500mA or \dps500mA" resource. Within each resource de nition, the pins can be grouped
and labeled for better readability. This  le is typically named with a .pin extension.
Socket File
This  le speci es the mapping between the DUT pins and the ATE connectors. This
 le does not o er any grouping of the pins and does a general mapping of each pin with
the ATE connectors. Every signal pin is speci ed in this  le and the corresponding ATE
connectors can be found in the resource folder.
Speci cation File
The device speci cations, such as the supply voltage range, current range, timing and
slew rate are speci ed in this  le. The speci cations are de ned as a variable with a data type
that indicates the type of speci cation (voltage, current or time) to provide more readability
and consistency across other  les. Each variable can be speci ed with a range of values,
such as min, max and typical. The range is user de ned, and the range for each speci cation
speci ed in the same order by separating values with commas.
Levels File
The levels  le speci es the voltage and current levels at each signal and power pin. The
values at each pin can be either  xed or assigned a variable name from the speci cation  le.
In this  le the levels can be speci ed common to a group in the pin description  le or to an
individual pin.
Timing and Timing Map File
The timing description related to the test clock period and the behavior of the signal
(waveform) at each pin or pin group is speci ed in the timing  le. Each timing group can be
speci ed with 4 periods and up to 8 waveforms. The input value speci ed in the pattern  le
19
must be in relation to the values speci ed in the timing waveform, i.e., if the input values
such as ?X? or ?Z? indicating a don?t care bit or high impedance, respectively, are speci ed
in the pattern  le, the signal behavior for those values must be speci ed in the timing  le.
Test Condition File
The preferred type of operating condition for a given test is speci ed in the test condition
 le. It includes the type of voltage levels and their speci cations, and the timing information
for the test. Every test can be provided with a unique test condition. The test condition
for every test can be unique and can either have new speci cation or use the range from the
speci cation  le.
Pattern File
The pattern  le has the test patterns or test vectors to be applied during functional
tests. The OTPL allows several types of pattern descriptions, such as algorithmic pattern
generation (ALPG) , SCAN pattern generation, or a simple pattern list that speci es the
values at each input and output pins. The tester?s pattern generator will generate the signals
based on the values speci ed in the pattern  le and timing behavior provided in the timing
 le.
Test Plan File
This is the main  le that organizes the test  ow and calls all the test condition and
resource  les. Every test  ow can also be provided with an option of binning, which logs the
failed device and the levels at which the failure occurred. The patterns that each test will
use, is also called from this  le.
20
2.1.2 Test Data Analysis
Analyzing the test data helps to identify or sort the good chips from the bad ones. From
the bad chips the test engineer can understand the fabrication process and  ne tune the
process for the next design to minimize the defects. The analysis also provides information
about the design weaknesses [15]. The data also provide information on the quality of the
test that indicates how thorough the test has been in sorting good chips. A chip that fails
can easily be sorted as faulty, however if the chip passes it may be a case that it had passed
for the given test model but can fail in some other scenario.
Process variations play a key role in the discrepancies that occur during fabrication. It
is quite possible that in the same wafer di erent chips may have, say, di erent operating
frequencies due to process variations. Failure mode analysis of the failing chips can pro-
vide information to improve the fabrication process. Normally chips failing due to process
variations have similar failing patterns.
The T2000GS o ers several graphical user interface (GUI) tools of which the logic an-
alyzer and the oscilloscope are used in this work. The logic analyzer provides a digital
representation of the signal activity during the test and the oscilloscope represents an analog
representation of the signal. The oscilloscope provides the tools to measure signal charac-
teristics such as rise and fall times, voltage and current values. These tools also provide
indication of the expected and observed values and the time at which the event is observed.
2.2 Time and Cost Relationship
Test time depends on the type of test conducted on the ATE. There are two categories
of tests that are performed called the parametric tests and the functional tests. Parametric
test are performed with slow speeds and the test time depends on the number of pins that
are tested. Functional tests are performed at higher speeds than parametric tests and the
test time depends on the number of vectors applied and the frequency at which they are
21
Table 2.1: State of art IMA Tester Cost Analysis Data [62]
ATE Purchase Price $985K
Depreciation 20% [27]
Maintenance 4%
Operating Cost 10% [15]
Production weeks/yr 52
Production days/week 7
Production shifts/day 3
production hours/shift 8
Devices per slot 7000
Good devices test time 5 seconds
Bad devices test time 0.3 seconds
Yield 98%
applied. Testing cost can be de ned as the cost incurred for the amount of time spent on
the tester. This cost can be quanti ed as [15],
Running cost = Depreciation + Maintenance + Operating Cost
Consider the test data example given in Table 2.1. Using the data in the table we can
calculate the test cost for a single chip.
Running Cost = $985;000(0:2 + 0:04 + 0:1) = $334;900
Tester usage = weeks=yr  days=week  number of shifts  hours=shift  3600 sec
Tester usage = 52  7  3  8  3600 seconds
) Tester usage = 31;449;600 seconds
Testing cost = Running costTester usage cents=second
Testing cost = 334;90031;449;600 = 10 cents=second
22
Total test time = Total time for good devices + Total time for bad devices seconds
Total test time = 7000 (0:98  5 + 0:02  0:3 ) = 34;342 seconds
Total cost = Total test time  testing cost
Total cost = 34;342 seconds  10 cents=second = 343;420 cents
Cost per die = Total costNumber of good dice = 50 cents
Cost per die = 343;420(7000  0:98) = 50 cents
Though parallel testing dominates the industry, cost of testing will still be signi cant
owing to the volume of chips produced and the number of parallel sites available to run these
tests. Having many parallel sites is also added to the cost of tester as a whole and involves
maintenance costs. Hence reducing test time reduction is still a major concern in testing.
23
Chapter 3
Test Time Theorem and Applications
In the previous section we saw how test time a ects the cost of a single chip. In this
section we lay the foundation for the proposed methods by stating a theorem for minimum
test time.
3.1 Test Time Theorem
Theorem. For power constrained testing where the peak power during any clock cycle must
not exceed PPEAKfunc, the test time (TT) has a lower bound,
ETOTAL
PPEAKfunc  TT =
ETOTAL
PAVG (3.1)
where ETOTAL is the total energy and PAVG is the average power consumed by the test.
Proof:
Consider a test that runs for N clock cycles and for cycle i, we de ne:
Ti as period of the clock cycle,
Edi as dynamic energy consumed during the cycle,
Pli as leakage power dissipated during the cycle, and
Ei as total energy consumed during the cycle.
Then, test time and total energy are given by,
TT =
NX
i=1
Ti (3.2)
24
ETOTAL =
NX
i=1
Ei =
NX
i=1
(Edi +Ti Pli) (3.3)
In particular, for a periodic clock test, Ti = Ttest, i.e., all clock cycles have the same period
Ttest,
TT = N Ttest (3.4)
The equality in equation (3.1) follows from the standard de nitions of energy and power.
PAVG is the rate of energy usage, averaged over the test duration TT. Therefore, total energy
is ETOTAL = TT PAVG.
To prove the lower bound, the power constraint that each clock cycle must satisfy is
examined. The clock cycles are assumed to have di erent periods and thus a conventional
periodic clock would be a special case. Thus,
Edi
Ti +Pli PPEAKfunc; 8 1 i N (3.5)
or
Ti Edi +Ti PliP
PEAKfunc
; 8 1 i N (3.6)
Hence, from equations (3.2) and (3.3),
TT  1P
PEAKfunc
NX
i=1
(Edi +Ti Pli) = ETOTALP
PEAKfunc
(3.7)
This proves the lower bound on test time in equation (3.1).
Leakage power plays an interesting role. Notice that in inequality (3.6), Ti appears
on both sides. For given PPEAKfunc as clock period Ti is increased to satisfy the power
constraint, the right hand side also increases, though at a slower rate because of small Pli.
The minimum period for ith clock cycle is,
25
Ti = EdiP
PEAKfunc Pli
(3.8)
To determine Ti we must know dynamic energy Edi and leakage power Pli, both of which are
functions of the input vector applied to the circuit in clock cycle i. For now, let us neglect
the leakage power and thus equation (3.8) will take a simpler form,
Ti = EdiP
PEAKfunc
 EiP
PEAKfunc
(3.9)
For a given set of test patterns generated by an automatic test pattern generator
(ATPG), the total energy consumed during test remains unchanged irrespective of how
tests are applied. The total test time is dependent only upon the average power consumed.
In order to reduce the test time, it is required that the test be run with the smallest clock
period possible while dissipating power less than the rated power. Since the minimum pe-
riod is limited by the critical path delay of the DUT, test time is dependent on both the
rated power and the structural delay of the circuit. The two constraints that determine the
minimum test clock period can be de ned as follows,
1. Power Constraint - A test is power constrained if the minimum test clock period
is limited by the maximum rated power for the circuit. We de ne this period as
Tpower = EMAX(test)=PPEAKfunc where PPEAKfunc is the maximum power dissipated
during functional operation or the rated maximum for the DUT and EMAX(test) is the
maximum energy dissipated during any test cycle.
2. Structure Constraint - A test is structure constrained if the minimum test clock period
is limited by the structural (critical path) delay of the DUT. We de ne the fastest clock
as fstructure = 1=Tstructure where Tstructure is the structure constrained clock period.
26
Based on the above de nitions, the minimum test clock period would have to satisfy both
power and structure constraints, i.e.,
Ttest = maxfTstructure; Tpowerg (3.10)
In a power constrained test, the test clock period is Tpower >Tstructure, that is,
Ttest = Tpower = EMAX(test)P
PEAKfunc
(3.11)
Substituting equation (3.11) in equation (3.4) we get the total test time for power constrained
test as;
TTmin = N EMAX(test)P
PEAKfunc
(3.12)
Equation (3.12) can also be represented as
TTmin = ETOTALE
AVG
 Tsynch = ETOTALP
AVG
(3.13)
3.2 Applications of Test Time Theorem
In section 3.1, it was shown that for a given rated power, test time is limited by the
total energy dissipated during test. Conventionally, energy can be reduced by modifying
the test vectors. For instance, to increase the probability of identifying a fault with a given
pattern set, the automatic test pattern generator (ATPG) uses \0?s" and \1?s" randomly to
 ll the \don?t care" bits during pattern generation. However it causes excessive switching
in the scan chain during scan shift and thus increases the shift power. This e ect can be
avoided by conservatively  lling the \don?t care" bits with adjacent  ll, where the \don?t
care" bits are  lled with the same value of the bits adjacent to it, or with only \0?s" or only
\1?s". Since this procedure is done mainly to reduce power, for a given allowable power the
ATPG normally increases the number of test patterns to achieve the desired test coverage.
27
Thus this increases test time and often is the trade o . Now the question is whether test
time can be reduced using a given set of vectors, rated power, and the critical period for the
device under test (DUT). Based on the theorem stated earlier, we describe two scenarios in
this section and examine the feasibility of reducing test time with the given constraints.
3.2.1 Periodic Clock Test
The  rst scenario considers a test using a  xed clock period for every cycle during test.
This is the conventional method of testing and let us name it as periodic clock test, where
every cycle has the same period as its neighboring cycle. Now to minimize the test time of a
periodic clock test, let us assume a test with N clock cycles with a period Ttest and frequency
ftest = 1=Ttest. As described in 3.1 the test clock period is constrained by rated power and
the critical path delay of the circuit. Based on equations (3.10) and (3.11) the test clock
period is limited by,
 Tstructure - the critical path delay which limits the minimum period,
 EMAXtest - the maximum cycle energy dissipated for a given set of vectors
 PPEAKfunc - the maximum allowable rated power for the device under test (DUT).
In a power constrained test, the maximum power that any cycle can dissipate is limited
to PPEAKfunc, hence PPEAKfunc can be assumed as a constant. Then based on equation (3.11)
we infer that the only way to minimize Ttest is to minimize the numerator EMAXtest. Since
for a given test the test vectors are practically unchanged, the switched capacitance during
the test will not vary and thus the energy dissipated during any cycle will be proportional
to the quadratic value of the supply voltage applied to the DUT during test. So reducing
the supply voltage can signi cantly reduce the energy during test. Doing so, we now have
lot of head room between the power dissipated during test and the allowable peak power. If
we want to maintain the same power dissipation, PPEAKfunc, the frequency of the test must
be increased. The new test clock period Ttest can be obtained using the equation (3.11) with
28
the energy dissipated at the new supply voltage. This way the test time can be reduced
using the new power constrained test clock period at the new voltage.
The idea of using the low supply voltage and increasing the frequency would work very
well if not for one caveat, when the voltage is reduced the gates tend to switch slower due
to the now increased time in charging the load capacitance. This indicates that the critical
path delay can increase and in worst case change the critical path. Assuming that there is
no change in the critical path, when the voltage is reduced the critical path delay increases.
From equation (3.10), the test clock period is structure constrained if Tstructure > Tpower,
and any reduction in voltage will increase the delay and hence the test clock period must
increase. Hence it should be ensured that the voltage cannot be low enough that the power
constrained test clock period is shorter than the structural delay. The optimum supply
voltage should be such that the test clock period Ttest = Tpower = Tstructure. Thus for a
periodic clock testing at optimum voltage,
PPEAKfunc = EMAXtestT
structure
(3.14)
and the minimum test time for a periodic clock test is given by
TTsync = N Tstructure (3.15)
Figure 3.1 illustrates the minimum test time as a function of supply voltage. if a test
is performed at the nominal supply voltage, e.g., 1.8V for 180nm CMOS technology, the
test clock period is limited by the maximum power dissipated by the DUT during any clock
cycle. If the rated power is lower than the maximum power dissipated during test the test
clock period must be wide enough to ensure that the test power does not exceed the rated
power. If we reduce the voltage then the EMAXtest reduces and Tstructure increases. Based
on equation (3.11) if the power dissipated is held constant to PPEAKfunc then the test clock
period decreases. Repeating the experiment several times at each voltage level, as long as
29
Figure 3.1: Minimum test time as a function of supply voltage (VDD) for N-cycle periodic
clock test. For a minimum test time TTsync supply voltage is Vsync which is lower than the
nominal voltage Vnom.
the test is still power constrained, we achieve a reduction in test time. At a certain supply
voltage Vsynch < Vnom, the energy dissipated becomes low enough that the test is no longer
power constrained and the structural delay of the circuit starts to dominate the test clock
period. Thus, Figure 3.1 can be partitioned into two regions, the region on the right side
indicates that the test time is power constrained and region on the left side indicates that the
test time is structure constrained. The minimum value of test time occurs at the boundary
of the two regions. The voltage at this boundary is the optimum voltage at which the test
will be fastest. Any reduction in voltage beyond Vsynch, i.e. in the structure constrained
region, will increase the test time signi cantly.
30
Figure 3.2: Illustration of test power and test energy for every test cycle using periodic clock.
The test clock period is determined by the cycle dissipating the maximum power.
3.2.2 Aperiodic Clock Test
In Section 3.2.1 we considered the scenario where the clock period was  xed and thus
the power constrained test clock period was determined by the maximum power dissipated
during test. Then according to a theorem in Section 3.1 the periodic clock test serves as the
upper bound of test time. In the second scenario, the goal is to achieve the lower bound for
test time in the theorem.
Consider the illustration in Figure 3.2, which shows the energy and power dissipated
during a given test of N cycles (N = 8, here). The power constrained test clock period in
a periodic clock test is determined by the cycle that consumes the most power. Though
the maximum power is now limited within the allowed rated power for the DUT, there will
be some cycles that dissipate lower power than the maximum power. Hence, in a power
constrained test scenario equation (3.13) may not be the optimum solution for the minimum
test time, since the denominator can be small if there are many cycles consuming lower power.
This means that the power constrained test time can be reduced if the denominator can be
larger. In mathematics, the arithmetic mean of any positive valued function is maximum
when all the values in that function have the value equal to the maximum value in the
function. Thus, we infer that in order to increase the value of PAVG, every cycle should
31
Figure 3.3: Illustration of test power and test energy for every test cycle using aperiodic
clock. The test clock period for every cycle is determined by the power dissipated during
that cycle.
consume the same power equal to the rated maximum power for that device, i.e., each cycle
will now dissipate the same maximum power equal to the rated power PPEAKfunc. This is
achieved by using aperiodic clock test where the period of each clock cycle can be unique
and may di er from the period of the neighboring cycle. This is illustrated in Figure 3.3,
where every cycle has a unique period that is determined by the amount of power dissipated
during that cycle.
Though the period of each cycle is determined by the power dissipated during that cycle,
the resulting period must not cause any setup or hold time violations. Hence the minimum
clock period allowed is limited to the critical delay of the circuit. The period for each cycle
in a aperiodic clock test will then be given by,
Ti = maxfTstructure; EiP
PEAKfunc
g (3.16)
where Ei; i = 1; 2; 3;   ; N; is the energy consumption during the ith clock cycle, and
Ti is the test clock period of the ith cycle and it must not be shorter than Tstructure. Notice
that since the energy is independent of the chosen time period, the device still dissipates
32
the same amount of energy for the given test vectors as in the periodic clock test. Equa-
tion (3.16) indicates that each cycle can be structure constrained or power constrained based
on the energy dissipated during that cycle, i.e., the cycle is structure constrained if Ei  
PPEAKfunc  Tstructure and the cycle is power constrained, if Ei >PPEAKfunc  Tstructure.
For instance, in Figure 3.3 since energy E5 and E7 are high, cycles T5 and T7 will de nitely
be power constrained. However, because energy E1 to E3 are low the corresponding cycles
could be structure constrained. Revisiting equation (3.6), we can notice that in an aperiodic
clock test the leakage energy during the cycles with shorter time period will be lower. The
test time for an aperiodic clock test is bounded by,
NX
i=1
maxfTstructure; EiP
PEAKfunc
g TTasync (3.17)
TTasync TTsync = N EMAXtestP
PEAKfunc
(3.18)
Equation (3.17) is true when there are a mix of low power and high power test cycles,
and the equality in equation (3.18)will occur when all the cycles dissipate same amount of
energy. While from equation (3.18) we can conclude that at any given voltage it is possible
that, as long as the test is power constrained, the time taken by an aperiodic clock test
will be lower than the time taken by a periodic clock test. So, as described for periodic
clock test, there should be an optimum voltage at which the aperiodic clock test is fastest.
The optimum voltage for an aperiodic test can in fact be inferred by back tracing from the
optimum voltage of a periodic test.
Consider the plot in Figure 3.4, which is an extension of the illustration in Figure 3.1.
Here the point ?A? indicates the optimum voltage Vsynch at which the periodic clock test is
fastest. If we increase the voltage from point ?A? then the test will become power constrained,
and hence, as we discussed earlier, using a periodic clock test will have a mix of low and high
power cycles and the clock period will be based on the cycle that consumes most power. If
33
Figure 3.4: Minimum test time as a function of supply voltage (VDD) for N-cycle aperiodic
clock test. For a minimum test time TTsync supply voltage is Vsync which is lower than the
nominal voltage Vnom.
we use an aperiodic clock test beyond point ?A?, because there is a mix of low power and
high power cycles, the low power cycles will use the structural period for the test to run
periodically, while the cycles with higher power will be expanded aperiodically to dissipate
same amount of power. In the region between point ?A? and point ?B? there will be a mix of
structure constrained and power constrained cycles and the test is mostly dominated by the
structure constrained cycles. The minimum test time for an aperiodic clock test will be at the
supply voltage at which there are more structure constrained cycles than power constrained
cycles, and the structural delay is at the minimum. This point is shown in the  gure as
Vasync, which is the optimum aperiodic supply voltage, and Vasynch > Vsynch always. From
Figure 3.4 we can imply that the periodic clock test at the optimum voltage will be a special
case of aperiodic test when every cycle of the aperiodic clock test is structure constrained.
In the following chapters we will discuss more about the applications of the theorem
with experimental example on a benchmark circuit and provide enough evidence to support
the theorem with transistor level simulation results using several benchmark circuits.
34
Chapter 4
Scaling Supply Voltage to Reduce Periodic Clock Test Time
4.1 Low Voltage Tests
Testing at low voltage has several advantages. Hao and McCluskey [31] have shown
that manufacturing defects such as interconnect bridging and gate-oxide shorts become more
visible (testable) at reduced voltage. Such defects are the main causes for early life failures
and reliability issues in circuits but they often escape the test at nominal voltage [18,19,31].
When the voltage is reduced, the resistance of the short does not change and the voltage
drop across these resistive shorts becomes high. According to Chang and McCluskey [18,19]
the voltage at which these defects are detected lies between 2VTH to 2:5VTH. Roehr [49]
indicates that for a reasonable yield, the voltage can be obtained through statistical analysis
of min-VDD tests on a large sample of chips. Reducing power supply has a quadratic e ect
on the dynamic power dissipation, hence it is an attractive option in testing, especially
during scan shift operation [24]. For instance, a test pattern set that causes a lot of signal
transitions in the device under test (DUT), due to random  ll to obtain better fault coverage
with fewer vectors, can perform the test at lower power supply voltages and avoid the power
dissipation to exceed the rated power for the DUT.
A cited disadvantage of reduced voltage testing is the possible change in critical paths
[18], which can force an increase in the test clock period. Qian et al. [46] have suggested
novel timing tests as an alternative solution to the conventional logic tests to identify gate
oxide defects at very low power supply.
35
4.2 Reduced Supply Voltage Test
As indicated in Section 4.1, testing at low voltages has its advantages and disadvan-
tages. It was mentioned in Section 3.1 of Chapter 3 that the speed of a test is constrained
by power dissipated by the DUT during test and the structural delay of the DUT. With
regards to power, reducing the power supply has signi cant advantage over lowering the test
power. In fact, even slightly lowering the voltage can have signi cant reduction in dynamic
power dissipation and even more reduction in gate oxide and sub-threshold leakage power
dissipation [35]. With respect to test time reduction, reducing power could enable us to
increase the speed of testing, thus maintaining the same power dissipation. However, by
reducing the supply voltage the gates switch slowly, thus increasing the critical path delay
and sometimes a change in critical path. Hence, the question arises, how low can the voltage
be reduced? In this section we will examine the lowest possible voltage without changing
the critical path.
The operational speed of a circuit is characterized by the time taken for a signal to
propagate from one register to the next through a combinational path. The accumulated
delays of individual gates in a path through which the signals propagate determine the total
delay of that path. The path that has the longest delay becomes the critical path, and
any path with a delay less than the critical path is considered as a non-critical path. The
propagation delay of a gate represents the time to charge and discharge the load capacitor.
When the gate switches, it operates in the saturation region and the drain current is directly
proportional to the square of the di erence in gate-source voltage and the threshold voltage.
More generally, in the region of saturation, the drain current can be shown to be directly
proportional to (VGS VTH) [51], where  is the velocity saturation index. The relation
between gate delay and supply voltage is shown quantitatively by Sakurai and Newton [51]
and by Nose and Sakurai [43]. A simpli ed proportionality relation between delay and supply
voltage is given by Sakurai [50] and is shown below,
36
td / K  VDD(V
DD VTH) 
(4.1)
According to Sakurai and Newton [51] the velocity saturation index  ranges from 1 to
2 based on the channel length. Several methods [14, 51] can be used to  nd a value for
 . For the work presented in this thesis, the value of  is found to be near 2. However,
the experiments can be performed for any value between 1 and 2 based on the available
technology.
To determine the accuracy of the delay calculated using equation (4.1), the delay ob-
tained by triggering the critical path of a DUT can be compared with the calculated delay
value using equation 4.1. In this experiment, the s298 ISCAS?89 benchmark circuit is synthe-
sized using 180nm CMOS technology and we assume that the test is not power constrained,
i.e., the test is only limited by the structural delay of the circuit. It was observed that the
critical path determined by the Leonardo Spectrum [7] static timing analysis (STA) tool was
a false path and hence a path was chosen between the two registers of the critical path. Using
an ATPG tool, such as Mentor graphics Fastscan [5], a path delay vector set was obtained for
a path with 6 out of 7 gates speci ed in the critical path. The initial path delay, measured
using SPICE, was used to calculate the value of the constant K in the equation (4.1). With
an assumption that the critical path will not change as the voltage reduces, the value for the
delay was calculated using equation (4.1) for every voltage reduced from the nominal voltage
of 1.8V down to 0.6V in steps of 0.1V. The new value was used as the new clock period and
the SPICE simulation was performed again. If the expected transitions occurred in the path
chosen, then the path delay was noted for that voltage. If the expected transitions did not
occur, then the test clock period was increased and the test was repeated.
Figure 4.1 shows the comparison of the calculated and measured values for minimum
test time at each voltage reduction. The measurement assumes that the test is only structure
constrained and hence the test runs at the functional speed. From the results it was observed
that the delay calculated using the  power law equation (4.1) was in correspondence with
37
Figure 4.1: Comparison of the test time measured using SPICE simulations with the delay
calculated using  power law using s298 ISCAS?89 benchmark circuit. The test clock period
is chosen as the functional period assuming that the test is not power constrained.
the measure values while reducing the supply voltage by up to half of the nominal supply
voltage, beyond which the clock period had to be increased to obtain the expected results.
So the experiment provides evidence that it is safe to assume that the critical path will not
change for small reductions in supply voltage and for a given value of K and  the delay can
be found using the approximation in equation (4.1). As it will be described in the following
sections, this conclusion helps to obtain the optimum voltage for test time reduction in
periodic clock test.
4.3 Optimum Supply Voltage
4.3.1 SPICE Experiment
In Chapter 3 it was stated that in a power constrained test, the test clock period is
limited by the maximum allowable power of the circuit. In general test clock period can be
38
related as
PMAXtest = EMAXtestT
power
)Tpower = EMAXtestP
MAXtest
= CL V
2
DD
PMAXtest (4.2)
where Tpower is the test clock period at a given peak power limit PMAXtest, EMAXtest is
the maximum energy dissipated by any clock cycle during the entire test, and CL is the
total switched capacitance in clock cycle that consumes most energy due to rising signal
transitions. Since the technique is implemented for stuck at fault tests, the signal transitions
in both scan shift and capture are accounted for to  nd the cycle with maximum switching
activity.
The maximum allowable power of the device is usually the maximum power dissipated
during its functional operation for which the hardware is designed. Hence in a power con-
strained tests, the maximum allowable power during test must not exceed the maximum
power dissipated during functional operation, i.e. PMAXtest  PPEAKfunc. The power con-
strained test clock period Tpower is,
Tpower  CL V
2
DD
PMAXfunc (4.3)
The leakage power dissipation depends on the current  ow in the circuit when it is in
the steady state. Hence the power dissipation due to leakage will remain the same during
test as during functional operation [29]. Since the strategy is to lower the voltage and shrink
the test clock period, the net e ect will be to reduce the leakage power as well as leakage
energy per cycle during the test. In the following analysis, the dynamic power, which is a
function of both signal transitions and short circuit power, is considered to dominate the
total power dissipation.
In this section we aim to  nd the best voltage at which a power constrained test can
run with minimum test time without exceeding the peak power or violating the critical path
delay constraint of the circuit. As mentioned in the previous sections the test time can be
39
reduced while limiting the power by reducing the supply voltage. However, there exists a
point where the voltage will not be enough to charge the output load capacitance within the
right amount of time. Thus the value at the output will be wrong. At this point the circuit
is considered structure constrained and the test time is now dependent on the critical path
delay of the circuit. The gate delay of the circuit can be characterized by using the  power
law delay model in equation (4.1) proposed by Sakurai [51] [50]. This allows expressing the
smallest structure constrained test clock frequency as,
Tcritical K VDD(V
DD VTH) 
(4.4)
where K is a proportionality constant, which depends upon the critical path structure, timing
margin, etc., and  is the velocity saturation index. If the test is only structure constrained
then the total test time can be given as
TTstructure = N  Tcritical (4.5)
where TTstructure is total test time using structure constrained clock period TCLK in an N
cycle test.
To minimize the test time we  nd the smallest test clock period, Topt, that will satisfy
the power constraint (4.3) and critical path constraint (4.4). Thus, at any given voltage the
optimum test period is given by
Topt = maxfminTpower;minTcriticalg (4.6)
then the minimum test time will be,
TTmin = max(TTpower; TTstructure) (4.7)
40
The optimum voltage at which a power constrained test will run with the fastest clock
and in least overall test time will be the voltage at which both TTpower and TTstructure are
equal.
An experiment to identify the optimum voltage was performed on the s298 sequential
benchmark circuit. In order to observe the peak power dissipated by the circuit during test,
as well as the critical path delay, scan vectors generated by the ATPG for stuck-at fault
tests were combined with the vectors generated to trigger the critical path. The critical path
obtained by the static timing analysis (STA) tool for the DUT was found to be a false path
and hence the next longest path was found to be having a delay of 0:77ns, including the
setup time. This path was considered in the experiments and correspondingly the value for
K was calculated. Simulations were performed using Nanosim SPICE simulator [4] tool by
varying the voltage from 1.8V down to 0.6V in steps of 0.1V. The peak energy dissipated and
the critical path delay were measured at each voltage point. Using equations (4.2) and (4.5),
the values for test times were calculated and the maximum of the two values is the total
time as given by equation (4.7).
Figure 4.2 shows the graph for the simulated and calculated results to  nd the optimum
voltage points. The point labeled minimum VDD is the cross point at which the power-
constrained test clock period and the critical path delay of the circuit are approximately
the same. Reducing the voltage beyond this point increases the test time as the critical
path delay increases above the power-constrained clock period. Hence, the test must slow
down. The dotted line, labeled \measured", shows the result from simulation. The best
voltage is 1.07V with a total test time of approximately 1 s and a test clock frequency of
532MHz is the same as the functional frequency. The minimum of the \measured" curve
exactly coincides with the cross-over point of the two calculated curves from equations (4.2)
and (4.5). This validates the calculation of the best supply voltage for minimum test time
is obtained from equation (4.7). Thus, the SPICE simulation, which is expensive for a large
circuit, is not required for many voltages. Once EMax and tpd have been obtained at one
41
Figure 4.2: Simulation and experimental test time plots to  nd the optimum voltage for s298
benchmark circuit.
voltage (say, nominal voltage or 1.8V in our example), both equations (4.2) and (4.5) can
be characterized. The optimum voltage can also be obtained directly for a given value of K,
 , switched capacitance CL and the rated power for the device PPEAKfunc.
4.3.2 Polynomial Equation to Obtain Minimum Supply Voltage
From equation (4.3) we observe that as the voltage is reduced Tpower reduces. But from
equation (4.4) Tcritical increases as the voltage is reduced. Thus if we plot equations (4.3)
and (4.4) with respect to voltage, as the voltage reduces the two functions will cross each
other at a point. The voltage VDDopt at which the test time is minimum must satisfy:
Topt = Tpower = Tcritical (4.8)
This relation was evident in Figure 4.2. To obtain a straight forward solution for the optimum
voltage problem we make the following assumptions in our analysis:
42
1: Variation in threshold voltage VTH due to changes in supply voltage is not drastic and
VTH is assumed to be constant for the supply voltage interval of interest.
2: The critical path remains unchanged as supply voltage changes. Thus, the value of K is
assumed to be independent of the supply voltage.
We equate the right hand sides of equations (4.3) and (4.4) according to (4.8) and substitute
VDD = VDDopt:
Topt = CL  V
2
DDopt
PMAXfunc =
K  VDDopt
(VDDopt VTH) (4.9)
We make two useful observations about the test conducted at supply voltage VDDopt that
satis es equation (4.9):
 For shortest test time, the test clock period Topt is the minimum allowed by the critical
path delay at VDDopt.
 The maximum power for a test cycle, CL VDDopt=Topt, equals the peak power speci-
 cation PMAXfunc.
These observations help us experimentally  nd the optimum test time parameters. To ana-
lytically obtain VDDopt we derive a polynomial equation:
V 1 +1DDopt VTH V 1 DDopt (K PMAXfuncC
L
) 1 = 0 (4.10)
or
V +1x  VTH Vx  = 0 (4.11)
where Vx = V 1 DDopt and  = (K PMAXfuncC
L
) 1 
Since  = 1 when the device is completely velocity saturated and  = 2 if the device
has no velocity saturation [50,51], equation (4.11) is a polynomial of degree three or lower,
43
which is solvable. Knowing the voltage VDDopt for the shortest test time, the corresponding
shortest test clock period can be obtained from (4.9) as,
Topt = CL V
2
DDopt
PMAXfunc (4.12)
The optimum test frequency is then
fopt = 1T
opt
Here fopt is the highest power constrained test frequency.
The polynomial equation (4.11) can be solved using any mathematical solver such as
MATLAB. The values for K,  , PMAXfunc and e ective maximum switched load capacitance
CL during any test cycle can be obtained through simulation at nominal voltage.
4.3.3 Solving for VDDopt, fopt and TTopt
The optimum voltage VDDopt will be the minimum voltage at which the test can run
fastest without exceeding the maximum power limit of the device and without being struc-
turally constrained due to an increase in the critical path delay because of scaling the supply
voltage.
Let us solve for the optimum voltage VDDopt using equation (4.11) for the s298 IS-
CAS?89 sequential benchmark circuit synthesized for scan test in TSMC 180nm technology.
The DUT is synthesized using Mentor Graphics Leonardo Spectrum tool [7]. The nominal
voltage for this technology is 1:8V and the threshold voltage is 0:39V. The critical path de-
lay obtained through static timing analysis (STA) using Leonardo Spectrum [7] was 1:5ns or
666MHz. To  nd VDDopt using equation 4.11 we need values for the proportionality constant
K, maximum allowable power limit PMAXfunc and the maximum switched capacitance CL,
that will determine  .
44
The alpha power law model given in equation (4.4) is an approximate method to  nd
the critical path delay for any circuit for a given supply voltage and threshold voltage. The
value for  , the velocity saturation index, in equation (4.4) ranges between 1 and 2 [50,51]
and can be found using methods described in [14] and [51]. It can also be obtained from
a simple curve  tting to delay values at di erent voltages for a chain of inverters. In our
experiment for 180nm technology, the value for  was found to be 2 using the latter method.
We can now rewrite equation 4.4 to  nd the value for K as follows:
K = Tcritical (VDD VTH)
 
VDD
To trigger the critical path for observing the delay we obtained a path delay vector set using
Mentor Graphics Fastscan [5] for the path used in section 4.3.1. The STA for this path was
given as 0:77ns including setup time. Post synthesis timing simulation of the DUT using
Mentor Graphics Modelsim with a period of 0:77ns, was found to pass the test. The value
for the proportionality constant K for this path was calculated to be 0:85. The value of K
depends on the critical path of the circuit, hence based on assumption 2 in Section 4.3.2,
the value of K is kept constant.
The maximum allowable power limit for a circuit is normally given as a speci cation in
the datasheet. In a power-constrained test the power dissipated during test must be kept
under that limit. In the absence of a known power limit for our DUT, we determined the
maximum allowable power by simulating 1,000 random patterns in functional mode and
measured the power dissipated per cycle using Synopsys Nanosim transistor level simulator
at the nominal voltage of 1:8V and a frequency of 500MHz. The maximum power over the
entire functional operation is assumed to be the upper bound for the power during test. For
the DUT in this example the upper bound was measured as 1:2mW.
The next unknown in equation (4.10) is the maximum switched capacitance CL. It is
de ned as the e ective switched load capacitance of the circuit during maximum rising signal
45
Table 4.1: Parameter values for s298 benchmark synthesized in 180nm CMOS technology
(VDD = 1:8V, VTH = 0:39V, Critical Delay = 0.77ns).
Parameter Value
PMAX(func) 0.0012W
CL 2.04pF
K 0.85
 2
transitions caused by any test cycle. Energy consumed during that cycle is,
EMAXtest = CL V2DD
where CL = maximum switched capacitance of the test pattern that causes the most rising
signal transitions. Therefore,
CL = EMAXtestV2
DD
The value of EMAXtest can be obtained by simulating the test patterns at any arbitrary
(slow) frequency f and measuring the maximum power PPEAKfunc for a clock cycle, i.e.,
EMAXtest = PPEAKfuncf
where f is any frequency slower than the maximum allowed by the critical path.
Once the value for EMAXtest is obtained, CL can be obtained from the equation above.
For the DUT in this example the value for CL is obtained as 2:04pF. Table 4.1 summarizes
the values obtained in this manner. Substituting these values into the expression for  
following equation (4.11) we get  = 0:7158 and the equation becomes,
V3X 0:39VX 0:7158 = 0
We use a numerical solver in MATLAB to  nd the roots for VX. We obtain 3 roots, two
complexes and one real. Since the supply voltage is a real number, it is logical to consider
only the real root and discard the two complex roots. Solving for VDDopt from VX we get
46
Figure 4.3: Simulated and calculated curves using test period and functional period at
various voltages. The direct approach using MATLAB (circled) matches the cross point of
the curves obtained analytically using the periods calculated from equations (4.3) and (4.4)
and the results obtained from SPICE (\plus" data points) in [66].
VDDopt = 1:0727V. This is the optimum voltage at which the test can run at maximum speed
would better. Since at this voltage the test is still power constrained, we can calculate Topt
from equation (4.3) where Ttest = Topt which gives us Topt = 1:95ns)fopt 512MHz.
The total test time for the DUT can be calculated as
TTopt = N Topt
where N is the total number of test cycles. For the DUT in this example N = 498, hence
the total test time is TTopt = 0:971 s.
Figure 4.3 shows the calculated test time plots using equations (4.3) and (4.4), at various
voltages for the s298 benchmark circuit. The circled data point indicates the optimum voltage
value obtained from the numerical analysis. The values measured from SPICE at various
voltages in Section 4.3.1 are shown in the curve \Spice Measurement". It is readily observed
47
Table 4.2: Optimum VDD for reduced test time of ISCAS?89 benchmark circuits.
Total Peak per Nominal voltage Optimum Test
Circuit scan test cycle (1.8V) test voltage test time
name cycles power Test Test Supply Test Test reduction
freq. time voltage freq. time
(W) (MHz) ( s) (Volts) (MHz) ( s) (%)
s298 450 0.0012 216.3 2.08 1.1 562 0.8 61.5
s298 498 0.0012 187 2.66 1.07 512 0.971 63.0
s298 540 0.0012 184 2.92 1.07 529 1.02 65.0
s382 704 0.0029 292 2.41 1.44 457 1.54 36
s713 810 0.0015 137.28 5.9 1.33 249.23 3.25 45.0
s1423 6975 0.0045 135.9 51.3 1.6 148.4 47.0 13.0
s1423 6975 0.0030 90.58 77 1.49 132.1 52.7 31.5
s13207 62237 0.0213 168 369 1.62 208.8 298 19.2
s15850 101708 0.240 67.35 1510 1.30 128.9 789 47.7
s38584 224113 0.350 88.54 2531 1.30 172.3 1300 48.6
from the graph that the numerical analysis to obtain the optimum voltage is in accordance
with the SPICE measurement.
4.4 Results
The procedure described in this section was repeated for several ISCAS?89 benchmark
and the results are tabulated in Table 4.2. It gives the results based on Nanosim transistor
level [4] simulation. The values slightly di er from [66]; the di erence is the number of cycles
and the peak power used. The vectors and the power values are chosen such that they are
consistent throughout this document. Though the results vary, for the given peak power and
vectors, both results are valid. In the SPICE simulations, the optimum voltage is obtained
through transistor level simulations for closely spaced voltages to  nd the point before the
circuit becomes structure constrained. In Table 4.2 it was observed that, if the value for the
chosen PPEAKfunc is closer to the power dissipated during test, then the reduction obtained
in test time using reduced voltage is not much. This is because, when the power dissipated
by the test is closer to the rated power, the test runs at a speed closer to the functional
speed and any reduction in supply voltage makes the test structure constrained. This is
48
seen in circuits s1423 and s15850. On the other hand, if the power dissipated during test is
signi cantly greater than the rated power then signi cant reduction in test time is observed,
as in s298 and s382. Most circuits today have the test power 2 the functional power in
CPUs [47] and 4 the functional power in GPUs [69], hence signi cant reduction in test
time is attainable.
Using the polynomial method explained in Section 4.3.2, the optimum voltage was
obtained analytically for the same benchmark circuits used in the SPICE experiments de-
scribed above. The results shown in Table 4.3, correlate well with the simulation results
in Table 4.2. Thus knowing the design speci cations such as, the velocity saturation index
 , proportionality constant K, maximum allowable power PPEAKfunc, and the maximum
switched capacitance CL obtained from the maximum energy cycle, using the polynomial
equation (4.11) the optimum voltage can be obtained in less time. Note that to solve the
polynomial equation (4.11) for a given PMAXfunc, we need to simulate the circuit only once
at the nominal voltage to  nd the constants. For instance if the optimum voltage using
SPICE simulations is achieved after 10 simulations, the time taken to obtain the optimum
supply voltage and test time using the numerical analysis, is reduced by 110.
It should be noted that since the test patterns generated for periodic test was able to
 nd all faults in the optimum VDD tests, the defect and fault coverage for stuck-at tests
should be the same as in periodic test at nominal voltage.
4.5 Peak Power and Critical Path Frequency Measurements
The scenario described in this chapter was used to perform experiments on a FPGA
con gured with a benchmark circuit. The motive of this experiment is to experimentally
observe the e ect of scaling supply voltage on power and frequency. Due to the absence
of a method to measure power on the  y using the Advantest T2000GS ATE, a bench test
equipment by National Instruments [2] is used.
49
Table 4.3: Analytically obtained VDDopt and fopt for minimum scan test time of ISCAS?89
circuits in 180nm CMOS ( = 2, VTH = 0:39V).
Propor- Maximum Total Peak Nominal voltage Optimum voltage Test
Circuit tionality switched scan per cycle 1.8V test test time
name constant capaci- test power Test Test Supply Test Test reduc-
K tance cycles PMAXfunc freq. time VDDopt freq.fopt time tion
( 10 9) CL (pF) N (W) (MHz) ( s) (Volts) (MHz) ( s) (%)
s298 0.85 1.76 450 0.0012 216.3 2.08 1.1 562 0.8 61.5
s298 0.85 2.04 498 0.0012 187.0 2.66 1.073 512 0.971 63.0
s298 0.85 2.06 540 0.0012 184.0 2.92 1.07 529 1.02 65.0
s382 1.75 3.07 704 0.0029 292.0 2.41 1.44 434 1.54 36.0
s713 2.79 3.36 810 0.0015 132.28 5.90 1.34 249.23 3.25 45.0
s1423 6.38 10.22 6975 0.0045 135.9 51.30 1.66 158.5 44 14.2
s1423 6.38 10.22 6975 0.0030 90.58 77.00 1.49 131.7 52.9 31.2
s13207 4.64 38.9 62237 0.0213 168 369 1.62 208.8 298 19.2
s15850 5.12 109 101708 0.0240 67.35 1510 1.31 128.9 789 47.7
s38584 4.03 156.8 224113 0.0350 88.54 2531 1.31 172.4 1300 48.6
4.5.1 Hardware Setup
National Instruments ELectronic Virtual Instrumentation Suite II+ (NI ELVIS) [2]
serves equally well as a bench-top test instrument and prototyping board. We used NI ELVIS
to measure peak power per cycle and the maximum circuit test frequency for a given supply
voltage. The circuit used for the measurements was the Altera DE2 Field Programmable
Gate Array (FPGA) board [12]. The DE2 board houses an Altera Cyclone-II 2C35 FPGA.
Benchmark circuit s298 was programmed on this FPGA. Figure 4.4 shows the test setup
for the power and maximum test frequency measurements. The DE2 board was powered
through the variable power supply available on NI ELVIS. For the s298 circuit, all inputs
and outputs, including scan-in and scan enable, were con gured as external pins of the DE2
board. These pins were in turn connected to the programmable digital Input/Output (IO)
pins available on NI ELVIS. The test program was written in LabVIEW [1] on a PC, and
the test patterns were sent to NI ELVIS through a Universal Serial Bus (USB) connection.
Stored test patterns were then applied to the circuit under test (in our case the DE2 board)
from NI ELVIS, and the response was captured and compared for every test vector.
50
Figure 4.4: Test setup for measuring peak power per cycle and maximum test frequency for
an Altera DE2 FPGA board (with all its peripherals) using the NI ELVIS II+ bench-top
prototyping board.
4.5.2 Peak Power and Frequency Measurements
Figure 4.5 shows the peak power per cycle and maximum test frequency, plotted as
a function of the supply voltage. As the DE2 board comprises a number of peripherals,
like the seven-segment display, several LEDs, several di erent IO drivers, etc., the absolute
power numbers measured from the supply voltage and current product will be dominated by
these peripheral components rather than the actual circuitry on the FPGA. We, therefore,
corrected the measured supply-power by removing the steady state power component in
each cycle. The remaining power component, which is the switching (or dynamic) power, is
presumably dominated by CMOS circuitry on the FPGA. The dynamic power curve is shown
in blue with circular markers at the measured voltage points on the graph in Figure 4.5. We
found that the peak dynamic power per cycle increases as a square of the supply voltage in the
range of 1.8V{5.4V, well in agreement with theory. For supply voltages below 1.8V, even very
51
Figure 4.5: Measured values of maximum power consumed per cycle (in blue) and maximum
test frequency (in green) plotted as a function of the supply voltage for the Altera DE2
FPGA board tested using NI ELVIS II+ bench-top prototyping board. Switching power is
dominated by the CMOS circuitry contained on the board. The FPGA itself is programmed
with the function of s298 benchmark with scan.
low test frequencies result in erroneous outputs, which is plausible since the nominal voltage
speci ed for the board is 3.3V, and one or more of the IO drivers may not be operational
at voltages below 1.8V. Even though the commonly used nominal supply voltage for CMOS
logic circuits at the 90nm technology node is about 1.2V, we could only control the supply
to the DE2 FPGA board in the range 1.8V to 5.4V. Because these tests are destined for
s298 circuit implemented on the FPGA chip and were applied through edge connectors and
other logic on the board, the whole process ran essentially like a board test rather than a
chip test.
The maximum test frequency, in practice, is limited by the structural critical path delay
of the circuit; however, in the current setup, it is limited by the speed of the IO drivers on the
FPGA board and the maximum allowable sampling frequency of NI ELVIS. The maximum
test frequency at each supply voltage also corresponds to frequency at which maximum
power per cycle is dissipated. This curve is shown in green with diamond markers at the
52
measured voltage points in Figure 4.5. The maximum operating frequency at each supply
voltage step was found by starting at an initial frequency and increasing it until the point
where the circuit output no longer matches the ideal output. The highest frequency at which
the circuit output matches the ideal output is taken as the peak operating frequency.
4.5.3 Minimizing Test Time for Given Peak Power Limit
For a circuit under test with a given peak power limit, PMAXfunc, the experimental
data of Figure 4.5 readily gives both the supply voltage VDDopt and test frequency fopt
that minimize the test time of the power constrained test. This is done by using the two
observations made following equation (4.9). For example, suppose we have a peak power
limit PMAXfunc = 0:5mW. At the nominal supply voltage of 3.3V, the test power dissipation
is 1.428mW and maximum structural clock frequency is 16.4 kHz. To keep the test power
under 0.5mW, the test must be run at 16:4 0:5=1:428 = 5:74kHz. From Figure 4.5, for
PMAXtest = PMAXfunc = 0:5mW, we should lower VDD to VDDopt = 2:5V, which gives a test
frequency of fopt = 12:5kHz. Thus, test time will be reduced by a factor 5:74=12:5 = 0:46.
53
Chapter 5
Dynamic Scaling of Test Clock Period
5.1 Aperiodic Clock Test
In Chapter 4 it was proposed that, for a periodic clock test where the test clock period
is held constant throughout the test, the test time can be reduced by choosing an optimum
supply voltage that is lower than the nominal supply voltage. While using a periodic clock
test, at any voltage the dynamic power dissipation for a given test pattern set is not constant
and randomly varies throughout the test, based on the switching activity caused by the test
patterns during each cycle. If every cycle in a periodic clock test can be modi ed such
that the power remains constant and within the allowed limit for the entire test by choosing
unique clock periods, then it will be possible to reduce test time signi cantly. This method of
testing is termed as the aperiodicclock test, where the period of each cycle can be di erent
from the period of its neighboring cycle. This was brie y described in Section 3.2.2. In
an aperiodic clock test, it is possible for every clock cycle in a test to be either structure
constrained or power constrained. The test clock period of an aperiodic test can be given
by,
Ti = maxfTstructure; EiP
PEAKfunc
g (5.1)
For the stuck-at fault tests, the capture cycle will also be at the same clock period as
the scan shift cycles; hence the period for the capture cycle can also be reduced based on the
power dissipated during that cycle. When considering delay testing, based on whether we
use single clock capture (Launch-on Shift) or double clock capture (Launch-on Capture), test
cycle period for the shift cycles can be modi ed and the capture cycles can be left unaltered,
since the capture cycles uses functional clock period to identify delay faults.
54
Using equation (5.1) the total test time for an aperiodic clock test can be given as,
TTasynch =
NX
i=1
maxfTstructure; EiP
PEAKfunc
g (5.2)
where TTasynch is the aperiodic test time for a test with N cycles.
5.1.1 A Circuit Example
We examine the proposed aperiodic clock test using an ISCAS?89 sequential benchmark
circuit. For simplicity, let us choose the s298 benchmark circuit that contains 14  ip- ops,
3 primary inputs and 6 primary outputs. The circuit is synthesized using Mentor Graphics
Leonardo Spectrum [7] with TSMC 180nm technology. The Spectrum tool also provides the
critical path delay via static timing analysis (STA) of the circuit. More accurate critical path
delay information can be obtained after the routing of the circuit with inserted scan chains.
Statistical static timing analysis (SSTA) can also be used to consider process variations
during delay calculations [9,28]. All  ip- ops in the circuit were daisy chained to form a single
scan chain, using Mentor Graphics DFT Advisor. Once the scan chain was inserted, a set
of deterministic ATPG test vector patterns for stuck-at faults were generated using Mentor
Graphics Tessent Fastscan [5]. A transistor level simulation was performed using Synopsys
Nanosim [4] at the nominal voltage of 1.8V. The transistor level description of the netlist
was generated using Mentor Graphics Design Architect and the SPICE  le was imported
into Nanosim. Using Nanosim the energy dissipated per cycle during the entire test was
measured. Based on the report obtained through transistor level simulation, we determined
the test period for each cycle. For each cycle the test period would be constrained both
by structure, as given by STA, and by maximum rated power. The maximum rated power
depends on the functional characteristics, physical design, packaging, etc., and is part of the
speci cation of the circuit. In the absence of available data, for our analysis we measured
the maximum power in functional mode through simulation of 1,000 random vectors, which
55
Figure 5.1: Periodic and aperiodic clock simulation of 450-cycle scan test of ISCAS?89 bench-
mark circuit s298. Periodic test clock frequency is 240MHz and test time is 1.87 s. Aperiodic
clock test time is 1.31 s.
was 1.23mW. Once the time period for each cycle was obtained the circuit was simulated
again to calculate the power dissipated during each test cycle.
Figure 5.1 shows the simulation results for the s298 benchmark circuit. The plot com-
pares the test performed using periodic ( xed) and aperiodic (varying) clock periods. The
x-axis shows time as test was run and the y-axis shows the power dissipated during each
test cycle. As observed from the  gure, when a periodic clock is used the power dissipated
during each cycle does not reach the maximum rated power at most cycles. Hence the test
clock periods for cycles dissipating less power can be safely reduced until the cycle power
is close to the rated power. This e ect is seen in the simulation results using the aperiodic
clock. When the particular cycle dissipates low power, the period is reduced such that the
power for that cycle increases to a value closer to the rated power. However, while trying to
56
do so if the period becomes shorter than the critical path delay then the period is set to the
value of the critical path delay. Thus, we ensure that the power constrained period is small
without violating any timing constraints. This limitation on the minimum period will force
the circuit to dissipate signi cantly less than the rated power and hence the \dips" in the
aperiodic clock plot.
In this example, the total test time with the periodic clock was 1:87 s and the test
time with the aperiodic clock was  1:31 s. This represents a reduction of 30%. Greater
reduction is achievable if the average power of the entire test is signi cantly lower than the
maximum power. Thermal analysis [70] and characterization of test power can be performed
to determine a safe operating point for testing and the test can be modi ed appropriately,
such that if the thermal issues are a concern, the method can be used at the safe operating
point.
5.1.2 Simulation Results
Table 5.1 shows the simulation results for several ISCAS?89 benchmark circuits using
the procedure described in Section 5.1.1. All circuits were synthesized using TSMC 180nm
technology. The nominal supply voltage for this technology is 1.8V. For s298 three di erent
sets of test patterns were used for each circuit to observe the e ect of test power while
reducing test time. This is discussed later in this section.
Column 2 of Table 5.1 shows the number of scan test clock cycles used for each circuit.
This is determined by the number of  ip- ops in the scan chain and the total number
of vectors, along with one cycle per vector for capture. Since vectors were generated for
stuck at faults, only one capture cycle is used for response capture at the end of each test.
The maximum rated power (PPEAKfunc) shown in column 3 is normally given in the circuit
datasheet. However, for these benchmark circuits we obtained a value by simulating the DUT
in functional mode at its fastest frequency for 100 random vectors. In some cases the power
value thus obtained might be closer to the power calculated during test, but employing an
57
Table 5.1: Scan test time for ISCAS?89 circuits in TSMC 180nm technology.
Total Per cycle Max per Total Periodic Asynch-
Circuit scan peak cycle energy clock ronous Test
name test power Energy of test Freq- Test clock test time
clock PPEAKfunc EMAX(test) ETOTAL uency time time reduction
cycles (W) (pJ) (nJ) (MHz) ( s) ( s) %
s298 450 0.0012 5.71 1.83 216.34 2.08 1.48 28.5
s298 498 0.0012 6.61 1.83 187.2 2.66 1.49 44.0
s298 540 0.0012 6.68 1.89 184.9 2.92 1.54 47.2
s382 704 0.0029 9.96 4.69 100 2.41 1.62 32.7
s713 810 0.0015 10.9 4.22 137.28 5.9 2.83 52.0
s1423 6975 0.0045 33.14 166.68 135.96 51.3 39.8 22.4
s1423 6975 0.0030 33.14 166.68 90.58 77 55.6 28.0
s13207 62237 0.0213 126.3 660 168.66 369 312.2 15.0
s15850 101707 0.0240 213.8 2610 67.35 1510 1088.0 27.8
s38584 224112 0.0350 609.8 9470 88.54 2531 2101.5 17.0
aperiodic clock to reduce test time can still be shown. Column 4 shows the maximum energy
(EMAXtest) dissipated due to signal transitions in the clock cycle that consumes the most
energy. Column 5 shows the total energy (ETOTAL) consumed by the entire test. These were
obtained by simulation as discussed in Section 5.1.1. For the s1423 an additional experiment
was performed with a value of 0.030mW, which is less than the simulated peak value, to
observe the e ect of power on test time reduction without a change in the energy dissipation.
Columns 6 and 7 give the test frequency and test time for the periodic clock test.
The periodic clock period TPOWER is obtained from equation (3.11), using the data from
columns 3 and 4. The test frequency in column 6 is 1T
POWER
. The total test time for the
periodic clock, in column 7, is calculated using equation (3.15). Column 8 shows the total
test time taken when an aperiodic clock is used and the corresponding test time reduction
over that of column 7 for periodic clock is given in column 9.
An interesting observation here is that aperiodic to periodic test time ratio for power
constrained testing is the ratio of average energy to the maximum energy per cycle. For
example, consider s298 in the  rst row of Table 5.1. Average energy per cycle is EAVG =
1:83nJ=498 = 4:067pJ. The ratio EAVG=EMAX(test) = 4:067=5:71 = 0:71 is about the
58
same as the test time ratio 1:48=2:08. In cases where a signi cant number of clock cycles are
structure constrained the test time ratio may move toward unity. If every cycle consumes
signi cantly low energy compared to a few cycles that consume very high energy, then it is
possible to achieve a large reduction in test time. This is because, based on equation (5.1)
all low energy cycles will only be limited by the critical path delay and only those cycles
that consumes high energy will run at the slowest clock period. On the other hand, if the
number of cycles consuming very high energy is signi cantly larger than the number of cycles
consuming low energy then the reduction in test time will be less. This e ect was examined
for the s298 circuit in Table 5.1. Using alternative sets of vectors with one test pattern
having high energy consuming cycles and the rest of the patterns consuming low energy, the
test time reduction improved from 28% to 47% for s298.
Once again, since the test patterns generated for periodic test was able to  nd all faults
in aperiodic test, the defect and fault coverage for stuck-at fault test should be the same as
in periodic test.
5.1.3 Test Programming on ATE at Nominal Voltage
Experimental Setup
The aperiodic clock technique was experimentally veri ed on the Advantest T2000GS
ATE at Auburn University. The ATE can be operated at a maximum speed of 250MHz and
has 128 bi-directional tester channels. The power supply to the DUT is provided by the
ATE through a digital power supply module DPS500mA, which has a power supply range of
 2 to 8V and a output current range up to 500mA. The test plan is programmed using the
native Open architecture Test system Programming Language, in short OTPL. Provisions to
place a chip on the tester head are available. For our experiments with benchmark circuits,
we used a Xilinx Spartan 3 FPGA XC3S50 soldered on a printed circuit board. The DUT
used for our experiment was the s298 benchmark circuit with daisy chained mux-type scan
59
 ip- ops con gured on the FPGA. The FPGA is con gured on the run by the ATE using
the bit  le generated by the Xilinx ISE tool [39].
The ATE has a frame processor and a pattern generator, which are synchronized with
the rate generator. The rate generator generates a  xed rate clock pulse and triggers the
pattern at the start of each pulse. Based on the waveform set by the frame processor and
the corresponding pattern value, the pattern is applied to the DUT mounted on the tester
head. The test plan for the FPGA consists of three steps. The  rst two steps account for
the con guration of the FPGA using the ATE. In the  rst step, the FPGA is powered by
the ATE with a supply voltage of 2.5V and the con guration memory is cleared during this
process. The second step downloads the bit  le generated by Xilinx ISE using a slave serial
mode. In this mode, the con guration data is provided through the DIN input pin of the
FPGA and clocked externally using the ATE. A successful con guration of the FPGA is
indicated by a High output value on the DONE pin. The third step performs the functional
test on the DUT now con gured on the FPGA.
External Test
The clock period required for the scan-based functional test is determined prior to the
external testing. Certain limitations of the tester framework set only allow 4 unique clock
periods can be provided for each test  ow this limits the granularity in its variations. Hence,
the periods for each test cycle are obtained through simulations and split into four groups.
The latency of the analog measurement modules is included in the selected period. The
longest cycle period corresponds to the pulse width determined by the cycle during which
we achieve maximum switching. The shortest period corresponds to the lowest test period
using which we achieve signi cant reduction in test time. Each test cycle is assigned to a
period that is closer to, but not less than, the required period for that cycle.
Based on the periods obtained earlier, synchronization with the rate generator is con-
trolled by specifying the periods in the test program using a timing block. The timing block
60
has information about the rates at which the pattern should be applied at each input and
the behavior of the signal at each pin corresponding to the value in the pattern  le. Since
the patterns are applied at the start of each period, the pulse provided by the rate generator
is not used as a clock to the scan circuit of the DUT, but instead it is used to synchronize the
pattern generation. The clock pin is considered as an input pin and the duty cycle is set to
50% of the period set by the rate generator. This way we avoid any race conditions caused
during the application of the inputs at the start of the each period of the rate generator. The
pattern for each cycle contains the signal value needed at each input pin and the response
to be observed at each output pin. The period for each cycle is speci ed by mapping the
cycle with the waveform information that is uniquely de ned to match one period.
ATE Results
The proposed method was applied by the ATE to the s298 benchmark circuit con g-
ured on the Xilinx. We simulated 36 deterministic combinational ATPG patterns used for
simulation of the s298 circuit in Table 5.1 row 3. The cycle times required for each period
were obtained through a perl script based on the energy consumption per cycle reported
by Nanosim [4]. Though in this work the energy is obtained using NanoSim, power can be
calculated per cycle during the actual test on the ATE by implementing a microcontroller
on the test head. The minimum clock period that was used with the DUT was 100ns. For
clarity of our experiment, the clock periods obtained through simulation were multiplied by
100. Four unique clock periods were then obtained such that we achieve signi cant reduction
in test time. Figure 5.2 shows the test clock periods on the y-axis for each corresponding
test cycle on the x-axis. The horizontal broken (red) lines show the four unique test cycle
periods. A test cycle will use the test clock in the dotted line just above the period as shown
in Figure 5.2. For the periodic clock test the maximum period in Figure 5.2 will be used as
the  xed clock period.
61
Figure 5.2: Aperiodic clock for 540-cycle scan test of s298 for a power budget of 1.23mW.
Horizontal broken lines indicate four test clock periods available from the T2000GS ATE.
Period used for a test cycle was the nearest higher ATE clock period.
The four clock periods used in this experiment were determined from a visual inspection
of the plot in Figure 5.2 and are not optimal. It is possible to algorithmically  nd the best
clock periods for any given number of periods that an ATE may support [30].
The waveforms for the ATE tests are shown in Figures 5.3 and 5.4, as viewed in the
logic analyzer of the Advantest T2000GS system. The two  gures have the same time scale.
Figure 5.3 shows 33 cycles (13 to 46) which account for two scan sequences of the periodic
clock test using a 500ns clock. The cycle number is indicated in the  rst row, followed by
the period for each cycle as indicated above the  rst waveform. The labels on the left side
of each waveform correspond to scan out, scan in, scan enable, three primary inputs and
clock pins. The value expected at the scan out signals are indicated by X, L or H, at the
beginning of each period and the strobe instants at which the output response is veri ed
are indicated by downward/upward triangles, placed at the end of each period. The strobe
points are located such that there is enough time for the signal to settle after a clock pulse
is applied. The input waveforms are indicated along with the pattern that is applied at
62
Figure 5.3: Periodic clock: ATE result for 540-cycle scan test of s298 benchmark circuit.
Waveform shows 33 test cycles (cycles 13 through 46) of 500ns clock. Signals shown are scan-
out, scan-in, scan enable, three primary outputs and clock. Green triangles under scan-out
waveform are matching strobes.
the start of that period. A ?1? pattern for the clock during each period indicates that the
clock is turned on and based on the 50% duty cycle for the clock during that period, the
corresponding waveform is generated by the frame processor. The test pattern used in this
experiment is a deterministic test pattern generated by Fastscan ATPG for stuck-at faults
having lower power cycles than high power cycles. For the periodic clock test of Figure 5.3,
which used a  xed clock period of 500ns for the entire test, the total time for 540 cycles was
270 s.
Figure 5.4 shows the ATE waveforms using an aperiodic clock with periods, 500, 410,
300 and 200 ns, as selected for each cycle based on the corresponding activity it produces in
the DUT. The test clock period is determined from Figure 5.2. Thus, the peak activity in
the DUT is the same for both periodic and aperiodic clock tests. Both Figures 5.3 and 5.4
show the waveforms for a time interval of 16:5 s. Because the aperiodic clock test runs at
varying clock period, more cycles are run in this time. Hence, in Figure 5.4 we observe 58
cycles (13 to 71) within the same time frame of 16:5 sec as 33 cycles (13 to 46) for the
periodic clock test. The total test time for 540 cycles is now 157:7 s, which corresponds to
a reduction of  38% over the periodic clock test time.
63
Figure 5.4: Aperiodic clock test: ATE result for 540-cycle scan test of s298 benchmark
circuit. Waveforms shows 58 test cycles (cycles 13 through 71) taking the same time as
taken by 46 cycles of periodic clock test in Figure 5.3. Clock periods used were 200, 300, 410
and 500 ns as shown in Figure 5.2. Signals shown are scan-out, scan-in, scan enable, three
primary outputs and clock. Green triangles under scan-out waveform are matching strobes.
This test time reduction is dependent on the relative clock schedules between periodic
and aperiodic clock tests and hence can be compared with 47.2% reduction reported for the
540 cycle test of s298 in Table 5.1. There are two reasons for ATE time saving being lower.
Firstly, the granularity of clocks, i.e., four ATE clocks versus individual clock for each vector
and secondly, the selection of the four ATE frequencies was ad-hoc and we believe a better
selection can improve the test time reduction.
5.2 Optimum Voltage for Aperiodic Clock Test
We saw in the previous section that by using an aperiodic clock test, it is possible to
reduce test time at nominal voltage. However, is it possible to reduce the test time further?
In this section we examine the possibility of an optimum test time for aperiodic clock test
at which the test can run the fastest.
The reduced voltage approach discussed in chapter 4 can be extended to further reduce
the aperiodic clock test time. From equation (4.2), the period for power constrained cycle is
proportional to the voltage used for test. Hence the width of the power constrained period
can be reduced to improve the test time by reducing the voltage for the test. However,
64
there are a few points to consider when using aperiodic clock tests. First, some cycles in the
aperiodic clock test may have already been compressed to the minimum period permitted by
the structure constraint and as the supply voltage is reduced for test the critical path delay
increases. Thus, as the voltage is reduced, more cycles become structure constrained. With
further reduction in power supply, the voltage is further reduced and the test starts to lose
its aperiodic clock property and becomes fully periodic. However, at some voltage before the
test becomes periodic most of the cycles will be structurally constrained except for a very few
cycles that are power constrained. The point at which the test still retains the aperiodic clock
property will be the optimum test time for aperiodic clock test. Figure 3.4 illustrates this
e ect and is an extension of Figure 3.1. Based on equation (3.17) and (3.18), the aperiodic
clock test time is bounded by the slowest frequency of the periodic clock test, limited by the
power constraint and the fastest frequency limited only by the structure constraint. Point A
represents the optimum voltage for periodic clock test. From point A, if the supply voltage is
increased in steps such that V >Vsynch then some cycles start becoming power constrained,
so those cycles have to be expanded to accommodate the power dissipation within the rated
power. An optimum voltage for the aperiodic clock test, Vasynch, will lie in the region A-B
at a point where the structural delay is lower and the test has more structure constrained
cycles than power constrained cycles.
Analysis of this method is performed on the s298 benchmark circuit. The circuit was
synthesized in 180nm technology using Leonardo Spectrum [7] and the scan chains were
inserted using Mentor Graphics DFT Advisor. We used deterministic vectors generated by
Fastscan [5] for stuck-at faults test, which also include one path delay vector to trigger the
critical path of the device. The aperiodic clock test was performed for decreasing power
values and the corresponding test times were noted. Figure 5.6 shows the results, plotted
with test time on the y-axis and supply voltage on the x-axis. At each voltage we estimate
the critical path delay using the alpha power law approximation [50,51]:
65
Figure 5.5: Aperiodic clock test time as a function of supply voltage showing the minimum
test time voltage, Vasync.
Tstructure = K VDD(V
DD VTH) 
(5.3)
A few assumptions were made when solving for the critical path delay,
1. Critical path does not change as voltage is reduced; found valid for small voltage
changes.
2. Threshold voltage does not vary.
The maximum rated power was found by simulating the circuit at nominal voltage for 1000
random vectors in the functional mode. The resulting maximum power was noted to be
1:23mW. The value for the velocity saturation index  was found to be 2 by curve  tting
the delay of a chain of inverters at di erent voltages with those obtained through simulation.
Once the value of alpha is known, the value for K for the s298 benchmark, found using the
delay obtained through STA in the section 4.3.1 at nominal voltage. The value for K is
found to be 0:85. We can now  nd the delay of the critical path at every voltage step.
66
Figure 5.6: Minimum periodic and aperiodic clock test times for s298 circuit after selecting
suitable supply voltages.
The method to calculate the aperiodic clock period at each voltage was based on the ex-
planation provided in Section 5.1. At the optimum voltage of Vasync = 1:25V, the correspond-
ing minimum aperiodic clock test time using this method is found to be TTasynch = 0:77 s,
which is a 71% reduction in test time compared to the test time (TTNominal) of 2:66 s using
periodic clock at the nominal voltage of 1.8V, as shown in Figure 5.6.
5.2.1 Simulation Results
Table 5.2 shows the optimum voltage for aperiodic clock test obtained through simu-
lation. All circuits were synthesized using TSMC 180nm CMOS technology using Mentor
Graphics Leonardo Spectrum [7]. The scan chains were inserted and stitched into a single
scan using DFT Advisor. Patterns used in this experiment were the same set of patterns
used in periodic clock test experiment in chapter 4 and for the aperiodic experiment in
67
Table 5.2: Optimum voltage VDDopt for minimum aperiodic clock scan test time of ISCAS?89
circuits in 180nm CMOS ( = 2, VTH = 0:39V).
Proport- Maximum Total Peak Nominal voltage Optimum Test
Circuit onality switched scan per (1.8V) test voltage test time
name constant capaci- test cycle power Test Test Supply Test reduc-
K tance cycles PMAXfunc clock freq. time VDDopt time tion
( 10 9) CL (pF) N (W) (MHz) ( s) (Volts) ( s) (%)
s298 0.85 1.76 450 0.0012 216.0 2.08 1.20 0.72 65.4
s298 0.85 2.04 498 0.0012 187.0 2.66 1.25 0.77 71.0
s298 0.85 2.06 540 0.0012 184.9 2.92 1.25 0.81 72.2
s382 2.18 3.07 704 0.0029 292: 2.41 1.59 1.37 43.5
s713 3.31 3.36 810 0.0015 137.28 5.9 1.62 2.49 57.7
s1423 6.38 10.22 6975 0.0045 135.9 51.3 1.82 39.7 26.7
s1423 6.63 10.23 6975 0.0030 90.6 77 1.55 49.5 35.7
s13207 4.64 38.9 62237 0.0213 168 369 1.7 281 23.8
s15850 5.22 109 101707 0.0240 67.35 1510 1.43 702 53.5
s38584 4.03 156.8 224112 0.0350 88.54 2531 1.41 1290 49.0
Table 5.1. The patterns were generated for stuck-at fault model using Mentor Graphics
Tessent Fastscan [5]. The critical path delay for the structure constraint was obtained using
the static timing analysis (STA) tool of Leonardo Spectrum. The  power law model in
equation (5.3) was then used to calculate the path delay value at each supply voltage to set
the structure constraint for the aperiodic clock period. The critical path is assumed to be
constant and hence the initial value for the proportionality constant was obtained at the
nominal voltage at the velocity saturation value  = 2.
Column 1 of Table 5.2 shows the ISCAS?89 benchmark circuits used for this experiment,
column 2 gives the proportionality constant used to calculate the delay at each voltage using
the  delay law model. The maximum allowable power used in these experiments is shown
in column 5. The maximum power chosen here is measured by simulating each circuit in
normal mode with 1000 random patterns and measuring the maximum power over the entire
test. Columns 6 and 7 show the test clock frequency and total test time for a conventional
test using a periodic clock period at the nominal voltage of 1.8V. Column 8 shows the
optimum test supply voltage for an aperiodic clock test and the corresponding test time at
68
the optimum voltage is given in column 9. Finally the reduction in test time by using an
aperiodic clock under reduced supply voltage over the periodic clock at nominal voltage is
shown in column 10. The results show signi cant reduction in test time when compared
to the methods discussed earlier. Since, the calculated delay is pessimistic, in a practical
setting more reduction in test time is possible.
Table 5.3: Test times for various methods normalized with respect to that of the conventional
method (nominal 1.8V supply and periodic clock).
Circuit Nominal Voltage 1.8V Periodic clock Aperiodic clock
Name Periodic Aperiodic Optimum Reduction Optimum Reduction
clock clock voltage ratio voltage ratio
s298 1 0.52 - 0.71 1.07 - 1.1 0.35 - 0.38 1.20 - 1.25 0.27 - 0.34
s382 1 0.67 1.44 0.63 1.59 0.56
s713 1 0.48 1.33 0.55 1.62 0.42
s1423 1 0.72 1.49 0.68 1.55 0.64
s13207 1 0.84 1.62 0.80 1.70 0.76
s15850 1 0.72 1.30 0.52 1.43 0.46
s38584 1 0.83 1.30 0.51 1.41 0.49
Table 5.3 summarizes the test time reduction using the proposed methods, normalized
with the conventional method using a periodic clock at the nominal voltage of 1.8V for CMOS
180nm technology. Summarizing the results, it is evident that for most of the circuits the
test time is reduced signi cantly (> 30%) by either reducing the power supply or by scaling
the frequency.
69
Chapter 6
Discussion
6.1 Adapting to At-Speed Testing
Timing related defects are often targeted during at-speed testing. This necessitates
the need for clock pulses generated at functional speeds. Due to high costs of automatic
test equipment (ATE) and the growing clock speeds, implementing at-speed testing using
ATEs will not be economical [15], due to this reason at-speed clocks during capture cycles
are generated internally using phase lock loops (PLL) that generates clock pulses with  xed
frequency that can be a multiple of a reference slow clock [38]. On chip PLLs also serve
another purpose; to eliminate clock skews [48].
In the case of at-speed testing the use of aperiodic clock can be restricted only to
improve scan shift timing, since the majority of test time is taken while shifting. Besides,
during at-speed the capture cycle (and the launch cycle for launch on capture) already run at
functional speed. The slow clock during scan shift can be supplied by the ATE or generated
internally using a divide by N counter. In the case where the clock for scan shift is generated
within the chip, modi cations can be done to the architecture such that the tester chooses
the clock to be used. To describe the feasibility of this procedure let us consider the serdes
architecture used in [52], where a deserializer is used to shift patterns at high speed from
the tester and the internal scan chain shifts patterns at a slow speed. The shift clock to the
internal scan chains is a slow clock with a frequency that is a fourth of the ATE clock. This
slow clock could be generated by implementing a simple divide by N counter. With aperiodic
clock, the divide by N counter could be designed to produce the required four clocks.. The
ATE can then be used to send the bits that sets the value for N and the corresponding
output is used as clock. When the pattern is shifted out, the output from the scan chains is
70
fed to the serializer module which shifts out the pattern to the ATE at high speed. Since the
patterns from the ATE and to the ATE are shifted at the same speed, there should not be
any con ict due to internal aperiodic clock when comparing the output with the expected
value. It is to be noted that, while choosing the clock period for the aperiodic test, any
additional delay constraints due to long interconnects during scan shift that is considered
for timing closure will be added to the structure constraint in the aperiodic clock. Further
analysis will be done in future works that explores this feasibility.
6.2 Adapting to Process-Voltage-Temperature Variations
As technology scales to small feature sizes intra-die and inter-die variations in the opera-
tional parameters of the chips are more pronounced. These variations can be due random or
systematic defects due to fabrication process and can often a ect the timing during voltage
and temperature changes. Testing the circuit under test (CUT) under di erent process-
voltage-temperature (PVT) corners, namely worst, typical, and best cases, identi es the
reliability of the chip and its operating range [44]. Such testing normally checks for the
functionality of the device at each corners. Designers and process engineers often deter-
mine these corners and the corresponding delays are characterized to the chips structural
constraints [25, 26]. The CUT is tested at these corners for its functionality and therefore
corner testing is more critical during timing based tests. Hence, during corner testing the
concern is for the CUT to meet the timing speci cation of the capture cycle. The shift cycle
(and launch cycle in case of launch on shift) must include the interconnect delay and the
setup and hold time constraints imposed due to these variations. Using the proposed method
with the corner testing would account the same constraints when determining the structural
constraint for the test clock period and while  nding the optimum voltage during scan shift
at each of these corners. The voltage can then be switched to the corner voltage during the
capture cycles either internally if the chip has a dynamic voltage regulator [24] or externally
by the ATE.
71
Further analysis will be performed in future extensions of this work to analyze the e ect
of process variations in obtaining the optimum values in aperiodic test and voltage at each
corners and explore the feasibility to increase the supply voltage dynamically for capture
cycle using ATE.
72
Chapter 7
Conclusion
Advanced technologies for CMOS VLSI design for low power applications require power
constrained tests that could result in longer test time and high testing costs. New methods
are required to reduce test time while conforming to the allowable power. In this work,
employing scan-based test, ISCAS?89 benchmark circuits were simulated for the maximum
energy dissipated using periodic clock period. Di erent methodologies such as scaling down
supply voltage and scaling up the frequency were shown to help reduce test time. Results
have shown that a reduction of more than 50% is attainable in some cases. Maximum reduc-
tion in test time is observed when the peak energy dissipated by the circuit is signi cantly
greater than the average energy dissipated. The feasibility of aperiodic clock test was demon-
strated on an ATE. The results presented in this work suggests that a test engineer can have
multiple options to reduce test time based on the available hardware and improvements in
tester framework and hardware can help reap full bene ts of the proposed methods.
It was shown that by reducing the power supply voltage at which a DUT is tested
the test time can be reduced. Future investigations should involve the use of the proposed
method for delay testing with the focus of  nding the correlation of low voltage timing
measurements with actual nominal voltage of the delay. E ects of leakage power that occur
in advanced technologies will also have to be investigated in the future. We believe that the
methods presented here will remain bene cial. The dynamic power considered in this work,
was the cycle-average peak power, which also includes the instantaneous peak within a clock
cycle. Although smoothing due to suitable design of power grid and decoupling capacitors
may justify the use of cycle-average, any possible issues related to power supply noise and
coupling e ects will have to be examined in the future.
73
Future analysis would include process variations when  nding aperiodic clock periods for
a DUT. Such an analysis would be bene cial for the proposed method and we believe that it
will not be a limitation in determining optimum clock periods. System on chip (SoC) testing
could have severe test time and power problems when multiple cores are tested in parallel.
There will be bene ts if core tests are optimized by aperiodic clock periods. However, the
distributions of clocks through the test access mechanism (TAM) of SoC and test program
details have to be analyzed in the future. Such limitations may not be encountered when
using reduced power supply voltage with periodic clock test. Since the netlist used in this
work did not have information on wiring delay after place and route, the work presented in
this thesis assumes that the critical path delay on the scan path and functional path to be
similar. However, the critical path for scan mode may have very little logic but it is not
optimized for delay in the physical design, future work will have to examine the issue in
depth.
The work presented in this thesis uses an ad-hoc method to group clock periods used
on the ATE, future work can investigate di erent non-linear algorithms that could provide
the optimum values for the clock periods. The test programming used in the Advantest
T2000GS ATE is not a conventional one and the method have to be adapted to the STIL
format [3, 40], which is more commonly used in the industry. Some circuits include a dual
voltage logic that can operate at di erent voltage levels. With that in mind, a dynamic
voltage mechanism can be devised in the same way as the aperiodic clock, where the voltage
for each cycle will be determined by the amount of energy dissipated during that cycle. At
the cost of some added complexity to the test set up the added degree of freedom will further
reduce the test time. It is suggested that such a scheme should be studied in the future.
The clock signal during test may be generated in several possible ways. The clock may
be supplied directly by the ATE. Alternatively, it may be generated by a phase lock loop
(PLL) circuit on the device under test (DUT) with a synchronizing signal provided from the
ATE. In the  rst case the control of the clock frequency would be done by the test program.
74
However, the manipulation of test clock frequency would be more complex in case of test
compression where the test data transmission rate between ATE and DUT would di er from
the internal clock on DUT. Such schemes require serious study. Another important issue is
the distribution of clock on DUT. Typically, a clock tree is designed to correctly transmit
the functional clock or any  xed rate slower clock. However, faster clock as may be used for
reduced voltage periodic test or variable clock rates used for aperiodic test would require a
careful study of the time constant of the clock distribution network.
75
Bibliography
[1] \LabVIEW System Design Software, National Instruments." http://www.ni.com/labview/
(accessed Oct. 26, 2012).
[2] \NI ELVIS: Educational Design and Prototyping Platform, National Instruments."
http://www.ni.com/nielvis/ (accessed Oct. 26, 2012).
[3] \IEEE Standard Test Interface Language (STIL) for Digital Test Vector Data," IEEE Std
1450-1999, pp. i{, 1999.
[4] Nanosim User Guide. Synopsys, San Jose, CA, 2008.
[5] ATPG and Failure Diagnosis Tools. Mentor Graphics Corp., Wilsonville, OR, 2009.
[6] \NVIDIA Takes Charge for Faulty Graphics," Aug. 2009. Computerworld, IDG News Service,
http://www.computerworld.com.
[7] Leonardo Spectrum User Guide. Mentor Graphics Corp, Wilsonville, OR, 2011.
[8] DFTMAX Compression User Guide. Synopsys Incorporated, Mountain View, CA, 2013.
[9] A. Agarwal, D. Blaauw, and V. Zolotov, \Statistical Timing Analysis for Intra-Die Process
Variations With Spatial Correlations," in Proc. International Conf. Computer Aided Design,
2003, pp. 900{907.
[10] V. D. Agrawal, \Pre-Computed Asynchronous Scan (Invited Talk)," in 13th IEEE Latin Amer-
ican Test Workshop, Quito, Ecuador, Apr. 2012.
[11] V. D. Agrawal, \Reduced Voltage Test Can be Faster," in Proc. International Test Conf.,
Nov. 2012. Elevator Talk.
[12] Altera, \DE2 Development and Education Board." http://www.altera.com/education/univ/
materials/boards/de2/unv-de2-board.html (accessed Oct. 26, 2012).
[13] Y. Bonhomme, T. Yoneda, H. Fujiwara, and P. Girard, \An E cient Scan Tree Design for
Test Time Reduction," in Proc. 9th IEEE European Test Symp., 2004, pp. 174{179.
[14] K. A. Bowman, B. L. Austin, J. C. Eble, X. Tang, and J. D. Meindl, \A Physical Alpha-Power
Law MOSFET Model," IEEE Journal of Solid-State Circuits, vol. 34, no. 10, pp. 1410{1414,
Oct. 1999.
[15] M. L. Bushnell and V. D. Agrawal, Essentials of Electronic Testing for Digital, Memory and
Mixed-Signal VLSI Circuits. Springer, 2000.
[16] K. Chakravadhanula, V. Chickermane, D. Pearl, A. Garg, R. Khurana, S. Mukherjee, and
P. Nagaraj, \SmartScan - Hierarchical Test Compression for Pin-limited Low Power Designs,"
in Proc. International Test Conference, 2013, pp. 1{9. Paper 4.2.
76
[17] M. Chalkia and Y. Tsiatouhas, \The Leafs Scan-Chain for Test Application Time and Scan
Power Reduction," in Proc. 19th IEEE International Conf. Electronics, Circuits and Systems,
Dec. 2012, pp. 749{752.
[18] J. T. Y. Chang and E. J. McCluskey, \Detecting Delay Flaws by Very-Low-Voltage Testing,"
in Proc. International Test Conference, Oct. 1996, pp. 367{376.
[19] J. T. Y. Chang and E. J. McCluskey, \Quantitative Analysis of Very-Low-Voltage Testing,"
in Proc. 14th IEEE VLSI Test Symposium, 1996, pp. 332{337.
[20] M. Chloupek, O. Novak, and J. Jenicek, \On Test Time Reduction Using Pattern Overlap-
ping, Broadcasting and On-Chip Decompression," in Proc. IEEE 15th International Symp. on
Design and Diagnostics of Electronic Circuits Systems (DDECS), Apr. 2012, pp. 300{305.
[21] R. M. Chou, K. K. Saluja, and V. D. Agrawal, \Power constraint scheduling of tests," in Proc.
7th International Conference VLSI Design, Jan. 1994, pp. 271{274.
[22] R. M. Chou, K. K. Saluja, and V. D. Agrawal, \Scheduling Tests for VLSI Systems Under
Power Constraints," IEEE Trans. VLSI Systems, vol. 5, no. 2, pp. 175{185, June 1997.
[23] W. Daehn and J. Mucha, \Hardware Test Pattern Generation for Built-In Testing," in Proc.
International Test Conf., 1981, pp. 110{120.
[24] V. R. Devanathan, C. P. Ravikumar, R. Mehrotra, and V. Kamakoti, \PMScan: A Power-
Managed Scan for Simultaneous Reduction of Dynamic and Leakage Power During Scan Test,"
in Proc. IEEE International Test Conf., Oct. 2007. Paper 13.3.
[25] L. G. e Silva, J. Phillips, and L. M. Silveira, \E ective Corner-Based Techniques for Variation-
Aware IC Timing Veri cation," Computer-Aided Design of Integrated Circuits and Systems,
IEEE Transactions on, vol. 29, no. 1, pp. 157{162, Jan. 2010.
[26] L. G. e Silva, L. M. Silveira, and J. R. Phillips, \E cient Computation of the Worst-Delay
Corner," in Proc. Design, Automation Test in Europe Conference Exhibition, 2007. DATE
?07, Apr. 2007, pp. 1{6.
[27] A. C. Evans, \Applications of Semiconductor Test Economics, and Multisite Testing to Lower
Cost of Test," in Proc. International Test Conference, 1999, pp. 113{123.
[28] C. Forzan and D. Pandini, \Why We Need Statistical Static Timing Analysis," in Proc. 25th
International Conf. Computer Design, 2007, pp. 91{96.
[29] P. Girard, N. Nicolici, and X. Wen, Power Aware Testing and Test Strategies for Low Power
Devices. New Jersey: Prentice-Hall, second edition, 2004.
[30] S. Gunasekar, Algorithms for Finding Optimum Frequencies for Aperiodic Clock Testing. Mas-
ter?s thesis, Auburn University, Auburn, Alabama, USA, May 2014. In preparation.
[31] H. Hao and E. J. McCluskey, \Very-Low-Voltage Testing for Weak CMOS Logic ICs," in Proc.
International Test Conference, Oct. 1993, pp. 275{284.
[32] H. Hashempour, F. J. Meyer, and F. Lombardi, \Test Time Reduction in a Manufacturing
Environment by Combining BIST and ATE," in Proc. 17th IEEE International Symposium
on Defect and Fault Tolerance in VLSI Systems, 2002, pp. 186{194.
[33] A. Kedia, Design of a Serialized Link for On-chip Global Communication. Master?s thesis,
University of British Columbia, Canada, 2006.
77
[34] A. Kedia and R. Saleh, \Power Reduction of On-Chip Serial Links," in Proc. IEEE Interna-
tional Symp. Circuits and Systems, 2007, pp. 865{868.
[35] R. K. Krishnarnurthy, A. Alvandpour, V. De, and S. Borkar, \High-Performance and Low-
Power Challenges for Sub-70 nm Microprocessor Circuits," in Proc. IEEE Custom Integrated
Circuits Conference, 2002, pp. 125{128.
[36] W.-J. Lai, C.-P. Kung, and C.-S. Lin, \Test Time Reduction in Scan Designed Circuits," in
Proc. 4th European Conference on Design Automation, Feb. 1993, pp. 489{493.
[37] E. Larsson, Introduction to Advanced System-on-Chip Test Design and Optimization. Springer,
2005.
[38] X. Lin, R. Press, J. Rajski, P. Reuter, T. Rinderknecht, B. Swanson, and N. Tamarapalli,
\High-frequency, At-Speed Scan Testing," IEEE Design & Test of Computers, vol. 20, no. 5,
pp. 17{25, Sept. 2003.
[39] P. Mangilipally and V. P. Nelson, \Emulation of Slave Serial Mode to Con gure the Xilinx
Spartan 3 XC3S50 FPGA Using Advantest T2000 Tester," Technical report, Auburn Univer-
sity, 2011.
[40] G. A. Maston, T. R. Taylor, and J. N. Villar, Elements of STIL: Principles and Applications
of IEEE Std. 1450. Springer, 2003.
[41] J. Moreau, T. Droniou, P. Lebourg, and P. Armagnat, \Running Scan Test on Three Pins:
Yes We Can!," in Proc. International Test Conference, 2009, pp. 1{10. Paper 18.1.
[42] N. Nicolici and B. M. Al-Hashimi, Power Constrained Testing of VLSI Circuits. Springer,
2002.
[43] K. Nose and T. Sakurai, \Optimization of VDD and VTH for Low Power and High-Speed
Applications," in Proc. Asia and South Paci c Design Automation Conf., Jan. 2000, pp. 469{
474.
[44] S. Pasricha, Y.-H. Park, F. J. Kurdahi, and N. Dutt, \Incorporating PVT Variations in
System-Level Power Exploration of On-Chip Communication Architectures," in Proc. 21st
International Conference on VLSI Design, Jan. 2008, pp. 363{370.
[45] R. J. Powers, \Throughput Advantages of Asynchronous Prober Control," IEEE Design &
Test of Computers, vol. 5, no. 3, pp. 56{63, 1988.
[46] X. Qian, C. Han, and A. D. Singh, \Detection of Gate Oxide Defects with Timing Tests at
Reduced Power Supply," in Proc. 30th IEEE VLSI Test Symposium, 2012, pp. 120{126.
[47] S. Ravi, \Power-Aware Test: Challenges and Solutions," in Proc. International Test Conf.,
Oct. 2007, pp. 1{10. Lecture 2.2.
[48] B. Razavi, Design of Analog CMOS Integrated Circuits. Tata McGraw Hill, 2002.
[49] J. L. Roehr, \Very-Low Voltage (VLV) and VLV Ratio (VLVR) Testing for Quality, Reliability,
and Outlier Detection," in Proc. International Test Conference, Oct. 2006, pp. 1{6. Paper
31.1.
[50] T. Sakurai, \Alpha Power-Law MOS Model," Solid-State Circuits Society Newsletter, vol. 9,
pp. 4{5, Oct. 2004.
[51] T. Sakurai and A. R. Newton, \Alpha Power-Law MOS Model," IEEE Jounal of Solid State
Circuits, vol. 25, pp. 584{593, Oct. 1990.
78
[52] A. Sanghani, B. Yang, K. Natarajan, and C. Liu, \Design and Implementation of a Time-
Division Multiplexing Scan Architecture Using Serializer and Deserializer in GPU Chips," in
Proc. 29th IEEE VLSI Test Symposium, 2011, pp. 219{224.
[53] P. Shanmugasundaram, Test Time Optimization in Scan Circuits. Master?s thesis, Auburn
University, Auburn, Alabama, USA, Dec. 2010.
[54] P. Shanmugasundaram and V. D. Agrawal, \Dynamic Scan Clock Control for Test Time
Reduction Maintaining Peak Power Limit," in Proc. 29th IEEE VLSI Test Symposium, May
2011, pp. 248{253.
[55] P. Shanmugasundaram and V. D. Agrawal, \Dynamic Scan Clock Control in BIST Circuits,"
in Proc. 43rd IEEE Southeastern Symp. System Theory, Mar. 2011, pp. 237{242.
[56] P. Shanmugasundaram and V. D. Agrawal, \Externally Tested Scan Circuit with Built-In
Activity Monitor and Adaptive Test Clock," in Proc. 25th International Conf. VLSI Design,
Jan. 2012, pp. 448{453.
[57] V. Sheshadri, V. D. Agrawal, and P. Agrawal, \Optimal Power-Constrained SoC Test Sched-
ules With Customizable Clock Rates," in Proc. IEEE International SOC Conf. (SOCC), Sept.
2012, pp. 271{276.
[58] V. Sheshadri, V. D. Agrawal, and P. Agrawal, \Optimum Test Schedule for SoC with Speci ed
Clock Frequencies and Supply Voltages," in Proc. 26th International Conf. VLSI Design, Jan.
2013, pp. 267{272.
[59] V. Sheshadri, V. D. Agrawal, and P. Agrawal, \Power-Aware SoC Test Optimization Through
Dynamic Voltage and Frequency Scaling," in Proc. 21st IFIP/IEEE International Conf. Very
Large Scale Integration (VLSI-SoC), (Istanbul, Turkey), Oct. 2013, pp. 105{110.
[60] C. E. Stroud, Designers Guide to Built-in Self Test. Springer, 2002.
[61] F. N. Taher, A Low-Power Analog Bus for On-Chip Communication. Master?s thesis, Auburn
University, Auburn, Alabama, USA, June 2013.
[62] L. Van Eck, \IMA: Cost E ective Testing in the New Era," Nov. 2010. http://www.ltxc.com.
[63] P. Venkataramani and V. D. Agrawal, \Reducing ATE Time for Power Constrained Scan Test
by Asynchronous Clocking," in Proc. International Test Conf., Nov. 2012. Poster P13.
[64] P. Venkataramani and V. D. Agrawal, \Test-Time Reduction in ATE Using Asynchronous
Clocking," in Proc. 6th IEEE International Workshop on Design for Manufacturability and
Yield, June 2012. Poster.
[65] P. Venkataramani and V. D. Agrawal, \ATE Test Time Reduction Using Asynchronous Clock
Period," in Proc. International Test Conference, Sept. 2013. Paper 15.3.
[66] P. Venkataramani and V. D. Agrawal, \Reducing Test Time of Power Constrained Test by
Optimal Selection of Supply Voltage," in Proc. 26th International Conf. VLSI Design, Jan.
2013, pp. 273{278.
[67] P. Venkataramani, S. Sindia, and V. D. Agrawal, \A Test Time Theorem and Its Applications,"
in Proc. 14th IEEE Latin AmericanTest Workshop (LATW), 2013, pp. 1{5.
[68] P. Venkataramani, S. Sindia, and V. D. Agrawal, \Finding Best Voltage and Frequency to
Shorten Power-Constrained Test Time," in Proc. 31st IEEE VLSI Test Symposium, 2013, pp.
19{24.
79
[69] B. Yang, A. Sanghani, S. Sarangi, and C. Liu, \A Clock-Gating Based Capture Power Droop
Reduction Methodology for At-Speed Scan Testing," in Proc. Design, Automation Test in
Europe Conf. and Exhibition, 2011, pp. 1{7.
[70] C. Yao, K. K. Saluja, and P. Ramanathan, \Thermal-Aware Test Scheduling Using On-chip
Temperature Sensors," in Proc. 24th International Conf. VLSI Design, Jan. 2011, pp. 376{381.
80