Controlled Transition Density Based Power Constrained Scan-BIST with
Reduced Test Time
by
Farhana Rashid
A thesis submitted to the Graduate Faculty of
Auburn University
in partial fulfillment of the
requirements for the Degree of
Master of Science
Auburn, Alabama
May 7, 2012
Keywords: Scan-BIST, Transition density, Weighted Random Patterns, Test time
Copyright 2012 by Farhana Rashid
Approved by
Vishwani D. Agrawal, Chair, James J. Danaher Professor of Electrical and Computer
Engineering
Adit D. Singh, James B. Davis Professor of Electrical and Computer Engineering
Victor P. Nelson, Professor of Electrical and Computer Engineering
Abstract
Controlling power dissipation in large circuits during test sessions is one of the major
concerns in VLSI testing. The reason behind the high power dissipation during test is
because unlike normal mode operation of the system correlation between consecutive test
patterns does not exist in test mode.
To increase the correlation between consecutive vectors during testing, several tech-
niques have been proposed for creating low transition density in the pattern sets and thus
control the power dissipation. However, this in turn increases the test application time as
the test has to run for longer test sessions to reach sufficient fault coverage. Increase in test
time is also undesirable.
This research aims to provide a common way to deal with both the problems by optimiz-
ing test lengths for power constraint scan BIST circuits and reduce required test application
time. It has been shown that a specific weight or transition density results in producing
effective test with shortest test length for a given fault coverage. Thus the test length is
optimized to reduce test application time. .Test time is further reduced by adapting the scan
clock dynamically based on the transition density of the pattern set staying within power
budget. A new pattern generator has been proposed to produce the test patterns of desired
properties. Finally we propose a greedy algorithm for mixing various transition densities to
reduce the test application time further without sacrificing the fault coverage. Time saving
up to 43% has been seen in this proposed method in ISCAS89 circuits.
ii
Acknowledgments
I would like to convey my heartiest gratitude to my supervisor, Dr. Vishwani D. Agrawal,
the James J. Danaher Professor at Electrical Engineering Department. Without his patience
and guidance this thesis would not have been possible. I honestly appreciate his not giving
up on me when I myself lost confidence while working through the research.
I thank my advisory committee members, Dr. Victor P. Nelson and Dr. Adit D.
Singh for their valuable suggestions and concern towards my work. I highly appreciate Dr.
Charles Stroud for helping me running his AUSIM simulator on Auburn University?s High
Performance Cluster Computer and letting me audit BIST course.
I sincerely appreciate Auburn University Network Services for partly supporting my
graduate study and my colleagues Shannon Price, Zebediah Whitehead, Jeffery Walker for
always being so supportive. It was a great learning experience working with these amazing
people.
Graduate study at Auburn University has been a pleasure. I am thankful to my friends
and colleagues for always being ready to discuss and give opinions whenever I was in doubt.
Without them my graduate experience would not have been so enjoyable. I am indebted to
my parents, brothers and sisters for all their love and support. Also I am very thankful to
my husband, Abdullah Al Owahid, for being with me in every step of life.
Finally, I thank everyone who directly or indirectly helped me during the course of
graduate study here in Auburn.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Transition Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Static Signal Probability . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.3 Transition Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Power Dissipation During Test . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Design for Testability (DFT) Techniques . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Scan Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 Built-In Self-Test (BIST) . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Types of Test Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 Reducing Test Power in BIST Circuits . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 New Test Pattern Generators . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Test Scheduling Algorithms . . . . . . . . . . . . . . . . . . . . . . . 14
iv
3.1.3 Toggle Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.4 LFSR Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.5 Vector Filtering BIST . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Reduction in Test Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Transition Density and Its Effect on Fault Coverage . . . . . . . . . . . . . . . . 19
4.1 Weighted Random Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Computing Best Case Transition Density from Best Case Weight . . . . . . . 20
4.3 Effect of Controlled Transition Density on Fault Coverage . . . . . . . . . . 21
5 Adapting Scan Clock Based On Transition Density . . . . . . . . . . . . . . . . 25
5.1 Dynamic Control of Scan Clock in a BIST Circuit . . . . . . . . . . . . . . . 25
5.2 Estimation of Scan-in Time Reduction . . . . . . . . . . . . . . . . . . . . . 28
5.3 Time Reduction and Power Consumption . . . . . . . . . . . . . . . . . . . . 31
6 Controlled Transition Density Patterns for BIST . . . . . . . . . . . . . . . . . . 34
6.1 BIST-TPG Circuit for Controlled Transition Density . . . . . . . . . . . . . 34
6.2 Randomness of Weighted Random Patterns . . . . . . . . . . . . . . . . . . 37
6.3 Dynamic Control of Scan Clock in BIST Circuit with Modified TPG . . . . . 38
6.4 Fault Coverage by the Modified TPG . . . . . . . . . . . . . . . . . . . . . . 39
7 A Greedy Algorithm to Apply Tests with Different Transition Densities . . . . . 41
7.1 Analysis of Fault Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.2 Algorithm to Apply an Optimal Set of Vectors . . . . . . . . . . . . . . . . . 42
7.3 Implementation of Controlled Mixed Transition Density Based TPG in BIST
Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.1 Fault Coverage Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.2 Dynamic Scan Clock Implementation . . . . . . . . . . . . . . . . . . . . . . 50
8.3 Power Consumption Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.4 Greedy Algorithm Implementation . . . . . . . . . . . . . . . . . . . . . . . 54
v
9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
vi
List of Figures
2.1 Architecture of a sequential circuit. . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Basic BIST circuitry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.1 Number of test-per-scan vectors for 95% coverage in s1269 when 1-probability of
scan-in bits was weighted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Number of test-per-scan vectors for 95% coverage in s1269 for various transition
densities of scan-in bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Numbers of weighted random and transition density vectors for 95% fault cover-
age in several ISCAS89 circuits. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.1 BIST circuitry for non-adaptive scan test clock. . . . . . . . . . . . . . . . . . . 26
5.2 BIST circuitry for adaptive scan test clock. . . . . . . . . . . . . . . . . . . . . 26
5.3 Inactivity monitor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 The inactivity counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.5 Power per test clock in s1196. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Power per test clock for first 25 cycles. . . . . . . . . . . . . . . . . . . . . . . . 33
6.1 Hardware implementation of TPG. . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Hardware implementation of TPG for M scan chains. . . . . . . . . . . . . . . . 36
vii
6.3 Distribution of 1s in weighted random patterns. . . . . . . . . . . . . . . . . . . 38
6.4 Adaptive scan clock scheme with modified TPG. . . . . . . . . . . . . . . . . . 39
6.5 Performance of transition density and weighted random patterns of s510. . . . . 40
6.6 Performance of transition density and weighted random pattern of s1512. . . . . 40
7.1 Detected faults vs. number of transition density vectors obtained from fault
simulation of s510 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.2 Fault coverage by transition density vectors obtained by simulation of s510 . . . 43
7.3 Detected faults vs. number of vectors in s510 for best case transition density
vectors and mixed transition density vectors. . . . . . . . . . . . . . . . . . . . . 45
7.4 Flow chart of proposed algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.5 Hardware Implementation for controlling a mix of various transition densities. . 47
8.1 Per clock power consumption with and without adaptive schemes for s1196. . . 54
8.2 Performance of greedy algorithm for s298 and s820. . . . . . . . . . . . . . . . . 55
8.3 Performance of greedy algorithm for s382 and s1196. . . . . . . . . . . . . . . . 56
viii
List of Tables
4.1 Best case weighted random and transition density vectors for 95% fault coverage
in ISCAS89 circuits obtained from fault simulation experiments. . . . . . . . . . 22
5.1 Determination of clock cycle range for different frequencies. . . . . . . . . . . . 30
5.2 Scan-in time reduction in ISCAS89 benchmark circuits. . . . . . . . . . . . . . . 32
6.1 Estimation of randomness in generated 1000 random patterns. . . . . . . . . . . 37
7.1 Performance of mixed transition density vectors. . . . . . . . . . . . . . . . . . 46
8.1 Test lengths for random and best-case weighted random (WRP) and transition
density (TDP) patterns for 95% fault coverage in ISCAS89 circuits. . . . . . . . 49
8.2 Test lengths for random and best-case weighted random (WRP) and transition
density (TDP) patterns for 90% fault coverage in ISCAS89 circuits. . . . . . . . 50
8.3 Reduction in scan-in time for conventional random patterns of weight 0.5. . . . 51
8.4 Reduction in scan-in time for best-case weighted random patterns (WRP). . . . 51
8.5 Reduction in scan-in time for best-case transition density patterns (TDP). . . . 52
8.6 Comparing test times for 90% coverage by conventional random (R), weighted
random (WRP) and transition density (TDP) patterns when adaptive scan clock
is used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.7 Mixing transition densities selected by Greedy Algorithm based on partial fault
coverage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
ix
List of Abbreviations
ASIC Application Specific Integrated Circuit
ATPG Automatic Test Program Generator
BIST Built-in-Self-Test
BS-LFSR Bit-Swapping LFSR
CA Cellular Automata
CMOS Complementary Metal Oxide Semiconductor
CUT Circuit Under Test
DFT Design for Testability
DS-LFSR Dual-speed LFSR
IC Integrated Circuit
ISCAS International Symposium on Circuits and Systems
LFSR Linear Feedback Shift Register
LT-LFSR Low Transition LFSR
PRESTO Pre-Selected Toggling
RI-LFSR Random Bit Injection LFSR
SAR Signature Analysis Register
SoC System-on-a-Chip
x
TPG Test Pattern Generator
VLSI Very Large Scale Integration
xi
Chapter 1
Introduction
Test power and test application time are the two major challenges in today?s VLSI
design and test area. Time spent in the expensive tester machines directly contributes to the
cost of a chip. Also for self testing circuits shorter test application time is desirable. Due to
advancement of technology circuit size has increased which naturally claims longer test time.
On the other hand, the test process causes higher power dissipation in the circuits compared
to the power dissipated in the normal mode of the circuit. The excess power dissipation
gives rise to many problems like hot spots, chip failure, performance degradation; testing
may even dramatically shorten the battery life when on-line testing is involved [15, 20].
Many techniques have proposed to tackle these two issues separately by DFT engineers.
For reducing test power one of the widely used technique is to decrease the clock frequency
used for testing. This is very often used for scan testing, a more popular method in DFT.
This method, however, is responsible for making the test application time longer. One other
method for reducing power consumption in the test session is to use test vectors created that
aim for lower switching activity in the circuit during testing. This method is widely used if
circuits are being self-tested and the vectors are produced on chip. The down side of this
technique is that it requires longer test sequences to achieve targeted fault coverage. This,
once again, results in longer test time.
More DFT techniques, like test compression are needed to make the test process faster
by reducing test vector application time [41].
A method that can contribute to both the causes mentioned above is the motivation
behind this work.
1
1.1 Problem Statement
The aim of this work is to:
? Analyze the effect of transition density on fault coverage
? Deploy an effective test generation process using the information from the analysis
? Adapt the scan frequency to the transition density for power constrained testing
1.2 Thesis Contribution
The unique contribution of the work presented in this thesis is:
? Determination of best transition density in a vector set to achieve a target fault cov-
erage with the shortest test length
? Adjust the scan frequency according to the transition density for a power constrained
scan-BIST circuit to speed up the test in multiple scan chains
? Deployment of a variable transition density test pattern generator in a BIST circuit
that is capable of producing pre-selected transition density vectors
? Reduction of test application time further by adapting the scan clock to the pre-selected
transition density
? A greedy algorithm to construct a vector set with mixed transition density in a con-
trolled way, aimed to effectively apply test in a power constrained setup for scan -BIST
with reduced test application time
1.3 Thesis Organization
Chapter 2 of the thesis introduces readers to various concepts that are relevant for
understanding the significance of the problems solved by the proposed work. Chapter 3
2
describes briefly the prior work done to reduce both test application time and power dissi-
pation during test. Chapter 4 analyzes the effect on fault coverage when transition density
in a vector set is modified from the conventional transition density that is present in ran-
dom or pseudo-random patterns. Chapter 5 presents a technique to dynamically adjust the
scan clock based on the transition density present in the vector sets with test compression
technique. Chapter 6 utilizes the information from chapter 4 and chapter 5 to combine
the benefits in a BIST implementation. Here we present a modified test pattern generator
(TPG) that has a capability to produce a desired transition density in a vector set. Thus
a refined scheme to adjust the scan clock adaptively to reduce the test application time is
described in this chapter. Chapter 7 proposes an algorithm to select among transition den-
sities to construct a controlled transition density based vector set for further reduction of
test time. Chapter 8 discusses the experimental results obtained from the implementation
of the algorithm on different benchmark circuits. Finally, chapter 9 concludes the work with
suggestion for future research.
3
Chapter 2
Background
This chapter discusses the background information for a better understanding of the
work presented in this thesis. The first section provides some definitions of the terms used
in this work, the second section explains the power dissipation during test and its conse-
quences, the third section briefly discusses DFT techniques and finally the last section gives
an overview of different test pattern generation methods.
2.1 Definitions
2.1.1 Transition Density
The transition density T, of a logic signal N(t) is defined as number of transitions per
unit time, i.e., T = N(t)t . For a continuous signal, transition density T = Limt??N(t)t [30].
Thus, the transition density of a clock signal is two, according to this definition, as
there are two transitions, one rising and the other falling in a unit time (which is one clock
period).
2.1.2 Static Signal Probability
Viewing a signal as a random process [31] and observing it for a time interval t0 + t1,
where signal remains 1 for duration t1 and the signal remains 0 for duration t0, then the
probability of the signal being 1, is given by
p1 = t1t1 +t0 (2.1)
4
And the probability of the signal being 0 is given by
p0 = t0t1 +t0 = 1?p1 (2.2)
2.1.3 Transition Power
The power consumption in CMOS circuits can be classified into static and dynamic
power [46]. Leakage current or other current that is drawn continuously from power supply
causes static power dissipation. Dynamic dissipation occurs when switching occur either due
to short circuit current or charging and discharging load capacitance.
The average energy consumed at node i due to switching is given by E= 12CiV 2dd, where
Ci is the equivalent output capacitance and Vdd is power supply voltage. Therefore, transition
power dissipation is given by,
P = E?f = 12CiV 2dd?f (2.3)
where ? is transition density and f is clock frequency.
2.2 Power Dissipation During Test
This section describes different power consumptions in a CMOS circuit that is relevant
to testing and then discusses the reason why test power is needed to be controlled during
testing.
Let us assume that the energy consumption of a circuit after application of successive
input vectors (Vk?1,Vk) is Evk = 12CV 2dd?k, where, ?k is the number of switching in the
circuit that occurred due to application of the vector Vk. Therefore, for a pseudorandom
test sequence of length L, where the test length is determined from the number of vectors to
5
reach a targeted fault coverage, the total energy consumed in the circuit during application
of the complete test sequence is given by, Etotal = 12CV 2ddsummationtext?k.
If fck denotes the clock frequency then instantaneous power consumed in the circuit
after application of vectors (Vk?1,Vk) is given by, Pinst(Vk) = Evkfck. This is because, by
definition, the instantaneous power is the consumed power during one clock period [16, 20].
The peak power consumption corresponds to the maximum instantaneous power con-
sumed during the test session. It therefore, corresponds to the highest energy consumed
during one clock period, multiplied by clock frequency.
Also, the average power consumed during the test session is the total energy multiplied
by the test time.
Pavg= Etotal * fckLength,L
According to the expressions of power and energy consumption mentioned above and
assuming a given CMOS technology and supply voltage for the circuit design, number of the
switching in the circuit caused by applying a test vector is the only parameter that affects
the energy, peak power and average power consumption. The clock frequency used during
testing affects both the peak power and average power and the test length which is the
number of the patterns applied to the circuit under the test (CUT) affects only the total
energy consumption.
It is important when dealing with high density systems such as modern ASICs and
SOCs, to design tests for these circuits that are non-destructive . But excessive switching
activity during tests leads to increased current flow in the circuit, resulting in circuit failures
due to altered electromigration, increase in cost for packaging, decreased circuit reliability,
and autonomy of battery powered remote and portable system.
For circuits that have BIST circuitry incorporated within them, switching activity during
test session is a major concern. Therefore, many low power or low transition based BIST
techniques, especially for scan BIST have been proposed by researchers.
6
Figure 2.1: Architecture of a sequential circuit.
Low transitions in test vectors, on the other hand, tends to increase the test length
resulting in increased test time that is required to apply the longer test sequence.
Therefore we propose in our work a test scheme for controlled transition based test
application that will optimize test length and speed up test without exceeding the power
budget of the circuit.
2.3 Design for Testability (DFT) Techniques
This section briefly describes the two DFT techniques that are widely used. As the size
of the circuit increases, test complexity also increases. Their internal nodes become harder
to test. Circuits are therefore modified so that they can be tested effectively [8].
2.3.1 Scan Design
Sequential circuits are harder to test than combinational circuits. This is because the
presence of memory elements, as shown in Figure 2.1, which creates internal states during
circuit operation. An exhaustive test would involve application of all possible input vectors
at all possible states of the memory elements.
7
Figure 2.2: Basic BIST circuitry.
For a circuit with n inputs there are 2n possible input combinations. As n increases the
number of possible input vectors increases exponentially. This phenomenon is even more
severe for sequential circuits.
The DFT technique that seeks to improve testability of sequential circuits is scan de-
sign [8] or its partial scan variations [5, 8]. Here the sequential circuit is modified such that
it can operate in test mode. When the circuit is in test mode, the flip-flops in the circuit
are chained together to form one or more shift registers. The flip-flops serve as a point of
controllability and observability and help achieve better test coverage.
2.3.2 Built-In Self-Test (BIST)
BIST is a DFT technique in which additional hardware is added to the circuit to be
tested so that it can test itself [6, 8, 38]. The basic BIST circuitry is shown in Figure 2.2.
The patterns required for test are generated using various techniques. Among them, the use
of a Linear Feedback Shift Register (LFSR) that generates pseudorandom pattern sets is
most common.
A large number of outputs are received from the circuit under test. It is necessary to
store the correct values of all those bits without adding a lot of extra hardware. This in turn
8
calls for some more design techniques. A LFSR, most commonly known as Signature Analysis
Register (SAR) or Multiple Input Signature Register (MISR) is used for this purpose.
Test-Per-Clock BIST Systems: In this type of system, a test is applied every clock cycle,
i.e., a new set of faults is tested in every clock cycle. This type of system has short pattern
lengths. A major concern for BIST is the simulation time required to compute good circuit
behavior. It is therefore advantageous to have short pattern lengths.
Test-Per-Scan BIST Systems: In test-per-scan BIST, each test comprises scan-in of one
input vector, one clock to conduct the test and scan-out of output responses. This type of
system therefore requires a larger test time. Also, it involves larger simulation time than in
test-per-clock BIST systems due to the longer pattern lengths.
2.4 Types of Test Patterns
This sections briefly describes types of test patterns as based on the test pattern gener-
ators are classified [38].
1. Deterministic test patterns are developed to detect specific faults and/or structural
defects for a given CUT. An example of hardware for applying deterministic vectors
would include a ROM with a counter for addressing thr ROM. This type of approach
has limited applicability for BIST. This approach is often referred to as stored test
patterns in the context of BIST applications.
2. Algorithmic test patterns are similar to deterministic test patterns in that they are
specific to a given CUT and are developed to detect specific fault models in the CUT.
However, because of repetition and/or sequence typically associated with algorithmic
test patterns, the hardware for generating algorithmic vectors is usually a finite state
machine. There are considerable applicability of this test pattern generation approach
to BIST for regular structure such as RAMs.
9
3. Exhaustive test patterns produce every possible combination of input test patterns.
In case of an N-input combinational logic circuit where an N-bit counter produces
all possible 2N test patterns and will detect all detectable gate level stuck-at faults.
Exhaustive test patterns are not practical for large N.
4. Pseudo-exhaustive test patterns are an alternative to exhaustive test patterns. In this
case, each partitioned combinational logic sub-circuit will be exhaustively tested. Each
K-input sub circuit receives all 2K possible patterns, where K < N.
5. Pseudo-random patterns are most commonly produced patterns by TPG hardware
found in BIST applications. These patterns have properties similar to those of random
pattern sequences but the sequences are repeatable.
6. Weighted-random test patterns are good for circuits that contain random pattern re-
sistant faults. This type of pattern generation uses an LFSR or cellular automata (CA)
to generate pseudo-random test patterns and then filters the patterns with combina-
tions of AND/NAND gates or OR/NOR gates to produce more logic 0s or logic 1s in
the test patterns applied to the CUT. The number of 1s (or 0s) are referred to as the
weight of the vectors.
7. Lastly, random patterns have frequently been used for external functional testing of
microprocessors as well as in ATPG software.
10
Chapter 3
Previous Work
This chapter presents previous work that has been done in the area of low power testing,
especially techniques involved in reducing transitions or toggles to reduce switching activity
during the test. The first section summarizes the work done for controlling power dissipation
during testing circuits with BIST circuitry. The second section gives brief descriptions of
the work done to reduce the test application time in scan testing.
3.1 Reducing Test Power in BIST Circuits
Reduction of test power is a widely recognized problem, as described earlier, and there-
fore a number of solutions have been presented. Girard summarizes the different techniques
proposed for low-power testing of VLSI circuits [16, 20]. They are broadly classified into
low-power external testing techniques and low-power BIST techniques. This section focuses
on the low power techniques proposed for BIST.
3.1.1 New Test Pattern Generators
Test pattern generators have been modified to reduce the power that is generated during
test because of low correlated test vectors. This section discusses some of those techniques
briefly.
Tehranipur et al. proposed a low transition BIST pattern generator called LT-LFSR,
to reduce average and peak power of a circuit during test by reducing transitions within
random test patterns and between consecutive patterns [39]. The proposed LT-LFSR reduced
transitions by inserting intermediate vectors between two consecutive vectors generated by
the LFSR. This was done by combining properties of two different LFSRs, the Bipartite
11
LFSR and the Random Bit Injection LFSR (RI-LFSR). The experimental results showed
four ISCAS benchmark circuits up to 77% and 49% reduction in average and peak power
respectively. However, the test length increased to achieve targeted fault coverage while
using this method.
Abu-Issa and Quigley proposed a novel low transition LFSR, called Bit-Swapping LFSR
(BS-LFSR) which consisted of an LFSR and 2-to-1 multiplexer [1]. They have showed that
when BS-LFSR was used to generate test patterns for scan-based BIST, it reduced the
number of transitions that occur at the scan chain input during scan shift operations by 50%
when compared to those patterns produced by a conventional LFSR. Thus they reduced
the overall switching activity in the circuit under test during test application. They also
combined the BS-LFSR with a scan chain ordering algorithm that ordered the cells in a way
that reduced the average and peak (scan and capture) in the test cycle or while scanning out
a response to a signature analyzer. Result showed up to 65% and 55% reduction in average
and peak power, respectively, with negligible effect on fault coverage or test application time.
Wang proposed a low hardware overhead test pattern generator (TPG) for scan-based
BIST that reduced switching activity along with achieving very high fault coverage with
a reasonable test length [43]. The proposed TPG comprised two TPGs LT-RTPG(low
transition random TPG) and 3-weight WRBIST (weighted random BIST) TPG, where the
LT-RTPG generated patterns for easy to detect faults and test patterns generated by the
3-weight WRBIST detects faults that remained undetected after LT-RTPG patterns was
applied. Close to 100% fault coverages for ISCAS benchmark circuits were seen with signif-
icantly reduced activity during test sessions.
Rajski et al. proposed a pseudorandom test pattern generator with pre-selected toggling
(PRESTO) activity that comprised a finite state machine, a pattern generator, appropriate
phase shifter. The experimental results for eight industry standard circuits showed reduced
switching activity with a cost of increased test length [25].
12
Wang and Gupta proposed a test pattern generator for BIST called dual-speed LFSR
(DS-LFSR), aiming to reduce heat dissipation during test application [44]. As the name
implies, the dual-speed LFSR (DS-LFSR) consisted two LFSRs, a slow LFSR and a normal
speed LFSR. The inputs of the circuit under test were provided through the slow LFSR in
order to reduce the transition density at the inputs, which resulted in reduced heat dissipation
during test. A procedure was introduced to design a DS-LFSR such that high fault coverage
was achieved through unique and uniformly distributed patterns. New methods of selecting
inputs driven by the slow LFSR and increasing the number of inputs driven by the slow
LFSR were presented. Reductions of 13% to 70% in the number of transitions were observed
for ISCAS benchmark circuits without loss of fault coverage using this method.
A Cellular Automata based Test Pattern Generator (TPG) was proposed by Corno et
al. to test combinational circuits. The TPG was designed to reduce power consumption
while achieving high fault coverage [12]. An algorithm was presented here that selected an
optimal non-linear hybrid cellular automaton (HCA) based on power consumption for given
coverage and test length constraints. Experimental results showed an average test power
reduction of 34% without affecting fault coverage, test length and area overhead.
Girard et al. presented a low power test-per-clock BIST test pattern generator (TPG)
that generated test vectors capable of reducing the switching activity during test [19]. The
technique was based on a modified clock scheme and the clock tree feeding the TPG. There-
fore, this method reduced test power in the TPG and clock tree in addition to power reduction
in the circuit under test (CUT). Reductions of up to 60% and 61% were noted in power and
energy when the proposed technique was implemented on ISCAS benchmark circuits.
Zhang, Roy and Bhawmik proposed a modified LFSR by adding weight sets to tune
the pseudorandom vector?s signal probability in order to achieve increased fault coverage
but with reduced energy consumption [48]. A tool, POWERTEST was developed which
used a genetic algorithm based search to determine optimal weight sets at primary inputs
13
to minimize energy dissipations. Results on ISCAS benchmark circuits showed an energy
reduction of up to 97.82% while still achieving high fault coverage.
Wang and Gupta presented a new BIST TPG design, called low-transition random
TPG (LT-RTPG) that comprised an LFSR, a k-input AND gate, and a T flip-flop [45].
The LT-RTPG generated test patterns for test-per-scan BIST that decreased the number of
transitions that occurred during scan shifting and thus decreased the heat dissipation during
testing. The new TPG reduced the number of transitions in ISCAS89 benchmark circuits
by 23% to 59%.
Gizopoulos et al. proposed low power BIST schemes for datapath architectures built
around multiplier-accumulator pairs, based on deterministic test patterns [21]. They have
also proposed two alternatives based on whether the design is low energy dissipation or
low power dissipation during a BIST session. Both methods are based on modified binary
counters, operating as Gray counters. The technique offers up to 78.33% energy saving and
up to 82.22% power saving compared with pseudorandom BIST.
3.1.2 Test Scheduling Algorithms
Test scheduling techniques have been proposed by different researchers to control the
test power for complex ICs.
Zorian presented a technique which consists of a distributed BIST control scheme [21].
The process included a BIST control methodology that implemented the BIST schedule
with a highly modular architecture. The control architecture provided an autonomous BIST
activation and a diagnostic capability to identify failed blocks. The technique reduces average
power and hence avoids temperature related problems but with the cost of increased test
time.
Chou, Saluja and Agrawal presented an optimum test scheduling algorithms for equal
and unequal test length cases under power constraints [11, 10, 26]. Their algorithms find the
optimum solution by first constructing a test compatibility graph from a resource graph, and
14
then using the compatibility graph to identify time compatible tests with power information
associated with each test, followed by identifying power compatible tests among the time
compatible tests. Finally the optimal scheduling of the tests was found using a minimum
cover table approach. This algorithm reduces the average power consumption.
Iyengar and Chakrabarty proposed an integrated framework to determine optimal SOC
test schedules [24]. They also proposed a new algorithm that used preemption to obtain
optimal test schedules in polynomial computation time.
3.1.3 Toggle Suppression
The toggle reduction technique involves the suppression of toggles in the circuit during
test. This reduces the net activity and hence the power dissipation during test.
Hertwig and Wunderlich introduced a low power technique for scan-based BIST archi-
tectures that modified the scan-path structure?s scan cells such that the inputs to the CUT
remained unchanged during shift operations [23]. Energy savings of up to 90% were seen in
a standard, scan-based BIST architecture.
3.1.4 LFSR Tuning
Girard et al. proposed a technique to minimize the energy required to test combinational
circuits with BIST without altering fault coverage [18]. They have analyzed the impact of
the polynomial and seed selection of the LFSR used as TPG on the energy consumed by
the circuit and found that appropriate selection of the seed of the LFSR can contribute to
energy reduction whereas the polynomial selection does not affect the power consumption.
A heuristic based on a simulated annealing algorithm was proposed to decrease the energy
consumption of BIST runs.
15
3.1.5 Vector Filtering BIST
Not all the patterns generated by the TPG contribute to the fault detection. Therefore,
a number of works aimed to reduce the energy consumed during test by filtering out the
non-detecting vectors. Girard et al. proposed a test vector inhibiting technique to tackle the
increased activity during test operation [17]. A mixed solution based on a reseeding scheme
and the vector inhibiting technique was also proposed in order to deal with hard-to-test
circuits that contain pseudo-random resistant faults. The technique reduced the total energy
consumption during test and allowed the test at system speed in order to achieve high delay
fault coverage. Experimental results showed weighted switching activity reductions ranging
from 18.5% to 78.5% without loss of stuck-at fault coverage.
As the test progresses the detection efficiency of the pseudo-random vectors decreases.
The number of pseudo-random vectors that will not detect previously undetected faults
increases. These vectors consume energy without contributing to fault coverage. This fact
was used by Manich et al. to propose two techniques to reduce the energy and average power
consumption of the system [28]. The first technique filters all the non-detecting subsequences
and the second technique uses reseeding which is an extension of the technique used in [18].
Energy and average power consumption savings up to 90% have been observed while applying
these two techniques in ISCAS benchmark circuits.
Gerstend?orfer and Wunderlich used the technique of filtering non-detecting patterns
for scan-based BIST architectures, combined with Hertwig and Wunderlich to avoid scan
path activity during scan shifting [14, 23]. The modules and modes with the highest power
consumption were identified and design modifications to reduce power consumption were
proposed. The proposed modifications reduced the test power by several orders of magnitude
with nominal cost in terms of area and performance penalties.
16
3.2 Reduction in Test Time
Shanmugasundaram and Agrawal proposed a dynamic scan clock control scheme in scan
testing to reduce test time while maintaining peak power limit [33, 35, 34, 36]. Per cycle
scan activity is monitored in the scan chain to speed up the scan clock for low activity cycles
without exceeding the specified peak power budget.
The scheme was based on the fact that not every vector has the highest activity and
hence can be scanned-in using a faster test clock without exceeding the power budget. The
power P dissipated at a node is given by
P = 12CV 2?f (3.1)
where C is the capacitance of the node, V is supply voltage, f is clock frequency and ? is
node activity factor.
In the worst case, scan clock frequency ftest can be can be determined based on the
maximum activity ? = 1, so that the test power can never exceed the power limit. Therefore,
Pbudget = 12CV 2ftest (3.2)
and,
ftest = 2PbudgetCV 2 (3.3)
In general, the worst case assumption can be modified for any value ?node. All vectors
are scanned in and scanned out at this frequency. However, most vectors do not cause the
maximum activity in the circuit and will dissipate much lower power than the allowed limit.
It is possible to scan in these vectors at higher clock frequencies without exceeding power
budget.
17
Excepting the worst case, if the number of transitions in the circuit reduces to fraction
1
i of the maximum number of transitions, then power consumption is reduced. Given that
the power should not exceed Pbudget, we can increase the test clock frequency to ftest ? i.
That is,
Actual Power = 12CV 2ftest ? 1i ? Pbudget (3.4)
Since the capacitance and the voltage are constant for a node, the power is proportional to
the product of activity and frequency.
The authors showed that if the activity reduces to iith fraction of the maximum ?
= 1, then the scan frequency can be increased to i times of the test frequency without
causing the power to exceed power budget. The analytical results showed that for very
low activity (? ? 0.0) test application time can potentially reduce by 50%. Experimental
results showed up to 19% reduction in time in the largest ISCAS 89 circuit with 2-3% area
overhead [33]. These experiments add scan chain activity monitoring and clock frequency
adjustment hardware to test-per-scan BIST circuits with a single scan chain.
The work presented in this thesis uses the above method, further extending it to a test
compression technique by breaking a single scan chain into multiple scan chains to reduce
the test application time to limit the power consumption during scan activity.
18
Chapter 4
Transition Density and Its Effect on Fault Coverage
To keep the power consumption low while testing, low transition density test vectors
are applied [7, 47]. This, in general, increases the test application time for achieving a
target fault coverage. To study the effect of transition density of vectors on fault coverage a
detailed analysis has been done. This chapter describes the variation in fault coverage due
to different transition density selection and compared to fault coverage attained by weighted
random patterns. A best case transition density is also determined from that analysis.
4.1 Weighted Random Pattern
Weighted random patterns have been used before to reduce test length for combinational
circuits [2, 3, 4, 22, 32, 47]. Proper selection of the input probability can increase the
efficiency of test vectors in detecting faults, resulting in reduced test time [27]. Therefore,
to achieve higher fault coverage with shorter test lengths weighted pseudo random patterns
are used [13]. For demonstrating the effectiveness of weighted pseudorandom test patterns,
fault simulation was done on ISCAS89 benchmark circuits.
A Matlab [29] program was written to construct different test vector sets. Each of the
sets contained 10,000 vectors but with different weights. Here, the weights are defined as
the probability of a bit being 1 in a vector. The weights are varied from 0.1 to 0.95 at 0.05
intervals. Thus a total 18 sets of vectors are constructed for the weights 0.1, 0.15, 0.2, etc,
up to 0.95.
Targeted fault coverage was set to 95% of the total faults and then fault simulation
was done using the 18 different vector sets as mentioned earlier. In each case the number
of vectors needed to reach the target fault coverage by each vector set was recorded. For
19
0 0.2 0.4 0.6 0.8 10
500
1000
1500
2000
2500
weights of random patterns
number of vectors
Figure 4.1: Number of test-per-scan vectors for 95% coverage in s1269 when 1-probability
of scan-in bits was weighted.
every circuit that was simulated there exists one specific weight that resulted in shortest test
length. The number of vectors obtained in this experiment for s1269 circuit as a function of
the weight (probability of 1 in the scan-in bits) is shown in Figure 4.1. For this circuit the
minimum is 22 vectors for a weight of 0.6.
4.2 Computing Best Case Transition Density from Best Case Weight
This section deals with the assumption of a best case transition density from the best
case weighted random patterns. The transition density in an uncorrelated-bit sequence that
has a 0-bit probability of p0 and 1-bit probability of p1 is given by p0p1 + p1p0 since a
transition occurs when a 1 follows a 0 or a 0 follows a 1. However, p0 = 1 ? p1, thus, the
transition density can be calculated as:
TD = (1?p1)p1 +p1(1?p1) = 2p1(1?p1) (4.1)
20
Hence, from Figure 4.1, for circuit s1269, if best case weighted random pattern has a 1-bit
probability of 0.6 then the corresponding transition density will be 2?0.6?0.4 = 0.48.
This implies that if a test vector set is constructed to have a transition density of 0.48,
then that vector set will generate an effective test for the circuit with shortest test length.
In other words it can be assumed that a vector set of average transition density of 0.48 will
result in detecting more faults with fewer vectors when compared to the numbers of vectors
applied with transition densities higher or lower than 0.48.
4.3 Effect of Controlled Transition Density on Fault Coverage
If bits are generated randomly, the probabilities of generating a 1 or a 0 are equal, i.e.,
p0 = p1 = 0.5. Hence the transition density of the bit stream is also 0.5. To generate a
transition density higher or lower than 0.5, bits must be generated with negative or positive
correlation, respectively. Therefore, the bit stream will contain shorter runs of consecutive
1s or 0s for a transition density higher than 0.5 and longer runs of consecutive 1s or 0s for
a transition density lower than 0.5.
A Matlab [29] program was written to generate test vector sets, each set containing
10000 vectors but with different transition densities. Here also the transition density was
varied from 0.1 to 0.95, with 0.05 intervals. The vector set generated for 0.1 transition
density has longer runs of 1s and 0s in consecutive bit positions. Likewise the vector set
having transition density of 0.95 has very short runs of 1s and 0s in consecutive bit positions.
Target fault coverage was set to 95% of the total faults and then fault simulation was
done using the 18 different vector sets as mentioned above. In each case number of vectors
needed to reach the target fault coverage by each vector set was recorded. For every circuit
that was simulated, there existed a best transition density (TD) that resulted in the shortest
test length.
The same set of ISCAS89 benchmark circuits was again used for fault simulation. Ta-
ble 4.1 shows the best case results obtained from fault simulation using AUSIM [37]. The
21
Table 4.1: Best case weighted random and transition density vectors for 95% fault coverage
in ISCAS89 circuits obtained from fault simulation experiments.
Circuit Target Weighted random vectors Transition density vectors
name FC (%) p1 No. of vectors 2p1(1?p1) TD No. of vectors
S298 77.1 0.6 18 0.48 0.55 423
S382 95 0.3 56 0.42 0.45 124
S510 95 0.4 136 0.48 0.5 152
S635 95 0.9 97 0.18 0.1 1883
S820 95 0.45 2872 0.495 0.45 5972
S1196 95 0.55 1706 0.495 0.45 2821
S1269 95 0.6 22 0.48 0.5 24
S1494 98.8 0.5 4974 0.5 0.45 3158
S1512 95 0.75 538 0.375 0.2 338
table shows the numbers of vectors that achieved 95% fault coverage. The third column
gives the weighted random bit probability (p1) that required minimum number of vectors
shown in column 4. In column 5, the probability p1 of column 3 is used to compute the
transition density from equation 4.1.
The last two columns of Table 4.1 give the best case transition density (TD) and the
corresponding number of vectors obtained from simulation. The differences in the transition
densities of columns 5 and 6 can be because the two were obtained from two different
statistical test samples. Also, equation 4.1, used for computing TD in column 5, assumes
uncorrelated neighboring bits, an assumption that is yet to be validated.
Figure 4.2 shows a bar chart of the number of transition density vectors obtained from
fault simulation experiments to reach 95% fault coverage in circuit s1269. A vector set
generated with 0.5 transition density has the best fault detecting capability with smallest
number (only 24) as compared with the other transition density vector sets.
However, unlike highly efficient weighted random patterns the patterns constructed
based on transition density were not able to detect 100% of faults for some circuits. As
shown in Figure 4.3, the weighted random patterns and the transition density based vectors
do not always have the same effectiveness. Which is better, often depends upon the circuit.
22
0 0.2 0.4 0.6 0.8 10
50
100
150
200
250
300
transition density
number of vectors
Figure 4.2: Number of test-per-scan vectors for 95% coverage in s1269 for various transition
densities of scan-in bits.
While the generation of weighted random patterns is well understood, transition density
patterns need further study.
Note weighted random bits have a transition density of their own. But our transition
density patterns generated by the toggle flip-flop always have equal number of 0s and 1s.
Though the transition density of weighted random bits for any p1 can never be higher than
0.5, the toggle flip-flop can produce transition densities greater than 0.5. Such patterns
will produce high power consumption, which can be lowered by the adaptive test clock
procedures [33, 34, 35, 36] as discussed in a later chapter, if the vectors gave accelerated
fault coverage. This aspect needs additional study.
A more detailed analysis has been done on the fault coverage of ISCAS89 circuits and
the results has been tabulated in a later chapter on experimental results. It may be noted
that many techniques has been used to detect hard to detect faults in scan-BIST, such as
vector reseeding, test point insertion, etc. We show here that to reach certain fault coverage,
a specific weight (probability of a bit being 1) or a specific transition density (per vector
transition probability that determines the number of transitions in a bit stream) can produce
23
1 2 3 4 5 6 7 8 90
1000
2000
3000
4000
5000
6000
s298 s382 s510 s635 s820 s1196 s1269 s1494 s1512
number of vectors
weighted random
transition density
Figure 4.3: Numbers of weighted random and transition density vectors for 95% fault cov-
erage in several ISCAS89 circuits.
an effective test with the minimum number of vectors as compared to arbitrary weights or
transition densities.
If the vectors are scanned in with a fixed test clock frequency, then the number of vectors
in a test sequence determines the test time. The shorter the test length the lesser time it
will take to apply the test. Hence, finding the best case weight or the best case transition
density is useful for constructing an effective test sequence for scan-BIST circuits in order
to optimize the test application time.
The test time can be further reduced by adapting the test frequency dynamically de-
pending on the transition density in the vectors, which will be described in the next chapter.
24
Chapter 5
Adapting Scan Clock Based On Transition Density
This chapter describes the scheme of reducing test application time for scan testing
by adapting the test clock according to the transition density of the test vectors without
exceeding the power limit. It is assumed that the test power limit is set by the maximum
activity in the scan chain during capture cycles. Therefore, by monitoring the transition
density of the scan in vectors, the scan clock can be dynamically adjusted [33, 34, 35, 36].
5.1 Dynamic Control of Scan Clock in a BIST Circuit
The circuit model chosen for the analysis is the test-per-scan multiple scan chain based
BIST model [38]. To implement this model, flip-flops are added to primary inputs and
primary outputs of the sequential circuit under test (CUT). All flip-flops are converted into
scan flip-flops and partitioned into multiple scan chains. A test pattern generator (TPG), a
multiple input signature analysis register (MISR) and a BIST controller are also added. A
frequency divider module is added, which provides either the scan clock or the system clock,
based on the mode of operation of the circuit. If the circuit is in the system mode, the BIST
circuitry that consists of TPG, MISR and BIST controller are kept idle and the circuit runs
with the system clock provided by the control clock select block. Likewise if the circuit is in
test mode then the BIST circuitry is active and the control clock selects the scan clock, as
shown in Figure 5.1. Here the frequency of the scan clock is determined by the peak power
consumption of the circuit, which is assumed to be the power consumption in the scan chains
having activity ? = 1.
Figure 5.2 shows the hardware implementation of an adaptive scheme for the test clock
in a test-per-scan BIST model with multiple scan chains. A larger frequency divider divides
25
Figure 5.1: BIST circuitry for non-adaptive scan test clock.
Figure 5.2: BIST circuitry for adaptive scan test clock.
the system clock into n different frequencies, where the fastest clock is the system clock and
the slowest frequency is the test clock, based on the peak power consumption as described
earlier. An n:1 multiplexer MUX is also added to select from the range of frequencies
generated by the frequency divider block.
Figure 5.3 shows an inactivity monitor block implementation with monitors attached to
the first scan flip-flop of each chain. The inactivity monitors are simple XNOR gates that
produce a 1 whenever inactivity enters the attached scan chain and produces a 0 when an
activity enters the scan chain. We feed the output of all monitors to a counter. Depending
26
Figure 5.3: Inactivity monitor.
on the number of lines that are logic 1 at the output of the XNOR gates counter adds from
0 to n (number of scan chains) per clock. Hence all the inactivity that has entered the entire
scan chains per clock has been accounted for. If no inactivity enters any of the scan chains,
then the counter stays in its previous state by adding 0. If 1 inactivity enters one of the scan
chains, the counter counts up by 1, and so on. Therefore, if inactivity enters every one of
the n scan chains, the counter adds n to the previous count.
While counting up, if the counter reaches a certain threshold it signals the frequency
selector MUX to deploy a higher frequency and hence dynamically adapts the scan frequency
according to the inactivity in the chain.
The counter shown in Figure 5.3 is a simplified block diagram. The actual hardware
consists of an adder, a combinational block with a register and a MUX. At every clock,
if a non-activity enters a scan chain, the inactivity monitor attached to the first flip-flop
of the scan chain becomes high. The inactivity monitor from every scan chain feeds to a
combinational block. The output of the combinational block is connected to a separate select
line of a MUX. The inputs of the MUX are 0, 1, 2... n, where n is the number of scan chains
in the design. The inputs to the adder are the previous state of the register and the output
from the MUX. The hardware is shown in Figure 5.4.
27
Figure 5.4: The inactivity counter
If 1 output of the inactivity monitor is high, the output of the combinational block will
be 01 (assuming the number of scan chains in the design to be 4). the first input to the
adder is the present state of the register and the second input to adder is the output to
the mux. For our example, 1 will be added to the current contents of the of the register.
Hence, at every clock the contents of the register wil be updated according to the number
of inactivities that entered the scan chains. Similarly, If 1 inactivity enters in each of the
scan chains, i.e., if the total number of inactivity is 4, then the combinational circuit output
will be 11. This in turn will choose the second input of the adder to be 4. Hence, 4 will
be added to the current state of the register. However, If no inactivity enters in any of the
scan chains, the combinational circuit will produce 00 output. Hence, 0 will be added to the
current state of the register.
5.2 Estimation of Scan-in Time Reduction
The reduction in scan-in time for this multiple scan chain based circuit can be estimated
almost similarly as for a single chain implementation [34]. It is assumed here that the
28
captured vector in the scan chain prior to the scan in vector has an activity factor of 1. That
is, the scan chain is filled up with alternating 0s and 1s.
Let N be the total number of flip-flops in the circuit, A be the non-transition density
(A = 1 ? ?), v be the number of frequencies, T be the time period corresponding to the
fastest clock and m be the number of scan chains. Hence, the largest scan chain will have Nm
flip-flops. The time reduction for scan-in of vectors will be dominated by the length of the
longest scan chain.
The time period of the fastest clock is v times faster than the slowest clock. Therefore,
the time period for the slowest clock is given by vT. If bits are scanned in with the slowest
clock, the total scan-in time is given by NmvT. The number of non-transitions in the input
vector equals AN. These AN non transitions occur in Nm cycles. Therefore, a non-transition
occurs every 1mA cycles. Hence, z transitions will occur in zmA cycles.
The cumulative number of non-transitions that can be held by all the scan chains is equal
to N and to be able to deploy the speed up mechanism for all ranges of non-transitions, the
frequency is increased only after counting Nv non-transitions. Since a non-transition occurs in
every 1mA cycles, Nv non-transitions occur in NmAv cycles. Thus, the frequency is not increased
until Nv non-transitions occur in about NmAv cycles. The counter keeps on counting until it
reaches 2Nv non-transitions in the next NmAv cycles, before the next step up in frequency is
signaled by the counter. Therefore, the first Nv non-transitions are scanned in using a scan
clock, which has a period of vT. Then the next Nv non-transitions are scanned in using a
scan clock which has a period of (v ? 1)T, and so forth. The clock period can reach the
maximum of T, i.e., vth frequency, when the scan chain is filled with only non-transitions.
As shown in Table 5.1, the ith frequency corresponds to a clock period of (v ?i + 1)T
when the largest scan chain has between (i?1)Nv and iNv non-transitions. The ith frequency is
employed between clock cycles (i?1)NmAv and iNmAv. The scan clock initially has a period of vT
in cycle 1. The scan clock period is decreased in steps until the Nmth clock. Therefore, the
29
Table 5.1: Determination of clock cycle range for different frequencies.
Clock Number of non-transitions Clock cycles
period Lower limit Upper limit Lower limit Upper limit
1 vT 0 Nv 0 NmAv
2 (v ?1)T Nv 2Nv NmAv 2NmAv
. . . . . .
. . . . . .
i (v ?i)T (i?1)Nv iNv (i?1)NmAv iNmAv
. . . . . .
. . . . . .
v T (v?1)Nv vNv (v?1)NmAv vNmAv
clock cycle corresponding to the last scan clock frequency is Nm. If the maximum number of
frequencies the scan clock will speed up to is given by x, then
N
mAv ?x =
N
m (5.1)
x = Av (5.2)
Thus, the number of scan clock frequencies employed for a scan-in vector with non-transition
density of A is Av.
The total scan-in time per vector is the sum of scan-in times at each frequency. The
scan-in time at each frequency is given by the product of the number of cycles run at that
frequency and the time period of the clock, as shown in Table 5.1 The total scan-in time per
vector is given by
Avsummationdisplay
i=1
(ceilingleft iNmAvceilingright?ceilingleft(i?1)NmAv ceilingright)?(v ?i+ 1)T (5.3)
30
Substituting Nm = ?N we get
Avsummationdisplay
i=1
(ceilinglefti
?N
Avceilingright?ceilingleft
(i?1) ?N
Av ceilingright)?(v ?i+ 1)T (5.4)
The above equation simplifies to
Avsummationdisplay
i=1
?N
Av(v ?i+ 1)T (5.5)
If v and ?N are chosen to be powers of 2, then the total scan-in time is
?N
v (v ?Av ?Av(Av +
1
2) +Av)T (5.6)
So, the reduction in scan-in time in largest scan chain is
=
?NvT ? ?Nv (v.Av ?Av(Av + 12) +Av)?T
?Nvt
= A2 ? 12v
= 1??2 ? 12v (5.7)
Hence maximum reduction in scan-in time can be reached only if non-transitions are scanned
in and maximum reduction is close to 50% if the number of frequencies chosen is very high.
For random patterns, where non-activity is 0.5, the highest reduction that can be reached is
around 25%.
5.3 Time Reduction and Power Consumption
This section gives experimental results on reduction in scan-in time for random pat-
terns with activity 0.5 for ISCAS89 benchmark circuits. Table 5.2 summarizes the results.
The transition density in random patterns or pseudo random patterns is approximately 0.5.
Theoretically, using the dynamic scan clock control scheme we get 50% reduction in test
31
Table 5.2: Scan-in time reduction in ISCAS89 benchmark circuits.
Circuit No. of FF No. of gates No. of vectors Time savings (%)
s298 23 282 19 29.23
s382 30 361 90 32.52
s510 32 447 138 29.55
s820 42 655 3455 27.55
s1196 46 885 2528 27.7
s38417 443 31834 1000 27.1
application time if transition density is 0, that is if the vector is filled with all 1s or all 0s.
When patterns are applied from a pseudo random TPG the transition density will be close
to 0.5 if not exactly 0.5. From the table, it is to be noted that it takes a while for the TPG
to start becoming random and the activity is less than 0.5 in the beginning. Therefore, for
a small number of vectors and for small circuits the saving in scan-in time dose not conform
to the theoretical analysis. But as larger numbers of vectors are used for we see the time
reduction conforming to 25% saving as transition density approaches 0.5.
The power while applying these tests is kept under a peak power constraint. Here the
peak power is assumed to be the power consumed when the scan chain has activity 1, that
is, the scan chain is filled with alternating 1s and 0s. The power graphs in Figures 5.5
and 5.6 show the power consumption for circuit s1196 with fixed clock and adaptive clock
scan, respectively.
32
0 20 40 60 80 1003.6
3.8
4
4.2
4.4
4.6
4.8
5
5.2 x 10
4
No. of Clock Cycles
Per Cycle Power (uW)
Fixed frequency power
Adaptive frequency power
Power budget
Figure 5.5: Power per test clock in s1196.
0 5 10 15 20 250
1
2
3
4
5
6 x 10
4
No. of Clock Cycles
Per Cycle Power (uW)
Fixed frequency power
Adaptive frequency power
Power budget
Figure 5.6: Power per test clock for first 25 cycles.
33
Chapter 6
Controlled Transition Density Patterns for BIST
This chapter presents a hardware implementation of the controlled transition density
based vector generation by a BIST-TPG. The first section describes the hardware used
to implement the pattern generator, the second section estimates the randomness of the
generated vectors and the last section describes the implementation of the TPG in the
adaptive scan clock scheme described earlier.
6.1 BIST-TPG Circuit for Controlled Transition Density
The test pattern generator (TPG) chosen for the analysis is a 28 bit external LFSR
using the polynomial p(x) = x28 +x3 +1. Its combinational part consists of only AND gates
and inverters, an eight input MUX to select from eight different probability of a bit being 1,
a simple finite state machine (FSM) to control the MUX, and a toggle flip-flop. Figure 6.1
shows the circuitry for the test pattern generator. The combinational network generates
eight different weighted random bit sequences. The weights are constructed by ANDing two
or more outputs from non-adjacent cells of the LFSR. Here it is to be noted that two cells
in an n-bit LFSR are adjacent if the output of one cell feeds the input of the second directly,
without an intervening XOR gate.
As shown in Figure 6.1, eight weights for the probability of a bit being 1 are 0.125, 0.25,
0.375, 0.4375, 0.5, 0.625, 0.75 and 0.875, respectively. The probability of a bit being 1 or 0
at the output of any cell of the TPG is 0.5. This weight is directly fed to one of the inputs
of the MUX. Two outputs from two non-adjacent cells were ANDed to produce a weight of
0.25, three outputs from three non-adjacent cells were ANDed to produce a weight of 0.125,
and inverting these two weights we get weights of 0.75 and 0.875, respectively.
34
Figure 6.1: Hardware implementation of TPG.
For generating weight of 0.375, weight 0.75 is again ANDed with another cell output
that is not adjacent to any of those two cells that are used in creating the 0.75 weight.
Similarly for generating weight of 0.4375, weight 0.875 is ANDed with another non-adjacent
cell output. Finally, to construct a weight of 0.625, weight 0.375 is inverted.
The different weight lines are the inputs to the 8:1 MUX. An FSM controls the select
lines of the MUX to choose the intended probability.
A toggle generating flip-flop constructed with a D-flip-flop and an XOR gate is added
to produce the required transition density in the vectors that are to be fed to scan chain as
shown in Figure 6.1. Through the select lines of the MUX a weight is selected and the bit
sequence fed to one of the inputs of the XOR gate; the other input line of the XOR gate is
the output of the D flip-flop.
Once a weight is selected, the corresponding bit sequence will then control the transition
at the output of the XOR gate. A 1 in the bit sequence will produce a transition at the
35
Figure 6.2: Hardware implementation of TPG for M scan chains.
output of the XOR gate and a 0 will produce no transition. Thus the resulting transition
density in the bit stream at the output of the XOR gate will have the same weight (i.e., the
probability of a transition to occur) as the weight selected from the MUX.
The output of the D type flip-flop is fed back to the input with an XOR gate to generate
the toggling of the bits. Whenever a 1 is encountered at the input of the XOR gate from
the output of the MUX there is a toggle. Whenever a 0 is encountered the bit remains
unchanged. The generated bit stream is then fed to the scan chains.
This bit stream then feeds the scan chain input. For multiple chain implementations,
only the combinational part to generate weights is copied multiple times and all the same
weighted lines are fed to the corresponding input of the MUX. Also, toggle flip-flops are
added for each output line of the MUX and fed to individual XOR gates to produce separate
bit streams. Thus separate bit streams are used to feed each separate scan chain, as shown
in Figure 6.2.
36
Table 6.1: Estimation of randomness in generated 1000 random patterns.
Avg. no. of 1s in each bit position p1 summationtextnj=1(p0j ?log2 p0j +p1j ?log2 p1j)
1255 0.1255 15.25
2508 0.2508 22.74
3746 0.3746 26.74
4378 0.4378 27.68
5007 0.5007 27.99
6254 0.6254 26.71
7491 0.7491 18.21
8750 0.875 15.21
6.2 Randomness of Weighted Random Patterns
Entropy has been used by many researchers as a measure of randomness metric
rsummationdisplay
i=1
pi ?log2 pi (6.1)
where pi is the probability that the signal is in state 1 and r denotes the total number of
states [9, 40]. This metric can quantify how the quality of randomness deviates if there is a
bias in the random number generator. For an n-bit perfect random number generator, r = 2n
and pi = 0.5n and hence entropy will be Hmax = n, reflecting the maximum randomness.
For any biased random bit generator, entropy will be 0 ? H ? n. For an easier computation
in an n-bit LFSR, if p1j(p0j) denotes probability of having 1(0) in bit bj then entropy can be
approximated by adding the entropies of individual bits:
nsummationdisplay
j=1
(p0j ?log2 p0j +p1jtimeslog2 p1j) (6.2)
To verify the weighted randomness in case for the above mentioned weights for a vector
set of 10000 vectors of n = 28 bits, the number of 1s was counted for each weight. Table 6.1
summarizes the randomness of the intended weight.
37
0 5 10 15 20 25 301000
2000
3000
4000
5000
6000
7000
8000
9000
distribution of 1s
Number of 1s
wrp=0.1255
wrp=.2508
wrp=0.3746
wrp=.4378
wrp=0.5007
wrp=.6254
wrp=7491
wrp=875
Figure 6.3: Distribution of 1s in weighted random patterns.
As the toggle flip-flop controls the transitions depending on probability of the input
signal being 1, the transition density will follow the randomness value closely. Also, Figure 6.3
shows the distribution of 1s generated from each of the weights.
6.3 Dynamic Control of Scan Clock in BIST Circuit with Modified TPG
We replaced the conventional test pattern generator with the modified test pattern
generator, along with the finite state machine that selects transition density from the test
pattern generator as shown in Figure 6.4. The finite state machine (FSM) takes the number
of the patterns applied as inputs from the BIST controller and controls the transition density
of the test vectors applied.
From the analysis of circuits in chapter 4, we can pre-determine the best case transition
density with the modified TPG and run the whole test session with a pre-selected transition
density. Also the circuitry to dynamically adapt the scan clock will help speed up the
test clock by monitoring the inactivity and, therefore, keep the whole test session power-
constrained.
38
Figure 6.4: Adaptive scan clock scheme with modified TPG.
6.4 Fault Coverage by the Modified TPG
The newly proposed TPG gives similar fault profiles as the simulation results. Fig-
ures 6.5 and 6.6 show the number of vectors needed to reach fault coverage of 95% in circuits
s510 and s1512, respectively, when test vectors were applied from the new modified TPGs.
It is to be noted that, in circuit s510 the best case transition density is 0.4375 whereas the
best case transition density for s1512 is 0.25. This actually conforms to the best case weights
for random pattern, 0.4375 and 0.875, respectively, as described earlier.
From both the graphs it can be seen, that the proposed LFSR is capable of producing
vectors with the desired weight (probability of a bit being 1) and the desired transition
density (number of transitions in a vector bit stream). Fault coverage obtained by the two
39
1 2 3 4 5 6 7 80
1000
2000
3000
4000
5000
6000
7000
w=0.125 w=0.25 w=0.375 w=0.4375 w=0.5 w=0.6375 w=0.75 w=0.875
number of vectors
weighted random
transition density
Figure 6.5: Performance of transition density and weighted random patterns of s510.
1 2 3 4 5 6 7 80
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
w=0.125 w=0.25 w=0.375 w=0.4375 w=0.5 w=0.6375 w=0.75 w=0.875
number of vectors
weighted random
transition density
Figure 6.6: Performance of transition density and weighted random pattern of s1512.
different kind of vectors followed the trends of the vectors produced by the Matlab [29]
program that was described earlier.
40
Chapter 7
A Greedy Algorithm to Apply Tests with Different Transition Densities
This chapter describes a scheme to reduce test application time by mixing vectors of
different transition densities in a controlled way. Here we propose the scheme through a
greedy algorithm to find an optimal vector set that consists of various transition densities
but in a controlled manner.
7.1 Analysis of Fault Profiles
From Chapter 4, it can be seen that to achieve certain lower fault coverage the low
transition densities are almost as good as the best case transition densities in terms of
number of vectors to reach that fault coverage. Figure 7.1 shows the fault coverage in circuit
s510 by vectors with transition densities from 0.1 to 0.5. According to our analysis in Chapter
4 we see that the best case transition density is 0.5.
The steep rise in the fault coverage indicates that to achieve a lower fault coverage the
lower transition density based vectors are almost as good as the best case transition density
vectors. So even if the lower transition density vectors need a few vectors more to reach the
partial fault coverage using that transition density, with the adaptive scan clock scheme we
can reduce the test application time. Figure 7.2 shows a logarithmic scale plot of the fault
coverage in s510 from the data of Figure 7.1.
The idea is to break down the target fault coverage into a number of partial fault
coverages and apply a greedy algorithm to select a best case transition density in each
partial target fault coverage stage.
41
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
200
300
400
500
600
700
td=0.1
td=.15
td=0.2
td=.25
td=0.3
td=.35
td=0.4
td=0.45
td=0.5
Figure 7.1: Detected faults vs. number of transition density vectors obtained from fault
simulation of s510
7.2 Algorithm to Apply an Optimal Set of Vectors
The greedy algorithm described here finds a locally optimal solution for reaching partial
target fault coverage in each stage based on the number of vectors to reach that partial test
coverage multiplied by the time taken to apply those vectors in adaptive scan clock scheme,
to reach a globally optimal vector set to achieve the final target fault coverage.
The time needed to apply each vector in adaptive scheme for a given transition density
can be computed from an equation described here. Using adaptive scan clock scheme for a
transition density ? and the number of frequencies available to adapt from v, the reduction
in scan-in time is given by,
1
2(1??)?
1
2v (7.1)
42
0 0.5 1 1.5 2 2.5 3 3.5 410
20
30
40
50
60
70
80
90
100
No of Vectors in Log scale
% of Faults Detected
td=0.1
td=.15
td=0.2
td=.25
td=0.3
td=.35
td=0.4
td=0.45
td=0.5
Figure 7.2: Fault coverage by transition density vectors obtained by simulation of s510
Now in a setup where the number of frequencies to adapt from is fixed as v, the reduction
in time varies with ?. Therefore, for simplicity we can take reduction in time to be,
1
2(1??) (7.2)
Hence, the time taken to scan in vectors with a given transition density can be found by
deducting the scan-in time reduction from the total time taken to scan-in N vectors at a
fixed frequency.
The proposed algorithm considers all transition densities and the number of vectors
needed in each transition density to reach the fault coverage at each stage and chooses the
transition density that will result in the minimum time to scan-in vectors for that stage.
Algorithm :
Given for a circuit a target fault coverage FC, fault profiles data table for each transition
43
density to reach target fault coverage FC, a fixed number of vectors and partial fault coverage
pfc1, pfc2, pfc3 . . . pfcn find an optimal set of vectors of various transition densities.
? Step 1: Get the best case transition density - x
? Step 2: Get the fault coverage achieved by the best case transition density - y
? Step 3: Set FC = y
? Step 4: Set the number of vectors - z
? Step 5: Divide the FC into n partial fault coverages Pfc1 Pfc2....Pfcn
? Step 6: For each partial fault coverage starting from the lowest partial fault coverage
up to FC
1. Compute the time taken to scan-in the number of vectors needed to reach that
partial fault coverage
2. Choose the minimum value
3. Select that transition density as optimum for that partial fault coverage
4. Set the number of vectors
For illustrating the algorithm, the circuit s510 is chosen. 10000 thousand vectors were
generated for each the eight different transition densities from the LFSR described earlier.
Running fault simulation using those vectors it was observed that transition density 0.5 is
the best. We then divide the target fault coverage into 6 partial fault coverages (Pfc) and run
the algorithm to find a vector mix that will reduce the transition density in the vectors in a
controlled way so that the final target fault coverage can be achieved without lengthening the
test session and without loss of fault coverage. In addition, if a transition density lower than
the best case transition density is chosen in order to reach some of the partial fault coverages
then that will reduce the test application time using the scan clock adaptive scheme.
44
0 100 200 300 400 500100
200
300
400
500
600
700
No of Vectors
No of Faults Detected
best case td
mixed td
Figure 7.3: Detected faults vs. number of vectors in s510 for best case transition density
vectors and mixed transition density vectors.
It can be seen from Figure 7.3 and Table 7.3 that even if few more vectors compared
with the best case transition density are required, we can speed up the test by using optimal
transition density in each partial case without increasing total number of vectors and without
sacrificing fault coverage. Figure 7.4 shows the flow chart of the algorithm described in this
section.
7.3 Implementation of Controlled Mixed Transition Density Based TPG in
BIST Circuit
The same adaptive scheme described in Chapter 6 (Section 3) is used to implement
the controlled mix of transition densities with a small modification in the FSM as shown
in Figure 7.5. In the best case transition density setup the FSM selected the best case
transition for the whole test session. It keeps track of number of patterns applied from the
BIST controller. In place of running the whole test session in one transition density now
45
Table 7.1: Performance of mixed transition density vectors.
Best case TD Mix TD
FC 99% 99%
Number of vectors 592 592
Transition density 0.5 Pfc1 70% ? TD 0.25 vectors 16
Pfc2 80% ? TD 0.25 vectors 14
Pfc3 85% ?TD 0.4 vectors 14
Pfc4 90% ?TD 0.5 vectors 32
Pfc5 95%?TD 0.5 vectors 76
Pfc6 99%?TD 0.4 vectors 404
the FSM is programmed to select different transition density as the pattern count reaches
certain limit. The FSM can be programmed to change the selection according to the Greedy
algorithm described earlier. For example, from the Table 7.1, the FSM can be programmed
to run first 30 vectors in 0.25 transition density and hence speed up the test application
time by adapting the scan clock. Also running 32 vectors and later 404 vectors in transition
density 0.4 we can still speed up as compared running all the vectors in 0.5 transition density.
Thus the hardware provides a controlled mixing of transition densities in the vector
sets. Using vectors that have lower transitions at the beginning of the test sequence helps to
detect the easy to detect faults also facilitates the speeding up the scan clock. The transition
density of the vector sets can be tuned to higher transition density by the use of a simple
FSM later towards the end of the test sequence. The power is held under the power budget
by dynamically adapting to a slower scan clock.
46
Figure 7.4: Flow chart of proposed algorithm.
Figure 7.5: Hardware Implementation for controlling a mix of various transition densities.
47
Chapter 8
Experimental Results
This chapter describes the entire procedure followed in experiments, followed by results
obtained. ISCAS 89 circuits were chosen to run the experiments.
8.1 Fault Coverage Analysis
The benchmark circuits from ISCAS89 suite were used to do the analysis of fault cov-
erage based on three different types of test vectors. A Matlab [29] program was written to
generate all the different vector sets. Scripts were written to convert the generated vector
into scan patterns for AUSIM, an Auburn University fault simulation program [37].
Matlab [29] and AUSIM [37] programs were run on Auburn University?s High Perfor-
mance Compute Cluster (HPCC) [42].
Number of vectors needed to reach target fault coverage has been recorded for each set
of vectors with a different weight or a different transition density. The number of vectors
was thus chosen for the experiments and the BIST controller was set accordingly.
The newly proposed TPG was implemented in the previous netlists in place of con-
ventional 28 bit LFSR. It was again simulated in Modelsim and the randomness of the
generated vectors based on weights were measured after capturing the vectors Modelsim
transcript. This transcript files were fed to another Matlab [29] program to compute the
randomness. Those vectors were simulated by AUSIM [37] to obtain their fault coverage.
Table 8.1 shows the comparison between the number of vectors needed to achieve 95%
fault coverage for ISCAS89 circuits. Also Table 8.2 shows the comparison between the
number of vectors needed for the same circuits to reach 90% of fault coverage. These fault
48
Table 8.1: Test lengths for random and best-case weighted random (WRP) and transition
density (TDP) patterns for 95% fault coverage in ISCAS89 circuits.
Circuit DFFs Gates PI+
PPI
No. of
vectors
(0.5)
No. of
vectors
(WRP)
No. of
vectors
(TDP)
Weight TD
S27 4 10 7 16 7 35 0.15 0.8
S298 14 119 17 113 73 10000** 0.45 0.0
S382 21 158 24 122 56 124 0.3 0.45
S386 6 159 13 1042 540 869 0.4 0.6
S344 15 160 24 82 68 10000 0.4 0.0
S510 6 211 32 138 136 152 0.4 0.5
S420 16 218 34 10000 644 735 0.9 0.2
S635 32 286 34 10000 97 1883 0.9 0.1
S832 5 287 23 7840 3771 10000 0.45 0.0
S820 5 289 23 2872 2827 5972 0.5 0.45
S641 19 379 54 175 175 270 0.5 0.7
S713 19 393 54 10000 10000 10000 0.75 0.55
S967 29 394 45 2390 934 4346 0.4 0.35
S953 29 395 45 3088 1214 4502 0.35 0.35
S838 32 446 66 10000 1902 2848 0.95 0.1
S1238 18 508 32 10000 10000 10000 0.55 0.45
S991 19 519 84 100 76 62 0.55 0.1
S1196 18 529 32 2528 1706 2821 0.55 0.45
S1269 37 569 55 48 22 24 0.6 0.5
S1494 6 647 14 1090 1046 562 0.55 0.4
S1488 6 653 14 1002 937 512 0.55 0.4
S1423 74 657 91 86 86 104 0.6 0.2
S1512 57 780 86 5556 538 338 0.75 0.2
S13207 669 7951 700 15000* 15000 15000 0.35 0.3
S15850 597 9772 611 15000 15000 15000 0.5 0.3
simulations were done with 10000 vectors for each case of weight or transition density. There-
fore, for the circuits (marked as ** in table) that could not reach 90% or 95% fault coverage
within those 10000 vectors, the best case was taken as 10000 vectors. The bigger circuit
s13207 and s15850 were simulated for faults with 15000(*) vectors.
The first four columns show the circuit name, number of flip-flops, gates , primary and
pseudo-primary inputs (PPI) in the respective circuits. The next column shows number of
vectors needed to reach the target fault coverage with conventional random test patterns.
The sixth and seventh column indicate the number of vectors needed to reach the same target
49
Table 8.2: Test lengths for random and best-case weighted random (WRP) and transition
density (TDP) patterns for 90% fault coverage in ISCAS89 circuits.
Circuit DFFs Gates PI+
PPI
No. of
vectors
(0.5)
No. of
vectors
(WRP)
No. of
vectors
(TDP)
Weight TD
S27 4 10 7 15 5 12 0.15 0.6
S298 14 119 17 52 52 10000 0.5 0.0
S382 21 158 24 42 30 16 0.3 0.4
S386 6 159 13 621 317 609 0.35 0.4
S344 15 160 24 50 34 10000 0.45 0.0
S510 6 211 32 70 84 76 0.4 0.5
S420 16 218 34 10000 300 263 0.9 0.1
S635 32 286 34 10000 3 13 0.85 0.15
S832 5 287 23 1380 1126 2128 0.45 0.55
S820 5 289 23 962 806 1524 0.45 0.45
S641 19 379 54 38 31 41 0.7 0.55
S713 19 393 54 109 131 139 0.75 0.55
S967 29 394 45 746 268 656 0.4 0.45
S953 29 395 45 746 312 458 0.4 0.35
S838 32 446 66 10000 488 491 0.95 0.15
S1238 18 508 32 2506 1759 2598 0.55 0.45
S991 19 519 84 34 34 23 0.5 0.75
S1196 18 529 32 675 537 712 0.65 0.3
S1269 37 569 55 14 10 12 0.6 0.85
S1494 6 647 14 460 416 266 0.6 0.4
S1488 6 653 14 438 410 247 0.6 0.55
S1423 74 657 91 12 6 11 0.85 0.2
S1512 57 780 86 46 46 78 0.5 0.6
S13207 669 7951 700 4262 2127 1490 0.35 0.3
S15850 597 9772 611 2463 2463 3293 0.5 0.3
fault coverage but with specific weight (probability of it being 1) and specific transition
density respectively. These are denoted as best case weight and best case transition density.
8.2 Dynamic Scan Clock Implementation
Verilog netlists of ISCAS 89 circuits were used for simulation purpose. The dynamic
scan clock control scheme was implemented in circuits with multiple scan chains as described
in Chapter 5. The simulation tool from MentorGraphics Modelsim, was used to simulate the
circuit with and without the dynamic scan clock circuitry. Time needed to apply the test
sequences in the both cases were recorded.
50
Table 8.3: Reduction in scan-in time for conventional random patterns of weight 0.5.
Circuit Patterns Test time
Fixed
Freq.
(ns)
Test time
Adap. Freq.
(ns)
Test time
reduc
-tion
(%)
s298 52 14605 10050 31.18
s382 42 15165 10320 31.94
s510 70 25245 18200 27.9
s820 962 461805 348392 24.55
s953 746 537165 418073 22.17
s1196 675 351045 264652 24.61
s1488 438 175245 124572 28.91
s13207 4262 36482765 31565011 13.47
s15850 2463 18915885 16341260 13.61
Table 8.4: Reduction in scan-in time for best-case weighted random patterns (WRP).
Circuit Best Case
Weight
Patterns Test time
Fixed
Freq
(ns)
Test time
Adap. Freq
(ns)
Time
reduc
-tion
(%)
s298 0.5 52 14605 10050 31.18
s382 0.3 30 10845 6661 38.57
s510 0.4 84 30285 19570 35.38
s820 0.4 806 386925 268971 30.48
s953 0.4 312 224685 162371 27.73
s1196 0.6 537 279285 221416 20.72
s1488 0.6 410 1 64045 117901 28.12
s13207 0.35 2127 18207165 16180025 11.13
s15850 0.5 2463 18915885 16341260 13.61
The synthesis tool from MentorGraphics Leonardo Spectrum was used to analyze the
area of the circuits with and without the dynamic scan clock circuitry. To insert the scan
flip-flops in the basic verilog netlists MentorGraphics DFT Advisor tool was used.
In this section scan in time reduction has been shown for various circuits from ISCAS89
suite. Tables 8.3, 8.4 and 8.5 show the scan-in time reduction to reach 90% fault coverage
by using conventional vectors, best case weighted random patterns and best case transition
density patterns respectively. In Table 8.3, the test application time required to complete the
test sequence in fixed scan frequency is tabulated in third column and the test application
51
Table 8.5: Reduction in scan-in time for best-case transition density patterns (TDP).
Circuit Best Case
TD
Patterns Test time
Fixed
Freq
(ns)
Test time
Adap. Freq
(ns)
Time
reduc
-tion
(%)
s298 0.5 10000 2800045 1974026 29.5
s382 0.4 32 11565 8287 24.72
s510 0.4 76 27405 19852 28.34
s820 0.4 1524 731565 504453 31.04
s953 0.3 458 329805 231833 29.7
s1196 0.3 712 370285 262350 29.14
s1488 0.5 247 98845 72831 26.31
s13207 0.3 1490 12754445 10149712 20.42
s15850 0.3 3293 25290285 20109065 20.48
time required to complete the test sequence in dynamically adaptive scan frequency is shown
in fourth column. The last column in the table shows the reduction in scan-in time by
deploying the latter method over the first one. Similarly, Table 8.4 shows the test application
time and the test time reduction when weighted random patterns are used by deploying the
modified TPG as described in Chapter 6. The second column shows the best case weight
(probability of a bit being 1) for the respective circuit. Also, Table 8.5 shows the test
application time and reduction in test time when transition density patterns are applied
using the proposed TPG. The second column in this table shows the best case transition
density for the respective circuit.
Table 8.6 provides a comparison between required test time to reach 90% fault coverage
with adaptive scan clock scheme deployed in conventional random, weighted random and
transition density patterns. We note that the pattern choice for best (minimum) test time
is circuit dependent.
52
Table 8.6: Comparing test times for 90% coverage by conventional random (R), weighted
random (WRP) and transition density (TDP) patterns when adaptive scan clock is used.
Circuit Test time (R)
Adaptive
Frequency
(ns)
Weight
(WRP)
Test time (WRP)
Adaptive
Frequency
(ns)
Transition
density
(TDP)
Test time (TDP)
Adaptive
Frequency
(ns)
s298 10050 0.5 10050 0.5 1974026
s382 10320 0.3 6661 0.4 8287
s510 18200 0.4 19570 0.4 19852
s820 348392 0.4 268971 0.4 504453
s953 418073 0.4 162371 0.3 231833
s1196 264652 0.6 221416 0.3 262350
s1488 124572 0.6 117901 0.5 72831
s13207 31565011 0.35 16180025 0.3 10149712
s15850 16341260 0.5 16341260 0.3 20109065
8.3 Power Consumption Analysis
Both the synthesized netlists with adaptive scheme and without adaptive scheme were
used to generate SPICE netlists using MentorGraphics tool Design Architect using TSMC018
um.
Sysnopsys Tool NanoSim was used to do the SPICE simulation for capturing power
dissipation for the per clock peak power consumption shown in Figures 5.4 and 5.5. Figure 8.1
shows the power consumption for the circuit s1196 for some clock cycles. The peak power
consumption is decided by the peak power consumed by the circuit while running in the
fixed clock, which is also the slowest clock of the adaptive scheme. The figure plots per clock
power consumption against number of test clock cycles. The power budget is set by the
peak power consumption by the fixed frequency. It is shown in the figure that though the
per clock peak power increases when the scan clock is dynamically adapted but it remains
below the power budget line. However, the average power will increase as the time required
to apply the test decreases.
53
0 10 20 30 40 50 603.8
4
4.2
4.4
4.6
4.8
5
5.2 x 10
4
No. of Clock Cycles
Per Cycle Power (uW)
Fixed frequency power
Adaptive frequency power
Power budget
Figure 8.1: Per clock power consumption with and without adaptive schemes for s1196.
8.4 Greedy Algorithm Implementation
For the circuits that have best case transition density around 0.5 or 0.4 the greedy
algorithm was implemented on them to reduce scan in time further.
The fault profiles generated by AUSIM [37] were converted into tables for each circuit. A
Matlab [29] program was written to take the table as input and apply the greedy algorithm
to give the transition density needed to be applied for each partial fault coverage stage.
Figures 8.2 and 8.3 shows that the algorithm helps to detect the same number of faults with
the same number of vectors as the best case transition density if the vectors were applied
with the transition density determined by the algorithm for each partial fault coverage stage.
As seen from the graphs, it is possible to run the test even faster using the controlled
transition density mixing in the vector set during the application of lower transition density
patterns.
54
0 50 100 150 200 250 300 350 400 4500
100
200
300
circuit s298
0 1000 2000 3000 4000 5000 60000
200
400
600
800
1000
circuit s820
No of Vectors
No of Faults Detected
best case td
mixed td
Figure 8.2: Performance of greedy algorithm for s298 and s820.
The lower transition density vectors are as good as the best case transition density
until certain percentage faults are detected. This can be seen from the graphs 8.2 and 8.3.
Keeping the total number of vectors fixed and mixing the lower transition density vectors in
a controlled manner the same fault coverage can be reached.
Table 8.6 shows the increase in time reduction if transition density is chosen by the
greedy method in cases where best case transition density lies around 0.4-0.5. The second
column in the table shows the best case transition density for the respective circuit. The
fifth column shows the number of vectors needed to reach a certain fault coverage for that
circuit. The third column shows the reduction in test application time when the scan clock
is dynamically adapted as compared with keeping the scan clock at fixed frequency. The
fourth column shows test time reduction when transition density is varied according to the
55
0 50 100 150 200 250 300 350 400 450250
300
350
400
450
circuit s382
0 500 1000 1500 2000 2500 30000
500
1000
1500
circuit s1196
No of Vectors
No of Faults Detected
best case td
mixed td
Figure 8.3: Performance of greedy algorithm for s382 and s1196.
algorithm. The sixth column indicates the partial fault coverages (pfc). It is to be noted
that, for circuit s298 the test has been applied to reach 77.1% . But for the rest of the
circuits, test has been applied to reach above 90% fault coverage.
The seventh and eighth columns show the transition density and the number of vectors
needed to run in that transition density to reach the partial fault coverage as indicated in
the sixth column.
Running initially in low transition density and accordingly adapting the scan frequency
to the reduced transition density allows to reduce the over all test application time.
56
Table 8.7: Mixing transition densities selected by Greedy Algorithm based on partial fault
coverage.
Circuit TDP time red.(%) Alg. time red (%) Pattern Pfc % TD mix Vector
s298 0.5 12.93 37.5 435 70 0.65 53
77 0.35 382
s382 0.45 18.8 43.53 442 70 0.1 4
80 0.2 6
85 0.4 6
90 0.4 16
95 0.35 94
98 0.4 316
s510 0.5 18.41 42.22 492 70 0.25 16
80 0.25 14
85 0.4 14
90 0.5 32
95 0.5 76
99 0.4 310
s820 0.45 24.32 26.51 5972 70 0.4 110
80 0.4 211
85 0.4 323
90 0.45 880
95 0.45 4448
s1196 0.45 28.78 35.56 2962 70 0.25 83
80 0.3 169
85 0.3 148
90 0.3 312
95 0.25 2250
57
Chapter 9
Conclusion
For scan testing it is important to note that both power and test time contribute to the
test cost as well as quality of the test. This work proposes to strike a balance between these
two factors and aims to reduce test application time as much as possible without sacrificing
the fault coverage while keeping the test power controlled the entire time.
The main ideas forwarded in this thesis are, transition density can be effectively selected
for any circuit analogous to weighted random patterns to generate test session with shorter
test length. Once the transition density is known the test application time can be further
reduced by dynamically controlling the test clock keeping the test power controlled.
At the beginning of the test session, any transition density is capable of reaching a
fault coverage that is lower than 80% faults with almost same number of vectors. The lower
transition density based vectors though need more number of vectors but the difference
between numbers of the vectors needed to detect those faults is small. Thus a lower transition
density can be chosen deterministically to reach that partial coverage while speeding up the
scan clock without crossing the power budget.
Thus using a control mix of transition density, the test session is kept optimal. Power
dissipation can be controlled by dynamically adjusting the scan clock with the transition
density chosen. The experimental results show a further speed up of 10-15% when transition
densities are mixed.
In the future, more sophisticated methods for obtaining the controlled transition density
mixing in the vector set by using linear programming should be examined to balance the
test time and test power more efficiently.
58
Bibliography
[1] A. S. Abu-Issa and S. F. Quigley, ?Bit-Swapping LFSR and Scan-Chain Ordering: A
Novel Technique for Peak and Average-Power Reduction in Scan-Based BIST,? IEEE Trans.
Computer-Aided Design of Integrated Circuits and Systems, vol. 28, no. 5, May 2009.
[2] P. Agrawal and V. D. Agrawal, ?On Improving the Efficiency of Monte Carlo Test Generation,?
in Digest of 5th International Fault Tolerant Computing Symp., (Paris, France), June 1975,
pp. 205?209.
[3] P. Agrawal and V. D. Agrawal, ?Probabilistic Analysis of Random Test Generation Method
for Irredundant Combinational Networks,? IEEE Trans. Computers, vol. C-24, pp. 691?695,
July 1975.
[4] P. Agrawal and V. D. Agrawal, ?On Monte Carlo Testing of Logic Tree Networks,? IEEE
Trans. Computers, vol. C-25, pp. 664?667, June 1976.
[5] V. D. Agrawal, K.-T. Cheng, D. D. Johnson, and T. Lin, ?Designing Circuits with Partial
Scan,? IEEE Design & Test of Computers, vol. 5, no. 2, pp. 8?15, Apr. 1988.
[6] V. D. Agrawal, C. R. Kime, and K. K. Saluja, ?A Tutorial on Built-In Self-Test, Part 1:
Principles,? IEEE Design & Test of Computers, vol. 10, pp. 73?82, Mar. 1993.
[7] F. Brglez, D. Bryan, and K. Kozminski, ?Combinational Profiles of Sequential Benchmark
Circuits,? in Proc. Int. Symp. Circuits and Systems, May 1989, pp. 1929?1934.
[8] M. L. Bushnell and V. D. Agrawal, Essentials of Electronic Testing for Digital, Memory and
Mixed-Signal VLSI Circuits. Springer, 2000.
[9] S. Chiu and C. Papachristou, ?A Design for Testability Scheme with Applications to Datapath
Synthesis,? in Proc. Design Automation Conf., June 1991, pp. 271?177.
[10] R. M. Chou, K. K. Saluja, and V. D. Agrawal, ?Power Constraint Scheduling of Tests,? in
Proc. 7th International Conference VLSI Design, Jan. 1994, pp. 271?274.
[11] R. M. Chou, K. K. Saluja, and V. D. Agrawal, ?Scheduling Tests for VLSI Systems Under
Power Constraints,? IEEE Trans. VLSI Systems, vol. 5, no. 2, pp. 175?185, June 1997.
[12] F. Corno, M. Rebaudengo, M. S. Reorda, G. Squillero, and M. Violante, ?Low Power BIST
via Non-Linear Hybrid Cellular Automata,? in Proc. IEEE 18th VLSI Test Symp., May 2000,
pp. 29?34.
[13] E. B. Eichelberger and E. Lindbloom, ?Random-Pattern Coverage Enhancement and Diagnosis
for LSSD Logic Self-Test,? IBM Jour. Research and Development, vol. 27, no. 3, pp. 265?272,
May 1983.
[14] S. Gerstend?orfer and H.-J. Wunderlich, ?Minimized Power Consumption for Scan-Based
BIST,? Jour. Electronic Testing: Theory and Applications, vol. 16, no. 3, pp. 203?212, 2000.
59
[15] P. Girard, ?Low Power Testing of VLSI Circuits: Problems and Solutions,? in Proc. First
IEEE symp. Quality Electronic Design (ISQED), Mar. 2000, pp. 173?179.
[16] P. Girard, ?Survey of Low-Power Testing of VLSI Circuits,? IEEE Design & Test of Comput-
ers, vol. 19, no. 3, pp. 80?90, May-June 2002.
[17] P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch, ?A Test Vector Inhibiting Tech-
nique for Low Energy BIST Design,? in Proc. IEEE 17th VLSI Test Symp., Apr. 1999, pp.
407?412.
[18] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, J. Figueras, S. Manich, P. Teixeira,
and M. Santos, ?Low Energy BIST Design: Impact of the LFSR TPG Parameters on the
Weighted Switching Activity,? in Proc. International Symp. Circuits and Systems, June, pp.
110?113.
[19] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, and H. J. Wunderlich, ?A Modified
Clock Scheme for a Low Power BIST Test Pattern Generator,? in Proc. IEEE 19th VLSI Test
Symp., May 2001, pp. 306?311.
[20] P. Girard, N. Nicolici, and X. Wen, editors, Power-Aware Testin and Test Stargegies for Low
Power Devices. Springer, 2009.
[21] D. Gizopoulos, N. Krantitis, A. Paschalis, M. Psarakis, and Y. Zorian, ?Low Power/Energy
BIST Scheme for Datapaths,? in Proc. IEEE 18th VLSI Test Symp., May 2000, pp. 23?28.
[22] J. Hartmann and G. Kemnitz, ?How to Do Weighted Random Testing for BIST,? in Proc.
IEEE/ACM International Conf. Computer-Aided Design, Nov. 1993, pp. 568?571.
[23] A. Hertwig and H.-J. Wunderlich, ?Low Power Serial Built-In Self-Test,? in Proc. European
Test Workshop, May 1998, pp. 49?53.
[24] V. Iyengar and K. Chakrabarty, ?Precedence-Based, Preemptive, and Power Constrained Test
Scheduling for System on a Chip,? in Proc. IEEE 19th VLSI Test Symp., May 2001, pp. 368?
374.
[25] J. Rajski et al., ?Test Generator with Preselected Toggling for Low Power Built-In Self-Test,?
in Proc. IEEE 29th VLSI Test Symp., 2011.
[26] E. Larsson, Introduction to Advanced System-on-Chip Test Design and Optimization. Springer,
2005.
[27] A. Majumder, ?On Evaluating and Optimizing Weights for Weighted Random Pattern Test-
ing,? IEEE Trans. Computers, vol. 45, no. 8, pp. 904?916, Aug. 1996.
[28] S. Manich, A. Gabarro, M. Lopez, J. Figueras, P. Girard, L. Guiller, C. Landrault, S. Pravos-
soudovitch, P. Teixeira, and M. Santos, ?Low Power BIST by Filtering Non-Detecting Vec-
tors,? in Proc. European Test Workshop, May 1999, pp. 165?170.
[29] MathWorks, ?MATLAB - The Language of Technical Computing.? http://www.mathworks
.com/products/matlab/, accessed on Mar. 6, 2012.
[30] F. Najm, ?Transition Density: A New Measure of Activity in Digital Circuits,? IEEE Trans.
CAD, vol. 12, pp. 310?323, Feb. 1993.
[31] K. P. Parker and E. J. McCluskey, ?Probabilistic Treatment of General Combinational Net-
works,? IEEE Trans. Computers, vol. C-24, no. 6, pp. 668?670, June 1975.
60
[32] H. D. Schnurmann, E. Lindbloom, and R. G. Carpenter, ?The Weighted Random Pattern
Genrerator,? IEEE Trans. Computers, vol. C-24, no. 7, pp. 695?700, July 1975.
[33] P. Shanmugasundaram, ?Test Time Optimization in Scan Circuits,? Master?s thesis, Auburn
University, Dec. 2010.
[34] P. Shanmugasundaram and V. D. Agrawal, ?Dynamic Scan Clock Control for Test Time
Reduction Maintaining Peak Power Limit,? Proc. 29th IEEE VLSI Test Symp., pp. 248?253,
May 2011.
[35] P. Shanmugasundaram and V. D. Agrawal, ?Dynamic Scan Clock Control in BIST Circuits,?
in Proc. Joint IEEE Int. Conf. on Industrial Electronics and 43rd Southeastern Symp. on
System Theory, Mar. 2011, pp. 237?242.
[36] P. Shanmugasundaram and V. D. Agrawal, ?Externally Tested Scan Circuit with Built-In
Activity Monitor and Adaptive Test Clock,? in Proc. 25th International Conf. VLSI Design,
Jan. 2012, pp. 448?453.
[37] C. E. Stroud, ?AUSIM - Auburn University SIMulator.? http://www.eng.auburn.edu/users/
strouce/ausim.html, accessed on Mar. 6, 2012.
[38] C. E. Stroud, A Designer?s Guide to Built-In Self-Test. Springer, 2002.
[39] M. Tehranipoor, M. Nourani, and N. Ahmed, ?Low Transition LFSR for BIST-Based Appli-
cation,? in Proc. IEEE 14th Asian Test Symposium, 2005.
[40] K. Thearling and J. A. Abraham, ?An Easily Computed Functional Level Testability Mea-
sure,? in Proc. International Test Conf., Aug. 1989, pp. 381?390.
[41] N. A. Touba, ?Survey of Test Vector Compression Techniques,? IEEE Design & Test of Com-
puters, vol. 23, no. 4, pp. 294?303, Apr. 2006.
[42] vSMP HPCC, ?Virtual Symmetric Multiprocessing High Performance Compute Cluster.?
http://www.eng.auburn.edu/ens/hpcc/index.html, accessed on Mar. 6, 2012.
[43] S. Wang, ?Generation of Low Power Dissipation and High Fault Coverage Patterns for Scan-
Based BIST,? in Proc. International Test Conf., Dec. 2002, pp. 834?843.
[44] S. Wang and S. K. Gupta, ?DS-LFSR: A New BIST TPG for Low Heat Dissipation,? in Proc.
International Test Conf., Nov. 1997, pp. 848?857.
[45] S. Wang and S. K. Gupta, ?LT-RTPG: A New Test-Per-Scan BIST TPG for Low Heat Dissi-
pation,? in Proc. International Test Conf., Sept. 1999, pp. 85?94.
[46] F. M. Wanlass and C. T. Sah, ?Nanowatt Logic using Field-Effect Metal-Oxide-Semiconductor
Triodes,? in IEEE International Solid-State Circuits Conf. Digest, volume IV, Feb. 1963, pp.
32?33.
[47] H.-J. Wunderlich, ?On Computing Input Probabilities for Random Tests,? in Proc. Design
Automation Conf., June 1987, pp. 392?398.
[48] X. Zhang, K. Roy, and S. Bhawmik, ?POWERTEST: A Tool for Energy Conscious Weighted
Random Pattern Testing,? in Proc. 12th International Conf. VLSI Design, Jan. 1999, pp.
416?422.
61