Built-In Self-Test of the Programmable Interconnect in Field Programmable Gate Arrays Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classi ed information. Bobby Earl Dixon Jr. Certi cate of Approval: Victor P. Nelson Professor Electrical and Computer Engineering Charles E. Stroud, Chair Professor Electrical and Computer Engineering Adit D. Singh Professor Electrical and Computer Engineering George T. Flowers Dean Graduate School Built-In Self-Test of the Programmable Interconnect in Field Programmable Gate Arrays Bobby Earl Dixon Jr. A Thesis Submitted to the Graduate Faculty of Auburn University in Partial Ful llment of the Requirements for the Degree of Master of Science Auburn, Alabama December 19, 2008 Built-In Self-Test of the Programmable Interconnect in Field Programmable Gate Arrays Bobby Earl Dixon Jr. Permission is granted to Auburn University to make copies of this thesis at its discretion, upon the request of individuals or institutions and at their expense. The author reserves all publication rights. Signature of Author Date of Graduation iii Vita Bobby Earl Dixon Jr., son of Bobby E. and Martha J. Dixon, was born in Dothan, Alabama on September 15, 1984. In the Spring of 2006, he graduated with a Bachelor of Electrical Engineering majoring in Computer Engineering from Auburn University. Upon graduating he immediately began working on his Master of Science degree at Auburn Uni- versity under the advisement of Dr. Charles E. Stroud. iv Thesis Abstract Built-In Self-Test of the Programmable Interconnect in Field Programmable Gate Arrays Bobby Earl Dixon Jr. Master of Science, December 19, 2008 (B.S., Auburn University, 2006) 90 Typed Pages Directed by Charles E. Stroud Testing programmable interconnect resources in Field Programmable Gate Arrays (FP- GAs) is di cult because of the large number of wire segments and switches that must be tested. The adoption of Built-In Self-Test (BIST) for programmable interconnect testing has proven to be an e ective method for ensuring the fault-free status of the interconnect network for previous FPGA architectures. The BIST approaches used in previous FPGA interconnect testing relied on several assumptions to obtain adequate fault coverage within the device. With the advancement of technology and complexity of next generation FPGAs, the assumptions used in previous work can no longer be applied. New BIST approaches must be developed that alleviate these assumptions, yet still obtain high fault coverage. BIST approaches used in previous FPGA interconnect testing are modeled and simu- lated for gate-level stuck-at and bridging fault coverage. New BIST approaches are proposed and also modeled and simulated. The fault simulation results are used to compare and eval- uate the fault detection capabilities and e ectiveness of these BIST approaches for testing programmable interconnect resources in FPGAs. A cross-couple parity approach that best v suits the Xilinx Virtex-4 FPGA architecture is chosen and implemented for routing BIST of the global double lines in the interconnect network. vi Acknowledgments First and foremost I would like to thank God for giving me the strength and determi- nation to achieve my graduate degree. I would like to thank Dr. Stroud for his support and advice during my tenure at Auburn University during both my undergraduate and graduate studies. I would also like to thank Dr. Nelson and Dr. Singh for their contribution to this thesis by serving on my graduate committee. To all of the professors and sta that ever helped me big or small, I am forever grateful. To my research colleagues, Daniel, Lee, Jie, Brad, Jia, Mary, and Brooks, I am thankful for all your help throughout my research. To my parents, sister, and family I thank you for all the support, encouragement, and ever constant prayer that inspired me to always do my best. To my godparents, Ben and Mary, I would have never come to Auburn without your help. I owe you my education. I thank all of my friends that have been there for me throughout both the good and bad times. With- out you giving me a break from research, I would have never made it to the end. Lastly, I want to thank all of my students. I learned more from you than I ever taught you. If I can give you one last piece of moral ber I would say to stay good, study hard, and make me proud. Always remember: "A great engineer is not one that knows everything, but is the one that is willing to learn anything." -Bobby Dixon WAR EAGLE vii Style manual or journal used Journal of Approximation Theory (together with the style known as \aums"). Bibliograpy follows IEEE Transactions. Computer software used The document preparation package TEX (speci cally LATEX) together with the departmental style- le aums.sty. Tables were generated using Microsoft Excel and gures were drawn in Microsoft Visio and Microsoft Word. viii ix TABLE OF CONTENTS LIST OF FIGURES xi LIST OF TABLES xiii 1 INTRODUCTION 1 1.1 Field Programmable Gate Arrays .........................................................................2 1.2 Programmable Routing Resources in FPGAs.......................................................4 1.3 FPGA Programming Technologies.......................................................................6 1.4 The Testing Problem.............................................................................................6 1.5 Built-In Self-Test ..................................................................................................7 1.6 BIST for Programmable Interconnect...................................................................8 1.7 Thesis Statement ...................................................................................................9 2 BACKGROUND 10 2.1 Xilinx Virtex-4 Architecture...............................................................................10 2.1.1 Virtex-4 Programmable Logic Blocks....................................................12 2.2 Virtex-4 Programmable Interconnect .................................................................15 2.2.1 Virtex-4 Programmable Interconnect Switch Matrix .............................15 2.2.2 Virtex-4 Global, Local, and Dedicated Routing Resources....................16 2.3 BIST for FPGAs .................................................................................................19 2.4 BIST for Programmable Interconnect.................................................................20 2.4.1 Routing BIST Fault Models....................................................................21 2.4.2 Previous Routing BIST Approaches.......................................................23 2.4.2.1 Comparison-Based Counter Approaches.................................25 2.4.2.2 Parity-Based Approaches.........................................................29 2.4.3 Previous Routing BIST Approach Assumptions ....................................30 2.5 Thesis Restatement .............................................................................................32 3 ROUTING BIST ANALYSIS 33 3.1 Parity-Based Routing BIST Approaches ............................................................33 3.1.1 Cross-Coupled Parity-Based Approach ..................................................34 3.2 Linear Feedback Shift Register Based TPG/ORA Combination Approach .......36 3.2.1 4-bit LFSR with Internal Feedback.........................................................36 3.2.2 4-bit LFSR with External Feedback .......................................................37 3.2.3 8-bit LFSR with Internal Feedback.........................................................38 3.2.4 8-bit LFSR with External Feedback .......................................................38 x 3.3 Cellular Automata Register TPG/ORA Combination Approach........................39 3.3.1 4-bit CAR with all 150 Rules..................................................................40 3.3.2 8-bit Cyclic Boundary CAR....................................................................41 3.3.3 8-bit Maximal Length Sequence CAR....................................................41 3.4 Routing BIST Approach Fault Simulations........................................................42 3.4.1 Parity-Based Approaches........................................................................43 3.4.2 Counter-Based Approaches ....................................................................44 3.4.3 LFSR TPG/ORA Combination Approaches...........................................45 3.4.4 CAR Simulation Results.........................................................................46 3.5 Summary.............................................................................................................48 4 VIRTEX-4 ROUTING BIST IMPLEMENTATION 49 4.1 Virtex-4 Routing BIST Approach.......................................................................49 4.2 PLB Column Double Lines.................................................................................51 4.3 Non-PLB Column Double Lines ........................................................................62 4.4 Virtex-4 FX Device Configurations....................................................................69 4.5 Virtex-4 Routing BIST Results...........................................................................71 5 SUMMARY AND CONCLUSIONS 72 5.1 Summary of Routing BIST Approaches.............................................................72 5.2 Summary of Virtex-4 Routing BIST...................................................................73 5.3 Future Work........................................................................................................75 BIBLIOGRAPHY 76 xi LIST OF FIGURES 1.1 Simple FPGA Architecture.........................................................................................3 1.2 Simple PLB.................................................................................................................4 1.3 PIP Types....................................................................................................................5 1.4 BIST Process...............................................................................................................8 2.1 Example Virtex-4 Architecture.................................................................................12 2.2 Virtex-4 Programmable Logic Block [17]................................................................13 2.3 Virtex-4 SLICEL [19]...............................................................................................14 2.4 Virtex-4 Global Routing Resources..........................................................................17 2.5 Global and Local Routing Resources of the PLB.....................................................18 2.6 Stuck-at 0 and 1 Fault Models ..................................................................................22 2.7 DOM, DAND, and DOR Fault Models ....................................................................22 2.8 Single 2-bit Counter Approach .................................................................................25 2.9 Dual 2-bit Counter Approach....................................................................................26 2.10 Single 4-bit Counter Approach .................................................................................27 2.11 Dual 4-bit Counter Approach....................................................................................28 2.12 Original Parity-Based Approach [11] .......................................................................29 2.13 Modified Parity-Based Approach .............................................................................30 3.1 Cross-Coupled Parity Approach with Numbered WUTs..........................................35 xii 3.2 Internal Feedback 4-bit LFSR with Numbered WUTs.............................................37 3.3 External Feedback 4-bit LFSR with Numbered WUTs............................................37 3.4 Internal Feedback 8-bit LFSR with Numbered WUTs.............................................38 3.5 External Feedback 8-bit LFSR with Numbered WUTs............................................39 3.6 4-bit CAR with all 150 Rules with Numbered WUTs..............................................40 3.7 8-bit Cyclic Boundary CAR with Numbered WUTs................................................41 3.8 8-bit Maximal Length Sequence CAR with Numbered WUTs................................42 3.9 Parity-Based Approach Results ................................................................................44 3.10 Counter-Based Approach Results.............................................................................45 3.11 LFSR TPG/ORA Combination Approach Results ...................................................46 3.12 CAR Approach Results.............................................................................................47 4.1 Cross-Coupled Parity Implementation in Virtex-4...................................................50 4.2 Example North Double Line Implementation...........................................................52 4.3 Example East Double Line Implementation.............................................................58 4.4 END Terminal Connection Shift ..............................................................................64 4.5 Non-Connecting Terminals.......................................................................................65 4.6 Example North Non-PLB Double Line Implementation..........................................66 4.7 Loopbacks Used to Ensure East MID Testing..........................................................67 4.8 Example South Non-PLB Double Line Implementation..........................................68 4.9 Loopbacks Used to Ensure West MID Testing.........................................................69 4.10 FX Configuration Generation Process......................................................................70 xiii LIST OF TABLES 2.1 Virtex-4 Family Device Characteristics [17]............................................................11 3.1 CAR Rules [2]...........................................................................................................39 4.1 North Double Line Configuration 1..........................................................................54 4.2 North Double Line Configuration 2..........................................................................55 4.3 South Double Line Configuration 1..........................................................................56 4.4 South Double Line Configuration 2..........................................................................57 4.5 East Double Line Configuration 1 ............................................................................59 4.6 East Double Line Configuration 2 ............................................................................60 4.7 West Double Line Configuration 1...........................................................................61 4.8 West Double Line Configuration 2...........................................................................62 4.9 Summary of Virtex-4 Routing BIST Global Double Line Configs..........................71 1 CHAPTER 1 INTRODUCTION Since the invention of the first transistor, the electronics industry has been a hotbed of technological growth. Every new generation of integrated circuits (ICs) is in some way better than the previous generation. In 1965, Intel cofounder Gordon E. Moore noticed this trend and predicted the roadmap that has guided technology improvements for the past four decades. He proclaimed that the number of transistors that can be inexpensively put on an IC is doubling about every two years [13]. Since then, industry has kept up the trend and went from ICs consisting of only a few thousand transistors to now over a billion transistors. The major problem with this type of growth is the overall reliability of the IC. Larger chips with smaller transistors have caused new types of defects to emerge and an increased chance of faults occurring in the chip [2]. The more complex the chip, the more complex and extensive the testing must be to achieve acceptable defect levels. 2 1.1 Field Programmable Gate Arrays Some of the largest ICs currently on the market are Field Programmable Gate Arrays (FPGAs). A simple FPGA, as seen in Figure 1.1, is a two-dimensional array of programmable logic blocks (PLBs) that are interconnected by a network of programmable routing resources and Input/Output blocks (IOBs) [1]. FPGAs are ideal in applications where the logic of the chip might need to be changed over time [3]. Unlike an application specific integrated circuit (ASIC), an FPGA is not fabricated for a specific task. Instead it is user programmed, often many times, with the function needed at the current time. This ability to be reprogrammed over and over is one of the biggest advantages of FPGAs because it eliminates the design and development cost of fabricating custom ASICs to perform a new function. An FPGA loses its advantage over an ASIC when considering operating speeds and chip area [12]. Typically an FPGA will draw more power than an ASIC as well [21]. Another disadvantage is overall selling cost. More often than not, an FPGA will be more expensive than an ASIC. For this reason most system designs utilizing an FPGA will be limited to low volume production or prototype designs [12]. 3 PLB Embedded Component Routing I/O Buffer Figure 1.1: Simple FPGA Architecture The PLBs perform most of the combinational and sequential logic functions within the chip. The routing resources connect the PLBs to other PLBs, the IOBs, and embedded dedicated components such as the random access memories (RAMs) and digital signal processors (DSPs). The IOBs handle all signal communications between the PLBs and the package pins of the chip [11]. A typical PLB, as seen in Figure 1.2, consists of logic gates, multiplexers, flip-flops, and Look-up Tables (LUTs) [15]. The logic gates and multiplexers perform combinational logic signal path selection within the PLB. The flip- flops can sometimes be configured as latches and are used to handle sequential logic functions of the PLB. LUTs, or function generators, are programmed with the truth table of the desired combinational logic function to be performed by the PLB. They can also be configured as small RAMs in some cases [15]. LUT FF Inputs Outputs Figure 1.2: Simple PLB 1.2 Programmable Routing Resources In FPGAs The programmable interconnect of an FPGA is typically a horizontal and vertical mesh of wire segments that are interconnected by electronically programmable switches called programmable interconnect points (PIPs) [12]. The number and length of the wire segments vary among device types and manufacturers. A balanced distribution of wire segment types is sought to maximize the functional density of the FPGA. If the balance is heavily weighted towards long wires, or global routing, then local interconnections suffer from wasted chip area and increased signal delay. Conversely, a heavy weight of short wires, or local routing, causes long routes to become congested with many PIPs. Local routing resources are used to connect PLBs to adjacent PLBs or the global routing 4 resources. Global routing resources are used to connect PLBs to IOBs, non-adjacent PLBs, and other embedded components. The PIPs themselves are simple transmission gates that are controlled by configuration memory bits [5]. The FPGA is composed of several different types of PIPs: the break-point PIP, the cross-point PIP, and the multiplexer (MUX) PIP. A break- point PIP, illustrated in Figure 1.3a, connects two wire segments within the same plane; either vertical or horizontal. A cross-point PIP, illustrated in Figure 1.3b, connects vertical and horizontal wire segments. The MUX PIP, illustrated in Figure 1.3c, can either be decoded or non-decoded. A decoded MUX PIP is a group of 2 n cross-point PIPs connected to a single output wire and configured by n configuration bits [5]. A non- decoded MUX PIP has a configuration bit for every wire segment, making n configuration bits control n segments. Only one configuration bit can be active in a given configuration for a non-decoded MUX PIP. Most current FPGA routing resources are primarily made up of buffered non-decoded MUX PIPs to prevent signal degradation [5]. Input Output a) break-point PIP b) cross-point PIP c) mux PIP Figure 1.3: PIP Types 5 6 1.3 FPGA Programming Technologies Over time several FPGA programming methods have been implemented in chips. These methods include fuse/anti-fuse, floating gate, and static RAM (SRAM). SRAM- based FGPAs tend to be the most popular due to their fast re-programmability and standard fabrication process [12]. They are also optimal for Built-In Self-Test (BIST) implementations because of their ability to be reconfigured in-circuit an arbitrarily large number of times [1] [12]. A set of all programming bits make up the logic configuration of the PLBs [1]. These bits also configure the I/O blocks and routing resources. An SRAM-based FPGA is programmed by writing these configuration bits to the on-chip configuration memory [4]. Being able to configure FPGAs repeatedly has made them a major implementation medium in the digital design industry [15]. 1.4 The Testing Problem FPGAs of today consist of millions of transistors. With technology improvements, they are achieving greater logic capacities and higher operating frequencies [3]. The increase in FPGA functionality has made them front runners in mission critical and fault tolerant applications [10]. Due to such popularity, thorough testing is a must for these devices. The problem incurs in that FPGA testing becomes harder and more costly as FPGA architectures become more complex. Every new generation of FPGAs usually requires more test configurations to detect all faults [3]. An increase in test configurations, in turn, increases the testing time. Longer testing times 7 cost more money which is then reflected in the increased price of the FPGA. The use of automatic test equipment (ATE) to externally apply test vectors to FPGAs is only applicable at device-level testing [4]. Therefore, a solution must be implemented for in- system testing and fault tolerant applications. 1.5 Built-In Self-Test One solution for in-system FPGA testing is Built-In Self-Test (BIST) [1]. BIST, as seen in Figure 1.4, usually consists of four components: a test pattern generator (TPG), a circuit under test (CUT), an output response analyzer (ORA), and a BIST controller [2]. In the case of BIST for FPGAs, the TPG consists of one or more PLBs configured to drive one or more CUTs in parallel with test vectors [8]. Embedded components such as BRAMs and DSPs have also been used to store and generate test vectors that are applied to the CUTs [15]. The CUT is merely a part of the FPGA that is being tested whether it is other PLBs, IOBs, embedded components, or routing resources. The ORA is one or more PLBs configured to analyze the output of the CUT and report a pass or fail status [2]. The BIST controller oversees the testing of the CUT by initializing and issuing the starting and ending commands to the architecture. These BIST components are programmed into the FPGA to test its resources thereby eliminating the need for external ATE. Since the BIST circuitry is on-chip, BIST also allows for at- speed testing [1]. ORATPG CUT BIST Controller Figure 1.4: BIST Process 1.6 BIST for Programmable Interconnect Testing the programmable interconnect network of an FPGA is a difficult process. The overall goal in routing BIST is to minimize the number of test configurations by maximizing the number of wires under test (WUT) in a given configuration [1]. Previous work in routing BIST has produced several approaches that achieve adequate fault coverage for different FPGA architectures. The first BIST approach used to test the interconnect of FPGAs utilized a counter-based TPG to drive the WUTs with test patterns that were then compared by the ORAs [1]. Another successful approach was a parity- based BIST approach that was proposed for Xilinx 4000 series FPGAs [11]. This approach also used a counter-based TPG to exhaustively drive the WUTs with test patterns, but also generated a parity bit that was sent to the ORA where a parity check 8 9 function was used to detect the faulty wires. Through the years, modifications have been made to both approaches to utilize specific FPGA architecture efficiency [5] [6] [8] [9]. 1.7 Thesis Statement The goal of this thesis is to develop and implement a BIST architecture that will perform BIST on the programmable interconnect of the Xilinx Virtex-4 FPGA. This will be achieved by performing a comparative analysis of prior routing BIST approaches. Based on this analysis, several new approaches will be proposed and evaluated. The main goal of this analysis is to choose an approach that will maximize the number of WUTs per test configuration, which will in turn minimize the overall number of test configurations. Minimizing test configurations will ultimately decrease testing time and cost. By digesting the analysis results, a suitable approach will be chosen and implemented as the routing BIST approach for Virtex-4 FPGAs. This thesis is organized as follows: In Chapter 2, a background of previous routing BIST approaches will be given along with the theory behind programmable interconnect testing of FPGAs and the Virtex-4 architecture. Chapter 3 will discuss how multiple routing BIST approaches have been modeled and simulated for stuck-at and bridging fault coverage. The results will then be analyzed and evaluated to determine the best approach for routing BIST for the Virtex-4 FPGA. Chapter 4 will discuss how the chosen approach will be implemented for Virtex-4 routing BIST. Chapter 5 will summarize all work presented in this thesis and give insight towards future work. 10 CHAPTER 2 BACKGROUND This chapter presents an overview of the Virtex-4 architecture used in this research. The primary emphasis will be focused on the programmable interconnect within this architecture. The idea behind Built-In Self-Test (BIST) will be presented, mainly focusing on programmable interconnect testing. A description of previous programmable interconnect testing techniques will also be presented along with testing assumptions that were made during their development. 2.1 Xilinx Virtex-4 Architecture The Xilinx Virtex-4 FPGA was released in June of 2004 [16]. Claiming to be the world?s most advanced FPGA at the time, the Virtex-4 FPGA incorporated a triple-oxide 90 nanometer CMOS fabrication technology with 11-layer metal interconnect [16]. Developed on the advanced silicon modular block architecture, the Virtex-4 promised to deliver twice the density and performance of any FPGA on the market at the time [16]. It was the first FPGA family to offer multiple domain-optimized platforms as presented in Table 2.1: Virtex-4 LX for functions requiring lots of logic, Virtex-4 SX for signal 11 processing applications, and Virtex-4 FX for embedded processing and high-speed serial applications. Table 2.1: Virtex-4 Family Device Characteristics [17] Device Row x Col Slices DSPs BRAMs PPCs I/O XC4VLX15 64 x 24 6,144 32 48 - 320 XC4VLX25 96 x 28 10,752 48 72 - 448 XC4VLX40 128 x 36 18,432 64 96 - 640 XC4VLX60 128 x 52 26,624 64 160 - 640 XC4VLX80 160 x 56 35,840 80 200 - 768 XC4VLX100 192 x 64 49,152 96 240 - 960 XC4VLX160 192 x 88 67,584 96 288 - 960 XC4VLX200 192 x 116 89,088 96 336 - 960 XC4VSX25 64 x 40 10,240 128 128 - 320 XC4VSX35 96 x 40 15,360 192 192 - 448 XC4VSX55 128 x 48 24,576 512 320 - 640 XC4VFX12 64 x 24 5,472 32 36 1 320 XC4VFX20 64 x 36 8,544 32 68 1 320 XC4VFX40 96 x 52 18,624 48 144 2 448 XC4VFX60 128 x 52 25,280 128 232 2 576 XC4VFX100 160 x 68 42,176 160 376 2 768 XC4VFX140 192 x 84 63,168 192 552 2 896 All three families of the Virtex-4 FPGA utilize a column-based architecture as illustrated in Figure 2.1. Almost every device in the Virtex-4 product line has a different number of columns comprising its device size. All devices have a master center column that facilitates most of the secondary embedded components such as boundary scan, power management, and digital clock manager modules [17]. Moving outward from the center column are columns comprising digital signal processors (DSPs), block RAMs (BRAMs), and programmable logic blocks (PLBs) [18]. The edges of the LX and SX devices comprise the input/output buffer blocks (IOBs) [18]. The FX family also incorporates one or two, depending on device size, IBM Power PC blocks to the left of the center column and rocket IO gigabit transceivers on the edges with regular IOBs relocated in the eighth column from the edges [17]. PLBs BRAMs DSPs Figure 2.1: Example Virtex-4 Architecture 2.1.1 Virtex-4 Programmable Logic Blocks The Virtex-4 PLB (Figure 2.2) consists of four SLICES; two SLICELs and two SLICEMs [17]. A SLICEL is made up of two 4-input LUTs, two flip-flops/latches, and some secondary logic gates and multiplexers (Figure 2.3) [18]. The LUTs contain the truth table of the combinational logic functions configured into a SLICE. The flip- flops/latches are used for any sequential logical functions that need to be performed by 12 the SLICE. The secondary logic gates and multiplexers control the internal signal routing within the SLICE as well as specialized logic functions such as fast carry logic of adders [20]. A SLICEM is a more complex SLICEL. It incorporates all of the internal characteristics of a SLICEL with added functionality to be used as shift registers or small RAMs. In this thesis, PLBs will function as the TPGs and ORAs required to test the programmable interconnect of the FPGA. SLICEM (Logic or Distributed RAM or Shift Register) SLICEL (Logic Only) Switch Matrix SLICEM (0) X0Y0 SLICEM (2) X0Y1 SLICEL (1) X1Y0 SLICEL (3) X1Y1 COUT COUT CIN SHIFT CIN Figure 2.2: Virtex-4 Programmable Logic Block [17] 13 Figure 2.3: Virtex-4 SLICEL [19] 14 15 2.2 Virtex-4 Programmable Interconnect The Virtex-4 FPGA houses an 11-layer metal interconnect [20]. The basic purpose of this interconnect network is to connect the PLBs, IOBs, and embedded components together to allow for data transfer. Typically, 80% of the configuration bits in a given bitstream control the many wire segments and programmable switches that make up the programmable interconnect network [15]. This network is made up of global routing, dedicated routing, local routing, and switch matrices. These resources are made of wire segments and programmable interconnect points (PIPs). 2.2.1 Virtex-4 Programmable Interconnect Switch Matrix The programmable interconnect switch matrix is the gateway between all global and local routing resources within the FPGA. Every component has one or more switch matrices associated with it [19]. All signals that transfer from one PLB, IOB, or embedded component to another must route through a switch matrix. The switch matrix itself is comprised of a dense mesh of wire segments and multiplexer PIPs. There are a total of 164 multiplexers within a switch matrix, ranging from 1 to 37 inputs [19]. As a whole, a single switch matrix contains 3,312 multiplexer PIPs [19]. These internal segments and PIPs are the connections between the external global routing and the local routing of the associated component [17]. 16 2.2.2 Virtex-4 Global, Local, and Dedicated Routing Resources The global, local, and dedicated routing resources are the wire segments within the FPGA that are not part of the switch matrix itself. Local routing resources are used to connect PLBs to adjacent PLBs or the global routing resources [15]. Global routing resources are used to connect PLBs to IOBs, non-adjacent PLBs, and other embedded components [15]. Global routing resources can be broken down into three different wire segment types: double lines, hex lines, and long lines (Figure 2.4). The main difference in the three types of wire segments is the distance from where they begin to where they terminate within the FPGA. The biggest similarity between the double and hex lines is that they have three connections into a switch matrix along their span: a beginning (BEG) terminal, a middle (MID) terminal, and an ending (END) terminal [19]. The long lines have a total of five connections into a switch matrix along its span. The double and hex lines source in all four directions of the FPGA from a given switch matrix: north, south, east, and west. As can be seen in Figure 2.5, every direction has 10 BEGs, 10 MIDs, and 10 ENDs for a total of 240 double and hex wire segments per switch matrix [19]. The ten long lines, 5 horizontal and 5 vertical, are bidirectional and source in all four directions. The double global routing resources span three rows or columns of components, including the starting component, with switch matrix connections at all three components (Figure 2.4) [19]. The hex global routing resources span six rows or columns of components, not including the starting component (BEG), with switch matrix connections at the third (MID), and sixth (END) components (Figure 2.4) [19]. The long lines span twenty-four rows or columns of components, not including the starting component, with switch matrix connections at every sixth component as shown in Figure 2.4. BEG MID END 17 Double BEG MID END Hex LH0 LH6 LH12 LH18 LH24 5PLBs 5PLBs 5PLBs 5PLBs Long Figure 2.4: Virtex-4 Global Routing Resources Local routing resources are the lines that connect the switch matrices to the actual PLBs and embedded components (Figure 2.5). The signals coming into the PLBs and embedded components usually connect through the input multiplexers (IMUXs) to their associated pinwire within the SLICE or component [19]. The signals leaving the PLBs and embedded components usually connect to outbound wires and enter the switch matrix to be routed onto the global resources. The flip-flops and latches within a PLB usually have to connect first to an output multiplexer (OMUX) before it can be routed through the switch matrix to other routing resources [19]. 18 Figure 2.5: Global and Local Routing Resources of the PLB 19 2.3 BIST for FPGAs BIST for FPGAs usually involves developing test configurations where both test pattern generation and output response analysis completely test all programmable resources within an FPGA. Typically, the testable programmable resources are broken down into two groups: logic and routing resources [15]. These two programmable resource groups can be further broken down into their traditional BIST classification. Programmable logic is usually split into PLBs, IOBs, and specialized cores while programmable routing is split into local and global resources [15]. The work presented in this thesis will concentrate on the global routing resources within the programmable interconnect network of the FPGA. Since early FPGAs consisted mainly of an array of PLBs and routing resources, the most common and easiest to understand BIST is that for PLBs; also known as logic BIST [15]. The general architecture for logic BIST is that multiple identical TPGs are configured to source test patterns to multiple identically configured PLB blocks under test (BUTs). The output responses are then sent to one or more comparison-based ORAs. The BUTs are repeatedly reconfigured to be tested in various modes of operation until they have been completely tested. The process is then repeated alternating the PLBs that acted as TPGs and ORAs with the PLBs that were BUTs. After all PLBs have been tested as BUTs, the logic BIST is said to be complete. 20 2.4 BIST for Programmable Interconnect BIST for programmable interconnect, often referred to as routing BIST, is one of the most complex BISTs for FPGAs. This complexity is due to the sheer number of routing resources that have to be tested versus the number of resources that are tested in other types of BIST. Routing BIST complexity is also higher because over 80% of the configuration bits loaded into the configuration memory of the FPGA control the routing resources [15]. With this much complexity, ensuring the routing resources of an FPGA are fault-free is a difficult task. The most apparent issue is the enormous amount of wire segments and programmable switches that have to be tested. This in turn makes the total number of configurations required to test an entire FPGA quite large. For this reason, the main overall goal of developing a routing BIST approach is to maximize the number of WUTs per test configuration, thereby minimizing the total number of configurations required to test the FPGA. One of the main focuses of the work presented in this thesis is the analysis and evaluation of previously implemented and newly proposed routing BIST approaches to be able to determine the most feasible approach to be implemented for the Xilinx Virtex- 4 FPGA architecture. The criterion for this analysis is that every approach be modeled and simulated to acquire its gate-level stuck-at and bridging fault coverage. The data is then evaluated against other BIST approach data to determine which approaches have the highest fault coverage along with the most WUTs per configuration. 21 2.4.1 Routing BIST Fault Models The typical fault models used in FPGA interconnect testing include wire segments stuck-at 0 and 1, shorted wires, and open wires [2]. Stuck-open (stuck-off) and stuck- closed (stuck-on) PIP faults are also included because they correspond to stuck-at faults in the configuration memory bits that control the PIPs [2]. For the work presented in this thesis, only hard faults are considered, therefore delay faults are not part of this analysis. Delay faults are faults that allow the circuit to function normally but at a reduced clock rate [3]. The stuck-at 0 and 1 fault models are simply faults in which one end of the wire segments behaves as if disconnected and tied low to Vss or high to Vdd, respectively (Figure 2.6 a,b) [2]. Shorted wires are a characteristic of bridging faults. These fault models most commonly include dominant (DOM), dominant-AND (DAND), and dominant-OR (DOR) fault models (Figure 2.7 a,b,c) [2]. A DOM bridging fault is when two adjacent wires incur a low resistance short and the stronger driving gate dominates the short for all logic values. DAND and DOR bridging faults occur when two adjacent wires incur a low resistance short and the stronger driving gate dominates the short for only one logic value thus resembling an AND and OR logic gate connection, respectively. Since the number of bridging faults possible in an FPGA, if exhaustively tested, is exponential, most bridging fault testing is mainly focused on WUTs that are adjacent to each other [7]. Vss A A? Vdd A A? (a) (b) Figure 2.6: Stuck-at 0 and 1 Fault Models A B A? B? B? A B A? A B A? B? (a) DOM (b) DAND (c) DOR Figure 2.7: DOM, DAND, and DOR Fault Models Detecting these hard faults in FPGA programmable interconnect is straightforward [5]. Every wire segment and PIP put under test must be able to transmit both a 0 and a 1. Every pair of wire segments that could possibly short must be able to transmit both (0, 1) and (1, 0) pairs. By applying both logic values to one end of a wire segment and observing those values on the other end, any stuck-at fault that exists on any wire segment to which the test is applied can be detected. This method also detects any wire open or stuck-open faults for any activated PIP along the wire segment under test. Applying opposite logic values at the ends of wire segments connected by a deactivated PIP while monitoring both sides of the PIP will reveal a stuck-closed fault for that PIP. Testing the MUX PIPs requires a separate test configuration for every one of its inputs [5]. The selected input gets both logical 0 and 1 values applied to it while the non- 22 23 selected inputs get the opposite values. This method detects if a stuck-open fault is in the PIP that connects the selected input with the output of the MUX PIP. It also detects if stuck-closed faults are present on the PIPs that are associated with the non-selected inputs. 2.4.2 Previous Routing BIST Approaches Throughout the past work in routing BIST, several assumptions have been made to achieve adequate fault coverage for the target FPGA architectures. Since the TPGs and ORAs are constructed from PLBs, and routing resources are needed to connect the logic that forms them, the biggest assumption has been that the routing resources used are fault-free when testing PLBs and that the PLBs used are fault-free when testing routing resources [22]. In consideration of possible bridging fault sites, knowledge of physical positions of wire segments was assumed. These assumptions benefit in making routing BIST easier to comprehend, but fall to the question of how accurate is routing BIST interconnect testing. One important issue to consider in previous routing BIST approaches is the way in which ORA results are retrieved. There are many methods for retrieving the test results from the ORAs [2], but only two were widely used in previous routing BIST approaches. A comparison-based ORA with integrated scan chain was used in ORCA [5] interconnect testing. The scan chain was used to shift the test results out of the ORAs at the end of the BIST cycle. While this method of result retrieval was very effective and fast, its major drawback was the requirement of additional logic for the multiplexer and 24 routing resources to construct and operate the scan chain. The increase in resources needed for the integrated scan chain within the ORA decreased the number of routing resources that could be tested in a given configuration, thereby increasing the number of test configurations required for complete testing. The only time the integrated scan chain ORA did not increase the number of test configurations was in the case of the Cypress Delta39K where a built-in scan chain was already associated with the flip-flops [7], and in the case of Atmel [6] where the shift register was constructed via dynamic partial reconfiguration at the end of the BIST sequence. The second method used for routing BIST results retrieval is configuration memory readback. This approach was used for the Xilinx 4000 and Spartan series FPGAs [8]. Configuration memory readback in essence requires that the entire configuration memory be read for every BIST configuration to extract the results from the flip-flops within the ORAs. The drawback of configuration memory readback is that it basically doubles testing time since reading back the memory takes roughly the same amount of time as writing a configuration to it. The advantage comes in that the comparison-based ORAs no longer need a scan chain and use less logic and routing resources thereby freeing these resources to be tested in fewer configurations. The growing trend in routing BIST results retrieval for new FPGAs is partial configuration memory readback [9]. This capability allows for only the portions of the configuration memory that contain the ORA results to be read. This method of reconfiguration and results retrieval has been proven to reduce download time by a factor of 4 and readback by a factor of 2 [9]. The work presented in this thesis uses this method of ORA results retrieval for the Xilinx Virtex-4 FPGA. 2.4.2.1 Comparison-Based Counter Approaches The first routing BIST approach for FPGA interconnect testing used counter- based TPGs to generate the test patterns. The patterns were transmitted along the WUTs and compared by the comparison-based ORAs [1]. Many devices used this approach [1] [8] [9] [10]. The architecture of the device being tested was usually the deciding factor in how many and what size of counters would compose the TPGs. The typical composition of the counter-based TPGs would either be single or dual 2 or 4-bit counters. The single 2-bit counter TPG worked by using the current and next state of the counter as the test pattern sequence to provide four signals with opposite logic values between any pair of signals (Figure 2.8) [8]. A total of eight wires were under test using the single 2-bit counter TPG because the four signals were fanned out to the WUTs before routing to the ORAs. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D OUT WUTs TPG ORA Figure 2.8: Single 2-Bit Counter Approach 25 The dual 2-bit counter TPG approach (Figure 2.9) was also used in FPGAs [8], and worked off the same principal as the single 2-bit counter TPG approach. The current and next states served as the test pattern sequence. The only difference was that the four signals did not have to be fanned out to the eight WUTs since the dual counters provided the eight sources independently. 26 Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D OUT Q Q SET CLR D Q Q SET CLR D WUTs CNTR0 CNTR1 TPG ORA Figure 2.9: Dual 2-Bit Counter Approach Single 4-bit counter TPGs were used in FPGAs where the PLBs contained four flip-flops [1] [9] (Figure 2.10). They worked by using only the current state of the counter as the test pattern sequence to provide the four needed signals with opposite logic values between any pair of signals. Similar to the single 2-bit counter TPG approach, the single 4-bit counter also had its signals fanned out to the WUTs before routing to the ORAs to test a total of eight wires under test. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D WUTs TPG ORA Figure 2.10: Single 4-Bit Counter Approach The dual 4-bit counter TPG approach (Figure 2.11) was also used in larger FPGAs [1] [9], and worked off the same principal as the single 4-bit counter TPG approach. Only the current state of the counter was used as the test pattern sequence. The difference 27 between the 4-bit counters was the same as the 2-bit counters; the test patterns were not fanned out to the WUTs. 28 Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D WUTs CNTR0 CNTR1 TPG ORA Figure 2.11: Dual 4-Bit Counter Approach 2.4.2.2 Parity-Based Approaches Another routing BIST approach for FPGA programmable interconnect testing was a parity-based approach. It was proposed for the Xilinx 4000 series FPGAs in [11] (Figure 2.12). The idea behind this approach was that an N-bit counter TPG would exhaustively source test patterns across N wires under test that were connected to ORAs. The TPG would also generate a parity bit that would be sent to the ORAs as well. The ORAs would then perform a parity check function based on the values on the WUTs thereby detecting any faults for mismatched parity. The big disadvantage of this approach was that it assumed that the parity bit was being transmitted over fault-free routing resources. TPG Parity Code Generator ORA WUTs K WUTs_Parity Figure 2.12: Original Parity-Based Approach [11] This assumption was not realistic so the approach was modified in [6]. The approach would now send the parity over a WUT for a total of N+1 WUTs (Figure 2.13). This modification also allowed for greater flexibility in using different types of TPGs and ORAs. The parity approach really came into use in testing FPGAs that had small PLBs. 29 This new parity approach was used in the testing of Atmel AT40K and AT94K series of FPGAs [6]. In those tests, a 2-bit up counter, initialized to all 0s and generating even parity, was combined with a 2-bit down counter, initialized to all 1s and generating odd parity, to facilitate the testing of six wires at once per TPG/ORA combination. The count and parity signals were routed to opposite sides of deactivated PIPs to detect the stuck- closed faults within those PIPs. The opposite logic values produced between any pair of the six signals detected bridging faults between adjacent wire segments as well. This approach will be referred to as ?dual parity? in this thesis. Q Q SET CLR D Count 0 Count N-1 Parity WUTs Figure 2.13: Modified Parity-Based Approach 2.4.3 Previous Routing BIST Approach Assumptions All of the previous routing BIST approaches shared several common assumptions to be able to justify test validity [15]. One of these assumptions was that the logic resources used to create the TPGs and ORAs had been previously tested and found to be falt-free. The problem with this assumption is that in testing the logic resources for 30 31 faults, faulty routing resources could be used to connect the logic components. This in turn required that the same assumption be made in reference to the routing resources being used. A second assumption made in all of these routing BIST approaches was that the feedback routing used by the counter TPGs was fault-free. A third assumption, especially with older FPGAs, was that knowledge of the relative position of wire segments to be tested was present and therefore guaranteed that the same signal was not transmitted over adjacent wire segments. The first and second assumptions have to be used for the single counter TPGs of the comparison and parity-based routing BIST approaches. These assumptions can be alleviated depending on the method and procedure of ORA results retrieval. For instance, a logic fault that would prevent a single counter TPG from counting will go undetected along with any faults it would otherwise detect in the WUTs. However, by using configuration memory readback to retrieve the ORA test results and the current state of the TPGs after completing the entire BIST sequence and advancing past the starting state, any logical fault that could have occurred will be detected. Logical faults are not a problem in the case of using dual counters with the comparison-based approach because the TPG signals are not being fanned out to the WUTs so the ORAs will catch any differences in state between the two counters. The second and third assumptions were tolerable for the targeted FPGAs under test because they were considered small devices by today?s standards, and they all had dedicated feedback routing. The number of wire segments and PIPs per PLB were very low. The number of wire segments would range from 42 to 77 segments per PLB [15]. The number of PIPs ranged from 128 to 206 per PLB [15]. Since these devices were relatively small, information about the positions of the routing resources could be 32 obtained from the device datasheets and physical design graphical editors. In contrast, today?s FPGAs do not have dedicated feedback routing resources. Many have around 406 wire segments and 4,100 PIPs per PLB [15]. Therefore, it is imperative that new routing BIST approaches be developed to eliminate the typical testing assumptions so future, more complex FPGAs will be testable. 2.5 Thesis Restatement The goal of this thesis is to develop and implement a BIST architecture that will perform BIST on the programmable interconnect of the Xilinx Virtex-4 FPGA. This will be achieved by performing a comparative analysis of prior routing BIST approaches. These previous approaches were at a disadvantage on the validity of their testing because of the assumptions that had to be made to achieve adequate fault coverage. The work presented in this thesis will propose new approaches that minimize these assumptions while still maximizing the number of WUTs per test configuration and achieving high fault coverage. Chapter 3 will introduce the new approaches that were proposed for Virtex-4 routing BIST. It will also present the simulations and fault coverage of some of the approaches discussed in this thesis along with the comparative analysis and evaluation of such data. Chapter 4 will show the chosen routing BIST approach for Virtex-4 global interconnect testing. It will also present the actual implementation and application of the global routing BIST for Virtex-4. Chapter 5 will conclude with a summary of the work presented and suggestions for work to be done in the future. 33 CHAPTER 3 ROUTING BIST ANALYSIS New approaches for BIST of programmable interconnect in FPGAs will be proposed in this chapter. Some of the previous and new routing BIST approaches were modeled, and simulated, to determine gate-level stuck-at and bridging fault coverage, with the Auburn University Simulator (AUSIM) [23]. This information is analyzed and discussed in this chapter to determine the best approach for Virtex-4 routing BIST implementation. 3.1 Parity-Based Routing BIST Approaches The original parity-based routing BIST approach was proposed in [11], and later modified in [6] to allow the wire being used for parity transmission to also be a WUT. With the growing complexity of FPGAs, the assumptions that were used in testing the programmable interconnect utilizing these approaches are quickly becoming unrealistic. The most impractical assumption in the case of recent FPGAs was that the routing resources used for the feedback of the TPGs were assumed to be fault free. For example, since Virtex-4 FPGAs have no dedicated feedback routing resources, those which are 34 assumed to be fault-free are in essence part of the same wires that are to be tested. For that reason, the parity-based routing BIST approach had to be rethought to better test these new complex FPGAs. 3.1.1 Cross-Coupled Parity-Based Approach With the absence of dedicated feedback routing resources in Virtex 4 FPGAs, a fault that inhibits the count sequence could go undetected if the parity bit remains the correct value. A work around for this issue is to read the current state of the TPGs along with the ORA results, using partial configuration memory readback. However, by cross- coupling the parity lines to the ORAs, the need to read the current state of the TPGs can be totally eliminated since a fault in the TPGs will be detected by the ORAs, as will be shown in this chapter. The cross-coupled parity-based routing BIST approach is similar to the approach used in the Atmel series FPGA programmable interconnect testing [6]. It consists of one 2-bit up-counter initialized to all 0s and one 2-bit down-counter initialized to all 1s driving two ORAs each (Figure 3.1). The next state of the most significant bit of both counters is the parity and is cross-coupled to the ORAs that analyze the count values of the other counter. This cross-coupling of the parity bit allows for any fault that would hinder the count of one of the counters to be detected by the ORAs, thereby eliminating the assumption of fault-free feedback routing resources within the construction of the TPG. This added detection of faults within the feedback of the counters of the TPG also allows for those resources being used as feedback to be considered as WUTs. This set of feedback WUTs includes both the feedback path of each counter to its driving LUT along with the path to the next bit of the respective counter. The feedback WUTs give three WUTs per counter within the construction of the TPGs. This along with the six other WUTs being driven by the TPGs allows the cross-coupled parity-based routing BIST approach to test 12 wire segments within a single test configuration. Since two ORAs will fit into a single Virtex-4 slice, the TPG signals could be fanned out to drive six more WUTs, bringing the total number of WUTs possible to 18. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 6 5 4 3 2 1 TPGs ORAs 7 8 9 10 11 12 Figure 3.1: Cross-Coupled Parity Approach with Numbered WUTs 35 36 3.2 Linear Feedback Shift Register Based TPG/ORA Combination Approach In modeling the parity-based approaches, it was observed that the feedback routing resources of counter-based TPGs can be considered wires under test. One idea is that the counter itself can be considered as both a TPG and its own ORA. At the end of the designated BIST sequence, typically after cycling through the complete count sequence and continuing to a different count value from the starting value, the current state of the counter can be read via partial configuration memory readback. The counter must then be clocked several more times, and the current state read again to determine the pass or fail status of the test. To eliminate the need to read the configuration memory twice to retrieve BIST sequence results, a primitive polynomial linear feedback shift register (LFSR) can be used for the basis of a TPG. The thought is that by reducing the number of configuration memory readbacks, total testing time can be reduced. 3.2.1 4-bit LFSR with Internal Feedback An example LFSR-based TPG/ORA combination approach is a 4-bit primitive polynomial LFSR with internal feedback (Figure 3.2). It is clocked for an arbitrary number of clock cycles before the BIST results are read back. The main consideration to keep in mind is that the number of clock cycles has to be at least enough to let every bit of the LFSR toggle its value. This approach yields five WUTs. However, since two LFSRs can be placed in a single Virtex-4 PLB the total number of WUTs would be ten as long as the shift register mode of the LUT is not used to construct the LFSR. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 3 1 4 5 2 Figure 3.2: Internal Feedback 4-bit LFSR with Numbered WUTs 3.2.2 4-bit LFSR with External Feedback It is possible to modify the approach to a 4-bit primitive polynomial LFSR with external feedback (Figure 3.3). Again, the LFSR has to be run enough clock cycles to allow all bits to have toggled before BIST sequence results can be retrieved. With external feedback, the total number of wires under test is still five with the potential of ten in a single Virtex-4 PLB as long as the shift register does not use dedicated routing. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 1 2 3 4 5 Figure 3.3: External Feedback 4-bit LFSR with Numbered WUTs 37 3.2.3 8-bit LFSR with Internal Feedback To increase the number of WUTs one can increase the size of the LFSR to an 8- bit primitive polynomial LFSR with internal feedback (Figure 3.4). The thought is that by increasing the size of the LFSR that the number of feedback paths will increase thereby increasing both WUTs and fault coverage. While this approach does increase the number of wires being tested to 11 for a single LFSR, it requires a full PLB and as a result is no different than the dual 4-bit LFSR. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 11 1 6 3 4 5 8 10 2 7 9 Figure 3.4: Internal Feedback 8-bit LFSR with Numbered WUTs 3.2.4 8-bit LFSR with External Feedback It is possible to modify the approach to an 8-bit primitive polynomial LFSR with external feedback (Figure 3.5). The number of wires under test remains 11. This is because the polynomial has not changed even though the routes have. 38 Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 7 9 11 2 1 3 4 5 6 8 10 Figure 3.5: External Feedback 8-bit LFSR with Numbered WUTs 3.3 Cellular Automata Register TPG/ORA Combination Approach A cellular automata register (CAR) TPG/ORA combination approach was investagated. Much like the LFSR, the CAR utilizes pseudo-random test pattern generation as the basis of its BIST sequence. Instead of polynomials that govern the architecture of LFSRs, the architecture of CARs is governed by rules. The most widely used rules in the construction of CARs are rules 90 and 150 as described in Table 3.1. Table 3.1: CAR Rules [2] Rule Bit 1 (LSB) i th Bit Bit N (MSB) 150 Q 1 + = Q1 Q2 Q i + = Qi Q i-1 Q i+1 Q N + = Q N Q N-1 90 Q 1 + = Q2 Q i + = Q i-1 Q i+1 Q N + = Q N-1 As can be seen in the table, rule 150 allows for three wires under test while rule 90 allows for two. With that in mind, the best way to maximize WUTs is to maximize the use of 150 rules in the construction of the CAR. 39 3.3.1 4-bit CAR with all 150 Rules Consider, for example, a 4-bit CAR with all 150 rules (Figure 3.6). In theory this approach allows the testing of 24 wires in a single Virtex 4 PLB; the highest thus far. Due to the nature of CARs, this approach only produces two unique test patterns before repeating, depending on the initialization vector. It was also observed that several initialization vectors would lock up and never cycle. By having only two patterns, dual readback would be required to achieve high fault coverage. Even though the two patterns reflected different values, there is always one bit of the CAR that would never toggle. The location of this bit changes depending on the initialization vector. 10 Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 1 7 4 2 5 8 11 3 6 9 12 Figure 3.6: 4-bit CAR with all 150 Rules with Numbered WUTs 40 3.3.2 8-bit Cyclic Boundary CAR To account for the lack of unique test patterns being generated by the 4-bit CAR approaches, an 8-bit CAR with cyclic boundary conditions was investigated (Figure 3.7). This approach will not ensure all possible test patterns, but enough to possibly ensure high fault coverage. As can be seen in the figure, 150 rules were used on the ends of the CAR to account for the cyclic boundary conditions while still giving a total of 21 WUTs. Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SE T CL R D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 1 7 2 4 5 9 10 3 17 12 14 15 19 6 8 11 20 13 16 18 21 150 Rule 90 Rule Figure 3.7: 8-bit Cyclic Boundary CAR with Numbered WUTs 3.3.3 8-bit Maximal Length Sequence CAR The only way to achieve all possible test pattern combinations with a CAR is to assume null boundary conditions [2]. With that in mind, the 8-bit cyclic boundary CAR was modified into an 8-bit maximal length CAR with null boundary conditions (Figure 3.8). This modification gives a 2 N -1 test vector set to be applied to the WUTs. This modification also reduces the total number of WUTs from 21 to 19. 41 Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SE T CL R D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D Q Q SET CLR D 1 6 3 8 11 16 2 4 9 13 14 5 10 18 7 12 15 17 19 150 Rule 90 Rule Figure 3.8: 8-bit Maximal Length Sequence CAR with Numbered WUTs 3.4 Routing BIST Approach Fault Simulations Some of the routing BIST approaches mentioned in this thesis were modeled and simulated to analyze their gate-level stuck-at and bridging fault coverage. These simulations not only provide insight into the total fault coverage obtainable by the approach, but also what wires can be considered under test and whether or not the assumption of fault-free logic resources is valid. The simulations were conducted by first writing a model of some of the routing BIST approaches in the Auburn Simulation Language (ASL). The ASL models were then used by two of the AUSIM commands, FLTGEN and BFTGEN, to generate the gate-level stuck-at and bridging fault lists respectively [23]. Once the fault lists were generated, the ASL models were simulated by the command, SIMUL8, to obtain fault- free circuit operation output results [23]. Using these output results and the fault lists, the FLTSIM and BFTSIM commands were used to perform the gate-level stuck-at and 42 43 bridging fault simulations respectively [23]. The fault coverage results were collected and are analyzed in the following sections. It should be noted that stuck-at refers to gate- level stuck-at fault coverage in all results presented. It should also be noted that the XOR gates used in the models were gate-level subcircuits consisting of an AND and two NOR gates. 3.4.1 Parity-Based Approaches The fault simulation results for some of the parity-based routing BIST approaches are shown in Figure 3.9. If constructed to utilize an entire Virtex-4 PLB, then the dual parity [6] approach would be able to drive two ORAs, thereby doubling the number of WUTs to 12. Under the assumption that the feedback of the counters could also be considered WUTs, then the dual parity [6] approach would be able to test 18 wires in a single Virtex-4 PLB. However, the increase in overall fault coverage of the cross- coupled parity approach over the dual parity [6] approach is due to the fact that any fault within the counter TPG that can hinder the count, but still provide correct parity, will be detected by cross-coupling the parity bits. Even with the same number of WUTs, the dual parity [6] approach could never match the gate-level stuck-at fault coverage of the cross-coupled parity approach. 0 10 20 30 40 50 60 70 80 90 100 Dual Parity Cross-Coupled Parity F a u l t C o ver ag e ( % ) Stuck-At Dominant Dominant-AND/OR 18 WUTs 6 WUTs Figure 3.9: Parity-Based Approach Results 3.4.2 Counter-Based Approaches The fault simulation results for counter-based approaches with comparison-based ORAs are shown in Figure 3.10. The results labeled stuck-at fault coverage with ORAs were those with all of the gate-level stuck-at faults generated from a model which included the comparison-based ORA. The stuck-at fault coverage without ORAs was obtained by removing all gate-level faults in the fault list that are associated with the comparison-based ORA. As can be seen in the figure, the single counter approach was unable to detect any stuck-at faults in the TPG. This was because all the faults detected by the single counter approach were within the ORA itself. Without the ORA faults, the 44 dual counter approach was able to achieve 100% stuck-at fault coverage. This observation is what first led to the investigation of TPG/ORA combination approaches. 0 10 20 30 40 50 60 70 80 90 100 Single Counter (Figure 2.8) Dual Counter (Figure 2.9) F a u l t C o ver ag e ( % ) Stuck-At w/ ORA Stuck-AT w/o ORA Dominant Dominant-AND/OR All test 8 WUTs Figure 3.10: Counter-Based Approach Results 3.4.3 LFSR TPG/ORA Combination Approachs The fault simulation results for LFSR TPG/ORA combination approaches with one configuration memory readback are given in Figure 3.11. All four approaches have adequate fault coverage, but are lacking in the number of total wires under test in a single PLB. As can be seen in the figure, the dominant bridging fault coverage for the 4-bit 45 external LFSR was lower due to the external XOR feedback gate. This caused the dominant faults for the LSB of the LFSR to only be potentially detected. 86 88 90 92 94 96 98 100 4-bit External 4-bit Internal 8-bit External 8-bit Internal F a u l t C o ver ag e ( % ) Stuck-At Dominant Dominant-AND/OR 10 WUTs 11 WUTs 11 WUTs 10 WUTs Figure 3.11: LFSR TPG/ORA Combination Approach Results 3.4.4 CAR Simulation Results The fault simulation results for the CAR approaches can be seen in Figure 3.12. The greatest advantage of the CAR approaches is the total number of wires under test in a single PLB. The 8-bit cyclic CAR did not produce enough test patterns to allow for adequate fault coverage using a single BIST result readback. Reading the configuration 46 memory twice, however, was able to give this approach 100% gate-level stuck-at and slightly higher dominant and dominant-AND/OR fault coverage. 0 10 20 30 40 50 60 70 80 90 100 4-bit All 150 8-bit Cyclic 8-bit Max Length F a u l t C o ver ag e ( % ) Stuck-At Dominant Dominant-AND/OR 19 WUTs 21 WUTs 24 WUTs Figure 3.12: CAR Approach Results Of all the CAR approaches, the 8-bit maximal length sequence approach had the best results. One thing to mention about this approach was its ability to achieve 100% fault coverage from a single configuration memory readback in only 18 clock cycles. It was also observed that the ending vector could be used as the initialization vector for a second phase of fault simulations for another 18 clock cycles, and still achieve the high fault coverage. The thought was that this ability could be used for quick dynamic partial reconfiguration of the Virtex-4 to minimize testing time. 47 48 3.5 Summary Looking over the analysis of these approaches, it is obvious that every approach has its strengths and weaknesses. Consideration has to be given to total fault coverage, number of possible WUTs, and number of BIST results readbacks. The LFSR approaches had the least number of WUTs, but very high stuck-at and bridging fault coverage. However, the high fault coverage was only obtained by reading the configuration memory twice. The CAR approaches had the highest number of WUTs and fairly decent fault coverage. All of the CAR approaches had to read back the configuration memory twice, except the 8-bit maximal length CAR. Looking at the parity approaches, the cross-coupled parity approach dominated the dual parity approach on all respects. It had a high number of WUTs and excellent fault coverage while only having to retrieve the BIST results once. Of all the approaches both the cross-coupled parity and 8-bit maximal length CAR would be good approaches to implement for Virtex-4 routing BIST. 49 CHAPTER 4 VIRTEX-4 ROUTING BIST IMPLEMENTATION The implementation of the cross-coupled parity BIST approach for programmable interconnect for the Xilinx Virtex-4 FPGA will be presented in this chapter. Specific focus will be placed on the testing of the global interconnect double lines for both PLB and non-PLB columns. 4.1 Virtex-4 Routing BIST Approach The cross-coupled parity and 8-bit maximal length CAR were the two approaches that appeared to be the best candidates for Virtex-4 routing BIST implementation. Upon further investigation, it was observed that the limitations of the Virtex-4 architecture would not allow the 8-bit maximal length CAR to be constructed within a single PLB. The cause of this issue was mainly the amount of feedback connections the CAR required. Most flip-flop feedback has to enter the switch matrix through OMUXs. The Virtex-4 FPGA has a limited number of OMUXs per switch matrix, therefore the 8-bit CAR could not be constructed within a single PLB. To test the global routing resources, every bit of the CAR would have to be placed in a separate PLB. The complexity of ensuring proper CAR construction and uniform replication throughout the entire array is too much. For those reasons, cross-coupled parity became the implementation approach for Virtex-4 routing BIST. The approach itself was tailored to the Virtex-4 architecture (Figure 4.1). Two slices were populated with an even and odd parity generating TPG while the other two slices were populated with both an even and odd parity-based ORA in each slice. This adaptation allowed for full utilization of a single PLB, and produced a total of 6 test signals to be applied to the double lines. The six signals were fanned out to drive two ORAs putting 12 wires under test. By including the 6 feedback WUTs, the Virtex-4 routing BIST implementation tested a total of 18 wires. Figure 4.1: Cross-Coupled Parity Implementation in Virtex-4 50 51 To ensure opposite logic values on adjacent WUTs for bridging fault testing, the TPGs are alternated every other row or column depending on the direction of wires being tested. Implementing cross-coupled parity routing BIST for global routing double lines is presented in the following sections. 4.2 PLB Column Double Lines There are 30 wire segments for each direction (north, south, east, and west) of the global routing double lines per switch matrix in the Virtex-4 FPGA. These segments are divided into three groups of terminals (BEG, MID, and END) for every switch matrix with 10 terminals per group. To ensure adequate fault coverage, all 10 terminals must be tested for each group of a switch matrix. The test configurations are separated into groups according to the directions they test. The north double lines were the first direction configurations developed using the cross-coupled parity approach (Figure 4.2). This is accomplished by populating two slices of a PLB with the 2-bit up and down counters throughout the entire array. The remaining two slices of a PLB are then populated with even and odd parity-based ORAs throughout the entire array. The six test signals, four count bits and two parity bits, were routed onto six north double line segments via the BEG terminals. These signals are then routed through the north double lines at the MID terminals of the northerly adjacent switch matrix. Within the switch matrix, the test signals are routed to their respective parity-based ORA. This process is repeated at the END terminals of the double lines. 52 Figure 4.2: Example North Double Line Implementation When the signals reach the edge of the array, edge loopback segments are utilized to route the signals onto their corresponding south direction line. They are routed back to the opposite edge of the array using south double lines. When an END terminal is reached, it is reconnected to the BEG terminal within the switch matrix to continue down the array. Edge loopback segments are also utilized at the southern edge of the array to route the test signals back onto their original northbound double lines. This allows the final MID and END ORA terminals to be routed, thereby completing the circular test architecture. One point to note is that there are only six test signals, yet 10 double lines B M E B M E loopbacks B M E 53 need to be tested. Due to the limited supply of output multiplexers within a switch matrix, the six test signals can not be fanned out to drive all 10 double lines at once. This therefore requires two test configurations be used to test all 10 lines (Table 4.1 and 4.2). In the table the slice numbers are paired with their respective TPG or ORA (Te, To, Oe, and Oo). The first column shows which line number is being tested. The BEG column shows which test signal is connected to the BEG terminal via its respective slice output along with any required PIPs to complete the first connection in the third column. The MID and END columns show which ORA LUT input in the respective slice is connected to the MID and END terminals, along with any PIPs needed to complete the connection in the fifth and seventh columns. The feedback table shows the output feedback to LUT input paths the TPG and ORA require. 54 Table 4.1: North Double Line Configuration 1 Oe(0) Oo(1) Te(2) To(3) Wire BEG PIP MID PIP END PIP 3 X3(CU0) F3-1 G3-0 4 X2(CD0) G3-1 F3-0 5 Y2(peven) G2-1 F2-0 6 YQ2(CU1) OMUX9 G3-2 F4-1 Byp_Int_B1 G2-0 7 YQ3(CD1) OMUX11 W2BEG6 G3-3 G1-1 F1-0 8 Y3(podd) F1-1 G1-0 Feedback (TPG) XQ2 >> OMUX6 >> F2-2 XQ2 >> OMUX6 >> Bounce3 >> G4-2 XQ3 >> OMUX13 >> F4-3 XQ3 >> OMUX13 >> Byp_Int _B7 >> G2-3 Feedback (ORA) XQ0 >> OMUX2 >> F4-0 YQ0 >> OMUX4 >> W2BEG2 >> Byp_Int_B4 >> G4-0 XQ1 >> OMUX15 >> E2BEG9 >> Byp_Int_B5 >> F2-1 YQ1 >> OMUX5 >> Bounce0 >> G4-1 55 Table 4.2: North Double Line Configuration 2 Oe(0) Oo(1) Te(2) To(3) Wire BEG PIP MID PIP END PIP 0 Y2(peven) OMUX0 F3-1 Byp_Int_B2 G4-0 1 X2(CD0) F4-1 G2-0 Byp_Int_B0 2 YQ2(CU1) OMUX4 W2BEG2 G2-2 G3-1 F3-0 3 X3(CU0) G4-1 Byp_Int_B4 F2-0 E2BEG4 Bounce2 8 Y3(podd) G1-1 F1-0 9 YQ3(CD1) OMUX15 E2BEG9 G4-3 F1-1 G1-0 Feedback (TPG) XQ2 >> OMUX6 >> F2-2 XQ2 >> OMUX6 >> Bounce3 >> G4-2 XQ3 >> OMUX13 >> F4-3 XQ3 >> OMUX13 >> Byp_Int _B7 >> G2-3 Feedback (ORA) XQ0 >> OMUX2 >> F4-0 YQ0 >> OMUX5 >> Bounce1 >> G3-0 XQ1 >> OMUX11 >> W2BEG6 >> F2-1 YQ1 >> OMUX9 >> G2-1 The south double lines are similar to the north double lines. They too require two test configurations to test all 10 lines (Table 4.3 and 4.4). Like the north double lines, the test signals are first routed onto the wire segments via the BEG terminals to be sourced to the parity-based ORAs via the MID and END terminals along the wire segments. At the southern edge of the array the test signals are routed onto the corresponding north double lines using the edge loopback segments. When the signals have traveled back to the northern edge of the array they are routed back onto the original south double lines to complete the circular MID and END terminal connections. 56 Table 4.3: South Double Line Configuration 1 Oe(0) Oo(1) Te(2) To(3) Wire BEG PIP MID PIP END PIP 2 YQ2(CU1) OMUX4 G1-2 F4-1 G4-0 3 XQ2(CU0) OMUX6 F2-2 G2-2 F3-1 G3-0 4 X2(CD0) G3-1 F3-0 5 Y2(peven) G2-1 F4-0 E2BEG3 Byp_Int_B4 6 Y3(podd) F2-1 G2-0 7 YQ3(CD1) OMUX11 W2BEG6 G3-3 G1-1 Byp_Int_B3 F2-0 Feedback (TPG) XQ3(CD0) >> OMUX9 >> Byp_Int_Bounce1 >> F1-3 >> G1-3 Feedback (ORA) XQ1 >> OMUX13 >> F1-1 YQ1 >> OMUX2 >> G4-1 XQ0 >> OMUX5 >> Bounce1 >> Byp_Int_Bounce6 >> F1-0 YQ0 >> OMUX15 >> E2BEG9 >> G1-0 57 Table 4.4: South Double Line Configuration 2 Oe(0) Oo(1) Te(2) To(3) Wire BEG PIP MID PIP END PIP 0 XQ2(CU0) OMUX0 G1-2 F1-2 G4-1 F4-0 1 XQ3(CD0) OMUX2 F1-3 G1-3 F4-1 G4-0 2 YQ2(CU1) OMUX4 S2BEG2 Byp_Int_B0 G3-2 G3-1 W2BEG2 F3-0 Byp_Int_B2 7 Y2(peven) F2-1 G2-0 8 Y3(podd) G1-1 F1-0 9 YQ3(CD1) OMUX15 E2BEG9 G4-3 F1-1 G1-0 Feedback (ORA) XQ1 >> OMUX6 >> F3-1 YQ1 >> OMUX9 >> G2-1 XQ0 >> OMUX11 >> W2BEG6 >> F2-0 YQ0 >> OMUX5 >> Bounce1 >> G3-0 In testing the east double lines, the non-PLB columns must be taken into account (Figure 4.3). This is because the east double lines are row based instead of column based. Like the north and south double line configurations, the east double line test signals are routed onto the wire segments via the BEG terminals to source parity-based ORAs via the MID and END terminals along the wire segments. At the eastern edge of the array, the signals use loopback segments to travel back across the array on the west double lines. When they reach the western edge of the array they route back onto their respective east double lines and finish the MID and END terminal connections, thereby completing the circle. The main difference is dealing with the non-PLB columns. The non-PLB column switch matrix MID terminal connections are not used since there are no PLB slices to connect them to. The END terminals in non-PLB columns, however, are reconnected to the BEG terminals so that the next adjacent PLBs can be sourced with test patterns. By doing this, the non-PLB columns are in essence emulating the behavior of a TPG thereby providing the proper parity and count signals required by the adjacent PLBs. Again, two configurations were required to test all 10 lines (Table 4.5 and 4.6). Non-PLB E M B E M B E B E M B E M B Figure 4.3: Example East Double Line Implementation 58 59 Table 4.5: East Double Line Configuration 1 Oe(0) To(1) Te(2) Oo(3) Wire BEG PIP MID PIP END PIP 1 Y1(podd) F2-3 Byp_Int_Bounce2 G1-0 Byp_Int_Bounce0 Byp_Int_Bounce3 2 YQ2(CU1) OMUX6 G2-2 F1-3 G4-0 3 X2(CD0) G2-3 F4-0 Byp_Int_Bounce4 4 Y2(peven) G4-3 Bounce3 F3-0 5 X1(CU0) F3-3 G2-0 6 YQ1(CD1) OMUX11 N2BEG7 G1-1 G3-3 F2-0 Feedback (TPG) XQ1(CD0) >> OMUX9 >> F2-1 XQ1(CD0) >> OMUX9 >> G2-1 XQ2(CU0) >> OMUX2 >> F1-2 XQ2(CU0) >> OMUX2 >> G1-2 Feedback (ORA) XQ3 >> OMUX13 >> F4-3 YQ3 >> OMUX0 >> S2BEG0 >> G1-3 XQ0 >> OMUX15 >> N2BEG9 >> F1-0 YQ0 >> OMUX5 >> Bounce1 >> G3-0 60 Table 4.6: East Double Line Configuration 2 Te(0) To(1) Oe(2) Oo(3) Wire BEG PIP MID PIP END PIP 0 Y0(peven) F1-3 G1-2 1 Y1(podd) G1-3 F1-2 6 YQ0(CU1) OMUX9 G2-0 G3-3 F4-2 Byp_Int_Bounce3 7 X1(CU0) G4-3 N2BEG7 F3-2 8 X0(CD0) F3-3 Byp_Int_Bounce5 G4-2 9 YQ1(CD1) OMUX15 E2BEG9 G1-1 F4-3 G2-2 Byp_Int_Bounce7 Feedback (TPG) XQ0(CU0) >> OMUX13 >> F1-0 XQ0(CU0) >> OMUX13 >> G1-0 XQ1(CD1) >> OMUX5 >> Bounce0 >> F4-1 XQ1(CD1) >> OMUX5 >> Bounce0 >> G4-1 Feedback (ORA) XQ3 >> OMUX6 >> F2-3 YQ3 >> OMUX4 >> W2BEG2 >> G2-3 XQ2 >> OMUX2 >> Byp_Int_Bounce2 >> F2-2 YQ2 >> OMUX0 >> S2BEG0 >> Byp_Int_Bounce0 >> G3-2 The west double lines are similar to the east double lines, where careful attention is paid to the non-PLB columns. The count and parity test signals were routed onto wire segments via BEG terminals and monitored at MID and END terminals by their corresponding parity-based ORAs. At the western edge of the array, the test signals are routed onto their respective east double lines using loopback segments. As they travel back to the eastern edge, END to BEG connections are made where needed to continue the route. At the eastern edge of the array, they are then routed back onto their corresponding west double lines to complete the final MID and END ORA connections. When considering the non-PLB columns, the END terminals are connected to the BEG terminals so that the two westerly adjacent PLB columns are sourced with the proper test 61 signals. Again, the non-PLB columns MID terminals are not used. The west double lines also require two test configurations to test all 10 lines (Table 4.7 and 4.8). Table 4.7: West Double Line Configuration 1 Oe(0) Te(1) To(2) Oo(3) Wire BEG PIP MID PIP END PIP 3 X2(CU0) G2-3 F3-0 4 Y1(peven) F2-3 G3-0 5 X1(CD0) F3-3 G2-0 6 YQ1(CU1) OMUX11 W2BEG6 G2-1 G3-3 F2-0 7 Y2(podd) G4-3 F1-0 8 YQ2(CD1) OMUX13 G4-2 F4-3 G1-0 Feedback (TPG) XQ1 >> OMUX2 >> F4-1 XQ1 >> OMUX2 >> G4-1 XQ2 >> OMUX9 >> F3-2 XQ2 >> OMUX9 >> G3-2 Feedback (ORA) XQ0 >> OMUX0 >> S2BEG0 >> F4-0 YQ0 >> OMUX4 >> S2BEG2 >> G4-0 XQ3 >> OMUX5 >> Bounce0 >> F1-3 YQ3 >> OMUX6 >> Byp_Int_B4 >> G1-3 62 Table 4.8: West Double Line Configuration 2 Te(0) Oe(1) To(2) Oo(3) Wire BEG PIP MID PIP END PIP 0 Y0(peven) F1-3 G4-1 1 YQ0(CU1) OMUX2 G4-0 G3-3 Byp_Int_B0 F4-1 2 XQ2(CD0) OMUX6 F2-2 G2-2 F2-3 G2-1 N2BEG4 Bounce2 3 X2(CU0) G2-3 F3-1 8 YQ2(CD1) OMUX13 G4-2 F3-3 Byp_Int_B5 G1-1 9 Y2(podd) OMUX15 G4-3 F1-1 Feedback (TPG) XQ0 >> OMUX9 >> F2-0 XQ0 >> OMUX9 >> G2-0 Feedback (ORA) XQ1 >> OMUX11 >> W2BEG6 >> F2-1 YQ1 >> OMUX5 >> Bounce1 >> G3-1 XQ3 >> OMUX0 >> S2BEG0 >> Byp_Int_B2 >> Byp_Int_B6 >> F4-3 YQ3 >> OMUX4 >> S2BEG2 >> G1-3 4.3 Non-PLB Column Double Lines Testing the non-PLB column double lines is more involved than the PLB column double lines. This is due to having to place the TPGs and ORAs in surrounding PLB columns, and routing the test signals to the desired lines to be tested. More routing resources are required to realize the BIST architecture for these wires. Help is given in that the entire array does not have to be populated with the design. Otherwise, routing congestion and conflicts would surely occur. 63 One advantage the non-PLB column double line configurations have over the PLB column double line configurations is that two sets of directional wires are tested in one configuration instead of only one. Since the non-PLB column BEG and END terminals were used in testing east and west double lines, they are already considered tested. By using the non-PLB column east and west MID terminals as gateways into the non-PLB column switch matrices to source the north and south double lines, they too can be considered tested, thereby completing the east and west wire testing of non-PLB columns. In theory, this advantage should reduce the number of test configurations for non- PLB column double lines by eliminating the configurations needed for two directions. This was not the case, however, due to the connections of wire segments between different directions. When leaving the non-PLB column switch matrix to connect with the ORA, the test signal must be routed onto its respective east or west wire segment via the BEG terminal. The north and south MID terminal connections have no problem with this routing. However, the END terminal connections shift and connect to the next wire in the group (eg. N2END0 connects to E2BEG1) (Figure 4.4). This shift causes routing conflicts in the adjacent rows. For that reason, MID and END terminal testing must be separated into their own configurations thus requiring four configurations for each direction; two for MID and two for END terminal testing. N2END0 E2BEG1 Figure 4.4: END Terminal Connection Shift Another disadvantage of non-PLB column double line testing is that not all segments have the ability to connect with east and west lines to leave the non-PLB column and connect to their ORA. This problem occurs on the N2END9, S2END0, and S2END1 terminals (Figure 4.5). The only feasible way to test these segments is to develop an additional configuration. The signals are routed onto the desired wire segments and routed to the edges of the array. At the edges, they use the loopback segments to route onto the corresponding opposite direction double lines. The signals are then routed to their respective ORA. 64 S2END1 S2END0 N2END9 No east/west connections Figure 4.5: Non-Connecting Terminals The north non-PLB double lines BIST configurations also test the remaining untested east non-PLB double lines MID terminals. TPGs are placed in the left adjacent column to the non-PLB column while ORAs are placed in the right adjacent column (Figure 4.6). 65 M B M 66 Figure 4.6: Example North Non-PLB Double Line Implementation This process is repeated throughout the entire array except for the eastern and western edges. On the edges, the TPGs and ORAs are placed in the same column. The test signals are routed onto the east wire segments via the BEG terminals. At the non-PLB column switch matrix, they are routed from the east MID terminals and onto the north wire segments via the BEG terminals. Depending on the configuration, the test signals are rerouted back onto the east wire segments via the BEG terminals from either the north MID or north END terminals. They are then routed to the adjacent PLB column and connect to the parity-based ORA via the MID terminals. On the east edge of the array, instead of rerouting back onto east wire segments to be connected to the ORA, the signals are routed onto west wire segments via the BEG terminals from the north MID and END terminals. On the west edge of the array, the test signals are first routed on the west wire segments via BEG terminals. The loopback segments are then used to route the signals onto their corresponding east lines where they were connected to the non-PLB column TPG B B M ORA switch matrix from the east MID terminal (Figure 4.7). They then follow the same pattern as the other routes running north then east. East Terminals Figure 4.7: Loopbacks Used to Ensure East MID Testing 67 The south non-PLB column double lines are similar to the north configurations, and also test a second direction; the west MID terminals (Figure 4.8). The TPGs are placed in the right adjacent column to the non-PLB column while ORAs are placed in the left adjacent column. The process is repeated throughout the entire array except for the eastern and western edges. Like the north configurations, the edge TPGs and ORAs are placed in the same column. The test signals are routed onto the west wire segments via BEG terminals to connect to the non-PLB column switch matrix. There they are rerouted onto south wire segments. Depending on the configuration, the test signals are rerouted back onto the west wire segments of the non-PLB column from either the south MID or south END terminals. They are then connected to their respective ORA by the west MID terminals of the adjacent PLB column. BM B TPG M M B ORA Figure 4.8: Example South Non-PLB Double Line Implementation On the eastern edge of the array, the signals first have to be routed on the east wire segments to the loopback connections at the edge. There they connect to their corresponding west lines where they are then connected to the non-PLB column switch matrix by the west wire segments via MID terminals (Figure 4.9). From there they follow the same path as the other routes running south then west. On the western edge of the array, the signals follow the same paths as the other routes except where they are rerouted to the ORAs. At that point they are routed onto east wire segments to connect to their respective ORA by east MID terminals. Like the north and east configurations, the south and west configurations requires four configurations to test all 10 lines. This brings the total configuration count to nine for non-PLB column wires. 68 West Terminals Figure 4.9: Loopbacks Used to Ensure West MID Testing 4.4 Virtex-4 FX Device Configurations The Virtex-4 FX devices contain one or two Power PC modules in the left half of the FPGA. In developing BIST configurations for the double line segments careful attention must be paid to the routing around these modules. Like the edges of the array, the edges of the Power PCs also have loopback connections that can be used by the double lines to change direction of the routing path. Every Power PC row and column has a set of these loopback connections. Since the configurations for the FX device double lines are relatively the same as the other Virtex-4 devices, a post process algorithm was developed to modify a dummy configuration with the needed changes to be compatible with the FX devices (Figure 4.10). The algorithm takes a completely routed array with no consideration for the Power PC locations and strips out all of the routing where the Power PCs are located. It then makes the appropriate connections and 69 modifications to complete the routing BIST architecture in the loopback connections around the Power PC. Post-Process Figure 4.10: FX Configuration Generation Process A few special considerations must be made for the FX device configurations based on the direction being tested. For the north and south PLB column double lines, the entire column on the right edge of the Power PCs must be treated like a non-PLB column. This is because that particular column contains both PLB and non-PLB sections (the non-PLB sections being alongside the Power PCs). The column will be tested in the non-PLB column tests. For the non-PLB column configurations, the top and bottom rows in the Power PC area must be included. Aside from the small modifications to the 70 71 existing configurations by the post process algorithm, no additional configurations must be developed specifically for FX devices. 4.5 Virtex-4 Routing BIST Results The implementation of the cross-coupled parity routing BIST approach in the Virtex-4 FPGA was presented in this chapter (Table 4.9). The adaptation of the approach to the Virtex-4 architecture was also shown. Specific focus on the development of the PLB and non-PLB column double lines BIST configurations was also presented. It was shown how the consideration for the Power PCs in the FX devices was handled. Table 4.9: Summary of Virtex-4 Routing BIST Global Double Line Configs Virtex-4 Routing BIST Configurations Direction Number Of Configs North 2 South East 2 West North/East Non-PLB 4 South/West Non-PLB 4 Additional Non-PLB 1 17 Total Configs 72 CHAPTER 5 SUMMARY AND CONCLUSIONS 5.1 Summary of Routing BIST Approaches The work in this thesis presented an analysis and evaluation of previous routing BIST approaches. New approaches were proposed in an effort to eliminate assumptions that were made with the previous approaches to obtain high fault coverage. Some of the approaches were modeled and simulated to obtain their gate-level stuck-at as well as dominant and dominant-AND/OR bridging fault coverage. Emphasis was not only on fault coverage, but also on the number of wires that could be tested in a given configuration. Counter-based approaches that used a comparison-based ORA had a decent number of WUTs, but suffered in overall fault coverage. A number of different LFSR TPG/ORA combination approaches were simulated. Even though they were able to obtain high fault coverage, the number of WUTs was low and the configuration memory had to be read twice. Several CAR approaches were also simulated as TPG/ORA combinations. The CARs had many more feedback paths, which increased the total number of WUTs. However, a few of the CARs could not produce the amount of test patterns needed to achieve good fault coverage. The 8-bit maximal length 73 sequence CAR was found to be one of the best approaches, as it was able to achieve very high fault coverage while only needing to read the configuration memory once. It also had one of the highest numbers of WUTs. Its drawback, however, was that its construction could not be contained in a single Virtex-4 PLB due to the limitations in the switch matrices by the OMUXs. In the end, the cross-coupled parity approach was the one chosen for Virtex-4 implementation. 5.2 Summary of Virtex-4 Routing BIST The work in this thesis developed and implemented a routing BIST architecture for the Virtex-4 FPGA using the cross-coupled parity approach. Specific focus is given to the global double line routing resources of PLB and non-PLB columns. The double lines are characterized by the direction they source: north, south, east, or west. These lines span three rows or columns, depending on direction, in the array and have connection terminals at every switch matrix in their span. Every switch matrix has 30 double line wire segments per direction. The 30 segments are organized into three groups of terminals: BEG, MID, and END. Each group contains 10 wire segments to and from the switch matrix. To ensure adequate testing, all 10 terminals must be tested for each group of terminals and wire segments associated with a switch matrix. The first routing BIST configurations developed for Virtex-4 were for the columns containing PLBs. The limitation of the Virtex-4 architecture in regards to OMUXs in the switch matrix required that only 6 global lines be tested per BIST configuration. Therefore, each direction required two BIST configurations to fully test 74 all 10 wire segments and terminals for each group. One advantage of these BIST configurations was that the east and west BEG and END terminals for non-PLB columns were also tested during the east and west direction configurations. The non-PLB column BIST configurations were more involved than those of the columns containing PLBs. These configurations were able to test two directions at the same time: north/east and south/west. The ability to test two directions simultaneously was thought to be an advantage that would reduce the number of configurations to test the FPGA. However, it was observed that the terminals of the non-PLB columns had to be tested independently due to routing conflicts. Therefore, the non-PLB columns required four BIST configurations for both directions to fully test all of the lines. Three of the lines that needed to be tested in the non-PLB columns were not able to change directions within a switch matrix. To handle this issue, an additional BIST configuration had to be developed to route the test signals on these wires to the edges of the array and use the loopback connections to route them to their respective ORAs. One challenge faced in developing the routing BIST configurations for the global double lines was accommodating the area taken up by the Power PCs in the FX devices. To deal with that issue, a post process algorithm was developed that stripped the routing out of the area where a Power PC would be located and reroute any signals that needed to complete the configuration. 75 5.3 Future Work There is a lot of work left to do in the area of routing BIST for Virtex-4. Configurations must be developed to test hex and long lines for both PLB and non-PLB columns. Lines that branch from END terminals on global double lines need to be verified and tested. The lines connecting the switch matrices around the Power PCs in FX devices to the Power PCs themselves need tests to be developed as well. The work presented in this thesis could easily be applied to next generation FPGAs. The cross-coupled parity approach can be modified to utilize the full potential of any architecture. For example, since most newer FPGAs have routing resources based on buffered multiplexer PIPs to prevent signal degradation, the cross-coupled parity test signals could be fanned out to source as many wires as the architecture would allow. 76 BIBLIOGRAPHY [1] C. Stroud, S. Wijesuriya, C. Hamilton, and M. Abramovici, ?Built-In Self-Test of FPGA Interconnect,? Proc. IEEE International Test Conf., pp. 404-411, 1998. [2] C. Stroud, A Designer?s Guide to Built-In Self-Test, Boston: Springer, 2002. [3] J. Smith, T. Xia, C. Stroud, ?An Automated BIST Architecture for Testing and Diagnosing FPGA Interconnect Faults,? J. Electronic Testing: Theory & Applications, vol. 22, no. 4, pp. 239-253, 2006. [4] M. Abramovici and C. Stroud, ?BIST-Based Test and Diagnosis of FPGA Logic Blocks,? IEEE Trans. on VLSI Systems, vol. 9, no. 1, pp. 159-172, 2001. [5] C. Stroud, J. Nall, M. Lashinsky, and M. Abramovici, ?BIST-Based Diagnosis of FPGA Interconnect,? Proc. IEEE International Test Conf., pp. 618-627, 2002. [6] J. Sunwoo and C. Stroud, ?Built-In Self-Test of Configurable Cores in SoCs Using Embedded Processor Dynamic Reconfiguration,? Proc. International System-on-Chip Design Conf., pp. 174-177, 2005. [7] C. Stroud, J. Bailey, J. Emmert, D. Nickolic, and K. Chhor, ?Bridging Fault Extraction from Physical Design Data for Manufacturing Test Development,? Proc. IEEE International Test Conf., pp. 760-769, 2000. [8] C. Stroud, K. Leach, and T. Slaughter, ?BIST for Xilinx 4000 and Spartan Series FPGAs: A Case Study,? Proc. IEEE International Test Conf., pp. 1258-1267, 2003. [9] S. Dhingra, S. Garimella, A. Newalkar, and C. Stroud, ?Built-In Self-Test for Virtex and Spartan II FPGAs Using Partial Reconfiguration,? Proc. IEEE North Atlantic Test Workshop, pp. 7-14, 2005. [10] I. Harris and R. Tessier, ?Testing and Diagnosis of Interconnect Faults in Cluster- Based FPGA Architectures,? IEEE Trans. on CAD of Integrated Circuits and Systems, vol. 21, no. 11, pp. 1337-1343, 2002. [11] X. Sun, J. Xu, B. Chan, and P. Trouborst, ?Novel Technique for Built-In Self-Test of FPGA Interconnects,? Proc. IEEE International Test Conf., pp. 795-803, 2000. 77 [12] J. Rose, A. El Gamal, and A. Sangiovanni-Vincentelli, ?Architecture of Field- Programmable Gate Arrays,? Proceedings of the IEEE, vol. 81, no. 7, pp. 1013-1029, 1993. [13] G. Moore, ?Cramming More Components onto Integrated Circuits,? Proceedings of the IEEE, vol. 86, no. 1, pp. 82-85, 1998. [14] B. Dixon and C. Stroud, ?Analysis and Evaluation of Routing BIST Approaches for FPGAs,? Proc. IEEE North Atlantic Test Workshop, pp. 85-91, 2007. [15] L. Wang, C. Stroud, and N. Touba, System On Chip Test Architectures: Nanometer Design for Testability, Amsterdam: Elsevier, 2007. [16] ?Xilinx Delivers Virtex-4 FPGAs?, Xilinx Press Release #0480, www.xilinx.com. [17] ?Virtex-4 User Guide?, User Guide UG070, Xilinx, Inc., 2005. [18] S. Dhingra, D. Milton, and C. Stroud, ?BIST for Logic and Memory Resources in Virtex-4 FPGAs,? Proc. IEEE North Atlantic Test Workshop, pp. 19-27, 2006. [19] ?Xilinx Corp FPGA Editor Software Manual?, Software Manual 9.2i, Xilinx, Inc., 2005. [20] ?Virtex 4 Family Overview?, Datasheet DS112, Xilinx, Inc., 2007. [21] I. Kuon and J. Rose, ?Measuring the Gap between FPGAs and ASICs,? ACM International Symp. on FPGAs, pp. 21-30, 2006. [22] V. Suthar and S. Dutt, ?Mixed PLB and Interconnect BIST for FPGAs Without Fault-Free Assumptions,? Proc. IEEE VLSI Test Symp., pp. 36-43, 2006. [23] C. Stroud, ?AUSIM : Auburn University SIMulator ? Version L2.2,? 2004.