## **Integrated Circuit for Time Domain Mixed Signal Processing**

by

Hechen Wang

A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy

> Auburn, Alabama May 5<sup>th</sup>, 2018

Keywords: all digital phase-locked-loop (ADPLL), polar transceiver, signal processing, time domain, time-to-digital converter (TDC), wireless communication

Copyright 2018 by Hechen Wang

Approved by

Dr. Fa Foster Dai, Chair, Reynolds Professor of Electrical and Computer Engineering
Dr. Bogdan M. Wilamowski, Professor of Electrical and Computer Engineering
Dr. Guofu Niu, Alumni Professor of Electrical and Computer Engineering
Dr. Michael C. Hamilton, Associate Professor of Electrical and Computer Engineering

### Abstract

This work explores time to digital conversion techniques, designs and applications for time domain signal processing. Though the time-to-digital conversion devices and integrated circuits (IC) have been invented for decades, recent designs have been revolutionized with the advancements in deep sub-micron CMOS processes. Many of the performance limitations highlighted in early literature, such as the transition speed and power consumption of the transistors, no longer apply to designs in modern IC processes. Two time-to-digital converter (TDC) designs with spiral comparator arrangement,  $\Delta\Sigma$  modulations (SDMs) and ring structure are proposed in this dissertation to offer pico-second (ps) level temporal resolution with wide detectable range, ultra linear transfer function and low power consumption. With the aid of high performance TDCs, an all-digital phase-locked-loop (ADPLL) with automatic TDC linearization is presented to prove the fading gap between digital PLL and analog PLL. The proposed ADPLL achieves an in-band phase noise of -107 dBc/Hz and a highest close in fractional spur level of -55 dBc. Owing to pico-second and even sub-pico-second time interval detectability, TDCs pave the ways for the time domain signal processing towards many different potential research directions. Two communication system related time domain signal processing applications: TDC based digital super-regenerative bidirectional IoT node and hybrid time assist data convertor for direct radio frequency (RF) polar receiver system, are included in this work.

Dedicated to My Family

### Acknowledgments

I have been extremely fortunate to be a graduate student in the electrical and computer engineering department at Auburn University and work closely with my academic advisor, Prof. Fa Foster Dai. I would like to express my deepest gratitude and appreciation to him for his invaluable guidance, discussions, encouragement, and assistance during my study and research. His instruction and motivation lead to the completion of this research work. He is also an important friend to me in my personal lives and career. I would like to take this opportunity to thank my committee members, Prof. Bogdan M. Wilamowski, Prof. Guofu Niu and Prof. Michael C. Hamilton for taking their valuable times out of their busy schedule to serve in the committee. I would like to thank the university reader, Prof. Bo Liu for sacrificing his time to help me finalize my dissertation. I want to thank Prof. Wilamowski for leading me into the area of neural networks. I would like to express my gratitude to Prof. Niu and Dr. Hamilton for their invaluable discussion and advice to my career. I would also like to thank Dr. Stuart M. Wentworth for consolidating my understanding of microwave operation and Prof. Reeves for the technical discussions in DSP area. I would like to thank Dr. David Irwin for their guidance and discussions during my PhD study. I extend my thanks to my great friends who have made graduate study life intellectually and socially enriched. Lastly, I would like to cordially appreciate my parents and family members, who instilled in me the virtues of perseverance and commitment and encouraged me to strive for excellence. All the work was made possible by the love and encouragement of my family.

# Table of Contents

| Integra   | ted Circuit for Time Domain Mixed Signal Processingi                                            |
|-----------|-------------------------------------------------------------------------------------------------|
| Abstrac   | ctii                                                                                            |
| Acknow    | vledgments iv                                                                                   |
| Table o   | f Contentsv                                                                                     |
| List of ] | Figuresix                                                                                       |
| List of [ | Tables xvi                                                                                      |
| Chapte    | r 1 Introduction1                                                                               |
| 1.1       | The Motivation of Time Information Extraction1                                                  |
| 1.2       | The Influence of the Advanced Deep Submicron Silicon Technology on Time Interval<br>Measurement |
| 1.3       | Time Domain Mixed Signal Processing Applications4                                               |
| 1.4       | Organization of Dissertation7                                                                   |
| Chapte    | r 2 Time Domain Information Digitization Techniques9                                            |
| 2.1       | Introduction9                                                                                   |
| 2.2       | Time-to-Digital Converter Key Specifications10                                                  |
| 2.2       | .1 Basic Structural Parameters 11                                                               |
| 2.2       | .2 Performance Characteristics                                                                  |
| 2.3       | TDC Design Architectures                                                                        |

| 2.3.    | 1 Counter Based TDC 1                                                      | 9  |
|---------|----------------------------------------------------------------------------|----|
| 2.3.2   | 2 Analog-to-Digital Conversion Based TDC 2                                 | 0  |
| 2.3.    | 3 Flash (Single Delay Line) TDC                                            | 2  |
| 2.3.4   | 4 Vernier Delay Line TDC                                                   | 2  |
| 2.3.    | 5 Oversampling Gated Ring Oscillator (GRO) TDC                             | 4  |
| 2.3.    | 6 Time Amplifier TDC 2                                                     | 5  |
| 2.4     | Conclusions                                                                | 6  |
| Chapter | x 3 Advanced Time-to-Digital Conversion Performance Improvement Solutions2 | 8  |
| 3.1     | Introduction                                                               | .8 |
| 3.2     | Linearity Issues Associated with 2-D Vernier TDC                           | 1  |
| 3.3     | Spiral Comparator Array Arrangement for 2-Dimentional Vernier TDC          | 5  |
| 3.4     | Linearization Improvement Techniques                                       | 9  |
| 3.4.    | 1 Delay Interpolation of Unit Delay Cells                                  | 9  |
| 3.4.2   | 2 2-D Comparator Array Folding Error Randomization                         | .3 |
| 3.4.    | 3 TDC Delay Calibration 4                                                  | .9 |
| 3.5     | An Ultra Linear Spiral 2-D Vernier TDC Design                              | 1  |
| 3.5.    | 1 TDC Architecture and Circuit Implementation                              | 1  |
| 3.5.2   | 2 Experiment Results of the Proposed TDC 5                                 | 2  |
| 3.6     | A Wide Range Highly Linear Ring TDC Design6                                | 0  |

| 3.6.    | .1 7           | TDC Architecture and Circuit Implementation                                                              | 50             |
|---------|----------------|----------------------------------------------------------------------------------------------------------|----------------|
| 3.6.    | .2 E           | Experiment Results of the Proposed TDC                                                                   | 57             |
| 3.7     | Conc           | lusion                                                                                                   | 58             |
| Chapter | r 4 Tin<br>Syr | ne-to-Digital Conversion in Phase-Locked-Loop Based Frequency<br>nthesizers                              | 70             |
| 4.1     | Introc         | luction                                                                                                  | 70             |
| 4.2     | A Lov<br>Calib | w Spur Fractional-N All-Digital PLL Design with Automatic TDC Linearity ration                           | 72             |
| 4.2.    | .1 A           | ADPLL Architecture                                                                                       | 75             |
| 4.2.    | .2 A           | ADPLL Building Block Circuits Implementation                                                             | 17             |
| 4.2.    | .3 A           | Automatic TDC Linearity Calibration                                                                      | 34             |
| 4.2.    | .4 E           | Experiment Results of the ADPLL                                                                          | 37             |
| 4.3     | Conc           | lusion                                                                                                   | <b>)</b> 4     |
| Chapter | r 5 Tin        | ne Domain Signal Processing in Advanced Communication Systems                                            | )5             |
| 5.1     | Introc         | luction                                                                                                  | <del>)</del> 5 |
| 5.2     | Time<br>of-Th  | Domain Demodulation in Super Regenerative Receiver for Low Power Internet-                               | €7             |
| 5.2.    | .1 I<br>a      | Digital PA-less Super Regenerative Radio Implementation with OOK Modulation<br>and TDC Based Demodulator | ו<br>00        |
| 5.2.    | .2 E           | Experiment Results of the THz Wireless Communication System                                              | )8             |
| 5.3     | Hybri<br>Com   | id Time-Analog-to-Digital Conversion for Direct RF Polar Wireless<br>nunication System                   | 11             |

| References                        |                                                                           |  |
|-----------------------------------|---------------------------------------------------------------------------|--|
| Chapter 6 Summary of the Works131 |                                                                           |  |
| 5.4 Co                            | nclusion129                                                               |  |
| 5.3.3                             | Simulation Results of the Proposed Time-Based Polar Receiver              |  |
| 5.3.2                             | APSK Modulation for Polar Transceiver                                     |  |
| 5.3.1                             | Sub-Sampling Direct RF Demodulation Data Converter for Polar Receiver 114 |  |

# List of Figures

| Fig. 1.1 Generic on-chip time domain signal processing system                                                                       |
|-------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 1.2 8-PSK modulation and 16-QAM modulation constellations                                                                      |
| Fig. 2.1 HP 5370A universal time interval counter with 20-ps resolution, 10 seconds range and frequency up to 100 MHz               |
| Fig. 2.2 Ideal transfer function of a TDC                                                                                           |
| Fig. 2.3 Gain error in TDC transfer function                                                                                        |
| Fig. 2.4 Offset error in TDC transfer function                                                                                      |
| Fig. 2.5 DNL in TDC transfer function                                                                                               |
| Fig. 2.6 INL in TDC transfer function 15                                                                                            |
| Fig. 2.7 Spectrum plot quantization noise floor for a 14-bit data converter (16384-point FFT) 16                                    |
| Fig. 2.8 Block diagram of basic ADC based time-to-digital converter                                                                 |
| Fig. 2.9 Block diagram of basic flash time-to-digital converter                                                                     |
| Fig. 2.10 Block diagram of basic Vernier delay line time-to-digital converter                                                       |
| Fig. 2.11 Block diagram of GRO based TDC                                                                                            |
| Fig. 2.12 SR-latch based time amplifier and its time domain gain characteristic                                                     |
| Fig. 3.1 Illustration of in-band phase noise and fractional spur level related to TDC performance<br>in a DPLL with ripple canceler |
| Fig. 3.2 Illustration of a conventional 2-D Vernier TDC topology with Vernier delay lines and a 2-D comparator array                |

| Fig. 3.3 Simulated TDC transfer curve, DNL, and INL with 4% delay mismatch                                                                                                                                                                                        |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 3.4 Measured TDC transfer functions and TDC outputs after the digital ripple canceler, still showing TDC nonlinearity caused by periodic folding errors of the 2-D comparator array.                                                                         |
| Fig. 3.5 Fourier transform reveals the relationship between filtered TDC output and up-converted fractional spur components at the DPLL output                                                                                                                    |
| Fig. 3.6 Proposed Vernier TDC with a 2-D spiral comparator array                                                                                                                                                                                                  |
| Fig. 3.7 Comparisons among proposed 2-D spiral comparator arrangement (scheme 1) and conventional 2-D comparator arrangements (scheme 2 and 3), indicating a better linearity achieved by using the spiral comparator array formation with less delay elements 37 |
| Fig. 3.8 Seven-bit digitally controlled tunable unit delay cell circuit diagram                                                                                                                                                                                   |
| Fig. 3.9 Digitally controlled tunable unit delay cell delay tuning transfer function                                                                                                                                                                              |
| Fig. 3.10 A $2^{nd}$ order $\Delta\Sigma$ modulator used for DTC delay interpolation with quantization error noise shaping                                                                                                                                        |
| Fig. 3.11 Four sets of delay settings used for folding error randomization and the resultant INLs.<br>Also shown are four sets of spiral comparator array configurations that meet the delay<br>requirement and their corresponding folding point locations       |
| Fig. 3.12 2-D folding point randomization with tunable delay and reconfigurable comparator array controlled by $2^{nd}$ order $\Delta\Sigma$ modulators                                                                                                           |
| Fig. 3.13 2-D TDC output periodic errors for individual configurations and modulated results. 46                                                                                                                                                                  |
| Fig. 3.14 Delay variations and $2^{nd}$ order $\Delta\Sigma$ randomization points                                                                                                                                                                                 |
| Fig. 3.15 Simulated TDC output spectrum without, with the 1 <sup>st</sup> order, and the 2 <sup>nd</sup> order $\Delta\Sigma$ modulators for folding error randomization. 47                                                                                      |
| Fig. 3.16 Automatic 2-D Vernier TDC close-loop and open-loop delay calibrations                                                                                                                                                                                   |

| Fig. 3.17 Block diagram of the proposed reconfigurable 2-D spiral Vernier TDC with $2^{nd}$ order<br>$\Delta\Sigma$ linearization                                                                                                                                                                 |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 3.18 Die photograph of the TDC prototype chip                                                                                                                                                                                                                                                |
| Fig. 3.19 Measured TDC full-range transfer curves with and without the $2^{nd}$ order $\Delta\Sigma$ modulation. 52                                                                                                                                                                               |
| Fig. 3.20 Measured TDC output power spectrum density with 1.01MHz input signals under three different measurement configurations: i) without $\Delta\Sigma$ modulation, ii) with the 1 <sup>st</sup> order $\Delta\Sigma$ modulation, and iii) with the 2 <sup>nd</sup> $\Delta\Sigma$ modulation |
| Fig. 3.21 Measured TDC output power spectrum density with 32.7MHz input signals under three different measurement configurations: i) without $\Delta\Sigma$ modulation, ii) with the 1 <sup>st</sup> order $\Delta\Sigma$ modulation, and iii) with the 2 <sup>nd</sup> $\Delta\Sigma$ modulation |
| Fig. 3.22 Single-shot precision measurement results with 10000 tests measured for four different input time intervals located at code 5, 64, 99 and 123, respectively                                                                                                                             |
| Fig. 3.23 Measured histogram plots of TDC output codes with a ramp input signal sweeping the entire detectable range under different settings                                                                                                                                                     |
| Fig. 3.24 Measured TDC DNL and INL without $\Delta\Sigma$ modulation, with the 1 <sup>st</sup> order $\Delta\Sigma$ modulation and with the 2 <sup>nd</sup> order $\Delta\Sigma$ modulation                                                                                                       |
| Fig. 3.25 Measured maximum INL results under voltage/temperature variations for five different TDC chip samples                                                                                                                                                                                   |
| Fig. 3.26 Performance summary and comparison with prior art TDC designs                                                                                                                                                                                                                           |
| Fig. 3.27 (a) Proposed 2-D spiral Vernier ring TDC with 2 <sup>nd</sup> order ΣΔ linearization, and (b)<br>ring/2D structure collaboration illustration                                                                                                                                           |
| Fig. 3.28 Signal propagation difference between (a) straight delay line and (b) end to end connected delay ring                                                                                                                                                                                   |
| Fig. 3.29 (a) Signal propagating issue in a delay ring when rising delay is not equal to falling delay. (b) Pulse generation timing diagram                                                                                                                                                       |

| Fig. 3.30 Unit delay cell circuit diagram                                                                                                                                                  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 3.31 Measured (a) INLs/ (b) DNLs under different delay over-sample ratio                                                                                                              |
| Fig. 3.32 Measured (a) INLs/ (b) DNLs with different SDM settings                                                                                                                          |
| Fig. 3.33 Die photograph of the TDC prototype chip                                                                                                                                         |
| Fig. 3.34 Measured TDC full-range transfer curves and INL                                                                                                                                  |
| Fig. 4.1 Simulated TDC output and the residue signal after the digi-phase canceller, showing that TDC resolution and TDC nonlinearity induced residue errors possess different periods. 73 |
| Fig. 4.2 (a) Fractional spur level due to residue error at the digi-phase canceller output. (b)<br>Measured residue error and its fundamental waveform                                     |
| Fig. 4.3 Proposed DPLL block diagram with automatic TDC linearity calibrations for fractional spur cancellation                                                                            |
| Fig. 4.4 Proposed three-step TDC block diagram                                                                                                                                             |
| Fig. 4.5 Timing diagrams of the (a) bang-bang TDC, (b) single delay line TDC, and (c) Vernier delay line TDC                                                                               |
| Fig. 4.6 Circuit diagrams of (a) TDC unit delay stage and (b) arbiter cell                                                                                                                 |
| Fig. 4.7 Proposed wide-tuning DCO architecture                                                                                                                                             |
| Fig. 4.8 Proposed DCO frequency tuning banks                                                                                                                                               |
| Fig. 4.9 Schematics and layout of digitally controlled capacitor unit                                                                                                                      |
| Fig. 4.10 Simulated TDC nonlinearity considering common mode error and differential mode error                                                                                             |
| Fig. 4.11 Proposed TDC automatic linearity calibration loops                                                                                                                               |
| Fig. 4.12 Measured convergence of TDC common mode and differential mode delays                                                                                                             |

| Fig. 4.13 Die photo of the DPLL in a low-power multi-standard wireless transceiver RFIC 88                                                                                                                                                     |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 4.14 Measured phase noise at a 2.08-GHz output with a loop bandwidth of 1 MHz 88                                                                                                                                                          |
| Fig. 4.15 Measured spectrum before and after digital calibrations with the fractionalities of 1/64.                                                                                                                                            |
| Fig. 4.16 Measured spectrum before and after digital calibrations with the fractionalities of 3/64.                                                                                                                                            |
| Fig. 4.17 Measured fractional spur near 2.4 GHz with a loop bandwidth of 1 MHz for different fractional frequencies with and without TDC calibrations                                                                                          |
| Fig. 4.18 Measured TDC transfer curve, INL, and DNL before and after digital calibrations 92                                                                                                                                                   |
| Fig. 5.1 Schematic diagram of a general super-regenerative receiver                                                                                                                                                                            |
| Fig. 5.2 A bidirectional digital-bits-in/-out CMOS THz pico-radio with on-chip antenna for wireless sensor nodes and Internet of Things (IoT)                                                                                                  |
| Fig. 5.3 Circuit schematic of the THz pico-radio configured as the transmitting mode (TX) 100                                                                                                                                                  |
| Fig. 5.4 Circuit schematic and operation principle of the THz pico-radio configured as the receiving mode (RX)                                                                                                                                 |
| Fig. 5.5 Circuit schematic of the TDC 103                                                                                                                                                                                                      |
| Fig. 5.6 (a) Monte Carlo simulations of the fast delay $(\tau_f)$ , slow delay $(\tau_s)$ and delay difference $(\tau_s - \tau_f)$ on the process variations. (b) Simulated delay variations over temperature, supply voltage and corners. 105 |
| Fig. 5.7 Timing diagram of the RX mode and the TDC, assuming 4Mbit/s OOK data rate 106                                                                                                                                                         |
| Fig. 5.8 Concept of the switching capacitor energy harvester                                                                                                                                                                                   |
| Fig. 5.9 Schematic of the adopted voltage doubler                                                                                                                                                                                              |
| Fig. 5.10 THz pico-radio chip microphotograph                                                                                                                                                                                                  |

| Fig. 5.11 THz pico-radio communication link using OOK modulations. (a) The results are<br>summarized as measured TRX peak DC power versus TX/RX communication distance<br>and OOK data rate at BER<10-7. (b) Measured BER and minimum RX DC power versus<br>data rate over a fixed TX/RX distance of 50cm. (c) Measured BER versus TX/RX<br>distance at different TX DC power for 4Mb/s OOK signals |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 5.12 THz pico-radio communication link using ASK modulations with a symbol rage of 4Msym/s. The results are summarized as maximum communication distance versus number of bits                                                                                                                                                                                                                 |
| Fig. 5.13 Traditional I/Q wireless transceiver architecture                                                                                                                                                                                                                                                                                                                                         |
| Fig. 5.14 Existing polar direct modulation TX architecture                                                                                                                                                                                                                                                                                                                                          |
| Fig. 5.15 Block diagram of TDC based hybrid polar data converter in RX architecture                                                                                                                                                                                                                                                                                                                 |
| Fig. 5.16 Polar RX working principle with 16-QAM baseband signal                                                                                                                                                                                                                                                                                                                                    |
| Fig. 5.17 FFT results of the filtered and unfiltered signal                                                                                                                                                                                                                                                                                                                                         |
| Fig. 5.18 Direct RF demodulation with 64QAM signal                                                                                                                                                                                                                                                                                                                                                  |
| Fig. 5.19 A comparison between Cartesian I/Q converter and polar converter 120                                                                                                                                                                                                                                                                                                                      |
| Fig. 5.20 Minimum amplitude and phase number of bits for rectangular 64-QAM with 25dB EVM                                                                                                                                                                                                                                                                                                           |
| Fig. 5.21 Rectangular 64-QAM and 64-APSK constellations                                                                                                                                                                                                                                                                                                                                             |
| Fig. 5.22 Rectangular 64-QAM and 64-APSK constellations with phase noise 123                                                                                                                                                                                                                                                                                                                        |
| Fig. 5.23 Rectangular 64-QAM and 64-APSK constellations with nonlinear distortion 124                                                                                                                                                                                                                                                                                                               |
| Fig. 5.24 Rectangular 64-QAM and 64-APSK constellations with thermal noise 124                                                                                                                                                                                                                                                                                                                      |
| Fig. 5.25 Proposed polar receiver with 64-QAM signal: top, transmitter baseband I/Q signals;<br>middle, transmitted RF signal; bottom, constellations of TX baseband, RF band, and RX<br>received baseband                                                                                                                                                                                          |

| Fig. 5. | 26 Proposed polar receiver with proposed 64-APSK signal: top, transmitter baseband I/Q signals; middle, transmitted RF signal; bottom, constellations of TX baseband, RF band, and RX received baseband |
|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 5. | 27 Comparison of the BER vs. SNR performance of the rectangular 64-QAM signal and the proposed 64-APSK signal when only considering the effect of phase noise                                           |
| Fig. 5. | 28 Comparison of the BER vs. SNR performance of the rectangular 64-QAM signal and<br>the proposed 64-APSK signal when only considering the effect of nonlinearity distortion.<br>                       |

Fig. 5.29 Comparison of the BER vs. SNR performance of the rectangular 64-QAM signal and the proposed 64-APSK signal when only considering the effect of thermal noise. ..... 129

# List of Tables

| Table 3-1 Performance comparison with recently reported TDCs             | 59    |
|--------------------------------------------------------------------------|-------|
| Table 3-2 Monte Carlo comparison of unit delay and number of delay cells | 62    |
| Table 4-1 Specifications of a three-step TDC                             | 78    |
| Table 4-2 Measured ADPLL performances and comparisons                    | 93    |
| Table 5-1 PAPR comparison between rectangular QAM and proposed APSK      | . 123 |

# **Chapter 1** Introduction

# **1.1 The Motivation of Time Information Extraction**

Time, a unique presence, has been related to the three-dimensional geometry of the universe by philosophers in Ancient Greece. The link between time and space has been approved in scientific way in 1905 by Albert Einstein. Acts as a distinct dimension, time gradually affects human life and restores information in its own way.

We have begun to measure time since human appearance on the Earth. And it always acts a crucial role in the understanding of nature from the earliest times till today. Starting with sun, sand or water driven devices we are able to use temporal measurement equipment now, based on the most sophisticated cesium resonators.

A quartz clock or a stop watch is capable and adequate for normal human activities for a long period in the history. However, when it comes to the 21<sup>st</sup> century and stands on the edge of transitioning from the 4<sup>th</sup> generation (4G) to 5<sup>th</sup> generation (5G) wireless communication system evolution and deployment of internet-of-things (IoT) network, integrated circuit (IC) designers are seeking much more accurate time measurement solutions. To overcome the unprecedented challenges of high throughput data stream demands and portable or wearable capability in 5G and IoT system requires seeing things differently. Utilizing the time information in IC designs gives us another degree of freedom that can help us cross the obstacle through a different dimension.

# **1.2** The Influence of the Advanced Deep Submicron Silicon Technology on Time Interval Measurement

Micro-electronics design, especially time related circuits is relied on technology process exceedingly. The first integrated circuit was implemented back in 1958 [1]-[3]. Some widely used electronic circuits, such as analog-to-digital converters (ADCs) [4] and phase-locked loops (PLLs) [5], [6], have been transformed on solid state shortly after the first IC was invented. However, the time domain related circuits, for instance time-to-digital converters (TDCs) and digital-to-time converters (DTCs), can barely be found until late 1980s [7], [8].

Unlike analog circuits, which deal with information in voltage domain, the time-based signal processing circuits is depending on the switching speed of transistors, because the information is stored in the time interval between certain events. Back to the times when only large feature size processes were available, alternative approaches were invited to deal with the information imbedded in time domain. The time signal extraction was leveraging analog-to-digital conversion techniques with large capacitors, which convert time interval into voltage domain. As a cumbersome passive component, capacitor is sometimes too large to integrate on micro-chips, for example the capacitors in analog PLL's loop filter and the sampling capacitors in ADC designs. The performance of analog circuit relies not on the speed of circuit transitions but fairly on the shape of the transistor characteristic. Different from digital circuits, technology scaling degrades those characteristics and leads to deteriorated intrinsic gain and output resistance.

From some perspective, time domain signal processing is more similar to digital building blocks, which take full advantage of reduced transistor dimensions in terms of transition speed, area, and power consumption. Moreover, the fast switching transitions reduce the susceptibility to noise, especially transistor's flicker noise. After the process gradually reduced to below 100 nm and reached the realm of so called deep submicron, the benefit of time domain signal processing finally emerged. The reduction of the gate delay results in unceasingly improving temporal resolution which is exactly the opposite to the amplitude resolution due to the reduced voltage supply in deep submicron technology [9]. The resolution of the direct time interval measurement using TDC is able to reach several pico-second (ps) or even femto-second (fs) level [10]-[12]. Engineers begin to maneuver the design flow back to straight exploit time information. Converting time into analog domain and using ADC for digitization is no longer a reasonable approach. Some ADC designers are even seeking alternative ADC architectures by using TDCs [13]-[15]. And equipped with pico-second level TDC, ADPLLs are now able to substitute traditional analog in many different circumstances [16]-[19].

There are definitely drawbacks in small feature size process such as the degradation of power supply rejection, increased leakage current and complicated layout design procedure. Still, the advantages are overwhelming and suggest us to pay more attention to time domain. Very often the time domain signal processing and time to digital implementations cause a one-time, area, power, and design complexity overhead. The new functionality and performance improvement will rationalize this extra expense. Moreover, those expense will gradually vanish in advanced technology generations, and the full advantage of the increased performance as well as the new functionality will pay off within one or two generations.



Fig. 1.1 Generic on-chip time domain signal processing system.

# **1.3 Time Domain Mixed Signal Processing Applications**

Time-to-digital converter together with digital-to-time converter play a critical role in time domain signal processing as depicted in Fig. 1.1. The purpose of these two modules are similar to the ADC and DAC in conventional voltage domain operation circuits, which both transfer analog physical quantities into digital domain signals. TDC itself is a hot topic in nowadays integrated circuit design field. And its derivatives are pouring out due to the technology shrinking and fed into various applications in IC industry.

Mostly, TDC is linked with all-digital PLL (ADPLL) frequency synthesizers where TDC serves as a phase detector [20]-[23]. Phase-locked loop (PLL) is widely employed in many integrated systems such as wireline, wireless communications and microprocessors. It is used for system clock generation, frequency synthesis, data recovery, duty-cycle enhancement, etc. With the aid of TDC, the ADPLL gains the extra ability to adjust loop parameters, such as loop bandwidth and response time, adaptively and automatically [24]. Furthermore, due to the credit of

high performance TDC, some novel concepts like direct phase modulation through PLL comes to reality [25]-[27].

While the ADPLL is likely the most popular TDC application others arise rapidly. At beginning, analog-to-digital converters are adopted for extracting the information embedded in time domain. However, a subtle turnaround is taking place right now. With the developing of semiconductor technology, the switching speed of the transistors become much faster than ever. The time domain signal processing, especially TDC, takes the full advantage of the technology shrinking and results in a low power, low cost, and high-performance solution. Meanwhile, the space for traditional analog-to-digital conversion is shriveling gradually, owing to the reduced power supply in deep sub-micron process. Converting voltage signal back into time domain provides designers a brand new way to solve the problems.



Fig. 1.2 8-PSK modulation and 16-QAM modulation constellations.

Together with the blooming of the individual electronic device market, wireless communication becomes one of the biggest branches in integrated circuit industry. Within all kinds of digital communication protocols, phase modulation is the most adoptive schemes. The vast majority of commercial protocols, such as Wi-Fi, Bluetooth, 3G, 4G and even 5G, are using phase related modulations like phase-shift keying (PSK) or quadrature amplitude modulation (QAM). Information in those kinds of modulation is encoded in or partial in separated phases as shown in Fig. 1.2. And we know that phase is essentially time domain signal in a certain frequency. Apply time domain signal processing techniques on wireless communication is able to partially remove the burden on analog domain and relief the design complicity of some of the modules in wireless transceivers, such as the power amplifier (PA) on transmitter side and the ADC on receiver side.

Terahertz (THz) wireless communication is another field attracting great interests and is expected to meet the ever-increasing demand for high-throughput data streams [28]-[30]. TDC is able to find its usefulness in this area as well. Low power digitized super-regenerative terahertz wireless receiver is one of the successful cases that demonstrates the combination of these two technologies [31]. TDC's potential also shows in some emerging areas, such as internet-of-things (IoT) and machine learning based artificial intelligence (AI). Planted into sensor nodes in the IoT network, TDC is able to assist the network planning the activation timing and duration, which directly affect the lifespan and accuracy of each IoT node.

Speaking with artificial intelligence, one of the biggest challenges in machine deep neural networks is limited memory resources. The restricted memory volume and bandwidth has already become a significant obstacle lying in the path of AI developing progress. However, even lower organisms, with only few hundreds of neural cells, have some basic learning and memory behaviors. Thus, from this point of view, the actual bottleneck is not the memory limitation, but is the way how the limited memory is being used. Scientists are trying hard to investigate and mimic the learning mechanism in human brains. A mechanism, called long short-term memory, is believed as a crucial part for memory saving. And the judgement of the long short-term is based on time domain where the time domain signal processing can be utilized. Furthermore, most of the classic computer architectures and digital circuits are synchronized system nowadays. Memory is occupied and released synchronizing with a centralized clock, which is straight forward and easy to implement, but inefficient and uneconomic. Conversely, our brain is not an asynchronized system. There is no such an organ working as a clock generator in our brain obviously. Autonomous learning ability and memorization process are related to time sequence triggered asynchronized events. Hence, we can see that the usefulness of time domain signal processing can also be found in artificial intelligence architecture improvement.

# 1.4 Organization of Dissertation

This dissertation focuses on time-to-digital conversion based signal processing techniques. The advantage and importance of the time domain signal representation are emphasized above in this chapter. Chapter 2 introduces some basis information of time-to digital converter circuit, the key module in time domain processing. The history of TDC development and evolution has been presented. And prior TDC architectures are listed and summarized.

In Chapter 3, several advanced TDC performance improvement techniques dealing with different TDC parameters, such as linearity, detection range, and power consumption are presented and discussed individually. Two state-of-the-art TDC designs are explored. One is targeted for

linearity improvement with reduced power consumption. The other one is focused on detection range expansion.

Three different applications utilizing TDC and time domain signal processing concept are presented and analyzed in the following chapters. The relationship between TDC linearity and the spur performance in ADPLL is revealed in Chapter 4. A super-regenerative terahertz wireless communication system utilizing TDC based digitization techniques is provided in Chapter 5. And Chapter 6 gives a new approach to demodulate quadrature amplitude modulation (QAM) signals in wireless communication system. Theoretical analysis, system and circuit level simulation, as well as silicon-based prototype measurement results are available in corresponding chapters. Finally, the dissertation concludes in Chapter 7 with future research topics suggested.

# Chapter 2 Time Domain Information Digitization Techniques

# **2.1 Introduction**

As shown in Fig. 1.1, normally, the information embedded in time domain can only be utilized after converting into digital data. The first direct TDC predecessor was invented for the measurement of theoretical physics experiment in the year 1942 by Bruno Rossi. Due to the technology limitation, it was using capacitor for time interval to voltage domain transformation. The corresponding voltage is then converted to digital bits with other analog-to-digital devices [32]. Transistor based time-to-digital equipment was invented in 1970s. A universal time interval counter, shown in Fig. 2.1, was delivered by Hewlett-Packard (HP) in 1978. By adopting Vernier concept, 20-ps resolution was achieved with  $\pm 10$  seconds range and frequency up to 100 MHz. However, the appearance of silicon based integrated TDC was postponed till late 1980s by excessive timing requirements of the technology process [7], [8], [33]. While some fundamental concepts (like Vernier mechanism (Pierre Vernier 1584-1638) [34], [35] and time stretching or amplifying [36]) of dividing time into measurable intervals are still up-to-date, the implementation changed tremendously during the past few decades. Starting with vacuum tubes those ideas are now implemented in complementary metal-oxide-semiconductor (CMOS) with deep sub-micro technologies.



Fig. 2.1 HP 5370A universal time interval counter with 20-ps resolution, 10 seconds range and frequency up to 100 MHz.

This chapter introduces some basis information about the time-to digital converter. Frequently used TDC performance indicators are listed and explained. Commonly adopted TDC architectures are presented and evaluated in the following paragraphs as well. Their performance strengths and weaknesses are analyzed and compared.

# 2.2 Time-to-Digital Converter Key Specifications

Both TDC and ADC are data converters that convert analog physical quantities into digital domain. Generally, most of the terms can be directly borrowed from ADC designs for TDC characterizations. Although engineers are using common terms to describe those converters, the way designers specify the performance of TDCs in data sheets is sometimes confusing for people even inside this area. Still, to fully understand those data converters, it's necessary to recognize the specifications. The following paragraphs will list and explain the specifications commonly appeared in papers and data sheets that describe the performance of TDCs.

## 2.2.1 Basic Structural Parameters

#### 2.2.1.1 Resolution

Resolution is the most elementary parameter in TDC designs and should be finalized in the first place to meet the system requirement. It represents the finest detectability that the TDC can reach, which is around 1-ps with currently COMS process technology. A finite resolution causes quantization noise, which directly affects ADPLL's in-band phase noise, one of the crucial performance indicator of ADPLL designs. The relationship between TDC resolution and ADPLL in-band phase noise will be covered in the following chapters.

#### 2.2.1.2 Detectable Range (Number of Bit)

Detectable range is a parameter to determine the maximum time interval that a TDC can measure. It is linked to ADPLL performance likewise. An appropriate detectable range balances ADPLLs locking time, lock accuracy and total power consumption. The detectable range sometimes can also be expressed as number-of-bit (NoB), which is commonly used in ADC applications. As the resolution of a TDC has been preset in advance, the detectable range can be calculated as *resolution*× $2^{NoB}$ . Contrariwise, the NoB normally implies ADC's resolution, since the input range a ADC can handle is limited by its power supply voltage, which is only approximately 1-volt in today's deep sub-micron CMOS technology. Consequently, A larger NoB ADC indicates finer resolution performance.

## 2.2.2 Performance Characteristics

As a data converter, a linear conversion is desired in TDC designs. However, the TDC measurement deviation from the ideal is inevitable due to process-voltage-temperature (PVT)

variations, mismatches in the manufacturing process common to all integrated circuits (ICs), and through various sources of inaccuracy during the conversion between an analog time input to digital output codes. The TDC performance specifications will quantify the nonlinearity caused by the TDC itself.

The linearity specifications are generally divided into two categories: DC accuracy and dynamic performance. TDCs are used to measure a relatively static time interval signal (for instance, a situation in an integer-*N* ADPLL) or a dynamic signal (such as processing in a fractional-N ADPLL or in direct digital phase modulation loop). The application determines which specifications the designer will consider the most important.

## 2.2.2.1 DC Accuracy

The specifications of TDCs describing DC accuracy are gain error, offset error, differential nonlinearity (DNL), and integral nonlinearity (INL). These four specifications shape a complete description of TDCs' DC accuracy.



Fig. 2.2 Ideal transfer function of a TDC.

#### • Transfer Function

The transfer function or transfer curve of a TDC is a plot of the measured time interval versus the digital code's output by the TDC. Such a plot is a sequence of discrete 2<sup>*NoB*</sup> codes, where *NoB* represents TDC's detectable range. An ideal transfer function for a 3-bit detectable range TDC is depicted in Fig. 2.2.

### • Gain Error

Gain error describes the difference between the ideal code transition from code zero to the highest output code and the actual measured transition from code zero to the highest output code. This can also be observed as a drift in slope of the transfer function line as shown in Fig. 2.3.



Fig. 2.3 Gain error in TDC transfer function.

## • Offset Error

An ideal transfer function line intersects the origin of the plot. The first code transition boundary occurs at 1/2 least-significant-bit (LSB) as shown in Fig. 2.2. When offset error exists,

a left or right shifting of the entire transfer function along the input time interval axis can be observed, as shown in Fig. 2.4.



Fig. 2.4 Offset error in TDC transfer function.



Fig. 2.5 DNL in TDC transfer function.

## • DNL

Under ideal conditions, each step size on a TDC's transfer function curve should be uniform in width and is equal to the TDC resolution. Deviation of each step size from the TDC resolution is considered as differential nonlinearity (DNL), expressed in LSBs. This can be observed as unevenly distributed step size in the transfer curve. In Fig. 2.5, a selected digital output code width is shown as larger than the previous code's step size. This difference is DNL error. DNL is calculated as shown in Equation (5.2).

$$DNL = \frac{(T_{n+1} - T_n)}{Resolution_{TDC}}$$
(2.1)

#### • INL

The integral nonlinearity (INL) is the deviation of a TDC's transfer function from a straight line, which connects the highest and lowest data points and can be treat as an ideal transfer curve with infinite resolution, as shown in Fig. 2.6. INL error is depicted as the difference between the ideal time interval lengths at which code transitions occur and the actual input time interval. DNL is related to INL, which INL error at any given time interval in TDC transfer function is the total summation of all DNL errors of all previous TDC codes. INL can be observed as the deviation from a straight-line transfer function, as shown in Fig. 2.6.



Fig. 2.6 INL in TDC transfer function.

## 2.2.2.2 Dynamic performance

Data converter's dynamic performance is normally analyzed in frequency domain and is typically calculated by performing a function of fast Fourier transform (FFT) on the output codes of the TDC. In Fig. 2.7, the fundamental frequency is the input signal frequency. This is the signal measured with the TDC. The rest of spectrum energy is considered as unwanted signals, including all kinds of noise and harmonic distortion. Signal distortion is mainly caused by the nonlinearity in conversion transfer function curve. Therefore, INL and DNL will affect the dynamic performance of an TDC.



Fig. 2.7 Spectrum plot quantization noise floor for a 14-bit data converter (16384-point FFT).

## • Signal-to-Noise Ratio (SNR)

The signal-to-noise ratio (SNR) is the ratio of the input signal power to the noise power (excluding harmonic distortion), expressed in decibels (dB), as shown in Equation (2.2).

$$SNR(dB) = 10\log_{10}\left(\frac{P_{signal}}{P_{noise}}\right)$$
(2.2)

where  $P_{signal}$  is the power of the signal, and  $P_{noise}$  is the power of noise. It is a comparison of the noise to be expected with respect to the input signal. Theoretically, the noise contributor in TDC is only quantization noise, which determined by TDC resolution and can be directly calculated.

The temporal quantum size  $A_q$  is TDC's resolution or one LSB

$$A_q = \frac{Detectable \ Range}{2^{NoB}} = 1 \ \text{LSB}$$
(2.3)

Finite temporal resolution introduces a quantization error between the analog input time interval and the reconstructed output signal. Assuming the quantization error is evenly distributed over the converted code from -1/2 LSB to +1/2 LSB, the expectation value of the error amplitude can be expressed as

$$E\{\epsilon^2\} = \frac{1}{A_q} \int_{\frac{1}{2}A_q}^{\frac{1}{2}A_q} \epsilon^2 d\epsilon = \frac{1}{A_q} \left[\frac{\epsilon^2}{3}\right]_{\frac{1}{2}A_q}^{\frac{1}{2}A_q} = \frac{A_q^2}{12}$$
(2.4)

This quantization noise is uncorrelated in the entire frequency band. By using this result, the maximum theoretical SNR can be calculated. The root-mean-square (rms) value of a full-scale input time interval is

$$A_{rms} = \frac{2^{\text{NoB}}A_q}{2\sqrt{2}} \tag{2.5}$$

Thus, the theoretical SNR is

$$\operatorname{SNR}(\mathrm{dB}) = 10\log_{10}\left(\frac{A_{rms}^{2}}{E\{\epsilon^{2}\}}\right) = 10\log_{10}\left(\frac{3}{2}2^{2NoB}\right) = 6.02NoB + 1.76$$
(2.6)

#### • Signal-to-Noise and Distortion Ratio (SNDR)

Nonlinearity in the data converter results in harmonic distortion when analyzed in the frequency domain. Such distortion is observed as spurious tone or spur in the FFT at harmonics of the measured signal as illustrated in Fig. 2.7.

Signal-to-noise and distortion ratio (SNDR or SINAD) offers a more comprehensive picture by including the noise and harmonic distortion together. It can be calculated using Equation (2.7).

SNDR(dB) = 
$$10\log_{10}\left[\frac{A_1^2}{A_2^2 + A_3^2 + \ldots + A_n^2 + A_{noise}^2}\right]$$
 (2.7)

where  $A_1$  is the amplitude of the fundamental, and  $A_n$  is the amplitude of the n-th harmonic tone. For an *M*-point FFT result, if the fundamental is in frequency bin *m* (with amplitude of  $A_m$ ), the SNDR can be obtained from the FFT result as

SNDR(dB) = 
$$10\log_{10}\left[A_m^2\left(\sum_{k=1}^{m-1}A_k^2 + \sum_{k=m+1}^{M/2}A_k^2\right)^{-1}\right]$$
 (2.8)

To avoid any spectral leakage around the fundamental tone, often several bins around the fundamental are ignored. Effective-Number-of-Bits (ENoB) is simply the signal-to-noise-and-distortion ratio expressed in bits rather than decibels by solving the "ideal SNR" Equation (2.6) as

$$ENoB = \frac{SNDR - 1.76 \text{ dB}}{6.02 \text{ dB/bit}}$$
(2.9)

In the presentation of measured results, ENoB is identical to SNDR, with a change in the scaling of the vertical axis.

### • Spurious-Free Dynamic Range (SFDR)

Spurious-free dynamic range (SFDR) is the difference between the power of the fundamental tone and its peak harmonic component or peak of the highest spurious tone in dB numbers. Spurious tones are generated at harmonics of the input signal due to the nonlinear conversion transfer function, or at sub-harmonics of the sampling frequency due to mismatch or clock on-chip coupling.

# 2.3 TDC Design Architectures

Similar to ADCs, various types of TDC architectures are developed, with differing resolutions, detectable range, linearity, conversion speed, bandwidths, design complexity, power consumptions, and other economic concerns. to fulfill a broad range of time conversion application requirements.

## **2.3.1** Counter Based TDC

In the time when deep sub-micron CMOS technologies were not available, two types of TDC, counter-based and ADC-based, were commonly used and considered as the first generation TDCs. If high resolution is not required, usually larger than 1 ns, then a counter based TDC is adequate for making the conversion. The TDC is simply a high-frequency counter that increments every clock cycle. The implementation of this type is straightforward with two basic building blocks, a frequency generator and a counter. The counter is controlled by an enabler, which triggered by input signal. The current contents of the counter represent the input time interval. With this approach, the measurement is an integer number of clock cycles. Thus, the measurement is restricted by the clock period, which for instance a 1 ns temporal resolution requires a clock

frequency equals to 1 GHz. Finer resolution can only be achieved with faster clock provided by the frequency generator. And the accuracy of the measurement depends on the purity of the generated frequency.

A PLL based frequency synthesizer is commonly adopted owing to its ability to produce high frequency around several giga-hertz with great stability. Although terra-hertz PLLs have been invented recently [37]-[39], which means pico-second resolution is realizable, the power consumption of the PLL and the counter operating at such high frequency disqualifies this approach for sub-nanosecond TDC applications.

## 2.3.2 Analog-to-Digital Conversion Based TDC

ADC based time interval measurement is another approach during the CMOS technological underdevelopment period. By utilizing advanced techniques available in ADC area, such as sigmadelta modulation (SDM), successive-approximation register (SAR), etc., this type of TDC achieves much finer resolution even with large feature size technologies [32], [40], [41].

This type of TDCs also consists of two major building blocks, an integrator and an ADC. The integrator is normally built with a capacitor and a constant current source. Firstly, the capacitor is charged by the current source during the measured time interval and convert the time interval into a voltage level. In a second step the converted voltage is digitized by an ADC. A basic building diagram is presented in Fig. 2.8.

Although pico-second level resolution is achievable owing to well-developed ADC techniques, the TDC performance degrades by several practical issues. Primarily, the time to voltage conversion has to meet the linearity demand of the overall TDC. The integrator used in
TDC is similar to a charge-pump, which a current source is connected to a capacitor during the measurement time interval. However, output resistance of the current source is limited especially in deep sub-micron technology and leads to a linearity performance. This issue could be partially neutralized by employing an operational amplifier (op-amp) based active integrator. Still, the congenital deficiency of the op-amp bandwidth greatly reduces the conversion speed of the TDC.



Fig. 2.8 Block diagram of basic ADC based time-to-digital converter.



Fig. 2.9 Block diagram of basic flash time-to-digital converter.

### 2.3.3 Flash (Single Delay Line) TDC

Other solutions exist to shrink the TDC resolution beyond the lower bound set by the minimum feasible clock period [42]. Asynchronized uniformly distributed sub-period clock phases can be generated with a tapped delay line as shown in Fig. 2.9.

The resolution of a delay line TDC is therefore determined by the propagation delay of a single delay cell. The delay cells in the delay chain are usually build with one CMOS inverter or a pair of inverters and the resolution is limited to few tens of pico-seconds with current technology. The TDC detectable range is the product of the number of delay cells and the resolution. This TDC structure is widely adopted in various applications due to its simple structure and low latency as the output is immediately available by the end of input time interval. While, just like the flash ADC, the number of elements, delay cells and comparators, grows exponentially when the number-of-bits increases. Additionally, mismatch between the delay cells and between the comparators directly affect the linearity performance of the TDC. The variation in propagation delay accumulates along the delay chain. Thus, the jitter of the propagating signal will increase towards the end of the delay line.

#### 2.3.4 Vernier Delay Line TDC

The lower limit resolution that flash TDC can reach is limited by the minimum gate delay, normally higher than 20 ps even with most advanced technology. While, the requirement set by ADPLL's in-band phase noise is within 10 ps. To further improve resolution, Vernier mechanism is therefore adopted, which reveals a way to achieve sub-gate-delay resolution [34], [35].



Fig. 2.10 Block diagram of basic Vernier delay line time-to-digital converter.

As shown in Fig. 2.10, it consists of two delay lines with slightly difference in time delay. The start signal is assigned to the line with larger delay, slow delay line, and the stop signal is assigned to the other delay line. During the measurement, the propagation of the stop signal is relatively faster and catch up with the start signal eventually after passing a certain number of delay cells. The resolution of a TDC based on the Vernier delay line is the delay difference of the unit delay cell in each delay chains. Therefore, a high resolution independent to the gate-delay can be implemented.

Compared to the Flash TDC, the latency in Vernier delay line TDC is greatly increased and is equal to the time that two signals catch up with each other. Consequently, for a given full scale range, the length of the delay lines in Vernier TDC is the quotient of the maximum input time interval divided by the TDC resolution. Since the Vernier TDC resolution is normally much smaller than the gate-delay, the total length can be considerably greater than the Flash TDC. Essentially, the Vernier delay line TDC trades the resolution with hardware cost, power consumption, conversion rate, as well as overall linearity performance.

## 2.3.5 Oversampling Gated Ring Oscillator (GRO) TDC

Counter based TDC is also possible to achieve the sub-gate delay resolution with the aid of ring oscillator structure and the oversampling effect. Instead of generating only one signal, the ring oscillator is able to produce a bunch of phases evenly distributed in one oscillation period. And the temporal difference between each phase is equal to the time delay of the unit delay cells in the ring. The further resolution reduction is accomplished with the effect of oversampling mechanism [43]-[45]. According to oversampling principle, if the signal has been converted for total  $2^{2n}$  times, the combined  $2^{2n}$  consecutive *m*-bit samples can effectively add *n* bits and form an overall (*n*+*m*)-bit conversion result, namely  $2^n$  times resolution improvement.



Fig. 2.11 Block diagram of GRO based TDC.

In addition to oversampling, noise shaping technique can further improves the TDC resolution. During the  $2^{2n}$  times of oversampling measurement, if the quantization residue left in

the previous measurement cycle is preserved and carried to the next cycle, then the first order noise shaping effect can be observed in frequency domain. The quantization residue store technique is embedded in the gated-ring-oscillator TDC [46]. As shown in Fig. 2.10, the delay elements in the ring is controlled by an enable signal generated according to the input time interval. When a measurement is finished, the system disables the oscillation and freezes the phases in the ring, which represent the quantization residue of the current measurement state. By the time the next measurement begins, the oscillation is starting from the previous state phases and the quantization residue is accumulated.

With the first order noise shaping and oversampling, the quantization noise at low frequencies is shaped to high frequency band. As high frequency components are filtered, the GRO TDC is able to achieve a temporal resolution even reach a level below 1-ps. However, it is important to point out that this improved resolution is not achieved for a single measurement but for multiple averaged samples. As a result, the high effective resolution is not improved in a single shot measurement, but only achieved with a large oversample ratio and after filtering the results. Consequently, the actual conversion bandwidth of this kind TDC is limited to only a few megahertz, which is similar to the SDM ADCs. Another drawback of GRO TDC is caused by leakage currents. The preserved phases between each state are stored as charge on capacitors in the delay cells. The discharge during the stop time is inevitable, especially in deep sub-micron technologies [47].

#### **2.3.6** Time Amplifier TDC

Another technique to accomplish the sub-gate delay resolution is by employing the time amplifier [36], [48], [49]. Both served as data conversion modules, many technologies developed

for the ADCs are transplantable to the TDCs. It can be found that some of the ideals and architectures in both field are identical. Voltage amplifiers are commonly seen in ADC designs, especially in pipeline ADCs. The decision residue voltage from the previous stage is amplified and sent to the next stage. Due to the amplification, the burden on comparators are greatly alleviated. The comparison then amplification cycle can last until reaches the noise floor.



Fig. 2.12 SR-latch based time amplifier and its time domain gain characteristic.

A linear amplifier with large dynamic range is easy to implement in voltage domain. While, it is very challenge when comes to time domain. A SR latch-based time amplifier is depicted in Fig. 2.12. By following this approach, excellent resolution performance is reachable. However, the detectable range is limited due to the narrow dynamic range of the time amplifier [50].

# 2.4 Conclusions

The TDC evolving history is introduced at beginning of the chapter. Basic knowledge and design specifications, such as resolution, detectable rang, INL, and DNL, are clarified and listed

for reference. The relationship between TDC transfer function linearity and dynamic performance is explained.

Several commonly used TDC architectures are explained in detail and analyzed with respect to resolution, detectable range, linearity performance, and power consumption. Although counter based TDC and flash TDC can be easily implemented and has high conversion speed, their resolutions are limited by clock frequency and the gate-delay respectively. ADC-based TDCs improves resolution, while its linearity is determined by the time-to-analog conversion. Vernier TDC formed by two delay chains with slightly different delays can achieve sub-gate delay resolution with improved linearity. However, its detectable range and conversion rate are limited due to the reduced conversion step size. Consequently, a large number of delay stages are needed to cover the detection range, resulting in high power consumption. Gated-ring-oscillator TDC achieves fine resolution with large range, while its nonlinearity is a drawback due to the device leakage issue. Time amplifier TDC can achieve fine-time resolution and high conversion rate, yet it suffers from limited detection range.

With different criteria, TDCs can be grouped into several categories, such as counter based or gate-delay based TDCs, Nyquist or oversampling TDCs, gated-delay or sub-gated-delay TDCs, and indirect conversion or direct conversion TDCs. However, unlike ADC designs, the boundaries among those TDC categories are ambiguous. Considering architectures, for some applications with relatively low requirement just about any architecture could work well; for others, there is a "best choice", regarding to one or several specific needs. In some cases, the choice is simple because there is a clear-cut advantage to using one architecture over another.

# Chapter 3 Advanced Time-to-Digital Conversion Performance Improvement Solutions

## **3.1 Introduction**

As CMOS processes keep scaling down in modern deep sub-micron technologies, digital phase locked loop (DPLL) [51]-[53] becomes more prospective than their analog counter parts [54]-[57] due to its capability of programming the loop parameters on the fly, performing direct digital modulations through the PLL, calibrating the loop for superior linearity performance and its scalability with technology migration. In DPLL designs, time-to-digital converter (TDC) is one of the key building blocks [58], which directly affects the in-band phase noise and fractional spurious level [59]. The in-band phase noise of a DPLL is related to the TDC quantized resolution as

$$\mathcal{L} = \frac{\left(2\pi\right)^2}{12} \left(\frac{\tau_{TDC}}{T_{DCO}}\right)^2 \frac{1}{f_{ref}}$$
(3.1)

where  $\tau_{TDC}$  is the TDC resolution,  $T_{DCO}$  is the oscillation period of the digitally controlled oscillator (DCO) output, and  $f_{ref}$  is the reference frequency of the DPLL [60]. In order to lower inband phase noise, the TDC resolution has been reduced to around 1-ps level according to the recently reported data [61]-[64]. However, improving TDC's linearity performance faces increasing challenges particularly for high-resolution TDC. The TDC nonlinearity not only jeopardizes DPLL's in-band phase noise, but also leads to deteriorated fractional spur level in fractional-*N* DPLLs. When operating in a fractional-*N* mode, a multi-modulus divider (MMD) toggles the division ratio between *N* and *N*+1 and generates a gradually increased phase difference between reference signal and divided feedback signal [65]. Quantized by the TDC, a staircase ramp signal is generated at TDC output, as shown in Fig. 3.1. Directly feeding this signal into DCO through the loop filter may lead to an unacceptable spur level or unstable loop dynamics. In order to reduce the fractional spurious tones, a digital ripple cancelation technique is often employed [66]. As illustrated in Fig. 3.1, an ideal staircase ramp is generated following the variation of loop division ratio in fractional mode. This ideal staircase ramp is subtracted from the TDC raw output to cancel its staircase signal while extract the needed DC component. As a result, a less rippled TDC output for DCO tuning is generated, leading to an improved fractional spur performance. However, this spur suppression technique is highly sensitive to TDC's linearity.



Fig. 3.1 Illustration of in-band phase noise and fractional spur level related to TDC performance in a DPLL with ripple canceler.

Various TDC architectures for DPLL applications have been reported recently. A single delay line TDC or a flash TDC is the most basic architecture, which quantizes input time interval information using a single chain of inverters. Although it can be easily implemented and has high conversion speed, its time resolution is limited by the CMOS gate delay that is sensitive to processvoltage-temperature (PVT) variations [67]. Vernier TDC formed by two delay chains with slightly different delays can achieve sub-gate delay resolution with improved linearity since the first order mismatches are automatically cancelled [68]. However, its detectable range and conversion rate are greatly limited due to the reduced conversion step size. Consequently, a large number of delay stages are needed to cover the detection range, resulting in high power consumption. Vernier ring TDC achieves fine resolution and large detectable range simultaneously with a reduced hardware configured in a ring structure [69], [70], yet its conversion rate is low for large time intervals. ADC based TDCs and  $\Delta\Sigma$  TDCs achieved good linearity and resolution with even poorer conversion rate [71], [72]. Time amplifier TDC can achieve fine time resolution and high conversion rate, yet it suffers from limited detection range and high power consumption [73], [74]. Gated-ringoscillator (GRO) TDC achieves fine resolution with large range, while its nonlinearity is a drawback due to the device leakage issue [46].

This work presents an 8-bit 1.25 ps resolution reconfigurable TDC based on the conventional 2-dimensional (2-D) Vernier TDC topology [75]. The proposed TDC utilizes a novel 2-D spiral comparator array with its folding points reconfigured following the output sequence of a  $2^{nd}$  order  $\Delta\Sigma$  modulator in order to randomize the folding point errors occurred when the comparator lines transit from one to another. The desired delays are interpolated using digital-to-time converters (DTCs) and the delay quantization errors are also randomized with a  $2^{nd}$  order  $\Delta\Sigma$  modulator.

Fabricated in a 45 nm CMOS SOI technology, the prototype TDC consumes 70-690 uW under 1 V power supply at 80 MHz conversion rate and achieves 0.4 ps maximum integral nonlinearity (INL), which compares favorably among the stat-of-the-art TDCs.

# 3.2 Linearity Issues Associated with 2-D Vernier TDC

To achieve reasonable in-band noise and fractional spur performances in DPLL designs, TDCs are required to have sub-gate delay resolution according to Equation (3.1), while its detection range should cover at least one DCO oscillation cycle (i.e., 500 ps for a 2 GHz DPLL) with a reference frequency normally around 50 MHz. Considering all those constraints, the 2-D Vernier TDC is a preferred candidate architecture.



Fig. 3.2 Illustration of a conventional 2-D Vernier TDC topology with Vernier delay lines and a 2-D comparator array.

In order to achieve improved detection range with fine resolution, a Vernier TDC with a 2-D comparator array [75], as illustrated in Fig. 3.2, are evolved from a prior-art Vernier TDC with a

1-D comparator line. 2-D Vernier TDC has reduced number of delay elements and much higher conversion rate compared with other types of Vernier TDCs. However, the linearity of a 2-D Vernier TDC is more sensitive to delay variations compared to 1-D Vernier TDCs [76], [77].



Fig. 3.3 Simulated TDC transfer curve, DNL, and INL with 4% delay mismatch.

With a closer look of Fig. 3.2, the TDC consists of two delay chains with two different unit delays  $\tau_{\rm S}$  and  $\tau_{\rm F}$ , respectively. The resolution is defined by the delay difference, for instance,  $\tau_{\rm S} - \tau_{\rm F} = 5\Delta - 4\Delta = \Delta$  in this case. The 2-D Vernier TDC breaks the comparator line into multiple sections and forms a 2-D comparator array instead of forming a long single comparator line. The 2-D Vernier TDC uses less delay stages to cover the same detection range. However, each of the segmented comparator line contains *k* comparators, e.g., *k* is equal to 5 in this case. The folding points cycled by the gray box in Fig. 3.2 indicate the comparison signal's transition locations into next comparator lines. The extended sixth comparison point in the first comparator line is equal to  $2\tau_{\rm S} - 1\tau_{\rm F} = 6\Delta$ . In order to ensure a smooth transition between each comparator lines, the two comparisons should

produce identical delay response to the input signals, namely,  $6\tau_s - 6\tau_F = 2\tau_s - 1\tau_F$ . In general, a linear conversion using 2-D comparator array topology requires that

$$k\left(\tau_{S}-\tau_{F}\right)=\tau_{S}.\tag{3.2}$$

This condition demands precisely matched delays in both delay chains against the PVT variations. A small delay deviation can lead to large periodic nonlinearity. Mismatches introduce slope errors, gaps or overlaps between the comparator lines, producing periodic errors in both differential nonlinearity points of the 2-D comparator array topology. To illustrate the problem, a 4% delay mismatch is assumed in the simulation. Fig. 3.3 present the simulated TDC transfer curve, DNL and INL. These plots illustrate that a small delay mismatch could lead to a large nonlinearity in the 2-D Vernier TDC. Indeed, the number of periodic cycles in the nonlinearity plots correspond to the number of comparator lines with their peaks located at the folding points of the comparator array.

A nonlinear TDC transfer curve can lead to high fractional spur level in a fractional-*N* DPLL. A 2-D Vernier TDC based DPLL with digital ripple canceler was presented in [77]. Fig. 3.4 gives the measured TDC output in its fractional-*N* operation. The signal has a periodic cycle of 1 /  $f_{\Delta} = 1 / (F \cdot f_{ref})$ , where  $f_{\Delta}$  is the closet fractional spur offset frequency and F is the fractionality. In this case, the DPLL is running with a division ratio of 29+1/64 and a reference frequency of 80 MHz. The synthesized frequency is centered at 2.32125 GHz. The closest spur is located at  $f_{\Delta} = 1 / 64 \times 80$  MHz = 1.25 MHz. The 2-D folding point errors can be clearly seen from the measured waveform given in Fig. 3.4. Subtracted from the ideal staircase waveform generated by the fractional accumulator, the measured TDC residue error and its filtered version are plotted in Fig. 3.4. Note that the residue error represents the TDC's INL. Even with the digital ripple cancelling technique, the filtered TDC residue error still shows nonlinearity associated with the folding point errors caused by TDC 2-D comparator array, which will be addressed later.



Fig. 3.4 Measured TDC transfer functions and TDC outputs after the digital ripple canceler, still showing TDC nonlinearity caused by periodic folding errors of the 2-D comparator array.



Fig. 3.5 Fourier transform reveals the relationship between filtered TDC output and up-converted fractional spur components at the DPLL output.

The fractional spurs are affected by TDC nonlinearity and can be analyzed taking Fourier transform of the DCO control signal as illustrated in Fig. 3.5. The smoothed residue error in time domain is mapped into frequency domain as multiple tones located at  $f_{\Delta}$ ,  $2f_{\Delta}$ ,  $3f_{\Delta}$ , etc. The DCO output in the loop is modulated by the filtered residue error signal. The frequency components of this filtered control signal will be up-converted to DCO's output, showing as fractional spurs. Due to the TDC nonlinearity, the fractional spur level is only around -40 dBc. The fractional spur level will be covered in Chapter 4.

# 3.3 Spiral Comparator Array Arrangement for 2-Dimentional Vernier TDC

Previous discussion reveals that 2-D Vernier TDC peak nonlinearity appears at folding comparator line folding locations. By following a saw-tooth arrangement in traditional 2-D array, the last comparator in nth comparator line faces much larger accumulated delay mismatches comparing with the first comparator in the (n+1)<sup>th</sup> comparator line, leading to a discontinuous transfer curve. To break this trend, we thus propose to configure the 2-D comparator array in a spiral arrangement as shown in Fig. 3.6 [78]. In this arrangement, instead of folding the comparator line in a saw-tooth form in one direction, we rearrange the comparator path in a spiral shape. Referring to Fig. 3.6, the comparison points start with climbing up along the comparator line from node "0" in a similar way to the conventional 2-D Vernier TDC. When reach the folding point, the comparison folds back counter-clockwise to the left side and continuous downwards, as shown in Fig. 3.6. The separated two sides of the comparator array have opposite comparison mechanisms: on the right-hand side, it satisfies that  $m\tau_{\rm S} - n\tau_{\rm F}$  and defined as positive plane, while on the left-

hand side it meets that  $n\tau_{\rm F}$  -  $m\tau_{\rm S}$  and defined as negative plane, where n and m are the unit delay index of the fast and slow delay chains, respectively. As a result, the mismatches along the comparison path are partially compensated, resulting in an improved linearity performance.



Fig. 3.6 Proposed Vernier TDC with a 2-D spiral comparator array.



Fig. 3.7 Comparisons among proposed 2-D spiral comparator arrangement (scheme 1) and conventional 2-D comparator arrangements (scheme 2 and 3), indicating a better linearity achieved by using the spiral comparator array formation with less delay elements.

To compare different comparator arrangements, we use reduced numbers of delay cells to illustrate the proposed spiral arrangement in comparison with other two conventional 2-D arrangements shown in Fig. 3.7. Scheme '1' shows the proposed 2-D spiral comparator array and is used as the benchmark for evaluation. Schemes '2' and '3' provide two options that achieve the same resolution of " $1\tau$ " and conversion range of " $20\tau$ " by using conventional arrangements. Scheme '2' uses the same amount of unit delay cells as that of the Scheme '1'. However, in order to maintain the " $1\tau$ " resolution, the temporal delay of its unit delay cells has to be reduced to  $5\tau$  comparing to  $10\tau$  in Scheme '1'. Thus, Scheme '2' is more sensitive to delay mismatches and parasitic effects. Scheme '3' is built with the same temporal unit delay as that of Scheme '1', yet its comparator line length is doubled to fulfill the 2-D TDC linear requirement given in Equation (3.2), resulting in degraded nonlinearity and increased power consumption due to longer delay lines.

From topological point of view, the comparison path in our proposed 2-D spiral Vernier TDC forms a spiral shape. The comparison starts from the center of the comparator array and gradually fans out to the outer lines, alternately across the positive plane on the right and the negative plane on the left when the input time interval increases. If there are mismatches, the errors accumulated on positive and negative planes will partially cancel with each other. In contrast, in the conventional 2-D comparator arrays, the comparison path follows a saw-tooth shape, moving in one direction on the positive plane, which accumulates mismatch errors.

To further analyze the nonlinearity of the above three schemes, a theoretical model is built based on the theory presented in [75]. Two kinds of errors, absolute delay error  $\varepsilon_{Absolute}$  and local delay error  $\varepsilon_{Local}$ , are added, where  $\varepsilon_{Absolute}$  is fixed delay error applied to all the unit delay cells in the delay chains and  $\varepsilon_{Local}$  is a gradually increased delay error along the delay chains, which models the unevenly distributed on-chip doping level. The simulated TDC INL in Fig. 3.7 reveals that Scheme '1' has two opposite INL slopes alternatively appears along the TDC detectable range. As a result, its INL is bounded around zero. The INL slope of Scheme '2' is the largest due to the reduced temporal delay of unit delay cells. The Scheme '3' ends up with the largest INL due to its longest delay lines along one direction. Moreover, the transitions between comparator lines are much smoother in spiral arrangement comparing with traditional saw-tooth cases, which end up with much improved DNL as shown in Fig. 3.7.

Among all these three options, the proposed spiral 2-D scheme achieves the best INL and DNL performance and has the least number of unit delay cells, which indicates less mismatches and fast conversion speed.

## **3.4 Linearization Improvement Techniques**

### **3.4.1** Delay Interpolation of Unit Delay Cells

Vernier TDCs' nonlinearity mainly comes from the temporal delay errors of the delay units. Minimizing the delay error is the prerequisite for improving its linearity. The unit delay cell in the delay chain comprises a pair of cascaded inverters as shown in Fig. 3.8. To reduce mismatch, both fast and slow delay chains employ identical unit delay cells. In this design, the unit delay is tunable from 19 ps to 43 ps with seven digitally controlled bits to obtain digital calibration compatibility and meet tuning requirements against PVT variations. The seven delay tuning bits are constructed with six pairs of NMOS and PMOS transistors sized with binary weights. The 1<sup>st</sup> and 2<sup>nd</sup> leastsignificant-bits (LSBs) of the tuning bits share the same transistor pair. A pair of keep-life NMOS and PMOS transistors is connected in parallel with delay tuning transistors. The median value of the delay tuning range can be varied by adjusting the size ratio between keep-life transistors and the tuning transistors. Overall, each tunable delay cell is a 7-bit DTC with quantization errors due to its digitized tuning steps. Moreover, the quantization granularity is not evenly distributed due to the intrinsic nonlinearity of the MOSFETs. Indeed, the transfer curve of the delay cell follows an exponential curve approximately, as shown in Fig. 3.9. For instance, to achieve a 32 ps time delay, the closet reachable delay in the DTC is 31.9 ps as shown in Fig. 3.9. This 0.1 ps time difference introduces a 0.3% delay error that leads to an INL of more than 1.5 LSB according to simulations.

To deal with this issue, we propose to interpolate the precise delay amount by using  $\Delta\Sigma$  modulation. The delay interpolation with  $\Delta\Sigma$  noise shaping is illustrated in Fig. 3.10, where the unit delay cells can be digitally tuned to four adjacent reachable discrete delays. To obtain the desired interpolated delay of 32 ps, a 2<sup>nd</sup> order  $\Delta\Sigma$  output sequence is used to sequentially select these four delays as the timing diagram shown in the bottom of Fig. 3.10, where the time-average value of the temporal delay amount is 32 ps. The static delays corresponding to the four-digital delay-controlled-words (DCWs) vary among four reachable levels. Controlled by the 2<sup>nd</sup> order  $\Delta\Sigma$  modulator, the spectrum of the delay sequence demonstrates 40 dB/dec noise shaping effect. The loading on the pseudo supply and ground lines of the delay cell (see Fig. 3.8) naturally provides a low-pass filter with about 2 MHz bandwidth that helps removing the shaped high frequency noise, leading to a smooth time-averaged interpolation delay value. Note that this delay tuning bandwidth is determined by the loading of pseudo supply and ground lines and is not related to the TDC

conversion bandwidth since the signal bandwidth of the delay cells are related to the inverter speed only.



Fig. 3.8 Seven-bit digitally controlled tunable unit delay cell circuit diagram

The architecture of the  $2^{nd}$  order  $\Delta\Sigma$  modulator is also shown in Fig. 3.10 [79], [80]. In our case, the integer value *C* is set to be 101. The fractional value *K* is determined by the distances between two adjacent quantized delay steps as well as the accumulator size, which has 10 bits in this design. Additionally, the adjustable fractional value *K* is used to compensate the mismatch between the seven digital controlled switches and the nonlinearity of the DTC transfer curve in this design.



Fig. 3.9 Digitally controlled tunable unit delay cell delay tuning transfer function.

The delay chain contributes more than 80% of the total power in a Vernier TDC design. The unit delay cells use only parasitic capacitance to generate the delay and are optimized for noise, mismatch and power consumption. In traditional Vernier TDC designs, with a short time interval inputs, the conversion completes after the signal passed just a few number of delay cells. However, signals still propagate along the delay lines until they reach the end of each line. In this design, transmission gates are used to switch off the signal propagation through the remaining delay cells and dump it to the ground once the comparison is completed. This adaptive power control scheme reduces the TDC power consumption by about 50% in fractional-N mode, where the input time interval sweeps over the TDC detection range in one fractional cycle. As a result, the TDC average power can be estimated as 50% of the power consumed by the peak amount of delay cells. In integer-*N* mode, the TDC input time interval is around zero when the loop is locked, leading to a very short conversion time and a power saving for more than 90%.



Fig. 3.10 A  $2^{nd}$  order  $\Delta\Sigma$  modulator used for DTC delay interpolation with quantization error noise shaping.

# 3.4.2 2-D Comparator Array Folding Error Randomization

As discussed in Section 3.2, 2-D Vernier TDCs suffer from periodic nonlinearity due to the transition errors at the folding points between different comparator lines in a 2-D array (see Fig. 3.2). Even with delay calibration and the spiral comparator arrangement, the delay mismatch between the delay lines still cannot be eliminated completely. As a result, the folding errors are

inevitable and periodic INL peaks at the folding locations are still present, which results in high fractional spur level at the DPLL output.



Fig. 3.11 Four sets of delay settings used for folding error randomization and the resultant INLs. Also shown are four sets of spiral comparator array configurations that meet the delay requirement and their corresponding folding point locations.

Comparator folding locations are fixed in hardware once the delay chains and comparator parameters are chosen. To reduce the nonlinearity, we propose to randomize the folding locations using multiple comparator configurations. If there are multiple sets of comparator line folding locations that can satisfy the Equation (3.2), we can choose different folding points in each comparison cycle, leading to a reconfigurable comparator array architecture that randomizes the mismatch errors. Fig. 3.11 illustrates four valid configurations of a spiral 2-D comparator array, in which "Config. 1" is the first arrangement with delay  $\tau_r=25\tau$ ,  $\tau_s=26\tau$ . The enlarged squares labeled

with "64 $\tau$ " and "65 $\tau$ " indicate one of comparator line folding locations in Config. 1, where the maximum periodic error occurs. In "Config. 2, 3, and 4" with different delay settings, the "64 $\tau$ " nodes are moved to different locations and the corresponding folding points in the simulated INL curves are shifted to TDC output code 67, 69, and 72, respectively, while the time resolution settings of '1 $\tau$ ' among all configurations are kept the same. The randomization block diagram is shown in Fig. 3.12. With the tunable delay cell and reconfigurable comparator array, only one set of hardware is required. A  $\Delta\Sigma$  output sequence generated by a 2<sup>nd</sup> order  $\Delta\Sigma$  modulator preloads one of the four configuration settings to the TDC at each reference cycle and selects their corresponding output by controlling a 4-1 multiplexer. Fig. 3.13 presents four standalone TDC configurations' INL nonlinearity results as well as the result randomized with the  $\Delta\Sigma$  modulations.



Fig. 3.12 2-D folding point randomization with tunable delay and reconfigurable comparator array controlled by  $2^{nd}$  order  $\Delta\Sigma$  modulators.



Fig. 3.13 2-D TDC output periodic errors for individual configurations and modulated results.



Fig. 3.14 Delay variations and  $2^{nd}$  order  $\Delta\Sigma$  randomization points.

Delay variation increases with the length of the delay chain. A signal goes through a longer delay chain will have a larger delay variation, as illustrated in Fig. 3.14. Moreover, due to layout geometry, the comparison points close to the end of transfer curve, which corresponding to large input time interval, face larger delay deviation from layout mismatches and parasitics. When combined with the spiral comparator array architecture, the randomization level is automatically adapted to input signals' time interval. As shown in Fig. 3.14, comparison point P4 located in the first comparator line has the same location among four different configurations. In other words, it experiences no  $\Delta\Sigma$  randomization effect. The other three comparison points P34, P94, and P124, all have four different locations in four configurations. The second point among the four points is selected to be the default "0" point of a 2<sup>nd</sup>  $\Delta\Sigma$  output sequence. And the other three points are assigned to "-1", "+1", and "+2", correspondingly. The distance between each randomization point increases when comparison points move away from its nominal location.



Fig. 3.15 Simulated TDC output spectrum without, with the 1<sup>st</sup> order, and the 2<sup>nd</sup> order  $\Delta\Sigma$  modulators for folding error randomization.

The four configurations have the same resolution, namely, they have the same amount of quantization error. The difference of their nonlinearity characteristics lies upon the folding locations, namely, where the error peaks. Selecting different configuration using a  $\Delta\Sigma$  modulator, the folding locations can be randomized, while the quantization noise remains the same. This randomization technique does not have a noise shaping effect, nor does it limit the conversion bandwidth. In the process, the spur cause by TDC nonlinearity is randomized, while its resolution is untouched. Fig. 3.15 gives simulated TDC output spectrum for the cases without, with the 1<sup>st</sup> order and with the 2<sup>nd</sup> order  $\Delta\Sigma$  modulations. A 20-dB spurious-free dynamic range (SFDR) improvement is achieved with the 2<sup>nd</sup> order  $\Delta\Sigma$  modulation.



Fig. 3.16 Automatic 2-D Vernier TDC close-loop and open-loop delay calibrations.

It should be pointed out that the reconfigurable structure comprises only one 2-D spiral comparator array in hardware, although four configurations are needed. Therefore, power consumption and area penalty are minimal for the proposed linearization technique. The four comparison configurations are always available in the 2-D spiral comparator array and one of the four valid configurations is selected based on the output sequence of a  $\Delta\Sigma$  modulator at the beginning of each comparison cycle. The comparator array outputs are further processed by thermometer to binary encoders to produce the final TDC output.

## 3.4.3 TDC Delay Calibration

Prior to its normal operation, the TDC needs to go through a delay calibration. Calibration is one of the commonly used TDC linearization techniques. A close-loop automatic digital calibration technique based on least-mean-square (LMS) algorithm is developed in [77], [81]. This work leverages the technique with additional open-loop calibration capability for TDC stand-alone applications. The block diagram of the TDC calibration circuit is shown in Fig. 3.16. The calibration is accomplished with a 40 MHz reference clock to ensure sufficient time for digital computation. The loop's output frequency is set to a certain fractional number with a minimal fractional part such as 60+1/1024, shown in the Fig. 3.16. With a small fractional number, the quantization error generated by the factional-*N* accumulator forms a staircase ramp waveform with fine step size that can be used to sweep the TDC input time interval over one DCO cycle. The corresponding TDC output further subtracted from the calculated ideal ramp signal, creating an error signal corresponding to TDC nonlinearity that is used to automatically adjust the TDC delays with optimization goal of minimizing this error. Two LMS loops are designed to collect the differential and common error signals used for fast and slow delay calibrations. Similarly, the

open-loop calibration uses an external signal to provide a fractional frequency same as that used in the close-loop calibration. However, although the frequency is pulled close to the desired value, the phase error can still be unknown in an open-loop operation. A large phase error could saturate the TDC's output and fail the calibration algorithm. Thus, an out-range-flag is generated from the TDC to indicate whether the input phase error is out of the TDC detection range or not. This flag is used to validate the TDC input used for automatic calibration. A pair of optimized DCWs for slow delay line and fast delay is obtained during the calibration process and set as pair '0'. And the  $2^{nd}$  order  $\Delta\Sigma$  modulator will select four adjacent DCWs to form the pair '-1', '1', and '2' for precise delay interpolation.



Fig. 3.17 Block diagram of the proposed reconfigurable 2-D spiral Vernier TDC with  $2^{nd}$  order  $\Delta\Sigma$  linearization.

# 3.5 An Ultra Linear Spiral 2-D Vernier TDC Design

### **3.5.1 TDC Architecture and Circuit Implementation**

Fig. 3.17 presents the block diagram of the proposed TDC system including the proposed spiral comparator arrangement and two  $\Delta\Sigma$  modulation based TDC linearization techniques. In summary, the 2-D spiral comparator array improves both linearity and detection range of the TDC. A  $\Delta\Sigma$  modulator is employed in delay interpolation to minimize the quantization errors introduced by digitally tuned delay cells. The folding point errors commonly seen in 2-D comparator arrays are randomized by using a reconfigurable comparator array controlled by the output of another  $\Delta\Sigma$  modulator. The 2-D spiral Vernier TDC produces seven output bits. A steering module detects lead/lag or polarity information and outputs the most-significant bit (MSB), forming the 8<sup>th</sup> bit of the TDC. To ensure there is no dead-zone around the zero-crossing point, the same comparator is used in the steering module with its decision standard derivation around 0.2 ps based on Monte Carlo simulations. This decision error is smaller than the quantization error, i.e., half of LSB (0.625ps in our design).



Fig. 3.18 Die photograph of the TDC prototype chip.

### **3.5.2** Experiment Results of the Proposed TDC

The proposed TDC was fabricated in a 45nm CMOS SOI technology. As shown in the die photo of Fig. 3.18, the 2-D Vernier TDC core occupies an area of 0.03 mm<sup>2</sup>. Other auxiliary circuits occupy another 0.03 mm<sup>2</sup> space. The measured full-range transfer curves of the TDC with and without the 2<sup>nd</sup> order  $\Delta\Sigma$  modulator are given in Fig. 3.19. The TDC covers a conversion range from -160 ps to 160 ps, namely 8 bits output with a 1.25 ps resolution. Sinusoidal modulated delay signals are generated with an arbitrary waveform generator and are fed into the TDC to perform a spectrum measurement. Fig. 3.20 and Fig. 3.21 give the measured TDC output spectrum results with inputs equal to 1.01 MHz and 32.7 MHz under three different configurations: i) without  $\Delta\Sigma$ modulation, ii) with the 1<sup>st</sup> order  $\Delta\Sigma$  modulation, and iii) with the 2<sup>nd</sup> order  $\Delta\Sigma$  modulation. A 10 dB SFDR improvement is achieved with the 2<sup>nd</sup> order  $\Delta\Sigma$  modulation.



Fig. 3.19 Measured TDC full-range transfer curves with and without the  $2^{nd}$  order  $\Delta\Sigma$  modulation.



Fig. 3.20 Measured TDC output power spectrum density with 1.01MHz input signals under three different measurement configurations: i) without  $\Delta\Sigma$  modulation, ii) with the 1<sup>st</sup> order  $\Delta\Sigma$  modulation, and iii) with the 2<sup>nd</sup>  $\Delta\Sigma$  modulation.



Fig. 3.21 Measured TDC output power spectrum density with 32.7MHz input signals under three different measurement configurations: i) without  $\Delta\Sigma$  modulation, ii) with the 1<sup>st</sup> order  $\Delta\Sigma$  modulation, and iii) with the 2<sup>nd</sup>  $\Delta\Sigma$  modulation.



Fig. 3.22 Single-shot precision measurement results with 10000 tests measured for four different input time intervals located at code 5, 64, 99 and 123, respectively.



Fig. 3.23 Measured histogram plots of TDC output codes with a ramp input signal sweeping the entire detectable range under different settings.

Measured histogram plots of the TDC output codes with a ramp input signal sweeping the entire detectable range of the TDC under different settings is presented in Fig. 3.22, illustrating the efficacy of each proposed TDC linearization techniques. Without calibration, the TDC transfer curve is extremely nonlinear with missing codes or code gaps due to the 2-D folding errors, layout mismatches, and unexpected parasitic effects. The 2-D folding error effect has been greatly reduced by automatic delay calibration and delay interpolation using  $\Delta\Sigma$  modulators. The folding error residues together with layout mismatches and unexpected parasitic effects are further eliminated by the 2<sup>nd</sup> order  $\Delta\Sigma$  randomization for the folding locations. The last histogram plot with evenly distributed code hits indicates a highly linear transfer curve. Two different time interval signals are further illustrated with single-shot precision measurements, as shown in Fig. 3.23, where 10000 tests are measured for four different input time intervals located at code 5 and 123, respectively.

For comparison, linearity performances are measured with the 2<sup>nd</sup> order, 1<sup>st</sup> order and no  $\Delta\Sigma$  modulations, as shown in Fig. 3.24. Periodic errors as large as 1.27 LSB are observed in the measured INL without  $\Delta\Sigma$  modulation, showing dominant nonlinearity associated with the folding point errors of 2-D comparator array. With the 2<sup>nd</sup> order  $\Delta\Sigma$  randomization, the measured INL and DNL have much smaller errors of 0.34 LSB and 0.25 LSB, respectively. Five different TDC chips are measured. The worst measured INL results over a temperature range from 25 °C to 125 °C and a voltage range from 0.8 V to 1.2 V are presented in Fig. 3.25, demonstrating the robustness of the TDC linearity performance against PVT variations with the proposed linearization techniques.



Fig. 3.24 Measured TDC DNL and INL without  $\Delta\Sigma$  modulation, with the 1<sup>st</sup> order  $\Delta\Sigma$  modulation and with the 2<sup>nd</sup> order  $\Delta\Sigma$  modulation.



Fig. 3.25 Measured maximum INL results under voltage/temperature variations for five different TDC chip samples.
This proposed 2-D spiral Vernier TDC is designed to cover 8 bits with a resolution of 1.25 ps. Taking the nonlinearity performance into consideration, the effective number of bits (ENoB) is 7.58 bits and the effective resolution is 1.67 ps. In the measurement, the TDC consumes 0.3 mW under a conversion rate of 80 MS/s and a 1 V power supply when the TDC input is fed with a staircase sweeping ramp signal similar to a fractional-N mode operation. It consumes 0.7 mW if every cycle of the input phase difference exceeds the TDC's full range and consumes less than 0.1 mW when dealing with small input time interval, for instance, in an integer-N mode operation in a DPLL. Performance summary and comparison with prior-art TDC designs are listed in Table 3-1. FoM is based on a well-accepted data converter figure of merit (FoM) evaluation criterion that takes the power consumption, detectable range, and conversion rate into consideration [82]. For TDC designs, effective resolution is an important factor that directly impacts DPLL's performance. Considering both effective resolution and FoM, we summarized the performances of recently reported state-of-the-art TDC designs [83]-[89] and presented the comparison in Fig. 3.26; demonstrating a very competitive TDC design among the state-of-the-art with excellent linearity performance. The presented TDC design provides precise time measurement and digitization of timing information up-to 1.25ps resolution, which supports a wide variety of applications, including DPLL, direct digital modulator, time-based communication transceivers, [90], [91], and mm-wave imaging radars [91], [92].



Fig. 3.26 Performance summary and comparison with prior art TDC designs.

|                                                           | L. Vercesi [75] | J. Yu [69]<br>ISSC'10 | W. Yu [72]                                                                      | S. Liu [93]          | S. J. Kim [94] | A. Sai [41] | This work             |
|-----------------------------------------------------------|-----------------|-----------------------|---------------------------------------------------------------------------------|----------------------|----------------|-------------|-----------------------|
| Topology                                                  | 2-D Vernier     | Vernier ring          | 1-3 MASH                                                                        | Parallel<br>sampling | Stochastic     | SS-ADC      | 2-D Spiral<br>Vernier |
| Process                                                   | 65nm            | 130nm                 | 65nm                                                                            | 65nm                 | 14nm           | 65nm        | 45nm                  |
| NoB                                                       | 7               | 12                    | 11                                                                              | 14                   | 10             | 6.1         | 8                     |
| ENoB <sup>(1)</sup>                                       | 4.90            |                       | 9.42                                                                            | 13.40                | 8.28           | 5.76        | 7.58                  |
| Resolution                                                | 4.8ps           | 8ps                   | 2.64                                                                            | 6                    | 1.17ps         | 6ps         | 1.25ps                |
| ER <sup>(2)</sup>                                         | 20.58           |                       | 7.89                                                                            | 8.7                  | 3.85ps         | 7.60ps      | 1.67ps                |
| Speed [MS/s]                                              | 50              | 15                    | 150                                                                             | 1                    | 100            | 40          | 80                    |
| DNL<br>[LSB]/[ps]                                         | 0.9/4.32        |                       |                                                                                 | 0.1/0.6              | 0.8/0.94       |             | 0.25/0.31             |
| INL<br>[LSB]/[ps]                                         | 3.3/15.8        |                       | 2.0                                                                             | 0.5/3                | 2.3/2.7        | 0.27/1.6    | 0.34/0.4              |
| Power [mW]                                                | 1.7             | 7.5                   | 3.52                                                                            | 0.28                 | 0.78           | 0.36        | 0.07-0.69             |
| Area [mm <sup>2</sup> ]                                   | 0.02            | 0.26                  | 0.03                                                                            | 0.12                 | 0.036          | 0.022       | 0.04                  |
| <i>FoM</i> <sup>(3)</sup>                                 | 0.266           |                       | 0.012                                                                           | 0.017                | 0.008          | 0.131       | 0.016                 |
| FoMTDC <sup>(4)</sup>                                     | -112.7          |                       | -130.2                                                                          | -128.3               | -135.1         | -120.0      | -135.7                |
| $(1) ENoB = NoB - log_2(INL+1).$                          |                 |                       | (2) Effective Resolution ( <i>ER</i> ) = Resolution $\times 2^{(NOB - ENOB)}$ . |                      |                |             |                       |
| (3) $FoM = Power / (2^{NOB} \times Fs) [pJ / conv-step].$ |                 |                       | (4) $FoM_{TDC} = 20log(ER/1[sec.]) + 10log (FoM_{ADC}/1[pJ/conv-step]).$        |                      |                |             |                       |

Table 3-1 Performance comparison with recently reported TDCs

# 3.6 A Wide Range Highly Linear Ring TDC Design

Resolution and detectable range are two critical and contradictory parameters for TDC designs. It is challenging to achieve fine resolution and large range simultaneously. Vernier based TDCs achieve good resolution, yet with limited range. Ring based TDCs have a wide detectable range, while its resolution and linearity are imperfect. The TDC presented in previous section achieved 1.25ps resolution and 0.4ps nonlinearity by using spiral 2-D comparator array and SDM linearization techniques. However, its 8 bits range is not able to support DPLLs with output frequency less than 3GHz.

### **3.6.1** TDC Architecture and Circuit Implementation

In order to expand the detectable range while not panelizing resolution and linearity, we proposed a taped 2-D Vernier ring TDC, shown in Fig. 3.27, with  $2^{nd}$  order  $\Sigma\Delta$  linearization techniques and a spiral 2D arbiter array to achieve large range, fine resolution and improved linearity simultaneously. The TDC contains two delay lines and a 2-D arbiter array formed a 2-D Vernier TDC. The ring TDC is built with part of the slow delay chain. The comparison signals are fed into a steering block, which directs the leading and lagging signals to the slow delay line and the fast delay line, respectively. There are two switches in the slow delay line. The first switch is connected to the input signal node at beginning of each conversion and is switched to the loop after the signal appears. The second switch is controlled by a "ring/2D flag". The time interval is measured by the ring TDC first. Once the interval residue falls into the 2D Vernier TDC range, it triggers the flag to activate the 2-D Vernier TDC.



Fig. 3.27 (a) Proposed 2-D spiral Vernier ring TDC with  $2^{nd}$  order  $\Sigma\Delta$  linearization, and (b) ring/2D structure collaboration illustration.



Fig. 3.28 Signal propagation difference between (a) straight delay line and (b) end to end connected delay ring.

Once a delay line is formed into an end to end delay ring, a pulse generator is required to convert input edge signal into a pulse signal. The reason is shown in Fig. 3.28. An edge signal is able to propagate in a delay line. However, when propagating in a delay ring, the state of the cells need to be reset after the signal passes. Otherwise, there will be no more transactions when signal comes back after the first lap. The falling edge of the pulse signal resets the delay cell.

Table 3-2 Monte Carlo comparison of unit delay and number of delay cells

| Option                     | 1     | 2     | 3     | 4     |
|----------------------------|-------|-------|-------|-------|
| $	au_{s} / 	au_{F} [ps]$   | 15/16 | 25/26 | 35/36 | 45/46 |
| Delay cell number <b>n</b> | 16    | 26    | 36    | 46    |
| Mismatch $\sigma$ [fs]     | 37    | 12    | 18    | 31    |

To ensure a continuous transfer curve between the ring TDC and 2-D Vernier TDC, only first six delay cells in the slow delay chain are used to form the delay ring. In a 2-D Vernier TDC, the temporal delay and number of delay cells need to fulfill the previous mentioned Equation (3.2). The unit temporal delay  $\tau_s$  and  $\tau_F$  are affected by mismatches, which can be analyzed with Monte Carlo simulations. Table 3-2 shows the simulated temporal delays under different number of delay cells. If the delay is too small, the mismatch is relatively large. In order to minimize mismatches, larger delay is desirable. However, as revealed in Equation (3.2), the number of delay cells will increase when unit delay is enlarged, which also leads to larger mismatch. According to the simulations, we choose the unit delay  $\tau_s / \tau_F$  as 25ps/26ps.



Fig. 3.29 (a) Signal propagating issue in a delay ring when rising delay is not equal to falling delay. (b) Pulse generation timing diagram.

Theoretically, the proposed ring TDC's detectable range is only limited by the size of the output counter. However, the actual detectable range is limited by the mismatch between the transition times of the rising and the falling edges. In addition, unlike a delay-based ring oscillator, there is no feedback to compensate the mismatch over time in a ring TDC. As a result, the duty

cycle of a pulse propagating in a delay ring will either gradually increase or decrease, which eventually causes the pulse vanishing after passing a certain numbers of delay cells, limiting the achievable detectable range of the ring TDC. As illustrated in Fig. 3.29 (a), the unmatched rising time  $\tau_r$  and falling time  $\tau_f$  lead to a progressively increased pulse width, which eventually overlaps with the feedback pulse.

In this design, the rising and falling delays can be adjusted in the duty cycle tuning stage of the unit delay cell, shown in Fig. 3.30, in order to achieve the targeted large delectable range of 14 bits. Unit delay cells in both slow and fast delay chain comprise of three parts: delay tuning stage, duty cycle correction stage and signal truncation stage. The rising and falling edges are adjustable to regulate the pulse duty cycle and to compensate process, voltage, and temperature (PVT) variations. The truncation transmission gate switch stops the signal propagation to save power consumption once TDC conversion is completed. Its delay is digitally controlled by 6 tuning bits, capable to tune from 20 ps to 42 ps. Hence, each delay cell is a 6-bit digital-to-time convertor (DTC) with quantization errors. For instance, to get a 25 ps delay, the closet reachable delay in the DTC is 24.8 ps. The 0.2 ps temporal error will be accumulated during the signal propagating in the delay ring and result in a poor INL of more than 5 least significant bits (LSBs). We therefore propose to interpolate the precise delay amount by toggling among a few adjacent delay control words following a sequence generated by a 2<sup>nd</sup> order SDM. The time averaged value among those discrete delay steps gives the correct desired delay value and the quantization errors generated in the process are noise-shaped to high frequency band by the SDM. As shown in Fig. 3.31, the measured INL has been suppressed from over 5 LSB to less than 2 LSB with the 2<sup>nd</sup> order SDM running at 8 times over-sample ratio.



Fig. 3.30 Unit delay cell circuit diagram.



Fig. 3.31 Measured (a) INLs/ (b) DNLs under different delay over-sample ratio.



Fig. 3.32 Measured (a) INLs/ (b) DNLs with different SDM settings.

The prior-art 2D arbiter array suffers periodic nonlinearity associated with the transitions between arbiter lines (folding points) of the 2D structure. To further reduce the nonlinearity, the unique spiral 2D arbiter array and the SDM folding location randomization technique are used [78], [95]. Fig. 3.32 shows the improvement when applying the 1<sup>st</sup> order SDM and the 2<sup>nd</sup> order SDM and comparing to the case without SDM linearization. It is clear that the measured INL has been suppressed to 0.79 LSB with the 2<sup>nd</sup> order SDM from an INL level of 1.8 LSB when no SDM is applied.



Fig. 3.33 Die photograph of the TDC prototype chip.

### **3.6.2** Experiment Results of the Proposed TDC

The proposed TDC was fabricated in a 130nm CMOS technology. As shown in the die photo of Fig. 3.33, the TDC core occupied an area of 0.06 mm<sup>2</sup> and other auxiliary parts (I/O buffers and digital unit) occupied another 0.06 mm<sup>2</sup> area. The TDC covers a conversion range over 1.6 ns or 14 bits with a 1 ps resolution. The measured full-range transfer curve and its corresponding INL are given in Fig. 3.34. The TDC consumes 2.4 mW under a conversion rate of 10 MS/s and 1.2-volt power supply. With a 2<sup>nd</sup> order delay SDM running at 80 MHz for delay interpolation and a 2<sup>nd</sup> order linearization SDM running at 10 MHz folding error linearization, our proposed architecture achieved a very competitive linearity performance with DNL/INL of 0.41/0.79 ps when compared with state-of-the-art TDC designs.



Fig. 3.34 Measured TDC full-range transfer curves and INL.

# 3.7 Conclusion

In this chapter, two different high performance TDC designs are presented. The first one achieving 1.25 ps temporal resolution with 8-bit range is implemented for high linearity applications. A spiral comparator array is proposed to enlarge the TDC detection range and improve the linearity. Two  $2^{nd}$  order  $\Delta\Sigma$  modulators are utilized to lower the quantization errors of the DTC based unit delay cells and to randomize the periodic folding errors of the 2-D comparator array. With an 80 MHz reference clock, the measured maximum DNL and INL of the proposed TDC are 0.25 LSB and 0.34 LSB, respectively. With the adaptive power control that switches off unused delay cells, the TDC power consumption is greatly reduced. Fabricated in 45 nm CMOS technology, the TDC prototype consumes 70-690  $\mu$ W under a 1 V power supply at a conversion rate of 80 MHz. It achieves 1.67 ps effective resolution and a FoM ADC of 0.016 pJ / conv-step, advancing the state-of-the-art high-performance TDC designs.

The goal of the second TDC design is to accomplish a large detectable range. A total 14 bits (1.6ns) is reached with fine resolution of 1ps and excellent differential linearity (DNL)/integral linearity (INL) of 0.41ps/0.79ps owing to the following novel techniques: (i) a combined ring and 2D Vernier TDC is used to 14 bits detectable range and 1ps resolution; (ii) a  $2^{nd}$  order  $\Delta\Sigma$  modulator is adopted to mitigates the quantization errors introduced by the delay cell nonlinearity; (iii) the 2D arbiter comparison path is arranged in a spiral form in order to improve its INL; (iv) an additional  $2^{nd}$  order SDM is used to randomize the arbiter line folding errors associated with the 2- D arbiter array topology. Traditionally, technologies with small feature size are preferred for mixed-signal designs such as TDC design to achieve better performance and power efficiency. However, after five decades, it seems that the Moore's low has come to a crossroads. As a result,

there is an increased benefit to focus on circuit innovations rather than simply pursue the use of technologies with small feature size to further improve the circuit performance figure of merit (FoM). In this work, by using a large feature size technology (130nm CMOS) we presented a TDC achieving improved performance comparing to state-of-art TDC designs using small feature size processes.

# Chapter 4 Time-to-Digital Conversion in Phase-Locked-Loop Based Frequency Synthesizers

## 4.1 Introduction

As semiconductor technology advances to finer feature size, digital circuits are becoming more efficient in both area and power. Integrating the conventional phase-locked loop (PLL) imposes a greater challenge and burden to maintain the analog components. On the other hand, digital PLL shares similar device as used in digital circuits. Fully synthesizable digital PLL has been proposed to take full advantage of the advanced deep submicron technology while providing easy integration with digital system [96]. Moreover, digital PLL is highly flexible and programmable which makes it capable of achieving functionalities that are very difficult to be obtained using analog PLL. As an example, various digital PLL architectures have been proposed to implement direct modulations for high-speed wireless polar transmitters [25], [97], which is a very challenging task for an analog PLL due to its nonlinear analog properties that are sensitive to process-voltage-temperature (PVT) variations.

Converting from anaolg domain into digital domain, the PLL performance degradation is inevitable during the quantization process introduced by TDC. Almost every critical performance indicators of ADPLL is related to TDC's specifications: the in-band phase noise is determined by TDC resolution; locking time and robustness is affected by detectable range; the fractional spur level is limited by linearity of the transfer function, etc. Till now, with the aid of advanced deep sub-micron technology, multiple articles proved that the gap between digital PLL and analog PLL is fadeout. The ADPLL is able to achieve an outstanding figure-of-merit (FoM) and fractional spur performance that are competitive with the best analog PLL [98].

The main challenge for the TDC design in digital PLL now is probably the calibration. Uncharacterized TDC transfer function regarding to PVT variations can lead to an unacceptable performance. In this case the calibration has to be continuously done in the background. Various digital calibration techniques have to be applied to suppress fractional spurs in a digital PLL. The conventional fractional spur cancellation using sigma-delta modulator (SDM) [80] requires narrow loop bandwidth in order to suppress the noise-shaping component at high frequency band. In addition, using a high order SDM to toggle the loop division ratio varies the feedback edge after the divider over multiple digital-controlled oscillator (DCO) cycles, which requires a TDC or DTC with wider detectable range that leads to higher power consumption and more complicated hardware. These drawbacks motivate us to explore other spur cancellation methods including the digi-phase. Regardless of the techniques employed, the spurious level in DPLL is highly dependent on the TDC linearity, necessitating accurate calibrations. In this chapter, a wideband fractional-N DPLL with digital calibration for fractional spur suppression for a low power Wi-Fi transceiver in 802.11 a/b/g/n bands using a 55 nm CMOS technology is presented. The two-dimensional (2D) Vernier TDC's nonlinearity is automatically calibrated through the fractional frequency synthesis [81]. The implemented RFIC also includes an improved MMD that overcomes the division ratio skipping problem associated with the prior art designs.

# 4.2 A Low Spur Fractional-N All-Digital PLL Design with Automatic TDC Linearity Calibration

When PLL is generating a carrier frequency  $f_o$  which equals to integer multiples of the reference frequency  $f_{ref}$ , e.g.,  $f_o = Nf_{ref}$ , the frequency divider generates one pulse after N DCO cycles. Divider output pulse shall be directly aligned with the reference pulse when the loop is in lock, thus the phase error measured by the TDC is zero. On the other hand, when DPLL is generating a fractional frequency, the divider will toggle its division ratio between N and N+1 to achieve an equivalent fractional division ratio. Even though the average division ratio over time equals the desired fractional value, an instantaneous phase error exists between the divider output and the reference clock. This periodic error will further modulate the DCO control words, thus various fractional spurs will arise along with the desired carrier tone.

To tackle this problem, digi-phase technique was first introduced in [99] to suppress fractional spurs. Since the periodic phase error is deterministic for each fractional frequency, an exact replica can be subtracted from TDC digital output to provide a constant control word at the DCO input. In our proposed design, the fractional accumulator output (quantization error) is used to generate an inverse stair waveform that is subtracted from the output of the TDC. If the two paths can maintain balanced gains, the injected stair signal can precisely cancel the original stair wave at the output of the TDC, leaving only the DC component for frequency tuning. This achieves a spurfree fractional operation under ideal conditions.



Fig. 4.1 Simulated TDC output and the residue signal after the digi-phase canceller, showing that TDC resolution and TDC nonlinearity induced residue errors possess different periods.

Various techniques including digi-phase [99] have been proposed to suppress fractional spur. However, the cancellation effect at TDC output is affected by various non-ideal circuit characteristic which degrades the spur suppression performance. Due to limited TDC resolution and linearity, a small amount of residue error might still exist after the cancellation. As shown in Fig. 4.1, assuming each TDC bit covers  $2^{-tr}$  of one DCO cycle and each digital bit in digi-phase cancellation signal covers  $2^{-fr}$  of one DCO cycle respectively, the TDC resolution induced residue has a period of  $2^{fr-tr}T_{ref}$ , which corresponds to a fractional spur located at an offset frequency of  $2^{-fr+tr}f_{ref}$ . On the other hand, divider output sweeps around the reference edge with division ratio toggling between N and N+1, thus the TDC output waveform repeats every  $2^{fr}T_{ref}$  cycles, which creates a fractional spur at an offset frequency of  $2^{-fr}f_{ref}$ . As a result, both TDC resolution and linearity have impact on fractional spurs. Nevertheless, the fractional spur from the TDC nonlinearity is more critical since it is closer to the carrier tone on the spectrum. As an example, assuming a TDC resolution of 5 ps, a reference frequency of 80 MHz, a carrier frequency of 2.4 GHz and a fractionality 2<sup>-fr</sup> of 1/256, the resolution and linearity induced fractional spurs will be located at 26 MHz and 0.32 MHz, respectively. Thus, the fractional spur generated by limited TDC resolution will be greatly attenuated by the loop filter, leaving spurs generated by TDC nonlinearity as the dominant source. Thus, to implement a digital PLL with low fractional spur, it is critical to have a highly linear TDC.

The fractional spur due to TDC nonlinearity can be further analyzed as follows: assuming that the residue the digi-phase canceller output be expressed error at can as:  $\varepsilon = A_1 \sin(2\pi f_m t) + A_2 \sin^2(2\pi f_m t) + A_3 \sin^3(2\pi f_m t) + \dots$ , where  $A_1$  is magnitude of the error's fundamental tone and  $f_m$  represents the fractional offset frequency, the power level of the closest fractional spur can be derived as:

$$P_{frac}(dBc) = 20 \cdot log_{10}\left(\frac{H(f_m)K_{DCO} \cdot A_1}{2f_m}\right)$$
(4.1)

where  $K_{DCO}$  denotes the gain of DCO and H(f) represents the loop filter transfer function. As an example, assuming a fractional frequency of 1.25 MHz, a loop bandwidth of 1 MHz such that the closest fractional spur experiences a slight suppression from loop filter, a  $K_{DCO}$  of 10 kHz/bit and a TDC resolution of 5 ps/bit, the calculated and simulated spur level results are shown in Fig. 4.2 (a). Simulation result deviates slightly from the calculated value for small TDC residue, mainly because that Equation (4.1) has not taken into account of the quantization effect of a digital PLL. Furthermore, using the measured TDC residue error as shown in Fig. 4.2 (b), its fundamental

waveform can be shown to have a peak-to-peak magnitude of 0.6 LSB, which corresponds to a peak magnitude  $A_1$  of 0.3 LSB. Using the above analysis, a spur level of -54 dBc is expected at the closest fractional frequency, which is very close to our measured closest spur level of -56 dBc.



Fig. 4.2 (a) Fractional spur level due to residue error at the digi-phase canceller output. (b) Measured residue error and its fundamental waveform.

#### 4.2.1 ADPLL Architecture

The complete digital PLL architecture with digital calibration is shown in Fig. 4.3. The TDC adopts a 3-step architecture to provide both fine and coarse measurements. Digi-phase cancellation signal is injected at the TDC output to cancel the instantaneous divider quantization errors. Ideally the waveform after the cancellation block shall remain constant with only DC component. However, various non-ideal characteristics in the loop will still cause a small amount of residue phase errors. In other words, the residue error after digi-phase subtraction is directly related to various system imperfections including non-linearity, mismatch and variation. Thus, this residue can be used as the error signal for various digital calibrations adopted in this design. The gain

applied on the digi-phase path is automatically adjusted with a TDC gain tracking module that correlates the error signal with the digi-phase gain. Optimized gain can be achieved when the error is minimized. Likewise, the TDC calibration uses the same error signal to adjust delay cell for optimal TDC linearity. In summary, our proposed digital calibration scheme can be described as follows: step 1, initially the PLL is locked to a known fractional frequency with the digi-phase block enabled and the TDC gain tracking fixed at a pre-set value. Step 2, after lock-in, the TDC calibration block utilizes the ramp signal at the TDC output for TDC linearity calibration. Step 3, the linearity calibration is disabled and the TDC gain tracking is enabled. Next, the loop is relocked to the desired frequency.



Fig. 4.3 Proposed DPLL block diagram with automatic TDC linearity calibrations for fractional spur cancellation.



Fig. 4.4 Proposed three-step TDC block diagram.

## 4.2.2 ADPLL Building Block Circuits Implementation

#### 4.2.2.1 Three-Step TDC

Similar to the PFD in an analog PLL, TDC measures the phase difference between the divided feedback signal and the reference clock. The measured result will be further quantized into digital bits and processed by the digital loop filter. The quantization step, or TDC resolution, directly determines the in-band phase noise at DPLL output, as presented in Equation (3.1). Assuming a  $T_{DCO}$  of 416 ps and a  $f_{ref}$  of 80 MHz, a TDC resolution of 5ps can be calculated from Equation

(3.1) to achieve an in-band noise floor of -110 dBc/Hz. Since the phase error ranges across [- $T_{ref}/2$ ,  $T_{ref}/2$ ], our proposed TDC is segmented into 3 steps to cover all possible phases during phase locking process as shown in Fig. 4.4. The three-step structure includes a bang-bang TDC as the first stage, a single delay chain as the second stage and 2D Vernier delay array as the third stage. The single delay chain is constructed as part of the Vernier delay chains in order to save area and power. TABLE 4-1 summarized the specifications of these three sub-TDCs.

Table 4-1 Specifications of a three-step TDC

| Step        | Structure  | Range/Bit  | Resolution |  |
|-------------|------------|------------|------------|--|
| Acquisition | Bang-bang  | $\infty/1$ | +/-        |  |
| Coarse      | Flash      | 2.08ns/5   | 65ps       |  |
| Fine        | 2D Vernier | 520ps/7    | 5ps        |  |

In order to ensure a robust lock, a TDC covering an entire reference clock cycle is required. More specifically, a bang-bang TDC acting as a signal steering gear is employed for the first stage of the proposed 3-step TDC. The bang-bang TDC has the capability to detect an entire reference cycle (12.5 ns in an 80 MHz system). It takes the position of the falling edge from the reference (REF) signal as a trigger signal. If the divided feedback (DIV) signal is between REF signal's rising and falling edges, as shown in Fig. 4.5, it will be determined as a lagging event. These two signals will be directly propagated to later TDC stages. Otherwise, if the divided feedback (DIV) signal arrives after the REF signal's falling edge, it will be considered as a leading event with respect to the following REF signal's rising edge. In this case, the two signals will be swapped to ensure a normal operation for the next TDC stages.



Fig. 4.5 Timing diagrams of the (a) bang-bang TDC, (b) single delay line TDC, and (c) Vernier delay line TDC.

In the second stage TDC, a delay chain with 16 delay stages are adopted to provide a coarse measurement with 4-bit binary output. In conjunction with the polarity detection provided by the bang-bang TDC, the coarse TDC provides a 5-bit output with a resolution of 65 ps. By reusing the delay stages of the 2D Vernier TDC, this coarse TDC requires no extra hardware and power consumption, while extending the TDC detectable range to 2.08 ns. With this coarse TDC, the proposed digital PLL can achieve faster locking owing to the enlarged detectable range. In the simulation, an initial frequency error of 40 MHz was introduced. With this frequency error, it takes about 5.6 us to lock for a loop with the coarse TDC and the bang-bang TDC, while it takes more

than 30 us to lock for a loop using bang-bang TDC only without the coarse TDC. The fine TDC is used to further lower the in-band phase noise.



Fig. 4.6 Circuit diagrams of (a) TDC unit delay stage and (b) arbiter cell.

The fine TDC is constructed using a Vernier structure with 2D arbiter array. The delays from one stage in fast delay chain and slow delay chain are set to 60 ps and 65 ps, respectively. This slight difference provides a sub-gate delay time resolution as fine as 5 ps. The fine 2D TDC has a detectable range of 520 ps (7 bits), which is sufficiently large to cover an entire 2.4 GHz DCO cycle (420 ps). Circuit diagram of TDC unit delay stage is shown in Fig. 4.6 (a). Each delay stage consists of two cascaded inverters to avoid mismatches between rising and falling edge. In order to tune each delay stage to the desired value (60 ps and 65 ps) for optimal TDC linearity, both delays are designed to be adjustable with 6 bit controls and 0.5 ps step size covering a range of 50 ps. In addition, a first order sigma-delta modulator was added at the delay control input to further improve the tuning accuracy. The arbiter cell structure is shown in Fig. 4.6 (b). The reference and feedback signals are fed into port "Start" and port "Finish". This arbiter structure is able to

distinguish a minimum time difference of 200 fs based on simulation results with propagation delay less than 10 ps.



Fig. 4.7 Proposed wide-tuning DCO architecture.

#### 4.2.2.2 DCO

As part of a multi-bands wireless transceiver, a DCO with wide tuning range is required to provide sufficient spectrum coverage. Moreover, a wide tuning DCO can also be used in a DPLL with high data-rate direct modulation. In our design, four capacitor banks (PVT, ACQ, TRK, and FIN) are designed as shown in Fig. 4.7. The DCO oscillates at 5 GHz and is able to generate 2.4 GHz carrier with a divide-by-2 prescaler. The implemented DPLL is able to cover 1.9~2.8 GHz and 3.8~5.6 GHz bands for multi-band applications. Thus, the DCO is equipped with a wide-tuning

range of 35%. In order to lock to the desired channel, the system will first use a successive approximation (SAR) algorithm to tune the PVT bank which has the widest tuning range with 6 binary-weighted capacitors. Next, the other three banks, the acquisition (ACQ) bank (5-bit), the tracking (TRK) bank (6-bit), and the finest (FIN) bank (7-bit), will be activated for further locking. Thermometer-weighted structure is adopted for these three banks to ensure monotonic tuning characteristic. Fig. 4.8 shows the frequency range relationships among the tuning banks. In order to minimize quantization noise from DCO, two fixed capacitors are connected in series with the parasitic capacitor array to reduce the frequency tuning step. Eventually, the finest bank has achieved a frequency resolution of 10 kHz/step.



Fig. 4.8 Proposed DCO frequency tuning banks.



Fig. 4.9 Schematics and layout of digitally controlled capacitor unit.

In addition, we used a common-centroid layout scheme as shown in Fig. 4.9 to further improve the monotonicity of the DCO tuning curve. The thermometer-weighted capacitors in each bank are placed in an array surrounding unit bit 0. Unwired dummy unit capacitors are inserted at corners of the array to minimize layout mismatches. Moreover, for each unit capacitance, the capacitor has been split into four equal pieces with a common-centroid quadrature layout style as well.

#### 4.2.2.3 Loop Filter

As shown in Fig. 4.3, the digital loop filter consists of proportional and integral paths to achieve a programmable bandwidth from 200 kHz to 2 MHz. In addition, two additional IIR filters are added on the proportional path to create a second order filter. Parameters including gain on the

proportional and integral paths in the digital loop filter can be programmed to achieve different natural frequency  $\omega_n$  and damping factor  $\xi$  similar to the analog PLL:

$$\omega_n = \sqrt{\frac{K\beta}{T_{ref}}} \qquad \xi = \frac{\alpha}{2} \sqrt{K \frac{T_{ref}}{\beta}} \tag{4.2}$$

where K represents total loop gain except loop filter,  $T_{ref}$  is the period of reference clock,  $\alpha$  and  $\beta$  represent gains in the proportional and integral paths, respectively. The loop can be programmed to a wider loop bandwidth initially for faster frequency lock and reconfigured to an optimal bandwidth that corresponds to the best phase noise performance afterwards.

#### 4.2.3 Automatic TDC Linearity Calibration

Similar to the basic Vernier TDC, a fast and a slow delay chains are employed in a 2D Vernier structure. However, rather than using a single arbiter line, multiple arbiter lines are implemented in a 2D Vernier structure to compare each fast delay stage with multiple slow delay stages. By reusing part of the delay stages, larger detectable range can be achieved. However, a highly linear 2D Vernier TDC requires that the delays of fast and slow chains to satisfy the following conditions:

$$\begin{cases} n(d_s - d_f) = d_s \\ d_s - d_f = t_{res} \end{cases}$$
(4.3)

where  $d_s$  and  $d_f$  denote delays of single stage in slow chain and fast chain, respectively; *n* is number of stages in one arbiter line. The first equation comes from the condition for a continuous measurement with 2D Vernier TDC and the second equation sets the measurement resolution. Using these two equations, only one set of  $d_s$  and  $d_f$  can be used as a viable solution. Any deviation of the two delays will cause error comparing to the ideal case. We define the common mode delay error as the deviation of the average of two delays and the differential mode delay error as the deviation of the difference of two delays from their ideal values, respectively. As shown in Fig. 4.10, a common mode delay error introduces gaps at the turning points of each arbiter line and a differential mode delay error leads to incorrect slope for each line. Moreover, TDC nonlinearity induced by common delay error is zero for small TDC input located within the first arbiter line. The deviation from ideal transfer curve accumulates as TDC input gets larger. On the other hand, the nonlinearity from differential delay error shows up even within the first arbiter line but only repeats itself periodically for large TDC inputs. It is from these observations that we conclude the dominant source for TDC nonlinearity is the differential mode delay error for small TDC inputs.



Fig. 4.10 Simulated TDC nonlinearity considering common mode error and differential mode error.



Fig. 4.11 Proposed TDC automatic linearity calibration loops.

With a closer look, the quantization error generated by the factional-N accumulator presents a staircase ramp waveform that can be used to sweep the TDC input from  $-T_{DCO}/2$  to  $T_{DCO}/2$ . As illustrated in Fig. 4.11, the corresponding TDC output can be further subtracted from an ideal ramp signal, creating an error signal that can be used to automatically adjust the TDC delays. As mentioned above, when TDC input is within the range of first arbiter line, only the difference between fast and slow delays causes TDC measurement error. On the other hand, the average of fast and slow delays dominates TDC error when TDC input is sufficiently large such that multiple arbiter lines are used. As a result, the common and differential parts of the fast and slow delays can be calibrated separately according to TDC input range. Two least-mean-square (LMS) loops are designed to collect the differential and common mode error signals used for fast and slow delay chain calibrations. More specifically, TDC generates a flag signal to indicate either one or multiple arbiter lines are used in one measurement. This flag signal will be further used to activate either common or differential LMS loop. In this way, we can guarantee an orthogonal calibration of two types of errors without interfering each other. As shown in Fig. 4.12, measured results showed that the LMS loops for common and differential delays converge after about 150 us. The convergence speed depends on the step size of the LMS loop. Faster convergence can be achieved with larger step size. However, exceedingly large step size might jeopardize the convergence stability.



Fig. 4.12 Measured convergence of TDC common mode and differential mode delays.

## 4.2.4 Experiment Results of the ADPLL

A prototype of the proposed digital PLL is implemented in a standard 55 nm CMOS technology as shown in Fig. 4.13. The RFIC is separated into the digital part and the analog part on the layout to minimize their cross-talk. The entire digital PLL occupies 0.56 mm<sup>2</sup>, in which two major components, TDC and DCO, take most of the area. When the loop is locked to an integer frequency at 2.08 GHz, the measured in-band phase noise is -107 dBc/Hz and the integrated rms

jitter (from 10 kHz to 10 MHz) is 0.55 ps as shown in Fig. 4.14. The in-band spur around 250 kHz is due to the power regulator used on board. The loop bandwidth is set to 1 MHz in order to clearly show the in-band noise floor achieved.



Fig. 4.13 Die photo of the DPLL in a low-power multi-standard wireless transceiver RFIC.



Fig. 4.14 Measured phase noise at a 2.08-GHz output with a loop bandwidth of 1 MHz.

To demonstrate the effectiveness of the proposed TDC calibration, the digital PLL is configured to lock at various fractional frequencies. Two cases with the fractionalities of 1/64 and 3/64 are shown in Fig. 4.15 and Fig. 4.16, respectively. The measured largest fractional spurs in two cases at 1.25 MHz and 3.75 MHz were -45 dBc and -36 dBc before calibrations. In both measurements, the digi-phase spur cancellers have been enabled. However, small amount of residual error still exists after the canceller due to TDC non-linearity. When the proposed TDC calibration is completed, the fractional spurs level drop to below -55 dBc and -60 dBc, respectively, indicating a spur reduction of 10 dB and 25 dB, owing to the proposed TDC calibration scheme. Since the TDC gain is proportional to the delay difference between the fast and slow chains, the loop bandwidth varies slightly after delay calibration that could affect the final spur reduction effect as well. Furthermore, the spur level before calibration depends on the TDC initial delay that is PVT sensitive, different spur levels are observed for various frequency settings. However, with the proposed calibration turned on, the largest fractional spur level is always below -55 dBc. Additional measurements of the largest fractional spurs with different fractional frequencies before and after TDC calibration are shown in Fig. 4.17.



Fig. 4.15 Measured spectrum before and after digital calibrations with the fractionalities of 1/64.



Fig. 4.16 Measured spectrum before and after digital calibrations with the fractionalities of 3/64.



Fig. 4.17 Measured fractional spur near 2.4 GHz with a loop bandwidth of 1 MHz for different fractional frequencies with and without TDC calibrations.

The measured TDC transfer curve is shown in Fig. 4.18. Before TDC calibration, gaps between different arbiter lines can be clearly observed due to inaccurate delays from two delay chains which will cause high spurious tone in DPLL output. After TDC calibration, the measured TDC transfer curve is very close to the ideal transfer curve. With auto-calibration, this 2D Vernier TDC achieves an average differential nonlinearity (DNL) of 1.13 LSB and integral nonlinearity (INL) of 0.81 LSB, while DNL and INL are 1.32 LSB and 3.49 LSB without calibration, respectively. The DNL is mainly caused by the 2D arbiter topology, where the turning points of the arbiter chains correspond to worst DNL. The proposed TDC gain and linearity calibration only needs to be carried out once initially and involves negligible extra power consumption.



Fig. 4.18 Measured TDC transfer curve, INL, and DNL before and after digital calibrations.
|                            | Hsu [59]<br>ISSCC'08 | Tasca [20]<br>JSSC'11 | Elkholy [64]<br>JSSC'15 | Gao [53]<br>ISSCC'15 | This work [77]  |  |
|----------------------------|----------------------|-----------------------|-------------------------|----------------------|-----------------|--|
| Technology (nm)            | 130                  | 65                    | 65                      | 28                   | 55              |  |
| fref (MHz)                 | 50                   | 40                    | 50                      | 80                   | 80              |  |
| $f_o$ (GHz)                | 3.2-4.2              | 2.9-4.0               | 4.5                     | 5.8                  | 1.9~2.8/3.8~5.6 |  |
| DCO Tuning Range (%)       | 27.2                 | 31.9                  | 26.8                    | /                    | 32.1            |  |
| In-band PN (dBc/Hz)        | -108                 | -104                  | -106                    | -105                 | -107            |  |
| Fractional Spur (dBc)      | -53                  | -53                   | -51                     | /                    | -55             |  |
| Fractional Frequency (MHz) | 1                    | 1                     | 0.392                   | /                    | 1.25            |  |
| Loop Bandwidth (MHz)       | 1.1                  | 0.312                 | 2.5                     | /                    | 1               |  |
| RMS Jitter(fs)             | 204                  | 400                   | 490                     | 173                  | 549             |  |
| Fine TDC Number of Bits    | 11                   | /                     | 4                       | /                    | 7               |  |
| TDC Res (ps)               | 6                    | /                     | 0.9                     | /                    | 5               |  |
| DCO Res (kHz/bit)          | /                    | 94                    | 5                       | 100                  | 10              |  |
| Power (mW)                 | 46.7                 | 4.5                   | 3.7                     | 9.5                  | 9.9             |  |
| Area (mm <sup>2</sup> )    | 0.95                 | 0.22                  | 0.22                    | 0.3                  | 0.56            |  |

## Table 4-2 Measured ADPLL performances and comparisons

As part of a low-power 802.11 a/b/g/n wireless transceiver RFIC, this proposed digital PLL consumes 9.9 mW total power in which TDC, DCO and the digital circuits (including MMD) consumes 4.7 mW, 4.2 mW and 1 mW, respectively. The reference signal is generated with an 80 MHz crystal oscillator. Performance comparisons are summarized in Table 4-2, demonstrating a competitive DPLL design comparing to the state-of-the-art.

### 4.3 Conclusion

A typical TDC application, fractional-*N* digital PLL using 2D Vernier TDC with automatic linearity calibration is presented. By using a ramp signal generated from the existing fractional frequency synthesis blocks, the loop can automatically adjust TDC's fast and slow delays to achieve the best linearity for fractional spur reduction. A digi-phase canceller with automatic TDC gain tracking loop is implemented to further suppress the fractional spurs. A largest fractional spur of -55 dBc was measured over various fractional frequencies without using traditional SDM for noise shaping. The proposed 3-step TDC is able to provide fine resolution and wide detectable range with minimal hardware.

# Chapter 5 Time Domain Signal Processing in Advanced Communication Systems

## 5.1 Introduction

Although time-to-digital converters are intensively used in all kinds of digital PLL applications, with pico-second and even sub-pico-second time interval detectability, newly developed high-performance time-to-digital converters paved ways for the time domain signal processing towards many different potential directions.

With the advancement of the Internet of Things (IoT) concept, there is an increasing need for radical radio node recently to enable emerging distributed sensing, tagging and communication applications in commercial spaces that are based on ultra-densely deployed, collaborative, and secured sensor networks. The foundation of this node network is a unified highly miniaturized, low-power, and low-cost radio platform. One obstacle of the miniaturization is the on-node antenna required by the wireless communication links. The physical sizes of commercialized low frequency (MHz or several GHz range) wireless protocol radios are fundamentally limited by their antennas at millimeter to centimeter scale, which prevents the system-on-chip integration.

The continuous device scaling in silicon IC technologies has opened the door for low-cost silicon-based electronics and radios operating at Terahertz (THz) frequencies[100]-[123]. Using THz carrier allows for a drastic antenna size reduction and enables extreme miniaturization of the

whole radio to sub-millimeter scale, which will enable future invisible sensor nodes. However, another obstacle appears. Most existing THz radios consume a substantial DC power, often from hundreds of milli-Watts to Watts, incompatible with field-deployable radio nodes. Moreover, THz communication systems typically exhibit difficulties in high path loss, signal generation, and high-sensitivity signal receiving. Wired power lines are not available for those field deployed sensor nodes. The overall power consumption directly affects the life span of each individual node and performance of the entire IoT system. An integrated IoT communication node prototype is presented in the following sections. Transmitting in THz frequency, the system achieves an ultralow power consumption by utilizing TDC based digital super regenerative technique.

Together with the blooming of the individual electronic device market, wireless communication becomes one of the biggest branches in integrated circuit industry. Within all kinds of digital communication protocols, phase modulation is the most adoptive schemes. The vast majority of commercial protocols, such as Wi-Fi, Bluetooth, 3G, 4G and even 5G, are using phase related modulations like phase-shift keying (PSK) or quadrature amplitude modulation (QAM). Information in those kinds of modulation is encoded in or partial in separated phases. As we know that phase is essentially time domain signal in a certain frequency. Apply time domain signal processing techniques on wireless communication is able to shift the burden on analog domain partially and relief the complicity of some of the modules in transceivers, for instance the power amplifier (PA) in transmitter side and the ADC in receiver side.

# 5.2 Time Domain Demodulation in Super Regenerative Receiver for Low Power Internet-of-Things Applications

Field deployment IoT system is classified by one crucial property; low power transmission in the range of few milli-watts. Traditional RF transceivers requires a relative high sensitivity level to maintain the acquisition range between the transmitter (TX) and receiver (RX). Power-amplifier (PA) on TX side and low-noise-amplifier (LNA) on RX side is indispensable to enhance the signal in the transmission path and reach a required sensitivity level. For example, a commonly used PA for portable Wi-Fi applications consumes a power in a few hundred milli-watts or even a few watts, which disqualified themselves from the IoT node candidates list.



Fig. 5.1 Schematic diagram of a general super-regenerative receiver

The main motivation for using the super-regenerative receiver (SRR) is based on its low sensitivity requirement, which is able to compensate the huge path loss for a THz frequency signal transmission. Besides, compared with other receivers the SRR has much simpler architecture, which reduces the manufacture cost and suitable for massive produce. As can be seen in the schematic diagram in Fig. 5.1, it consists of a passive matching network, an amplifier controlled by a system driven quench signal, and a positive feedback path forming for oscillation purpose.

The modulated THz signal is coupled to the input. The oscillator is a combination of an amplifier and a selection network formed by a band-pass filter forming on the positive feedback path. The positive feedback continuously raises the gain of the amplifier, until the oscillation is started. The sensitivity of the receiver is significantly enhanced in this principle by means of the positive feedback loop.

The amplifier gain is controlled by the system synchronized quenching signal which means oscillation starts at a certain point then it quenched before the oscillation starts again after some time. The demodulation mechanism in a SRR receiver is based on the variation on startup time of the oscillation. In the absence of the input THz signal, the startup process is slow; the oscillation is started by the aid of the thermal noise. The startup process of the oscillator is much quicker in the presence of the input signal with a target frequency. The OOK (on-off keying) is used to modulate and demodulate in such a system.

A digital-bits-in/-out CMOS low-power super regenerative based THz radio system is presented in the following chapter, as shown in Fig. 5.2. As proof-of-concept demonstration, two THz pico-radio chips are used to establish a wireless communication link in a time division duplex manner. A bidirectional transmitter/receiver (TX/RX) circuit-sharing architecture configures the radio as a harmonic oscillator in the TX mode or as a super-harmonic super-regenerative receiver (SRR) in the RX mode. The TX is directly driven by On-Off Keying (OOK) digital data, and the super-harmonic SRR offers high sensitivity to compensate the THz path loss, extending the maximum communication distance. A time-to-digital converter (TDC) is integrated on-chip to provide direct digitized RX outputs. With the time division duplex operation, an on-chip two-feed slot antenna is shared by the TX/RX to eliminate any duplexer circuit for further radio size reduction. The chip area is only 0.57mm<sup>2</sup> including the on-chip antenna.



Fig. 5.2 A bidirectional digital-bits-in/-out CMOS THz pico-radio with on-chip antenna for wireless sensor nodes and Internet of Things (IoT).



Fig. 5.3 Circuit schematic of the THz pico-radio configured as the transmitting mode (TX).

# 5.2.1 Digital PA-less Super Regenerative Radio Implementation with OOK Modulation and TDC Based Demodulator

#### 5.2.1.1 TX Mode

In the TX mode, two pairs of cross-coupled oscillators oscillate at  $f_0$ , and the 2<sup>nd</sup> harmonic signal  $2f_0$  in THz range is extracted at the common mode nodes as the THz outputs shown in Fig. 5.3. The cross-coupled transistors M<sub>2</sub>–M<sub>3</sub> provide differential negative  $g_m$  for oscillation. Transistors M<sub>4</sub> and M<sub>5</sub> are used as varactors for carrier frequency fine tuning. In the TX mode, the on-chip TDC and the RX injection transistor M<sub>1</sub> are turned off for power saving. For the TX modulation, the OOK data drives the tail current source M<sub>2</sub> for direct bits-to-THz transmitting. The supply voltage of the TX-mode oscillators is 1.1V. The total 2<sup>nd</sup> harmonic output power delivered to the antenna is enhanced at 2*f*<sub>0</sub>. In the actual wireless link, the TX output power is optimized based on power-performance requirement. For example, in short-distance communications, the TX output power is backed-off for TX DC power saving.



Fig. 5.4 Circuit schematic and operation principle of the THz pico-radio configured as the receiving mode (RX).

The wavelength of the THz signal is below 1 mm, which allows the radiation and receiving antenna to be integrated on-chip. And it is shared between the TX mode and RX mode to save the total radio size [124]-[129].

#### 5.2.1.2 RX Mode

Conventional THz RXs operating above transistor  $f_{max}$  can be generally divided into power detector based incoherent detection scheme [109], [110] and sub-harmonic mixer based coherent detection scheme [119], [120]. Conventional passive power detectors feature low DC power consumption, small chip area, and low overhead due to supplementary circuits, but they often suffer from poor sensitivity. On the other hand, sub-harmonic coherent detections often achieve much better sensitivity, especially using a low IF frequency. Coherent detections also provide phase information and support complex quadrature modulations. Nevertheless, coherent detections require synchronized LO and THz PLL designs that often lead to large DC power consumption and chip area.

Another incoherent power-based detection approach is to use a SRR [130]-[132] that is particularly suitable for applications with limited chip area and DC power. SRRs are also popular solutions for GHz low-power RXs and mm-wave/THz imagers [122], [123]. In this work, a superharmonic super-regenerative architecture is proposed, where the input signal frequency is around the 2<sup>nd</sup> harmonic of the oscillation frequency, enabling THz receiving above transistor  $f_{max}$ . Moreover, its regenerative nature substantially improves the RX sensitivity over incoherent THz power detectors and exhibits competitive sensitivity but at significantly lower DC power and chip area compared with coherent sub-harmonic mixer-based RX.



Fig. 5.5 Circuit schematic of the TDC.

The RX mode operation is explained as follows: The TX/RX switch is closed. Once a  $2f_0$  input signal is received by the on-chip antenna, a  $2f_0$  current is injected into the resonator tank, and creates a small asymmetry to perturb the fundamental oscillation start-up at  $f_0$  as shown in Fig. 5.4. Thus, the received OOK signal leads to different RX oscillation start-up time (Fig. 5.4). Transistors M<sub>7</sub> and M<sub>8</sub> are reconfigured as an envelope detector (ED), whose output subsequently triggers the TDC. By periodically quenching the tail current source M<sub>2</sub>, the received OOK signal is

demodulated through measuring and digitizing the regenerative oscillator start-up time by the TDC, achieving direct THz-to-bits receiving. Notably, the incoherent detection nature requires oversampling over the symbol rate at the RX to accurately recover the data. In our implementation, a 4× oversampling is chosen for the RX quench signal. The RX oscillation start-up time gradually decreases with larger input power, and the time-encoded output can be readily resolved and digitized by an on-chip TDC.

Based on our THz super-harmonic SRR operation, a fine TDC timing resolution (~ tens of ps) is critical to achieve high RX sensitivity, whilst a large detectable timing range (~ tens of ns) is needed to capture the entire regenerative oscillation start-up event. These 2 timing requirements pose challenges for conventional TDCs, and judicious design considerations are thus required.

Our TDC is designed as a 2-step TDC with a coarse TDC and a fine TDC (Fig. 5.5), achieving large conversion range and fine timing resolution simultaneously. The coarse TDC is implemented as a sampler with a timing resolution of 500 ps, while the fine TDC is implemented as a 2D Vernier structure, covering 725 ps range at a timing resolution of 25 ps. The 2D Vernier TDC comprises two slightly differed delay lines and a 2D comparator array, presenting several unique advantages. First, compared with conventional single inverter-delay-line based TDC (flash TDC) that quantizes time information based on the propagation delay of each individual inverter, the Vernier TDC has greatly improved timing resolution by using the propagation delay difference of the unit delay cells in the two delay lines, i.e.,  $\tau_s$ - $\tau_f$ . Secondly, the Vernier delay line architecture is less sensitive to process-voltage-temperature (PVT) variations since the 1<sup>st</sup> order PVT mismatches are automatically cancelled if the two delay lines are well matched. Monte Carlo simulations for the fast delay ( $\tau_f$ ), slow delay ( $\tau_s$ ), and delay difference ( $\tau_s$ - $\tau_f$ ) are shown in Fig. 5.6 (a), and the PVT

dependence is illustrated in Fig. 5.6(b). Both results indicate a significant improvement on the PVT variations for the Vernier delay line. In addition, by using a 2D comparator based 2D Vernier TDC, the number of delay stages can be substantially reduced.



Fig. 5.6 (a) Monte Carlo simulations of the fast delay ( $\tau_f$ ), slow delay ( $\tau_s$ ) and delay difference ( $\tau_s$ - $\tau_f$ ) on the process variations. (b) Simulated delay variations over temperature, supply voltage and corners.

The timing diagram of the RX mode and TDC is explained as follows. Once the RX quench signal turns on the THz regenerative oscillator, its start-up time is first sampled by the coarse TDC. The ED output rises in proportion to the oscillator growing amplitude, and it eventually passes a threshold and triggers the fine TDC. The fine TDC then resolves the residual time of the coarse TDC residue, i.e., the timing difference between the onset of the fine TDC and the immediate next

clock of the coarse TDC (Fig. 5.7). The RX oscillation start-up time is thus the output difference of the coarse and fine TDCs. At low data rate, the duty cycle of the quench signal is reduced to further save the RX DC power.



Fig. 5.7 Timing diagram of the RX mode and the TDC, assuming 4Mbit/s OOK data rate.



Fig. 5.8 Concept of the switching capacitor energy harvester.

#### **5.2.1.3 Energy Harvester**

Wireless communication is one of the foundation of the IoT networks. Power grid is not available to every terminal of the net. Self-powered capability therefore becomes essential to IoT system node sensors. Our proposed THz radio node processes the energy harvesting function that can extract energy from unstable ultra-low voltage power source, such as solar cells. Switching capacitors are commonly used for this application. The concept is presented in Fig. 5.8. Two CMOS inverter like switch groups are stacked horizontally in the voltage domain. An input voltage is tied to the junction of those two groups. All the gates are connected to a clock signal. "Inverter's" outputs are coupled by a capacitor C<sub>FLY</sub>. In state one, the bottom plate of the C<sub>FLY</sub> is switched to ground, and the top plate is charged by V<sub>in</sub>. Once comes to state two, the bottom plate is lifted to V<sub>in</sub>, and the charge on the top plate is released to the node V<sub>out</sub>. After several cycles, the voltage on V<sub>out</sub> becomes twice of the input voltage V<sub>in</sub>.



Fig. 5.9 Schematic of the adopted voltage doubler.

The energy harvester adopted in our design consists of two series connected voltage doubler converting an input voltage around 0.5 volt to 1.8 volt. The architecture as shown in Fig. 5.9 is

provided in [133], [134]. Two stacked ring oscillators are coupled by capacitors. The ring oscillators served as switch capacitors and clock generators simultaneously. A feedback loop is formed to adaptive adjust the output voltage.

#### 5.2.2 Experiment Results of the THz Wireless Communication System

The bidirectional THz pico-radio is designed and fabricated in a 45nm CMOS SOI process with high-resistivity substrate (Fig. 5.10). The miniaturized THz pico-radio only occupies  $950 \times 600 \mu m^2$  including all the pads. The CMOS chip is wire-bonded to a PCB to facilitate the testing.



Fig. 5.10 THz pico-radio chip microphotograph.

After characterizing the performance of the TX and RX individually, wireless communication based on OOK modulation is established between the two THz pico-radio chips. The measured

data rate, TX/RX distance, and TRX total DC power consumption are summarized in Fig. 5.11. The maximum data rate of 4.4Mb/s (BER<10<sup>-7</sup>) over 50cm maximum TX/RX distance is supported at a peak TRX DC power of 49.3mW. The maximum data rate is limited by how fast the RX oscillator starts to oscillate, which can be potentially increased at the expense of reduced RX sensitivity and maximum communication distance.

The THz pico-radio TRX achieves power-performance scalable operation, where DC power is optimized based on the actual link distance and data rate [Fig. 5.11(a)]. The TX DC power and its output power can be lowered for closer communication distances, while RX quench signal duty cycle can be reduced for lower data rates. For example, the TRX total DC power reduces to 26.4mW when the data rate is lowered to 1Mb/s while keeping 50cm TX/RX distance. The TRX DC power further reduces to 18.7mW with 1Mb/s data rate when the communication distance is shrunk to 17cm. The measured BER and minimum RX DC power versus data rate over 50cm communication distance as well as the measured BER versus distance at different TX DC power for 4Mb/s OOK are shown in Fig. 5.11(b) and Fig. 5.11(c).

With the high-resolution on-chip TDC, the RX is also able to accurately distinguish the received signal strength based on the oscillation start-up time at a fixed TX/RX distance. Therefore, two THz pico-radio chips are used to establish an ASK wireless link to increase the total communication throughput. The measured maximum communication distance versus number of bits for BER<10<sup>-5</sup> and BER<10<sup>-7</sup> are shown in Fig. 5.12. The ASK link supports 4-bit 4Msym/s (16Mb/s) communication over 15cm and 3-bit 4Msym/s (12Mb/s) communication over 35cm (BER<10<sup>-5</sup>) with BER<10<sup>-5</sup>.



Fig. 5.11 THz pico-radio communication link using OOK modulations. (a) The results are summarized as measured TRX peak DC power versus TX/RX communication distance and OOK data rate at BER<10-7. (b) Measured BER and minimum RX DC power versus data rate over a fixed TX/RX distance of 50cm. (c) Measured BER versus TX/RX distance at different TX DC power for 4Mb/s OOK signals.

Note that this capability of distinguishing received amplitude can be used to monitor the realtime THz transmission characteristics between the TX/RX pair, which can enable a wide variety of distributed sensing applications, such as very-large-scale position/motion tracking, temperature/humidity monitoring, vibration/deformation sensing, or non-contact THz electromagnetic "tactile sensing".



Fig. 5.12 THz pico-radio communication link using ASK modulations with a symbol rage of 4Msym/s. The results are summarized as maximum communication distance versus number of bits.

# 5.3 Hybrid Time-Analog-to-Digital Conversion for Direct RF Polar

## Wireless Communication System

Cartesian I/Q transceivers are widely adopted in nowadays wireless communication systems. Most of today's wireless communication standards, such as GSM, WLAN and LTE, require a large peak to average power ratio (PAPR), which results in a huge burden to all the modules in the signal path. To prevent signal distortion, those circuits need to be highly linear. Consequently, a large power consumption is required for I/Q transceivers to meet the linearity specifications. Fig. 5.13 gives a conventional I/Q wireless transceiver architecture.



Fig. 5.13 Traditional I/Q wireless transceiver architecture.

The high PAPR of an I/Q architecture is due to the constant varying signal amplitude. One solution to relieve the issue is to use polar-based transmitter architecture. A polar direct modulation transmitter architecture is presented in Fig. 5.14. The original digitized I/Q signals are converted into amplitude and phase through the coordinate rotation digital computer (CODIC) and fed to two paths, amplitude path and phase path. The digital phase data is directly converted to phase modulated RF signals though ADPLL or direct-digital-synthesis (DDS) based direct phase modulators. And the amplitude information is embedded to the transmission signal at the PA stage, the very end stage of the transmitter, by using a digital power amplifier (DPA). By exploring the architecture diagram, it can be found that the linearity requirement has been detached from most of the modules, since the phase path only deals with constant envelope signals and the main parts

of amplitude path are located in digital domain. The only module left to face the high linearity requirement is the DPA, which also can be relieved with some specific modulation types that will be covered in the following sections.



Fig. 5.14 Existing polar direct modulation TX architecture.

Polar transmitter has been invented and studied for years [25]. However, literatures about the corresponding polar receivers can barely be found in communication field. One of the main reasons is that normally the received signal becomes very tiny during a long-distance transmission to the receiver. Thus, the linearity performance turns out to be less important comparing with transmitter. And the main burden of a receiver is shifted to ADC part and baseband signal processing part. For a conventional WLAN receiver, the system calls for an ADC for both I and Q signal path. These ADCs require around 10 number of bits and running with a minimum oversample ratio of 4, which is around 100Msps ADC sample rate. The ADC specifications mentioned above are feasible, but with considerable power consumption and cost.



Fig. 5.15 Block diagram of TDC based hybrid polar data converter in RX architecture.

## 5.3.1 Sub-Sampling Direct RF Demodulation Data Converter for Polar Receiver

Following the trend of digitization everything, the polar direct modulation TX architecture shows a good example that the major portion of it has been moved into digital domain as shown in Fig. 5.14. Two remaining parts, DCO and DPA, can be treated as data converters that transfer digital information into RF signals. In contrast, the digitization process in receiver part is still unclear. Sampling the signal at RF frequency by using ultra-high-speed ADC is one solution.

However, tremendous cost (hundreds or even thousands of dollars) and power consumption (a few watts) [xx-xx] block its way to commercialize for common customers.

In this work we proposed a novel time domain information assisted wireless receiver architecture, which uses both TDCs and ADCs to form a polar data converter and demodulator. The block diagram of the proposed architecture is shown in Fig. 5.15. At beginning, a few amplifier and filter stages process the antenna captured RF signal preliminarily. Secondly, instead of splitting into I/Q, the RF signal is fed into TDC path and ADC path directly. In the TDC path, the signal will go through a hysteresis buffer, which preserves signal's phase information and removes its amplitude. By carefully adjust the  $V_{TH}$  and  $V_{TL}$  of the buffer, the hysteresis effect reduces the noise in phase domain. The m-bit TDC measures the time difference between the system clock and the immediate zero-crossing point of the received signal. The signal in the other branch is fed into ADC directly. And the ADC sampling position tuned by a DTC is resolved based on TDC output data. The outputs of ADC and TDC represent the amplitude and phase of the baseband signal. Thus, the proposed architecture is able to directly demodulate the received signal.



Fig. 5.16 Polar RX working principle with 16-QAM baseband signal.

Fig. 5.16 demonstrates the proposed polar receiver's working principle. Take 'Symbol 1' for example. Firstly, the frequency of the baseband PLL is tuned and locked to baseband frequency and the phase is aligned with the edge of baseband symbols. The TDC measures the time between the start time of a baseband symbol and the stop time at baseband signal's first zero-crossing point, either rising or falling. The TDC's output represents phase information of the current symbol. Then we need to determine its amplitude, which represented by the peak value of the sinusoidal signal waveform. Taking a closer look of Fig. 5.16, it can be found that the signal peaks at the position of rising zero-crossing point with a  $\lambda/4$ -time delay, as indicated with a solid blue line. Thus, if the ADC samples the signal at that specific point, the measured data represents the amplitude of the signal. With the gathered phase and amplitude information, the symbol is mapped to the constellation and the signal is therefore recovered. One additional characteristic needs to be pointed out that oversample ratio is not necessary for this proposed architecture. When dealing with QAM or other phase modulated signals, the ADC only need to convert once per symbol to recover the signal with the information provided by TDC.

The baseband data in conventional protocols operate in serval mega symbol per second. However, the abrupt transitions between each symbol expand the bandwidth of the signal to infinity in frequency domain. Practically, pulse shaping filters, such as raised cosine filter, are applied to the baseband signal, which limit the bandwidth of the signal and fit transmitter's requirements. The spectrum of the unfiltered and filtered signal can be observed in Fig. 5.17. The power spectrum of the filtered signal is limited in a certain desired bandwidth. The out-of-band frequency components have been greatly suppressed. Furthermore, in order to match the transmission spectrum restriction, the baseband signal is up-converted to a specific RF carrier frequency according to a certain wireless communication protocol, for instance 2.4GHz for 802.11g standard.



Fig. 5.17 FFT results of the filtered and unfiltered signal.

The solid black curve shown in Fig. 5.17 depicts the signal spectrum after filter and upconversion. Next, let us explore the changes in time domain. An example is given in Fig. 5.18, where a 150M symbol per second baseband signal is filtered and up-converted to 2.4 GHz carrier frequency. A section of the signal, which contains three sequential symbols, is depicts in both time domain and constellation plots. Comparing with the signal in Fig. 5.16, it can be clearly seen that the actual RF transmission signal is quite different from baseband. Instead of only one baseband frequency oscillation cycle, a multiple of 16 cycles in carrier frequency are fitted into one symbol period, after the up-conversion. The effect of the pulse shaping filter influence the time domain signal in another way. The two time domain RF frequency signals, with and without filter, are plotted in Fig. 5.18. The unfiltered signal is plotted in color gray. Abrupt phase and amplitude jumps can be observed at the edge of each symbols. While, the filtered signal (in black) is gradually changed from one symbol to another, instead of having a sudden transmission, which confines the signal in a limited bandwidth as discussed earlier. Though the two signals seem totally different, they appear approximately overlapped at the edge of each symbol. Thus, the former introduced TDC based data conversion is valid only in the restricted time intervals as squared with blue boxes in Fig. 5.18. The RF signal is directly fed into the converter without frequency down-conversion, yet both the TDC and the ADC sample the signal once per symbol and works at baseband frequency. The outputs of the convert provide enough information for constellation reconstruction. And this is the reason for given its name of sub-sampling direct RF demodulation data converter.



Fig. 5.18 Direct RF demodulation with 64QAM signal.

There are two obvious benefits: first, without down-conversion the maximum phase variation is limited in one RF oscillation cycle, which means the TDC detectable range is greatly reduced to carrier frequency period (416 ps in this example); second, both the TDC and ADC sample the signal once per symbol at baseband frequency. Traditionally, either two ADCs at baseband frequency or one ADC at intermediate frequency (IF) is required with a minimum oversampling ratio larger than two to reconstruct the signal in digital domain. No matter which scenario is chosen, a higher sample rate is required and leads to a larger power consumption while comparing with the proposed conversion technique.

Similar to I/Q receivers, the proposed polar receiver requires phase and frequency alignment between local clock signal and received signal. Without a proper alignment, phase and amplitude drift occur, and result in a large unacceptable EVM. Therefore, we need to align the phase through calibration in the initial state. A multi-phase clock is adopted for coarse alignment, which generates multiple evenly distributed phases. The system will automatically sweep the phases during the initial communication setup state and pick the closest phase comparing with the symbol start time to be the TDC sampling clock. Then, we use a tunable delay cell for further fine phase alignment. The use of multi-phase clock will relax the tuning range of delay cell and reduce power consumption. This calibration is part of the receiver setup through automatic feedback calibration built in digital baseband.

This intelligent control of sample point allows us to simplify the data converter architecture. In our proposed architecture, a normal SAR-ADC is adopted with greatly reduced number of bits and low power for amplitude detection. This time-processing receiver architecture is suitable for low power communications such as portable cellular and IoT communications. It simplifies not only the RF and data-converters, but also the baseband processor, leading to ultra-low power with greatly extended battery time.



Fig. 5.19 A comparison between Cartesian I/Q converter and polar converter.

#### 5.3.2 APSK Modulation for Polar Transceiver

Another reason to pursue polar receiver is that comparing with conventional Cartesian I/Q receiver, polar receiver requires less data converter number of bits when dealing with same SNR level signal [135], [136]. Fig. 5.19 explains this advantage in an intuitive way. The first plot illustrates a Cartesian I/Q conversion space. The dashed lines indicate data converters' quantization level. In this example, the signals in both I and Q paths are quantized into 8 levels, namely 64 quanta in total. In the polar case as shown in second figure, the space has been quantized with 16 phases and 4 amplitude levels, 64 quanta as well. Note that no matter dealing with large or small input signal, the I/Q converter's quantization resolution (small box) remains the same, while the polar converter has large quantization resolution when input signal is large and finer quantization resolution when signal has small amplitude. In other words, the polar data converter's quantization steps are automatically adjusted based on input signal's amplitude, leading to

improved SNR with the same number of bits for data converters. Because all the phases finally converge to the origin, the resolution is infinitely magnified when signal approaching zero.



Fig. 5.20 Minimum amplitude and phase number of bits for rectangular 64-QAM with 25dB EVM.

However, a combination of polar transceiver and rectangular QAM signal is not an efficient choice. The QAM modulation widely adopted in wireless commercial communication protocols, such as Wi-Fi and 4G cellular, is developed based on I/Q transceiver architectures. For example, a minimum 4 bits long data length in both I-path and Q-path is adequate to construct a 64-QAM constellation. As illustrated in Fig. 5.20, while transformed to polar architecture, at least six bits in amplitude path and eight bits in phase path are required to meet the 25dB EVM specifications in Wi-Fi 64-QAM modulation protocol. As a result, the benefits of polar architecture cannot be fully utilized when dealing with rectangular modulations.

Polar architecture has intrinsic advantages with circularity. Just like circular and spherical problem found easier in polar coordinates, circular arranged constellation is more suitable for polar architecture. Amplitude and phase shift keying (APSK), also knowing as circular QAM, is

therefore introduced to fit the polar transceiver. The proposed *M*-ary APSK (M = 4, 16, 64, 256, 1024, ...) constellation is defined as:

$$\chi = \frac{A}{2^{\left(\frac{\log_2 M}{2} - 1\right)}} p e^{j\left(\frac{2\pi}{\log_2 M} + 1\right)q}$$
$$= \frac{A}{\sqrt{M/2}} p e^{j\left(\frac{\pi}{\sqrt{M}}\right)q} \text{ for } p=1, 2, ..., \sqrt{M/2} q=1, 2, ..., 2\sqrt{M}$$
(5.1)

where A is the maximum amplitude of signal, p is the number of amplitude levels in the M-ary APSK, and q is the number of phases in the M-ary APSK. In order to reduce the PAPR, the phase always has two more bits than the amplitude, namely  $\max(\sqrt{p}) + 2 = \max(\sqrt{q})$ . Fig. 5.21 gives an example of one of the APSK constellation arrangement, which contains 4 amplitude levels and 16 phase levels. The PAPR is defined as the ratio of signal peak power over RMS power, where

peak power = 
$$max(I(i)^2 + Q(i)^2)$$

RMS power = 
$$\frac{\sum_{i=1}^{M} (I(i)^2 + Q(i)^2)}{M}$$
 (5.2)

 $PAPR(dB) = 10 \log \frac{\text{peak power}}{\text{RMS power}}$  i=1, 2, ..., M

|        | _ |   |   |   |   |   |   |
|--------|---|---|---|---|---|---|---|
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| -<br>0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 0      | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
|        |   |   |   |   |   |   |   |

Fig. 5.21 Rectangular 64-QAM and 64-APSK constellations.

And a PAPR comparison between rectangular QAM and proposed APSK is given in Table 5-1.

| M-ary           | 4 | 16   | 64   | 256  | 1024 | 4096 |
|-----------------|---|------|------|------|------|------|
| QAM PAPR (dB)   | 0 | 2.55 | 3.68 | 4.23 | 4.50 | 4.64 |
| APSK PAPR (dB)  | 0 | 2.04 | 3.29 | 4.00 | 4.37 | 4.57 |
| Difference (dB) | 0 | 0.51 | 0.39 | 0.23 | 0.13 | 0.07 |

Table 5-1 PAPR comparison between rectangular QAM and proposed APSK

Though according to some researches [137], [138] it seems rather difficult to claim that APSK outperforms QAM in ideal conditions, the scenario inverses when considering the impact of some non-idea effects, for instance phase noise, nonlinearity, and multipaths issues [139]. In fact, APSK can benefit of its low PAPR, which is more suitable for the situations when using power amplifiers with nonlinear characteristic, like CMOS based DPAs, on portable or IoT applications.



Fig. 5.22 Rectangular 64-QAM and 64-APSK constellations with phase noise.



Fig. 5.23 Rectangular 64-QAM and 64-APSK constellations with nonlinear distortion.



Fig. 5.24 Rectangular 64-QAM and 64-APSK constellations with thermal noise.

The radius of phase noise contribution and nonlinear distortion effects are proportional to the signal's amplitude. Correspondingly, the size of APSK quantum also increases along with the signal's amplitude, which partially compensates the non-ideal effects. Fig. 5.22 and Fig. 5.23 show the comparison between rectangular 64-QAM and the proposed 64-APSK with phase noise and

nonlinear distortion, indicating a robustness of APSK to the non-ideal effects. However, the absolute distance between each constellation points in lower amplitude circle of the APSK is narrower than rectangular QAM constellation. As a result, the APSK modulation is less robust in a thermal noise dominated system as shown in Fig. 5.24.

#### 5.3.3 Simulation Results of the Proposed Time-Based Polar Receiver

The proposed polar receiver is able to demodulate both I/Q based QAM signals and polar based APSK signals. By adopting the TDC technologies introduced in previous chapters, the phase discrimination ability is greatly improved. The proposed polar receiver is supporting up to 1024-ary APSK modulation at a baseband rate of 200Msps. System level and schematic level simulations indicate a promising future in low power, high throughput wireless applications. Fig. 5.25 and Fig. 5.26 present the simulation results of 64-QAM and 64-APSK signals.

Multiple simulations are carried out to compare the performance between rectangular QAM and APSK modulations with the proposed polar receiver in the presence of general non-ideal factors, such as phase noise, nonlinear distortion and thermal noise. Bit error rate (BER) is chosen as a criterion to judge the performance of the two modulation types. As expected, the 64-APSK outperforms the conventional 64-QAM modulation in the situations where phase noise and nonlinear distortion presence. The comparisons are given in Fig. 5.27 and Fig. 5.28. While the case when thermal noise appears is different. Theoretically, the rectangular QAM is more insensitive to thermal noise owing to its uniformed quantizer. However, a modulation built in Cartesian plane requires more number of bits to resolve in polar domain and results in a degraded performance in the practical world as shown in Fig. 5.29. The deviation between simulation and theoretical results mainly comes from finite TDC and ADC resolutions, namely quantization noise, and the limited

sub-sampling gain, which is the ratio of carrier frequency and the baseband frequency. Although an increased ratio is able to minimize the deviation, extra complexity is unavoidable during the implementation.



Fig. 5.25 Proposed polar receiver with 64-QAM signal: top, transmitter baseband I/Q signals; middle, transmitted RF signal; bottom, constellations of TX baseband, RF band, and RX received baseband.



Fig. 5.26 Proposed polar receiver with proposed 64-APSK signal: top, transmitter baseband I/Q signals; middle, transmitted RF signal; bottom, constellations of TX baseband, RF band, and RX received baseband.



Fig. 5.27 Comparison of the BER vs. SNR performance of the rectangular 64-QAM signal and the proposed 64-APSK signal when only considering the effect of phase noise.



Fig. 5.28 Comparison of the BER vs. SNR performance of the rectangular 64-QAM signal and the proposed 64-APSK signal when only considering the effect of nonlinearity distortion.


Fig. 5.29 Comparison of the BER vs. SNR performance of the rectangular 64-QAM signal and the proposed 64-APSK signal when only considering the effect of thermal noise.

## **5.4 Conclusion**

A CMOS THz pico-radio concept is proposed in this work that achieves extreme radio miniaturization to enable future "invisible" field-deployable sensor network and IoT applications. The bidirectional circuit sharing architecture configures the THz pico-radio as an OOK-modulated harmonic oscillator in the TX mode or as a super-harmonic SRR that offers high sensitivity at significantly reduced DC power than conventional coherent down-conversion RX. A TDC is integrated on-chip to directly achieve digitized RX outputs with only 0.75mW power, minimizing any post signal processing. Compared with the reported Si-based 300GHz TRXs, this bidirectional THz pico-radio achieves the lowest DC power (49.7mW) and the longest communication distance (50cm) without any Si lens. It also achieves the highest radio miniaturization (>7.8× radio size reduction), competitive data rate (maximum 4.4Mb/s with BER<10<sup>-7</sup>), and DC power compared

with existing low-power radios at MHz, GHz, and mm-wave frequencies. The reported TX/RX power consumption is the peak DC power during the radio operation. Heavily duty-cycled operations, typical for sensor nodes and IoT devices, will substantially reduce the THz pico-radio averaged DC power down to  $\mu$ W level.

The other reported application is the time-to-digital converter (TDC) based hybrid polar data converter for receiver and time domain signal processing procedure corresponding to this data convertor. Polar conversion achieves better SNR tolerance owing to phase convergence when closing to the origin. The proposed architecture is formed with a TDC for phase detection and an ADC for amplitude capture. The ADC's sampling position is guided by TDC's output. With this mechanism, both ADC and TDC is able to reduce number of bits and convert data without doing oversample. By adding precisely controlled tunable delay cells and gain compensator, this hybrid data convertor is capable to demodulate base band signal directly without any digital CORDIC algorithm or other complicated digital signal processing. Thus, this proposed architecture achieves low power consumption and design complexity comparing with conventional I/Q receivers.

## **Chapter 6** Summary of the Works

This work explores time to digital conversion techniques, designs and applications for time domain signal processing. Though the TDC devices and integrated circuits have been invented for decades, recent designs have been revolutionized with the advancements in deep sub-micron CMOS processes. Many of the performance limitations highlighted in early literature, such as the transition speed and power consumption of the transistors, no longer apply to designs in modern integrated circuit processes. The TDC evolving history is introduced at beginning of the chapter. Basic knowledge and design specifications, such as resolution, detectable rang, INL, and DNL, are clarified and listed for reference. The relationship between TDC transfer function linearity and dynamic performance is explained. Several commonly used TDC architectures are explained in detail and analyzed with respect to resolution, detectable range, linearity performance, and power consumption.

Two different high performance TDC designs are presented. The first one achieving 1.25 ps temporal resolution with 8-bit range is implemented for high linearity applications. A spiral comparator array is proposed to enlarge the TDC detection range and improve the linearity. Two  $2^{nd}$  order  $\Delta\Sigma$  modulators are utilized to lower the quantization errors of the DTC based unit delay cells and to randomize the periodic folding errors of the 2-D comparator array. With an 80 MHz reference clock, the measured maximum DNL and INL of the proposed TDC are 0.25 LSB and 0.34 LSB, respectively. The TDC power consumption is greatly reduced with the adaptive power

control switches. Fabricated in 45 nm CMOS technology, the TDC prototype consumes 70-690  $\mu$ W under a 1 V power supply at a conversion rate of 80 MHz. It achieves 1.67 ps effective resolution and a FoM ADC of 0.016 pJ / conv-step, advancing the state-of-the-art high-performance TDC designs. The goal of the second TDC design is to accomplish a large detectable range. A total 14 bits (1.6ns) is reached with fine resolution of 1ps and excellent differential linearity (DNL)/integral linearity (INL) of 0.41ps/0.79ps owing to the novel techniques. In this work, by using a large feature size technology (130nm CMOS) we presented a TDC achieving improved performance comparing to state-of-art TDC designs using small feature size processes.

With the aid of high performance TDCs, a fractional-*N* ADPLL with automatic TDC linearization is presented to prove the fading gap between digital PLL and analog PLL. The proposed ADPLL achieves an in-band phase noise of -107 dBc/Hz and a highest close in fractional spur level of -55 dBc. By using a ramp signal generated from the existing fractional frequency synthesis blocks, the loop can automatically adjust TDC's fast and slow delays to achieve the best linearity for fractional spur reduction. A digi-phase canceller with automatic TDC gain tracking loop is implemented to further suppress the fractional spurs. The proposed 3-step TDC is able to provide fine resolution and wide detectable range with minimal hardware.

Owing to pico-second and even sub-pico-second time interval detectability, TDCs pave the ways for the time domain signal processing towards many different potential research directions. Two communication system related time domain signal processing applications: TDC based digital super-regenerative bidirectional IoT node and hybrid time assist data convertor for polar receiver system, are included in this work. The THz pico-radio node achieves extreme radio miniaturization to enable future "invisible" field-deployable sensor network and IoT applications. The

bidirectional circuit sharing architecture configures the THz pico-radio as an OOK-modulated harmonic oscillator in the TX mode or as a super-harmonic SRR that offers high sensitivity at significantly reduced DC power than conventional coherent down-conversion RX. A TDC is integrated on-chip to directly achieve digitized RX outputs with only 0.75mW power, minimizing any post signal processing. It achieves high radio miniaturization, competitive data rate, and DC power. The reported TX/RX power consumption is the peak DC power during the radio operation. Secondly, the time-to-digital converter (TDC) based hybrid polar data converter for receiver and time domain signal processing procedure corresponding to this data convertor are presented. Polar conversion achieves better SNR tolerance owing to phase convergence when closing to the origin. The proposed architecture is formed with a TDC for phase detection and an ADC for amplitude capture. The ADC's sampling position is guided by TDC's output. With this mechanism, both ADC and TDC is able to reduce number of bits and convert data without doing oversample. By adding precisely controlled tunable delay cells and gain compensator, this hybrid data convertor is capable to demodulate base band signal directly without any complicated signal processing digital algorithm. Thus, this proposed architecture achieves low power consumption and design complexity comparing with conventional I/Q receivers.

## References

- [1] Jack S. Kilby, "Miniaturized Electronic Circuits," United States Patent Office, US Patent 3,138,743, filed 6 February 1959, issued 23 June 1964.
- [2] Winston, Brian. "Media Technology and Society: A History: From the Telegraph to the Internet," Routledge. p. 221. ISBN 978-0-415-14230-4, 1998.
- [3] "Milestones: First Semiconductor Integrated Circuit (IC), 1958". IEEE Global History Network. IEEE. Retrieved 3 August 2011.
- [4] Walt Kester, "The Data Conversion Handbook," Newnes, ISBN 0-7506-7841-0, Dec. 2005.
- [5] A. B. Gregene and H. R. Camenzind, "Frequency-selective integrated circuits using phaselock techniques," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 4, no. 4, pp. 216-225, Aug 1969.
- [6] A. Grebene and H. Camenzind, "Phase locking as a new approach for tuned integrated circuits," 1969 IEEE International Solid-State Circuits Conference (ISSCC). Digest of Technical Papers, Philadelphia, PA, USA, 1969, pp. 100-101.
- [7] Y. Arai and T. Baba, "A CMOS time to digital converter VLSI for high-energy physics," *Symposium 1988 on VLSI Circuits*, Tokyo, Japan, 1988, pp. 121-122.
- [8] T. Rahkonen, J. Kostamovaara and S. Saynajakangas, "A CMOS ASIC time-to-digital converter for short time interval measurements," *IEEE International Symposium on Circuits* and Systems, Portland, OR, 1989, pp. 2092-2095 vol.3.
- [9] R. B. Staszewski, S. Vemulapalli, P. Vallur, et al. "1.3 V 20 ps time-to-digital converter for frequency synthesis in 90-nm CMOS," in *IEEE Transactions on Circuits and Systems II: Express Briefs (TCASII)*, vol. 53, no. 3, pp. 220-224, March 2006.
- [10] B. Shen, G. Unruh, M. Lugthart, et al. "An 8.5 mW, 0.07 mm<sup>2</sup> ADPLL in 28 nm CMOS with sub-ps resolution TDC and < 230 fs RMS jitter," 2013 Symposium on VLSI Circuits, Kyoto, 2013, pp. C192-C193.</p>

- [11] A. I. Hussein, S. Vasadi and J. Paramesh, "A 450 fs 65-nm CMOS millimeter-wave timeto-digital converter using statistical element selection for all-digital PLLs," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 53, no. 2, pp. 357-374, Feb. 2018.
- [12] S. Alahdab, A. Mäntyniemi and J. Kostamovaara, "A time-to-digital converter (TDC) with a 13-bit cyclic time domain successive approximation interpolator with sub-ps-level resolution using current DAC and differential switch," 2013 IEEE 56<sup>th</sup> International Midwest Symposium on Circuits and Systems (MWSCAS), Columbus, OH, 2013, pp. 828-831.
- [13] W. El-Halwagy, P. Mousavi and M. Hossain, "A 79dB SNDR, 10MHz BW, 675MS/s open-loop time-based ADC employing a 1.15ps SAR-TDC," 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC), Toyama, 2016, pp. 321-324.
- [14] D. Uchida, M. Ikebe, J. Motohisa and E. Sano, "A 12-bit, 5.5-μW single-slope ADC using intermittent working TDC with multi-phase clock signals," 2014 21<sup>st</sup> IEEE International Conference on Electronics, Circuits and Systems (ICECS), Marseille, 2014, pp. 770-773.
- [15] J. Daniels, W. Dehaene, M. S. J. Steyaert and A. Wiesbauer, "A/D Conversion Using Asynchronous Delta-Sigma Modulation and Time-to-Digital Conversion," in *IEEE Transactions on Circuits and Systems I: Regular Papers (TCASI)*, vol. 57, no. 9, pp. 2404-2412, Sept. 2010.
- [16] R. B. Staszewski, D. Leipold, O. Eliezer, et al., "A 24mm<sup>2</sup> quad-band single-chip GSM radio with transmitter calibration in 90nm digital CMOS," 2008 IEEE International Solid-State Circuits Conference (ISSCC) - Digest of Technical Papers, San Francisco, CA, 2008, pp. 208-607.
- [17] R. Winoto, A. Olyaei, M. Hajirostam, et al., "9.4 A 2x2 WLAN and Bluetooth combo SoC in 28nm CMOS with on-chip WLAN digital power amplifier, integrated 2G/BT SP3T switch and BT pulling cancelation," 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, 2016, pp. 170-171.
- [18] F. W. Kuo; S. B. Ferreira, H. R. Chen, *et al.*, "A Bluetooth low-energy transceiver with 3.7-mW all-digital transmitter, 2.75-mW high-IF discrete-time Receiver, and TX/RX switchable on-chip matching network," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 52, no. 4, pp. 1144-1162, April 2017.
- [19] M. Ding, X. Wang, P. Zhang, et al., "A 0.8V 0.8mm<sup>2</sup> bluetooth 5/BLE digital-intensive transceiver with a 2.3mW phase-tracking RX utilizing a hybrid loop filter for interference resilience in 40nm CMOS," 2018 IEEE International Solid - State Circuits Conference -(ISSCC), San Francisco, CA, 2018, pp. 446-448.

- [20] D. Tasca, M. Zanuso, G. Marzin, *et al.*, "A 2.9–4.0-GHz fractional-N digital PLL with bang-bang phase detector and 560-fsrms integrated jitter at 4.5-mW power," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 46, no. 12, pp. 2745-2758, Dec. 2011.
- [21] E. Temporiti, C. Weltin-Wu, D. Baldi, *et al.*, "A 3 GHz fractional all-digital PLL with a 1.8 MHz bandwidth implementing spur reduction techniques," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 44, no. 3, pp. 824-834, March 2009.
- [22] E. Temporiti, C. Weltin-Wu, D. Baldi, et al., "A 3.5 GHz wideband ADPLL with fractional spur suppression through TDC dithering and feedforward compensation," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 45, no. 12, pp. 2723-2736, Dec. 2010.
- [23] T. Tokairin, M. Okada, M. Kitsunezuka, et al., "A 2.1-to-2.8-GHz low-phase-noise alldigital frequency synthesizer with a time-windowed time-to-digital converter," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 45, no. 12, pp. 2582-2590, Dec. 2010.
- [24] L. Lou, B. Chen, K. Tang, *et al.*, "An ultra-wideband low-power ADPLL chirp synthesizer with adaptive loop bandwidth in 65nm CMOS," *2016 IEEE Radio Frequency Integrated Circuits Symposium (RFIC)*, San Francisco, CA, 2016, pp. 35-38.
- [25] G. Marzin, S. Levantino, C. Samori and A. L. Lacaita, "A 20 Mb/s phase modulator based on a 3.6 GHz digital PLL with -36 dB EVM at 5 mW power," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 47, no. 12, pp. 2974-2988, Dec. 2012.
- [26] F. W. Kuo, S. Pourmousavian, T Siriburanon, et al., "A 0.5V 1.6mW 2.4GHz fractional-N all-digital PLL for Bluetooth LE with PVT-insensitive TDC using switched-capacitor doubler in 28nm CMOS," 2017 Symposium on VLSI Circuits, Kyoto, 2017, pp. C178-C179.
- [27] H. Liu, Z. Sun, D. Tang, et al., "An ADPLL-centric bluetooth low-energy transceiver with 2.3mW interference-tolerant hybrid-loop receiver and 2.9mW single-point polar transmitter in 65nm CMOS," 2018 IEEE International Solid -State Circuits Conference - (ISSCC), San Francisco, CA, 2018, pp. 444-446.
- [28] H. J. Song and T. Nagatsuma, "Present and Future of Terahertz Communications," in *IEEE Transactions on Terahertz Science and Technology*, vol. 1, no. 1, pp. 256-263, Sept. 2011.
- [29] E. Ojefors, U. R. Pfeiffer, A. Lisauskas and H. G. Roskos, "A 0.65 THz Focal-Plane Array in a Quarter-Micron CMOS Process Technology," in *IEEE Journal of Solid-State Circuits* (JSSC), vol. 44, no. 7, pp. 1968-1976, July 2009.
- [30] T. W. Crowe, W. L. Bishop, D. W. Porterfield, *et al.*, "Opening the terahertz window with integrated diode circuits," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 40, no. 10, pp. 2104-2110, Oct. 2005.

- [31] T. Chi, H. Wang, M. Y. Huang, et al., "A bidirectional lens-free digital-bits-in/-out 0.57mm<sup>2</sup> Terahertz nano-radio in CMOS with 49.3mW peak power consumption supporting 50cm Internet-of-Things communication," 2017 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, 2017, pp. 1-4.
- [32] B. K. Swann, B.J. Blalock, L.G. Clonts, et al., "A 100-ps time-resolution CMOS time-todigital converter for positron emission tomography imaging applications," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 39, no. 11, pp. 1839-1852, Nov. 2004.
- [33] E. J. Gerds, J. Van der Spiegel, R. Van Berg, *et al.*, "A CMOS time to digital converter IC with 2 level analog CAM," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 29, no. 9, pp. 1068-1076, Sep 1994.
- [34] M. A. Thompson, M. W. Werner, R. R. Egan, *et al.*, "Free running time to digital converter with 1 nanosecond resolution," in *IEEE Transactions on Nuclear Science*, vol. 35, no. 1, pp. 184-186, Feb. 1988.
- [35] P. Dudek, S. Szczepanski and J. V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 35, no. 2, pp. 240-247, Feb. 2000.
- [36] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps resolution coarse–fine time-to-digital converter in 90 nm CMOS that amplifies a time residue," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 43, no. 4, pp. 769-777, April 2008.
- [37] P. Y. Chiang, Z. Wang, O. Momeni and P. Heydari, "A silicon-based 0.3 THz frequency synthesizer with wide locking range," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 49, no. 12, pp. 2951-2963, Dec. 2014.
- [38] M. Seo, M. Urteaga, J. Hacker, *et al.*, "InP HBT IC technology for terahertz frequencies: fundamental oscillators up to 0.57 THz," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 46, no. 10, pp. 2203-2214, Oct. 2011.
- [39] R. Han, C. Jiang, A. Mostajeran, et al., "25.5 A 320GHz phase-locked transmitter with 3.3mW radiated power and 22.5dBm EIRP for heterodyne THz imaging systems," 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, San Francisco, CA, 2015, pp. 1-3.
- [40] Z. Xu, M. Miyahara, K. Okada and A. Matsuzawa, "A 3.6 GHz low-noise fractional-N digital PLL using SAR-ADC-based TDC," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 51, no. 10, pp. 2345-2356, Oct. 2016.

- [41] A. Sai, S. Kondo, T. T. Ta, et al., "19.7 A 65nm CMOS ADPLL with 360uW 1.6ps-INL SS-ADC-based period-detection-free TDC," 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, 2016, pp. 336-337.
- [42] T. Rahkonen, J. Kostamovaara and S. Saynajakangas, "Time interval measurements using integrated tapped CMOS delay lines," *Proceedings of the 32<sup>nd</sup> Midwest Symposium on Circuits* and Systems, Champaign, IL, 1989, pp. 201-205 vol.1.
- [43] A. Elshazly, S. Rao, B. Young and P. K. Hanumolu, "A noise-shaping time-to-digital converter using switched-ring oscillators—analysis, design, and measurement techniques," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 49, no. 5, pp. 1184-1197, May 2014.
- [44] J. P. Caram, J. Galloway and J. S. Kenney, "Time-to-digital converter with sample-andhold and quantization noise scrambling using harmonics in ring oscillators," in *IEEE Transactions on Circuits and Systems I: Regular Papers (TCASI)*, vol. 65, no. 1, pp. 74-83, Jan. 2018.
- [45] C. L. Wei and S. I. Liu, "A digital pll using oversampling delta-sigma TDC," in *IEEE Transactions on Circuits and Systems II: Express Briefs (TCASII)*, vol. 63, no. 7, pp. 633-637, July 2016.
- [46] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator TDC with first-order noise shaping," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 44, no. 4, pp. 1089-1098, April 2009.
- [47] T. Konishi, K. Okuno, S. Izumi, *et al.*, "A 61-dB SNDR 700 μm<sup>2</sup> second-order all-digital TDC with low-jitter frequency shift oscillators and dynamic flipflops," 2012 Symposium on VLSI Circuits (VLSI), Honolulu, HI, 2012, pp. 190-191.
- [48] S. K. Lee, Y. H. Seo, H. J. Park and J. Y. Sim, "A 1 GHz ADPLL with a 1.25 ps minimumresolution sub-exponent TDC in 0.18um CMOS," in *IEEE Journal of Solid-State Circuits* (JSSC), vol. 45, no. 12, pp. 2874-2881, Dec. 2010.
- [49] K. Kim, Y. H. Kim, W. Yu and S. Cho, "A 7 bit, 3.75 ps resolution two-step time-to-digital converter in 65 nm CMOS using pulse-train time amplifier," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 48, no. 4, pp. 1009-1017, April 2013.
- [50] M. Lee, M. E. Heidari and A. A. Abidi, "A low-noise wideband digital phase-locked loop based on a coarse–fine time-to-digital converter with subpicosecond resolution," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 44, no. 10, pp. 2808-2816, Oct. 2009.

- [51] R. B. Staszewski, K. Muhammad, D. Leipold, *et al.*, "All-digital TX frequency synthesizer and discrete-time receiver for bluetooth radio in 130-nm CMOS," *IEEE J. Solid-state Circuits*, vol. 39, no. 12, pp. 2278–2291, Dec. 2004.
- [52] G. Marzin, S. Levantino, C. Samori, and A. Lacaita, "A 20Mb/s phase modulator based on a 3.6GHz digital PLL with -36dB EVM at 5mW power," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2012, pp. 342-344.
- [53] X. Gao, L. Tee, W. Wu, et al., "A 28nm CMOS digital fractional-N PLL with -245.5dB FOM and a frequency tripler for 802.11abgn/ac radio," in *IEEE Int. Solid-State Circuits Conf.* (ISSCC) Dig. Tech. Papers, Feb. 2015, pp. 1–3.
- [54] F. Bohn, H. Wang, A. Natarajan, S. Jeon, A. Hajimiri, "Fully integrated frequency and phase generation for a 6–18GHz tunable multi-band phased-array receiver in CMOS," in *Proc. IEEE RFIC Symp. Dig. Papers*, Jun. 2008, pp. 439-442.
- [55] R. Gu, A.-L. Yee, Y. Xie and W. Lee, "A 6.25GHz 1V LC-PLL in 0.13 um CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2006, pp. 2442-2451.
- [56] X. Gao, E. A. M. Klumperink, M. Bohsali and B. Nauta, "A 2.2GHz 7.6mW sub-sampling PLL with -126dBc/Hz in-band phase noise and 0.15psrms jitter in 0.18µm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2009, pp. 392-393.
- [57] F. Zhao and F. F. Dai "Low-Noise Low-Power Design for Phase-Locked Loops Multi-Phase High-Performance Oscillators," Springer International Publishing AG, ISBN 978-3-319-12199-4, New York, Nov. 2014.
- [58] R. B. Staszewski, J.L. Wallberg, C.-M. Hung, *et al.*, "All-digital PLL and transmitter for mobile phones," *IEEE J Solid-state Circuits*, vol. 40, no. 12, pp. 2469-2482, Dec. 2005.
- [59] C. M. Hsu, M. Z. Straayer and M. H. Perrott, "A low-noise wide-BW 3.6-GHz digital  $\Delta\Sigma$  fractional-N frequency synthesizer with a noise-shaping time-to-digital converter and quantization noise cancellation," *IEEE J Solid-state Circuits*, vol. 43, no. 12, pp. 2776-2786, Dec. 2008.
- [60] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda, and M. Fukaishi, "A 2.1-to-2.8-GHz low-phase-noise all-digital frequency synthesizer with a time-windowed time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2582–2590, Dec. 2010.
- [61] C. W. Yao, W. F. Loke, R. Ni, et al., "A 14nm fractional-N digital PLL with 0.14psrms jitter and -78dBc fractional spur for cellular RFICs," in *IEEE Int. Solid-State Circuits Conf.* (ISSCC) Dig. Tech. Papers, Feb. 2017, pp. 422-423.

- [62] Y. Wu, M. Shahmohammadi, Y. Chen, P. Lu and R. B. Staszewski, "A 3.5–6.8-GHz widebandwidth DTC-assisted fractional-N all-digital pll with a MASH ΔΣ-TDC for low in-band phase noise," *IEEE J. Solid-State Circuits*, vol. 52, no. 7, pp. 1885-1903, July 2017.
- [63] S. Zheng and H. C. Luong, "A WCDMA/WLAN digital polar transmitter with low-noise ADPLL, wideband PM/AM modulator, and linearized PA," *IEEE J. Solid-State Circuits*, vol. 50, no. 7, pp. 1645-1656, July 2015.
- [64] A. Elkholy, T. Anand, W. S. Choi, A. Elshazly and P. K. Hanumolu, "A 3.7 mW low-noise wide-bandwidth 4.5 GHz digital fractional-N PLL using time amplifier-based TDC," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 867-881, April 2015.
- [65] A. Elkholy, S. Saxena, R. K. Nandwana, A. Elshazly and P. K. Hanumolu, "A 2.0–5.5 GHz wide bandwidth ring-based digital fractional-N PLL with extended range multi-modulus divider," *IEEE J. Solid-State Circuits*, vol. 51, no. 8, pp. 1771-1784, Aug. 2016.
- [66] M. A. Wheatley, L. A. Lepper, and N. K. Webb, "Frequency modulated phase locked loop with fractional divider and jitter compensation," *U.S. Patent*, 5 038 120 A, Aug. 6, 1991.
- [67] R. B. Staszewski, S. Vemulapalli, P. Vallur, J. Wallberg, and P. T. Balsara, "Time-todigital converter for RF frequency synthesis in 90 nm CMOS," in *Proc. IEEE RFIC Symp. Dig. Papers*, Jun. 2005, pp. 473–476.
- [68] P. Dudek, S. Szczepanski, and J. V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," *IEEE J. Solid-State Circuits (JSSC)*, vol. 35, no. 2, pp. 240–247, Feb. 2000.
- [69] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit Vernier ring time-to-digital converter in 0.13 m CMOS technology," *IEEE J. Solid-State Circuits (JSSC)*, vol. 45, no. 4, pp. 830-842, April 2010.
- [70] H. Wang and F. F. Dai, "A 14-Bit, 1-ps resolution, two-step ring and 2D Vernier TDC in 130nm CMOS technology," in *Proc. IEEE Eur. Solid-State Circuits Conf. (ESSCIRC)*, 2017, pp. 143-146.
- [71] Z. Xu, S. Lee, M. Miyahara and A. Matsuzawa, "A 0.84ps-LSB 2.47mW time-to-digital converter using charge pump and SAR-ADC," in *Proc. IEEE Custom Integr. Circuits Conf.* (CICC) Dig. Papers, Sept. 2013, pp. 1-4.
- [72] W. Yu, K. S. Kim and S. H. Cho, "A 0.22 psrms integrated noise 15 MHz bandwidth fourthorder ΔΣ time-to-digital converter using time-domain error-feedback filter," *IEEE J Solid-state Circuits*, vol. 50, no.5, pp. 1251-1262, May. 2015.

- [73] M. Lee and A. A. Abidi, "A 9b, 1.25ps resolution coarse-fine time-to-digital converter in 90nm CMOS that amplifies a time residue," in *Proc. IEEE Int. Symp. VLSI Circuits Dig.*, 2007, pp. 168-169.
- [74] K. Kim, W. Yu and S. Cho, "A 9 bit, 1.12 ps resolution 2.5 b/stage pipelined time-to-digital converter in 65 nm CMOS using time-register," *IEEE J Solid-state Circuits*, vol. 49, no. 4, pp. 1007-1016, April 2014.
- [75] L. Vercesi, A. Liscidini and R. Castello, "Two-dimensions Vernier time-to-digital converter," *IEEE J Solid-state Circuits (JSSC)*, vol. 45, no. 8, pp. 1504-1512, Aug. 2010.
- [76] J. Yu and F. F. Dai, "A 3-Dimensional Vernier Ring Time-to-digital Converter in 0.13µm CMOS," in *Proc. IEEE Custom Integrated Circuits Conf. (CICC) Dig. Papers*, Sept. 2010.
- [77] D. Liao, H. Wang, F. F. Dai, Y. Xu, et al., "An 802.11a/b/g/n digital fractional-N PLL with automatic TDC linearity calibration for spur cancellation," *IEEE J Solid-state Circuits*, vol. 52, no. 5, pp. 1210-1220, May 2017.
- [78] H. Wang, F. Dai, H. Wang, "A 330 $\mu$ W 1.25ps 400fs-INL Vernier time-to-digital converter with 2D reconfigurable spiral arbiter array and 2<sup>nd</sup>-order  $\Delta\Sigma$  linearization," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC) Dig. Papers*, April. 2017.
- [79] J. W. M. Rogers, C. Plett, and F. F. Dai, "Integrated Circuit Design for High-Speed Frequency Synthesis," ARTECH HOUSE PUBLISHERS, INC., ISBN: 1-58053-982-3, Norwood, MA, Feb. 2006.
- [80] J. Rogers, F. F. Dai, M. S. Cavin, and D. G. Rahn, "A multiband ΔΣ fractional-N frequency synthesizer for a MIMO WLAN transceiver FRFIC," *IEEE J Solid-state Circuits*, vol. 40, no. 3, pp. 678-689, March 2005.
- [81] D. Liao, H. Wang, F. F. Dai, Y. Xu and R. Berenguer, "An 802.11 a/b/g/n digital fractional-N PLL with automatic TDC linearity calibration for spur cancellation," in *Proc. IEEE RFIC Symp. Dig. Papers*, May 2016, pp. 134-137.
- [82] Y. J. Chen, K. H. Chang and C. C. Hsieh, "A 2.02–5.16 fJ/conversion step 10-bit hybrid coarse-fine SAR ADC with time-domain quantizer in 90 nm CMOS," *IEEE J Solid-state Circuits*, vol. 51, no. 2, pp. 357-364, Feb. 2016.
- [83] A. Elshazly, S. Rao, B. Young and P. K. Hanumolu, "A noise-shaping time-to-digital converter using switched-ring oscillators—analysis, design, and measurement techniques," *IEEE J Solid-state Circuits*, vol. 49, no. 5, pp. 1184-1197, May 2014.

- [84] S. J. Kim, T. Kim and H. Park, "A 0.63ps, 12b, synchronous cyclic TDC using a time adder for on-chip jitter measurement of a SoC in 28nm CMOS technology," *IEEE Int. Symp. VLSI Circuits Dig.*, 2014, pp. 1-2.
- [85] K. Kim, W. Yu and S. Cho, "A 9b, 1.12ps resolution 2.5b/stage pipelined time-to-digital converter in 65nm CMOS using time-register," *IEEE Int. Symp. VLSI Circuits Dig.*, 2013, pp. C136-C137.
- [86] K. Kim, Y. Kim, W. Yu and S. Cho, "A 7b, 3.75ps resolution two-step time-to-digital converter in 65nm CMOS using pulse-train time amplifier," *IEEE Int. Symp. VLSI Circuits Dig.*, 2012, pp. 192-193.
- [87] Y. H. Seo, J. S. Kim, H. J. Park and J. Y. Sim, "A 1.25 ps resolution 8b cyclic TDC in 0.13 um CMOS," *IEEE J Solid-state Circuits*, vol. 47, no. 3, pp. 736-743, Mar. 2012.
- [88] Y. H. Seo, J. S. Kim, H. J. Park and J. Y. Sim, "A 0.63ps resolution, 11b pipeline TDC in 0.13µm CMOS," *IEEE Int. Symp. VLSI Circuits Dig.*, 2011, pp. 152-153.
- [89] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps resolution coarse–fine time-to-digital converter in 90 nm CMOS that amplifies a time residue," *IEEE J Solid-state Circuits*, vol. 43, no. 4, pp. 769-777, April 2008.
- [90] T. Chi, H. Wang, M. Huang, F. Dai, and H. Wang, "A bidirectional lens-free digital-bitsin/-out 0.57mm<sup>2</sup> terahertz nano-radio in CMOS with 49.3mW peak power consumption supporting 50cm internet-of-things communication," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC) Dig. Papers*, April. 2017.
- [91] A. Tang and M. C. F. Chang, "183GHz 13.5mW/pixel CMOS regenerative receiver for mm-wave imaging applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2011, pp. 296-298.
- [92] T. Chi, M.Y. Huang, S Li, H. Wang, "A packaged 90-to-300GHz transmitter and 115-to-325GHz coherent receiver in CMOS for full-band continuous-wave mm-wave hyperspectral imaging," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 304-305
- [93] S. Liu and Y. Zheng, "A low-power and highly linear 14-bit parallel sampling TDC with power gating and DEM in 65-nm CMOS," *IEEE Transactions on VLSI Systems*, vol. 24, no. 3, pp. 1083-1091, March 2016.
- [94] S. J. Kim, W. Kim, M. Song, J. Kim, T. Kim and H. Park, "A 0.6V 1.17ps PVT-tolerant and synthesizable time-to-digital converter using stochastic phase interpolation with 16×

spatial redundancy in 14nm FinFET technology," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2015, pp. 1-3.

- [95] H. Wang, F. F. Dai and H. Wang, "A reconfigurable Vernier time-to-digital converter with 2-D spiral comparator array and second-order  $\Delta\Sigma$  linearization," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 53, no. 3, pp. 738-749, March 2018.
- [96] W. Deng et al., "A Fully Synthesizable All-Digital PLL With Interpolative Phase Coupled Oscillator, Current-Output DAC, and Fine-Resolution Digital Varactor Using Gated Edge Injection Technique," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 50, no. 1, pp. 68-80, Jan. 2015.
- [97] S. Zheng and H. C. Luong, "A CMOS WCDMA/WLAN Digital Polar Transmitter with AM Replica Feedback Linearization," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 48, no. 7, pp. 1701-1709, July 2013.
- [98] C. W. Yao, R. Ni, C. Lau, *et al.*, "A 14-nm 0.14-ps<sub>rms</sub> fractional-n digital PLL with a 0.2-ps resolution ADC-assisted coarse/fine-conversion chopping TDC and TDC nonlinearity calibration," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 52, no. 12, pp. 3446-3457, Dec. 2017.
- [99] M. A. Wheatley, L. A. Lepper, and N. K. Webb, "Frequency modulated phase locked loop with fractional divider and jitter compensation," U.S. Patent 5 038 120 A, Aug. 6, 1991.
- [100] H. Wang and K. Sengupta, *RF and Mm-Wave Power Generation in Silicon*. Cambridge, MA, USA: Academic Press, 2015.
- [101] K. Sengupta and A. Hajimiri, "A 0.28 THz power-generation and beam-steering array in CMOS based on distributed active radiators," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 3013–3031, Dec. 2012.
- [102] J. Park, S. Kang, and A. Niknejad, "A 0.38 THz fully integrated transceiver utilizing a quadrature push-push harmonic circuitry in SiGe BiCMOS," *IEEE Symp. VLSI Circuits*, pp. 22–23, June 2011.
- [103] N. Sarmah, et al., "A fully integrated 240-GHz direct-conversion quadrature transmitter and receiver chipset in SiGe technology," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 2, pp. 562–574, Feb. 2016.
- [104] Z. Wang, P.-Y. Chiang, P. Nazari, C.-C. Wang, Z. Chen, and P. Heydari, "A CMOS 210-GHz fundamental transceiver with OOK modulation," *IEEE J. Solid-State Circuits*, vol. 49, no. 3, pp. 564–580, Mar. 2014.

- [105] O. Momeni and E. Afshari, "High power terahertz and millimeter-wave oscillator design: a systematic approach," *IEEE J. Solid-State Circuits*, vol. 46, no. 3, pp. 583–597, Mar. 2011.
- [106] Y. Tousi, O. Momeni, and E. Afshari, "A novel CMOS high-power terahertz VCO based on coupled oscillators: theory and implementation," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 3032–3042, Dec. 2012.
- [107] R. Han *et al.*, "A SiGe terahertz heterodyne imaging transmitter with 3.3 mW radiated power and fully-integrated phase-locked loop," *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 2935–2947, Dec. 2015.
- [108] H. Aghasi, A. Cathelin, and E. Afshari, "A 0.92-THz SiGe power radiator based on a nonlinear theory for harmonic generation," *IEEE J. Solid-State Circuits*, vol. 52, no. 2, pp. 406–422, Jan. 2017.
- [109] R. Han et al., "Active terahertz imaging using Schottky diodes in CMOS: array and 860-GHz pixel," IEEE J. Solid-State Circuits, vol. 48, no. 10, pp. 2296–2308, July 2013.
- [110] M. Uzunkol, O. Gurbuz, F. Golcuk, and G. Rebeiz, "A 0.32 THz SiGe 4×4 imaging array using high-efficiency on-chip antennas," *IEEE J. Solid-State Circuits*, vol. 48, no. 9, pp. 2056– 2066, Sep. 2013.
- [111] Y. Yang, O. Gurbuz, and G. Rebeiz, "An eight-element 370–410-GHz phased-array transmitter in 45-nm CMOS SOI with peak EIRP of 8–8.5 dBm," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 12, pp. 4241–4249, Dec. 2016.
- [112] P.-Y. Chiang, Z. Wang, O. Momeni, and P. Heydari, "A silicon-based 0.3 THz frequency synthesizer with wide locking range," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 2951– 2963, Dec. 2014.
- [113] T. Chi, J. Luo, S. Hu, and H. Wang, "A multi-phase sub-harmonic injection locking technique for bandwidth extension in silicon-based THz signal generation," in *Proc. IEEE Custom Integrated Circuits Conf. (CICC)*, Sep. 2014.
- [114] T. Chi, J. Luo, S. Hu, and H. Wang, "A multi-phase sub-harmonic injection locking technique for bandwidth extension in silicon-based THz signal generation," *IEEE J. Solid-state Circuits*, vol. 50, no. 8, pp. 1861–1873, Aug. 2015.
- [115] T. Chi and H. Wang, "A scalable active THz frequency multiplier chain in silicon with multi-phase sub-harmonic injection locking for bandwidth extension," in *Proc. IEEE Asia-Pacific Microwave Conf. (APMC)*, Dec. 2015.

- [116] Y. Zhao et. al, "A 0.56 THz phase-locked frequency synthesizer in 65 nm CMOS technology," *IEEE J. Solid-state Circuits*, vol. 51, no. 12, pp. 3005–3019, Dec. 2016.
- [117] J. Grzyb, B. Heinemann, and U. Pfeiffer, "A 0.55 THz near-field sensor with a μm-range lateral resolution fully integrated in 130 nm SiGe BiCMOS," *IEEE J. Solid-State Circuits*, vol. 51, no. 12, pp. 3063–3076, Dec. 2016.
- [118] Z. Ahmad, M. Lee, and K. O, "1.4THz, -13dBm-EIRP frequency multiplier chain using symmetric- and asymmetric-CV varactors in 65nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2016, pp. 350–351.
- [119] Q. Zhong, W. Choi, C. Miller, R. Henderson, and K. O, "A 210-to-305GHz CMOS receiver for rotational spectroscopy," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2016, pp. 426–427.
- [120] T. Chi, M. Huang, S. Li, and H. Wang, "A packaged 90-to-300GHz transmitter and 115to-325GHz coherent receiver in CMOS for full-band continuous-wave mm-wave hyperspectral imaging," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 304–305.
- [121] B. Yu, Y. Liu, Y. Ye, J. Ren, X. Liu, and Q. Gu, "High-efficiency micomachined sub-THz channels for low-cost interconnect for planar integrated circuits," *IEEE Trans. Microw. Theory Techn.*, vol. 64, no. 1, pp. 96–105, Jan. 2016.
- [122] A. Tang and M.-C. Chang, "183GHz 13.5mW/pixel CMOS regenerative receiver for mmwave imaging applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2011, pp. 296–297.
- [123] A. Tang and M.-C. Chang, "Inter-modulated regenerative CMOS receivers operating at 349 and 495 GHz for THz imaging applications," *IEEE Trans. THz Sci. Technol.*, vol. 3, no. 2, pp. 134–140, Dec. 2012.
- [124] S. Li, T. Chi, J. Park, and H. Wang, "A multi-feed antenna for antenna-level power combining," in *Proc. IEEE APS/URSI Symp.*, June 2016.
- [125] S. Li, T. Chi, J. Park, Y. Wang, and H. Wang, "A millimeter-wave dual-feed square loop antenna for 5G communications," *IEEE Trans. Antennas Propag.*, vol. PP., no. 99, pp. 1-12, Oct. 2017.
- [126] W. Choi, C. Cheon, and Y. Kwon, "A V-band single-chip MMIC oscillator array using a 4-port microstrip patch antenna," in *Proc. IEEE MTT-S Int. Microw. Symp. (IMS)*, June 2003, pp. 881–884.

- [127] S. Bowers and A. Hajimiri, "Multi-port driven radiators," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 12, pp. 4428–4441, Dec. 2013.
- [128] P. Nazari, S. Jafarlou, and P. Heydari, "A fundamental-frequency 114GHz circularpolarized radiating element with 14dBm EIRP, -99.3dBc/Hz phase-noise at 1MHz offset and 3.7% peak efficiency," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 322–323.
- [129] T. Chi, F. Wang, S. Li, M. Huang, J. Park, and H. Wang, "A 60GHz on-chip linear radiator with single-element 27.9dBm P<sub>sat</sub> and 33.1dBm peak EIRP using multifeed antenna for direct on-antenna power combining," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 296–297.
- [130] J. Bohorquez, A. Chandrakasan, and J. Dawson, "Frequency-domain analysis of superregenerative amplifiers," *IEEE Trans. Microw. Theory Techn.*, vol. 57, no. 12, pp. 2882–2894, Dec. 2009.
- [131] F. Moncunill-Geniz, P. Palà-Schönwälder, and O. Mas-Casals, "A generic approach to the theory of superregenerative reception," *IEEE Trans. Circuits Syst. I*, vol. 52, no. 1, pp. 54–70, Jan. 2005.
- [132] J. Bonet-Dalmau, F. Moncunill-Geniz, P. Palà-Schönwälder, F. Águila-López, and R. Giralt-Mas, "Frequency domain analysis of superregenerative receivers in the linear and logarithmic modes," *IEEE Trans. Circuits Syst. I*, vol. 59, no. 5, pp. 1074–1084, May 2012.
- [133] W. Jung, S. Oh, S. Bang, *et al.*, "An ultra-low power fully integrated energy harvester based on self-oscillating switched-capacitor voltage doubler," in *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 49, no. 12, pp. 2800-2811, Dec. 2014.
- [134] W. Jung, S. Oh, S. Bang, Y. Lee, D. Sylvester and D. Blaauw, "23.3 A 3nW fully integrated energy harvester based on self-oscillating switched-capacitor DC-DC converter," 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, CA, 2014, pp. 398-399.
- [135] P. Nazari, B. K. Chun, V. Kumar, et al., "A 130nm CMOS polar quantizer for cellular applications," 2013 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), Seattle, WA, 2013, pp. 155-158.
- [136] P. Nazari, B. K. Chun, F. Tzeng and P. Heydari, "Polar quantizer for wireless receivers: theory, analysis, and CMOS implementation," in *IEEE Transactions on Circuits and Systems I: Regular Papers (TCASI)*, vol. 61, no. 3, pp. 877-887, March 2014.

- [137] S.O. Zafra, X. Pang, G. Jacobsen, *et al.* "Phase noise tolerance study in coherent optical circular QAM transmissions with Viterbi-Viterbi carrier phase estimation." *Optics Express* 22.25 (2014): 30579-30585.
- [138] M. A. Tariq, H. Mehrpouyan and T. Svensson, "Performance of circular QAM constellations with time varying phase noise," 2012 IEEE 23<sup>rd</sup> International Symposium on Personal, Indoor and Mobile Radio Communications - (PIMRC), Sydney, NSW, 2012, pp. 2365-2370.
- [139] M Baldi, F Chiaraluce, A De Angelis, *et al.* "A comparison between APSK and QAM in wireless tactical scenarios for land mobile systems." *EURASIP Journal on Wireless Communications and Networking* 2012.1 (2012): 317.

The End