## Design of High Performance Radio-Frequency Phase-Locked Loops

by

Dongyi Liao

A dissertation submitted to the Graduate Faculty of
Auburn University
in partial fulfillment of the
requirements for the Degree of
Doctor of Philosophy

Auburn, Alabama August 04, 2018

Keywords: Phase Lock Loop, PLL, Phase Noise, Radio-Frequency, Integrated Circuit

Copyright 2018 by Dongyi Liao

## Approved by

Fa Foster Dai, Chair, Professor of Electrical and Computer Engineering Bogdan M. Wilamowski, Professor of Electrical and Computer Engineering Guofu Niu, Professor of Electrical and Computer Engineering Michael Hamilton, Professor of Electrical and Computer Engineering

#### **Abstract**

Phase-locked loop, or PLL, is widely used in different areas of electronic system such as wireless transceivers which use PLL to generate carrier signals. Nowadays, the requirement for ever increasing data rate puts more stringent requirement on wireless networks where PLL plays a critical role. The next generation PLL for 5G network is required to achieve lower phase noise, higher spurious free dynamic range, consume less power and operate over broader frequency range. In this dissertation, the classic type-II PLL structure will first be reviewed and studied. Some of the most critical PLL performance including noise, power and spur will be discussed. Next, three designs that I have been involved with during my PhD career will be presented in each of the following chapters. These PLL designs vary in architecture and emphasize on different aspects of performance. In my opinion, they represent some of the future trends in the architecture of the next generation PLLs. Each of these designs has adopted some novel ideas, aiming to improve some key performances compared to conventional architecture. Along with the discussion for theoretical analysis and working mechanism, the PLL design methodology and simulation setup for key performances will also be covered.

## Acknowledgments

It always feels surprisingly sudden when I become aware of the forthcoming end to my career as a student. Having spent almost my entire life until now in school, I become too familiar with the academic life. On one hand, I am very much looking forward to the days in the future to put my skills into proper use. On the other hand, looking back on those almost 6 years that I spent in Auburn, I have learnt a lot not only in knowledge, but also in life. Fighting for a Ph.D. degree in an exotic land by oneself, only those who tried knows the hardship and bitterness one has to undertake. Fortunately, I am blessed to have some very helping friends, colleagues and professors. I acknowledge my major advisor Dr. Fa Dai for enlightening me about this field and also for teaching me the right way to conduct scientific study. Discussing with him can always spark new ideas which lead to most of the designs in this dissertation. I must also acknowledge other members of my committee: Dr. Bogdan M. Wilamowski, Dr. Guofu Niu and Dr. Michael Hamilton. I would also like to acknowledge Dr. Minseo Park as the university reader to review this work.

In addition, I must also express my gratitude for my parents for supporting and loving me for all these years. Without their unconditional caring and trust, I will never have gone this far.

# Table of Contents

| Abstractii                       |
|----------------------------------|
| Acknowledgmentsiii               |
| List of Tablesvii                |
| List of Figuresviii              |
| List of Abbreviationsxii         |
| 1 Introduction                   |
| References6                      |
| 2 General PLL System Analysis    |
| 2.1 A Classic Type-II PLL Review |
| 2.2 Phase Noise                  |
| 2.3 Spurs                        |
| 2.4 Conclusions                  |
| References                       |

| 3 A Low-Noise Sub-Sampling Fractional-N PLL                                    | 17 |
|--------------------------------------------------------------------------------|----|
| 3.1 Robust Locking with Sub-sampling Technique                                 | 19 |
| 3.2 Multi-phase generation for fractional-N mode                               | 28 |
| 3.3 System and Building Blocks                                                 | 34 |
| 3.4 Measurement Results                                                        | 39 |
| 3.5 Conclusions                                                                | 46 |
| References                                                                     | 48 |
| 4 A 10GHz Reference Sampling PLL in a 5G Synthesizer                           | 52 |
| 4.1 Phase Noise & Ref-Sampling Phase Detector                                  | 54 |
| 4.2 Out-band Phase Noise & a Class-C VCO                                       | 59 |
| 4.3 PLL Loop Dynamics                                                          | 65 |
| References                                                                     | 68 |
| 5 A Digital PLL with Automatic TDC Linearity Calibration for Spur Cancellation | 69 |
| 5.1 Design of a Digital PLL with Low Fractional Spur                           | 71 |
| 5.2 System and Building Blocks                                                 | 74 |
| 5.3 Measurement Results                                                        | 84 |
| 5.4 Conclusions                                                                | 88 |

|   | References  | 89 |
|---|-------------|----|
|   |             |    |
|   |             |    |
| 6 | Conclusions | 92 |

# List of Tables

Table 3.1 Measured SSPLL Performances and Comparisons.

Table 4.1. Technical Specification for CCPD-575

Table 4.2 Simulation results for different inductors at 10GHz.

Table 5.1 Measured DPLL performances and comparisons

## List of Figures

- 1.1 Noise injected at different spots of oscillation waveform.
- 2.1. Conceptual diagram of a classic analog PLL.
- 2.2. Architecture diagram of a classic analog PLL.
- 2.3. The phase domain model of a classic analog PLL.
- 2.4. Simulated frequency response of the type-II PLL.
- 2.5. A typical PLL output phase noise and contributors.
- 3.1. Simplified block diagram of the dual loop architecture.
- 3.2. Illustrations of frequency response of proposed architecture
- 3.3. Simplified schematic diagram of the proposed loop switching controller.
- 3.4. Simulated loop gain variation for the proposed soft loop switching scheme
- 3.5. Simulated relocking time versus the percentage of switching threshold over reference period
- 3.6. Simulated phase portrait of the proposed soft switching SSPLL
- 3.7. Realignment of VCO zero crossing and reference edge utilizing a multi-phase VCO.

- 3.8. Capacitive phase interpolation network
- 3.9. Effect of interpolating capacitance value on phase error reduction
- 3.10. Simulated phase error DNL of interpolated output phases
- 3.11. Block diagram of the proposed multi-phase fractional-N SSPLL.
- 3.12. Simplified schematic diagram of the SSPD and CP.
- 3.13. Schematic diagram of the reference buffer with tunable delays.
- 3.14. Schematic diagram of the CML multiplexer.
- 3.15. Schematic diagram of the quadrature capacitive coupled VCO.
- 3.17. Measured phase noise and reference spur
- 3.18. Measured in-band phase noise variation in integer mode
- 3.19. Testing setup for dual loop feedback delay mismatch calibration in integer mode
- 3.20. Measured fractional spur
- 3.21. Measured relocking transient behavior after a supply perturbation
- 3.22. Measured relocking transient behavior
- 4.1. Simplified architecture diagram of the proposed 5G synthesizer.
- 4.2. Schematic diagram of RSPD with the capacitor bank.
- 4.3. Test-bench for in-band phase noise simulation.

- 4.4. pss and pnoise setup for in-band phase noise simulation.
- 4.5. Simulated in-band phase noise.
- 4.6. Illustrations of voltage and current waveform in class-C operation.
- 4.7. Schematic diagram of the proposed class-C VCO.
- 4.8. Voltage and current waveform for class-C operation
- 4.9. Simulated VCO free-running phase noise and VCO specifications
- 4.10. PLL open loop gain and phase over offset frequency.
- 4.11. Simulated overall PLL phase noise with in-band and out-band noise contributions.
- 5.1. Comparison of two digital PLL architectures
- 5.2. Simulated TDC output and the residue signal after digi-phase canceller
- 5.3 Fractional spur level due to residue error at the digi-phase canceller output.
- 5.4. Proposed DPLL block diagram
- 5.5. Proposed 3-step TDC block diagram.
- 5.6. Simulated TDC non-linearity considering common mode error and differential mode error.
- 5.7. Proposed TDC automatic linearity calibration loops.
- 5.8. Measured convergence of TDC common mode and differential mode delays.
- 5.9. Incorrect divider state in the first reference period after ratio switching

- 5.10. Flowchart of the proposed divider with asynchronous counter
- 5.11. Die photo of the DPLL in a low power multi-standard wireless transceiver RFIC.
- 5.12. Measured phase noise at 2.08 GHz output with loop bandwidth of 1 MHz.
- 5.13. Measured spectrum before and after digital calibration with fractionality
- 5.14. Measured fractional spur near 2.4 GHz
- 5.15. Measured TDC transfer curve, INL and DNL before and after digital calibration.

## List of Abbreviations

ADC Analog-to-Digital Converter

CP Charge Pump

DPLL Digital PLL

FLL Frequency Lock Loop

FoM Figure of Merit

LO Local Oscillator

PFD Phase Frequency Detector

PLL Phase Locked Loop

RFIC Radio-Frequency Integrated Circuits

RSPD Reference Sampling Phase Detector

RSPLL Reference Sampling PLL

SNR Signal to Noise Ratio

SSL Sub-Sampling Loop

SSPD Sub-Sampling Phase Detector

TDC Time-to-Digital Converter

VCO Voltage Controlled Oscillator

#### Chapter 1

#### Introduction

The phase-locked loop (PLL), as its name suggests, includes a close loop system that is capable of aligning the output phase of an oscillator to that of the reference clock. This phase aligning property can be found in common things around us: large clumps of fireflies can synchronize their illumination, two closely placed pendulum or tuning forks tend to oscillate at the same pace, or even applause in a show gradually merges after some time. The earliest research related to electronic phase lock loop dates back to 1930s in a Homodyne or directconversion receiver [1]. The locking process is analogous to driving on the road: if the car is heading a bit to the right or left, you turn the wheel to the opposite direction to keep the car going straight ahead. In this case, if the VCO phase is leading the reference clock phase, the loop will slow down the VCO by tuning down its oscillation frequency; likewise, when the VCO phase is lagging behind the reference phase, the loop will tune up the oscillation frequency of VCO trying to catch up. Usually the entire locking process can be broken down into two stages: Initially, the VCO frequency gradually approaches that of the reference clock (or multiple reference cycles), reaching the "frequency-locked" state; afterwards, the VCO phase will also start to align with the reference clock phase. Eventually, the VCO's rising/falling edge will fluctuate within a small region around the reference clock's rising/falling edge, reaching the "phase-locked" state.

The most common application of PLL is frequency multiplication where the generated frequency equals to multiples of the reference clock frequency. This accurate high frequency tone can be used as a carrier to up-convert baseband signals to the appropriate spectrum. The versatility of PLL being capable of generating accurate frequency with fine frequency step makes it very useful in both wired and wireless transmission. In order to avoid interference, each channel is assigned to different parts of the spectrum. Thus PLL in transcievers are usually required to achieve a frequency step of a few kHz while the residual frequency error can be further corrected in the digital domain. With the advent of 5G wireless communication standards, the next generation PLL is required to cover even higher frequencies on the spectrum, reaching tens of GHz with very little jitter to support complex modulation type and higher data rate. In addition to carrier generation, other areas of PLL application include providing clock for synchronous digital circuits, clock data recovery (CDR) from noisy environment or demodulating frequency modulated signal. Other clock generation clock methods also exist including direct digital synthesis (DDS).

In a practical PLL, the output tone is usually accompanied by unwanted noise and distortions. The noise can be divided into two categories: amplitude-modulation (AM) noise and phase-modulation (PM) noise. As their names suggest, AM noise represents the fluctuation on the amplitude of the sinusoidal tone. PM noise, on the other hand, represents the random deviation of the instantaneous phase from an ideal sinusoidal waveform where phase grows linearly with time. Both two types of noise stem from the random fluctuation on the either the voltage or current waveform as shown in Fig. 1.1 [2]. A disturbance injected at the peak of the sinusoidal waveform has the largest amplitude deviation whereas a disturbance injected at the zero-crossing has the largest phase deviation. Since most the oscillator used in PLL is operating

in a "voltage-limiting" mode where the amplitude variation is limited, usually the PM noise become dominant and is usually what people cared about the most.



Fig. 1.1. Noise injected at different spots of oscillation waveform leads to different levels of AM and PM distortion.

Depending on application, jitter can be measured in different ways. In digital circuits where the setup and hold time between data and clock is most important, the variation in clock periods, or the period jitter, is most critical which only accounts for the difference between adjacent clock edges. Another type of jitter, called time interval error (TIE), represents the timing error between the generated clock and an ideal clock. In TIE, the long term accumulative timing error is considered which is critical in certain applications such as range finder. TIE is a time domain representation whereas phase noise is a frequency domain representation of clock jitter.

The spurious tones, or simply spurs, are the most common type of distortions with PLL. Unlike noise resulting from pure random fluctuations, these spurious tones usually come from repeated patterns of errors within PLL such as errors due to the non-linear transfer function of phase detector. Due to its periodic nature, spurs usually show up as side tones close to the fundamental tone. Determined by the specific cause of spurs, the frequency of spur is usually

related to the output tone frequency and moves with the output tone on the spectrum. The most common spur related to PLL is reference spur as the loop is updated periodically with the reference clock. Spur also contribute to the deterministic jitter which degrades the overall PLL jitter. An overwhelming spur also make it difficult for the transmitter output to pass spectrum mask since the non-linear behavior of power amplifier usually exacerbates side tones.

Either mixing the carrier with the baseband signal for up-conversion or with the received RF signal for down-conversion, the noise and spurs from the carrier will also be mixed into the output signal. The noise causes higher noise floors and lower signal to noise ratio (SNR) and the spurs causes unwanted mixing which leads to crosstalk between adjacent channels. While reducing the PLL output noise and spurs, the power consumption also needs to be kept low since most of the wireless devices operate on batteries. The most widely used metric to measure the PLL performance, or "Figure of Merit" (FoM), is calculated based on its output phase noise and power consumption:

$$FoM = 10log \left[ \left( \frac{\sigma_t}{1s} \right)^2 \frac{P_{PLL}}{1mW} \right] \tag{1.1}$$

Where  $\sigma_t$  represents the output jitter which is the PLL phase noise measured in time domain and  $P_{PLL}$  represents the PLL power consumption. The two terms are normalized to one second and one milliwatt.

The future PLL is required to provide a clean output spectrum with lower phase noise and lower spur levels while consuming less power. To reach this goal, many novel architectures and techniques have been proposed in recent years to keep improving state-of-the-art PLL FoM. These novelties usually stem from a deeper understanding of the working mechanism of each

individual module or a second look on the grander system overall architectures. Some of the novel PLL design trends will be covered and discussed in this work. These ideas might be borrowed from circuits other than PLL, or even from outside the field of electrical engineering. The wisdom behind a smarter and more concise way to achieve equivalent functionality never stops to amaze me and always keeps me passionate about integrated circuit design.

# References

- [1] Henri de Bellescize, "La réception synchrone," L'Onde Électrique (later: Revue de l'Electricité et de l'Electronique), vol. 11, pages 230–240 (June 1932).
- [2] A. Hajimiri and T. H. Lee, "A general theory of phase noise in electrical oscillators," *IEEE JSSC*, vol. 33, no. 2, pp. 179-194, Feb 1998.

## Chapter 2

### General PLL System Analysis

Depending on the application, the PLL might be targeted towards different performance goals. Generally, it is preferable to reduce the jitter for improved SNR or lower the spur to suppress interference to adjacent channels. Other performance goals might also include a shorter locking time as required in frequency hopping, a larger frequency range for wide band applications or extreme low power consumption for wearable devices. Reaching good performance in all aspects usually results in conflict during circuit design and thus compromises are made to trade some of those aspects for the others. In this chapter, we will discuss the overall PLL architecture including both the analog PLL and the digital PLL which possesses different unique features. We will also discuss the overall design process and some key parameters that require additional attention.

## 2.1 A Classic Type-II PLL Review



Fig. 2.1. Conceptual diagram of a classic analog PLL.

A general PLL architecture can be broken down into 3 parts as shown in Fig. 2.1: the first being the phase detector which measures the phase/frequency difference between the oscillator and the reference clock; the second being the loop filter which filters out high frequency glitch detected by the phase detector; the third part being the voltage controlled oscillator (VCO) which tunes its own frequency based on the control signal from the loop filter. These parts form the feedback loop in the PLL driving itself toward the eventual "phase lock" state. The most commonly used approach to studying the PLL behavior is the phase domain model where the phase at each node in the PLL is modeled and each module is replaced with their phase gain.



Fig. 2.2. Architecture diagram of a classic type-II analog PLL.

As shown in the more detailed structure diagram of a classic analog PLL (Fig. 2.2), it uses a high frequency divider at VCO output to first scale down PLL output frequency by *N* times, reaching a frequency closer to the reference crystal oscillator. Next, a tri-state phase/frequency detector (PFD) converts the time gap between the edges of the reference clock and the divided clock into pulse width modulated (PWM) waveform to further drive the charge pump. Finally, the loop filter converts the charge pump current into tuning voltage which is eventually fed back

to VCO input, modulating its oscillation frequency accordingly. The entire circuits including divider, PFD and CP constitute the phase detector which measures the phase difference between VCO edge and reference clock edge. This structure is usually called a type-II PLL since there are two integrators in the loop: one from VCO integrating frequency into phase; the second one from the charge pump dumping current into the capacitor in the loop filter. Due to these two poles at DC, ideally it can achieve zero phase error after phase locking.



Fig. 2.3. The phase domain model of the classic type-II analog PLL.

The aforementioned PLL circuit can be converted to its equivalent phase-domain model where every node is assigned a variable representing its phase or voltage and the small signal transfer gain of each module is calculated. Since every block is only modeled with its linear gain, phase domain model fails to reflect some nonlinear behavior inside the loop including dead-zone, cycle slipping, etc. However, it still remains useful in analyzing other aspects like phase locking behavior and noise. As shown in Fig.2.3,  $\theta_R$  and  $\theta_o$  denotes the reference phase and the divided phase while  $v_e$  and  $v_c$  denotes the phase detector output and VCO tuning voltage. VCO adds an inherent pole in the loop by accumulating frequency into phase. The loop filter F(s) contributes additional pole and zeros resulting in higher PLL orders. The open loop gain can be derived as:

$$K_{open} = \frac{\theta_o}{\theta_P} = \frac{K_{phase}F(s)}{N} \cdot \frac{K_{VCO}}{s} = K\frac{F(s)}{s}$$
 (2.1)

And the closed loop gain can be shown as:

$$K_{close} = \frac{\theta_o}{\theta_R} = \frac{K_{open}}{1 + K_{open}} = \frac{KF(s)}{s + KF(s)}$$
(2.2)

Assuming a second-order loop filter as shown in Fig. 2.2, it can be shown:

$$K_{close} = \frac{\theta_o}{\theta_R} = \frac{K(1 + sC_1R)}{s^2(C_1 + C_2)(1 + sC_sR) + K(1 + sC_1R)}$$
(2.3)

where  $C_s = \frac{C_1 C_2}{C_1 + C_2}$ , representing the effective capacitance connecting  $C_1$  and  $C_2$  in series. As an example, we use a reference clock of 50MHz to generate an RF carrier of 2.4GHz (N=48). Assuming charge pump current as 0.5mA, loop filter coefficients as R=3K Ohms, C1=500pF, C2=50pF, we can plot the frequency response of this third order PLL as shown in Fig.2.4. Two poles exist at DC with another low frequency zero at  $1/RC_1$  and a high frequency pole at  $1/RC_s$ . To maximize the phase margin, the unity gain frequency is usually placed at the geometry mean between the low frequency zero and the high frequency pole.

### 2.2 Phase Noise

The phase noise is essentially a random fluctuation on the transient phase of oscillators. It plays a critical role on the quality of data received in a wireless transmission. To support higher-order modulations such as 64-QAM, the integrated phase noise of PLL needs to be lower than -30 dBc. In addition, other modules in the transceiver such as LNA or mixer will also introduce extra noise on the signal. In this section, we will analyze the major phase noise contributors inside the PLL.



Fig. 2.4. Simulated frequency response of the type-II PLL.



Fig. 2.5. A typical PLL output phase noise and contributors.

A typical PLL output phase noise can usually be divided into in-band phase noise and outband phase noise as shown in Fig.2.5. Inside the PLL loop bandwidth, the PLL total phase noise experiences a relatively flat region, or the "pedal", where the major noise contributor including

reference clock, PFD and CP has a unity gain to the PLL output. On the other hand, beyond the PLL loop bandwidth, the PLL total noise is dominated by VCO phase and follows its -20 dB/dec slope with frequency.

Since the out-band noise follows the VCO phase noise, the PLL loop component plays a more critical role in in-band noise floor. The major in-band noise contributors are reference clock and charge pump, other less significant noise sources are divider, PFD and loop filter. The reference clock sets the lower limit on the in-band noise floor:

$$L_{PLL,REF} = L_{REF} \cdot N^2 \tag{2.4}$$

Where N is the frequency division ratio. Thus the PLL output in-band noise floor follows a  $N^2$  relation with the input reference clock noise floor. Furthermore, the noise contribution from CP can be derived as:

$$L_{PLL,CP} = \frac{S_{iCP,n}}{2\beta^2} \tag{2.5}$$

Where  $S_{iCP,n}$  denotes the noise power of CP output current and  $\beta$  represents the feedback gain from VCO output to CP output. These two variables can be found to be:

$$\beta = \frac{1}{N} \frac{I_{CP}}{2\pi} \tag{2.6}$$

$$S_{iCP,n} = 8kT\gamma g_m \frac{\tau}{Tref} \tag{2.7}$$

Where  $I_{CP}$ ,  $g_m$ ,  $\tau$  denotes the CP current, trans-conductance of current source in CP and the average turn on time of CP current source. Ideally the CP should always remain off after phase locking which leads to zero noise contribution. However, various non-ideal circuit behaviors

still turn on the CP for some time every reference period after locking. Thus to reduce the inband phase noise, we need to increase the CP current while reducing the turn on time  $\tau$ .

For the classic type-II PLL with PFD and CP, the loop should achieve a steady state where the up current cancels the charges drawn by the down current. The average turn on time is determined by the reset delay in PFD along with the switching time of current sources in CP. Unbalanced up/down current in CP will also contribute to additional turn on time of the current source with less current in order to achieve a zero net charge in every cycle. Thus to reduce the turn on time  $\tau$ , a faster switching is required in both PFD and CP which can be most effectively achieved with better process technology. In addition, the current mismatch in CP also needs to be reduced which can be achieved with various circuit techniques including dynamic closed-loop compensation [1].

## **2.3** Spur

Another non-ideality that could show up on the PLL output spectrum is spurious tones which usually appeared as unwanted tones superimposed on the desired carrier tone. In contrast to noise which is resulted from purely random fluctuation, spurs usually stem from a periodic distortion on the waveform. Its period is usually inversely proportional to the offset frequency of the spur from the desired tone, implying a longer period leads to a spur closer to the fundamental tone on the spectrum. The cause for spur can be various, but in general it can be related to some periodic disturbance generated internally or injected from external sources. Reference spur, which is a common type of spur encountered in PLL, is mainly caused by the reference clock triggering the loop components to operate in a periodic manner: loop parameters including the VCO tuning voltage is updated every reference cycle. Thus the VCO frequency will still

experience transient frequency shift in every reference cycle with the average frequency equal to the desired frequency for phase/frequency locking. Consequently, reference spur showed up with a frequency offset of the reference clock to the carrier tone. Fortunately, since the reference clock frequency is usually much higher than the loop bandwidth, the reference spur can be usually sufficiently filtered by the loop filter. Various techniques have also been proposed to minimize the disturbance in the loop from the reference clock. Some examples are balancing the current sources in charge pump [1] or balancing the loading variation from reference clock with dummy components [2].

Another type of spur is the fractional spur which is a critical parameter for fractional PLLs. Unlike reference spur which has a fixed frequency offset and remains relatively far from the carrier tone, the frequency offset of fractional spur varies depending on the fractional frequency. If we define the fractional frequency offset  $\Delta f_{frac}$  as the frequency offset between fractional frequency and the closet integer frequency, usually a series of fractional spur will arise at  $\Delta f_{frac}$ and its harmonics at  $2\Delta f_{frac}$ ,  $3\Delta f_{frac}$  .... Since the fractional PLL is usually required to provide a frequency step of several kHz, the associated fractional spur can be very close to the carrier tone which cannot be suppressed by the loop filter. Usually sigma delta divider will be used to push the power of fractional spurs to its higher frequency harmonics through noise shaping which can be filtered by the loop filter. However, this would also lead to increased divider ratio range which requires larger detectable range of phase detectors and a longer average CP turn on time that causes higher in-band phase noise. Even though the detection range of classic tri-state PFD spans across the entire reference cycle, most of the novel phase detectors such as TDC or sampling mode phase detector cannot provide such a wide detection range without non-trivial penalties. Fortunately, various techniques other than sigma delta divider exists which enables

fractional mode without requiring additional detection range. However, most of these techniques show highest fractional spur at low frequency offset. Thus the in-band fractional spur level is usually measured as a fair comparison to exclude the suppression from loop filter.

### 2.4 Conclusions

The classic type-II analog PLL has been widely used and usually achieves performance that is still capable of meeting specs for most application today. However, in order to further improve its performance in some critical parameters such as noise, spur and power to meet the requirement of next generation wireless network such as 5G, novel architecture needs to be applied. In the following chapters, three PLL designs will be presented which showed improved phase noise and spur performance through adopting new architectures other than the classic PLL structure.

## References

- [1] J. W. Rogers, C. Plett, and F. F. Dai, "Integrated Circuit Design for High-Speed Frequency Synthesis," ARTECH HOUSE PUBLISHERS, INC., ISBN: 1-58053-982-3, Norwood, MA, February, 2006, pp. 189-190.
- [2] X. Gao, E. A. M. Klumperink, G. Socci, M. Bohsali and B. Nauta, "Spur Reduction Techniques for Phase-Locked Loops Exploiting A Sub-Sampling Phase Detector," IEEE JSSC, vol. 45, no. 9, pp. 1809-1821, Sept. 2010.

### Chapter 3

### A Low-Noise Sub-sampling Fractional-N PLL

Phase-locked loops (PLL) are commonly applied for clock and carrier signal generation. Due to circuit non-idealities, the zero-crossing timing of the output clock from a PLL shows random jitter and periodic disturbances, or phase noise and spurious tones in the frequency spectrum. These non-idealities impact systems in various ways such as unwanted spectral emissions and reduced interference robustness due to reciprocal mixing with phase noise and spurs [1]. Techniques and architectures to generate clean clocks are hence of great importance to electronics system. In digital PLL, this can be improved with a high resolution TDC or digital calibrations [2-3] whereas the in-band phase noise in an analog PLL is limited by the tri-state phase-frequency detector (PFD).

Recently, a sub-sampling phase detector (SSPD) has been proposed as an alternative to the PFD to achieve greatly improved in-band phase noise [4-6]. As a SSPD only detects phase, other means are needed for frequency detection and switching between the two detector outputs to define which one controls the VCO. To this end, the SSPLL in [4] uses a tri-state PFD with intentional large dead-zone to switch between the frequency lock loop (FLL) and the sub-sampling (phase) loop (SSL). Due to the narrow capture range of the SSL, the SSL may lose lock in the presence of large perturbations. Moreover, potentially a prolonged relocking time is required as the phase errors need to be accumulated for a quite long time before the dead-zone is

passed that triggers the FLL to be switched on. This problem was partially solved in [7] by removing the dead-zone from the FLL. However, the revised FLL is constantly injecting its charge pump current as well as its noise into the loop filter. Depending on the amount of current injected from the FLL, the in-band phase noise of the PLL may be degraded. This paper tries to stick to the original idea of the SSPLL in [4] by removing the FLL charge pump noise when the loop is in lock. We propose an automatic soft switching scheme that eliminates FLL noise in lock, but still ensures agile and robust locking [8]. When the phase error is approaching zero, the proposed scheme gradually increases the SSL gain and decreases the FLL gain, while maintaining a constant total loop gain during loop transition. As a result, the loop dynamics such as loop bandwidth and gain/phase margin will not vary much throughout the switching process. When the loop is locked, the gain of the FLL is effectively turned off while the SSL is fully turned on, eliminating the FLL noise contribution to in-band phase noise.

Another feature of this work is the use of a SSL in the context of concurrent multi-phase clock generation. Such clocks are increasingly needed in various circuit building blocks, including the N-path filter, multi-path passive mixer, time-interleaved ADC/DAC, and phased array beam-former. Multi-phase clocks can be generated with ring oscillators [9-10], but the phase noise is inferior compared to LC based VCO. Another widely applied technique is to use an N times higher frequency than the needed frequency followed by frequency division by N [11-12]. This only works well up to a certain (technology dependent) frequency. This paper proposes an alternative passive structure for multi-phase clock generation directly at the operational frequency, without involving any higher frequency oscillators and dividers.

This chapter is organized as follows: section 1 and 2 discusses our proposed loop gain switching scheme and fractional SSPLL architecture; section 3 presents detailed circuit

implementations of various building blocks in the system; measurement results are presented in section 4.



Fig. 3.1. Simplified block diagram of the dual loop architecture.

### 3.1 Robust Locking with Sub-sampling Technique

A SSPD can achieve a gain much higher than a traditional tri-state PFD [4], and hence lower in-band noise, as there is more suppression of the noise from the charge pump, which is the major in-band noise contributor in a classical PLL. On the other hand, since it directly samples the VCO waveform without frequency downscaling, the SSPD maintains its high gain only within a small region around zero crossing of the VCO waveform. If for some reason a relatively large phase error exists after the loop is locked, the in-band noise floor might be degraded due to reduced SSPD gain. Furthermore, the sampled voltage of a SSPD operating on a sinusoidal VCO signal only works well within  $\pm \pi/2$  phase shift compared to the zero-crossing. Beyond that, the SSPLL may lock to another VCO zero crossing with long relocking time or might even never regain lock on its own.

### 3.1.1 A Soft Loop Gain Switching Scheme

To improve the robustness of locking with a SSPD, a simplified SSPLL block diagram of our proposed dual loop gain switching scheme is shown in Fig. 3.1. The conversion gain of the feedback path from VCO output phase ( $\Delta \phi$ ) to charge pump (CP) output current ( $\Delta I$ ) in the SSL can be derived as [4]:

where AVCO denotes the magnitude of the VCO waveform,  $\Delta \phi$  is the loop phase error,  $\tau$  represents the output current pulse width and  $T_{ref}$  represents the reference period, respectively. Similarly, the conversion gain of the feedback path from VCO output phase to CP output current in the FLL can be found as:

$$G_{\rm FLL} = \frac{\Delta I}{\Delta \varphi} = \frac{1}{N} \cdot \frac{I_{\rm PFD}}{2\pi} \tag{3.2}$$

where N and  $I_{PFD}$  represents the division ratio and the charge pump current, respectively. Thus the total loop gain of the dual loop PLL shown in Fig. 3.1 is given by  $G_{total} = G_{SSL} + G_{FLL}$ .

Multiple approaches exist to combine the SSL and the FLL. Fig. 3.2(a) presents the total loop gain normalized by its maximum value versus static phase errors for some prior art designs and our proposed soft switching scheme. The SSPD assisted with dead-zone PFD [4] shows a periodic behavior because it can lock to any zero crossing of the VCO waveform. Since the phase error is still within the dead-zone, the FLL is not activated in this case [4]. Each null corresponds to a large drop in phase margin as shown in Fig. 3.2(b), which causes potential stability issues. Since  $G_{SSL}$  follows a sinc function with respect to the phase error and  $G_{FLL}$ 

shows a constant gain, directly summing these two terms as suggested in [7] leads to a gain profile of a sinc function with a DC offset shown in Fig. 3.2(a). Although this combined SSPFD approach [7] can avoid harmonic locking, the loop experiences large gain variation as the phase error changes from  $\pm \pi$  to 0. Consequently, loop bandwidth and phase margin also vary dramatically, causing stability concern. The problem can be further exacerbated in this architecture since the gain of the FLL needs to be very small in order to reduce the extra noise from the FLL. With the minimum FLL gain, the loop barely maintains a positive total gain for arbitrary phase error, making the total loop gain close to 0 when the phase error approaches  $\pm \pi$ .



Fig. 3.2. Illustrations of (a) the total loop gain normalized by its maximum value and its variation versus static phase error. A SSPD with dead-zone [4] shows a repeated profile due to harmonic locking; the combined SSPFD [7] shows a sinc profile and our proposed scheme features a rather

constant gain; (b) phase margin variation versus static phase error; (c) transient loop locking behavior.

The core idea of our proposed gain switching scheme is to tune the current  $I_{SSPD}$  and  $I_{PFD}$  dynamically with respect to the phase error such that, instead of a direct superposition, soft switching from one loop to the other is achieved. As a result, the total loop gain variation is reduced. Initially, only the FLL is activated by tuning  $I_{PFD}$  to its maximum. As the loop drives towards locking, the SSL will be activated while the FLL will be turned off once the phase error is sufficiently small such that the SSPD can safely lock on its own. After phase locking is achieved, only the SSL remains active. Note that turning off FLL with IPFD only shuts down the CP in the FLL whereas both the PFD and the divider still need to remain active to detect phase error. Our proposed loop gain switching scheme shows only slight gain and phase margin variations during loop switching as shown in Fig. 3.2, leading to much improved stability. In addition, the CP in the FLL is totally off at lock-in, resulting in low in-band phase noise performance in the proposed SSPLL.

During the lock process, the phase error largely varies. A constant loop gain during the lock process avoids variations in loop dynamics. As a result, the system is robustly stable and relocking time and overshoot are predictable. As shown in Fig. 3.2(c), large variations in open loop gain for the combined PD scheme [7] cause difficulties during the locking process in which the SLL and FLL loop gains are always added. If the loop-gains are matched to the desired gain (which leads to the desired locking time and overshoot) at its peak, the locking time will be prolonged (see Fig. 3.2(c)) since the gain for larger phase error is too small. On the other hand, if the total loop gain is matched at its (sinc) side-lobe level, larger overshoots result since the peak

gain at zero phase error is too large. In comparison, the proposed soft loop switching scheme gives a consistent locking behavior over the entire locking process due to the constant loop gain.

To ensure equal gain of two loops in our proposed structure, the PFD needs to match the peak gain of the SSPD which is usually much larger. Thus a large CP current  $I_{PFD}$  is required to achieve this gain matching. Fortunately, the CP in the FLL is turned off after phase lock, avoiding its power consumption and the large noise associated with a large current. During the loop switching, it becomes more difficult to maintain a constant gain since  $G_{SSL}$  follows square root relationship with respect to current (Eq. 3.1) whilst  $G_{FLL}$  follows a linear relationship with its current (Eq. 3.2). As a compromise between complexity and effectiveness, we propose to reduce gain variation during the switching by making the total current a constant:

$$I_{\text{total}} = I_{\text{SSPD}} + k \cdot I_{\text{PFD}} \tag{3.3}$$

where k represents current ratio between CPs in SSPD and PFD since the current in PFD needs to be larger than that in SSPD for gain matching ( $k \approx \frac{1}{3}$  in our case). As we will show below, this can be conveniently achieved with a differential pair where its tail current source sets the total current.



Fig. 3.3. Simplified schematic diagram of the proposed loop switching controller.

# 3.1.2 Loop Switching Controller

As shown in Fig. 3.3, our proposed loop gain switching controller can be broken down into 3 parts. Firstly, an XNOR gate is tied to the PFD output in the FLL. Along with a low-pass RC filter, an averaged loop phase error  $\varepsilon$  can be measured [13]. The output signal Vlock is inversely proportional to the phase error, meaning smaller  $\varepsilon$  leads to a higher Vlock. In the second part, an amplifier, or a soft comparator, consisting of an operational amplifier is utilized to compare Vlock with a programmable switching threshold. This threshold shall be set sufficiently high in order to ensure that the switching from the FLL to the SSL occurs only after the phase error is within the locking range of the SSPD, i.e., one VCO period. Additionally, a high threshold also helps with fast switching from the SSPD to the PFD once the loop somehow loses phase lock due to perturbation. The third part consists of a PMOS differential pair which directly drives current sources in CPs of the FLL and the SSL. The differential pair ensures a constant sum of a scaled PFD current and the SSPD current for a constant loop gain. The reason for using resistors instead of current mirror is to ensure the deactivated loop being entirely off. During loop switching, we assume the differential pair in linear mode, thus defining a small signal switching gain of the loop switching controller:

$$G_{sw} = \frac{\Delta V}{\Delta \varphi} = (\frac{V_{DD}}{2\pi} \frac{1}{1 + sR_1C_1}) \cdot (1 + \frac{R_3}{R_2}) \cdot (g_m R_L)$$
 (3.4)

where VDD and gm denote the power supply voltage and trans-conductance of the differential pair. Three brackets represent contribution from each part in switching controller.

The simulated loop gain normalized by its maximum value versus phase error (Vlock) over different process corners and temperatures is shown in Fig. 3.4. A tolerable worst peak-to-peak

gain variation of 18% is observed. To compensate for different process corners, the bandwidths (i.e., open-loop gain) of the two loops are calibrated to be equal before normal operation. As a result, the loop only needs to tolerate the variations from temperature and voltage. The phase margin of the loop is designed with sufficient margin such that the loop will always be stable across PVT variations. Relocking time with switching threshold normalized by reference period has been simulated as shown in Fig. 3.5. A voltage perturbation corresponding to a frequency step of 3MHz is injected to the VCO tuning input after the loop is locked. This causes the SSL to lose lock while the FLL will be activated depending on the switching threshold. The relocking time increases with the larger switching threshold since the phase error needs to be accumulated for a longer time in order to trigger the FLL.



Fig. 3.4. Simulated loop gain variation for the proposed soft loop switching scheme over process corners and temperatures. ff, tt, ss denote fast, typical and slow corners, respectively and all temperatures are in Celsius.



Fig. 3.5. Simulated relocking time versus the percentage of switching threshold over reference period; division ratio N=48.

# 3.1.3 Locking Analysis for the Switched PLL

The proposed dual loop PLL architecture with soft switching scheme can be analyzed as a hybrid switched system exploiting existing control theory. The foregoing analysis on the loop gain and the phase margin variation is based on the assumption that the phase error has settled to a constant value, i.e. a quasi-static phase error. However, the implemented switching scheme is not controlled by the instantaneous phase error, but rather by its averaged value produced by R1 and C1 in Fig. 3.3. To estimate the low-pass filter effect, we used a simplified model with a type-I PLL for both the SSL and the FLL. The differential equation of the switched PLL can be described as:

$$\mathrm{d}\phi_{\mathrm{av}}/\mathrm{d}t = (\phi - \phi_{\mathrm{av}})/(R_1 \cdot C_1)$$

$$d\phi/dt = -K_{SSL} sin(\phi) \cdot \left(1 - tanh \left(g \cdot (|\phi_{av}| - \phi_{th})\right)\right) - K_{FLL} \phi \left(1 + tanh \left(g \cdot (|\phi_{av}| - \phi_{th} - \phi_{offset})\right)\right) \quad (3.5)$$

where  $K_{SSL}$  and  $K_{FLL}$  represent the SSL and FLL loop gain;  $\phi$  and  $\phi_{av}$  represent the instantaneous and averaged (filtered with R1 and C1) phase errors, and g represents a scaling parameter which is proportional to the switching gain Gsw as in Eq. (3.4). For simplicity, the loop soft switching behavior shown in Fig. 3.4 is modeled with a hyperbolic tangent function. Parameters  $\phi_{th}$  and  $\phi_{offset}$  represent the switching threshold in radian and the phase offset between the two loops.  $\phi_{th}$  can be converted to the corresponding voltage with  $V_{th} = \phi_{th} \frac{VDD}{2\pi N'}$  where VDD is the supply voltage and N is the division ratio. The simulated phase portrait of the switched PLL with different switching control-loop bandwidth fsw (defined as the bandwidth of

the low-pass filter formed by R1 and C1) and a constant loop bandwidth floop is shown in Fig. 3.6(a). Using a high fsw,  $\phi_{av}$  is quickly converging to the actual phase error. In case of a low fsw, the PLL does not switch to the FLL fast enough at large phase errors, driving itself away from locking in some regions. However, eventually the loop is still able to achieve locking.



Fig. 3.6. Simulated phase portrait of the proposed soft switching SSPLL with (a) different soft-switching control bandwidth fsw and constant loop bandwidth, (b) different phase offset between the SSL and the FLL. Center of the plots indicate the lock state.

Since the feedback path of the FLL includes an additional multi-modulus divider (MMD) compared to the SSL, the propagation delay in the feedback path can differ by hundreds of ps between the two loops. Unbalanced layout and PVT variation can further exacerbate the delay mismatch. Such mismatch not only causes a larger loop gain variation, but also might lead to

multiple locking points. As shown in Fig. 3.6(b), as long as the offset remains below the switching threshold, only one stable point exists on the phase portrait. For a phase offset larger than the threshold, the nullclines of  $\phi_{av}$  where its derivative is zero intersects with that of  $\phi$ , causing an additional stable node, i.e. false locking where only the FLL is in lock while the SSL is not. To compensate for the propagation delay mismatch, we have implemented a calibration utilizing the tunable delay dT on the reference path which will be covered in a later section.

# 3.2 Multi-phase generation for fractional-N mode

#### 3.2.1 Fractional-N Mode with SSPD

In integer-N mode, the VCO zero crossing will always be aligned with the reference edge after phase locking since the VCO frequency is an integer multiple of the reference frequency. Extending the SSPLL to fractional-N mode requires the divider to switch between multiple integer division ratios in every reference cycle [14], causing instantaneous phase error while achieving a correct equivalent fractional division ratio on time average. Consequently, the VCO or the feedback edge will move periodically around the reference edge, creating instantaneous phase errors even after phase locking. In case of a basic fractional operation where the divider switches between N and N+1, the maximum phase error, or the phase gap, can reach one VCO cycle. Considering the narrow locking range and even narrower high gain range of the SSPD, the feedback signal and the reference signal needs to be properly realigned before feeding into the SSPD for fractional-N operation.

Prior art designs have proposed to utilize a tunable delay or digital-to-time converter (DTC) on the reference path [15-17] to build a fractional-N SSPLL. By appropriately delaying the reference clock in every cycle, the phase gap between the reference edge and the feedback edge

can be closed. However, this DTC is required to cover one or multiple VCO cycles with a fine resolution to provide the required fractionality. Furthermore, the DTC needs to be highly linear; otherwise large fractional spurs will arise. Delays from inverters will also contribute extra noise proportional to the amount of delay inserted as argued in [18-19]. Introducing large amount of delay on the reference might severely degrade PLL's in-band noise floor since its jitter will be multiplied by N2 when transferred to the PLL output. Other approaches have proposed using active phase interpolator [19-20] to decrease the amount of delay required on the reference path. However, the phase interpolator still contributes additional noise and consumes a large portion of the total power from the entire PLL [19]. In addition, only one VCO phase can be generated at a time in this structure whereas some applications require multi-phase clock outputs as discussed previously.



Fig. 3.7. Realignment of VCO zero crossing and reference edge utilizing a multi-phase VCO.

In our proposed PLL, edge alignment is achieved through utilizing multiple interpolated VCO phases uniformly spanning from  $0^{\circ}$  to  $360^{\circ}$ . By selecting the VCO output phase, in each reference cycle, which provides a zero crossing that is closest to the reference edge, the phase gap can be decreased to  $\pi/M$  where M denotes the number of available VCO phases. Furthermore, by using a fractionality of n/M where n denotes an arbitrary integer between 1 and M-1, it is possible to close the phase gap for every reference edge, as the phase error increment

without selecting another VCO output is equal to the phase error difference between two VCO outputs. Ideally, the SSPD would see zero phase error thus exhibiting a clean output spectrum without any fractional spurs. Consider a simple example as shown in Fig. 3.7, a fractional PLL is achieved with M VCO phases P1-PM. By jumping one VCO cycle each time, the sampling reference edge is always aligned with one of the zero crossings of VCO waveform. To generalize this idea, the fractional frequency can be programmed as:

$$f_{\text{frac}} = \left(N + \frac{n}{M}\right) \cdot f_{\text{ref}} \tag{3.6}$$

where N, n and M represent the integer division ratio, the VCO phase jump in each cycle and the total number of available VCO phases, respectively. n can be an arbitrary integer number from 1 to M-1. Compared with prior art using extra delay on the reference path or an active interpolator, our proposed architecture involves no active components, thus minimizing the extra power or noise for fractional-N operation and multi-phase clock generation. Note that even though the interpolation network does not consume power, the phase selecting multiplexer consumes about 1 mA.



Fig. 3.8. (a) Capacitive phase interpolation network (b) Interpolating arbitrary phases from a pair of quadrature signals with capacitance ratio  $\alpha = C2/C1$ .

### 3.2.2 Capacitive Interpolation for Multi-phase Clock Generation

In our proposed design, the generation of multiple clock phases is achieved through capacitive interpolation with a quadrature LC oscillator. A similar VCO architecture with fewer interpolated phases has been proposed in [21]. Consider a simple case of two capacitors connected in series between the in-phase (I) and the quadrature (Q) component of a quadrature VCO (QVCO) output. By tuning the ratio of two capacitors, arbitrary phase between  $0^{\circ}$  and  $90^{\circ}$  can be interpolated. In this PLL, the QVCO output is further extended into interpolating 16 phases as shown in Fig. 3.8. Four capacitors are connected in series between  $0^{\circ}$  and  $90^{\circ}$  from the QVCO to generate 3 additional sub-phases of  $22.5^{\circ}$ ,  $45^{\circ}$  and  $67.5^{\circ}$  respectively. Let us define a capacitor ratio  $\alpha = C_2/C_1$ . The phase at each node can be determined using superposition calculating the contribution from the I and Q component respectively.

First, let us ignore the loading effects. Later, we will include parasitic loading effects and also consider the effect on oscillation frequency and Q of the tank in the VCO. From I+ to Q+, the phase at the first node can be found to be  $\tan\theta = \alpha/(\alpha+2)$ . Using  $\theta$  of 22.5°,  $\alpha$  can be calculated to be  $\sqrt{2}$  which is approximated with 1.4 in the actual implementation. Even though the phase can be tuned to arbitrary value, magnitude of the interpolated phases can have slight variation. With an  $\alpha$  of  $\sqrt{2}$ , magnitude at 22.5°, 45° and 67.5° can be calculated to be approximately 0.765, 0.707 and 0.765 assuming a unity magnitude at I+ and Q+. Even though subsequent buffers will reshape interpolated sinusoidal waveform into square wave with similar magnitude, this non-uniform magnitude still causes a certain amount of phase error in interpolated VCO phases. To match the magnitude with interpolated phases, the magnitude of the original four VCO phases are scaled down as well with capacitors C3 and C4 as shown in Fig. 3.8. With a capacitance ratio C4/C3 of ( $\sqrt{2}-1$ ), the output magnitude of four quadrature phases are scaled to 0.707. Again, this capacitance ratio is approximated with 0.4 in the actual design. In

summary, assuming C3=C4=C as the unit capacitance, all capacitor values can be found as: C1=1.4C, C2=C3=C, C4=0.4C.

Having determined the capacitance ratio, we can now discuss how to choose the absolute value for these capacitors. Parasitic capacitance tends to create additional phase error in the interpolated phases. Larger interpolation capacitors alleviate the parasitic impact while reducing the achievable oscillation frequency. A load of 30 fF representing the input capacitance of the next stage is attached to each output of the interpolation network. The simulated phase error at the node  $22.5^{\circ}$  is shown in Fig. 3.9. Phase error larger than 2 degrees can be observed for small unit capacitance C. Approximating the ideal capacitance ratio  $\sqrt{2}$  with 1.4 caused a phase difference of about  $0.2^{\circ}$ . For very large C, the phase error is approaching zero with ideal capacitance ratio. However, larger unit capacitance C occupies more area which also leads to more parasitics. Furthermore, it also adds extra capacitive loading on the VCO, decreasing its oscillation frequency and tuning range. The simulated VCO tank quality factor, as shown in Fig. 3.9, decreases with larger C, requiring more VCO power to maintain the same oscillation frequency and phase noise. In our design, a unit capacitance of about 1.2 pF is chosen as a compromise in this tradeoff.



Fig. 3.9. Effect of interpolating capacitance value on phase error reduction with parasitic loading and VCO tank quality factor degradation.

The simulated phase error is shown in Fig. 3.10. From schematic level simulation, due to the non-uniform interpolated magnitudes as mentioned earlier, periodic phase maxima and minima can be observed on the DNL. In post layout simulation with all the parasitic, the phase error between interpolated phases increased due to imbalanced layout and wiring which is difficult to avoid entirely. Large fractional spurs will arise if these phase errors are left unresolved. We utilized the tunable delay dt on the reference clock path to compensate for these small variations. Based on the sampled voltage in SSPD, different delays are assigned on the reference path for each phase which is achieved with an on-chip digital logic. Details of this calibration will be covered in section V. However, this can only reduce the spurs in fractional-N mode. For multi-phase clock application where each phase is required to be evenly spaced, a controllable delay running at VCO frequency is needed at each interpolated nodes.



Fig. 3.10. Simulated phase error DNL of interpolated output phases before and after parasitic extraction.

# 3.3 System and Building Blocks

A block diagram of the proposed fractional sub-sampling PLL is shown in Fig. 3.11. The main phase lock SSL consists of a SSPD for low in-band phase noise operation. As explained in section II, the FLL uses divider/PFD for its larger capture range to ensure a robust locking, while the loop switching controller automatically tunes gains of two loops based on the current phase error. Due to the frequency divider, the delay of the feedback path in the FLL is slightly larger than that of the SSL. An unsynchronized feedback signal causes ambiguity in terms of locking between the two loops and makes it more difficult for the loop switching controller to decide when to transit between detectors. Thus a coarse controllable delay dT was inserted on the reference path of the PFD to calibrate for the delay difference. Another fine controllable delay dt is inserted in the reference clock path for the SSPD to calibrate for phase errors in interpolated VCO phases.



Fig. 3.11. Block diagram of the proposed multi-phase fractional-N SSPLL.

As shown in Fig. 3.11, the multiplexer after the VCO is connected to the SSPD through a buffer. Similar to a sample and hold circuit, the SSPD consists of switches (M1, M7) and sampling capacitors (Cp ~ 0.12pF) as shown in Fig. 3.12. Two shorted transistors M2 and M8 are connected to source and sink extra charges from the switching transistors. Their sizes are tuned to approximately half of M1 and M7. Two dummy paths consisting of M3~M6 with extra sampling capacitors are implemented to remain a constant loading on previous stages during sampling which helps reducing the reference spur [22]. The gain of the SSL is controlled through tuning the tail current in M11 for loop switching. In addition, the sampled voltage at Sp and Sn are also connected to pins through several stages of buffers to probe the phase error of the selected VCO phase.



Fig. 3.12. Simplified schematic diagram of the SSPD and CP.

A low-noise off-chip crystal oscillator generates a 50 MHz sinusoidal waveform with a peak magnitude of 0.6 V as the reference clock. The first stage self-biased inverter is most critical in terms of additional noise in the whole clock chain. Thus the NMOS transistor is given a large width for higher gm and less flicker noise whereas the PMOS can maintain a normal size to save power as shown in Fig. 3.13. This also enables faster rising edge at SSCLK+ and falling

edge at SSCLK- which are used as the sampling edges in the SSPD. The fine tunable delay for VCO phase error calibration on the SSPD clock path is implemented with a 5-bit binary weighted capacitor array. A series capacitor is connected between the capacitor array and the inverter output to improve the resolution. Simulation shows that a tuning range of 30 ps with a resolution around 1 ps is achieved. Likewise, the second delay on the PFD clock path for synchronizing feedback signal in two loops is also implemented with capacitor array. It is designed to cover a larger range (400 ps) with coarse resolution (20 ps). Note here the jitter requirement is relaxed since it is only used in the FLL.



Fig. 3.13. Schematic diagram of the reference buffer with tunable delays.

A current-mode-logic (CML) based multiplexer has been implemented for phase selection as shown in Fig. 3.14. P1-P16 denotes the 16 interpolated phases while SEL1-SEL16 represents the one bit high phase selection word. Only one differential pair will be ON and conducting current (~1 mA) at a time while the other 15 pairs will be shut down to minimize the total loading on the interpolation network and to save power. To minimize the unbalanced loading between ON and OFF state which causes extra phase error and higher spur level in fractional-N

mode, the CML based structure has been adopted where each P1-P16 nodes is loaded by two MOSFET gates, the loading variation mainly comes from different gate parasitic capacitance with different biasing current during ON or OFF states. When the bias current is on, the gate capacitance is about 27 fF in simulation, while it is 4 fF for the OFF state. Considering an interpolation capacitance of 1.2 pF, a load capacitance variation from 4 fF to 27 fF creates a phase error of about 0.5°, corresponding to 0.57 ps at 2.4GHz. This residual error can be further corrected for with the tunable delay dt which will be discussed in detail later. In addition, even though the selected VCO phase is toggled at the reference clock rate, it will not significantly increase the reference spur since the minor loading variation due to phase switching is ignorable compared to the total loading from the interpolation network onto the VCO tank. It should be noted that if all VCO phases are used as a multi-phase clock, each phase in the interpolation network will need to be buffered, alleviating the issue of unbalanced loading.



Fig. 3.14. Schematic diagram of the CML multiplexer.

An asymmetric buffer is inserted on the selection words to sharpen its rising edge while flatten the falling edge. This ensures a small amount of overlap between two adjacent selection bits to reduce glitches at multiplexer output during phase switching. Since all the VCO phases are available in parallel at different ports of the interpolation network, the multiplexer only needs

to activate a branch to select the desired VCO phase at the reference rate. In addition, considering that the selected phase is sampled by the SSPD at the rising edge of the reference clock, while its falling edge is used for phase selection, race conditions are avoided. Large DC biasing resistors are attached to the interpolation nodes (P1-P16) for proper operation of the differential pairs. Since the resulting parallel resistance seen by the tank is fairly large (~10K Ohm), this will not impose significant degradation on the VCO noise performance. In order to ensure a consistent propagation delay added onto each phase, a symmetric layout of the multiplexer was designed with care. Again, the residual phase error of the interpolated VCO phases can be calibrated with the tunable delay dt on the reference path.

The capacitive coupled QVCO is illustrated in Fig. 3.15, in which the oscillation signal of each oscillator core is coupled to the gates of the NMOS transistors in the next stage through the phase-coupling capacitor Cqc. The cross-coupling capacitor Cqc path forms the -gm needed for oscillation. The combination of coupling factor, defined as  $m = C_{qc}/C_{cc}$ , source degeneration CS and gm can be used to tune the coupling path phase delay for minimum phase noise and phase error without multi-modal oscillation. We choose m=0.6 and phase delay of  $60^{\circ}$  to achieve the optimized phase noise and phase error [23]. The I/Q outputs from the QVCO are connected to the interpolation network for multi-phase signal generation, as shown in Fig. 3.11.

Capacitance value CS in the QVCO can be used to alter the phase shift of the quadrature-coupled signals and thus can be used for phase noise optimization [23-24]. It can be shown that both the CS and the oscillating transistor's trans-conductance gm can alter the phase shifting relationship. While gm needs to be kept as a constant to overcome the tank loss for stable oscillation, CS provides a tunable parameter for phase shift adjustment. As shown in [23-24], phase noise can be minimized by shifting the peak value of noise source current away from the

zero crossing point of the VCO output signal. Thus, CS is adjusted to achieve the optimized phase noise at the center of oscillation frequency band.



Fig. 3.15. Schematic diagram of the quadrature capacitive coupled VCO.

### 3.4 Measurement Results

The proposed SSPLL is implemented in a 130 nm CMOS technology with the die photo shown in Fig. 3.16. The total active area is approximately 0.43 mm2. The system consumes 21 mW with a 1.3 V power supply. The system power breakdown is shown in Table I. Most of the power is consumed by the QVCO which delivers a phase noise of -121 dBc/Hz and -140 dBc/Hz at 1 MHz and 10 MHz, respectively, achieving a VCO FoM of -178 dB. The QVCO is able to tune from 2.39 GHz to 2.46 GHz. We have not detected an effect on the VCO phase noise in measurements by turning on or off the multiplexer, thus the loading on the interpolation network has no significant degradation on the VCO performance.

The measured phase noise of the reference clock, the SSPLL and the VCO is shown in Fig. 3.17. In integer-N mode, a low in-band noise floor of -120 dBc/Hz has been measured as expected due to using SSPD. The loop bandwidth is set to around 1.5 MHz where the in-band noise floor intersects the VCO free-running phase noise for minimal PLL total noise. An

integrated jitter of 158 fs (10 kHz-10 MHz) has been measured at 2.4 GHz. With careful circuit and layout design, a very low reference spur of -72 dB has been measured. In the fractional-N mode, limited by the number of VCO phases, the finest available fractionality is 1/16 leading to a fractional offset frequency of 3.125 MHz. With an integer division ratio of 48, the synthesized frequency equals 2.397 GHz. The measured phase noise at this frequency is shown in Fig. 3.17, maintaining an in-band noise floor around -120 dBc/Hz with an integrated jitter of 169 fs (10kHz-10MHz). The measured in-band phase noise variation as the PLL switches from the FLL to the SSL is presented in Fig. 3.18. In the measurement, the differential input voltage Vdiff to the differential pair in the switching controller (see Fig. 3.3) was swept from -0.5V to 0.5V, corresponding to a transition from the FLL to the SSL. The measured in-band phase noise improves from -102 dBc/Hz (FLL ON and SSL OFF) to -120 dBc/Hz (FLL OFF and SSL ON). Phase noise varies when the PLL transits between these two cases. Phase noise peaks around the middle point, where the simulated total loop gain drops.



Fig. 3.17. (a) Measured phase noise and reference spur at 2.4 GHz in integer-N mode; (b) Measured phase noise in fractional-N mode at 2.397 GHz.



Fig. 3.18. Measured in-band phase noise variation in integer mode with the simulated normalized loop gain versus differential input voltage that tunes the currents to switch the loop from FLL to SSL.

Due to the phase error of interpolated VCO phases, the closest fractional spur with a fractionality of 1/16 originally was -37 dBc as shown in Fig. 3.20. The phase error among interpolated VCO phases can be detected with the sampled voltage Sp and Sn at the SSPD output that are wired to the pin and observed using an external oscilloscope which is elaborated in Fig. 3.19(b). To calibrate for the phase error, the sampling reference edge can be delayed or advanced with the fine tunable delay cell dt in the reference path. The control bits are stored in an on-chip memory (D1-D16) and sequentially shifted onto dt by an integrated digital logic. After calibration, the variation of the sampled voltage (Sp, Sn) for different VCO phases will be greatly reduced with lower fractional spur level. As a result, the closest fractional spur has reduced by 15 dB to -52 dBc. The loop bandwidth in this case was set to 1 MHz, so that the fractional spur has experienced slight suppression from the loop filter before and after the phase error calibration. For an automatic integrated calibration, a basic comparator will be able to provide the required information using a basic least-mean square (LMS) algorithm for calibration.



Fig. 3.19. Testing setup for (a) dual loop feedback delay mismatch calibration in integer mode (b) VCO interpolation phase error calibration in fractional-N mode.



Fig. 3.20. Fractional spur (a) before phase error calibration (b) after calibration (c) across fractional offset frequency.

The robustness of the proposed soft loop gain switching scheme has been tested as well. In the test setup, a periodic step voltage of approximately 150 mV was injected in a way similar to [7]. Through a large capacitor connected in series, the VCO supply voltage will experience spikes periodically resembling the perturbation from the digital circuits. Such interference will

force the SSL out of lock and thus the relocking behavior of the PLL can be repeated and observed. The exact amount of disturbance required to drive a PLL away from lock depends on many circuit and design parameters which is difficult to model accurately. However, as a rule of thumb, as soon as the induced phase error exceeds the capture range of the SSPD (i.e., from  $-\pi$  to  $\pi$  or half VCO period), the SSL will lose lock.

Two experiments were conducted with our proposed PLL. Firstly, the switching threshold is set to VDD/2, indicating the loop starts switching as soon as phase error reaches Tref/2. Under such configuration, the proposed PLL is very similar to [4] where PFD has a dead-zone of Tref/2. As shown Fig. 3.21(a), after the perturbation has been injected, the lock detection instantly drops, indicating the PLL is out of lock. However, the FLL still remains inactive because the phase error at this time is not large enough to reach the switching threshold of Tref/2. Thus the PLL needs to wait for an accumulation of phase error to activate the frequency loop and regain locking. Note that soft switching is still applied here which is different from the hard switching used in [4]. However, the issue of delayed relocking is clearly demonstrated.

In the second setup, a high switching threshold close to VDD is applied with an estimated FLL dead-zone of  $\pm \pi/2$  or  $\pm 104$  ps at 2.4 GHz. This ensures that the SSL is enabled only when the phase error is within its detection range. As shown in Fig. 3.21(b), once the lock detection voltage drops below the switching threshold, the loop instantly switches to the FLL. The relocking time has been reduced by more than half compared to that in the first setup. After the loop is relocked, it is switched back to the SSL. Since the FLL is completely disconnected from the loop filter after phase lock, the interference and noise from the FLL can be avoided. The relocking time of this design is longer compared to [7] due to average phase error estimation (the low pass pole from R1 and C1 in Fig. 3.3) and the off-chip op-amp used to control the loop

switching behavior. Thus to provide locking robustness against high frequency disturbance, the frequency response of the loop switching controller needs to be increased. As for the noise on the supply voltage, if the supply noise is sufficiently large to cause the phase error to trigger the loop switching, the proposed soft-switching scheme has advantage over prior-art designs in re-locking time.



Fig. 3.21. Measured relocking transient behavior after a supply perturbation for (a) low switching threshold, similar to a SSPLL using FLL with large dead-zone; (b) high switching threshold, demonstrating fast relock for the proposed SSPLL.

In order to compensate for the delay mismatch between two loops, a coarse tunable delay dT on the reference clock for the PFD is used as shown in Fig. 3.19(a). In this calibration, the PLL first locks to an integer frequency with the FLL and then switches to phase locking with the SSL. Note that the loop can still maintain locking after switching from the FLL to the SSL as long as the delay mismatch between two loops is smaller than one VCO period. Next, dT is tuned based on Vlock from the FLL where larger Vlock indicates smaller phase error in the FLL. Once dT is tuned to compensate for the extra propagation delay in the feedback path from the divider, the SSL will be locked to the reference edge while the FLL will be locked to the delayed reference edge, respectively. The phase offset calibration will be limited by the resolution of the

tunable delay dT (20ps) which is much smaller compared to one VCO cycle (~ 400ps). The measured relocking transient without loop delay calibration is shown in Fig. 3.22 from which we can see that the loop switches between the FLL and the SSL for multiple times, prolonging the relocking time compared to that with loop delay calibration.



Fig. 3.22. Measured relocking transient behavior with (a) large phase offset between two loops (b) minimal phase offset after loop delay mismatch calibration.

A performance summary and comparison to other state-of-the-art SSPLL designs is given in Table II. The fractionality (1/16 in this work) is limited by the number of available VCO output phases. If finer step size is needed, the interpolated VCO phases can provide the coarse tuning for fractional-N operations. Additional phase tuning can be achieved by tuning the delay stages on the reference path. The use of interpolated VCO phases greatly reduces the maximum delay needed on the reference path. As a result, low power and less degradation of the in-band phase noise can be achieved. The power consumption in this design is slightly larger compared to other SSPLL designs, mainly due to the use of larger feature size CMOS. Even though we only achieved a decent FoM of -242 dB, we can generate 16 VCO phases simultaneously.

TABLE 3.1 MEASURED SSPLL PERFORMANCES AND COMPARISONS.

|                         | Gao [22]<br>JSSC-10<br>Integer-N | Hsu [7]<br>TCAS-15<br>Integer-N | Chang<br>[15]<br>JSSC-14<br>Frac-N | Gao [16]<br>ISSCC-16<br>Frac-N | Narayana<br>n [19]<br>JSSC-16<br>Frac-N | This work<br>Frac-N       |
|-------------------------|----------------------------------|---------------------------------|------------------------------------|--------------------------------|-----------------------------------------|---------------------------|
| Technology (nm)         | 180                              | 65                              | 180                                | 28                             | 65                                      | 130                       |
| Ref. (MHz)              | 55.25                            | 50                              | 48                                 | 40                             | 40                                      | 50                        |
| Output Freq. (GHz)      | 2.21                             | 1.9-2.3                         | 2.12~2.4                           | 2.7-4.3                        | 4.34-4.94                               | 2.39-2.46                 |
| In-band PN<br>(dBc/Hz)  | -121                             | -122                            | -112                               | -                              | -120                                    | -120                      |
| Int. RMS<br>Jitter (fs) | 300<br>(10kHz-<br>100MHz)        | 484<br>(10kHz-<br>40MHz)        | 266<br>(10kHz-<br>30MHz)           | 159*<br>(10kHz-<br>40MHz)      | 133*<br>(10kHz-<br>10MHz)               | 169*<br>(10kHz-<br>10MHz) |
| Ref. Spur (dBc)         | -80                              | -41                             | -55                                | -78                            | -70                                     | -72                       |
| Frac. Spur (dBc)        | -                                | -                               | -70<br>(3 MHz)                     | -54<br>(100 kHz)               | -59<br>(30 kHz)                         | -52<br>(3.125<br>MHz)     |
| Power (mW)              | 3.8                              | 8.8                             | 17.3                               | 8.2                            | 6.2                                     | 21                        |
| No. of Out.<br>Phases   | 2                                | 2                               | 2                                  | 2                              | 32                                      | 16                        |
| FoM (dB)                | -244                             | -236                            | -239                               | -247                           | -250                                    | -242                      |

 $\overline{\text{FoM} = 10\log((\frac{\sigma_{\text{t}}}{1\text{s}})^2 \cdot \frac{\text{Power}}{1\text{mW}})}, \quad * \text{ measured in fractional mode}$ 

### 3.5 Conclusions

A fractional-N subsampling PLL with fast robust locking has been presented in this paper using a dual loop structure with automatic soft loop switching. The potentially long relocking time of a SSPLL loop has been reduced without compromising the in-band phase noise. Compared with prior art SSPLLs, the proposed loop switching scheme greatly reduces the variation in loop gain and phase margins for a large range of phase error. Utilizing a QVCO with a capacitive interpolation network, the proposed SSPLL can simultaneously generate 16 VCO phases for multi-phase clock applications. Using a passive interpolation network, the proposed SSPLL can be extended from integer-N mode to fractional-N mode with little overhead in noise or power. This SSPLL design has achieved a reference spur and fractional spur of -72 dBc and -

52 dBc, respectively. The integrated jitter in integer-N and fractional-N modes are 158 fs and 169 fs at 2.4 GHz, respectively, while consuming 21 mW including the power consumed by FLL.

#### References

- [3] M. Mikhemar, D. Murphy, A. Mirzaei and H. Darabi, "A Cancellation Technique for Reciprocal-Mixing Caused by Phase Noise and Spurs," IEEE J. Solid-state Circuits, vol. 48, no. 12, pp. 3080-3089, Dec. 2013.
- [4] H. Wang, F. Dai, H. Wang, "A 330µW 1.25ps 400fs-INL Vernier time-to-digital converter with 2D reconfigurable spiral arbiter array and 2nd-order ΔΣ linearization," IEEE Custom Integrated Circuits Conf. (CICC), Austin, TX, 2017.
- [5] D. Liao, H. Wang, F. F. Dai, Y. Xu, R. Berenguer and S. M. Hermoso, "An 802.11a/b/g/n digital fractional-N PLL with automatic TDC linearity calibration for spur cancellation," IEEE J. Solid-state Circuits, vol. 52, no. 5, pp. 1210-1220, May 2017.
- [6] X. Gao, E. A. M. Klumperink, M. Bohsali and B. Nauta, "A Low Noise Sub-Sampling PLL in Which Divider Noise is Eliminated and PD/CP Noise is Not Multiplied by N2," IEEE J. Solid-state Circuits, vol. 44, no. 12, pp. 3253-3263, Dec. 2009.
- [7] Z. Ru, P. Geraedts, E. Klumperink, X. He and B. Nauta, "A 12GHz 210fs 6mW digital PLL with sub-sampling binary phase detector and voltage-time modulated DCO," Symp. on VLSI Circuits, Kyoto, 2013, pp. C194-C195.
- [8] T. Siriburanon et al., "A 2.2 GHz -242 dB-FOM 4.2 mW ADC-PLL Using Digital Sub-Sampling Architecture," IEEE J. Solid-state Circuits, vol. 51, no. 6, pp. 1385-1397, June 2016.
- [9] C. W. Hsu, K. Tripurari, S. A. Yu and P. R. Kinget, "A Sub-Sampling-Assisted Phase-Frequency Detector for Low-Noise PLLs with Robust Operation under Supply Interference," IEEE Trans. Circuits Syst. I, vol. 62, no. 1, pp. 90-99, Jan. 2015.

- [10] D. Liao, F. Dai, B. Nauta and E. Klumperink, "Multi-Phase Sub-Sampling Fractional-N PLL with Soft Loop Switching for Fast Robust Locking," IEEE Custom Integrated Circuits Conf. (CICC), Austin, TX, 2017.
- [11] R. Wang and F. F. Dai, "A 1~1.5 GHz Capacitive Coupled Multi-ring Oscillator with Improved Phase Noise," 42th European Solid-State Circuits Conf. (ESSCIRC), pp. 377 380, Geneva, Switzerland, September, 2016.
- [12] R. Wang and F. F. Dai, "A 0.8~1.3 GHz Multi-phase Injection-locked PLL Using Capacitive Coupled Multi-ring Oscillator with Reference Spur Suppression," IEEE Custom Integrated Circuits Conf. (CICC), Austin, TX, 2017.
- [13] D. Murphy et al., "A Blocker-Tolerant, Noise-Cancelling Receiver Suitable for Wideband Wireless Applications," IEEE J. Solid-state Circuits, vol. 47, no. 12, pp. 2943-2963, Dec. 2012.
- [14] Y. Lien, E. Klumperink, B. Tenbroek, J. Strange and B. Nauta, "A mixer-first receiver with enhanced selectivity by capacitive positive feedback achieving +39dBm IIP3 and <3dB noise figure for SAW-less LTE Radio," IEEE Radio Frequency Integrated Circuits Symp. (RFIC), Honolulu, HI, 2017, pp. 280-283.
- [15] J. W. Rogers, C. Plett, and F. F. Dai, "Integrated Circuit Design for High-Speed Frequency Synthesis," ARTECH HOUSE PUBLISHERS, INC., ISBN: 1-58053-982-3, Norwood, MA, February, 2006, pp. 189-190.
- [16] J. W. Rogers, F. F. Dai, M. S. Cavin, and D. G. Rahn, "A multiband ΔΣ fractional-N frequency synthesizer for a MIMO WLAN transceiver RFIC," IEEE J. Solid-state Circuits, vol. 40, no. 3, pp. 678-689, March 2005.

- [17] W. S. Chang, P. C. Huang and T. C. Lee, "A Fractional-N Divider-Less Phase-Locked Loop With a Subsampling Phase Detector," IEEE J. Solid-state Circuits, vol. 49, no. 12, pp. 2964-2975, Dec. 2014.
- [18] X. Gao, E. Klumperink and B. Nauta, "9.6 A 2.7-to-4.3GHz, 0.16 ps rms jitter, -246.8dB-FOM, digital fractional-N sampling PLL in 28nm CMOS," IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, San Francisco, CA, 2016, pp. 174-175.
- [19] N. Pavlovic and J. Bergervoet, "A 5.3GHz Digital-to-Time Converter Based Fractional-N All-Digital PLL," IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, San Francisco, CA, 2011, pp. 54-56.
- [20] X. Gao, E. A. M. Klumperink and B. Nauta, "Advantages of Shift Registers Over DLLs for Flexible Low Jitter Multiphase Clock Generation", IEEE Trans. Circuits Syst. II, vol. 55, no. 3, pp. 244-248, March 2008.
- [21] T. Narayanan, et al., "A Fractional-N Sub-Sampling PLL using a Pipelined Phase-Interpolator with an FoM of -250dB," IEEE J. Solid-state Circuits, vol. 51, no. 7, pp. 1630-1640, July 2016.
- [22] M. Zanuso, S. Levantino, C. Samori and A. L. Lacaita, "A Wideband 3.6 GHz Digital ΔΣ Fractional-N PLL With Phase Interpolation Divider and Digital Spur Cancellation," IEEE J. Solid-state Circuits, vol. 46, no. 3, pp. 627-638, March 2011.
- [23] H. S. Chen and L. H. Lu, "An Open-Loop Half-Quadrature Hybrid for Multiphase Signals Generation," IEEE Trans. Microwave Theory Tech., vol. 60, no. 1, pp. 131-138, Jan. 2012.
- [24] X. Gao, E.A.M. Klumperink, G. Socci, M. Bohsali, B. Nauta., "Spur Reduction Techniques for Phase-Locked Loops Exploiting A Sub-Sampling Phase Detector," IEEE J. Solid-state Circuits, vol. 45, no. 9, pp. 1809-1821, Sept. 2010.

- [25] F. Zhao and F. F. Dai, "A 0.6V quadrature VCO with optimized capacitive coupling for phase noise reduction," IEEE Trans. Circuits Syst. I, vol. 59, No. 8, pp. 1694-1705, Aug. 2012.
- [26] F. Zhao and F. F. Dai "Low-Noise Low-Power Design for Phase-Locked Loops Multi-Phase High-Performance Oscillators," Springer International Publishing AG, ISBN 978-3-319-12199-4, New York, Nov. 2014.

# Chapter 4

# A 10GHz Reference Sampling PLL in a 5G Synthesizer

With the advent of 5G cellular network, the allotted spectrum has been increased to 30GHz while requiring low phase noise to support complex modulation types including 256QAM. Directly generating such high frequency with a single PLL can be very challenging involving high frequency dividers. In this design, a cascaded structure using two stages of frequency synthesizer is implemented. The first stage uses a sub-sampling PLL to generate an output frequency around 10GHz. The second stage uses an injection locked oscillator to further generate a 40GHz carrier. This chapter focuses on the first stage 10GHz PLL.



Fig. 4.1. Simplified architecture diagram of the proposed 5G synthesizer.

To provide low phase noise at PLL output, a sub-sampling phase detector has been utilized in the first stage as shown in Fig. 4.1. Owning to the large gain of SSPD, in-band noise from subsequent can be largely suppressed. This phase noise improvement is most effective for type-II PLL since charge pump usually contributes most of the in-band phase noise. However, type-I PLL has recently gained popularity due to its simplicity and low power consumption with the removal of charge pump [1-2]. In this design, a type-I architecture has been adopted to improve the overall PLL FoM. Compared to its type-II counterpart, a type-I PLL has lower loop gain close to DC and causes larger residual phase error after locking. Besides, due to the lack of charge pump, it only has 20dB/dec suppression for out of band noise/spurs. The first disadvantage is less of a problem since the gain of SSPD is inherently much larger than PFD which reduces this residual phase error. In addition, strict zero phase error is only required in some special cases and a static phase error will not compromise the frequency accuracy of the generated carrier which is critical in most applications. The second disadvantage makes the loop less effective in suppressing in-band VCO phase noise which grows at least 20dB/dec with frequency and grows even faster beyond the corner frequency of flicker noise. Therefore, the PLL bandwidth needs to be larger than type-II PLL to sufficiently suppress VCO phase noise. On the other hand, the out-of-band spurs also experience less suppression, thus type-I PLL tends to have higher reference/fractional spurs. Even without the charge pump, the large gain provided by the SSPD can still suppress the the kT/C noise from the voltage sampler or even the thermal noise from loop filter. Therefore, the capacitor size can be shrunk to save silicon area.

Another aspect this design has focused on is loop stability at the presence of external disturbances. In the original SSPD design, the VCO waveform is directly sampled by the reference edge, thus the sampled voltage is valid only within one VCO cycle. To extend the valid

range of SSPD, a reference sampling PLL (RSPLL) structure has been proposed [2]. Instead of using reference edge to sample VCO waveform, RSPLL uses VCO edge to sample the reference waveform, thus the valid range of SSPD can be extended to one full reference cycle. Instead of directly sampling the input sinusoidal reference waveform, a differential pair has been used to reshape the reference waveform into shaper rising and falling edge. An array of binary weighted capacitor has been implemented in the sampler. By programming the effective capacitor in the sampler, the reference waveform can be reshaped with different slope and linear range: the sharper slope causes a narrower linear range and vice versa. A faster slope can increase SSPD gain and improves in-band phase noise whereas a slower slope can increase the linear range of SSPD and thus improves its robustness against external interference. A full differential architecture from SSPD to VCO has also been adopted. Thus any external interference on the power supply or ground will be treated as common mode noise and suppressed. Until the date this dissertation is completed, the chip is still under fabrication. Thus this chapter will be presented with calculation and simulation results only.

# 4.1 Phase Noise & Ref-Sampling Phase Detector

In this section, the design considerations and procedures for reducing PLL phase noise will be discussed. Due to the high gain of RSPD and type-I architecture, the in-band phase noise is mainly dominated by the phase noise of reference clock which sets the theoretical limit of in-band noise floor. To achieve very low in-band phase noise, an ultra-low noise crystal oscillator CCPD-575 from Crystek will be picked for the reference clock as an example. Below are some of its technical details:

| Frequency | 100 MHz |  |  |
|-----------|---------|--|--|
|           |         |  |  |

| Power Supply | 3.3 V               |
|--------------|---------------------|
| Output       | Differential LVPECL |
| PN@100kHz    | -158 dBc/Hz         |
| PN@1MHz      | -160 dBc/Hz         |

Table.4.1. Technical Specification for CCPD-575

As a rough estimate, the PLL in-band noise floor at 100kHz will be -158+20log<sub>10</sub>(100)=-118dBc/Hz in the best case where the reference clock phase noise is dominating in-band phase noise. The other important parameters include its output waveform. "LVPECL", or low-voltage positive emitter-coupled logic, uses emitter-coupled output buffer to increase the speed of its output rise and fall time with larger swing which is very beneficial for driving high-frequency signals over lossy PCB traces. A typical LVPECL waveform has a swing of 800mV and a rise/fall time of approximately 200ps. These two parameters are important for CMOS inverter based reference buffer since slower edges induces more short circuit current which increases the additive noise from the reference buffer. This effect can be reduced with larger g<sub>m</sub> in the reference buffer at the cost of higher power consumption. Techniques have been proposed to save power [3] through offsetting the PMOS and the NMOS turn on time to reduce the short circuit current. In this design, a differential pair has been designed as the reference buffer in order to implement a fully differential architecture. The input clock slope is less critical here since the current flowing through the buffer is almost constant. Thus the noise figure of the reference buffer in terms of phase noise is relatively constant over time compared to its CMOS counterpart.



Fig. 4.2. Schematic diagram of RSPD with the capacitor bank.

The schematic diagram of the implemented RSPD is shown in Fig. 4.2. The first stage differential pair converts the input reference clock voltage into current, driving its own resistance load and the capacitor bank shown on the right. The second stage is a sample and hold circuit consisting of a PMOS based switch (M1) and the sampling capacitor (Cs). The capacitor bank consists of three binary weighted capacitors where each bank can be selected by passing through the gated clock signals  $\varphi_1$  and  $\varphi_2$ . These two clock signals are derived from the frequency divider and represent the VCO sampling edge. Two PMOS (M2) with half the size of M1 are shorted across drain and source to sink and source additional charges from M1 switching on/off. An additional PMOS (M3) based switch further dumps the charges of Cs into the loop filter.



Fig. 4.3. Test-bench for in-band phase noise simulation.

Due to the lack of charge pump, SSPD along with the reference oscillator determines the in-band noise floor. We can use this module to build a test bench and simulate its in-band phase noise as shown in Fig. 4.3. Through using an open loop structure, there is need to run a long simulation waiting for the loop to lock first which usually takes several  $\mu s$  whereas this open loop structure usually takes a few reference cycles to settle. In this case, the spectre pss (periodic steady-state) only takes about 50ns of simulation time to converge. Since VCO mainly contributes to out band phase noise, an ideal VCO can be used here to further save simulation time. Divider is included to simulate its noise contribution during frequency scaling. The reference clock source can be modeled with a port whose noise parameter can be setup using the datasheet of the actual crystal oscillator used in the measurement. Further processing might be needed to simulate the actual output waveform of the crystal oscillator (LVPECL in this case). Since the core of this test bench is SSPD, finding the output noise is very similar to simulating the noise of a sampler. After a few reference cycles, the sampled voltage Sp/Sn quickly settles to a fixed voltage. Since the sampling process is a highly non-linear and periodic process after PLL phase lock, the spectre pss is a perfect candidate for this scenario. After finding the steady state solution with pss, noises in the sampled voltage Sp/Sn can be further simulated with spectre pnoise (periodic noise). Since the voltage at Sp/Sn will be used for the VCO tuning voltage after loop filter, it is the pnoise around DC that we care about and the relative harmonic needs to be set to 0 in this setup. In addition, since the in-band phase noise has been converted to voltage at Sp/Sn and we only care its noise around DC frequency (not around reference clock), the noise extracted at SSPD output shall be the absolute voltage or dBV/Hz. To convert the noise voltage at Sp/Sn to the actual in-band phase noise, the transfer function from VCO output to SSPD needs

to be divided. Define the differential sampled voltage *Vs* as the difference between Sp and Sn, we can find:

$$PN_{in-band} = PN_{Vs} - 20log_{10} \left( \frac{\frac{d(Vs)}{dt}}{2\pi f_0} \right)$$

$$\tag{4.1}$$

The simulation result is shown in Fig. 4.5, achieving an in-band noise floor around - 114dBc/Hz which is close to the theoretical limit of -118dBc/Hz we calculated earlier. Printing the noise summary to list each individual noise contributor, we can see that the reference clock contributes about 70% of the total noise power at 1MHz and the additional noise mainly comes from the reference clock buffer.



Fig. 4.4. pss and pnoise setup for in-band phase noise simulation.



Fig. 4.5. Simulated in-band phase noise.

# 4.2 Out-band Phase Noise & a Class-C VCO

In this section, we will discuss the other half of PLL phase noise which is the out-band phase noise. For noise sources injected after the loop filter and before the PLL output, they will experience a high-pass transfer function and the closed-loop gain will approach 1 with increasing offset frequency. In most PLLs, the out-band phase noise is dominated by VCO phase noise. Thus at far-off frequencies, the PLL output spectrum basically follows that of the VCO output.

The VCO structure adopted in this PLL is a class-C differential LC-tank based VCO as proposed in [4]. In a conventional differential pair based VCO, each pair conducts the biasing current for 50% of each oscillation cycle. In another word, each transistor has a conduction angle of  $\pi$ . For class-C VCO, on the other hand, the conduction angle is reduced to less than  $\pi$  such that the bias current is injected into the tank in a pulse-like waveform as shown in Fig. 4.6. The magnitude of each spike is much larger than the biasing current while its duration is less than half of the oscillation period such that the total area remains the same as that in a normal

differential pair based oscillator. The benefit of this reduced conduction angle is two folds. Firstly, impulse is more efficient than square wave in terms of generating the fundamental oscillation tone. More specifically, using an equal amount of biasing current, the fundamental tone generated with narrow and tall pulses is 3.9dB higher than that generated with square waves. This can be understood by considering that a large part of the power of a square wave is distributed among its higher order harmonics while the energy of an impulse train is focused in a single tone on the spectrum. Thus using same amount of power, a class-C VCO can generate larger oscillation magnitude than the conventional differential pair based VCO. The second part of the benefit is that by injecting large amount of current into the tank at the peak of its oscillation waveform where its ISF is minimal, the phase noise from the switching transistor and biasing transistor can be reduced, thus leading to improved VCO phase noise.



Fig. 4.6. Illustrations of voltage and current waveform in class-C operation.

The schematic diagram of the proposed VCO is shown in Fig. 4.7. In order to achieve class-C operation mode, two key modifications from the conventional differential pair based VCO are needed: 1. A RC based circuit to bias the switching transistors' gates with an external DC voltage; 2. A large capacitor shunting the common node to ground. Even though the original

paper derived the idea of class-C VCO from Coplitts oscillator, the basic mechanism of class-C operation can be more easily understood with the simulation results shown in Fig. 4.8. In a conventional differential pair based VCO as shown in (a), the switching transistor' gate is biased at VCC and it is turned on for half of each oscillation period. When the drain voltage of the switching transistor falls to the minimum, its  $V_{DS}$  is reaching zero and enters triode region causing a little notch on the conducting current  $I_{DS}$ . (Note in deep sub-micron technologies, the velocity saturation is more severe and the saturation voltage is usually lower than  $V_g$ - $V_{th}$ ) In (b), the gate biasing of switching transistor is lowered, causing it to stay in active region for the entire VCO period. The notch on IDS has disappeared and the current follows the gate voltage which appears more like an impulse waveform. Next in (c), the tail capacitor has been added and it filters out the second order harmonic on the virtual ground such that the source voltage no longer follows the gate voltage as in (b). As a result, the current pulses are further sharpened and the conduction angle is further reduced as well, leading to class-C operation.



Fig. 4.7. Schematic diagram of the proposed class-C VCO.

Referring to Fig. 4.8(c), it is very beneficial to discuss the maximal swing allowed in this design. Due to velocity saturation at deep sub-micron technologies, the saturation voltage is lower than the overdrive voltage. We will assume a saturation VDS of 150mV for the switching transistor and 250mV for the biasing transistor (its length is larger than minimal length to suppress flicker noise). At the bottom of  $V_{\rm d}$ , in order to keep both switching transistor and biasing transistor under active region, we can find  $V_{CC} - \frac{V_{osc}}{2} \ge 0.15 + 0.25$ . With a supply voltage of 1.1V, the maximal allowable swing is thus 1.4V.



Fig. 4.8. Voltage and current waveform for class-C operation: (a) switching transistor entering triode region (b) switching transistor kept within active region (c) kept within active region and adding tail capacitance.

Another vital part of VCO phase noise is the tank noise which is determined by the quality factor Q of the VCO tank. It can be shown from the famous Leeson's equation that the noise from the VCO tank can be derived as:

$$L(\omega_m) = \frac{4kTR}{V_0^2} \left(\frac{\omega_0}{2Q\omega_m}\right)^2 \tag{4.2}$$

Assuming a constant oscillation voltage  $V_0$ , the above equation can be further derived as:

$$L(\omega_m) = \frac{4kT}{V_0^2} \left(\frac{\omega_0}{2\omega_m}\right)^2 \cdot \frac{L}{Q}$$
 (4.3)

Thus for a fixed oscillation frequency  $\omega_0$  and offset frequency  $\omega_m$ , reducing tank phase noise requires a small inductance while maintaining a high tank Q. On the other hand, a smaller inductance requires larger VCO current to achieve maximal voltage swing due to the smaller tank resistance.

A simulation testbench can be used to test various inductors available from the PDK library. Ideal capacitors are connected with the selected inductor in parallel. The inductance and tank Q can be calculated by simulating the parallel LC tank impedance and extracting the resonant frequency and 3dB bandwidth. Results including the outer dimensions, turn numbers, differential Q and differential inductance of some inductors are listed in table? In general, the L/Q ratio increases with larger inductance but also requires less power to achieve maximal swing. Taking consideration of this tradeoff, the selected inductor is highlighted.

| Dimension (um) | Turns | Differential Q | Differential Inductance (pH) | L/Q  |
|----------------|-------|----------------|------------------------------|------|
| 60             | 1     | 10.9           | 46                           | 0.26 |

| 100 | 1 | 14.5 | 100 | 0.44 |
|-----|---|------|-----|------|
| 150 | 1 | 18.0 | 180 | 0.63 |
| 200 | 1 | 19.5 | 269 | 0.87 |
| 100 | 2 | 19.7 | 269 | 0.86 |
| 150 | 2 | 25.1 | 560 | 1.40 |
| 200 | 2 | 29.0 | 945 | 2.05 |
|     |   |      |     |      |
| 100 | 3 | 18.3 | 413 | 1.43 |

Table 4.2. Simulation results for different inductors at 10GHz. The highlighted represents the inductor adopted for this design.



Fig. 4.9. Simulated VCO free-running phase noise at 9.4 GHz and VCO specifications.

The frequency tuning circuits of the VCO consists of a varactor based fine tuning and a coarse tuning with capacitor banks as shown in Fig. 4.7. In order to implement a fully differential PLL architecture, the fine tuning varactor is differentially tuned. In addition to the two AC coupling capacitor  $C_{f1}$ , another pair of capacitor  $C_{f2}$  is also included to balance the RC time

constant at both the positive and negative tuning point. The fine tuning can cover a frequency range of about 50MHz. The capacitor bank consists of 5 banks to achieve a tuning resolution of around 30MHz. The entire VCO can cover a frequency range of around 600MHz. The simulated free-running phase noise and specifications are shown below.

# 4.3 PLL Loop Dynamics



Fig. 4.10. PLL open loop gain and phase over offset frequency. A phase margin of 58° has been achieved.

Having discussed the major noise sources, now we can further analyze the PLL overall phase noise. The PLL open loop gain can be calculated as:

$$K_{open} = \frac{2\frac{dV}{dt}}{2\pi f_0} \cdot \frac{1}{1 + s\frac{C_p}{C_s}\frac{1}{f_{ref}}} \cdot \frac{2\pi K_{VCO}}{s}$$

$$\tag{4.4}$$

Where  $\frac{dV}{dt}$  denotes the slope of the SSPD sampled signal,  $f_0$  and  $f_{ref}$  denotes the VCO frequency and reference frequency,  $C_s$  and  $C_p$  denotes the SSPD sampling capacitance and loop filter capacitance and  $K_{VCO}$  denotes the VCO frequency tuning gain respectively. The three parts in the equation represents gains from SSPD, loop filter and VCO respectively. Being a type-I PLL, its loop dynamics is much simpler than its type-II counterpart, including only two poles: one at the origin and the other at  $f_{ref} \frac{C_s}{C_p}$ . The simulated closed loop gain and phase is shown in Fig. 4.11.



Fig. 4.11. (a) PLL closed loop transfer function to PLL output for in-band and out-band noise sources. (b) Simulated overall PLL phase noise with in-band and out-band noise contributions.

# 4.4 Conclusions

Through using the VCO edge to sample the reference clock, one of the major limitations of small detection range from subsampling phase detector can be greatly improved. Even though the linear range of RSPD is still quite limited, it shifts into bang-bang mode for larger phase error which still generates output with correct polarity. Thus RSPD will constantly be pushing PLL to lock during the locking phase whereas the SSPD sometimes failed to lock due to ambiguous

output across multiple VCO cycles. Simulation results also revealed the low phase noise property of RSPD as comparable to SSPD. The chip is still under fabrication at this time. Measurement results will be later verified and compared to the simulation results.

### References

- [1] A. Sharkia, S. Mirabbasi and S. Shekhar, "A 0.01mm2 4.6-to-5.6GHz sub-sampling type-I frequency synthesizer with -254dB FOM," *ISSCC Digest*, San Francisco, CA, 2018, pp. 256-258.
- [2] J. Sharma and H. Krishnaswamy, "A dividerless reference-sampling RF PLL with -253.5dB jitter FOM and <-67dBc Reference Spurs," *ISSCC Digest*, San Francisco, CA, 2018, pp. 258-260.
- [3] X. Gao, E. Klumperink, G. Socci, M. Bohsali and B. Nauta, "A 2.2GHz sub-sampling PLL with 0.16psrms jitter and -125dBc/Hz in-band phase noise at 700μW loop-components power," *VLSI*, Honolulu, HI, 2010, pp. 139-140.
- [4] A. Mazzanti and P. Andreani, "Class-C Harmonic CMOS VCOs, With a General Result on Phase Noise," *IEEE* JSSC, vol. 43, no. 12, pp. 2716-2729, Dec. 2008.

# Chapter 5

# A Digital PLL with Automatic TDC Linearity Calibration for Spur Cancellation

As semiconductor technology advances to finer feature size, digital circuits are becoming more efficient in both area and power. Integrating the conventional phase-locked loop (PLL) imposes a greater challenge and burden to maintain the analog components. On the other hand, digital PLL shares similar device as used in digital circuits. Fully synthesizable digital PLL has been proposed to take full advantage of the advanced deep submicron technology while providing easy integration with digital system [1]. Moreover, digital PLL is highly flexible and programmable which makes it capable of achieving functionalities that are very difficult to be obtained using analog PLL. As an example, various digital PLL architectures have been proposed to implement direct modulations for high-speed wireless polar transmitters [2], [3], which is a very challenging task for an analog PLL due to its nonlinear analog properties that are sensitive to process-voltage-temperature (PVT) variations.

A key aspect in digital PLL operation is the way to measure the phase error between reference clock and feedback signal. Essentially, two approaches exist to address this issue. The first type, as shown in Fig. 5.1, moves the feedback pulse very close to the reference clock with a digital-to-time converter (DTC) followed by a narrow range TDC to provide finer time measurement. The required DTC can be implemented with phase interpolator or a delay locked loop (DLL) to provide multi-phase outputs of the feedback signal [4], [5]. On the other hand, the

second type of DPLL architecture uses only TDC to measure the phase error [6], [7], [20]. Since all the work from DTC is now loaded to the TDC, a wide-range TDC covering at least one oscillator period is required. Fortunately, various TDC architectures are available to meet this requirement. In addition, part of the hardware from the fine TDC can be reused to implement the coarse measurement which can fasten the loop locking process.



Fig. 5.1. Comparison of two digital PLL architectures: (a) DPLL with DTC and TDC (b) DPLL using TDC only.

Either using DTC or not, various digital calibration techniques have to be applied to suppress fractional spurs in a digital PLL. The conventional fractional spur cancellation using sigma-delta modulator (SDM) [8] requires narrow loop bandwidth in order to suppress the noise-shaping component at high frequency band. In addition, using a high order SDM to toggle the loop division ratio varies the feedback edge after the divider over multiple digital-controlled oscillator (DCO) cycles, which requires a TDC or DTC with wider detectable range that leads to higher power consumption and more complicated hardware. These drawbacks motivate us to explore other spur cancellation methods including the digi-phase. Regardless of the techniques employed, the spurious level in DPLL is highly dependent on the TDC linearity, necessitating

accurate calibrations. In this paper, we present a wideband fractional-N DPLL with digital calibration for fractional spur suppression for a low power Wi-Fi transceiver in 802.11 a/b/g/n bands using a 55 nm CMOS technology. The two-dimensional (2D) Vernier TDC's nonlinearity is automatically calibrated through the fractional frequency synthesis [9]. The implemented RFIC also includes an improved MMD that overcomes the division ratio skipping problem associated with the prior art designs.

This paper is organized as follows: section II discusses the design principle to achieve low fractional spurs for digital PLLs. Section III describes system architecture and the proposed TDC linearity calibration. Measurement results are presented in section IV and conclusions are drawn in section V.

# 5.1 Design of a Digital PLL with Low Fractional Spur

When PLL is generating a carrier frequency  $f_o$  which equals to integer multiples of the reference frequency  $f_{ref}$ , e.g.,  $f_o=Nf_{ref}$ , the frequency divider generates one pulse after N DCO cycles. Divider output pulse shall be directly aligned with the reference pulse when the loop is in lock, thus the phase error measured by the TDC is zero. On the other hand, when DPLL is generating a fractional frequency, the divider will toggle its division ratio between N and N+1 to achieve an equivalent fractional division ratio. Even though the average division ratio over time equals the desired fractional value, an instantaneous phase error exists between the divider output and the reference clock. This periodic error will further modulate the DCO control words, thus various fractional spurs will arise along with the desired carrier tone.

Various techniques including digi-phase [10] have been proposed to suppress fractional spur. However, the cancellation effect at TDC output is affected by various non-ideal circuit characteristic which degrades the spur suppression performance. Due to limited TDC resolution

and linearity, a small amount of residue error might still exist after the cancellation. As shown in Fig. 5.2, assuming each TDC bit covers 2<sup>-tr</sup> of one DCO cycle and each digital bit in digi-phase cancellation signal covers 2-fr of one DCO cycle respectively, the TDC resolution induced residue has a period of  $2^{fr-tr}T_{ref}$ , which corresponds to a fractional spur located at an offset frequency of 2-fr+trfref. On the other hand, divider output sweeps around the reference edge with division ratio toggling between N and N+1, thus the TDC output waveform repeats every  $2^{fr}T_{ref}$ cycles, which creates a fractional spur at an offset frequency of  $2^{-fr}f_{ref}$ . As a result, both TDC resolution and linearity have impact on fractional spurs. Nevertheless, the fractional spur from the TDC nonlinearity is more critical since it is closer to the carrier tone on the spectrum. As an example, assuming a TDC resolution of 5 ps, a reference frequency of 80 MHz, a carrier frequency of 2.4 GHz and a fractionality 2<sup>-fr</sup> of 1/256, the resolution and linearity induced fractional spurs will be located at 26 MHz and 0.32 MHz, respectively. Thus the fractional spur generated by limited TDC resolution will be greatly attenuated by the loop filter, leaving spurs generated by TDC nonlinearity as the dominant source. Thus to implement a digital PLL with low fractional spur, it is critical to have a highly linear TDC.



Fig. 5.2. Simulated TDC output and the residue signal after digi-phase canceller, showing TDC

resolution and TDC nonlinearity induced residue errors possess different periods.



Fig. 5.3. (a) Fractional spur level due to residue error at the digi-phase canceller output. (b) Measured residue error and its fundamental waveform.

The fractional spur due to TDC nonlinearity can be further analysed as follows: assuming that the residue error at the digi-phase canceller output can be expressed as:  $\varepsilon = A_1 sin(2\pi f_m t) + A_2 sin^2(2\pi f_m t) + A_3 sin^3(2\pi f_m t) + \dots , \text{ where } A_1 \text{ is magnitude of the error's fundamental tone and } f_m \text{ represents the fractional offset frequency, the power level of the closest fractional spur can be derived as:}$ 

$$P_{frac}(dBc) = 20 \cdot log_{10} \left( \frac{H(f_m)K_{DCO}}{2f_m} \cdot A_1 \right)$$
 (5.1)

where  $K_{DCO}$  denotes the gain of DCO and H(f) represents the loop filter transfer function. As an example, assuming a fractional frequency of 1.25 MHz, a loop bandwidth of 1 MHz such that the closest fractional spur experience a slight suppression from loop filter, a  $K_{DCO}$  of 10 kHz/bit and a TDC resolution of 5 ps/bit, the calculated and simulated spur level results are shown in Fig. 5.3(a). Simulation result deviates slightly from the calculated value for small TDC residue, mainly because that equation (1) has not taken into account of the quantization effect of a digital

PLL. Furthermore, using the measured TDC residue error as shown in Fig. 5.3(b), its fundamental waveform can be shown to have a peak-to-peak magnitude of 0.6 LSB, which corresponds to a peak magnitude  $A_1$  of 0.3 LSB. Using the above analysis, a spur level of -54 dBc is expected at the closest fractional frequency, which is very close to our measured closest spur level of -56 dBc.

Several TDC topologies can be used for the proposed DPLL design: the traditional TDC architecture is a single delay line TDC which can only achieve a resolution of one single gate delay. To achieve finer resolution, Vernier technique is developed [11]. By using two delay chains with a slight delay difference, this kind of TDC can achieve a sub-gate-delay resolution. However, a large number of delay stages will be required in order to cover a large detection range. An improved structure is to configure the Vernier delay chains into a ring [12]. By reusing the delay cells, Vernier ring TDC can achieve large detection range and fine resolution simultaneously. Alternatively, the gated ring oscillator (GRO) connects the delay cells together to form a ring oscillator [13]. Using multiple phases in the ring to clock a counter while holding the clock phases between the measurement cycles allow accurate time measurement with intrinsic first-order order quantization noise cancellation. Moreover, time amplifier (TA) TDC [14] and ADC based TDC [15] can both achieve fine resolution. However, they have their own sets of drawbacks. TA TDC is limited by the linearity of its TA, while the conversion rate of ADC based TDC is limited. In this work, the 2D Vernier TDC structure was adopted to achieve the sub-gate-delay resolution. Moreover, the 2D structure is able to provide sufficient detectable range while consuming reasonable power and minimal hardware with high conversion rate.

# 5.2 System and Building Blocks



Fig. 5.4. Proposed DPLL block diagram with automatic TDC linearity calibrations for fractional spur cancellation.

# 5.2.1 System Architecture

The complete digital PLL architecture with digital calibration is shown in Fig. 5.4. The TDC adopts a 3-step architecture to provide both fine and coarse measurements. Digi-phase cancellation signal is injected at the TDC output to cancel the instantaneous divider quantization errors. Ideally the waveform after the cancellation block shall remain constant with only DC component. However, various non-ideal characteristics in the loop will still cause a small amount of residue phase errors. In other words, the residue error after digi-phase subtraction is directly related to various system imperfections including non-linearity, mismatch and variation. Thus, this residue can be used as the error signal for various digital calibrations adopted in this design. The gain applied on the digi-phase path is automatically adjusted with a TDC gain tracking module that correlates the error signal with the digi-phase gain. Optimized gain can be achieved when the error is minimized. Likewise, the TDC calibration uses the same error signal to adjust

delay cell for optimal TDC linearity. In summary, our proposed digital calibration scheme can be described as follows: step 1, initially the PLL is locked to a known fractional frequency with the digi-phase block enabled and the TDC gain tracking fixed at a pre-set value. Step 2, after lock-in, the TDC calibration block utilizes the ramp signal at the TDC output for TDC linearity calibration. Step 3, the linearity calibration is disabled and the TDC gain tracking is enabled. Next, the loop is relocked to the desired frequency.

# **5.2.2** A 3-step TDC

Similar to the PFD in an analog PLL, TDC measures the phase difference between the divided feedback signal and the reference clock. The measured result will be further quantized into digital bits and processed by the digital loop filter. The quantization step, or TDC resolution, directly determines the in-band phase noise at DPLL output and can be shown as [16]:

$$\mathcal{L} = \frac{(2\pi)^2}{12} \left(\frac{\Delta t_{res}}{T_{DCO}}\right)^2 \frac{1}{f_{ref}}$$
 (5.2)

where  $T_{DCO}$  is the period of the DCO output and  $t_{res}$  is the TDC resolution. Assuming a  $T_{DCO}$  of 416 ps and a  $f_{ref}$  of 80 MHz, a TDC resolution of 5ps can be calculated from (2) to achieve an inband noise floor of -110 dBc/Hz. Since the phase error ranges across [-T<sub>ref</sub>/2, T<sub>ref</sub>/2], our proposed TDC is segmented into 3 steps to cover all possible phases during phase locking process as shown in Fig. 5.5. The three-step structure includes a bang-bang TDC as the first stage, a single delay chain as the second stage and 2D Vernier delay array as the third stage. The single delay chain is constructed as part of the Vernier delay chains in order to save area and power.



Fig. 5.5. Proposed 3-step TDC block diagram.

More specifically, a bang-bang TDC acting as a signal steering gear is employed for the first stage of the proposed 3-step TDC. The bang-bang TDC has the capability to detect an entire reference cycle (12.5 ns in an 80 MHz system). In the second stage TDC, a delay chain with 16 delay stages are adopted to provide a coarse measurement with 4-bit binary output. In conjunction with the polarity detection provided by the bang-bang TDC, the coarse TDC provides a 5-bit output with a resolution of 65 ps. By reusing the delay stages of the 2D Vernier TDC, this coarse TDC requires no extra hardware and power consumption, while extending the TDC detectable range to 2.08 ns. With this coarse TDC, the proposed digital PLL can achieve faster locking owing to the enlarged detectable range. Finally, the fine TDC is constructed using a Vernier structure with 2D arbiter array [17]. The delays from one stage in fast delay chain and slow delay chain are set to 60 ps and 65 ps, respectively. This slight difference provides a sub-

gate delay time resolution as fine as 5 ps. The fine 2D TDC has a detectable range of 520 ps (7 bits), which is sufficiently large to cover an entire 2.4 GHz DCO cycle (420 ps).

# **5.2.3** Automatic TDC Linearity Calibration

Similar to the basic Vernier TDC, a fast and a slow delay chains are employed in a 2D Vernier structure. However, rather than using a single arbiter line, multiple arbiter lines are implemented in a 2D Vernier structure to compare each fast delay stage with multiple slow delay stages. By reusing part of the delay stages, larger detectable range can be achieved. However, a highly linear 2D Vernier TDC requires that the delays of fast and slow chains to satisfy the following conditions:

$$\begin{cases} n(d_s - d_f) = d_s \\ d_s - d_f = t_{res} \end{cases}$$
 (5.3)

where  $d_s$  and  $d_f$  denote delays of single stage in slow chain and fast chain, respectively; n is number of stages in one arbiter line. The first equation comes from the condition for a continuous measurement with 2D Vernier TDC and the second equation sets the measurement resolution. Using these two equations, only one set of  $d_s$  and  $d_f$  can be used as a viable solution. Any deviation of the two delays will cause error comparing to the ideal case. We define the common mode delay error as the deviation of the average of two delays and the differential mode delay error as the deviation of the difference of two delays from their ideal values, respectively. As shown in Fig. 5.6, a common mode delay error introduces gaps at the turning points of each arbiter line and a differential mode delay error leads to incorrect slope for each line. Moreover, TDC nonlinearity induced by common delay error is zero for small TDC input located within the first arbiter line. The deviation from ideal transfer curve accumulates as TDC input gets larger. On the other hand, the nonlinearity from differential delay error shows up even within the first

arbiter line but only repeats itself periodically for large TDC inputs. It is from these observations that we conclude the dominant source for TDC nonlinearity is the differential mode delay error for small TDC inputs and the common mode delay error for large TDC inputs.



Fig. 5.6. Simulated TDC non-linearity considering common mode error and differential mode error.



Fig 5.7. Proposed TDC automatic linearity calibration loops.



Fig 5.8. Measured convergence of TDC common mode and differential mode delays.

With a closer look, the quantization error generated by the factional-N accumulator presents a staircase ramp waveform that can be used to sweep the TDC input from  $-T_{DCO}/2$  to  $T_{DCO}/2$ . As illustrated in Fig. 5.7, the corresponding TDC output can be further subtracted from an ideal ramp signal, creating an error signal that can be used to automatically adjust the TDC delays. As mentioned above, when TDC input is within the range of first arbiter line, only the difference between fast and slow delays causes TDC measurement error. On the other hand, the average of fast and slow delays dominates TDC error when TDC input is sufficiently large such that multiple arbiter lines are used. As a result, the common and differential parts of the fast and slow delays can be calibrated separately according to TDC input range. Two least-mean-square (LMS) loops are designed to collect the differential and common mode error signals used for fast and slow delay chain calibrations. More specifically, TDC generates a flag signal to indicate either one or multiple arbiter lines are used in one measurement. This flag signal will be further used to activate either common or differential LMS loop. In this way, we can guarantee an orthogonal calibration of two types of errors without interfering each other. As shown in Fig. 5.8, measured results showed that the LMS loops for common and differential delays converge after about 150 us. The convergence speed depends on the step size of the LMS loop. Faster convergence can be achieved with larger step size. However, exceedingly large step size might jeopardize the convergence stability.

# 5.2.4 A Second Order Digital Loop Filter

As shown in Fig. 5.4, the digital loop filter consists of proportional and integral paths to achieve a programmable bandwidth from 200 kHz to 2 MHz. In addition, two additional IIR filters are added on the proportional path to create a second order filter. Parameters including gain on the proportional and integral paths in the digital loop filter can be programmed to achieve different natural frequency  $\omega_n$  and damping factor  $\xi$  similar to the analog PLL:

$$\omega_n = \sqrt{\frac{K\beta}{T_{ref}}} \qquad \xi = \frac{\alpha}{2} \sqrt{K \frac{T_{ref}}{\beta}}$$
 (5.4)

where K represents total loop gain except loop filter,  $T_{ref}$  is the period of reference clock,  $\alpha$  and  $\beta$  represent gains in the proportional and integral paths, respectively. The loop can be programmed to a wider loop bandwidth initially for faster frequency lock and reconfigured to an optimal bandwidth that corresponds to the best phase noise performance afterwards.

### 5.2.5 Error-Free Multi-Modulus Divider

Conventional MMD uses a chain of 2/3 cells connected in series [18]. In this type of divider, frequency of DCO waveform is scaled down by 2 or 3 times through one cell and propagated to the next to be further divided down. By controlling the division ratio of each 2/3 cell, the entire chain can achieve a continuous division ratio range from  $2^n$  to  $2^{n+1}$ -1. Extended division ratio range can be achieved with extra extension cells. Such extended divider chain can achieve a range from  $2^m$  to  $2^n$ -1, where m is the chain length when all extension cells are turned off and n is the chain length when all extension cells are turned on. However, this architecture

might generate a glitch in the divider output during the first cycle after the chain length is modified.

Consider the case as shown in Fig. 5.9, assume that the division ratio P is set to 01..1 at first, then all the stages are configured as divide-by-3. Since  $P_N$  is 0, the OR gate will block the feedback signal from the last stage and generate a constant high. Equivalently the second last stage cannot see the last stage since its  $mod_{in}$  signal remains constantly high. In this case, the last 2/3 cell is still running as a divide-by-3 counter, but its feedback signal  $mod_{out}$  is blocked the OR gate. Now if the P is switched to 10...0, all the stages will be configured to divide-by-2. Since the OR gate no longer blocks the feedback signal from the last stage, the equivalent length of the entire chain is increased by one. Depending on the time of P switch, the last stage might still need to finish its current divide-by-3 cycle in the first period before successfully switch to divide-by-2 mode. This might cause the feedback signal  $mod_{out}$  to be delayed or advanced by one reference period, thus generating incorrect edge at divider output. Such glitch can cause failure of locking at fractional frequency in which division ratio toggles between  $2^n$ -1 and  $2^n$ .

To resolve this issue, some division ratio dependent solutions have been proposed for limited extension bits [19], but extending to higher bits still remains non-trivial. In this design, we propose to use a single synthesizable state machine to replace all the stages with ratio extension logics as shown in Fig. 5.10. This new MMD uses an asynchronous counter to count the divided edges from previous stages. As shown in the flowchart, the asynchronous counter is set to zero when the divider extension bits are disabled. Thus it will always count from zero when enabled to avoid generating glitch in the output. When it is activated, it will function as a counter triggered by the divided clock from last 2/3 cell stage. The upper limit of the counter is set by the assigned higher bits from division ratio words *P*. The remapped division ratio with

control word P is also shown in Fig. 5.10. The 3 LSB are used to control the high speed 2/3 cells while the upper 4 MSB sets the upper limit for the asynchronous counter. A division range programmable from 8 to 127 is achieved with no division ratio switching error.



Fig. 5.9. Incorrect divider state in the first reference period after ratio switching associated with conventional MMD using extension cells.



Fig. 5.10. Flowchart of the proposed divider with asynchronous counter and the remapped control words.



Fig. 5.11. Die photo of the DPLL in a low power multi-standard wireless transceiver RFIC.



Fig. 5.12. Measured phase noise at 2.08 GHz output with loop bandwidth of 1 MHz.

# 5.3 Measurement Results

A prototype of the proposed digital PLL is implemented in a standard 55 nm CMOS technology as shown in Fig. 5.11. The RFIC is separated into the digital part and the analog part on the layout to minimize their cross-talk. The entire digital PLL occupies 0.56 mm<sup>2</sup>, in which two major components, TDC and DCO, take most of the area. When the loop is locked to an integer frequency at 2.08 GHz, the measured in-band phase noise is -107 dBc/Hz and the integrated rms jitter (from 10 kHz to 10 MHz) is 0.55 ps as shown in Fig. 5.12. The in-band spur

around 250 kHz is due to the power regulator used on board. The loop bandwidth is set to 1 MHz in order to clearly show the in-band noise floor achieved.



Fig. 5.13. Measured spectrum before and after digital calibration with fractionality of (a) 1/64 (b) 3/64, respectively.

To demonstrate the effectiveness of the proposed TDC calibration, the digital PLL is configured to lock at various fractional frequencies. Two cases with the fractionalities of 1/64 and 3/64 are shown in Fig. 5.13 (a) and (b), respectively. The measured largest fractional spurs in two cases at 1.25 MHz and 3.75 MHz were -45 dBc and -36 dBc before calibrations. In both measurements, the digi-phase spur cancellers have been enabled. However, small amount of residual error still exists after the canceller due to TDC non-linearity. When the proposed TDC calibration is completed, the fractional spurs level drop to below -55 dBc and -60 dBc, respectively, indicating a spur reduction of 10 dB and 25 dB, owing to the proposed TDC calibration scheme. Since the TDC gain is proportional to the delay difference between the fast and slow chains, the loop bandwidth varies slightly after delay calibration that could affect the final spur reduction effect as well. Furthermore, the spur level before calibration depends on the

TDC initial delay that is PVT sensitive, different spur levels are observed for various frequency settings. However, with the proposed calibration turned on, the largest fractional spur level is always below -55 dBc. Additional measurements of the largest fractional spurs with different fractional frequencies before and after TDC calibration are shown in Fig. 5.14.



Fig. 5.14. Measured fractional spur near 2.4 GHz with a loop bandwidth of 1 MHz for different fractional frequencies with and without TDC calibrations.



Fig. 5.15. Measured TDC transfer curve, INL and DNL before and after digital calibration.

The measured TDC transfer curve is shown in Fig. 5.15. Before TDC calibration, gaps between different arbiter lines can be clearly observed due to inaccurate delays from two delay chains which will cause high spurious tone in DPLL output. After TDC calibration, the measured TDC transfer curve is very close to the ideal transfer curve. With auto-calibration, this 2D Vernier TDC achieves an average differential nonlinearity (DNL) of 1.13 LSB and integral nonlinearity (INL) of 0.81 LSB, while DNL and INL are 1.32 LSB and 3.49 LSB without calibration, respectively. The DNL is mainly caused by the 2D arbiter topology, where the turning points of the arbiter chains correspond to worst DNL. The proposed TDC gain and linearity calibration only needs to be carried out once initially and involves negligible extra power consumption.

TABLE 5.1. MEASURED DPLL PERFORMANCES AND COMPARISONS

|                                    | Hsu<br>[20]<br>ISSCC<br>08 | Tasca<br>[4]<br>JSSC11 | Elkholy<br>[21]<br>JSSC15 | Narayana<br>n [5]<br>JSSC16 | Gao<br>[23]<br>ISSCC1<br>5 | This work           |
|------------------------------------|----------------------------|------------------------|---------------------------|-----------------------------|----------------------------|---------------------|
| Architecture                       | Frac.<br>DPLL              | Frac.<br>DPLL          | Frac.<br>DPLL             | Frac.<br>SSPLL              | Integer<br>SSDPL<br>L      | Frac.<br>DPLL       |
| Technology (nm)                    | 130                        | 65                     | 65                        | 65                          | 28                         | 55                  |
| $f_{ref}\left(\mathrm{MHz}\right)$ | 50                         | 40                     | 50                        | 40                          | 80                         | 80                  |
| $f_o$ (GHz)                        | 3.2-4.2                    | 2.9-4.0                | 4.5                       | 4.34-4.94                   | 5.8                        | 1.9~2.8/<br>3.8~5.6 |
| DCO Tuning Range (%)               | 27.2                       | 31.9                   | 26.8                      | 12.9                        | /                          | 38.3                |
| In-band PN (dBc/Hz)                | -108                       | -104                   | -106                      | -120                        | -105                       | -107                |
| Fractional Spur (dBc)              | -53                        | -53                    | -51                       | -59                         | /                          | -55                 |
| Closet Spur (MHz)                  | 1                          | 1                      | 0.392                     | 0.03                        | /                          | 1.25                |
| Loop Bandwidth (MHz)               | 1.1                        | 0.312                  | 2.5                       | 1                           | /                          | 1                   |
| RMS Jitter(fs)                     | 204                        | 400                    | 490                       | 133                         | 173                        | 549                 |
| Power (mW)                         | 46.7                       | 4.5                    | 3.7                       | 6.2                         | 9.5                        | 9.9                 |
| Area (mm <sup>2</sup> )            | 0.95                       | 0.22                   | 0.22                      | 0.2                         | 0.3                        | 0.56                |

As part of a low-power 802.11 a/b/g/n wireless transceiver RFIC, this proposed digital PLL consumes 9.9 mW total power in which TDC, DCO and the digital circuits (including MMD) consumes 4.7 mW, 4.2 mW and 1 mW, respectively. The reference signal is generated with an 80 MHz crystal oscillator. Performance comparisons are summarized in table II, demonstrating a competitive DPLL design comparing to the state-of-the-art.

### 5.4 Conclusions

A fractional-N digital PLL using 2D Vernier TDC with automatic linearity calibration is presented. By using a ramp signal generated from the existing fractional frequency synthesis blocks, the loop can automatically adjust TDC's fast and slow delays to achieve the best linearity for fractional spur reduction. A digi-phase canceller with automatic TDC gain tracking loop is implemented to further suppress the fractional spurs. A largest fractional spur of -55 dBc was measured over various fractional frequencies without using traditional SDM for noise shaping. The proposed 3-step TDC is able to provide fine resolution and wide detectable range with minimal hardware. This paper also presents an improved divider structure that resolves the glitch issues during division ratio switching associated with conventional MMDs. This novel divider structure can provide a wide division range from 8~127 without transient switching glitches to support the wide DCO tuning range of 38%.

### References

- [5] W. Deng et al., "A Fully Synthesizable All-Digital PLL With Interpolative Phase Coupled Oscillator, Current-Output DAC, and Fine-Resolution Digital Varactor Using Gated Edge Injection Technique," *IEEE J. Solid-State Circuits*, vol. 50, no. 1, pp. 68-80, Jan. 2015.
- [6] G. Marzin, S. Levantino, C. Samori and A. L. Lacaita, "A 20 Mb/s Phase Modulator Based on a 3.6 GHz Digital PLL With –36 dB EVM at 5 mW Power," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 2974-2988, Dec. 2012.
- [7] S. Zheng and H. C. Luong, "A CMOS WCDMA/WLAN Digital Polar Transmitter with AM Replica Feedback Linearization," *IEEE J. Solid-State Circuits*, vol. 48, no. 7, pp. 1701-1709, July 2013.
- [8] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori and A. L. Lacaita, "A 2.9–4.0-GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560-fs<sub>rms</sub> Integrated Jitter at 4.5-mW Power," *IEEE J. of Solid-State Circuits*, vol. 46, no. 12, pp. 2745-2758, Dec. 2011.
- [9] A. Tharayil Narayanan et al., "A Fractional-N Sub-Sampling PLL using a Pipelined Phase-Interpolator With an FoM of -250 dB," *IEEE J. Solid-State Circuits*, vol. 51, no. 7, pp. 1630-1640, July 2016.
- [10] Z. Ru, P. Geraedts, E. Klumperink, X. He and B. Nauta, "A 12GHz 210fs 6mW digital PLL with sub-sampling binary phase detector and voltage-time modulated DCO," VLSI Symp. Dig., 2013, pp. 194-195.
- [11] J. Borremans, K. Vengattaramane, V. Giannini, B. Debaillie, W. Van Thillo and J. Craninckx, "A 86 MHz–12 GHz Digital-Intensive PLL for Software-Defined Radios, Using

- a 6 fJ/Step TDC in 40 nm Digital CMOS," *IEEE J. Solid-State Circuits*, vol. 45, no. 10, pp. 2116-2129, Oct. 2010.
- [12] John W. M. Rogers, Foster F. Dai, Mark S. Cavin, and Dave G. Rahn, "A Multi-Band Delta-sigma Fractional-N Frequency Synthesizer for a MIMO WLAN Transceiver RFIC," *IEEE J. Solid-State Circuits*, vol. 40, no. 3, pp. 678-689, March, 2005.
- [13] D. Liao, H. Wang, F. F. Dai, Y. Xu and R. Berenguer, "An 802.11 a/b/g/n digital fractional-N PLL with automatic TDC linearity calibration for spur cancellation," *IEEE Radio Frequency Integrated Circuits Symp.*, May, 2016, pp. 134-137.
- [14] R. Dana and M. A. Wheatly et al., "Frequency-modulated PLL with fractional-N frequency divider and jitter compensation," U.S. patent 5038120.
- [15] P. Dudek, S. Szczepanski, and J. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 240–247, Feb. 2007.
- [16] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit Vernier ring time-to-digital converter in 0.13 μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, Apr. 2010.
- [17] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator TDC with first-order noise shaping," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, Apr. 2009.
- [18] M. Lee, M. E. Heidari, and A. A. Abidi, "A low noise, wideband digital phase-locked loop based on a new time-to-digital converter with subpicosecond resolution," *VLSI Symp. Dig.*, 2008, pp. 112–113.
- [19] Z. Xu, S. Lee, M. Miyahara and A. Matsuzawa, "A 0.84ps-LSB 2.47mW time-to-digital converter using charge pump and SAR-ADC," *Proc. IEEE Custom Integrated Circuits Conf.*, May, 2013, pp. 1-4.

- [20] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda and M. Fukaishi, "A 2.1-to-2.8-GHz Low-Phase-Noise All-Digital Frequency Synthesizer With a Time-Windowed Time-to-Digital Converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2582-2590, Dec. 2010.
- [21] A. Liscidini, L. Vercesi and R. Castello, "Time to digital converter based on a 2-dimensions Vernier architecture," *Proc. IEEE Custom Integrated Circuits Conf.*, May 2009, pp. 45-48.
- [22] C. S. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli and Z. Wang, "A family of low-power truly modular programmable dividers in standard 0.35-/spl mu/m CMOS technology," *IEEE J. Solid-State Circuits*, vol. 35, no. 7, pp. 1039-1045, July 2000.
- [23] P. Nuzzo, K. Vengattaramane, M. Ingels, V. Giannini, M. Steyaert and J. Craninckx, "A 0.1 5GHz Dual-VCO software-defined ∑∆ frequency synthesizer in 45nm digital CMOS," Proc. IEEE Radio Frequency Integrated Circuits Symp., May 2009, pp. 321-324.
- [24] C. M. Hsu, M. Z. Straayer and M. H. Perrott, "A Low-Noise Wide-BW 3.6-GHz Digital ∑Δ Fractional-N Frequency Synthesizer With a Noise-Shaping Time-to-Digital Converter and Quantization Noise Cancellation," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2776-2786, Dec. 2008.
- [25] A. Elkholy, T. Anand, W. S. Choi, A. Elshazly and P. K. Hanumolu, "A 3.7 mW Low-Noise Wide-Bandwidth 4.5 GHz Digital Fractional-N PLL Using Time Amplifier-Based TDC," *IEEE J. Solid-State Circuits*, vol. 50, no. 4, pp. 867-881, April 2015.
- [26] X. Gao et al., "A 28nm CMOS digital fractional-N PLL with -245.5dB FOM and a frequency tripler for 802.11abgn/ac radio," *IEEE ISSCC Dig. Tech. Papers*, Feb. 2015, pp. 1-3.

### Chapter 6

### Conclusions

In this work, three high performance designs have been presented. A low-noise analog PLL based on sub-sampling phase detector has been presented in chapter 3. The soft loop switching controller proposed helps to improve the locking robustness of SSPD while avoiding injecting additional noise. Compared to prior art, the proposed architecture achieved minimum loop gain variation during loop switching, thus enabling controllable and precise loop dynamics during locking. Furthermore, the proposed multi-phase VCO based on capacitive interpolation network has achieved fractional-N operation with a SSPLL while avoiding extra noise or power penalty compared to integer mode.

The second design is built as an improvement to the previous SSPLL design. Not only has the technology improved from 130nm bulk CMOS to 45nm SOI which inherently improves power efficiency, the sub-sampling architecture has been modified for improved stability. Compared to the original SSPD, reference sampling phase detector, or RSPD, can achieve a much larger capture range at the range of sub reference cycle. Consequently, the locking stability issue with the original SSPLL can be largely resolved. Through adaptively programming the signal slope in RSPD, a tradeoff between capture range and gain, which further leads to a tradeoff between stability and phase noise, can be made under different external interference environments. Furthermore, migrating from type-II to type-I also leads to less hardware, easier to

design, less noise sources and, most important of all, much smaller filter sizes which make it much more economical to integrate the entire PLL.

In contrast to these two analog PLLs, the third design presented is a digital PLL. Even though the general architecture remains the same, some of the modules have been totally replaced in DPLL. TDC, for example, which is a substitute for the phase detector in the analog PLL proves to be a quite critical component determining several aspects of PLL performance including in-band phase noise and spur level. Utilizing the flexibility of digital PLL, which turns out to be one of its biggest advantages, we have implemented an automatic calibration loop to dynamically adjust circuit parameters in TDC to improve its linearity which reduces PLL output fractional spur level. These digitally assisted analog circuits can potentially achieve much improved performance.

Again, these three designs focus on different aspects of PLL depending on specific application. Exploring these different fields of PLL can lead to deeper insight into this simple but useful circuit. PLL are used everywhere in the wireless networks and fortunately it has also experienced some exciting new inventions over recent years, keeping it one of the most active area in RFIC researches. Plenty of work is still needed in the future to make a PLL with lower phase noise, lower spur level, smaller frequency step and lower power consumption that runs stable with external disturbances. And this work represents my humble trial to further push state-of-the-art PLL design one step closer to this goal.