Ankit Seedher<sup>\*</sup> and Gerald E. Sobelman Department of Electrical and Computer Engineering University of Minnesota Minneapolis, MN 55455 e-mail: seedher@ti.com, sobelman@ece.umn.edu

Abstract—Some recent PLL designs utilize a half-rate phase detector so that the VCO operates at a frequency that is one-half of the input data rate. In this paper, a technique is proposed to extend the half-rate phase detector structure to a rate of  $1/2^n$ , for integer n > 1. The concept is explained using a rate 1/8 implementation and simulation results are presented to verify the scheme. These rate  $1/2^n$  phase detectors can be used to raise the maximum operating frequency of clock and data recovery circuits in a given CMOS process technology.

#### I. INTRODUCTION

The use of optical fiber as a transmission medium in communications networks allows very high bandwidth to be achieved. However, dispersion effects within the fiber require that the data be regenerated periodically. Optical links typically convert from the optical domain to the electronic domain to perform the regeneration and then reconvert the signal back into the optical domain for subsequent transmission. The relatively lower speed of these electronic circuits poses significant design challenges. Additionally, the ability to implement analog front-end functions together with digital processing in a mainstream CMOS technology would have cost advantages compared to a BiCMOS design. However, the relatively lower  $f_T$  of MOS transistors limits the circuit speed that can be achieved in such an approach.

In the receiver front-end, clock information is required for synchronized sampling such that the value of the input signal is sampled at the optimum time. Figure 1 shows a generic Clock and Data Recovery (CDR) circuit that is based on a charge pump Phase-Locked Loop (PLL) [1]. A PLL can be made to operate at a higher speed than the technology would normally allow if the phase detector (PD) can support an input data rate that is multiple of the Voltage Controlled Oscillator (VCO) clock frequency. In particular, designs in which the clock runs at one-half of the input data rate have been described [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]. In this paper, we propose an extension to the half-rate phase detection scheme to one that can support a data rate that is  $2^n$  times the VCO frequency, for n > 1, thereby facilitating a faster analog front-end. As an example, a rate 1/8 design is studied in detail.



Fig. 1. Block diagram of a clock/data recovery system. The phase detector, which is the focus of this work, is circled.

### II. DESIGN OF RATE $1/2^n$ PHASE DETECTORS

In the half-rate PD design of Ref. [2], the incoming data is applied to two master/slave latch pairs which are triggered on opposite polarities of the clock signal. A pair of auxiliary signals, called Error and Reference, are generated by performing XOR operations on the latch outputs. These signals are used to extract two data streams comprising even and odd samples of the input signal but at half of the input data rate. Thus, the circuit also contains a built in 1:2 demultiplex function, which may be useful for any downstream digital processing that may occur.

The basic approach that we employ is to instantiate multiple true/complement data paths, each of which is similar in structure to the design used in [2]. Then, instead of having only one Error signal and one Reference signal, we will produce a pair of Error and Reference signal from each of these data paths. These will then be used to control the charge pump of the PLL in the desired fashion.

## A. Rate 1/8 Phase Detector

The structure of the proposed rate 1/8 PD is shown in Figure 2. The input data signal is applied to four pairs of master/slave latches. Four VCO clocks are used (together with their complements), where each clock is offset by 90 degrees with respect to the adjacent clocks. The input data stream is sampled at both the rising and falling edges of each of these four clocks. The input data rate is 8 times as fast as the VCO clock frequency and the four pairs of VCO clocks are used to sample eight consecutive bits of the input stream. Four latches that are transparent during the high time of each clock and four latches that are transparent during the low time of each clock are used in the first stage of the phase detector.

The Q outputs of these initial four pairs of latches are labeled as  $\{O,O'\}$ ,  $\{T,T'\}$ ,  $\{Th,Th'\}$  and  $\{F,F'\}$ , respectively. Within each set, the first signal is the output of the latch that



fa-fd are the Error Signals; ra-rd are the Reference Signals. These serve as inputs to the Charge Pump.

Fig. 2. Schematic diagram of the proposed 1/8 rate phase detector.



Fig. 3. Error signals f1 - f8 in the locked state.

is transparent when the corresponding clock is high while the second is the output of the latch that is transparent when the clock is low. Consider what would happen if a latch becomes transparent exactly in the middle of a bit time interval. In that situation, it would output some bit i during the first half of its transparent phase and bit i+1 during the second half. The corresponding complementary latch, on the other hand, would hold bit *i* for the entire time interval. If the XOR function is performed on these two signals, we would obtain 0 for the first half of this interval and the XOR of bit i with bit i + 1 during the second half of the interval. Since only one of the latches becomes transparent in the middle of each bit period, we can expect to form the XOR of two successive bits for exactly onehalf of  $T_{bit}$  in each bit period. We would therefore have eight XOR functions in each of the eight bit periods in one period of the VCO clock. These eight XOR functions are shown as f1 to f8 in Figure 3. The duration of an XOR pulse is equal to  $T_{bit}/2$  in the locked state and it would be proportional to the phase lag/lead otherwise.

The Reference signals are used to remove the data dependencies of the Error signals. The eight waveform pairs {O,O'}, {T,T'}, {Th,Th'} and {F,F'} are sampled by a set of slave latches that are complementary to the previous latch in each path. The signals {OO', O'O}, {TT',T'T}, {ThTh',Th'Th} and {FF',F'F} are produced at the outputs of latches that are transparent on positive and negative levels of clk1, clk2, clk3 and clk4, respectively. These eight waveforms in the second stage are XORed to obtain the eight Reference signals r1-r8, as shown in Figure 4.



Fig. 4. Reference signals r1 - r8 in the locked state.

The signals f1 - f8 and r1 - r8 can be created by using the following generic logic function:  $(x1 \oplus x2)(c1.c2)$ . Note that the operation (c1.c2) is not a commutative Boolean AND. Rather, it is implemented during the time between a low-to-high transition on c1 and a high-to-low transition on c2. Each of the sixteen functions f1 - f8 and r1 - r8required for generating the Error and Reference signals can be implemented using the above logic function with x1, x2,

| Function | x1 | x2  | c1    | c2    |
|----------|----|-----|-------|-------|
| f1       | 0  | 0'  | clk1  | -clk2 |
| f2       | 0  | 0'  | -clk1 | clk2  |
| f3       | Т  | T'  | clk2  | -clk3 |
| f4       | Т  | T'  | -clk2 | clk3  |
| f5       | Th | Th' | clk3  | -clk4 |
| f6       | 0  | 0'  | -clk3 | clk4  |
| f7       | F  | F'  | clk4  | clk1  |
| f8       | F  | F'  | -clk4 | -clk1 |

TABLE I

VARIABLES USED TO IMPLEMENT THE ERROR FUNCTIONS f1 - f8.

| Function | x1    | x2    | c1    | c2    |
|----------|-------|-------|-------|-------|
| r1       | 0'0   | T'T   | clk2  | -clk3 |
| r2       | 00'   | TT'   | -clk2 | clk3  |
| r3       | T'T   | Th'Th | clk3  | -clk4 |
| r4       | TT'   | ThTh' | -clk3 | clk4  |
| r5       | Th'Th | F'F   | clk4  | clk1  |
| r6       | ThTh' | FF'   | -clk4 | -clk1 |
| r7       | F'F   | 00'   | -clk1 | clk2  |
| r8       | FF'   | 0,0   | clk1  | -clk2 |
|          |       |       |       |       |

| TABLE 1 |
|---------|
|---------|

VARIABLES USED TO IMPLEMENT THE REFERENCE FUNCTIONS r1 - r8.

c1 and c2 as shown in Tables I and II. The required boosting of the Error signal amplitudes can be done in a manner similar to that described in [2].

### B. Generalization to Rate $1/2^n$

The rate 1/8 scheme can be generalized to rate  $1/2^n$  in a straightforward fashion. For integer n > 1, the input data rate is  $2^n$  times the VCO clock frequency by using  $2^{n-1}$  clocks, each having a phase difference of  $\pi/2^{n-1}$  from its neighbor. After this, waveforms similar to Figures 3 and 4 can be drawn and the signals to be used for x1, x2, c1 and c2 can be deduced for generating the  $2^{n-1}$  Error and Reference functions. Larger values of *n* give higher speed but at the expense of increased area, power and design complexity.

## III. IMPLEMENTATION DETAILS AND SIMULATION RESULTS

A circuit simulation of the rate 1/8 phase detector was done using rail-to-rail CMOS circuits in the 0.18 micron process from TSMC that is available through MOSIS. The latches are implemented using transmission gates and inverters. The aforementioned logic function has been implemented using the circuit template of Figure 5.

The amplitudes of the Error signal pulses are boosted by a factor of two using the symmetric current-steered XOR gate design of [2]. Four charge pumps are employed as shown in Figure 6 and their currents are added at nodes X and Y to produce a final differential drive signal for the VCO.

The simulations were carried out for an input data rate of 2 Gbps and therefore a VCO that operates in the neighborhood of 250 MHz is sufficient. Figure 7 shows the Error and Reference signals, fa and ra, in the locked condition. (The waveforms for the three other pairs of Error and Reference



Fig. 5. Circuit template used for implementing the 16 Error and Reference functions



Fig. 6. Connection of the four charge pumps to create a differential drive for the VCO.

signals are similar.) The figure shows that, after an initial startup time, the Error signal pulse has approximately twice the amplitude and half the width of the Reference pulse, as expected. Thus, in the locked state, these two signals can have the same average value.



Fig. 7. The simulated Error and Reference signal waveforms fa and ra.

The PD characteristic is shown in Figure 8. The input data was given various phase lags and leads and the average value of Error - Reference for one bit period was calculated from the simulation results. The magnitude is plotted against the phase error in nanoseconds. The PD is observed to have an approximately linear characteristic. The graph reveals that the averages of the Error and Reference signals become exactly equal at a phase error of about 0.04 nanoseconds. For a bit period of 0.5 nanoseconds, this amounts to a systematic offset error of 8%.

# IV. CONCLUSIONS

An extension strategy for the half-rate phase detection scheme of [2] has been proposed. As a specific design example, the implementation of a rate 1/8 phase detector has been



Fig. 8. Simulated PD output characteristic

described. Simulations have been performed to demonstrate that the circuit operates as desired. The results show show that an approximately linear PD characteristic is achieved and a built-in 1:8 demultiplex operation is automatically obtained. Moreover, other fractional rate phase detectors can be constructed using similar techniques. These architectures can be used to extend the maximum operating frequency of clock and data recovery circuits in any given CMOS technology.

## V. ACKNOWLEDGEMENTS

This work was partially supported by an equipment donation from Intel Corp.

#### REFERENCES

- J. Savoj, B. Razavi, "High-Speed CMOS Circuits for Optical Receivers," Kluwer Academic Publishers, 2001.
- [2] J. Savoj, B. Razavi, "A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear Phase Detector," *IEEE Journal of Solid-State Circuits*, pp. 761-768, May 2001.
- [3] J. Savoj and B. Razavi, "Design of Half-Rate Clock and Data Recovery Circuits for Optical Communication Systems," *Design Automation Conference*, pp. 121-126, 2001.
- [4] J. Savoj and B. Razavi, "A 10-Gb/s CMOS Clock and Data Recovery Circuit," Symposium on VLSI Circuits, pp. 136-139, 2000.
- [5] J. Savoj and B. Razavi, "A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Binary Phase/Frequency Detector," *IEEE Journal* of Solid-State Circuits, Vol. 38, No. 1, pp. 13-21, 2003.
- [6] B. Razavi, "Challenges in the Design of High-Speed Clock and Data Recovery Circuits," *IEEE Communications Magazine*, Vol. 40, No. 8, pp. 94-101, 2002.
- [7] M. Wurzer, et al., "40-Gb/s Integrated Clock and Data Recovery Circuit in a Silicon Bipolar Technology," *Proceedings of the Bipolar/BiCMOS Circuits and Technology Meeting*, pp. 136-139, Sept. 1998.
- [8] M. Rau, et al., "Clock/Data Recovery PLL using Half-Frequency Clock," IEEE Journal of Solid-State Circuits, Vol. 32, pp. 1156-1159, July 1997.
- [9] P. Larsson, "An Offset-Cancelled CMOS Clock-Recovery/Demux with a Half-Rate Linear Phase Detector for 2.5 Gb/s Optical Communication," *IEEE International Solid-State Circuits Conference*, pp. 74-75 and 434, 2001.
- [10] M. Ramezani et al, "Analysis of a Half-Rate Bang-Bang Phase-Locked-Loop," *IEEE Trans. Circuits and Systems II: Analog and Digital Signal Processing*, Vol. 49, No. 7, pp. 505-509, 2002.
- [11] J. Rogers and J. Long, "A 10-Gb/s CDR/DEMUX with LC Delay Line VCO in 0.18-μ CMOS," *IEEE Journal of Solid-State Circuits*, Vol. 37, No. 12, pp. 1781-1789, 2002.