A 38-GS/s 7-bit Pipelined-SAR ADC With Speed- Enhanced Bootstrapped Switch and Output Level Shifting Technique in 22-nm FinFET

Efficient time-interleaved (TI) analog-to-digital converters (ADCs) that operate at high sample rates with wide input bandwidths are necessary to support increasing wireline transceiver data rates. This article presents a 7-bit 38-GS/s 32-way TI ADC that utilizes an eight-way interleaver architecture based on a speed-enhanced bootstrapped switch that increases input bandwidth. ADC sample rate and efficiency is improved with pipelined-successive approximation register (SAR) unit ADCs that employ an output level shifting (OLS) settling technique in the dynamic residue amplifier to achieve settling in only 33% of the time required for a conventional current-mode logic (CML) amplifier. Using parallel comparators in the two 4-bit asynchronous pipeline stages allows for further improvements in ADC conversion speed. Fabricated in 22-nm FinFET, the proposed ADC occupies 0.107-mm 2 area. Operating at 38 GS/s, the ADC achieves 41.9 fJ/conv.-step with low input frequencies, 64.1 fJ/conv.-step at Nyquist, and has 20-GHz 3-dB input bandwidth.

Abstract-Efficient time-interleaved (TI) analog-to-digital converters (ADCs) that operate at high sample rates with wide input bandwidths are necessary to support increasing wireline transceiver data rates. This article presents a 7-bit 38-GS/s 32-way TI ADC that utilizes an eight-way interleaver architecture based on a speed-enhanced bootstrapped switch that increases input bandwidth. ADC sample rate and efficiency is improved with pipelined-successive approximation register (SAR) unit ADCs that employ an output level shifting (OLS) settling technique in the dynamic residue amplifier to achieve settling in only 33% of the time required for a conventional current-mode logic (CML) amplifier. Using parallel comparators in the two 4-bit asynchronous pipeline stages allows for further improvements in ADC conversion speed. Fabricated in 22-nm FinFET, the proposed ADC occupies 0.107-mm 2 area. Operating at 38 GS/s, the ADC achieves 41.9 fJ/conv.-step with low input frequencies, 64.1 fJ/conv.-step at Nyquist, and has 20-GHz 3-dB input bandwidth.

I. INTRODUCTION
H IGH-SPEED time-interleaved (TI) analog-to-digital converters (ADCs) are becoming more popular in wireline receiver front ends in order to enable powerful digital equalization and support modulation schemes with improved spectral efficiency [1], [2], [3], [4], [5]. Fig. 1 shows a common implementation of a TI ADC [6] that utilizes an interleaver architecture with two switching ranks. The first rank is a high-speed track and hold (T/H) circuit that must sample the full input signal bandwidth with a number of low-jitter precisely spaced Φ T/H clocks. This Rank 1 interleave factor Manuscript  is typically limited to between 4 and 16 due to difficulty in generation and distribution of the Φ T/H clocks and also to reduce input buffer loading. Each input T/H is followed by parallel second rank switches sampling with the Φ ADC clocks to form the unit ADC inputs. The ADC's overall interleave factor is equal to the number of parallel Rank 2 switches, with each unit ADC performing conversion at a rate of the full ADC sample rate divided by the interleave factor. While this two-rank interleaver architecture relaxes the number of critical Φ T/H clocks, sampling the wideband input signal with sufficient high-speed linearity is difficult with conventional bootstrapped switch T/H circuits. A reason for this is that low duty-cycle sampling clocks are often utilized to avoid sampling crosstalk between the TI channels. This shortens the tracking time and degrades the performance improvement offered by bootstrapped switches relative to simple Rank 1 NMOS switches [7], [8], [9], [10]. However, utilizing simple NMOS switches can restrict the input buffer's output common-mode level and result in a reduced ADC 0018-9200 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information. full-scale range. Low-supply voltages (VDDs) can also result in relatively large switches to satisfy settling requirements, resulting in increased input buffer loading and reduced bandwidth. This motivates improved bootstrapped switch topologies that can improve high-speed linearity when operating with low duty-cycle sampling clocks. The development of high-speed low-power unit ADCs can also reduce the interleaving factor for a given effective sampling rate, resulting in a reduced number of critical Rank 1 clocks, higher input bandwidth, smaller area, and an overall simpler design. Successive approximation register (SAR) ADC architectures are popular due to their low comparator count and simple digital logic implementation, making them suitable for compact and power-efficient mid-resolution TI ADCs [11]. However, the conversion speed is limited in the most common implementation of the successive approximation algorithm that performs sequential single-bit conversion cycles. As shown in Fig. 2, introducing pipelining in the SAR ADC provides improved speed by decreasing the number of conversion cycles per input sampling event and employing redundancy relaxes comparator performance for improved power efficiency. A critical block in this architecture is the amplifier that transfers the residue signal between the two pipeline stages. In highspeed converters, conventional op-amp-based amplifiers are not suitable due to the excessive static power required to meet settling time requirements. An alternative approach is to use a dynamic residue amplifier that is only activated once over the entire conversion process [12].
While dynamic residue amplifiers have the potential to save power, the traditional settling process limits the speed due to most of the time being consumed by the second half settling (Fig. 2). Thus, these topologies require a small τ to achieve fast settling times. Satisfying this and maintaining a given gain can result in large dynamic tail current values and increased input transistor size that will load the first pipeline stage capacitive digital-to-analog converter (CDAC). Given that the smallest CDAC that satisfies the kT /C noise requirement is desired to reduce input buffer power, this loading can cause significant reference attenuation that must be compensated with an increased range reference buffer that is difficult to implement with low-VDDs. Another issue is kickback noise due to the coupling through the large dynamic amplifier input transistors.
This article presents a 38-GS/s 7-bit TI ADC that improves interleaver bandwidth with a speed-enhanced bootstrapped switch [13] and utilizes high-speed and low-power pipelined-SAR unit ADCs [14]. The ADC architecture and interleaver structure with the proposed speed-enhanced bootstrapped switch are described in Section II. Section III discusses the pipelined-SAR unit ADC that utilizes a novel output level shifting (OLS) settling technique to enable high-speed operation and low-power consumption of the dynamic residue amplifier with reasonable hardware overhead. Experimental results from a 22-nm CMOS FinFET prototype are presented in Section IV. Finally, Section V concludes this article.

II. ADC OVERVIEW
A. ADC Architecture Fig. 3 shows the proposed 38-GS/s 7-bit 32-way TI ADC architecture. T-coil termination is utilized at the differential ADC input to distribute the capacitance of the input pads, electrostatic discharge (ESD) diodes, and 100-differential termination resistor [15]. Two input buffers then each drive four parallel Rank 1 odd/even speed-enhanced bootstrapped T/H switches that sample the input onto capacitors. As shown in the timing diagram of Fig. 4, these T/Hs are clocked by f s /8 25% duty-cycle Φ T/H pulses in order to avoid sampling crosstalk between adjacent odd/even T/H channels. Each T/H sampled voltage is then buffered to drive four parallel 7-bit pipelined-SAR unit ADCs. The Rank 2 switches sample the buffered signal onto the unit ADC's first-stage capacitive CDAC using parallel non-overlapping Φ ADC clocks, such that the buffer always only sees an effective single CDAC load. Each pipelined-SAR unit ADC then performs the 7-bit conversion at the f s /32 rate of 1.1875 GS/s. ADC clock generation is realized with a differential external f s /2 clock that is connected to an input current-mode logic (CML) buffer and then passed through a CML divider. The four CML divider output phases are then converted to CMOS levels and fed into a clock generation block that produces the eight T/H clocks and 32 unit ADC clocks. Testing is performed with the 32-channel 7-bit digital output data bits and 32 clock phases from each unit ADCs captured by a synchronization block before being decimated to a few tens of MHz to be efficiently driven off chip.

B. Interleaver Design
Achieving high ADC input bandwidth with reasonable power consumption requires careful optimization of the Rank 1 T/H, Rank 2 buffer and switch network, and duty cycles of the Φ T/H and Φ ADC clocks. Fig. 5(a) shows an effective schematic of an interleaver channel, with the buffered input signal first sampled and held on the 40-fF Rank 1 sampling capacitor C S that is connected to the gate V G of the Rank 2 source follower buffer that drives the following unit ADCs. The Rank 2 sample and hold switch inside the unit ADC is enabled when the Rank 1 T/H switch is in the hold phase. In order to achieve the desired settling error,  the Rank 2 buffer dominates the interleaver power consumption due to long routing parasitics, C P , that originate from area mismatches between the Rank 1 T/H switches and the unit ADCs. These routing parasitics are primarily capacitive due to the utilization of wide top-layer metal routing, with the total 151-fF C P consisting of the routing parasitic capacitance (64%), source follower buffer output capacitance (17%), and the input capacitance of the four connected bootstrapped switches (19%). Thus, it is critical to optimize the Rank 2 buffer bandwidth to achieve a target settling error with reasonable power consumption and to minimize its signal-dependent input capacitance for improved linearity. shows the small-signal equivalent circuit during the Rank 1 T/H hold phase. The settling error on the unit ADC CDAC is a function of the Rank 2 buffer BW = g m1 /(2πC P ), switch τ unit_ADC = R on3 C DAC , and the allotted settling time during this hold phase. Fig. 5(c) plots the simulated worst-case output settling error versus hold time for different buffer bandwidths with a Nyquist rate input and the C DAC being initially reset to avoid any memory effect. A 6.25-ps switch time constant and a C P /C DAC of ≈2, obtained from layout extraction, are utilized. Utilizing the Rank 1 interleave factor of 8, which corresponds to 4.75-GHz Φ T/H clocks at 38 GS/s, the required Rank 2 buffer bandwidth for While using 25% duty-cycle sampling clocks allows for effectively only driving one activated Rank 1 T/H switch at a time, resulting in relaxed input buffer bandwidth and reduced power consumption, this does necessitate that the T/H has a fast startup time. This is difficult, because the T/H is loaded by the Rank 2 buffer that has to be sized sufficiently to drive long routing parasitics to the Rank 2 switches. Recently, there have been alternative bootstrapped switch topologies proposed that improve upon the original implementation [16]. One implementation adds two additional PMOS devices and modifies the gate connection to the NMOS charging the boosting capacitor to improve the transition times of the main switch gate [17]. However, one of these additional PMOS devices is directly connected to the input and adds additional loading. Another design utilizes feedback biasing of the main switch gate by passing the switch output through a PMOS source follower [18]. However, this only yields a V GS boost in the gate voltage.
The proposed bootstrapped switch topology, shown in Fig. 6, modifies the M N1 gate connection by directly driving this node with the φ clock signal. As soon as the clock is enabled, the low-threshold voltage M N1 device turns on to transfer the boosted voltage to the M NSW gate to reduce startup time and offer better tracking of the high-speed input.
M N5 is also added to rapidly pull up the M NSW gate signal upon entering track mode to further improve the startup time. While ideally the M NSW gate voltage is boosted by a full VDD value, there is some capacitive voltage division due to the M P2 and M P1 n-well loading. Post-layout transient simulation waveforms show that the proposed topology has a wider switch on pulse, faster startup, and better tracking relative to a conventional bootstrapped switch [16]. At the effective f s /8 T/H frequency of 4.75 GHz for 38-GS/s operation, this results in 0.75-and 1.1-b improvement in effective number of bits (ENOB) with 20-and 30-GHz input signals. This shows that the Rank 1 25% duty-cycle sampling clock is not limiting the performance. Projecting this bootstrapped switch operation in ADCs with higher speed 16-GHz clocks shows further improvement of 1.8 b with both 20-and 30-GHz input signals.
Hold-phase signal feedthrough is compensated with a dummy switch transistor connected to the opposite input signal polarity. This dummy switch utilizes a gate grounding network that yields an impedance similar to the network on the main M NSW switch transistor gate. In Fig. 7, the T/H simulation results with a Nyquist input show that the hold-phase feedthrough improves from 1.7 mV ppd to 36 µV ppd when this dummy grounding network is employed instead of an ideal ground connection, offering approximately 1-bit improvement in the ENOB.
The ADC input buffers are NMOS source followers that have a low output common mode that allows the proposed speed-enhanced bootstrapped switch to operate with the nominal clock that toggles between ground and VDD. PMOS  source followers are then utilized for the Rank 2 buffers to yield a high input common mode for high-speed operation of the unit ADC comparators. As discussed previously, relaxing the Rank 2 buffer bandwidth with the 25% duty-cycle sampling clocks allows for a reduction in its signal-dependent input capacitance and limits ENOB degradation due to this to only 0.95 b with a Nyquist input. clocks. An external f s /2 clock passes through a T-coil termination network with reduced capacitance ESD, a CML input buffer, and then a CML divide-by-2 block to generate fourphase f s /4 clocks with 90 • phase spacing. These four-phase CML clocks are then fed to a CML-to-CMOS converter and pass through CMOS divide-by-2 blocks to output eight-phase f s /8 clocks with 50% duty cycle and 45 • phase spacing. The eight-phase f s /8 clocks enable pass gates in the f s /4 clock paths to generate the 8 25% duty-cycle clocks that are distributed to the Rank 1 T/Hs. Skew calibration is performed with programmable capacitive loads placed at the output of several buffer stages to maintain reasonable edge rates for simulated 105-fs rms jitter performance. As shown in the corner simulation results of Fig. 9, 7-bit control allows for skew calibration that is in excess of the 6σ phase mismatch with a  resolution that ranges from 65 to 123 fs over temperature and ±100-mV VDD variations.

C. Multi-Phase Clock Generator
As shown in Fig. 8, the 32-phase f s /32 unit ADC clocks are derived by passing each eight-phase f s /8 clock through a divide-by-4 block and four shift registers with NAND/NOR logic gates that set the ADC sampling pulsewidth equal to the T/H clock pulse hold width. Appropriate alignment of the unit ADC clock sampling and T/H clock hold phases is achieved with the 3-b phase rotator block inserted between the T/H and unit ADC clock generator blocks with static control settings.

III. UNIT ADC DESIGN A. OLS Settling Technique
While the pipelined-SAR ADC architecture reduces the required settling accuracy of the residue amplifier, it is still challenging to achieve this at high speeds. Upon activation, in Fig. 2, the conventional dynamic amplifier output will settle as V Amp = A CML V in (1 − e −(t/τ ) ), where A CML is the amplifier gain. This settles to 50% of the steady state value in a rapid 0.69τ , but requires an additional 2.77τ to settle to the 4-bit accuracy required in the second pipeline stage. The brute force method of reducing this settling time is to reduce the load resistor to decrease τ , but this leads to increased tail current values and large input transistor sizes.
Previously, OLS techniques were developed to reduce errors in feedback amplifiers that occur from finite op-amp gain [19], [20]. In these works, an initial estimate of the desired output voltage is sampled on a level shifting capacitor, and then, this capacitor is switched in series with the op-amp output and the feedback amplifier output to improve settling accuracy. This concept is modified in the proposed pipelined-SAR unit ADC to dramatically improve the settling time of the open-loop dynamic residue amplifier by utilizing the second pipeline stage CDAC2 as the level shifting capacitor. Fig. 10 gives an  overview of the proposed OLS settling technique. When φ Amp is high and the amplifier is activated, the differential output voltage is stored on both sides of CDAC2 by connecting the nominal amplifier output to the top plate and the opposite output to the bottom plate. This φ Amp duration should nominally match the rapid 50% settling time. After this, φ OLS is enabled to switch the CDAC2 bottom plate to the common mode. Charge conservation during this phase produces a rapid doubling of the amplifier output signal. Thus, the amplifier output voltage only needs to initially settle to 50% of the steady-state value, and the long second half settling is avoided. The significant speedup offered by the OLS technique is achieved with the reasonable hardware overhead of only one extra bottom-plate CDAC2 switch and enable logic.
A simplified schematic of the OLS dynamic amplifier, which offers several improvements relative to a conventional CML dynamic amplifier, is shown in Fig. 11. Instead of utilizing a simple resistive-loaded differential pair, this inverter-based amplifier structure provides both PMOS and NMOS transconductance to provide a higher gain of A = (g mp + g mn )(r on //r op ) ≈ 2A CML at lower VDDs. While the amplifier has high impedance outputs, a stable output common mode is achieved by resetting the CDAC2 top plate to the common mode prior to activation. One downside of this OLS amplifier is that the equivalent capacitive loading is 4× larger than the conventional CML amplifier due to both sides of CDAC2 being connected to each amplifier output and each capacitor experiencing Miller multiplication. Considering this, the time for the OLS amplifier to achieve 50% settling relative to the original CML amplifier is Due to the increased amplifier gain, this is only 33% of the 3.46τ required by the conventional CML amplifier at 4-bit resolution with the same tail current. This also results in lower average power due to the dynamic amplifier's reduced activation time. Moreover, the required amplifier's linear output swing range is decreased by a factor of 2. One potential issue with the proposed amplifier is matching the duration of φ Amp with the 50% settling point. However, high precision is not necessary, as any inaccuracy simply results in a modified gain value that is easily compensated with the adjustment of the second-stage reference voltages.
The φ Amp pulsewidth jitter σ t is another issue that needs to be considered. As shown in Fig. 12(a), φ Amp pulsewidth jitter causes timing variance on the level shifting point that results in output voltage noise. Intuitively, a small amplifier time constant τ gives a very fast transition time that makes the output voltage more sensitive to the timing variance. Thus, the residue amplifier output-referred noise power is a function of the amplifier time constant τ and the jitter.
Due to the OLS doubling effect, the jitter-induced voltage noise will double, and the average output-referred noise power is calculated as follows: Assuming a sampled sinusoidal input with amplitude v in peak , then the average output-referred noise power is Assuming that 2v o peak is set equal to the 125-mV full-scale range of the second pipeline stage, Fig. 12(b) plots the jitter-induced noise power versus jitter with reasonable amplifier time constants for 38-GS/s operation. This shows that the φ Amp jitter specifications are not prohibitive, as a relatively large 5.4-ps rms jitter can be tolerated to achieve 4-bit accuracy with a 80-ps amplifier time constant. This is far larger than the simulated 410-fs rms φ Amp jitter, which corresponds to 0.17-mV rms output-referred noise. As the total output-referred noise is 0.97 mV rms , this jitter-induced component is only 3.0% of the total output noise power.

B. Unit ADC Architecture and Building Blocks
The 7-bit unit pipelined-SAR ADC is shown in Fig. 13. Both pipeline stages convert 4 bits, with 1-bit redundancy between the two stages to relax the offset requirements of the first stage. The proposed speed-enhanced bootstrapped switch is also used as the Rank 2 switch at the unit ADC input to reduce the sampling time constant and improve the settling performance [ Fig. 5(d)]. Setting CDAC1 and CDAC2 to 32 and 16 fF, respectively, satisfies kT /C noise requirements, sufficiently attenuates comparator kickback, and allows for reasonable reference attenuation. Monte Carlo simulations of the worst-case differential nonlinearity (DNL) error due to DAC capacitive mismatch show a 6σ value less than 0.3 LSB.
Each unit ADC has four independent on-chip reference DACs with flipped-voltage followers [21] to provide the two sets of CDAC reference voltages for the first and second pipeline stages. These buffered reference voltages are locally decoupled with MOS capacitors and generated with 7-b R-2R DACs with 300-mV range that consume 200 µW per unit ADC. Adjusting the first pipeline stage reference DAC values allows for channel gain mismatch calibration, while the second-stage reference voltage values are tuned to accommodate any small static deviations in the dynamic amplifier gain due to the exact φ Amp pulsewidth. Fig. 14(a) shows the unit ADC timing diagram. After input sampling, the first stage converts 4 bits and holds the residue voltage for partial amplification when φ Amp goes high. φ Amp then transitions low, and the level-shifted gain is achieved when the φ OLS pulse is activated with minor modifications in the second-stage reference switch logic. The second stage then converts the final bits. Both stage CDACs are reset to V CM after their conversions are complete to avoid any memory effects, with these reset signals internally generated by the comparator ready signal and input clocks.
A clocked inverter-based buffer that achieves a gain of ≈4 serves as the residue amplifier stage, shown in detail in Fig. 14(b). In addition to the main transconductance transistors M 1/2 and M 7/8 , M 3 acts as a current source that is switched on   and short the differential output, respectively. As shown in the simulation results of Fig. 14(c), resetting CDAC2 allows the amplifier output to start separating from the common mode and then experience an effective doubling after the level shifting.  Fig. 15 illustrates how the OLS technique is included with minor logic changes in the reference switch control to allow both the CDAC2 bottom and top plate to connect to the amplifier output during the amplification phase. As shown in detail for the negative DAC MSB switches, there is an extra rightmost switch that connects the bottom plate to the positive input signal when φ Amp is high. An AND gate then produces the φ OLS signal to switch the bottom plate to the common mode when φ Amp goes low and the comparators' ready signals are enabled. Since the extra switch is added to the CDAC bottom plate, there are no speed penalty or reference attenuation issues due to the top plate sampling in the main signal path. Dynamic two-stage comparators [22] are used to allow for low-voltage operation in the two pipeline stages. These comparators are foreground offset calibrated with current-mode DACs.
As shown in Fig. 16(a), employing an on-chip DAC for foreground calibration of the OLS gain stage tail current allows for gain variations of only 0.23 dB and an output common mode between 60% and 70% of the supply over PVT variations. However, Fig. 16(b) and (c) shows that the unit ADC ENOB can degrade if the temperature or VDD drifts after calibration. The design is not too sensitive to temperature variations over a wide −40 • C to 90 • C range, with 6-b ENOB performance maintained and a maximum degradation of 0.55 b. A larger performance degradation to near 4 b is observed as the supply drifts by ±100 mV due to gain variation in the dynamic residue amplifier. A potential solution to this is to employ a replica amplifier bias scheme [23], which can stabilize the dynamic amplifier operation over dynamic PVT variations. Fig. 17 shows the chip microphotograph and layout floor plan of the 7-bit 38-GS/s ADC prototype, which was fabricated in a 22-nm FinFET process. The core TI ADC, consisting of the two input buffers, eight-way T/Hs, multi-phase clock generator block, and 32 unit pipelined-SAR ADCs, occupies 0.107-mm 2 active area. The even and odd channels are split on the left-hand and right-hand sides for symmetric routing and to allow close placement of the front-end T/Hs and unit ADCs, thereby minimizing the high-speed signal routing from the Rank 2 buffers to the unit ADCs. H-tree style wiring is utilized to balance the signal routing to the unit ADCs. Placing the clock generator block close to the T/Hs reduces the critical eight-phase Φ T/H routing and the number of required buffers for improved jitter performance. The unit ADCs, which each occupy 54 × 12 µm, have the bootstrapped switch, first pipelined-SAR stage, residue amplifier, and second pipelined-SAR stage placed in sequence to minimize signal routing. Each unit ADC has independent reference buffers to reduce reference coupling noise.

IV. EXPERIMENTAL RESULTS
A chip-on-board test setup is used to characterize the ADC prototype, as shown in Fig. 18. Foreground calibration is performed for comparator offset, TI channel gain mismatch, and sampling clock phase skew. Compensation of bandwidth mismatch between the two input buffers is provided with independent current DAC biasing. A script running on the PC captures the ADC output data from the logic analyzer, calculates the error, and updates the on-chip programmable scan cell codes automatically, with several iterations performed to allow convergence of the calibration codes. Comparator offset calibration is performed with the ADC differential inputs set to the dc common-mode level and each comparator output sequentially multiplexed off chip for monitoring with an oscilloscope. The offset calibration code for each comparator is adjusted to make the output have an average value of half the output buffer swing level, implying an equal number of 1 and 0 s and that the comparator is close to a metastable point. TI channel gain and initial skew calibration are performed with a sinewave fitting algorithm [24], which calculates the amplitude, offset, and phase of each unit ADC output when the input is a sinewave. The first pipeline stage reference DAC values are adjusted for channel gain mismatch calibration, and the inter-stage gain for each unit ADC is estimated from the peak unit ADC ENOB point, as this factor is swept. This information is utilized to first set the dynamic residue amplifier tail current to set the nominally correct gain over PVT and then to optimize the second pipeline stage reference DAC settings to accommodate any small static deviations in the dynamic amplifier gain. The skew calibration programmable delay lines and on-chip reference DACs are adjusted accordingly until each unit ADC output has the same amplitude and equally spaced phases. Additional fine-tuning of the skew calibration  settings is then performed to minimize the timing spurs of the aggregated discrete Fourier transform (DFT) spectrum. The offset of each unit ADC is subtracted off chip.
A sinewave histogram technique [25] is utilized for ADC static characterization with a 500-mV ppd input at a 550-mV common-mode level. As shown in Fig. 19, the aggregated maximum DNL and integral nonlinearity (INL) after calibration are +0.2/−0.39 and +0.71/−0.55 LSB, respectively.   366 GHz and Nyquist sinusoidal inputs at 38 GS/s. The TI ADC achieved signal-to-noise and distortion ratio (SNDR) and spurious-free dynamic range (SFDR) of 38.52 and 49.8 dB, respectively, with a 2.366-GHz input. This moderate frequency performance is mainly limited by thermal noise, residual gain mismatch spurs, and unit ADC non-linearity [ Fig. 20(b)] from incomplete residue amplifier settling. The residual time-interleave error spurs near 11.9 and 16.6 GHz have both fundamental and third-harmonic distortion frequency components [26]. In Fig. 20(c), the Nyquist response shows that the SNDR and SFDR are 35.6 and 43.8 dB, respectively. This is limited by spurs coming from skew and bandwidth mismatch, non-linearity, and sampling clock jitter. The spurs near 6 and 17.8 GHz are believed to be due to channel gain mismatch caused by variations in the Rank 2 switch settling behavior. Monte Carlo simulations of the Rank 2 switch bandwidth show a σ = 2.2% mismatch, which is partially compensated by independent 4-bit current DAC biasing of the Rank 2 buffers. The measured SNDR and SFDR with various input frequencies are shown in Fig. 21, with the SNDR and SFDR dropping from its low-frequency value by 3.7 and 6.3 dB at Nyquist, respectively. Fig. 22 shows that the measured 3-dB bandwidth of the TI ADC is 20 GHz, which includes insertion loss from the wire bonds; 28-GHz bandwidth is potentially possible if these wire-bond parasitics are de-embedded.  Table I summarizes the TI ADC performance and compares this work against previous 7-8-bit ADCs operating over 28 GS/s. Total ADC power consumption is 119.7 mW, with 82.65 mW dissipated in the pipelined-SAR unit ADCs and clock generation circuitry operating on a 0.85-V supply and 37.05 mW from the 0.9-V proposed interleaver. This lowest reported interleaver power consumption is enabled by the proposed speed-enhanced bootstrapped switch, which also allows for a wide 20-GHz bandwidth. The additional efficiency gains from employing the OLS settling in the pipelined-SAR unit ADC allow for the best low-frequency and Nyquist figure of merit (FOM) among these high-speed TI ADCs.

V. CONCLUSION
This article has presented a 7-bit 38-GS/s 32-way TI ADC. The proposed speed-enhanced bootstrapped switch that operates with a low duty-cycle sampling clock reduces interleaver power consumption and achieves high input bandwidth. Pipelined-SAR unit ADCs utilize an OLS settling technique in the dynamic residue amplifier and parallel asynchronous comparators to improve sample rate and efficiency. These combined techniques allow for significant improvement in the Nyquist rate FoM.