IEEE 2006 Custom Intergrated Circuits Conference (CICC)
Comparison and Impact of Substrate Noise Generated by Clocked and Clockless Digital Circuitry Jim Le, Christopher Hanken, Martin Held, Mike Hagedorn∗, Kartikeya Mayaram, and Terri S. Fiez School of EECS, Oregon State University, Corvallis, OR 97331 ∗ Theseus Logic, Inc., Orlando, FL 32826
Abstract— A pseudo-random number generator implemented in asynchronous logic generates one-fifth the RMS substrate noise compared to the equivalent design in synchronous logic. An asynchronous 8051 processor generates one-third the RMS substrate noise as the equivalent synchronous design. The SNR of a second order delta-sigma modulator (DSM) is not affected by substrate noise due to an asynchronous processor while it experiences 15 dB degradation when the synchronous 8051 processor is clocked near integer multiples of the DSM sampling frequency. Keywords: substrate noise, synchronous circuit, asynchronous circuit, null conventional logic, delta sigma modulator.
I. I NTRODUCTION The trend toward integrated Systems-on-a-Chip (SoC) has resulted in combining analog and digital components on a single chip. Due to this integration, switching noise generated by the digital circuitry is coupled to the chip substrate through transistor junction, interconnect and bond-pad capacitances [1]. This generates noise currents that may degrade analog performance by changing the transistor body potential and by altering the power and ground voltage levels [2], [3]. In recent years, many techniques have been developed to suppress the substrate noise coupling to the analog block. Most of these methods attempt to reduce noise coupling by either blocking or actively canceling the noise in the substrate [4]. An alternative approach is to reduce the amount of noise that is injected into the substrate. For a typical clocked Boolean logic (CBL) design, the main sources of noise injection are the clock tree and synchronous switching. The clock tree, used to distribute the clock across the chip, represents a large capacitive load in terms of both power and noise generation. Synchronous switching noise is the result of thousands of digital gates switching relatively close in time such that their effects tend to accumulate. Both of these problems can be mitigated with an asynchronous design approach such as Null Conventional Logic (NCL). With this clockless logic, data is assessed and propagated independently by each gate. Thus, switching is localized and for the most part, independent of activity elsewhere on the chip [5]. In this paper, the substrate noise generated by a simple synchronous and an asynchronous circuit are compared and analyzed. Next, the analysis is expanded to examine the noise from a typical large digital block such as a synchronous CBL 8051 processor and an asynchronous NCL 8051 processor. In order to gauge the practical impact of the substrate noise
1-4244-0076-7/06/$20.00 ©2006 IEEE
on an analog block, the performance degradation of a deltasigma modulator (DSM) is evaluated in the presence of the substrate noise from each processor. These measurements provide insight into noise tolerant analog/RF circuit design techniques. II. CBL V ERSUS NCL As mentioned above, one of the largest sources of noise generation for CBL is the clock tree. A NCL design circumvents this problem by implementing a building block called a threshold gate that consists of a DATA state and a NULL state. A threshold gate starting with its output in a NULL state will remain in the NULL state until the specified number of inputs is placed in the DATA state. Once the gate reaches the DATA state, it remains in this state until all of the inputs return to the NULL state. A combination of clockless threshold gates can be used to build any conventional Boolean gate. Since the threshold gate needs to hold state information in a latch in addition to performing their logic function, they are typically larger than their traditional Boolean logic counterparts that perform the same function. They do, however, hold several distinct advantages over synchronous circuits, especially when it comes to noise generation such as: clock correlated switching noise, peak currents on power rails due to supply noise, and extra power consumption due to unnecessary clock induced switching. III. S IMULATION M ETHODOLOGY Simulation was used for validation of the comparison between synchronous and asynchronous circuits. For very large digital blocks, simulation of substrate noise coupling is not practical to do at the transistor level. An efficient methodology presented in [6] uses a gate-level VHDL description of the digital system to generate transition information. This information is then combined with a noise signature library for each gatelevel block to determine cell noise currents. Finally the cell noise currents can be used in the transistor level simulation of the analog block in order to determine the noise coupling effects. The complete design and simulation flow is illustrated in Fig. 1. For synchronous and asynchronous blocks, each gate is characterized by a noise signature library and an equivalent rail parasitic library. The latter is used to simulate the parasitic effects of the gate transistions on the power rails when performing the final simulation with the cell noise currents.
6-7-1
Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
105
Behavioral VHDL / Verilog Synthesis & Timing Analysis/Verification
Test Vectors VHDL Timing info
Event Simulation PRNGs
Timing Extraction & Verification
Noise Signature Library
SA
Cell Noise Currents
Generate Substrate Parasitic Network
Final Layout
Noise Vector Generation
Final Transient Simulation Output Waveforms
Fig. 1.
Design and simulation flow incorporating substrate noise analysis.
Fig. 3. Die photograph of the synchronous (CBL) and asynchronous (NCL) pseudo-random number generators (PRNG). The effective die areas are 0.32mm2 and 0.6mm2 , respectively.
Constant Magnitude (V)
Multiplier 8 LSB
1 Adder
0.04
0.04
0.02
0.02
0
-0.02
-0.04
Register
0
-0.02
0
2
4 Time (sec)
-0.04
6 x 10
0
2
4 Time (sec)
-7
6 x 10
-7
Fig. 4. Measured substrate noise for the synchronous (left) and asynchronous (right) PRNGs in the time domain.
Output
Linear congruential random number generator.
All final cell noise currents and parasitics were run for each implementation of the processor along with an extracted netlist of the appropriate analog block. An equivalent resistor network was used to simulate the substrate. A 3-dimensional Green’s function solver was used to calculate the resistance values [7]. The connections from the circuit to substrate networks were determined with the use of Silencer! [8]. All equivalent package, bondwire, and PCB parasitics were also included in the final simulation. IV. P SEUDO -R ANDOM N UMBER G ENERATION B LOCKS In order to compare and contrast the substrate noise induced by synchronous and asynchronous circuits, a CBL and NCL version of an 8-bit linear congruential pseudo-random number generator (PRNG) was implemented in a heavily-doped 0.25µm process. A block diagram of this circuit is shown in Fig. 2. This easily scalable circuit consists of a multiplier, an adder, and a register to generate a sequence of 256 unique data values in a continuous loop. The die photograph of the chip with the PRNGs is shown in Fig. 3. A total of 36 blocks of the CBL PRNG and 36 blocks of the NCL PRNG were placed on the die to emulate the switching noise from a large digital block. Rows of the synchronous and asynchronous PRNGs were inter-digitated on the die to provide equivalent distance to the sensing circuits. The die itself is mounted directly on the PCB as a chip-onboard to eliminate package parasitics. Separate power rails
Magnitude (dB)
Fig. 2.
Magnitude (V)
Layout Place & Route
0
0
-20
-20
-40
-40 Magnitude (dB)
Analog Design/Layout
-60 -80 -100 -120
-80 -100 -120
-140 -160
-60
-140 25
75
125 175 225 Frequency (MHz)
275
-160
25
75
125 175 225 Frequency (MHz)
275
Fig. 5. Measured substrate noise for the synchronous (left) and asynchronous (right) PRNGs in the frequency domain.
were used for the NCL and CBL circuits to allow measurement of the noise of one block while the other is inactive. The substrate noise is measured with on-chip probing at the output of a wideband amplifier with unity gain bandwidth of approximately 1GHz [2]. Figs. 4 and 5 show the measured substrate noise in the time domain and frequency domains, respectively. For these measurements, the CBL PRNG is clocked at the same equivalent operating speed of the NCL PRNG (approximately 50MHz). The time domain plot shows the obvious differences between the noise generated by the synchronous and asynchronous logic. The RMS noise voltage of the asynchronous circuit is found to be 14dB lower than the synchronous circuit. The frequency domain plot for the synchronous PRNG shows large clock tones at the operating frequency of 50MHz and other smaller tones corresponding to the synchronous switching. Conversely, the frequency domain plot for the asynchronous PRNG shows the noise is
6-7-2
Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
106
0.04
0.04
0.03
0.03
0.02
0.02
8051 NCL
CBL
SA
Magnitude (V)
Magnitude (V)
Memory
0.01 0 -0.01 -0.02 -0.03
V. 8051 M ICROPROCESSOR C ORES Although the PRNG blocks are useful to analyze the differences in the substrate noise injected by a synchronous and an asynchronous circuit, these blocks are in general not a good measure of the substrate noise that would be present in a typical mixed-signal chip. A more realistic comparison in terms of substrate noise can be found with a microprocessor. Microprocessors are commonly incorporated onto large mixedsignal chips as application specific functional blocks. In order to extend this analysis, another test chip was fabricated with a synchronous and an asynchronous version of a generic 8051 microprocessor. The die photo of the test chip is shown in Fig. 6. The chip was fabricated in a heavily-doped 0.25µm process and packaged in a CPGA132 package. Aside from the two microprocessor cores, the designs share a 256 byte program memory, a 256 byte data memory, and an external memory interface for reading and writing from an off-chip source. The common components of the design are physically placed between the two cores to maintain layout symmetry for substrate noise comparisons between the CBL and NCL designs. I/O pins and peripherals have also been kept to a bare minimum to maintain the integrity of the substrate noise analysis. Similar to the PRNG case, the substrate noise is measured with onchip probing at the output of a wideband amplifier with unity gain bandwidth of approximately 1GHz. Time domain plots of the substrate noise generated by the synchronous and asynchronous 8051s are shown in Fig. 7. For these measurements the 8051 processors are loaded with an equivalent software implementation of the pseudo random number algorithm used in the PRNGs. The synchronous 8051 is clocked at 33MHz to obtain the equivalent operating speed of the asynchronous 8051. Measurements show that the RMS
-0.02 0
1
2
3 4 Time (sec)
5
-0.03
6 x 10
0
2
4
6
Time (sec)
-7
0
0
-20
-20
-40
-40 Magnitude (dB)
spread across the spectrum. There are noticeable tones at the equivalent operating frequency and its harmonics. The skirting seen at these frequencies is caused by the logic of the NCL switching at different times. Also notable is the size of the second harmonic at 100MHz. The size of this tone can be explained by the nature of the NCL having an output which has a return-to-zero behavior. This causes a doubling in the frequency of the noise from some gates.
0 -0.01
x 10
-7
Fig. 7. Measured substrate noise for the synchronous (left) and asynchronous (right) 8051s in the time domain.
Magnitude (dB)
Fig. 6. Die photograph of the synchronous (CBL) and asynchronous (NCL) 8051s. The die area of the cores are 0.5mm2 and 0.62mm2 , respectively.
0.01
-60 -80 -100 -120
-80 -100 -120
-140 -160
-60
-140 0
20
40 60 Frequency (MHz)
80
100
-160
0
20
40 60 Frequency (MHz)
80
100
Fig. 8. Measured substrate noise for the synchronous (left) and asynchronous (right) 8051s in the frequency domain.
substrate noise generated by the asynchronous design was 9.5dB less than that generated by the synchronous design. Further analysis of the measured substrate noise reveals two main contributors to the observed waveform: the noise due to the architectural implementation of the processor and the noise resulting from the software that is loaded in the processor. These individual contributions can readily be seen from the frequency spectrum of the noise, shown in Fig. 8. In the synchronous implementation, the architectural contributors to the injected substrate noise are the clock and the instruction cycle. As expected, the clock is the dominant source of noise in the synchronous 8051. This can be seen in the form of large noise tones in the frequency spectrum at 33MHz and its harmonics. The instruction cycle in this implementation of the 8051 is four clock cycles long and can be seen in the frequency spectrum as slightly smaller tones at 1/4th the clock frequency or 8.25MHz (and its harmonics). The contribution to the measured substrate noise due to software can be found from repetitive structures in the program. In the pseudo-random number program for the synchronous 8051, the loop that generates the random number sequence is 12 instruction cycles long. The resulting effect of this loop can be seen in the frequency spectrum as tones at 0.68MHz and its harmonics. In the implementation of the asynchronous 8051, it can be seen that frequency components due to the clock are indeed absent from the spectrum. The dominant noise source for this processor is the memory accesses in the RAM. In the spectrum, similar tones to those in the synchronous spectrum can be seen at 0.68MHz and its harmonics. This result is to be expected since the asynchronous 8051 is running the same pseudo-random number program. Note that the noise is spread out across the spectrum similar to the asynchronous PRNG.
6-7-3
Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
107
0 -20
-40
-40
Magnitude (dB)
0 -20
-60 -80 -100 -120 -140
-60 -80 -100 -120
0
4
8 12 Frequency (kHz)
16
-140
0
4
8 12 Frequency (kHz)
16
Fig. 9. Nominal DSM spectrum (left) and DSM spectrum with substrate noise injection from the synchronous 8051 (right). 90 Exact integer multiple of the clock frequency
85 SNR (dB)
A fully differential second-order DSM was implemented on the chip to examine substrate noise effects on the performance of a typical analog block. For a sampling frequency of 4MHz, an OSR of 128, and an input signal at 10kHz, the nominal SNR of the DSM is 83dB. Measurement of the DSM with the synchronous 8051 active shows that the performance is most sensitive to noise frequencies around integer multiples of the DSM sampling clock. At these frequencies, the difference between the 8051 clock and the sampling frequency is aliased down into the passband of the DSM. Fig. 9 shows the nominal DSM spectrum and the spectrum with noise from the synchronous 8051 clocked at 200Hz below the DSM sampling frequency. As seen in this spectrum, tones 200Hz apart show up in the passband due to aliasing. After sweeping the synchronous 8051 clock at frequencies close to the sampling clock, shown in Fig. 10, it was found that the SNR degradation can be up to 15 dB. Simulation shows that substrate noise coupling at the input dominates the performance degradation. At exact integer multiples of the clock frequency, the aliasing results in a dc offset which causes little SNR degradation. This is consistent with previously published work [9]. Measurement of the DSM with the asynchronous 8051 active shows no noticeable effect on the SNR performance.
Magnitude (dB)
VI. S UBSTRATE N OISE E FFECTS ON A D ELTA -S IGMA M ODULATOR
80
75
70
65 −10
−5 0 5 Offset Frequency (kHz)
10
Fig. 10. DSM SNR with substrate noise from the synchronous 8051 clocked close to the DSM sampling frequency.
VII. D ISCUSSION AND C ONCLUSION
R EFERENCES
Generalizing the results from the sampled-data DSM, intermodulation terms generated by the substrate noise tones and the clock frequency that are above the circuit noise floor and in the bandwidth of interest may degrade the analog circuit performance. Additionally, as measurements revealed, providing a phase offset for the digital clock does not alter these effects. Generally, continuous time analog circuits have relatively high linearity and thus, only the in-band substrate noise that exceeds the specified noise floor degrades the performance. This information can be applied to RF circuits and in particular a RF LNA [10]. There are both harmonic and intermodulation (IM) tones at the output of a RF LNA. The intermodulation terms are from the harmonics of the clock mixing with the RF carrier in the active devices, whereas the harmonics are coupled through passive circuitry directly into the amplifier output [10]. Based on this observation, it is clear that the digital clock harmonics should be reduced, which is possible with the use of clockless digital circuitry.
[1] N. Verghese, T. Schmerbeck, and D. Allstot, Simulation Techniques and Solutions for Mixed-Signal Coupling in Integrated Circuits. Kluwer Academic Publishers, 1995. [2] M. van Heijningen, J. Compiet, P. Wambacq, S. Donnay, M. Engels, and I. Bolsens, “Analysis and experimental verification of digital substrate noise generation for epi-type substrates,” IEEE J. Solid-State Circuits, pp. 1002–1008, Jul. 2000. [3] M. Nagata, K. Hijikata, J. Nagai, T. Morie, and A. Iwata, “Reduced substrate noise digital design for improving embedded analog performance,” in IEEE International Solid-State Circuits Conference, pp. 224–225, Feb. 2000. [4] M. Peng and H. Lee, “Study of substrate noise and techniques for minimization,” IEEE J. Solid-State Circuits, vol. 39, pp. 2080–2086, Nov. 2004 [5] K. Fant and S. Brandt, “NULL conventional logic: A complete and consistent logic for asynchronous digital circuit systems,” in International Conference on Application-specific Systems, Architectures, and Processors, pp. 261–273, 1996. [6] H. Habal, T. Fiez, and K. Mayaram, “Accurate and efficient simulation of synchronous digital switching noise in systems on a chip,” IEEE Trans. VLSI, vol.13, pp. 330-338, March 2005. [7] C. Xu, EPIC: A Program for Extraction of the Resistance and Capacitance of Substrate With the Green’s Function Method, ECE Dept., Oregon State Univ., 2002. [8] P. Birrer, T. Fiez, and K. Mayaram, “Silencer!: a tool for substrate noise coupling analysis,” IEEE International SOC Conference, pp. 105-108, Sept. 2004. [9] T. Blalack and B. Wooley, “The effects of switching noise on an oversampling A/D converter,” in IEEE International Solid-State Circuits Conference, pp. 200-201, Feb. 1995. [10] S. Hazenboom, T. Fiez, and K. Mayaram, “Digital noise coupling mechanisms in a 2.4GHz LNA for heavily and lightly doped CMOS substrates,” Proc. Custom Integrated Circuits Conference 2004, pp. 367370, Oct. 2004.
VIII. ACKNOWLEDGEMENTS This research was supported in part by grants under the DARPA TEAM and CLASS programs. The authors would also like to thank Triet Le for the DSM design, Husni Habal for his work on the PRNG, and James Ayers for the chip-on-board packaging.
6-7-4
Authorized licensed use limited to: University of Central Florida. Downloaded on November 8, 2008 at 21:39 from IEEE Xplore. Restrictions apply.
108