8
SRAM TECHNOLOGY
Overview An SRAM (Static Random Access Memory) is designed to fill two needs: to provide a direct interface with the CPU at speeds not attainable by DRAMs, and to replace DRAMs in systems that require very low power consumption. In the first role, the SRAM provides cache memory, interfacing between DRAMs and the CPU. An SRAM can be accessed in as little as a few nanoseconds (ns), versus 50ns to 80ns for a DRAM. The second application — low power — is found in most portable equipment where the DRAM refresh current is several orders of magnitude more than the low-power SRAM standby current. In the low-power parts, the access time is comparable to a DRAM. How the Device Works The SRAM cell consists of a bi-stable flip flop connected to two access transistors (Figure 8-1) . The data is latched into the flip flop. Word Line
B
B
To Sense Amplifier Source: ICE, "Memory 1996"
20019
Figure 8-1. SRAM Cell
The data in an SRAM cell is volatile, (i.e., the data is lost when the power is removed). However, the data does not “leak away” like in a DRAM, so the SRAM does not require a refresh cycle.
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-1
SRAM Technology
Figure 8-2 shows the read/write operations of an SRAM. To select a cell, the word line is set to Vcc (X address). The B and B (bit lines) are connected to the sense amplifier or the write circuitry (depending on whether the device is in a read or a write mode) by the column decode transistors (Y address). In a read mode, the cell data is applied to the sense amplifier that recognizes the data (voltage comparator). In a write mode, the write circuitry forces the data onto the cell as the write circuitry drivers are stronger than the cell flip flop transistors.
Word Line
Word Line
Column Decode
Column Decode
Sense Amplifier (Voltage Comparator) Write Circuitry D Out D In READ OPERATION
WRITE OPERATION
Source: ICE, "Memory 1996"
19952
Figure 8-2. Read/Write Operations
MEMORY CELL 4T Cell The most common SRAM cell consists of four transistors and two poly-load resistors (Figure 8-3). This design is called the 4T cell SRAM. The storage cell of a 4T cell SRAM is about four times as large as the cell of a comparable generation DRAM and contains four transistors.
8-2
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
W +V
B
B To Sense Amps
Source: ICE, "Memory 1996"
18470A
Figure 8-3. SRAM 4T (Four-Transistor) Cell
4T cells have several limitations: • Each cell has a current flowing in one resistor, i.e., high standby current. • The cell is sensitive to noise and soft error rate because the resistors are so large. • The cell is not the fastest cell design available. 6T Cell A different cell design that eliminates these limitations is the 6T cell. This SRAM cell is composed of six transistors, two NMOS and two PMOS transistors connected as a flip flop, and two NMOS access transistors. This structure is shown in Figure 8-4. It offers better performance (speed, noise immunity, standby current) than a 4T structure. These devices are used when extremely low power consumption is mandatory, such as in palm top computers operating from AA batteries. TFT (Thin Film Transistor) Cell A mix of the 4T cell and the 6T cell structures has been developed. The load is formed by using polysilicon as a PMOS device. This PMOS transistor is called a Thin Film Transistor (TFT), and it is formed by depositing several layers of polysilicon above the silicon surface.
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-3
SRAM Technology
W +V
B
B To Sense Amps 18471A
Source: ICE, "Memory 1996"
Figure 8-4. SRAM 6T (Six-Transistor) Cell
The performance of this TFT PMOS transistor is not as good as a standard PMOS silicon transistor used in a 6T cell. It is more realistically compared to the linear polysilicon resistor characteristics, as TFT cell technology is more an improvement and replacement of the 4T technology than the 6T technology. Process and cell size are closer to a 4T cell technology than a 6T cell technology because the area of the TFT transistors is above the NMOS transistors. Figure 8-5 shows the TFT characteristics. In actual use, the effective resistance would range from about 11 x 1013Ω to 5 x 109Ω. Figure 8-6 shows the TFT cell schematic. –10–6
Drain Current, Id (A)
Vd = –4V
–10–8 Vg
–10–10
Tox = 25nm Tpoly = 38nm L/W = 1.6/0.6µm
–10–12
2 Source: Hitachi
0
–2 –4 Gate Voltage, Vg (V)
–6
–8 19953
Figure 8-5. TFT (Thin Film Transistor) Characteristics
8-4
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
Word Line Poly-Si PMOS
BL
BL
Source: ICE, "Memory 1996"
19954
Figure 8-6. SRAM TFT (Thin Film Transistor) Cell
Figure 8-7 shows a cross section drawing of the TFT cell. The TFT technology requires the deposition of two more films and at least three more photolithography steps.
1st Metal (BIT Line)
2nd Poly-Si (Gate Electrode of TFT)
4th Poly-Si (Internal Connection)
3rd Poly-Si (Channel of TFT)
Contact (W-Plug)
2nd Direct Contact
Isolation N+
N+ N+ Diffusion Region (GND Line)
Driver Transistor
N+ TiSi2
Access Transistor
N+
1st Poly-Si (Gate Electrode of Bulk Transistor)
Source: IEDM 91 18749
Figure 8-7. Cross Section of TFT SRAM Cell
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-5
SRAM Technology
Cell Size and Die Size SRAM performance targets have a dramatic effect on cell size. Figure 8-8 shows cell sizes and other characteristics of SRAM parts analyzed in ICE’s laboratory in 1995. None of the devices analyzed made use of thin-film-transistor (TFT ) pullups. Each supplier used the standard 4T cell using resistive pullups. It is interesting to note that the die size of the only 4Mbit part (Toshiba’s CMOS SRAM datecoded 9509) was larger than an NEC 4Mbit SRAM die analyzed in 1994. In fact, Toshiba’s 4Mbit cell size is actually larger than the cache SRAM on Intel’s Pentium microprocessor.
GALVANTECH SAMSUNG TOSHIBA GVT7132C32Q7 KM732V588 TC554161FTL-70L 1Mb (x32) 1995 1Mb (x32) 9524 4Mb (x16) 9509 Technology Die Size Min Gate - (N)
MOTOROLA NEC HITACHI HM67W1664JP-12 D461018LG5-A12 MCM67C618FN7 1Mb (x18) 9443 1Mb (x18) 9436 1Mb (x16) 9539
CMOS
BiCMOS
CMOS
BiCMOS
BiCMOS
BiCMOS
7.7 x 18.7mm (144mm2)
5 x 6.6mm (33mm2)
4.5 x 6.8mm (31mm2)
6.4 x 10.1mm (64mm2)
5.7 x 11.7mm (67mm2)
9.2 x 11.8mm (108mm2)
0.65µm
0.5µm
0.4µm
0.45µm
0.6µm
0.6µm
Cell Pitch
3.7 x 6µm
3.0 x 4.75µm
3.4 x 4.9µm
3.3 x 5.7µm
3.5 x 5.5µm
4.9 x 8.2µm
Cell Area
22µm2
14.25µm2
16.5µm2
19µm2
19µm2
40µm2
Cell Type
4T
4T
4T
4T
4T
4T
5V
3.3V
3.3V
3.3V
3.3V
5V
70ns
—
7ns
12ns
12ns
7ns
VCC Access Time
Source: ICE, "1996 Successful Technologies Review"
20859
Figur 8-8. Physical Geometries of SRAMs
Figure 8-9 shows the trends of the SRAM cell size. There is a tradeoff between the performance of the cell and the process complexity. Most manufacturers believe that the TFT-cell SRAM manufacturing process is too difficult, regardless of its advantages. The 6T cell gives a better performance but has a much larger cell size. This cell will only be used in high performance and ultra low-power SRAMs. As we will see in Section 13 (embedded memories), the 6T cell is very common for microprocessor or on-chip cache applications. The design offers extremely high performance and the process is virtually identical to the microprocessor process. Cell size remains an issue, however, so some microprocessor manufacturers utilize the 4T cell. Figure 8-10 shows the field applications of SRAMs.
8-6
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
1,000
100 Cell Size (µm2)
6T Cell
10 4T (and TFT) Cell
1 1 Micron
0.8 Micron
0.5-0.6 Micron
0.35 Micron
0.25 Micron
Technology 19989
Source: ICE, "Memory 1996"
Figure 8-9. Trend of SRAM Cell Sizes
Magnetic Memory External Memory
DRAM DRAM
Main Memory
Fast MOS SRAM 15~35ns
SRAM
Trend to use SRAM for main memory 15~45ns x1, x4
Disk Cache x1, x4 L Version
MOS DRAM
ECL SRAM 10~15ns
Cache Memory
Slow MOS SRAM 55~150ns x8
SDRAM CDRAM
Main Frame
32-bit/64-bit PC have cache 15~25ns x8 asyn. x32 sync. burst
Fast MOS SRAM 10~20ns x4, x8, x9, x16, x18
ECL RAM 3~10ns
Super Computer
Notebook PC will use SRAM
Super Mini-Com
MiniComputer
Work Station
Personal Computer
Source: Mitsubishi
OA
Small Machine 15543B
Figure 8-10. Application of DRAM/SRAM
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-7
SRAM Technology
CONFIGURATION The SRAM can be classified in three main categories: •Asynchronous SRAMs Low speed and high Speed •Synchronous SRAMs Standard and pipelined •Special SRAMs Cache Tag, FIFO, and Multiports Figure 8-11 shows the SRAM classification.
SRAMs
Asynchronous
Low Speed
High Speed
Synchronous
Standard
Pipelined
Special
Multiport
FIFO
Cache Tag
Source: ICE, "Memory 1996"
20860
Figure 8-11. Overview of SRAM Types
Asynchronous SRAMs Figure 8-12 shows a functional block diagram of an asynchronous SRAM. The memory is controlled by three clocks: - Chip Select (CS) that selects or de-selects the chip. When the chip is deselected, the part is stand-by mode (minimum current consumption) and the outputs are in a high impedance state. - Output Enable (OE) that controls the outputs (valid data or high impedance). - Write Enable (WE) that allows a read or a write cycle.
8-8
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
A0
VCC Address Decoder
262, 144 Bit Memory Array
Input Data Circuit
I/O Control
GND
A14
I/O0
I/O7
CS OE WE
Control Circuit
Source: IDT
20861
Figure 8-12. Functional Block Diagram of a Standard SRAM
Low-Speed SRAM Low-speed SRAM devices offer low power and low cost. The main applications are high density hard disk drive (HDD) and industrial control systems. The speed of these devices is greater than 55ns.
High-Speed SRAM Definitions vary, but high-speed SRAM devices have access time generally ranging from 35ns to 15ns. Applications are local storage in telecommunications systems and cache memory in computer systems. Synchronous SRAM As computer system clocks have increased, the demand for very fast SRAMs has necessitated variations on the standard asynchronous fast SRAM. The result is the synchronous SRAM (SSRAM). Synchronous SRAMs have their read or write cycles synchronized with the microprocessor clock and therefore can be used in very high speed applications. An important growing application is the cache SRAM used in Pentium- or PowerPC-based PCs and workstations.
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-9
SRAM Technology
Figure 8-13 shows the configuration of an SSRAM. The RAM array, which forms the heart of an asynchronous SRAM, is also found in SSRAM. Since the operations take place on the rising edge of the clock signal, it is unecessary to hold the address and write data state throughout the entire cycle.
Data Register
D Address
Address Register
Clock A
RAM Array
Q
Data
W WE Register WE
Write Pulse Generator
Source: Electronic Design
20862
Figure 8-13. Configuration of a Synchronous SRAM
Burst Mode The SSRAM can be addressed in burst mode for faster speed. In burst mode, the address for the first data is placed on the address bus. The three following data blocks are addressed by an internal built-in counter. Data is available at the microprocessor clock rate. These devices are offered in a wide word organization such as a 1Mbit device in a 32Kbit x 32 organization. Figure 8-14 shows SSRAM timing.
8-10
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
SYNCHRONOUS MODE CLOCK
Address Output
BURST MODE
Address
Output
Source: ICE, "Memory 1996"
19955A
Figure 8-14. SSRAM Timing
Pipelined SRAMs Pipelined SRAMs (sometimes called Register to Register mode SRAMs) add a register between the memory array and the output. Pipelined SRAMs are less expensive than standard SRAMs for equivalent electrical performances. The pipelined design does not require the aggressive manufacturing process of a standard SRAM, which contributes to its better overall yield. Figure 8-15 shows a burst timing for both pipelined and standard SRAMs. With the pipelined SRAM, a four-word burst read takes five clock cycles. With a standard synchronous SRAM, the same four-word burst read takes four clock cycles. Figure 8-16 shows the SRAM performance comparison of these same products. Above 66 MHz, pipelined SRAMs have an advantage by allowing single-cycle access for burst cycles after the first read. However, pipelined SRAMs require a one-cycle delay when switching from reads to writes in order to prevent bus contention. Cache Tag RAMs The implementation of cache memory requires the use of special circuits that keep track of which data is in both the SRAM cache memory and the main memory (DRAM). This function acts like a directory that tells the CPU what is or is not in cache. The directory function can be designed with standard logic components plus small (and very fast)
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-11
SRAM Technology
SRAM chips for the data storage. An alternative is the use of special memory chips called cache tag RAMs, which perform the entire function. Figure 8-17 shows both the cache tag RAM and the cache buffer RAM along with the main memory and the CPU (processor). As processor speeds increase, the demands on cache tag and buffer chips increase as well. Figure 8-18 shows the internal block diagram of a cache-tag SRAM.
Clock 1
Clock 2
Clock3
Clock 4
Clock 5
Clock Address
A
A+1
Data
A+2
Data A
A+3
Data A+1
Data A+2
Data A+3
A 4-word burst read from pipelined SRAMs
Clock 1
Clock 2
Clock3
Clock 4
Clock 5
Clock Address
A
Data
A+1
Data A
A+2
Data A+1
A+3
Data A+2
Data A+3
A 4-word burst read from synchronous SRAMs Source: Electronic Design
20863
Figure 8-15. Pipelined Versus Non-Pipelined Timings
FIFO SRAMs A FIFO (First In, First Out) memory is a specialized memory used for temporary storage, which aids in the timing of non-synchronized events. A good example of this is the interface between a computer system and a Local Area Network (LAN). Figure 8-19 shows the interface between a computer system and a LAN using a FIFO memory to buffer the data. Synchronous and asynchronous FIFOs are available. Figures 8-20 and 8-21 shows the block diagrams of these two configurations. Asynchronous FIFOs encounter some problems when used in high speed systems. One problem is that the read and write clock signals must often be specially shaped to achieve high performance. Another problem is the asynchronous nature of the flags. A synchronous FIFO is made by combining an asynchronous FIFO with registers. For an equivalent level of technology, synchronous FIFOs will be faster. 8-12
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
3.3V 32K x 8 Bus Frequency Speed Banks (ns)
32K x 32 Pipelined
Performance Read
Write
Cycle Time
Performance Read
Write
32K x 32 Non-Pipelined Access Cycle Time Time
Performance Read
Write
50
20
1
3-2-2-2
4-2-2-2
20
3-1-1-1
2-1-1-1
12
20
2-1-1-1
2-1-1-1
60
15
1 2
3-3-3-3 3-2-2-2
4-3-3-3 4-2-2-2
16.7
3-1-1-1
2-1-1-1
10
16.7
2-1-1-1
2-1-1-1
66
12 15
1 2
3-3-3-3 3-2-2-2
4-4-4-4 4-2-2-2
15
3-1-1-1
2-1-1-1
9
15
2-1-1-1
2-1-1-1
75
15
2
3-2-2-2
4-2-2-2
13.3
3-1-1-1
2-1-1-1
9
13.3
3-2-2-2
3-2-2-2
83
12
2
3-2-2-2
4-2-2-2
12
3-1-1-1
2-1-1-1
9
12
3-2-2-2
3-2-2-2
100
10
2
3-2-2-2
4-2-2-2
10
3-1-1-1
2-1-1-1
9
10
3-2-2-2
3-2-2-2
125
8
2
3-2-2-2
4-2-2-2
8
3-1-1-1
2-1-1-1
9
8
3-2-2-2
3-2-2-2
Source: Micron
20864
Figure 8-16. SRAM Performance Comparison
Data Bus
Processor Cache Buffer RAM
Main Memory
Address Bus
Cache Tag RAM
Source: TI
18472
Figure 8-17. Typical Memory System With Cache
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-13
SRAM Technology
VCC
A0 65,356-Bit Memory Array
Address Decoder A12
GND
RESET 8
I/O0-7
WE Control Logic
OE
I/O Control
Comparator
CS
Match (Open Drain) Source: IDT
20865
Figure 8-18. Block Diagram of Cache-Tag SRAM
Microprocessor
LAN System Bus
Disk Drive
FIFO
Memory
Source: IDT
18804
Figure 8-19. FIFO Memory Solutions for File Servers
8-14
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
Write Data
Write Clock
Write Address Counter
Write Enable
Write Data Register
Write Latch
FF Write Pulse Gen
Full Dual Port RAM Array 4096 Words x 18 Bits
Flag Logic FF
Read Enable Read Clock
Read Address Counter
Read Data Register
Read Data
Source: Paradigm
Empty
20866
Figure 8-20. Synchronous FIFO Block Diagram
Write Data Write Clock
Inhibit
Write Counter
Full Dual Port RAM Array 4096 Words x 18 Bits
Flag Logic Empty
Read Clock
Read Counter
Inhibit
Read Data Source: Paradigm
20867
Figure 8-21. Asychronous FIFO Block Diagram
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-15
SRAM Technology
Multiport SRAMs Multiport fast SRAM (usually two port, but sometimes four port) memories are specially designed chips using fast SRAM memory cells, but with special on-chip circuitry that allows multiple ports (paths) to access the same data at the same time. Figure 8-22 shows such an application with four CPUs sharing a single memory. Each cell in the memory uses an additional six transistors to allow the four CPUs to access the data, (i.e., a 10T cell in place of a 4T cell). Figure 8-23 shows the block diagram of a 4-port SRAM.
CPU #1
CPU #2
4-Port SRAM
CPU #3
Source: IDT
CPU #4
18805
Figure 8-22. Shared Memory Using 4-Port SRAM
RELIABILITY CONCERNS SRAMs are susceptible to alpha particle radiation. As designs have reduced the load currents in the 4T cell structures by raising the value of the load resistance, the energy required to switch the cell to the opposite state (soft error) has been reduced. This, in turn, has made the devices more sensitive to radiation. The TFT cell reduces this susceptibility, as the active load has a low resistance when the TFT is “on,” and a much higher resistance when the TFT is “off.” Due to process complexity, the TFT design is not widely used today.
8-16
INTEGRATED CIRCUIT ENGINEERING CORPORATION
SRAM Technology
R/WP1 CEP1
R/WP4 CEP4
OEP1
OEP4 Column I/O
I/O0P1-I/O7P1
A0P1-A11P1
Column I/O
Port 1 Address Decode Logic
I/O0P4-I/O7P4
Port 4 Address Decode Logic
A0P4-A11P4
Port 3 Address Decode Logic
A0P3-A11P3
Memory Array
A0P2-A11P2
I/O0P2-I/O7P2
Port 2 Address Decode Logic
Column I/O
Column I/O
I/O0P3-I/O7P3
OEP2
OEP3
CEP2 R/WP2
CEP3 R/WP3
Source: IDT
20868
Figure 8-23. Block Diagram of a 4-Port DRAM
INTEGRATED CIRCUIT ENGINEERING CORPORATION
8-17