The memory of a computer is more like a predefined working place, where it temporarily keeps information and data to facilitate its performance. When the task is performed, it clears it memory and the space is then available for the next task to be performed. When the power is switched off, everything stored in the memory gets erased and cannot be recalled. The CPU processes memory that holds the program code and data and it is this intimate relationship between memory and the CPU that forms the basis of computer performance. With larger and faster CPUs constantly being introduced and more complex software is developed to take advantage of the processing power which in return demands larger amounts of faster memory. With the explosive growth of Windows (and more recently, Windows 95) the demands made on memory performance are more acute than ever. These demands have resulted in a proliferation of memory types that go far beyond the simple, traditional DRAM. Cache (SRAM), fast page-mode (FPM) memory, extended data output (EDO) memory, video memory (VRAM), synchronous DRAM (SDRAM), flash BIOS, and other exotic memory types (such as RAMBUS) now compete for the attention of PC technicians.
Back in the 80s, PCs were equipped with RAM in quantities of 64 KB, 256 KB, 512 KB and finally 1 MB. Think of a home computer like Commodore 64 (64 KB RAM). Around 1990, advanced operating systems, like Windows, appeared on the market that started the RAM race. The PC needed more and more RAM. The first Windows operated PCs could address 2 MB RAM, but 4 MB soon became the standard. The race has continued through the 90s, as RAM prices have dropped dramatically. Today it would be foolish to consider less than 32 MB RAM in a PC. Many have much more.
All memory is basically an array organized as rows and columns. Each row is known as an address—one million or more addresses might be on a single memory IC. The columns represent data bits—a typical high-density memory IC has 1 bit, but might have 2 or 4 bits, depending on the overall amount of memory required. The intersection of each column and row is an individual memory bit (known as a cell). This is important because the number of components in a cell—and the way those components are fabricated onto the memory IC—will have a profound impact on memory performance. For example, a classic DRAM cell is a single MOS transistor, and static RAM (or SRAM) cells often pack several transistors and other components onto the IC.
A memory IC communicates with the “outside world” through three sets of signals: 1. Address lines define which row of the memory array will be active. In actuality, the address is specified as a binary number and conversion circuitry inside the memory IC translates the binary number into a specific row signal. 2. Data lines pass binary values (data) back and forth to the defined address. 3. Control lines are used to operate the memory IC. Read/Write (R/W) line defines whether data is being read from the specified address or written to it. Chip Select (CS) signal makes a memory IC active or inactive (this ability to “disconnect” from a circuit is what allows a myriad of memory ICs to all share common address and data signals in the computer). Some memory types require additional signals, such as Row Address-Select (RAS) and Column Address-Select (CAS), for refresh operations.
By the time 386 systems took hold in the PC industry, proprietary memory modules had been largely abandoned in favor of the “Memory Module”. A SIMM (Single Inline Memory Module) is light, small, and contains a relatively large block of memory, but perhaps the greatest advantage of a SIMM is standardization. Using a standard pin layout, a SIMM from one PC can be installed in any other PC. The 30-pin SIMM provides 8 data bits, and generally holds up to 4MB of RAM. Later it fell short when providing more memory to later-model PCs. The 72-pin SIMM provides 32 data bits, and it could hold up to 32MB (or more). 72-pin SIMM highlights the use of Error-Correction Code (ECC) instead of parity. DIMM are identical to SIMMs, but they are larger. In SIMM each electrical contact on the SIMM is tied together between the fronts and back, the DIMM keeps front and back contacts separate—effectively doubling the number of contacts available on the device. For example, in a 72-pin SIMM, there are 72 electrical contacts on both sides of the device but are tied together, so there are only 72 signals. On the other hand, a DIMM keeps the front and back contacts electrically separate. Today, virtually all DIMM versions provide 168 pins (84 pins on each side). DIMMs are appearing in highend 64-bit data-bus PCs (such as Pentiums and PowerPC RISC workstations).
Finally, you might see SIMMs and DIMMs referred to as composite or non-composite modules. These terms are used infrequently to describe the technology level of the memory module. A composite module uses older, lower-density memory; so more I.C’s are required to achieve the required storage capacity. Conversely, a non-composite module uses newer memory technology; so fewer ICs are needed to reach the same storage capacity.
•
•
•
•
•
•
•
•
Dynamic Random-Access Memory: DRAM achieves a good mix of speed and density, while being relatively simple and inexpensive - only a single transistor and capacitor is needed to hold a bit. DRAM contents must be refreshed every few milliseconds or the contents of each bit location will decay. DRAM performance is also limited because of relatively long access times. Static Random-Access Memory: SRAM does not require regular refresh operations, and can be made to operate at access speeds that are much faster than DRAM. However, SRAM uses six transistors (or more) to hold a single bit. This reduces the density of SRAM and increases power demands. Still, due to the high speed of SRAM it is used as cache. Asynchronous SRAM: ASRAM is the traditional form of L2 cache, introduced with i386 systems. Its contents can be accessed much faster (20 ns, 15 ns, or 12 ns) than DRAM. ASRAM does not have enough performance to be accessed synchronously, and has been replaced by better types of cache. Synchronous-Burst SRAM: SBSRAM is used as L2 cache for intermediatespeed motherboards (60 to 66MHz), with access times of 8.5 ns and 12 ns. It can provide synchronous bursts of cache information in 2-1-1-1 cycles (i.e., 2 clock cycles for the first access, then 1 cycle per access—in time with the CPU clock). Pipelined-Burst SRAM: PBSRAM (4.5 to 8 ns) is the fastest form of highperformance cache now available for 75MHz+ motherboards. PBSRAM requires an extra clock cycle for “lead off,” but then can sync with the motherboard clock (with timing such as 3-1-1-1). Video RAM: Originally developed by Samsung Electronics, video RAM achieves speed improvements by using a “dual data bus” scheme. Ordinary RAM uses a single data bus— data enters or leaves the RAM through a single set of signals. Video RAM provides an “input” data bus and an “output” data bus. This allows data to be read from video RAM at the same time new information is being written to it. Fast-Page Mode DRAM: For access in DRAM a memory “page” is accessed first, then the contents can be located. The problem is that every access requires the re locating the page. Fast-page mode operation overcomes this delay by allowing the CPU to access multiple pieces of data on the same page without having to re-locate the page—as long as the subsequent read or write cycle is on the previously located page. FPDRAM can access the specific location on that “page” directly. Enhanced DRAM: EDRAM was developed by Ramtron International and United Memories. It eliminates an external cache by placing a small amount of SRAM into each EDRAM device. The cache is distributed within the system RAM; as more memory to the PC, more cache is effectively added. If data is in the ED-
•
•
•
•
•
RAM’s cache (known as a hit), the data is made available in about 15 ns. If data is not in the cache (called a miss), the data is accessed from the DRAM portion of memory in about 35 ns, which is still much faster than ordinary DRAM. Synchronous DRAM: Typical memory can only transfer data during certain portions of a clock cycle. The SDRAM modifies memory operation so that outputs can be valid at any point in the clock cycle. SDRAM also provides a “pipeline burst” mode that allows a second access to begin before the current access is complete. This “continuous” memory access offers effective access speeds as fast as 10 ns, and can transfer data at up to 100MB/s. It is supported by the Intel VX chipset, and VIA 580VP, 590VP, and 680VP chipsets. It can transfer data in a 5-1-1-1. Cached DRAM: Like EDRAM, the CDRAM from Mitsubishi incorporates cache and DRAM on the same IC. The difference is that CDRAM uses a “setassociative” cache approach that can be 15 to 20% more efficient than the EDRAM cache scheme. Windows RAM: Samsung Electronics has recently introduced WRAM as a new video-specific memory device. WRAM uses multiple-bit arrays connected with an extensive internal bus and high-speed registers that can transfer data continuously. Other specialized registers support attributes, such as foreground color, background color, write-block control bits, and true-byte masking. Samsung claims data-transfer rates of up to 640MB/s and are cheaper than their VRAM counterparts.
Paged memory: This approach basically divides system RAM into small groups called pages from 512 bytes to several KB long. Memory-management circuitry on the motherboard allows subsequent memory accesses on the same page to be accomplished with zero wait states. If the subsequent access occurs outside of the current page one or more wait states might be added while the new page is found. Interleaved memory: Interleaved memory combines two banks of memory into one. The first portion is even while the second portion is odd—so memory contents are alternated between these two areas. This allows a memory access in the second portion to begin before the memory access in the first portion has finished. Interleaving doubles memory performance. The problem is that you must provide twice the amount of memory as matched pairs. Most PCs that use
•
interleaving will allow you to add memory one bank at a time, but interleaving will be disabled and system performance will suffer. Memory Caching: Cache is a small amount of very fast SRAM, which forms an interface between the CPU and DRAM. The SRAM operates on the order of 5 to 15 ns, which is fast enough to keep pace with a CPU using zero wait states. A cache-controller IC on the motherboard keeps track of frequently accessed memory locations (and predicted memory locations), and copies those contents into cache. When a CPU reads from memory, it checks the cache first. If the needed contents are present in cache (cache hit), the data is read at zero wait states. If the needed contents are not present in the cache (cache miss), the data must be read directly from DRAM at a cost of one or more wait states.
A well-designed caching system can achieve a hit ratio of 95% - in other words, memory can run without wait states 95% of the time. Two levels of cache are in the contemporary PC. CPUs from the i486 onward have a small internal cache (L1 cache) and external cache (SRAM installed as DIPs or COAST modules on the motherboard) is referred to as L2 cache. The i386 CPUs have no internal cache (although IBM’s 386SLC offered 8KB
of L1 cache).
ROM devices are frustratingly slow with access times often exceeding several hundred nanoseconds. ROM access then requires a large number of wait states, which slow down the system’s performance. This problem is compounded because the routines stored in BIOS are some of the most frequently accessed memory in your computer. Beginning with the i386-class computers, some designs used a memory technique called shadowing. ROM contents are loaded into an area of fast RAM during system initialization, then the computer maps the fast RAM into memory locations used by the ROM devices. Whenever ROM routines must be accessed during run time, information is taken from the “shadowed ROM” instead of the actual ROM IC. The ROM performance is improved by at least 300%. Shadow memory is also useful for ROM devices that do not use the full available data bus width. For example, a 16-bit computer system might hold an expansion board that contains an 8-bit ROM IC. The system would have to access the ROM not once, but twice, to extract a single 16-bit word. If the computer is a 32-bit machine, that 8-bit ROM would have to be addressed four times to make a complete 32-bit word.
Loading the ROM to shadow memory in advance virtually eliminates such delays. Shadowing can usually be turned on or off through the system’s CMOS Setup routines.
All memory is rated in terms of speed—specifically, access time. Access time is the delay between the time data in memory is successfully addressed, to the point at which the data has been successfully delivered to the data bus. A wait state orders the CPU to pause for one clock cycle to give memory additional time to operate. Typical PCs use one wait state, although very old systems might require two or three. The latest PC designs with caching might be able to operate with no (zero) wait states. More wait states result in lower system performance. Zero wait states allow optimum system performance. There are ways of selecting wait states. First, the number of wait states might be fixed. Wait states might be selected with one or more jumpers on the motherboard. Current systems, such as i486 and Pentium computers, place the wait state control in the CMOS setup routine. You might have to look in an “advanced settings” area to find the entry. When optimizing a computer, you should be sure to set the minimum number of wait states.
It is often necessary to check SIMMs or DIMMs for proper memory speed during troubleshooting, or when selecting replacement parts. Unfortunately, it can be very difficult to determine memory speed accurately based on part markings. Speeds are normally marked cryptically by adding a number to the end of the part number. For example, a part number ending in -6 often means 60 ns. Still, the only way to be absolutely certain of the memory speed is to cross reference the memory part number with a manufacturer’s catalog, and read the speed from the catalog’s description. PRESENCE DETECT (PD) Another feature of modern memory devices is a series of signals known as the Presence Detect lines. By setting the appropriate conditions of the PD signals, it is possible for a computer to immediately recognize the characteristics of the installed memory devices, and configure itself accordingly. Presence detects lines typically specify three operating characteristics of memory: size, device layout and speed.